
Archive for March, 2007
Management Fad
March 29, 2007Writing a Digg-Style Popularity Algorithm
March 27, 2007Not so long ago I was tasked with creating a web site similar to Digg – users voted on records in the database, and those deemed “popular” were promoted to the front page. But how exactly does such an algorithm work? Let’s take a look under the hood…
Before we begin, a quick disclaimer: These are my own thoughts on such an algorithm, from scratch. I don’t know how Digg, Reddit and other such sites rank popular items, so this may be a quite different.
Records and Votes
We have a database record, whether it be a news story, picture, video, podcast, whatever. And we have voting, a method for users to place a single vote on a particular database record. We could order the “Popular” category by number of votes and be done with it.
Popularity = Votes
However records very quickly become stale, as new submissions are entered and the voting process begins again. If records from last year have 300 votes each, but popular records this year only have 100 or so, they won’t see the front page. So we need to look at time as a factor as well.
Records and Time
Let’s introduce the age of the record as another variable. If the record is newer it should have a higher prominence on the front page, yes?
Popularity = Votes / Record Age
The older the record, the more votes it requires to achieve popularity. But that’s not really fair – if a record takes a little while to receive votes, it doesn’t get the credit it deserves. This is especially a problem if your site doesn’t yet have enough traffic.
So what’s the solution? Let’s take a look at the age of each vote for the record.
Records and Vote Time
To keep the front page fresh, we can give more weight to votes placed recently. That way if a story is hasn’t yet received the credit it deserves, it’s still in for a chance if several users notice its value and vote accordingly. So let’s iterate through votes and calculate a popularity score:
Popularity = (V1/A1) + (V2/A2) + … + (Vn/An)
Vn is a vote, and An is the age of that vote (for example, in minutes). If a vote is 60 minutes old, it is worth 1/60th of a vote placed 1 minute ago. All the values of all the votes are added to achieve a popularity score.
This seems to solve the previous problem, but introduces a new one. If a single person votes on a record a year old, his vote will be worth more than 200 votes on a different record posted yesterday. Old material comes back to haunt the front page. So we’re close, but no cigar yet.
Let’s take a look again at the age of the record …
Records and Time and Vote Time
If we put together everything we’ve discussed so far, we get something that looks like this:
Popularity = [ (V1/A1) + (V2/A2) + … + (Vn/An) ] / Record Age
It’s a bit of a mouthful, but basically it adds together the weighted votes based on age, then divides that total by the age of the record. It doesn’t impose too much of a time limit on becoming popular, it dampens votes based on age, and prevents old stories from leaping back to the front page. It solves all our problems.
Dampening Popularity
Admission: In writing this article, I’ve discovered a more advanced algorithm than used previously on my project. Guess what I’m doing this evening?
But already I can see a problem – I think a dampening effect will need to be introduced to prevent wild jumps of increased popularity and back down again. I will update this post when I’ve had a chance to implement the new algorithm in the wild.
Other Variables
What other variables could we introduce to the algorithm?
- Number of page views – a form of popularity, but far less useful than voting
- Page views versus people who don’t vote – If 60% of readers vote, should it be more popular than if 10% vote? Remember this isn’t the number of votes, it’s the percentage of readers who vote
- Iterative algorithm of amount of time between individual votes
- Voting down as well as up
- Trustworthiness of user who originally submitted the record, maybe based on votes of previous submissions
- If a web URL is involved, maybe use metadata such as Google PageRank, inbound links (Google/Yahoo API), Blogosphere activity, and so forth
Further Reading
Say What Again
March 23, 2007The basic idea of Jarratt Moody’s project is to take a piece of audio from wherever (movie, song, poetry reading, answering machine) and then represent that audio on screen using only typography.

(watch)
Jarratt chose a famous bit of dialogue from Pulp Fiction as his subject matter. (Okay, what dialogue from Pulp Fiction isn’t famous?) The resulting piece is full of whimsy and style. Jarratt does some great things with scale and simple but effective camera movements. Get those headphones on and prepare yourself for several lashings of Samuel L. Jackson’s naughty tongue.
The Life and Times of the Thunderbolt Kid
March 15, 2007
Bill Bryson is one of my all-time favorite writers. He’s a master of observation, which makes him an ideal travel writer. His obsession with minutia and his pessimistic but abundant sense of humor, combined with a love for library-style research, results in engaging books that make me delighted, wistful, astonished, and nostalgic, often at the same time.
His latest book, The Life and Times of the Thunderbolt Kid, also happens to be his best book. It’s his memoir of growing up in the 1950s and 1960s in Des Moine, Iowa, and it’s as close to being a time machine back to that wonderful era than anything I’ve encountered.
His descriptions of the independently owned businesses in Des Moines in the 1950s — the diners, dime stores, theaters, department stores, and supermarkets — had me aching to return to an era where grocery stores had a “kiddy coral” loaded with comic books where kids could read in bliss while their parents bought TV dinners and chubs of cheese so shiny you could see your reflection in them (I’m paraphrasing Bryson on the cheese part).
He covers a lot of territory in under 300 pages: the cold war, the civil rights movement, the generally poor quality of kids toys (he goes off on a hysterically funny account of those old vibrating football games), farm living, juvenile delinquency, the postwar economic boom, Main Street, Disneyland, childhood friends, teenage lust, homemade explosives, and unfriendly dogs. Every page is laced with witty gems of rare insight and dry humor. He ends the book with a head-shaking look at the end of this magical era, where Wal-Marts have replaced dry goods stores and neighborhood streets, once full of kids riding bikes, rollerskating, playing kick the can, and running around in large packs, are silent, save for cars going by, because all the kids are locked up inside watching TV as a result of our culture being scared to death of everything.
Alternate Soundtrack To Old ’70s PSA For Union Workers
March 14, 2007Awesomely hilarious union commercial from the 70s with alternate (and filthy) narration. According to the person who posted it on YouTube: “This is a rare gem. This was a PSA that the voice-over person decided to record an “alternate” version of for fun. This comes from the archives of a local tv station. You won’t find this anywhere.”
The Weird and Wonderful Design of Japanese Sake Bottle Labels
March 3, 2007These bottles are not just pleasing to the eye of alcoholics. Japanese sake labels have bold designs decorated with pictures of crashing waves, strong calligraphy letterings, and gildings. Read more…

(from PingMag)