+

Cookies on the Business Insider India website

Business Insider India has updated its Privacy and Cookie policy. We use cookies to ensure that we give you the better experience on our website. If you continue without changing your settings, we\'ll assume that you are happy to receive all cookies on the Business Insider India website. However, you can change your cookie setting at any time by clicking on our Cookie Policy at any time. You can also see our Privacy Policy.

Close
HomeQuizzoneWhatsappShare Flash Reads
 

Two Programmers Claim Reddit's Voting Algorithm Is Flawed

Dec 10, 2013, 21:00 IST

@BryanCranston / TwitterBryan Cranston on Reddit.

Two programmers have separately concluded that the algorithm Reddit uses to sort its posts is flawed in a way that discriminates against new posts that briefly trigger a negative reaction from Reddit readers.

Advertisement

That seems trivial at first, especially if you don't understand code. Reddit appears to be working just fine - so who cares about a typo among thousands of lines of code?

But Reddit is a massive distributor of traffic around the web. It had 90 million unique users visit its pages last month. Publishers (Business Insider included) benefit hugely when a post becomes popular on Reddit. A single hot link from Reddit can pour hundreds of thousands of readers into your site within hours. And those pageviews are easily monetized with ads.

So there is a lot at stake. People trust Reddit to get it right.

Ian Greenleaf, a San Diego-based programmer, claims in a blog post that the sorting mechanism Reddit uses to rank new posts can bury those posts if they initially receive a few negative votes. It's complicated, but basically Reddit's code - which has been published publicly so developers can examine it - has two ranking mechanisms: Time, so that new posts are favored over old posts; and net positive or negative votes, so that posts people like are favored over those that people don't like.

Advertisement

The problem, Greenleaf says, occurs when a new post gets a few negative votes before it gets any positive votes, rendering its vote score less than zero:

... imagine one submission made a year ago, and another submission made just now. The year-old submission received 2 upvotes, and today's submission received two downvotes. This is a small difference - perhaps today's submission got off to a bad start and will rebound shortly with several upvotes. But under this implementation, today's submission now has a negative hotness score and will rate lower than the submission from last year.

Greenleaf says that the formula condemns some posts to a "purgatory" in Reddit, where they never get seen by other redditors. "These posts are sad, alone, and afraid. And notably, they are sorted oldest first, just as I predicted."

Systems librarian Jonathan Rochkind has made the same claim:

So it turns out there's a significant typo, that keeps the algorithm from working right, in the several previously blogged descriptions of reddit's story-ranking algorithm.

Advertisement

... More oddly, this same significant typo is in the public version of reddit's code released on github.

On Hacker News, the claim has been rebutted by a user named "ketrainis" who appears to be a Reddit administrator. Ketrainis says that making sure that disliked posts don't show up is kinda the point:

This comes up every 6 months or so, always with some sensational title like this.

... The thing is, the two most important pages are the front page (or a subreddit's own hot page) and the new page. The new page is sorted by date ignoring hotness, and if something has a negative score it's not going to show up on the front/hot page anyway. The two other main opportunities to get popular (rising and the organic box) don't really use hotness either.

So when it comes down to it, what happens below 0 is pretty moot. Smoothness around the real life dates and scores on the site is more important than smoothness around 0, where we don't really have listings that will display it anyway.

Advertisement

In summary, there don't exist listings in which the discontinuities at 0 really matter.

That comment has sparked a big debate here and here.

You are subscribed to notifications!
Looks like you've blocked notifications!
Next Article