Quoted from evanmiller.org:
CORRECT SOLUTION: Score = Lower bound of Wilson score confidence interval for a Bernoulli parameter
Say what: We need to balance the average rating with the uncertainty of a small number of observations. Fortunately, the math for this was worked out in 1927 by Edwin B. Wilson. What we want to ask is: Given the ratings I have, there is a 95% chance that the “real” average rating is at least what? Wilson gives the answer. For simplicity we suppose that there are only positive ratings with value 1 and negative ratings with value 0. Then this lower bound on the average rating is given by:
(For a lower bound use minus where it says plus/minus.) Here p is the fraction of positive ratings (observed), zα/2 is the (1-α/2) quantile of the standard normal distribution, and n is the total number of ratings. If that doesn’t makes sense to you, maybe this Ruby code will:
require 'statistics2' def ci_lower_bound(pos, n, power) if n == 0 return 0 end z = Statistics2.pnormaldist(1-power/2) phat = 1.0*pos/n (phat + z*z/(2*n) - z * Math.sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n) endpos is the number of positive rating, n is the total number of ratings, and power refers to the statistical power: I would pick 0.05.
Now for any item that has a bunch of positive and negative ratings, use that function to arrive at a score appropriate for sorting on, and be confident that you are using a good algorithm for doing so.