When you need to correctly display the leader-board based on ratings, you can’t just display the average rating for each entry.
------------------------------------ Restaurant A | 1 Vote(s) | Rating 10 ------------------------------------ Restaurant B | 3 Vote(s) | Rating 6 | Rating 5 | Rating 4 ------------------------------------ Restaurant C | 2 Vote(s) | Rating 3 | Rating 10
To correctly rank the data above you need to get the Bayesian Average for each restaurant. We have to take into account the number of votes for each restaurant. More votes would push it up the ranking while less votes would have little weight. This means that we if 1 restaurant had 1 vote and with rating of 10, it would not be the number one. (If we just got the average, it would be at the top! – and that would be wrong)
//In PHP $avg_num_votes = 2; //The average number of votes for all restaurants (1+3+2)/3 = 2 $avg_rating = 6.3333; //The average rating for all restaurants (10+6+5+4+3+10)/6 = 6.3333 $this_num_votes = 3; //The number of votes for current restaurant (Restaurant B) $this_rating = 5; //The average rating for current restaurant (Restaurant B: (6+5+4)/3 = 5) $bayesian_average = (($avg_num_votes * $avg_rating) + ($this_num_votes * $this_rating)) / ($avg_num_votes + $this_num_votes); |
Using the formula above, we would have the following ratings for each restaurant:
Restaurant A would have 7.55553 = ((2*6.3333) + (1*10)) / (2+1)
Restaurant B would have 5.53332 = ((2*6.3333) + (3*5)) / (2+3)
Restaurant C would have 6.41665 = ((2*6.3333) + (2*6.5)) / (2+2)
A quick solution using MySQL View Tables
Create View `ratings` AS SELECT restaurant_id, (SELECT count(restaurant_id) FROM ratings) / (SELECT count(DISTINCT restaurant_id) FROM ratings) AS avg_num_votes, (SELECT avg(rating) FROM ratings) AS avg_rating, count(restaurant_id) as this_num_votes, avg(rating) as this_rating FROM ratings GROUP BY restaurant_id |
To get the ratings for the restaurants:
SELECT restaurant_id, ((avg_num_votes * avg_rating) + (this_num_votes * this_rating)) / (avg_num_votes + this_num_votes) as real_rating FROM `ratings` |
Would output something like:
restaurant_id real_rating 1 7.555533333333 2 5.533320000000 3 6.416650000000
Sources:
http://blog.linkibol.com/2010/05/07/how-to-build-a-popularity-algorithm-you-can-be-proud-of/
http://snipplr.com/view/22559/