A web developer's blog. PHP, MySQL, CakePHP, Zend Framework, Wordpress, Code Igniter, Django, Python, CSS, Javascript, jQuery, Knockout.js, and other web development topics.

Getting the Bayesian Average for rankings (PHP / MySQL)

When you need to correctly display the leader-board based on ratings, you can’t just display the average rating for each entry.

Restaurant A | 1 Vote(s) | Rating 10
Restaurant B | 3 Vote(s) | Rating 6
                         | Rating 5
                         | Rating 4
Restaurant C | 2 Vote(s) | Rating 3
                         | Rating 10

To correctly rank the data above you need to get the Bayesian Average for each restaurant. We have to take into account the number of votes for each restaurant. More votes would push it up the ranking while less votes would have little weight. This means that we if 1 restaurant had 1 vote and with rating of 10, it would not be the number one. (If we just got the average, it would be at the top! – and that would be wrong)

//In PHP
$avg_num_votes = 2;   //The average number of votes for all restaurants (1+3+2)/3 = 2
$avg_rating = 6.3333; //The average rating for all restaurants (10+6+5+4+3+10)/6 = 6.3333
$this_num_votes = 3;  //The number of votes for current restaurant (Restaurant B)
$this_rating = 5;     //The average rating for current restaurant (Restaurant B: (6+5+4)/3 = 5)
$bayesian_average = (($avg_num_votes * $avg_rating) + ($this_num_votes * $this_rating)) / ($avg_num_votes + $this_num_votes);

Using the formula above, we would have the following ratings for each restaurant:

Restaurant A would have 7.55553 = ((2*6.3333) + (1*10)) / (2+1)
Restaurant B would have 5.53332 = ((2*6.3333) + (3*5)) / (2+3)
Restaurant C would have 6.41665 = ((2*6.3333) + (2*6.5)) / (2+2)

A quick solution using MySQL View Tables

Create View `ratings` AS
    (SELECT count(restaurant_id) FROM ratings) / (SELECT count(DISTINCT restaurant_id) FROM ratings) AS avg_num_votes,
    (SELECT avg(rating) FROM ratings) AS avg_rating,
    count(restaurant_id) as this_num_votes,
    avg(rating) as this_rating

To get the ratings for the restaurants:

    ((avg_num_votes * avg_rating) + (this_num_votes * this_rating)) / (avg_num_votes + this_num_votes) as real_rating 
FROM `ratings`

Would output something like:

restaurant_id    real_rating
1                7.555533333333
2                5.533320000000
3                6.416650000000


This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>