Brewing a Better Rating System: The 2nd Steep
A couple weeks ago we launched our new slider rating system and we’ve received tons of great feedback. But as people play with the slider and rate each of their teas, they’ve also come up with some questions. Mainly: How exactly are the average ratings determined?
As you may have noticed, a single user rating of 87 does not mean the overall tea rating will be the same. Now you might be thinking, “I rate my Darjeeling a 72 and the overall ratings comes out as a 77? What crazy, experimental tea are those Steepster guys drinking?” Although we’ve had some questionable ideas after a few late night cups of yerba mate, there actually is a method to our madness.
A Bayesian Brew
Before the switch to the new slider rating system we used a regular average. This created situations where a single, outstanding rating caused the tea’s overall rating to be higher or lower than a different tea with many more “normal” ratings. Sure, one person may think their custom blend of Assam, lime and dirt is the greatest tea ever to exist and give it a single rating of 100, but we don’t think that should mean it’s overall rating is higher than a Golden Yunnan 50 people rate at 95.
We wanted to accommodate the multiple factors that play into the popularity/rating of a tea; so after studing the thousands of ratings on Steepster we went with an approach know as a Bayesian Average. Basically, this allows us to include related factors in the calculation so small numbers of extreme ratings don’t skew the overall tea rating. These are the factors we think affect a tea’s rating:
- Average rating across all teas on Steepster
- Average number of times each tea has been rated
- Number of ratings for the tea you’re looking at
- Average rating for that specific tea
We took all of those variables and smashed them into an equation that made sense. Now, ratings are more relevant and outliers won’t have as significant an affect on a tea’s overall rating. The more people that rate a tea, the more valid and supported the overall rating becomes.
We’re pretty happy with what came out but that doesn’t mean we’re above improvements and suggestions. We’re also working to make it more visually obvious that you’re looking at a weighed average and not just a straight-up average. So, if you have any brilliant ideas or even just a thought about how to make our system the tiniest bit better, we’d love to hear it.