Netflix, I think, has killed their own recommendation algorithm when they removed stars and made it boolean. I don't know if those buttons even do anything anymore because I think they're just matching based off demographic and who they're trying to market to now. It's recommending shows that I never would watch in a million years and giving them high matches despite me disliking most similar shows just because I'm a 20-something male.
I think Netflix probably did it because people are inherently bad at being objective. For the average person, 2 stars for a movie and 4 for another isn't based on anything measurable, even they couldn't explain. I'm shocked at some of the amazon product reviews, most of which are 5 star reviews even if the product is absolutely terrible. Movies are different than products, but it's the same people doing the reviewing. Remember, the average user is not a thinking analytical HN user. Average people are much better at bool choices.
I know I'm in the minority here, but I am a big fan of the new system. I would torture myself trying to decide between, e.g., 3 or 4 stars for movie. And then go back and re-rate other movies that I realized I liked more but rated lower than the just-rated movie.
Their % match numbers are fairly accurate, but I have had to go into the watch history and delete the occasional movie watched and finished that we actually hated. No number of 1-stars (or thumbs downs) would eradicate its effects on the recommendations.
5 - Absolutely loved it, will buy a disc
4 - Good, but won't buy a disc
3 - Movie was okay
2 - Not a good movie
1 - Stopped watching 20 minutes into it
My problem with binary choice is that 1 == 2 and 3 == 4 == 5, whilst 1 and 5 were very special for me. :(
Plus the scale bias differing vastly between people and cultures makes the data a mess. Like say or me a 5 means 100% perfect, Why discreet choice stuff is all the rage in the market research world. (unless that's changed in last few years)
Asking people "which of these 3 things you like best" vs. "rate these 3 things 1-5" will usually give you much more useful data, plus be easier for respondents.
Popular recommendation algorithm like collaborative filtering by matrix factorization takes into account the accounts for user and item biases (the simplest method is to normalize the ratings of a particular user by the average of ratings of that user).
Couldn't you control for that by weighting people's ratings by the range in which they provide them? Like weighting a 5-star review a bit more from someone who averages 3's than someone whose ratings average 4's? Far from perfect sure but I bet it could save a lot of results from needing to be thrown out.
With stars you can cross compare with others to see if they have the same score. With simple thumbs up recommendations you cannot compare the ratings as the score is whether it appears to you or not.
I have to wonder if Netflix did this because a lot of their original or exclusive content seems to debut to mediocre star ratings. When the new system says "x% match" I assume that value is derived more from genre match or search relevance than whether I'll actually like it or not.
"In addition to the new rating system, Netflix has new match percentages (up to 100%) to more accurately predict how much users will like something.
...
The new rating system received 200% more ratings in A/B tests, according to Netflix VP of Product Todd Yellin.
When it comes to rating movies and shows, stars reflect the preferences that people want to have, rather than how people actually behave. Todd gave the example of users giving 5 stars to a documentary but just 3 stars to an Adam Sandler movie that they watched over and over again. “What you do versus what you say you like are different things,” said Todd."
If Netflix was trying to ensure that what I was mostly likely to watch next had the highest star rating, no wonder they gave up on it. Our opinion of the quality of something is not a good predictor of our likelihood of watching it.
Their users are surely not confused about that. So why does Netflix want to present a prediction as a rating? Is it to flatter their users by telling them that the thing that feels instantly gratifying right now is actually an amazing movie? "Hey, great choice. Billy Madison is a five-star movie. What? Why would you feel bad about not watching Raging Bull instead? It's a two-and-a-half star movie at best."
In other words, Netflix, like Facebook, like Doritos, is engineering itself for maximum addictiveness without regard for honesty. It will shelter you from even what you know and reassure you that whatever triggers a pleasure response in your brain is the best. Relax and enjoy it.
The truth is that we consume easy things a lot more often than we challenging things. It would be exhausting otherwise. But the best things are often the most challenging. We know that, we know that easy movies are just a way to kill time, but Netflix wants to do us the service of helping us forget it, because then we might be 1% more at ease when we watch Netflix and 1% less likely to switch to another service.
>When it comes to rating movies and shows, stars reflect the preferences that people want to have, rather than how people actually behave. Todd gave the example of users giving 5 stars to a documentary but just 3 stars to an Adam Sandler movie that they watched over and over again. “What you do versus what you say you like are different things,” said Todd."
There's a reasonable objection to this behavioral definition of "like", which is that it doesn't actually make people's lives better, it just fills them with more compulsive behavior. It's not necessarily "irrational" to wish you were more patient, or to want to ask Netflix to show you useful things rather than useless fluff. That you occasionally betray your stated goals does not mean you should be denied the right of self-definition.
In other words, what people say they like is more important, to me at least. See eg "Thinking Fast and Slow" by Kahneman.
I can't agree more - I'm a huge proponent for star rating systems. I get that they are perhaps more complicated than a Boolean value, but they help me out personally.
I miss the days of "tap tap scroll four clicks" on the iPod to help me rate new music, specifically.
Thumbs up/down might not be the best for training a recommendation engine, but as someone who just switched his product's rating system from 1-5 to simple up/down, let me tell you: people have no idea how to use a star rating system. I would get people raving but leave a 1 star review, some people would leave a 5 star review and say bad things, some would leave a 1 star review but seem pretty neutral in their review.