Show HN: The problem with the epsilon greedy method

marketforlemmas · on Dec 19, 2014

Interesting comparison but, by my understanding, epsilon greedy and A/B testing do not solve the same problem.

Epsilon greedy is a method for minimizing regret, that is the expected loss you occur from choosing options that are sub-optimal.

A/B testing's goal (or one of many goals) is to maximize the chance that, after the test is over, you select the best option going forth.

So e-greedy makes a conscience choice to not maximize its statistical confidence in certain options because it is trying to exploit the things it knows to be good. Meanwhile A/B testing is trying to balance the exploration so it can have that statistical confidence.

Hopefully someone with more expertise can chime in but I think this is the gist of it.

bcbrown · on Dec 19, 2014

http://engineering.richrelevance.com/bandits-recommendation-...

crobertsbmw · on Dec 20, 2014

awesome read. This would have satisfied my curiosity and probably saved me an entire day of messing around. Thanks.