Experimenting with a Neural Network-based Poker Bot

jpwagner · on Oct 13, 2009

Certainly a fun thing to ruminate over, but the model Matt uses here is way too simplistic to draw any conclusions from about the sufficiency of using a NN.

A short stack raising >11% when he shouldn't be is a pretty severe leak ;)

Just a thought: I don't see the implementation of the algorithm here, but if you're using the values of individual cards scaled to 1, then you are saying that v(A) - v(K) = v(8) - v(7) which seems absurd. Imagine that there were (fantasize for the sake of argument) a few shoves in front of you, then should be that v(AK) - v(KQ) >> v(87) - v(76).

I'd need to give more thought to this problem in general, but I have to ask Matt...why did you choose short stacking full ring NLHE? It seems mechanical for someone very familiar with the game, but it's so situational. The first area I'd think you'd be most effective would be LHE-HU-SNG's.

matt1 · on Oct 13, 2009

Hey jp, great points. I would have posted the code, but for the life of me I can't find it anywhere. I had the Excel sheet stored in a different folder, which is why I was able to post it.

Anyway, I think I was thinking that including the difference from the average value would help the neural network identify pairs. In retrospect, it should have been able to recognize that simply from the card values, no?

Limit holdem had been done a lot and No Limit was my area of expertise, which is why I chose it. I didn't have that much experience with shortstacking, but I had read pretty much everything there was to read on it and it seemed like a good start because of its simplicity. Little did I know that the profits from shortstacking come from a very few carefully timed moves that are close to impossible for a rules-based bot to emulate (more on that in another post). Eventually I decided to build a no limit heads up bot, which is where my success wound up being.

I didn't mention it in the post, but the key was moving away from rules based to a value based system, where the bot calculated the profitability of its options and decided based on that. It was a lot easier to debug, but there were a whole set of challenges to that too, like estimating your opponent's range.

bravura · on Oct 13, 2009

Just a thought: I don't see the implementation of the algorithm here, but if you're using the values of individual cards scaled to 1, then you are saying that v(A) - v(K) = v(8) - v(7) which seems absurd

Instantiating the features correctly is crucial to getting good neural network accuracy.

It might be the case that a thermometer representation for the card value is more appropriate. The thermometer representation is that there are thirteen bit-values, and the first n values are on for the value n.

Using this sort of input representation, the network can model the non-linear relationship between different card values, instead of assuming a strictly linear correlation.

jpwagner · on Oct 13, 2009

It's a neat idea, but I doubt it would work well because the hand value in this context is so qualitative (whereas in HU LHE it would be more quantitative.)

Flemlord · on Oct 13, 2009

His tests are so simple that he's not really testing anything. For some real analysis of how neural nets stand up to other player prediction models, check this out:

http://spaz.ca/poker/

In particular, read the paper titled "Opponent Modeling in Poker: Learning and Acting in a Hostile Environment". Section 4 outlines several prediction methods, one of which is a neural net.

johnwatson11218 · on Oct 14, 2009

Does anybody know about these 2 ideas? 1) a site that lets people upload their bots and play bot vs. bot 24/7. The idea is to try and find the best algorithm.

2) A site that a human player plays at against the computer. The computer also tries to model the player using a NN and tell the player how "Big" of a NN was needed to model their play. Perhaps how quickly the computer was able to start predicting the player's play? The idea being that you want your play to be very hard to model with a NN so that you are more unpredictable in real life.

hazzen · on Oct 14, 2009

Based on personal experiece, I feel like (2) would not work very well (nor would using a similar idea to test your bot). When real money is on the line, my entire state of play changes. There is a large psychological difference between playing a bot for fun, and playing a bot for money. Playing for fun, I am more likely to place large bets, bluff, test its logic, and in general play loosely. For real money? No thanks.

Note that this is all based on personal experiece, and may not be the common case, but I feel like the average hacker will play to test the bot, not to win.

mattmaroon · on Oct 13, 2009

You really don't need anything that complicated to beat low-stakes no-limit poker short stacking. I used to know a few people who made decent hourly rates just mechanically following a very small spreadsheet. I did it myself for awhile and it seemed like an easy win, though not as profitable as simply being good at the limit games I was playing and nowhere near as challenging/fun. My personal sample size was small, but I still suspect a small set of if/then statements would have been a winner.

The real challenge is finding enough tables to play. The bad thing about short stack is that sometimes someone calls you and you win, and now you've got at least a medium stack. Most sites don't let you come back to a no-limit table with less money than you left with until some period of time (for instance 2 hours) has passed. Party Poker did not have this feature (maybe still doesn't) so it would have been easy back then, but for an American now I don't know of any site of substantial volume where you wouldn't quickly burn through all available tables and find yourself waiting.

Even for a bot, there might not be much profit in it.

matt1 · on Oct 13, 2009

It seems easy, right? That's what I thought too.

I purchased 7M+ NL200 hand histories from a guy who mined them all day every day. I filtered that down to hands where someone was playing with 15 - 25BB stacks and then for every person that played more than 5,000 hands in that range, calculated their profits and winrates.

Here's the breakdown: http://www.mattmazur.com/images/ShortstackingNL200.png

It's important to remember when looking at these numbers that the vast majority of people who try shortstacking never make it to 5,000 hands so this is heavily skewed toward the survivors. It would be like analyzing the revenue of startups that are over five years only and then using that average for the average revenue of all startups. There's only 80 people in the full ring group and 19 in the 6-max group. Just looking at these folks, only about 1 in 5 win more than 2bb/100 and in my experience that's not easy to do. The people that are profitable shortstackers would likely win a lot more playing with a normal stack.

I don't think there are many long term profitable shortstackers following rules. Again, I think the majority of the profits come from a few strategic plays that would be impossible for a spreadsheet, or a bot, to emulate.

mattmaroon · on Oct 13, 2009

Idk, those numbers seem pretty reasonable to me and don't indicate any real level of difficulty. 58% of people in it are even or worse, which means if there is any sample bias at all, it's very small. The fact that losers tend to drop out over time is probably mitigated at that level by the fact that winners will quickly rise from low stakes to at least middle ones.

2-4 bb/100 is great even for a human since you can do huge volume that way. The spreadsheet method can easily be applied to 8 tables at once, maybe more. 2 bb/100 (thats $4 as calculate by PT right?) could be $30/hr. 4bb is obv $60, which is more in line with what people I knew were making.

Also when short stacking, opponents at higher level games don't necessarily respond any more appropriately. In fact a lot of times they play worse against it because they have more idea what to do vs. normal opponents. Low level players like to call big bets too much, which makes them accidentally better vs. a short stacker. My friends claimed they were getting higher bb/100 at the $3/$6 and $5/$10 games than the $1/$2 because of that.

matt1 · on Oct 13, 2009

I just realized that I have the hand histories for all cash No Limit hands played on PokerStars from Dec 1, 2007 to Mar 12, 2008. I've got all the tools built to do the analysis, so I can pick apart the results from the lower and higher stakes as well. Any recommendations on how to do it?

Not PTBB/100, just BB/100. PokerTracker couldn't handle all the data.

mattmaroon · on Oct 13, 2009

Wow, how many hands is that? That's really cool.

I suppose you could possibly control for play style, thinks like what Poker Tracker tracks. VPIP, etc., and of course buy-ins if you do some technomagic to figure them out. (I'm assuming someone who short stacks all the time plays differently than a normal player who just lost half their stack.) Not sure how good your findings would be though.

LiveTheDream · on Nov 2, 2009

Look up mpethybridge on 2+2 forums. I believe he does analysis on big databases of poker hands; maybe he could help.

tom_b · on Oct 14, 2009

Have you checked out "The Mathematics of Poker" by Bill Chen and Jerrod Ankenman? This books does some neat game theory based analysis of poker and comes up with some surprising results. I've pondered what some sharp hacker type might do given that text and some time to grok on a bot.

I think the one result most applicable in this scenario is the section with jam or fold tables based on hand and stacksize. Basically, they show that there are completely defined hand ranges when you should jam or fold when stacked at a certain number of BB at no-limit.

I've pondered doing a neural network, but for limit rather than no-limit. I still may, mainly because I think working on a bunch of inputs that are hard to quantify on the fly when playing would be fun to experiment with via the neural net.

Thanks for your posts on poker bots - although I have to admit I've basically quit playing online over the last year because of the proliferation of HUDs, data mining automation, and bots.

Oh, and for anybody interested in game theory and/or poker, you simply have to grab the book. It's a bunch of fun and not scary on the math for those on this forum.

matt1 · on Oct 14, 2009

I've read The Mathematics of Poker; the binding is broken from keeping it propped open too much. That being said, it was not very helpful actually building a bot. They do a lot of nice little game theory examples, but most of it is not applicable to real life game play.

The push/fold tables are another story, and I'll write about that soon.

chasingsparks · on Oct 13, 2009

For those interested in the PokerBot aspect, check out: http://pokerai.org/openholdem.html

It allows for bots in Perl and has a DLL interface. I've been meaning to make a V8 based DLL to write bots in Javascript for a while, but it's not a high priority for me.

P.S. Did anyone notice this guy has a Tetrinet bot?! That game made high school tolerable.

matt1 · on Oct 13, 2009

The TetriNET bot was a blast. You can download the full source code on the site as well.

prat · on Oct 13, 2009

One of the issues with NN is that there is no way to find out the relative significance of the input variables and how they affect the output response. Atleast I couldn't find a way when I tried using them a couple of years back. Secondly, I guess there isn't a way to cluster similar cases together to analyze what rules bind different decisions together. Also, one cannot find the partial dependence of the output on a given input variable. All these problems are solved by the random forest classifier which I have been working with and really like.

clevercode · on Oct 13, 2009

I agree that analysis of neural networks is not easy, but there are certainly ways to do it:

there is no way to find out the relative significance of the input variables and how they affect the output response

Calculate partial derivatives of inputs w.r.t. outputs.

there isn't a way to cluster similar cases together to analyze what rules bind different decisions together

PCA analysis of activations of the hidden layer. Could also do k-means clustering of activations of hidden layer.

The random forest classifier looks interesting, I'll have to investigate that. Any suggestions for papers/tutorials?

prat · on Oct 13, 2009

Okay, I think I should accept that I am probably outdated or misinformed about the things one can do with neural nets. Re, randomforests, I was referring to the Leo Breiman implementation on the univ berkeley website. Andy Liaw of Merck also maintains the R version of Leo Breiman's algorithm (http://cran.r-project.org/web/packages/randomForest/index.ht...). I know neural nets have been successfully used for really compute intensive applications like face recognition or vehicle autopilot, but for applications like poker or predicting car sales or data mining used by walmart etc, there has been increasing use of simpler regression based / decision tree based approaches. There is an impression in the machine learning community that NN is like a black box which is probably the reason for its falling popularity if that is the case.

bravura · on Oct 13, 2009

there is no way to find out the relative significance of the input variables and how they affect the output response

"Calculate partial derivatives of inputs w.r.t. outputs."

Yes. There are other ways to interpret the output. See, for example, "Visualizing Higher Layer Features of a Deep Network." by Erhan et al 2009: http://www.iro.umontreal.ca/~lisa/publications/?page=publica...

It is true, though, that more powerful models are less explainable. But in return you have more compact modeling and training via gradient descent. This is faar faster than the combinatorial optimization involved in ensembles of trees. (Trust me, I've implemented both.)

there isn't a way to cluster similar cases together to analyze what rules bind different decisions together

"PCA analysis of activations of the hidden layer. Could also do k-means clustering of activations of hidden layer."

I concur. However, for 2-d or 3-d visualization, you should use the more recently developed t-SNE algorithm instead of PCA or other alternatives. t-SNE does waay better. Software (in Matlab and Python) is available at: http://ict.ewi.tudelft.nl/~lvandermaaten/t-SNE.html

Also, you should look at the JMLR paper (http://ict.ewi.tudelft.nl/~lvandermaaten/t-SNE_files/vanderm...) and supplemental material (http://ict.ewi.tudelft.nl/~lvandermaaten/t-SNE_files/Supplem...), to see the visualizations produced by t-SNE and competing methods. This qualitative evaluation by looking at pictures speaks for itself.

Also, one cannot find the partial dependence of the output on a given input variable.

If your model assumes that the output is a non-linear combination of inputs, then yes, it is hard to express the output in terms of a linear decomposition. But that was your choice of modelling assumption, presumably because linear models are insufficiently powerful to fit the underlying variations.

"The random forest classifier looks interesting, I'll have to investigate that. Any suggestions for papers/tutorials?"

Random forests were developed by Leo Breiman (RIP).

"Ensemble methods in machine learning" by Diettrich (2000) compares different tree ensemble methods. He concludes that boosting an ensemble of decision trees is better, except when the data are very noisy, in which randomized trees are better. (Boosting is when you focus on the examples that the model is currently doing the worst. Randomized instead works on random subsets of examples.) The main reason boosting is worse than randomized trees in the noisy case is because the AdaBoost exponential loss is sensitive to outliers. Which is to say, AdaBoost boosts the wrong loss function. Boosting an appropriate choice of loss function (perhaps a regularized log-loss) is probably superior to randomized trees in most circumstances.

"Improved boosting algorithms using confidence-rated predictions" by Schapire and Singer (2000) is a great introduction to boosting.

Around the same time Llew Mason and Jerome Friedman independently demonstrated that boosting is essentially fitting an additive model using gradient-based methods to select the features that have the steepest loss gradient. So you should follow up by looking at their work.

bravura · on Oct 13, 2009

Your statements are all false and I can refute them later when I am not on my phone.

If you want to criticize neural nets as tricky and confusing to newbies with a steep learning curve, that is fair. But it's irresponsible criticism like yours that has given neural nets an undeserved bad reputation in the machine learning community.

prat · on Oct 13, 2009

If I were you, I would have commented after finishing up with the phone.

bravura · on Oct 13, 2009

What I meant is that I would comment further when not on my mobile device, but instead a keyboard. See my forthcoming comments above.

LiveTheDream · on Oct 14, 2009

This bot attacked Texas Hold'em. Presumably most of the others do as well, given that it is far and away the most commonly played game online (disclaimer: complete assumption).

What about attacking 7-card stud, stud8, omaha, and o8? These games have more information to analyze than hold'em because more cards are visible. It seems to me that would provide more of an edge to a computer, which could calculate drawing odds faster and more accurately than a person.