Hold'em Killer: April 2005

Wednesday, April 27, 2005

How to write a winning poker bot (index)

Chapters in order:
Raw Hand Odds
Computing Actual Hand Odds
Implied Odds
Call or Bet? Check or Raise?
Opponent Modeling
Beyond the Sklansky Preflop

Copyright © Ervin Peretz 2005. All rights reserved.

See also my free online poker bot at:
http://www.holdemkiller.com

======================================================

Opponent Modeling

(continued from previous post)

Now let's start to look at some opponent modeling.

There have been some attempts recently to build a large database of all online players. I believe that modeling opponents from a long-term database may be useful for identifying strong/weak players, but is not as useful for predicting playing style as short-term modeling (i.e. modeling each opponent as you play them in the current sitting).

So here I'll discuss some ideas for modeling and using your opponents' preflop behavior as you play them. That "and using" clause is key; there's not much point to any piece of analysis unless you can rationally scale and combine it with the other available data and statistics.

There is an unbounded amount of modeling you can do from observing what cards the opponent has shown in the past, in what position, etc. In reality, however, it will take a very long time to model that level of detail, because very few of an opponent's hands actually get shown. Online, only called or winning hands on the river are regularly shown (although any hand played through the river is available by request). So this type of model may take too long to build before you can apply it. More seriously though, an opponent may choose to show a hand when folding, which allows him to selectively misshape this type of model. Any opponent that voluntarily shows you his cards is either showing off or intentionally misguiding your mental modeling of him.

So I'll talk about a very simple type of preflop model: one that simply looks at percentage of flops seen by each opponent. Even this simple statistic gets complex in implementation.

To build this statistic, simply keep a #flopsSeen/#totalHandsratio for each opponent. You may halve both numerator and denominator periodically to reduce impact of older data.

Opponent Model for the preflop itself

There are 4 pieces of information available to you on the preflop:

a. Your hole cards
b. Your betting position
c. Opponent bets behind
d. Opponent betting history (i.e. some kind ofopponent model)

(this is ignoring opponent appearance and behavior, as we're talking about what is available to a bot)

There is this corresponding set of questions whose answers need to be combined somehow:

a. Your hole cards: In what situations are they profitably playable?
b. Your betting position: How much can I predict about the situation at the end of the preflop?
c. Opponent bets behind: What is the significanceof their bets?
d. Opponents ahead: How should I expect them tobet?

If it were just (a) and (b), you could use a precomputed preflop model like Sklansky's. To be fair, Sklansky does pepper his advice with clauses like 'if the game is loose preflop', etc, but we need something deterministic for every situation.

We're still not discussing bluffing here, or influencing the opponents ahead; just trying to play optimally for an honestly profitable hand at the flop.

Your preflop model (e.g. Sklansky), tells you that certain hole cards are playable against a certain number of opponents, as if they were all the same. Now you want to apply a preflop model that attributes unique characteristics to each opponent. The only thing to do is to take the available statistics when it's your turn to bet (the bets behind and the opponents ahead) and scale them somehow. But how?

a). For the "made" hands (pairs and high cards), you want few opponents. Lots of opponents behind is bad. To scale the number of bets behind based on opponent's looseness, start with a preconceived figure of the "ideal" flops-seen rate for the game size, e.g. 20%. If the opponent is seeing twice the "ideal" number of flops, you may count his bet behind as half a bet. A tight bettor behind might count as more than one scaled bet, etc. For the opponents ahead, however, their looseness doesn't count; you simply make a statistical estimate of how many stronger hands there are ahead.

b). For the "draw" hands (e.g. suited connectors), the equation is entirely different. For draw hands, the desire is to have many callers who will pay you off in the minority of cases where you make your hand. In this case, the bets behind are not scaled; they represent money that is there to pay you off; each bet behind is counted as one caller. It is the opponents ahead who are scaled on their looseness. Loose opponents ahead means a greater likelihood of enough callers to make your hand (e.g. 78s) playable. Scale each opponent ahead by his flops-seen percentage, and add that to the bets behind, to obtain an estimate for the number of players seeing the upcoming flop. Here, a tight player is scaled down instead of up. Obviously, the later your position, the more accurate your estimate is going to be.

Now you have a decision to make. You have a fixed number of callers behind, a scaled estimate of number of callers for the flop, and a range of number of callers against which your hole cards are playable according to your preflop strategy. At the conservative extreme, you can choose to play only if your hand is playable against the entire possible range of callers for the flop; or, you can choose to play if your cards are playable against the number of opponents given by your scaled estimate; or you may require a 'playable range' of #callers around the estimate.

Applying preflop modeling to RHO later in the hand

Earlier I discussed Reflective Hand Odds, which is a way of filtering the hand spectrum based on opponent betting behavior to obtain more accurate hand odds. The RHO filter assumes rational behavior by the opponents, who are usually betting according to the strength of their hands. Part of that filtering is for the preflop, i.e. if an opponent bet voluntarily on the preflop, the hand spectrum is filtered for "playable" hole cards.

The opponent preflop model is applicable to this filter. If an opponent should "ideally" be playing only the top 25% Sklansky hands in his position (and given the betting situation at the time that he bet), but he has a modeled preflop looseness factor of 2x, then the filter may adjust to consider the top 50% of Sklansky hands as having been "playable".

Depending on the board, this could make your RHO higher or lower - if only a maniac would play 46 offsuit but that is now the nuts, the loose player may have you beat; but if the board is playable with top Sklansky hole cards, then the loose player is less likely to be holding them. The RHO computation will express that numerically.

Note that it's easier to have the "Reflective Nuts" against a tight player. In many situations you can actually bet stronger against a tight player due to having made the "Reflective Nuts" (or close).

Monday, April 18, 2005

Beyond the Sklansky Preflop

(continued from previous post)

Poker programs are very different from chess programs. In chess, the program is strong at the opening and end-games, where it can work from a ‘book’, or do a full forward simulation in the endgame; chess programs are weakest in the middle game, where despite complete information the game is not fully computable. With poker bots it’s the exact opposite: bots are strongest in the middle game, when the game is most dominated by calculable (but not humanly calculable) statistics of the cards to come; in the beginning and end-games, the standard play is simple enough that the bots have no advantage, and the psychological factors surrounding the unknown information dominate.

That said, it really seems like a bot should be able to play the preflop right and be able to adjust to the game.

Earlier I wrote that, rather than trust existing preflop models like Sklansky’s, with all their ambiguities and possible errors, I wanted to come up with a new, comprehensive and deterministic preflop strategy.

One of the biggest mistakes that new players make in the preflop is to decide ahead of time which hands to play, and then ignoring position and opponent bets. The odds impact of a single bet to the right is huge. If you’re in middle position, a Sklansky 5 hand that is raisable with no bets to the right usually becomes a fold with one bet to the right. So any preflop model that doesn’t take in the entire situation is lost from the outset.

To keep things well-defined, we’re still talking about honest play. There is no consideration of playing loose to create a table image for the benefit of future hands, bluffing, or anything like that. We’re talking about honest play against modeled (but not necessarily honest!) opponents.

The incredible thing is that given each completely specified situation, there does exist a precise statistically correct action within the limits of the available information. Any lookup table or software library that takes your hole cards, position, and opponent bets (plus any available opponent history) and produces *the* correct preflop action is easily a million dollar asset. But as far as I know, no one has ever publicly produced one. I’m going to take a partial stab here today.

It is not possible to run a forward simulation of holdem from the preflop through the river in real-time.

And even pregenerating the game, with all its possible variations, is not practical. You may be able to precompute all card combinations through the river, but once you add in all possible opponent actions and modeling (all of which really matter), that’s too many parameters to pre-generate (we’re potentially talking about years of supercomputer time to produce that).

The middle game is a sweet spot for the bot because holdem is thoroughly computable in real time at the flop. What we want to do is produce a scheme by which we can precompute enough data to make the preflop computable in a real game.

FlopHand:

Since it’s not possible to pregenerate a comprehensive lookup table for the preflop, let’s define an intermediate goal: the situation at the flop. I’ll call the cards visible to you at the flop a FlopHand. A FlopHand is simply a pair of hole cards plus a set of 3 table cards. This is not to be confused with a 5-card hand. There is a big difference between aces in your hole and aces on the board. The cards of a FlopHand are un-ordered within the hole and table card groups, but the groups themselves are ordered. So whereas there are 2.7 million 5-card hands, there are about 26 million FlopHands. For each of the 1326 sets of hole cards, however, there are only 19600 FlopHands, making the set of FlopHands for your preflop trivially available in a lookup table.

You’ve probably seen tables that order all preflop card pairs from best to worst. These are usually based on simple simulations that simply deal all cards through the river. Some actually take #opponents into account and order them by (trivially estimated) expected winnings; in this case the ordering changes depending on #opponents. In reality, of course, the rankings depend on situation; if you put the opponent on a high pair, that diminishes the value of you having a medium pair vs. e.g. flush cards.

The FlopHands are also orderable. A FlopHand has a trivial rank, which can be computed by dealing all combinations of the remaining 2 cards and ordering the hands by #wins. In reality, however, the opponent hand is not random. Flops that benefit commonly played hole cards (according to the opponent model) are worse than those that don’t, independent of how they directly benefit your hand. Therefore, the rank of a FlopHand is conditional on the game situation.

FlopConfig:

Now let’s define a FlopConfig. A FlopConfig is a FlopHand plus the players’ positions and preflop actions and opponent preflop model (previous segment). A FlopConfig is the fully specified situation at the flop, comprehending the pot size, the expected types of cards that the opponents may have, and the resulting Reflective Hand Odds (RHO).

Expected Value:

What we’re working towards is a computable Expected Value (EV) for a FlopConfig. If we can calculate an expected win/loss for each possible FlopConfig, then at the preflop we can take the weighted average across the possible FlopConfigs and determine the correct action. Once again it comes down to Reflective filtering and math.

What we want is a formula where we can efficiently take a FlopHand and some precomputed parameters for it, plug in the extra FlopConfig info, and produce the RHO and Implied Odds (IO), which we plug into the simple F() function (from earlier segment) to determine if the FlopConfig is “bettable”. If it’s not bettable, we will fold, so its cost is the cost of seeing the flop; otherwise, its EV will be some positive amount that we want to generate from the computed RHO.

p(FlopConfig):

All possible FlopHands are equally likely (ignoring minor variation for dependencies btw opponent hole cards and the deck), so

p(FlopHand) = 1/19600

The FlopConfig entails opponent bets, etc. Part of that information is available in the form of bets to the right. The remainder has to come from your preflop model. In the last segment I talked about scaling callers behind and ahead based on preflop modeling. The missing information for the FlopConfigs is comprised of each possible action ahead, weighted according to the preflop model.

p(FlopConfig) = p(FlopHand) * p(actions ahead that contribute to this FlopConfig)

Although each possible FlopHand is equally likely, the FlopConfigs are not.

The later your position, the fewer possible FlopConfigs there are per possible FlopHand. There are always multiple FlopConfigs/FlopHand, however, since your preflop action is part of the state space.

EV(FlopConfig):

The EV of an unbettable FlopConfig is zero, because we will fold it. We don’t count the negative cost of seeing the flop. Below is the raw formula for EV of a FlopConfig. If the raw formula produces a negative value, we decide that the FlopConfig is unbettable, making the EV zero. Therefore, EV(FlopConfig) is always non-negative.

EVraw(FlopConfig) = p(win)*(ExpectedPot - $ExpectedBets - $rake) -
p(lose)($ExpectedBets) ,
where:
p(win) = RHO
p(lose) = 1-RHO
$ExpectedBets and $ExpectedPot is taken from Implied Odds calculation (earlier segment)
$rake is given

EV(FlopConfig) = max(EVraw(FlopConfig), 0)

Now, it’s true that in late position, our evaluation of the FlopConfig changes as opponents check/call/raise behind, and we may for example end up betting a FlopConfig (honestly, no bluff) that in early position would be unbettable. But we are talking about the FlopConfig as anticipated from the preflop. We only want to assign positive values to FlopConfigs that would be initially bettable by us (against n opponents as stipulated by the FlopConfig).

The fact that EV(FlopConfig) >= 0 has huge impact on the IO for the preflop. It invalidates the full forward simulations that rank preflop hands on how they fare through the river. It also rewards preflop hands with highly polarized sets of outcomes, e.g. suited connectors, where the strong hands depend on getting the right flop and the other flops cause us to fold. Now the preflop hands will be ranked by having the highest average FlopConfig, and the only negative factor is the cost of seeing the flop.

It’s well known that 22 has >50% odds of winning over AKs. Most of us know intuitively that AKs is the preferred hand, but why? You can see the answer if you think about the FlopConfigs. 22 gives a large number of FlopConfigs with EVraw slightly greater than zero (because you have at least a pair), and only very few with large EV (when you hit a set). Therefore the average EVraw of the bettable FlopConfigs is low. With AKs, you get a smaller number of bettable (EVraw > 0) FlopConfigs, when you hit a pair or flush draw; but the average of those bettable ones is much higher; the EV is then much higher because you strip away much more of the downside in the EVraw -> EV conversion than you do for 22. That’s true in heads-up, and it’s even more true in multiway due to the higher positive EVraw’s for AKs.

So these simplistic rankings of hole cards that we see so often are flawed; the real value of a preflop hand comes from the weighted average of its EV(FlopConfig)’s, which depends on the game situation.

Computing EV(FlopConfig):

The only complex part of the equation is the RHO. The RHO for a specific FlopConfig is computable in a few seconds. Pregenerating RHO for even one FlopConfig for each 26million FlopHands is already months of processing time.

What we want though is to precompute and cache enough parameters with each FlopHand that we can compute the EV for all possible FlopConfigs (which is much more than all 19600 possible FlopHands for a given PreflopHand) in real time.

To make it worse, recall also, from the first posting, that even if you precompute hand odds against each opponent type in each position for each FlopHand, you cannot then simply multiply those together to get hand odds against all opponents (to briefly restate: the probability of beating one opponent is highly dependent on the probability of beating the other opponents, especially in draw situations).

The good news is that most FlopConfigs are known to be non-bettable from a simple heuristic (e.g. no pair, draw, or strong overcards). Their EV’s are a constant zero, regardless of other factors.
Also, an even larger set of FlopConfigs are ignorable because you would never (in honest play) arrive at the given flop situation (e.g. having called with a clearly unplayable set of hole cards following a bet on the right).

I haven’t thought it through much further than this. Suffice it to say that as the bot plays, its model of each opponent stabilizes over time; the bot starts with a neutral model for each player, then chooses a stereotype based on the first few hands, then slowly develops a more accurate model. It can then use it’s downtime to slowly update its ‘book’ of possible FlopConfigs based on those models, and assign RHO to each possible situation. Given a preflop hand, it can then quickly compute the weighted average of the EV(FlopConfig)’s, and decide if the hole cards are playable.

EV(PreflopHand):

Now we want to take the statistics we have for the spectrum of possible FlopConfigs, and use them to evaluate the preflop hand.

EV for a PreflopHand is influenced by your preflop action. Your raise increases your investment and also influences opponent folds and bets. That choice produces different sets of possible FlopConfigs.

EV(PreflopHand w/ call) =
( Sum( p(FlopConfig) * EV(FlopConfig)) / #FlopConfigs ) – $CallAmount

EV(PreflopHand w/ raise) =
( p(no flop)*PotAmount + Sum( p(FlopConfig) * EV(FlopConfig)) /
#FlopConfigs ) –
$RaiseAmount (**)

EV(PreflopHand w/ fold) = 0

The honest play is to take the preflop action with the highest EV, which is most often zero (fold).

Note that the formula is comprehensive. E.g., since EV(FlopConfig) is always non-negative, checking the big blind (i.e. $CallAmount = 0) always has positive EV.

(**) If you raise, everyone may fold and there may not be a flop. State in math terms, in the EV(PreflopHand w/ raise) equation, the p(PreflopConfig)’s don’t add up to 1. Therefore, you fill in the missing probability component for the “null” FlopConfig, whose EV is the pot amount before your raise.

A Reductive look at the Preflop:

The preflop is very hard to wrap your head around. Even at this initial stage there is all this mixing of factors from the starting hand, position, implied odds, bets behind, and opponent model. How to make sense of it all ?

Since we can’t humanly rationalize all these factors, let’s start with a game that we can thoroughly analyze. Bear with me as we build on this simple model:

1-Chip Zero-Blind Hold’em:

1-Chip Zero-Blind holdem (1C0B) is a simple imaginary game where you may bet 1 chip at the preflop. There is no more betting thereafter, and if you win on the river you don’t win the pot – you just get back your chip plus 1 more chip. The betting order is the same (starting with UTG), but the blind amounts are zero. This eliminates all pot odds and implied odds; you’re simply betting even money that you have the best hand on the preflop.

To make it even more unreal, let’s assume no one is bluffing. What does it take then to bet first in 1C0B ? We said earlier that preflop hands are not orderable; but for the first bet they pretty much are. So we can define this in terms of top %-ile preflop needed to bet first.

If you’re in the “SB” and no one has bet, you’re just betting that you have a better hand than the BB, which is random; so trivially you just need an above-average hand. When OTB, you’re betting that both of the random hands ahead are lower. The required %-ile is the solution to (1-x)^2 = ½ , i.e. ~30%. In seat 9, it’s the solution to (1-x)^3 = ½, etc.

Seat	Required %-ile preflop hand to bet first
UTG	(1-x)^9 = ½	7%
4	(1-x)^8 = ½	8%
5	(1-x)^7 = ½	9%
6	(1-x)^6 = ½	11%
7	(1-x)^5 = ½	13%
8	(1-x)^4 = ½ ...	16%
9	(1-x)^3 = ½ ; x = 1 – 1/cuberoot(2)	21%
OTB	(1-x)^2 = ½ ; x = 1 – 1/sqrt(2)	29%
SB	1/2	50%
BB	1/1	100%

As you can see, the set of hands playable as the first bet grows almost exponentially in latter position. Your exact seat position matters big-time.

1-Chip Zero-Blind Hold’em w/ Pot Odds:

Now let’s add in the blinds and the resulting Pot Odds, just for the preflop. There is still no betting after the flop. The stake is now 1 ½ chips instead of 1 chip. And the small blind only bets ½ chip; note that whereas the first 8 bettors are risking 1 chip to win 1 ½ (2-to3), the SB is risking ½ to win 1 ½ (1-to-3), a huge difference. There’s still no bluffing. Call this 1CwB.
The first 8 seats are now getting 2-to-3 pot odds, so they need a 40% chance of winning.

Seat	Required %-ile preflop hand to bet first
UTG	(1-x)^9 = 2/5	10%
4	(1-x)^8 = 2/5	11%
5	(1-x)^7 = 2/5	12%
6	(1-x)^6 = 2/5	14%
7	(1-x)^5 = 2/5	17%
8	(1-x)^4 = 2/5 ...	20%
9	(1-x)^3 = 2/5 ; x = 1 – cuberoot(2/5)	26%
OTB	(1-x)^2 = 2/5 ; x = 1 – sqrt(2/5)	37%
SB	3/4	75% *
BB	100%	1/1

That covers the first bet. What quality hand is required to call a first bet, still assuming no bluffing and only a 1 ½ chip stake? To call a single bet, you need a 40% chance of beating a hand in the range that’s bettable for the opponent behind who bet first. The order of the preflop hands is now resorted by win rate against hole cards in the bettable range of the opponent who first bet (in a future segment I will be producing my 169x169 table showing win rate of each generic hand against each other generic hand; so this is well defined).

Seat N	Required %-ile preflop hand (resorted) to CALL a bet by opponent in seat N
UTG	10%*(1-40%) = 6%
4	11%*(1-40%) = 6.6%
...	etc ...

Now, we can add in the Pot Odds for the extra bet that the first opponent put in. So instead of a 1 ½ chip stake (2-to-3), we have a 2 ½ chip stake (2-to-5). The first bettor still needs a 40% chance to win. To call him, the next bettor has 2-to-5 odds so only needs 29% chance of winning.

Seat N	Required %-ile preflop hand to bet first in seat N	Required %-ile preflop hand to CALL a bet by opponent in seat N
UTG	(1-x)^9 = 2/5 ; 10%	10%*(1-29%) = 7.1%
4	(1-x)^8 = 2/5 ; 11%	11%*(1-29%) = 7.8%
...	...

I’ll leave it to someone else to add in Implied Odds for the entire hand.

Bluffing on the Preflop:

Now we can consider bluffing. You may think that bluffing will throw a wrench in the entire model. But not really. Let’s think about what it means for an opponent behind to be bluffing when he bets first on the preflop. I would argue that for any competent player, any regular bluffing in that case would appear as looseness in your model. An opponent has no reason not to bet his profitably bettable cards. So if he bluffs, he will be playing more than his average. You may not be able to predict which way he is bluffing, but you’ll know that he is bluffing with some known frequency. You can then adjust the strength of the hand needed to call/raise him accordingly.

Your bot itself should also bluff. Sklansky writes that the optimum bluffing frequency is such that it makes your opponent’s Hand Odds equal his Implied Odds.

Summary

Lots of people view the preflop as a simple and automatic part of the game. But it is definitely not, due largely to the effect of opponent actions and modeling on the odds. And, because a good bot is so strong on the flop, correct preflop play will get the bot into the right set of flops and put it over the top. The bot doesn’t have the human element to recognize opponent play patterns from just a few hands, but it can compensate by analyzing all available data to a level that is not humanly possible. Once the bot is in the right set of flops, it is in the zone.

At the end of the day, all the bot has to do is beat the rake. If it does that, it can keep its head above water for hours and occasionally capitalize on an opponent’s mistake. That’s all that it needs to be a winner. Unless it is up against other bots ;-)

---------------
Ervin Peretz is a software engineer at a major web search company.
He is author of HoldemKiller, a free online poker bot.

Check or Bet? Call or Raise?

(continued from previous post)

I want to zero in on one fine point in the Check-or-Bet and Call-or-Raise questions. This is still addressing straight play, i.e. statistically straightforward play with no deviousness.

Even if you have odds to bet first or raise, you may not want to push extra money into the pot. This is if your extra bet does not “pay for itself” statistically.

1. Should I Bet First / Raise ?

Let’s say, for example, that you are on a nut flush draw against 2 opponents, and the pot is large, e.g. the bet is $4 and the pot is already $40. You do not have a made hand yet, but your RHO are 25% due to the flush draw, and the IPO are somewhere around 10%. Well, by that comparison, you certainly have odds to bet first or raise according to the F() function in the last segment. But you usually don’t want to do so in this case because you haven’t made your hand yet. Certainly you’ll call anything, but you don’t want to voluntarily put money in the pot until you have a made hand. This is because IPO is missing something. IPO (when compared to RHO) defines acceptable risk for the reward of the pot. But it doesn’t address the risk/reward of taking an extra risk to win a possible increment to the pot.

We’re talking about straight play here, so there’s no consideration yet of a semi-bluff to get an opponent out. When considering an extra bet (i.e. a first bet or a raise), the size of the pot doesn’t matter, because it is not at risk. Your required odds to call a bet in the above example are roughly 10%, due to the size of the pot.

On the other hand, against 2 opponents you can only win back twice your extra bet, but you will only do so 1 in 4 times. Your required odds to bet first or raise are roughly 33%. Since your RHO are way above 10%, but below 33%, you should call anything but not bet first or raise.

This may seem trivial and you may think you can deal with it by a simple check for a draw hand in the program. But the difficulty arises in complex hands when where in addition to the powerful draw you may have a small pair or some other draw. The bot can’t just eyeball it. Furthermore, in a multiway hand, raising the draw hand often is correct. So we need a way to deal with this effect statistically.

To summarize, defining:

RO (raise odds) as the implied odds for the incremental bet-first-or-raise, ignoring the current pot, and
IPOr as the regular implied pot odds but with you raising instead of calling,

when F(rho, ipo) is TRUE but F(rho, RO) is FALSE, straight play for the bot is to check/call, even if F(rho, IPOr) is TRUE.

---------------
Ervin Peretz is a software engineer at a major web search company.
He is author of HoldemKiller, a free online poker bot.

Implied Odds

(continued from previous post)

The last segment covered Reflective Hand Odds, which are your basic, real-world hand odds, taking opponent play into account, albeit naively. At some point we can try to complete that model with opponent modeling data and a fuzzy, dynamic heuristic for putting a read on opponents’ play. What’s nice is that we have a model that will take in these innovations; we can incorporate opponent knowledge in a statistically meaningful way by evolving the “is-bettable” heuristic and customizing for each opponent.

1. Simple Pot Odds

So now let’s look at the other major component of straight play: pot odds. Pot odds are the ratio of your current contemplated bet or call to the current size of the pot. If the bet is $4, and the pot is $16, then the pot odds are 4-to-1; I’ll invert this as 25% for math purposes i.e. your bet is 25% of the current pot.

2. Implied Odds

As any player knows, simple pot odds are insufficient. What’s really meaningful is the ratio of two possibly larger numbers: your total bets for the rest of the hand, and the final pot minus your future bets; at any moment in the hand, that is the risk/reward equation. This is called your Implied Odds, and unlike pot odds it may depend on projections for opponent behavior, making the result fuzzy. In the preceding example, if there is also an expectation of one more call on the river against one opponent, then the total bet is $8 and the total pot (not counting your future bets) is $20, so the implied odds are 20-to-8 (10-to-4), or 40%.

In heads-up, implied odds are usually worse (higher percent) than pot odds; in multiway, they can be much better (lower percent).

In draw hands, there is a special complexity to implied odds, because your behavior will change based on the cards; on a flush draw, you will bet the river only if you make your flush. There is also the possibility of an opponent being on a draw. So you have to have an implied odds calculation that adjusts the numbers accordingly. This implied odds calculation is another big piece of AI required for good play.

This will be a trivial point for most: As the hand moves forward, more of your money ends up in the pot. As you recalculate your IPO in later betting rounds, that money is part of the pot, which tends to make your IPO better (lower percent) as the hand progresses (this is countered in limit holdem by moving to the higher limit at the turn). For this reason, a small statistical error early in the hand gets magnified on terms of your final losses, because your own mis-bet money winds up in the pot and gives you odds for staying in the hand and continuing to bet. So getting these statistics right in marginal cases matters big time.

Now let’s use these figures to compute straight play by the bot. This means no misleading or psychological trickery; no bluffs, check-raises, etc, just simple, predictable, statistically correct play. You can actually go pretty far with basic play in low-limit holdem, where no one is paying too much attention to you; and so can your bot, I suppose.

3. Check/Bet/Call/Raise/Fold ?

First let’s define a function that identifies a callable situation in straight play, based on Reflective Hand Odds (RHO) and Implied Pot Odds (IPO):

F(rho, ipo) = TRUE if rho > ipo/(100%+ipo); else FALSE

In straight play, deciding whether to call or not depends on a simple comparison of RHO an IPO, and is returned by F(). You can also trivially compute IPO for a raise by you; this will usually produce higher (worse) IPO since a greater proportion of the money is coming from you (unless it’s a very multiway hand). If RHO are still greater, then you may contemplate a raise.

If the first bet is to you, the minimum requirement to bet first (still assuming straight play only) is F(rho, ipo)==TRUE. With multiple players or a loose raiser ahead, you may require sufficient RHO to cover a raise before betting first. On the other hand, with any made hand, you don’t want to check and “give a free card”. Determining when to bet first is a creative part of the program, and may inject random behavior, but there is nothing very advanced or complex here; it comes down to expectation of an opponent betting first or raising, which is a direct consequence of opponent modeling and the “is-bettable” heuristic used in RHO to put each opponent on a hand. Even a static algorithm, with no opponent modeling, may be enough here.

4. The Preflop

On the preflop, the same calculations for RHO and IPO apply as for the rest of the game in theory. However, in reality, these are not computable on the preflop for several reasons.

First of all, there is much less information available on the preflop. There is no reflective component to RHO if an opponent has not yet acted. For IPO, the possibilities are too chaotic and dependent on the situation at the flop. Also, even if the hand odds information was available, it would not be computable in real time from the preflop.

For this reason, it makes much more sense to follow a precomputed strategy on the preflop, such as Sklansky’s. Sklansky gives you a general preflop strategy for playing each set of hole cards in early, middle, and late position. It is somewhat ambiguous with respect to different size games, and different types of opponents. Still, by maintaining a preflop history for each opponent, you can tweak the basic Sklansky action to respond to the game.

At some point, I’ll go through some of my ideas on generating the preflop strategy from scratch. I once estimated that generating the entire table would take 70 hours of computer time. One of my ambitions is to create my own preflop strategy which is adjustable statistically to variations in the game (as opposed to ad hoc, like Sklansky’s).

5. Straight Play

To summarize, the RHO/IPO calculation for straight play applies only after the flop. Straight play for a bot means following (some variant of) a precomputed strategy (like Sklansky’s) for the preflop, and (some variant of) the F() function thereafter.

---------------
Ervin Peretz is a software engineer at a major web search company.
He is author of HoldemKiller, a free online poker bot.

Computing Actual Hand Odds

(continued from previous post)

So after getting raw hand odds, the bot needs to react to the table action if it’s going to be any good. This means adjusting the odds to opponent betting behavior. But how to do that? How do you adjust your hand odds at the turn based on the fact that one of the four opponents raised preflop?

For a first stab, let’s assume that all opponents are playing rationally and competently, with no deceptive play. If that’s true, then we can look at the opponents’ betting history for previous rounds and compute which sets of opponent hole cards were playable as played on those rounds, using some heuristic. The choice of that heuristic is a central piece of AI for the bot. For now, let’s just consider a boolean heuristic.

1. Revised 7-card hand spectrum

Going back to the 7-card hand spectrum, we can then X out the red dots which assume opponent hole cards which, according to the heuristic, were not bettable as played on those rounds, given the table cards present on that round.

[_._...xx.._...._..__xx_..._._..._.....x..._...._.xx_.._._.._...._.._.._..._..._..xx.._.._xxx_]
133million

Again, the spectrum represents all ~133million possible 7-card hands. Blue dots represent ones you can make; red dots represent ones the opponent can make, and which assume hole cards that are considered playable as played up to now. Red x’s represent hands the opponent can make, but which include hole cards that are not considered playable as played in some preceding round.

2. Revised 5-card hand spectrum

As before, the 7-card possibilities map to a sorted spectrum of 5-card hands, which are the “best hands” taken from the 7-card hands. This time we filter out the red x’s above, so we only consider opponent hands that are reasonable according to the betting. Recall that the dots are weighted according to the number of 7-card hands that map to them. So, depending on what happened the heuristic returned for the corresponding 7-card hands, some red dots are eliminated, some are unaffected, and some have their weights reduced.

[__..._..__.._..__....__._..__...___.__..__.___._...___..__..__] 2.7 million

The spectrum above represents all possible 5-card hands (ordered worst to best). Blue dots represent "best hands" taken from the revised spectrum of 7-card user hands; red dots represent best hands taken from the possible 7-card opponent hands, which were bettable in the past, i.e. the red dots above only, not the x’s. Multiple 7-card hands can map to the same best 5-card hand.

3. Reflective Hand Odds

We can now compute new hand odds against 1 or n opponents exactly as before, but using the revised 5-card hand spectrum. These new odds take into account opponent play, reflecting on past rounds and which opponent hole cards were bettable up until now; I call these the Reflective Hand Odds.

A player may not have the nuts, but have the “reflective nuts”. This means that, assuming reasonable play by the opponents, they cannot be playing hole cards that have a chance of winning at this point. In this case, all red dots to the right of a blue dot in the unrevised 5-card hand spectrum were referenced by red dots in the 7-card hand spectrum which were x’d out by the heuristic in the revised 7-card hand spectrum.

“Having the reflective nuts” looks like this:
[_...__.___._..__.__...__.___._..___....._.._..._] ; 5-card hand spectrum (unrevised)
[_...__.___._..__.__...__.___._..___._..___.._] ; Revised 5-card hand spectrum

Applying the heuristic is the computational equivalent of “putting the opponent on a hand”. This is where a bot, with perfect memory and consistency, can have a significant advantage over a human player, especially in a multiway situation.

But to be effective, the heuristic needs to be fuzzy, and take opponent modeling and possible deceptive play into account. A too-rigid heuristic will make the bot easily susceptible to deceptive play.

---------------
Ervin Peretz is a software engineer at a major web search company.
He is author of HoldemKiller, a free online poker bot.

Raw Hand Odds

Welcome to holdemkiller, my poker bot blog and guide to writing your own pokerbot.

I thought that to kick things off, I’d lay some groundwork and terminology for the art of software poker analysis. There has been a lot of speculation recently about bots playing hold’em online, and how feasible or powerful such a bot might be. Before getting into any of the “human level” qualifications for playing good poker, the bot has to compute basic odds for the hand. This is the very lowest requirement, and so it’s a good place to begin understanding how complex this is in the presence of incomplete information. I hope we can all take part in refining and building on these ideas, creatively and with specificity. More topics will follow.

Raw hand odds:

Before we start talking about any kind of opponent modeling or hand reading, we need to establish basic hand odds. This is harder than it sounds. Most players establish raw hand odds by “putting the opponent on a hand”, and then counting their “out cards”, assuming both are correct and absolute. No player can calculate his exact hand odds in a typical hand. Let’s look at how to do it rigorously.

For raw hand odds, we assume a simple sort of hold’em game where the cards are dealt, there is no betting or folding, and we’re simply computing our expectation of coming out at the top against n opponents.

Below I’ve outlined a process for computing the raw hand odds, starting with your hole cards, the board, and n opponents with unknown hole cards. I’m going into graphic detail here so we can all appreciate the problem.

1. Starting info

Your hole (2 cards)
Current Board (0-5 cards)
(opponents …)

2. Possible 7-card hand spectrum

There are about 133million 7-card hands total (52C7 i.e. "52 choose 7" =~ 133million). Giving the starting info, you can trivially come up with the possible final 7-card possibilities for you and an opponent. I call this a hand spectrum. It has a dot for each possible final 7-card hand for the user and opponent. There are thousands of times more opponent dots than user dots, because you only know the board cards for the opponent. All 7-card final hands are equally likely.

[_._......_...._.._..._..._._..._......._......_...._.._._.._..._.._.._..._..._......_.._..._] 133million

The diagram above is of course conceptual and doesn't represent the immense number of hands involved. It represents all possible specific 7-card hands that can be made by you or an opponent. Blue dots represent ones you can make. Red dots represent ones the opponent can make. Note lots more red than blue dots (because there are fewer known cards for the opponents). All are equally likely (for raw hand odds).

3. Possible 5-card “best hand” spectrum

Each of the 7-card hands maps to a 5-card “best hand”. People can eyeball this, but computationally it’s pretty intensive, and you have to pre-generate the required tables to do it in real time. So now you can produce a 5-card hand spectrum, sorted by hand rank. Multiple 7-card hands (up to thousands) can map to the same 5-card hand, so each possible 5-card hand has a “weight” associated with it. Note that adjacent hands may have equal rank.

[__...._..__....._..__....__._..__...___.__...__.___._....___....__..__] 2.7 million

This diagram represents the ~2.7million specific 5-card hands (ordered worst to best). Blue dots represent "best hands" taken from the possible 7-card user hands. Red dots represent best hands taken from the possible 7-card opponent hands (there are also some red/blue dots in cases where both play the board; we'll ignore that for now). Multiple 7-card hands can map to the same best 5-card hand. So all are not equally likely (for analysis, each must be weighted by the number of equally-likely 7-card hands that map to it).

“Having the nuts” looks like this: your min is greater than the opponents’ max:

[__...._..__....._..__....__._..__...___.__...__.___._..___....__..__ ]

What does it mean to have a better partial hand than 1 opponent? It means that in the 5-card spectrum, the average blue dot (applying “weight” for the number of 7-card hands that map to it) is to the right of most of the red dots.

4. Raw Hand Odds vs. 1 Opponent

What does “Having a winning hand” against 1 opponent mean? Looking at the top spectrum in #3, it’s tempting to just compare average ranks for red and blue. But that’s flawed, as illustrated by the following spectrum:

[ _..___......._______________________________________._]

Here, you have an expected loss, but the average rank (pulled up by the rightmost dot), would make you think you have a likely win. Remember, for raw hand odds, we’re not talking about betting power or anything like that -- just the raw likelihood of winning the hand.

So the right answer, for the simple case of 1 opponent, is to compare the medians of the red and blue hand ranks. That answers the boolean question of which are more to the right of the others.

But, as we all know, at the end of the day we’re going to need odds (to compare with pot odds), not just a boolean "is better than" metric. So what are the raw hand odds against 1 opponent? (Think about this before reading on …)

Well, it’s the probability that, for random red and blue dots from the 7-card spectrum, the blue dot maps to the right of the red dot in the 5-card spectrum. You can do this with a brute-force count. There are accelerated methods, but I won’t go into perf here.

5. Raw Hand Odds vs. n Opponents

So, given the raw hand odds against 1 opponent, what are the raw hand odds against 2 opponents?

Opponent hole cards are almost independent, and you may recall the formula that for independent events A and B, p(A and B) = p(A)*p(B). So you may think that raw hand odds against 2 opponents is simply the square of the raw hand odds against one opponent; and that for n opponents, you raise the raw hand odds against one opponent to the power n.

Although this is sometimes close (if the dots are evenly distributed), it is not so in many cases (especially in draw hands against many opponents, in which case it is way off). Let’s see an illustration of why:

Suppose you are on a 1-card draw for the nut flush. If you don’t make your flush, you have nothing. The 5-card spectrum looks something like this:

[__...________........____....___...___...._______......._______.]

The opponent’s expected hand is roughly in the 50th percentile of rank, and yours is somewhere around the 25th percentile. Your odds against 1 opponent are exactly the odds of making your nut flush, i.e. 25%.

Now, applying the raise-to-the-power-n idea for n opponents above, you would calculate that against 2 opponents, your raw hand odds are 25%^2 = 6%, and against 3 opponents they’re 25%^3 = 1.5% . But that’s clearly wrong! If you make your nut flush, you’ll beat all opponents no matter how many there are; and if you don’t, you’ll lose.

This effect, due to the uneven distribution of the dots, comes up in subtle ways all the time. Therefore, even if they knew their raw hand odds against 1, and could exponentiate fractions in their head, no one can calculate their odds against n players accurately in their head.

The solution is to exponentiate the odds for each possible final user hand separately, before taking the median. That way, the high-rank possible hands don’t lose value in the exponentiation:

for 2 opponents: (3/4)*(~0%)^2 + (1/4)*(~100%)^2 = ~25%
for 3 opponents: (3/4)*(~0%)^3 + (1/4)*(~100%)^3 = ~25% …

With messier dot distributions, it gets more interesting and the effects are more chaotic.

---------------
Ervin Peretz is a software engineer at a major web search company.
He is author of HoldemKiller, a free online poker bot.
He is author of BibleCodex, a free Bible Codes research tool.
He is founder of Terra Bite Lounge, a voluntary-payment cafe/restaurant chain.

Hold'em Killer

Wednesday, April 27, 2005

How to write a winning poker bot (index)

Opponent Modeling

Monday, April 18, 2005

Beyond the Sklansky Preflop

Check or Bet? Call or Raise?

Implied Odds

Computing Actual Hand Odds

Raw Hand Odds

About Me

Links

Previous Posts

Archives