Tuesday, April 06, 2010

Fixing Razz

Trying to Get it Right

Joel Spolsky wrote a famous piece on how to do job interviews. His mantra throughout the article is that you want to hire people who are "Smart" and "Get Things Done" (I hope he hasn't copyrighted these phrases yet). To these two attributes, I would humbly add "Tries to Get It Right", though I'm hardly the first one to suggest this. When hiring a software engineer, you want someone who does his best to get it right the first time, and if necessary, gets it right the second or third or fourth time if things go wrong.

And things will go wrong. When it is discovered, for instance, that something he has written is based on faulty assumptions, your engineer shouldn't react too defensively, at least not for long. Ideally, he should look like someone just punched him in the gut, and then he should scramble to fix whatever it is that needs fixing, throwing away reams of code as necessary.

Enough with the Digression, What Needs Fixing?

Thanks to a number of posters on the 2+2 forums (the thread is here - the discussion starts toward the middle of page two), a highly counter-intuitive result was brought to my attention involving randomized razz simulations. In a sense, it isn't a bug - the code is actually performing the way I intended it. No, it is worse than a bug - it is a behavior, hidden from the user, which gives results different from what, in almost all cases, was probably intended.

The Old Approach


On the propokertools razz simulator, you can enter ranges of hands for each player. For instance, if you have a2 with a 9 up, and you see an early raiser showing a seven with a bunch of low cards following, you might put him on a range of 'a seven up with two downcards seven or lower without any duplicates'. Here is a link to this situation in propokertools with some dead cards added, assuming everyone but the raiser folds to you. Our simulation gives a29 around 42% all-in equity in this situation.

"But how are the random hands generated?" you ask. Fair enough. Essentially, for each possible rank in each hand, a selection is made randomly. Then, some statistics are performed to ensure a fair probability for that choice (please pardon the hand-waving here as this part of the algorithm is not important for the purposes of this discussion). Then, the next rank is chosen, etc. etc.

For instance, for our a29 hand, we always choose an ace, a two, and a nine, since no 'range of hands' syntax was employed for any of the cards.

For the (7-7-7) hand, we first choose a rank for the first downcard at random (lets say 6), then the second downcard( lets say 4), and then we always choose a 7 for the third card. Great, we chose 647. (If the second card chosen happened to also be a 6, we would have had to start over, as our range stipulates no duplicate ranks in the first three cards).

So, we have two 'real' hands a29 vs 647 - we can now deal out the rest of the cards for each hand and see who won (or if there was a tie).

Wash/rinse/repeat 600,000 times and VOILA! 42% equity.

Investigating the Old Approach


Let's say we race "(3-3-)45678" vs "A345678" with no dead cards. The Old Approach involves picking a random rank for each of the "3-" cards.

For the first card, if we randomly pick a deuce, then our choices for the next card are an ace or a three, with equal probability. We win with the ace and lose with the three, giving 50% equity on average when we pick a deuce for the first card.

For the first card, if we randomly pick an ace, we then have four ways to pick a deuce and three ways to pick a three for the second card. Picking a deuce for the second card nets a win, and a three gets a tie. So, on average, we win 4 times and tie 3 times for every seven random ace picks.

For the first card, if we randomly pick a three, the situation is the opposite of picking an ace. We lose 4 times and tie 3 times for every seven times this happens.

It should be obvious at this point that the Old Approach gives us 50% equity on average, which matches our intuition nicely.

Now, for the problem scenario lets race
"(3-2-)45678" vs "A345678". Notice that the only thing I did was change the second rank of the first hand from '3-' to '2-'. This causes the equities to change from being even to being 2 to 1. That's right - with the Old Approach, the "(3-2-)45678" gets 66.66% equity. I now quote from jbrennan's post on 2+2 which efficiently explains why this is the case:
the simulator would seem to do this:

A2 -- valid hand
A3 -- NOT valid (second card not a deuce or less)
2A -- valid hand
23 -- NOT valid (second card not a deuce or less)
3A -- valid hand
32 -- valid hand

Since A2, 2A, or 32 can each happen 12 ways, and 3A can happen 9 ways, we end up with 45 possible hands -- 24 winners, 12 losers, and 9 ties. That matches up exactly with the output of the simulator

The Problem with the Old Approach

The difference in equity between the '(3-3-)' '(3-2-) hands is highly counter-intuitive. In both cases, it seems pretty clear that the person running the simulation intended 'two downcards 3 or lower that are different'. That person most certainly did NOT mean 'look at the first downcard - if it is a three or lower, look at the second downcard - if a nonmatching ace or deuce, play the hand'.

Downcards should not be ordered.

The New Approach

I scrapped the old random razz hand generation code and instead retrofitted the existing, well-used code for generating hands in hold'em and omaha. Here is the algorithm that translates a 'suitless' razz simulation into a 'suitfull' one.
  • Examine each rank in the simulation.
  • For cards where only one rank is possible (dead cards, ranks in hands like 'a23'), pick a card from the deck with the appropriate rank and assign it as part of the range. Remove that card from the deck.
  • For cards where more than one rank is possible, assign a range consisting of the specified ranks and any suit.
  • Remove any sequences of ranks in hands that violate the 'no pairs' constraints (the parentheses used to force no duplicates)
  • Race!
Here is an example of the procedure applied to "A345678" vs "(3-3-)45678"
  • Assigning arbitrary suits to non-range-based ranks, we have "As3s4s5s6s7s8s" vs "(3-3-)4h5h6h7h8h, and As3s4s5s6s7s8s4h5h6h7h8h are removed from the deck.
  • Assigning ranges, we have "As3s4s5s6s7s8s" vs "[3*|2*|a*][3*|2*|a*]4s5s6s7s8s" (where * indicates any suit). (I will not list all of the possibilities for the second hand here).
  • Remove hands that violate the constraints (hands like 33..., 22... from hand two).
  • We're done!
You can walk through these steps yourself and see that the ranges do not change if you alter the (3-3) to read (3-2) - the equities are 50% for both simulations.

0 Comments:

Post a Comment

<< Home