Home > Uncategorized > Sharp Cards in Haskell: The Odds

## Sharp Cards in Haskell: The Odds

#### Recap

In a previous post I introduced a way to specify the action of drawing cards, culminating in the DrawM type:

```data DrawM card a = DrawFail | Done a | DrawOne (card -> DrawM card a)
| DrawAny (DrawM card a)```

So a draw can fail, succeed (with a result), require another card to be examined first or require any card to be drawn first. There is a monad instance that makes building items of this type quite straightforward. What it allows is not just to draw the cards using random chance, but also to calculate probabilities based on those specifications of drawing. In this post I’ll look at how to calculate those probabilities.

#### The Cards Type

Something that I omitted from my previous post was the definition of the Cards type. This represents a collection (deck) of cards. To aid in the way it is commonly used by the rest of the code, it is stored as a map from card to frequency, with cached card size:

`data Cards a = Cards { cardsMap :: Map a Int, cardCount :: Int }`

We can then define helpers functions such as this one to remove one card:

```removeOneCard :: Ord a => a -> Cards a -> Cards a
removeOneCard x (Cards m s) = Cards (update (`maybeMinus` 1) x m) (s - 1)

maybeMinus :: Int -> Int -> Maybe Int
maybeMinus x y | x <= y = Nothing
| otherwise = Just (x - y)```

There is also this function that is like fmap (but we can’t declare an actual Functor instance due to the Ord constraint):

```mapCards :: (Ord b) => (a -> b) -> Cards a -> Cards b
mapCards f (Cards m s) = Cards (mapKeysWith (+) f m) s```

#### Calculating the Odds

The strategy for calculating the probabilities for a draw (i.e. something of type DrawM card a) is fairly straightforward. In the cases where the draw requires another card, we give it another distinct card from the deck — and keep doing this until it fails or is done. But we can make a couple of optimisations.

If we use the Event type that I explained in a previous post for doing the card drawing, we end up calculating the Rational probability for all the different outcomes, then summing them all afterwards. Adding two rationals is a slow process: it involves multiplication to get a common denominator, then a simplification of the fraction — for each addition. Adding thousands or millions of rationals will get expensive.

In this case, however, the fractions involved will be fairly regular. The chance of a card on the first draw will be that card’s frequency divided by the total number of cards (e.g. 4/52 probability of getting an ace), so we have different numerators over a common denominator. Similarly for the second card, that will be the appropriate adjusted frequency divided by the total number of cards minus one, and so on. For example, the chance of getting an ace on the second card is 4/52*3/51 + 48/52*4/51, or to put it another way: (4*3 + 48*4)/(52*51); the denominator will always be common for the same number of cards drawn. So the denominator at any given point will be n!/(n-d)! where d is the number of cards drawn (aka depth), or to put it in Haskell: product [(n-d+1)..n].

We capture this regularity in the denominator by changing how we store the probabilities. Instead of using an intermediate structure like Map a Rational (i.e. a map from outcome to fractional probability), we use Map Int (Map a Integer): the outer map is from number of cards drawn (aka depth, or d) and the inner map is from outcome to the numerator of the fraction; the depth inherently determines the denominator, as we have seen. Here’s the code:

```chanceMap' :: (Ord a, Ord b) => Int -> Cards a -> DrawM a b
-> Map Int (Map b Integer)
chanceMap' n cards (Done x)
| cardCount cards >= n = singleton 0 (singleton x 1)
| otherwise = empty
chanceMap' _ _ DrawFail = empty
chanceMap' n deck (DrawAny m) = chanceMap' (n+1) deck m
chanceMap' n deck (DrawOne f)
= mapKeysMonotonic (+1) \$ unionsWith (unionWith (+))
[fmap (toInteger firstCount *) <\$>
chanceMap' n (removeOneCard firstCard deck) (f firstCard)
| (firstCard, firstCount) <- toList \$ cardsMap deck]
```

We’ll come back to the Int parameter momentarily. If the draw is done already, we return a map from 0 depth to the outcome mapped to a frequency count of 1. If the draw failed, our map is empty. The Any case is related to the integer, so we’ll leave that aside. The last case is if (at least) one more card is needed. The list comprehension explores all the distinct cards (with associated frequencies) from the deck: in each case, we recurse with the deck minus the card we drew, and feeding the card to the draw to see what the outcome is. We merge the maps for the different cards by unions that sum the counts inside the inner map, and then we add one to all keys because everything we processed is actually one deeper than it thought it was. This key-changing operation is why I use Map Int instead of IntMap, although I guess I could pass the depth down the tree instead of building it in on the way back up. I haven’t profiled to check the difference.

The Int parameter is an optimisation. If at some point we need to draw any card from the deck and ignore it, that doesn’t actually affect the odds of what comes afterwards, provided there are still enough cards in the deck to complete the draw. This is evident in Pontoon: there are many cards drawn after yours and given to other players, but if you don’t know what they are, that doesn’t change the odds of the next cards you draw. So our optimisation is this: if you do draw any card from the deck, then instead of actually drawing, we just keep count of how many cards you wanted to do this with. As long as when we’re finished drawing, there are enough cards to draw and discard (which we keep track of with the Int parameter), that’s fine and the odds are unaffected. If there aren’t, the draw fails (as it would have done if we’d drawn the cards at the specified time).

#### Further Processing

Let’s take the example of drawing an ace in the first three cards from a full deck. We can specify this using our DrawM type thus:

drawUntil (== Ace) (Just 3) >> return ()

The paths that lead to a successful draw involve finding one of 4 aces (in 52 cards) or one of the other 48 (of 52) followed by one of the 4 (of 51) aces or one of the 48 (of 52) followed by one of the 47 (of 51) followed by one of the 4 (of 50) aces. Note how the denominators are determined by the depth, as previously discussed. The map returned by chanceMap’ therefore looks like this:

```fromList [(1, singleton () 4), (2, singleton () (48*4)),
(3, singleton () (48*47*4))]```

What happens to that is coded in the outer function:

```chanceMap :: (Ord card, Ord a) => Cards card -> DrawM card a -> Map a Rational
chanceMap deck m = (% head permutes) <\$> unionsWith (+) \$ elems \$
mapWithKey (\k v -> fmap ((permutes !! k) *) v) depthToCount
where
depthToCount = chanceMap' 0 deck m
maxDepth = maximum (0 : keys depthToCount)
deckSize = cardCount deck

permutes :: [Integer]
-- A lookup for (deckSize `permute`) . (maxDepth -)
permutes | maxDepth == 0 = [1, 1]
| otherwise = reverse \$ 1 : scanl1 (*)
[toInteger (deckSize - maxDepth + 1) .. toInteger deckSize]
```

Let’s pick through it. depthToCount calls chanceMap' to get the aforementioned map. We use a slight optimisation to share the permutes function. Let’s briefly use an example of drawing up to 5 cards from 52. We will want the values 48, 49*48, 50*49*48, 51*50*49*48 and 52*51*50*49*48 at various points; there’s no point duplicating the calculation involved, so we use scanl1 to calculate the last number in that list, but keep track of the intermediary values encountered along the way.

When we get back our earlier map for the probability of drawing aces fromList [(1, singleton () 4), (2, singleton () (48*4)), (3, singleton () (48*47*4))], we multiply each value in the inner map by the permutes value for its depth. For 3, which was the maximum depth, that’s actually 1. For 2 we multiply by 50, and for 3 we multiply by 51*50. So we end up with the calculation ((4 * 51*50) + (48*4 * 50) + (48*47*4 * 1)) / (52*51*50). This is just the standard way of calculating 4/52 + (48*4)/(52*51) + (48*47*4)/(52*51*50), by converting to a common denominator. The division and simplification only ever happens once; all the other operations are multiplication and addition. Our calculation simplifies to the Rational value 1201%5525, incidentally.

#### Shortening the Odds Calculation

So how long does this calculation take? For a game with, say, nine distinct cards then no matter how many times those nine distinct cards appear in the deck, to examine all paths of drawing, say, 5 cards is in the worst case (i.e. where no draw fails early) going to look at 9*8*7*6*5 different sequences, or around 17,000. However, for a deck of playing cards with 52 distinct cards, the same calculation is 52*51*50*49*48 or a bit over 300,000,000. Ouch! Playing cards are actually the degenerate case, in my opinion: I struggle to think of any other card system where each card is distinct (and only appears once). But they are quite widely used, so it would be nice to optimise where possible.

There is one further class of optimisations that can be made, but it must be made by the user of the library. Consider the Pontoon example. We were looking for certain combinations of ranks, but we didn’t care about suit at all. So rather than operate on some deck :: Cards PlayingCard, we can use the mapCards function to get: mapCards rank deck :: Cards Rank. In this latter deck of cards, we still have 52 cards, but only thirteen distinct: Two through Ace (or Ace through King, if you prefer), each with a frequency of four. This means that examining all possible cards requires the worst case of 13*12*11*10*9 sequences, or around 150,000 — an improvement by a factor of 2,000 on examining all the 52 separately! You’ll note that our original specifications in the previous post already feature this optimisation and use DrawM Rank a instead of DrawM PlayingCard a. In general, reducing the number of distinct cards like this will always produce a large speed-up. However, it cannot be done by the library itself because we can’t know which aspects the drawing function will be interested in examining.

The work described in this post and the last features in the new game-probability-1.1 release on Hackage.