Archive

Archive for the ‘Uncategorized’ Category

Sharp Cards in Haskell: The Odds

September 8, 2010 Leave a comment

Recap

In a previous post I introduced a way to specify the action of drawing cards, culminating in the DrawM type:

data DrawM card a = DrawFail | Done a | DrawOne (card -> DrawM card a)
                  | DrawAny (DrawM card a)

So a draw can fail, succeed (with a result), require another card to be examined first or require any card to be drawn first. There is a monad instance that makes building items of this type quite straightforward. What it allows is not just to draw the cards using random chance, but also to calculate probabilities based on those specifications of drawing. In this post I’ll look at how to calculate those probabilities.

The Cards Type

Something that I omitted from my previous post was the definition of the Cards type. This represents a collection (deck) of cards. To aid in the way it is commonly used by the rest of the code, it is stored as a map from card to frequency, with cached card size:

data Cards a = Cards { cardsMap :: Map a Int, cardCount :: Int }

We can then define helpers functions such as this one to remove one card:

removeOneCard :: Ord a => a -> Cards a -> Cards a
removeOneCard x (Cards m s) = Cards (update (`maybeMinus` 1) x m) (s - 1)

maybeMinus :: Int -> Int -> Maybe Int
maybeMinus x y | x <= y = Nothing
               | otherwise = Just (x - y)

There is also this function that is like fmap (but we can’t declare an actual Functor instance due to the Ord constraint):

mapCards :: (Ord b) => (a -> b) -> Cards a -> Cards b
mapCards f (Cards m s) = Cards (mapKeysWith (+) f m) s

Calculating the Odds

The strategy for calculating the probabilities for a draw (i.e. something of type DrawM card a) is fairly straightforward. In the cases where the draw requires another card, we give it another distinct card from the deck — and keep doing this until it fails or is done. But we can make a couple of optimisations.

If we use the Event type that I explained in a previous post for doing the card drawing, we end up calculating the Rational probability for all the different outcomes, then summing them all afterwards. Adding two rationals is a slow process: it involves multiplication to get a common denominator, then a simplification of the fraction — for each addition. Adding thousands or millions of rationals will get expensive.

In this case, however, the fractions involved will be fairly regular. The chance of a card on the first draw will be that card’s frequency divided by the total number of cards (e.g. 4/52 probability of getting an ace), so we have different numerators over a common denominator. Similarly for the second card, that will be the appropriate adjusted frequency divided by the total number of cards minus one, and so on. For example, the chance of getting an ace on the second card is 4/52*3/51 + 48/52*4/51, or to put it another way: (4*3 + 48*4)/(52*51); the denominator will always be common for the same number of cards drawn. So the denominator at any given point will be n!/(n-d)! where d is the number of cards drawn (aka depth), or to put it in Haskell: product [(n-d+1)..n].

We capture this regularity in the denominator by changing how we store the probabilities. Instead of using an intermediate structure like Map a Rational (i.e. a map from outcome to fractional probability), we use Map Int (Map a Integer): the outer map is from number of cards drawn (aka depth, or d) and the inner map is from outcome to the numerator of the fraction; the depth inherently determines the denominator, as we have seen. Here’s the code:

chanceMap' :: (Ord a, Ord b) => Int -> Cards a -> DrawM a b
           -> Map Int (Map b Integer)
chanceMap' n cards (Done x)
  | cardCount cards >= n = singleton 0 (singleton x 1)
  | otherwise = empty
chanceMap' _ _ DrawFail = empty
chanceMap' n deck (DrawAny m) = chanceMap' (n+1) deck m
chanceMap' n deck (DrawOne f)
  = mapKeysMonotonic (+1) $ unionsWith (unionWith (+))
      [fmap (toInteger firstCount *) <$>
        chanceMap' n (removeOneCard firstCard deck) (f firstCard)
      | (firstCard, firstCount) <- toList $ cardsMap deck]

We’ll come back to the Int parameter momentarily. If the draw is done already, we return a map from 0 depth to the outcome mapped to a frequency count of 1. If the draw failed, our map is empty. The Any case is related to the integer, so we’ll leave that aside. The last case is if (at least) one more card is needed. The list comprehension explores all the distinct cards (with associated frequencies) from the deck: in each case, we recurse with the deck minus the card we drew, and feeding the card to the draw to see what the outcome is. We merge the maps for the different cards by unions that sum the counts inside the inner map, and then we add one to all keys because everything we processed is actually one deeper than it thought it was. This key-changing operation is why I use Map Int instead of IntMap, although I guess I could pass the depth down the tree instead of building it in on the way back up. I haven’t profiled to check the difference.

The Int parameter is an optimisation. If at some point we need to draw any card from the deck and ignore it, that doesn’t actually affect the odds of what comes afterwards, provided there are still enough cards in the deck to complete the draw. This is evident in Pontoon: there are many cards drawn after yours and given to other players, but if you don’t know what they are, that doesn’t change the odds of the next cards you draw. So our optimisation is this: if you do draw any card from the deck, then instead of actually drawing, we just keep count of how many cards you wanted to do this with. As long as when we’re finished drawing, there are enough cards to draw and discard (which we keep track of with the Int parameter), that’s fine and the odds are unaffected. If there aren’t, the draw fails (as it would have done if we’d drawn the cards at the specified time).

Further Processing

Let’s take the example of drawing an ace in the first three cards from a full deck. We can specify this using our DrawM type thus:

drawUntil (== Ace) (Just 3) >> return ()

The paths that lead to a successful draw involve finding one of 4 aces (in 52 cards) or one of the other 48 (of 52) followed by one of the 4 (of 51) aces or one of the 48 (of 52) followed by one of the 47 (of 51) followed by one of the 4 (of 50) aces. Note how the denominators are determined by the depth, as previously discussed. The map returned by chanceMap’ therefore looks like this:

fromList [(1, singleton () 4), (2, singleton () (48*4)),
          (3, singleton () (48*47*4))]

What happens to that is coded in the outer function:

chanceMap :: (Ord card, Ord a) => Cards card -> DrawM card a -> Map a Rational
chanceMap deck m = (% head permutes) <$> unionsWith (+) $ elems $
  mapWithKey (\k v -> fmap ((permutes !! k) *) v) depthToCount
  where
    depthToCount = chanceMap' 0 deck m
    maxDepth = maximum (0 : keys depthToCount)
    deckSize = cardCount deck

    permutes :: [Integer]
    -- A lookup for (deckSize `permute`) . (maxDepth -)
    permutes | maxDepth == 0 = [1, 1]
             | otherwise = reverse $ 1 : scanl1 (*)
                    [toInteger (deckSize - maxDepth + 1) .. toInteger deckSize]

Let’s pick through it. depthToCount calls chanceMap' to get the aforementioned map. We use a slight optimisation to share the permutes function. Let’s briefly use an example of drawing up to 5 cards from 52. We will want the values 48, 49*48, 50*49*48, 51*50*49*48 and 52*51*50*49*48 at various points; there’s no point duplicating the calculation involved, so we use scanl1 to calculate the last number in that list, but keep track of the intermediary values encountered along the way.

When we get back our earlier map for the probability of drawing aces fromList [(1, singleton () 4), (2, singleton () (48*4)), (3, singleton () (48*47*4))], we multiply each value in the inner map by the permutes value for its depth. For 3, which was the maximum depth, that’s actually 1. For 2 we multiply by 50, and for 3 we multiply by 51*50. So we end up with the calculation ((4 * 51*50) + (48*4 * 50) + (48*47*4 * 1)) / (52*51*50). This is just the standard way of calculating 4/52 + (48*4)/(52*51) + (48*47*4)/(52*51*50), by converting to a common denominator. The division and simplification only ever happens once; all the other operations are multiplication and addition. Our calculation simplifies to the Rational value 1201%5525, incidentally.

Shortening the Odds Calculation

So how long does this calculation take? For a game with, say, nine distinct cards then no matter how many times those nine distinct cards appear in the deck, to examine all paths of drawing, say, 5 cards is in the worst case (i.e. where no draw fails early) going to look at 9*8*7*6*5 different sequences, or around 17,000. However, for a deck of playing cards with 52 distinct cards, the same calculation is 52*51*50*49*48 or a bit over 300,000,000. Ouch! Playing cards are actually the degenerate case, in my opinion: I struggle to think of any other card system where each card is distinct (and only appears once). But they are quite widely used, so it would be nice to optimise where possible.

There is one further class of optimisations that can be made, but it must be made by the user of the library. Consider the Pontoon example. We were looking for certain combinations of ranks, but we didn’t care about suit at all. So rather than operate on some deck :: Cards PlayingCard, we can use the mapCards function to get: mapCards rank deck :: Cards Rank. In this latter deck of cards, we still have 52 cards, but only thirteen distinct: Two through Ace (or Ace through King, if you prefer), each with a frequency of four. This means that examining all possible cards requires the worst case of 13*12*11*10*9 sequences, or around 150,000 — an improvement by a factor of 2,000 on examining all the 52 separately! You’ll note that our original specifications in the previous post already feature this optimisation and use DrawM Rank a instead of DrawM PlayingCard a. In general, reducing the number of distinct cards like this will always produce a large speed-up. However, it cannot be done by the library itself because we can’t know which aspects the drawing function will be interested in examining.

The work described in this post and the last features in the new game-probability-1.1 release on Hackage.

Categories: Uncategorized

Nice Dice in Haskell

August 13, 2010 3 comments

Carrying on my theme of (board/card) game design, I recently wanted to play around with probabilities and dice. I wanted to find out the frequencies of the outcomes for various combinations of dice, and things like the probability of getting a repeated result on that combination. I began to envisage a neat way to represent dice in Haskell — and then a quick google led to discovering that the representation had already been thought of, published and released on Hackage. That saves a fair bit of work, at least. I lightly wrapped the probability library with the API that I wanted, and just released my game-probability library/EDSL on Hackage. The library has documentation and examples; I will focus more on looking at the design rationale in this post.

The Dice API

The central type is EventM a, which is a probabilistic event with outcomes of type a: each EventM a contains one or more outcomes of type a, each with an associated probability stored as a Rational. That means we get exact fractions, so 1/6th really is stored as 1/6th, not the nearest approximation that a Double can manage. For those who are interested in how EventM relates to the underlying probability library, it builds on the data-type from Numeric.Probability.Distribution:

import Numeric.Probability.Distribution (T, decons, norm, uniform)

newtype EventM a = EventM (T Rational a)

type DieRoll = EventM Int

With the aid of a helper function we can immediately define lots of standard dice:

makeEvent :: [a] -> EventM a
makeEvent = EventM . uniform

d4, d6, d8, z10, d10, d12, d20, d100 :: DieRoll
d4 = makeEvent [1..4]
d6 = makeEvent [1..6]
d8 = makeEvent [1..8]
z10 = makeEvent [0..9]
d10 = makeEvent [1..10]
d12 = makeEvent [1..12]
d20 = makeEvent [1..20]
d100 = makeEvent [1..100]

For those who don’t know their dice, d6 is a standard abbreviation for a 6-sided die. d10s have 0 to 9 written on the faces, and depending on the game the zero may either be read as 0 or as 10: d10 is the 1-10 interpretation, while z10 is the 0-9 interpretation. This is consistent with dice notation (wikipedia really does have a page for everything!).

When you try to show a die you get a faithful but not super-readable representation of a die:

> d6
fromFreqs [(1,1 % 6),(2,1 % 6),(3,1 % 6),(4,1 % 6),(5,1 % 6),(6,1 % 6)]

We can improve on that by customising the Show instance for our EventM wrapper. What would be really nice is a more visual representation of the chances associated with each event. To that end, I’ve chosen to display a horizontal bar-chart of outcomes with their associated probabilities. We find a common denominator for all the different probabilities, scale them up to integers and print them out with the corresponding bars. Now when we ask to see a die, we get a nicer visualisation:

> d6
1: #
2: #
3: #
4: #
5: #
6: #

Of course all the standard dice above have an equal chance of getting each number, so they’re quite dull. What we really want it for is more complex combinations, like for example the sum of a d4 and a d6:

> d4 + d6
2 : #
3 : ##
4 : ###
5 : ####
6 : ####
7 : ####
8 : ###
9 : ##
10: #

Note that we added two dice together there! The DieRoll type has a Num instance which allows dice to be added together. Intuitively, adding two dice adds their outcomes, and works out the corresponding probability for each summed outcome. We’ll come back to the details of the Num instance later, but we can also add constant factors to the dice:

> d4 + 1
2: #
3: #
4: #
5: #

And we can multiply by a constant:

> d6 * 2
2 : #
4 : #
6 : #
8 : #
10: #
12: #

Note that 2 * d6 is the same as d6 + d6, while d6 * 2 scales the outcome of the d6 by 2. So multiplication of dice is not commutative! This fits with how dice are usually written, and saves some confusion at the expense of causing other confusion (we’ll come back to this aspect later as well):

> 2 * d6
2 : #
3 : ##
4 : ###
5 : ####
6 : #####
7 : ######
8 : #####
9 : ####
10: ###
11: ##
12: #

We can actually allow rolling of the dice using a small roll :: DieRoll -> IO Int function:

> replicateM 10 (roll d6) 
[6,5,6,2,5,5,2,3,5,3]

With a little bit of point-free Haskell (which I’ll leave you to dig through if you’re interested) we can check that the frequencies roughly pan out if we roll lots of times:

> map (head &&& length) . group . sort <$> replicateM 10000 (roll d6)
[(1,1701),(2,1681),(3,1669),(4,1619),(5,1667),(6,1663)]

So that sorts out rolling the dice. However, the benefit of a reified probability representation is that we can also perform queries on the resulting distributions.

Equality

A DieRoll stores probabilities associated with each outcome. We can use these to prove equality of different dice rolls. A simple example:

> d6 + d6 + d6 == 3 * d6
True

Returning to our earlier mention of d10 and z10, one way to get from a z10 to a d10 is simply to add one to the former:

> z10 + 1 == d10
True

That’s not actually the way we tend to map between the two with real dice, though. We’ll define a handy helper function, subst, which changes one specific outcome to another, and use this to show that reading a 0 on the low die as a 10 accomplishes the same effect:

subst :: Eq a => a -> a -> EventM a -> EventM a
subst tgt rep = fmap (\x -> if x == tgt then rep else x)
> replace 0 10 z10 == d10
True

We can also check various ways for converting two d10/z10 to a d100, for example:

> z10 * 10 + d10 == d100
True
> subst 0 100 (z10 * 10 + z10) == d100
True
Further Queries

As well as checking equality of the entire roll, we can query specific probabilities. The chancePred function checks the chance of a particular predicate being satisfied:

> chancePred (>= 16) (3*d6)
5 % 108

So 5/108 chance of getting 16 or higher when rolling 3 six-sided dice. You can use the similar function chanceRel to check the chance that a particular relation between two events is satisfied. For example, you might want to know the chance of a d20 beating 3d6:

> chanceRel (>) d20 (3*d6)
19 % 40
The Full Monad

EventM is a monad (and functor and applicative functor, of course). This means you can perform all sorts of complicated combinations of dice — for example, if you want to prototype a game involving dice, you can write your rules in Haskell and see the chances of various outcomes. To demonstrate this, I’m going to borrow an example from the library documentation. Bonus points if you spot which game this comes from without looking at the library docs.

You roll a given number of 10-sided dice, rolling one extra die for every 10 you score if you are specialised in that skill. The number of 1s on the original roll are subtracted from the total number of dice that equal or exceed the difficulty target:

successes :: Int -> Int -> Bool -> EventM Int
successes target dice specialised
   = do initial <- replicateM dice d10
        extra <- if specialised
                   then replicateM (count (== 10) initial) d10
                   else return []
        return (count (>= target) (initial ++ extra) - count (== 1) initial)
   where
     count f xs = length (filter f xs)

The exact mechanics of that system caused much dispute back in the day with other players. I still may not have it correct now, but I can tell you that an unambiguous description like the above would have prevented much confusion! I wonder if we could get all game-writers to write down their intricate dice-based systems in Haskell.

The Inner Workings — Num

Finally, I’ll explain the inner workings of the probability type. You can think of EventM a as being a synonym for Map a Rational, where the Rational type represents a probability as explained earlier. (Due to some problems with type-class constraints, it has to be held in a list, but same idea.) We can imagine fromList and toList functions (like those for Map) to help us in this section:

toList :: EventM a -> [(a, Rational)]
fromList :: [(a, Rational)] -> EventM a

We’ll assume that fromList adds the probabilities associated with any duplicates in the list, i.e. it acts like Map.fromListWith (+). Armed with these, I’ll explore the Num instance for EventM Int (aka DieRoll).

Addition and subtraction

Let’s start with: what does it mean to add two dice? Each combination of the two dice must have their probabilities multiplied, with their outcomes added, and any duplicate outcomes (e.g. 1 and 6 gives the same outcome as 2 and 5) must be added together at the end (which we’ve said fromList will do):

addDice :: DieRoll -> DieRoll -> DieRoll
addDice a b = fromList [(ax+bx, ap*bp) | (ax,ap)<-toList a, (bx,bp)<-toList b]

Subtracting one die from another is trivially similar. So how do we add constants to a die? Well, one way to do it — and perhaps in a sense the most obvious way — is to use EventM’s functor instance. Using fmap on EventM a applies a pure function to all the outcomes of the die, so fmap (+3) would add the constant value 3 to each outcome. However, there is another way. We can convert 3 to a die by making it a certain outcome:

certain:: a -> EventM a
certain x = fromList [(x, 1)]

So every time you roll the “die” certain 3, it will come up with a 3 (talk about a loaded die!) Now think about the result of addDice (certain 3) d; all the probabilities of the second die will be multiplied by 1, leaving them untouched, while the outcomes will have 3 added to them: exactly the effect we wanted. So hopefully you can see that we can form a Num instance for our dice (which we’ve used throughout this post), where (+) = addDice, negate = fmap negate, and fromInteger = certain . fromInteger. But what about multiplication?

Multiplication

From a user’s perspective, multiplication of, say, 2 and d6 could be taken to mean one of two things — either: roll one d6 and multiply all outcomes by two, or: roll two d6 and add them together. We can implement the former using our certain function and a trivial modification to addDice:

multDice :: DieRoll -> DieRoll -> DieRoll
multDice a b = fromList [(ax*bx, ap*bp) | (ax,ap)<-toList a, (bx,bp)<- toList b]

It should be obvious from the definition that this function is commutative. multDice (certain 2) d6 multiplies all outcomes of the d6 by 2, and multDice d6 (certain 2) does the same. multDice (certain 3) (certain 4) is the certain outcome 12, and multDice d4 d4 rolls two d4 and multiplies their outcomes, giving this interesting distribution:

1 : #
2 : ##
3 : ##
4 : ###
6 : ##
8 : ##
9 : #
12: ##
16: #

The multDice function is a consistent, commutative definition of multiplication. But it holds a hidden surprise for the user if we use this for our definition of (*): 2 * d6 is not the same as d6 + d6! The former doubles the outcome of one d6 (if we use multDice), whereas the latter adds the outcomes of two d6. I think this is sufficiently surprising that it is worth exploring the other alternative: what if we define a function that takes its first argument and rolls that many of its second argument (summing the outcomes):

rollN :: DieRoll -> DieRoll -> DieRoll
rollN a b = do n <- a
               sum <$> replicateM n b

That uses the monad instance, which I haven’t got into much in this post, but hopefully its meaning is clear: find out the result of the first die roll, then roll that many of the second die, summing the results. We can work out that rollN (certain 2) d6 is the same as d6 + d6. What do you think the result will be of rollN d6 (certain 2)? Well, we roll the d6, then that many times we receive the certain outcome 2, and add all these together… which means we effectively scale the d6 result by 2 — the other behaviour we were using earlier! rollN (certain 3) (certain 4) is still the certain outcome 12, while rollN d4 d4 rolls one d4 to determine how many further d4 to roll, giving a rather complex distribution!

This rollN function has a nice parallel to dice notation where 2d6 means roll two d6, whereas d6 * 2 means double the result of a single d6. So even though it’s not commutative, I prefer this multiplication operator for our dice EDSL. (I believe that rollN is associative, but I don’t have a proof for this.) So (*) = rollN in my library, as being the most useful of two potentially-confusing options. The documentation tries to make clear how it works.

Summary

The library/EDSL is up on Hackage now for you to experiment with. I mainly intend it to be played with using ghci or small Haskell files for experimenting with dice combinations and calculating odds, but there’s nothing stopping it being used for concisely specified actual dice in some RPG-related program. The library uses some functionality from the the probability library developed by Martin Erwig, Steve Kollmansberger and maintaing by Henning Thielemann. The title of this post no doubt invites disagreement, but I couldn’t resist the rhyme.

Categories: Uncategorized

SVGutils, or: Doomed to Program

August 11, 2010 1 comment

Recently, I’ve been taking a bit of a break from programming in my spare time. In this vein, I’ve been playing some board/card games — and, having played a few, I thought it might be fun to try my hand at designing my own card game. I think some people prototype card games using index cards, but that seems to me inefficient if you need duplicate cards: if you need twenty of a particular card, that obviously calls for mass production by printing rather scribbling by hand. So I decided to at least use a computer to help produce a prototype — but the computer use would only be for doing a bit of text on some cards and then printing them out. No programming involved.

Input

Card games tend to be different text/images, often with duplicates of each card, on only a handful of templates. I could knock up a template, copy it many times and then change the text on each copy. But what if I then want to change the template? This is a maintainability issue: clearly we should separate the templates from the text and images to be used on them. But that means we’ll need a way to substitute all the text and images into the template when it comes time to print. Hmmm. I tend to use Inkscape for vector graphics editing, and since Inkscape stores SVG files, which is XML-based, we can actually do simple text-based substitution on the SVG files to put in our custom text (and image file names) for each card. A fairly simple scripting task which only has a modicum of programming.

We will need a format to store the bits of text (e.g. card title, card type, main text) to be substituted for the various cards. We could store the data for cards in a CSV-like format, with a different column for each bit of text, and a row per card. But many cards will probably share the same substitution values: for example, they may have the same card-type, like “instant” or “attack”. It seems a shame to have to type that out many types, and it is also bad practice that again impacts maintainability. So some sort of grouping via a hierarchical format would help here, to share values among groups of cards. YAML is a nice lightweight markup language, which is easier to manually write than XML, so that’s a good candidate. YAML libraries exist already, so that’s not adding too much burden. And in the long run it will save effort, so that excuses a bit more programming.

Since we have this substitution mechanism, we can save ourselves even more repetition. Sometimes you have cards that are fairly similar to each other in content, perhaps only changing a particular word. For example, you might have bronze, silver and gold cards of the same variety. Or you might want to have something like classic playing cards: each combination of rank and suit, but you wouldn’t want to write out 52 cards. Much better to write the 4 suits and 13 ranks and then express that you want each possible combination. So we can add some basic “for-each” functionality to our YAML markup (don’t worry, it’s not going to end up Turing-complete — I think!). A satisfying addition that shrinks the card definitions further, which makes the added programming worthwhile.

A template plus a concise definition (the complete YAML file is shown) allows easy generation of card groups.

Note: This isn’t the software I’m releasing today (SVGutils just does the tiling described in the next section) but in time I will probably release the card creation software too.

Output

Once the substitution into the templates is done we’ll need to print the cards (I’m only printing on a standard desktop printer) — but all the cards will be in separate post-substitution files. It would be fiddly to print them all, and would be a terrible waste to have one card per sheet. What would be really useful is a program to tile the cards together onto a handful of printable A4-size pages so that it’s easy for me to print them out and cut them up. And extra nice would be the ability to print them double sided to give the cards backs automatically. More programming, but a big saving in effort for physically creating the cards.

How Did It Come To This?

So having started with the intention of design a non-computer-related card game, I ended up a few hours later knee-deep in YAML and SVG processing. Once you’re a programmer, I think you just can’t help yourself when you see tasks like these that call for automation and tools. Doomed to program! Haskell was an obvious choice of language to make the process enjoyable. In the next section I’ll explain which Haskell libraries I ended up using (for those that might be interested), and then I’ll finally talk about the library I wrote and released to do the tiling for printing: SVGutils.

A Tour of Haskell Libraries

Haskell has Hackage, a centralised repository of lots of Haskell libraries and programs. I was pleased to find that most of the libraries I needed were already on Hackage.

YAML

The first thing I wanted was a library to read in YAML. I’d come across YAML during some Ruby on Rails work, and it seemed like a nice way to store some structured information about the cards. YAML is nicer than XML to write in a text editor, which is what I was going to be doing. Hackage had several libraries relating to YAML, which I quickly whittled down to:

  • yst: The yst library was actually in the same vein as several parts of the project, seeing as it provided the ability to read from YAML files and substitute into templates. From a read of its documentation, it’s centred around HTML, and I decided it looked like it would be easier to start from scratch than adjust a sizeable existing application (I may have been wrong, though).
  • data-object-yaml: A library with a pleasingly short API, and a quick click on its main data type for representing YAML data looked good:
    data Object key val =
        Mapping [(key, Object key val)]
        | Sequence [Object key val]
        | Scalar val

    That’s nice and clear. You can use the richer YamlObject type as key and value, or you can just use String or Text if you don’t need the anchors and references that YAML supports (I don’t).

  • HsSyck: Another good candidate, but its types always use the richer YAML format, and with a abstract text type (that is actually Bytestring), which isn’t as handy as being able to use String or Text.

So I chose data-object-yaml, based on looking at the API.

Templating

The next thing I needed was a library for performing string substitution in the SVG files (aka templating). There are many template libraries on Hackage. Some of them are specifically targeted at HTML, so I skipped over those. Some can work on XML, which is a possibility given that I’m working with SVG files (SVG is an XML-based format). But that requires me to put comments or custom tags into the XML file, which is slightly awkward to do from Inkscape. The simplest solution was to use the small template library. This can do simple substitution, substituting $foo in the document for whatever value you map from the key “foo”. This allowed me to construct my SVG files almost exactly as I wanted them, but put “$Name” as the content of the Inkscape text object where I wanted the name (which then showed on my screen as “$Name” while I fiddled with the font, centring, etc). Then I could use the template library to substitute for “$Name”, and my name text would end up exactly where I wanted it in the image. So the solution that knew nothing about the structure of the SVG file turned out to be the easiest. The template library lacked one or two functions that I needed, so I downloaded the source and tweaked it, then submitted some patches which ended up contributing to the latest version. Open source in action!

SVG/XML

Finally, I needed a library that could tile my resulting SVG cards together for printing. A quick search on Hackage for SVG revealed a dearth of SVG tools, and certainly nothing like what I needed. So it looked like I was going to have to write that one myself. Thankfully there were many libraries on Hackage for dealing with XML, so I could at least get the SVG documents parsed as a generic XML document ready for me to do some SVG-specific bits. Since there are many XML libraries, I needed to narrow down the choice a bit. I remembered that Don Stewart had recently posted about popular libraries on Hackage in various categories (including XML), so I decided to follow the crowd and pick from the ones that were popular: xml, HaXml and hexpat.

HaXml can be used in a totally different way to xml and hexpat. HaXml can generate a Haskell source file with a data-type that matches the DTD or XSD schema of an XML file (in this case, the SVG spec), to allow you to manipulate the document in a type-safe manner. I’m all for type-safety, and I had a quick try with HaXml. The file it generated with the complete Haskell types for the SVG spec was certainly large — lots of types, and many of the attribute types had towards 100 members. The thing that put me off this approach is that XML documents often have cross-cutting concerns. For example, many items in SVG have the color attribute. In the DtdToHaskell output, each color attribute is a differently-named member of a different attributes item (and all have the type Maybe String, which doesn’t help to mark them out). If I want to write a program to manipulate the colours in an SVG file, then with the DtdToHaskell approach I have to write a lot of code to pick out each color. With the standard approach that treats XML files very uniformly it’s actually much easier, as I just skim down the tree looking for any color attributes. I glanced at the API for xml and hexpat — there didn’t seem much to distinguish them, but the former had simpler central types so I went with that.

SVGutils

So I had most of the bits I needed except for code dealing with combining many SVG files together into one for printing. That bit I wrote myself, and have bundled up the code into a package I’ve called SVGutils. It’s a light wrapper around the XML library that has a few functions for dealing with SVG files: primarily, tiling several into one. It has both a library and a command-line utility for doing this tiling. The tiling works by finding out the size of each SVG file and then placing the contents appropriately in a larger file with a paper size of your choosing, which can then be converted into PDF using Inkscape and stuck together using PDFtk (is it worth integrating this into the tool, I wonder?). It supports having “backs” for each file, which will also be tiled correspondingly (but mirrored) so that you can print out both the front and back then stick them together before cutting them up, or just print out the whole document double-sided. I’ve no idea if anyone other than me will find this useful, but it seems worth sharing, so it’s now on Hackage. In time I may expand the SVGutils library with more functionality — patches also welcome. I don’t expect it to end up as a library doing very complex SVG manipulations or rendering, but time will tell.

An example output page from tiling the card SVG files together.

The End Result

So I’ve ended up releasing an SVG library, and have also got my complete program for taking in a YAML file and SVG templates and producing a PDF of the resulting images. What I had actually set out to do was to design a card game. I did actually finish the first version, and tested it last weekend. It turns out that, like most things, game design is harder than it first appears. More work needed, when I can tear myself away from programming!

Categories: Uncategorized

Beginners’ Programming Environments, and Haskell

July 15, 2010 6 comments

Greenfoot is a beginner’s programming environment for ages 13 upwards. It is a programming framework and IDE for Java targeted towards games and simulations: anything where the main output is 2D graphics and sound. Here’s some screenshots:

Greenfoot has several nice aspects (disclaimer: I’m a developer on the Greenfoot project). One is that the visual setting allows instant feedback on what your code is doing. You write a line of code to get your crab moving across the screen, and two clicks later your code is running and the crab is moving around. This sort of instant and obvious feedback is very useful when you are beginning programming. (Actually, it’s always useful — but we have to learn to do without as we progress to programming more complicated systems!) You can invoke methods on objects by right-clicking on the object and selecting the method from a pop-up menu, which is a little bit like a graphical version of a REPL. So if you are given a scenario with some pre-coded behaviour, you can interact with the objects before even getting to the code.

Another nice aspect of Greenfoot is that it embodies object-orientation well. Not everyone (and probably especially not Haskell programmers) is sold on OOP, but set that aside for this post. The object/class divide is immediately present: Wombat is a class, you can create many Wombat objects in the world that have the same fields and same behaviour, but different state. This is obvious from them running around separately: the concept comes across implicitly rather than having to be taught too explicitly, because of the design of the system.

A while ago people used to scoff at Java, declaring “public static void main(String[] args)” as a reason to stay away: too much is revealed upfront before the students can start programming. Incidentally, one could equally apply the same criticism to Haskell: the monadic type of the main function forces beginners to confront monads immediately. Greenfoot and other frameworks have shown this is easily dealt with. You will never see the main method in Greenfoot, and care is taken to steer beginners away from blank screens or from having to learn too much before they can get going. Writing Hello World starting with a blank file takes at least seven lines of Java code to print out a line of text, but adding one line of Java code in your first part-filled Greenfoot scenario makes a crab run across the screen. It isn’t the language that makes the difference for how hard it is to get started, it’s the framework.

The original designers of Greenfoot wanted to teach OOP to beginners, so they found a way to get across the key concepts of OOP easily, put it in a setting (interactive games) that can attract young beginners and produced the software (and supporting teaching material). Idea times execution.

What about Haskell?

This leads to my question: if a beginner’s environment can be produced for OOP and Java, surely it can be done for functional programming: specifically, Haskell. (The same question applies to the process-oriented programming model that CHP espouses — I believe the answer there is a clear yes, with a relatively close correspondence to the OOP design of Greenfoot, so it’s a bit less interesting to discuss.) Let’s toy with a potential design for a Greenfoot-like system in Haskell: that is, an interactive, graphics-based environment for beginner Haskell programmers. I’ll give our hypothetical system the name StartHaskell, just to make referring to it easier for the rest of the post.

First of all, what are the key concepts we would like people to learn in Haskell at the beginning? Function calls are central: without understanding functions and function calls, you can’t do anything in Haskell. Types are quite important, too — especially since they are bound to be the second class of error you encounter (your first will be a syntax error). These are very obvious: functions are as fundamental to functional programming as variables are to imperative programming, and types are very important in functional programming.

A key and distinguishing feature of Haskell is the notion of a monad. Different people have different opinions on monads. Is it best to introduce them via a simple monad like Maybe or State, or is it better to use the IO monad? A myriad of monad tutorials on the Internet provide all the different answers you can imagine. The IO monad is the one that does tend to push monads in the programmer’s path sooner rather than later: the main function has a monadic IO type. Should this be hidden by providing a main function that calls the programmer’s pure functions, or should it just be exposed straight away? Whether or not you expose monads is closely tied to the API design of StartHaskell.

API Designs

The notion of API in a programming environment like Greenfoot or StartHaskell actually swings both ways. There is the API that the StartHaskell system provides to the programmer, which will include things like functions dealing with images. But there is also the API that the programmer implements for the StartHaskell system — which is mainly what I want to discuss (and which has a heavy influence on the first kind, the system-provided API). In Greenfoot, every active thing in the world inherits from an Actor super-class, which has an act() method that the programmer implements to provide the functionality of that actor. These act() methods are called once each frame by the Greenfoot system: this is the API that the programmer implements. Since our hypothetical StartHaskell system will be functional rather than object-oriented, we will likely have one function to provide the functionality for the whole system (called once per frame) rather than bundling the functionality with individual objects. So let’s explore some different designs.

Option #1: Avoid monads entirely

We could focus entirely on pure functions. Conal Elliott’s picture work comes to my mind — and Simon Thompson’s Haskell book uses pure picture-based examples from the outset (using lists of lists of pixels, rather than Conal’s more abstract approach). You could conceive of a graphical framework that perhaps used a function from input-set to output-picture, i.e. step :: (KeyboardInput, MouseInput) -> Picture that would then be called each frame to produce an output picture given the input. Persistent state — that is, state that persists between frames, such as the position of the player in the world — is needed, so you might allow state to be returned as a result and passed to the next invocation, i.e. step :: (KeyboardInput, MouseInput) -> a -> (Picture, a). You could then provide various combinators for composing pictures that the programmer can use. (An approach based on Functional Reactive Programming, aka FRP, might also be possible — I don’t know enough FRP to talk about this, but feel free to write your own blog post explaining how FRP is the way to go!)

Option #2: Have a pure/monad split

Kodu is a new, very polished system for young programmers that uses a visual approach based around “when/do” rules. The “when” part specifies a condition that constrains when the rule is run, and the “do” aspect specifies the actions to take when the condition is satisfied. This easily translates into an API such as: rule :: (Environment -> Bool, SomeMonad ()). Different rules could be combined via combinators, or by changing that API into a list of rules. (One problem I have with lists of these rule-based interfaces is precedence: when one rule fires, does that preclude all other eligible rules firing? Or do they all fire together — and if so, in what order?) The system could then provide functions for accessing the environment data-type, and various monadic actions for affecting the system, such as drawing. Persistent state would probably be handled by making the SomeMonad a state monad.

Option #3: Introduce monads immediately, using do-notation

A lot of people try to understand monads by looking at the type signatures and going from there. If you completely ignore how monads are implemented, and just look at do-notation, many monads (especially the IO monad) are very close to imperative programming, and thus this is one way to introduce them. Haskell is a very nice imperative language! In this case we might require the user to provide an API like step :: SomeMonad (), or to make persistent state more apparent, step :: a -> SomeMonad a. The system would then provide an API with monadic functions to ask about the world, and to affect the world.

Option #4: Don’t Call Me, I’ll Call You

In all my options above I have assumed that the StartHaskell system will call the programmer’s function each frame. This design is taken from Greenfoot, and has various benefits. One is that if the programmer’s function is empty, the surrounding system will still be rendering frames and receiving input. The interact function in Haskell has this same flavour: here’s a function that contains my desired behaviour, you handle the heavy lifting of doing the IO and call me at the appropriate point. One problem with being called by the system is state. A graphical interactive program such as those in Greenfoot (e.g. games) will tend to maintain state: the positions of things in the world and the state of the player. I mainly dealt with that above by allowing the programmer to return some state that will be passed back in next frame. I’ve waved my hand slightly over the type of the state: each program will have a different type for the state, but that shouldn’t be hard to deal with in the environment.

An alternative approach is to take control: I’ll write my own code to drive the program and will call you at the appropriate points. There could be a single call into the framework, like nextFrame :: SomeMonad (), that does everything necessary: waits until the next frame is due, renders it, reads all the input devices and so on. So the most basic program would be forever nextFrame. Not too heavy to get started. Then you could pass state around yourself more easily instead of having to return it to the StartHaskell environment to be passed back to you later.

API Summary

I don’t know which of the above options is best — there may be other better options that I haven’t considered. What I do know is that it is surprisingly important. What kind of API you ask the programmer to implement will have a big effect on the style of programming that they learn, and on the concepts that they will have to master. This is particularly true in languages where multiple styles of programming are supported, like Haskell or C++, whereas Java only really has one main style of programming.

Environment

Greenfoot is not just a software library/framework for graphics. It is a complete programming environment, including interactive invocation, debugging, buttons for compiling and running the program (with no build scripts in sight), and a source code editor. Having a complete environment helps beginners a lot. You do not want beginners who you are teaching any programming language to learn the compiler command and flags on day one: that can come later, the point is to teach them programming! It is also important to offer a simple environment: giving Eclipse to someone who hasn’t programmed before is just unnecessarily complicating their learning process. A beginner’s environment should be simpler than that of a professional programmer.

Editor/Compiler

I think one major obstacle in Haskell is likely to be its error messages, particularly those for types. (Is every error message in Haskell one of: a syntax error, a type error, or an unknown identifier?) There has been work done before, for example in Helium, to try to improve this for Haskell. A better approach to teaching functional programming as a whole might be to use a language with a less powerful type system that allows for simpler errors, but I’m constraining myself to use Haskell for this post. So if you do have to teach Haskell, how can you best help with type errors?

Here is a mock-up of something that I always thought would be useful for type errors:

The idea being that those call-outs appear one-by-one as you move through the source. I like it because it visualises the process that I go through in my head when I encounter a type error, and the computer can doubtless do it faster than me. However, it is always worth bearing in my mind that what is helpful for an expert (as I guess I am in Haskell) is not necessarily helpful for a beginner. You need some level of understanding of type inference to work with that representation, and if a beginner doesn’t have that understanding then the interface may only serve to confuse.

Thoughts on Teaching Programming

I am a beginner when it comes to teaching and educational research, but here’s a few thoughts I’ve picked up so far:

  • Consider the following as a guiding principle. There are only two reasons that things are hard to teach: what you are trying to teach needs to be improved (e.g. it needs more research), or you haven’t found a good way to present it. (Warning: this is an unfalsifiable statement!) Recursion is meant to be hard, but there are many who believe recursion (particularly structural recursion) is easy when taught right. Objects were meant to be hard, but I think objects-first works quite well at getting OO concepts across. Concurrency can be taught to beginners when it’s a good model (occam in combination with robotics has proved easy to teach to beginners). And so on and so on.
  • Beginners can benefit from tools that are specialised to beginners. It is helpful if both beginners and experts realise this! Beginners shouldn’t start with Eclipse because they’ve heard that’s what Real Developers(TM) use, and experts should not pour scorn on beginners’ tools because they are too simple or hide intricate detail — that’s the point!

None of what I have said here about teaching is new, and much of the advice here is already followed on many courses. Most teachers give out partially-complete code to avoid students beginning with a blank screen, and it’s well-known that the key to teaching is to find good examples and present the concepts in the right way: it’s much easier said than done! My main intention was to think about what an equivalent to Greenfoot for functional programming would be like. It may be that an interactive graphics-based setting is not a good fit for a first functional programming environment. Perhaps there are other alternative settings that could be used — but it I think it should be something with a more widely appealing end result than calculating the fibonacci numbers!

Categories: Uncategorized

Haskell API Design: Scoped Resources

June 30, 2010 3 comments

A common idiom in any programming language is that of acquiring a resource, using it, and then releasing (de-acquiring?) it. The resource may be a file, a network socket or all sorts of other things. The main challenges with supporting this programming idiom in an API are three-fold:

  1. Making sure that the programmer cannot use the resource before acquisition or after release.
  2. Making sure that the programmer remembers to release the resource once they are finished with it.
  3. Make sure that the resource is released if a problem occurs while it is acquired.

In Java you might implement this as a try/finally block (takes care of 3; not so good on 1 or 2). In C++ you might use the RAII pattern: acquiring in a constructor and releasing in a destructor (takes care of all three reasonably well). In Haskell, you can use things like the bracket function (written about it in detail elsewhere) to write what we might call “with..” APIs. Some classic examples:

withMVar :: MVar  a -> (a -> IO b) -> IO b
withFile  :: FilePath -> IOMode -> (Handle -> IO r) -> IO r

These functions take as an argument the block to execute while the resource is acquired. If only this API is offered, it is impossible to forget to release the resource: release happens as soon as the block terminates. And if an exception occurs, that is also taken care of. There is a slight hole with respect to challenge 1 above: see if you can figure it out (we’ll return to it at the end of the post).

In this post, I thought I’d explore a few uses of this “scoped resource” pattern in my CHP library which use Haskell’s types to indicate the acquisition status.

Sharing

CHP features communication channels that allow you to send values between different threads. For example, something of type Chanout Integer allows you to send integer values. CHP’s channels are designed to be used by a single reader and a single writer. Sometimes, however, you want multiple writers and/or multiple readers to be able to use the same channel. For example, you may want to have many worker processes communicating results to a single harvester process. This can be achieved using shared “any-to-one” channels, which can be constructed simply by adding a mutex to the sending channel end. This is another case of the scoped resource pattern: we want to allow claiming of the shared channel (acquisition), writing to the channel while it is claimed, and to make sure the channel is released afterwards.

We can define a Shared wrapper for any type that simply holds a mutex alongside it:

data Shared a = Shared Mutex a

When we export this type we will not export its constructor, meaning that users of our shared API will not have access to the mutex, and will not be able to take the channel out of its shared wrapper, except using our API function, namely:

claim :: Shared a -> (a -> CHP b) -> CHP b

(I perhaps should have called it withClaimed to fit in with the other Haskell APIs, but never mind.) This specialises in the case of our Chanout Integer into:

claim :: Shared (Chanout Integer) -> (Chanout Integer -> CHP b) -> CHP b

Note that Chanout is a channel that can be used for sending, with functions such as:

writeChannel :: Chanout a -> a -> CHP ()

This sharing mechanism uses an opaque Shared wrapper type (i.e. one that has hidden internals) to stop the channel being written to unless the claim function is used to acquire/unwrap it into a normal channel for the duration of the block.

Enrolling

A barrier in CHP is a communication primitive that supports a single primitive operation: synchronising on the barrier. The rule for using them is straightforward: when you want to synchronise on a barrier, you must wait until all other processes that are members of the barrier also synchronise, at which point you all do synchronise and continue onwards. CHP supports dynamic membership of barriers: processes can enroll on the barrier (become members) and resign (leave the barrier) at any time. This is again an instance of the scoped resource pattern: enrollment is acquisition, you can synchronise while you are a member, and you should leave again at the end.

With barriers, we wrap them when you are enrolled on them, and only allow synchronisation on enrolled barriers:

data Enrolled a = Enrolled a

syncBarrier :: Enrolled Barrier -> CHP ()

class Enrollable a where
  enroll :: a -> (Enrolled a -> CHP b) -> CHP b

As before, we don’t expose the details of the Enrolled type, so you can’t just wrap something up yourself and pretend that you Enrolled. The enroll function does the acquisition and release (aka enrolling and resigning), and during the block in the middle you can synchronise on the enrolled barrier.

Different Ways to Wrap

We’ve seen two examples of scoped resources. In the Shared example, we used our wrapper to block access to the key functionality: we might call this negative wrapping. With negative wrapping you need to acquire the resource to remove the wrapper and use the object inside. In contrast, in the Enrolled example, we used our wrapper to enable access to the key functionality: we could correspondingly call this positive wrapping. With positive wrapping you need to acquire the resource to add the wrapper and use the object inside.

So which is better: positive or negative wrapping? With our negative wrapping, we were able to make use of the existing functions for channels once the value was unwrapped. Consider for a moment the possibility of adding a barrier type that had a fixed membership: no need to enroll. To fit with our existing API for synchronising on barriers, we would have to use the Enrolled Barrier type for our fixed-membership barrier, but that seems a little misleading. It might have been better to have a (negative) Unenrolled wrapper on barriers that is removed by enrolling, which would allow our API to allow synchronisation on a plain Barrier. Negative wrappers also allow for resources to be acquired in different fashions. You could have an Unenrolled (Chanout Integer) that can only be used by enrolling, which is orthogonal to a Shared (Chanout Integer); with positive wrappers this is not the case. So overall I think it is probably better to use negative wrappers.

(The other way to do it is to have totally separate types, e.g. FilePath and Handle in withFile, above. This is fine if you only have one specific scoped resource to deal with, but loses out on the chance to have a simple type-class like Enrollable, above, that encompasses the same operation (here: enroll) on multiple different types being wrapped.)

The Hole

I said earlier in the post that these scoped APIs have a problem (which exists for both positive and negative wrappers). Consider again the claim function:

claim :: Shared a -> (a -> CHP b) -> CHP b

Now consider this code:

do escapedChan <- claim sharedChan return
   writeChannel escapedChan "uh-oh!"

If you look at the function above, you’ll see it type-checks: types “a” and “b” are both Chanout String. What’s happened is that we have claimed the channel, and then passed it back out of the claim block. So now the mutex has been released, but we have the inner channel which we then use on the second line, outside the claim block. In general, I trust the users not to make this mistake. But there are ways to fix it. The technique known as lightweight monadic regions allows just that: paper here, subsequent library here. I’m going to try to convey the key idea here (as best I understand it).

The idea behind improving the scoped resource API is to parameterise the acquired resource by the monad in which they are allowed to be used. Here’s a simple example intended to illustrate the idea:

{-# LANGUAGE RankNTypes, EmptyDataDecls, KindSignatures #-}

data UnacquiredResource
makeUnacquiredResource :: IO UnacquiredResource

data AcquiredResource (m :: * -> *)
useResource :: Monad m => AcquiredResource m -> m ()

class MonadIO m where liftIO :: IO a -> m a
instance MonadIO IO where liftIO = id

withResource :: UnacquiredResource ->
  (forall m. (Monad m, MonadIO m) => AcquiredResource m -> m b)
  -> IO b

main :: IO ()
main = do u <- makeUnacquiredResource
          withResource u $ \r -> do useResource r
                                    liftIO $ putStrLn "middle"
                                    useResource r

What’s happening here is that the acquired resource has an extra type parameter that indicates which monad it can be used in. The type signature of useResource makes sure you can only use the resource in its corresponding monad. Our withResource function uses a forall to make sure that the block you pass must work with any appropriate monad it’s given: we can actually use the plain IO monad! In fact, we have ended up with two levels of protection in this API: the forall makes it hard to return the resource from the block (see the rank 2 types section of this post for more information), and even if the user returns the resource from the block, the type signature of useResource means it can’t be used afterwards outside the monad it originated from.

This is only a very basic version of lightweight regions: the actual implementation supports nesting of these scopes, and even to release in a different order than scope would usually enforce (e.g. acquire A, acquire B, release A, release B). The only problem with lightweight regions is that they do create a formidable API, e.g. glance at the types and see if you can follow them. As often in Haskell, the cleverer API that makes extensive use of the type system ends up being quite hard to understand: CHP is also guilty of this in places. There is sometimes a trade-off between type-safety and usability — and I don’t think either should always win.

Categories: Uncategorized

Haskell API Design: Having your own Monad atop IO

June 28, 2010 9 comments

I was recently writing my PhD thesis (which is about CHP), and writing the conclusions got me thinking about weaknesses of CHP. (For newcomers: CHP is an imperative message-passing concurrency library for Haskell.) When you’ve been working on some software for several years, you want to believe it’s fantastic, wonderful, flawless stuff and that everyone should use it. But why might CHP not be the best thing ever? (Other suggestions are welcome, honest!) Some aspects of the API design might fit into this category, and in this post I’ll discuss one of the API choices behind CHP, namely: having a special CHP monad instead of using the IO monad that sits underneath. This may have wider relevance to other libraries that similarly build on top of IO.

I’m going to invert the flow of this post: later on I’ll explain why I created a CHP monad, but first I’ll discuss what problems it causes.

IO vs CHP

The CHP monad is a layer on top of the IO monad (think of it as a few monad transformers on top of IO). What difference does it make to the user to have the CHP monad instead of the IO monad that underlies it?

Lifting IO

CHP being a layer on top of IO means that you can “lift” IO actions into the CHP monad to use IO actions in your CHP processes. In Haskell, all actions in a monadic block must be of the same type. But this cannot be done automatically. So if you have an IO action like touchFile :: String -> IO () and the standard CHP function readChannel :: Chanin a -> CHP a, you cannot write:

do fileName <- readChannel fileNameChan
   touchFile fileName

The first action is CHP and the second is IO — Haskell can’t reconcile the two. There are two fixes for this. One is to use a package like monadIO, which defines several standard IO actions to have types like: touchFile :: MonadIO m => String -> m (), i.e. to be able to fit into any monad built on top of IO (some argue all IO-based functions should have a type like this). Then you can write the code above. But the monadIO package only supports a few core IO actions, and any time you use another IO-based library (like, say, network) you are stuck. The other fix is to use the liftIO_CHP :: IO a -> CHP a function that exists exactly for this purpose. That works, but it gets rather verbose:

do fileName <- readChannel fileNameChan
   liftIO_CHP $ touchFile fileName

If you have lots of IO and CHP actions interspersed, it clutters up the code and makes it less readable. What’s galling is that liftIO_CHP has no particular semantic effect; it’s only really needed (to my mind) to placate the type-checker.

This lifting problem is not specific to CHP, incidentally: it is a problem for all monad transformer libraries. Adding any monad transformer on top of IO requires lifting functions such as liftIO_CHP.

Just a little CHP

Imagine that you have written a program that doesn’t use CHP. For all the imperative I/O parts it probably uses the ubiquitous IO monad. Now you want to add some concurrency. If you use MVars, you just need to create them and use them — writing and reading with MVars is done in the IO monad. The STM library has its own monad, but this is a monad in which you write a small transaction and then execute it in the IO monad, so the rest of your program can happily remain in the IO monad. But if you want to add a little CHP, suddenly you need to be in the CHP monad! You can’t just use runCHP :: CHP a -> IO (Maybe a) at the points where you want to use CHP. Firstly, the library is not built to be used like that (and the API doesn’t support being used like that). Secondly, if you have to tag all the CHP actions in the IO monad with runCHP, you’re not much better off than you were when you had to tag all the IO actions in the CHP monad with liftIO_CHP.

The “correct” way to add some CHP to your program is to adjust the whole program to be inside the CHP monad. You need to either change the types of your functions that used to be in the IO monad to be in the CHP monad (with varying amounts of liftIO_CHP) or you need to wrap them in CHP wrappers. That’s quite an overhead, especially if you would prefer to gradually introduce some CHP into your program. I don’t like to be forcing people to put their entire program inside CHP to get anywhere. (This reminds me a bit of emacs, where questions involving loading and quitting emacs to do other tasks often elicit the response: don’t quit emacs, just do everything from inside emacs.)

So what is the CHP monad for?

So far we’ve explored all the problems of the CHP monad, which may have convinced you that it was a bad design decision. Now I want to explain what the CHP monad does (over and above IO) and where it might be difficult to replace CHP with plain IO.

Poison

CHP has the notion of poison. Channels can be set into a poisoned state, and any attempt to use them thereafter results in a poison exception being thrown. The CHP monad incorporates this using a very standard error-monad(-transformer) mechanism. This could easily be implemented using the standard Haskell exception mechanisms that exist in the IO monad. I guess in general, an error monad transformer on top of IO is somewhat redundant.

Traces

A nice feature of CHP is that tracing support is built in, and can be turned on or off at run-time. This used to be done using a state-monad(-transformer), but the problem with implementing it that way is that if a process deadlocks, the state is lost. Since the main time you want the trace is when a process deadlocks, this was quite a limitation! So now it works by having a mutable (IORef/TVar) variable for recording the trace, with a reader-monad(-transformer) containing the address of that mutable variable along with an identifier for the current process.

At first, I thought to replicate this functionality I would need some sort of thread-local storage, which is discussed a little on an old Haskell wiki page. Now thinking about it, I could hold information on whether tracing is turned on in some sort of singleton variable (that is initialised once near the beginning of the program and read-only thereafter), and use ThreadId in lieu of a process identifier. Process identifiers in CHP at the moment have information on their ancestry, but I could easily record (during the runParallel function) a map from ThreadId to ancestry information in the trace.

Choice actions

The other functionality that the CHP monad supports is choosing between the leading action of two code blocks. That is, this code:

(readChannel c >>= writeChannel x) <|> (readChannel d >>= writeChannel y)

chooses between reading from channels “c” and “d”. Once a value arrives on one of those channels, the choice is made for definite. After that, the value is sent on either the channel x or the channel y respectively. This is slightly unusual for Haskell, but I’ve found it helps to make writing CHP programs a lot easier.
(I discussed this in more detail in the appendix of a recent draft paper.) There is no way to replicate the same functionality in the IO monad.

One alternative to this style of API is to use a style more like CML. In fact I’ve previously released a “sync” library that is a cut-down version of CHP with a CML-style API in the IO monad. So something similar to that may be acceptable; one possibility might be:

choose :: [forall a. (Event a, a -> IO b)] -> IO b

(Hoping I have my forall in the right place there! Is that technically an ImpredicativeType?) Which would mean I could write the above as:

choose [(readChannel c, writeChannel x), (readChannel d, writeChannel y)]

This could be done without the nasty types if I use a special bind-like operator (the CML library calls this exact same function wrap):

(|>=) :: Event a -> (a -> IO b) -> Event b

That, along with sync :: Event a -> IO a would allow me to write something like:

sync $ (readChannel c |>= writeChannel x) <|> (readChannel c |>= writeChannel y)

That might not be too terrible, although maybe a better combination of symbols for my bind-like operator might be better.

Summary

The more I think about (and write about) this issue, the more I begin to think that I would be better off removing the CHP monad. Slipping back to the IO monad would allow easier integration with existing Haskell code, and hence easier uptake of the library. But that’s quite a shift in the API — almost every single function in the CHP library would have to change its type. Is that a jump that can be made by shifting from CHP 2.x to 3.x, or is it perhaps better to start again with a new library and build from there? (This brings to mind several recent discussions on haskell-cafe and the resulting wiki page on whether to rename/re-version — although CHP doesn’t have as many users as FGL, so far as I know!) There may actually be a relatively easy way to transition with such a major change; defining:

type CHP a = IO a
liftIO_CHP = id
run_CHP p = (Just <$> p) `onPoisonTrap` return Nothing

Should probably allow most existing code to keep working without modification, except for choice.

All opinions are welcome on whether to dump the CHP monad for IO, especially if you use the library.

Categories: Uncategorized

Transposing Sequential and Parallel Composition

June 22, 2010 Leave a comment

Recently, I spent some time discussing traces with Marc Smith (we previously wrote two papers on the matter). Something that came up is the idea of representing a sequential composition of parallel processes as a parallel composition of sequential processes. Let me explain…

We, as programmers, often start off reasoning sequentially and then add on reasoning about parallel processes later — usually with much difficulty. Think of producing a control-flow graph for an imperative program, for example. If you have a sequential program, it is usually quite straightforward to form a flow graph for the program. When you try to add parallel processes, you need multiple flow graphs side by side. And they will interact, so really you need to link them together at the points of interaction. It quickly gets quite involved, and leaves you wishing that you didn’t have to involve concurrency.

What Marc and I found is the opposite: our reasoning was straightforward for parallel processes, but sequential composition was a headache. We were trying to represent the trace of a program as a tree, with each node representing either a parallel composition or a process without parallelism. In fact, due to the associativity and commutativity of parallel composition, this type of tree can always be flattened into an unstructured bag of parallel processes:

We then wanted to link processes in the tree that communicated with each other, to work out information about dependence and to look at neighbourhoods of connected processes. That was all fine, but gets quickly difficult when you need to consider sequential composition, specifically the sequential composition of parallel compositions.

Imagine that one of the tree nodes is first a program without parallel composition, and later becomes the parallel composition of two sub-processes, i.e. P = Q;(R || S). We don’t have a way to represent sequence in our neat parallel trees above, and this causes our representation to become awkward. We can’t have a simple ordered sequential list of trees, because if two processes have sequence information, they can proceed in either order, so really we would need a sequential lattice of trees, which is getting nasty.

At this point we wished we didn’t have to deal with sequential composition — at least, not sequential composition that can contain parallel composition. And via a small transformation, it turns out that we don’t. Consider the process: a; ((b; b’) || c); d, where everything in there is assumed to be a communication. We can refactor this into a set of parallel processes (P1 || P2 || P3):

P1 = a; par1start ; par1end ; d
P2 = par1start ; b; b'; par1end
P3 = par1start ; c ; par1end

The key insight is really this: if you are running multiple processes in parallel, it is the same if you start them at the outset of the program, then synchronise with them to start them, as if you set them going when you need them. So even though the process doing c won’t happen until a has, we don’t wait to set that process off partway through the program. We set that process (P3 above) off straight away, then send it a message (par1start) when we want to really set it going. The par1end message makes sure that P1 doesn’t do anything more until P2 and P3 have completed: all three processes synchronise together on the par1start and par1end messages.

With this transformation, we never have any parallel composition inside sequential composition in our traces. All we have is one giant parallel composition of every process in the system, where each process is a sequential composition of events. There is no more nesting than that; we have flattened the process network into an unstructured bag of lists of events. In fact, we can now represent the processes as a graph (with links formed by the communications: the dotted lines in the diagram above). The semantics of the process are unchanged, and all the information about the ordering dependencies is still there if you want it (embodied in the unique parstart/parend communications: the italicised events in the graph above). This form of representation actually makes it really nice to track dependencies and relations throughout the process network, by following the communication links.

Addendum

You could even go one step further, and transform the system into a collection of parallel processes, where each process had at most three events in sequence, where at most one of those would be an event from the original program, i.e. the above would become:

P1A = a ; seq1
P1B = seq1; par1start ; seq2
P1C = seq2; par1end; seq3
P1D = seq3; d
P2A = par1start; b; seq4
P2B = seq4; b’ ; par1end
P3 = par1start ; c ; par1end

I’m not sure how useful this is, though.

Categories: Uncategorized Tags:
Follow

Get every new post delivered to your Inbox.