Archive

Author Archive

Automatic Model Generation, part 5: Iteration

May 1, 2010 1 comment

Previous parts of this guide have shown how to represent and print CSP models, and how to redefine the CHP monad and the primitives for choice, IO and communication in order to generate CSP models from CHP programs. This part deals with the final difficult item: iteration.

Most processes in CHP run in an infinite loop, terminated only by poison. Many are written using the helper function forever, which is defined: forever p = p >> forever p. Consider what happens when we try to generate the model for such a process, e.g. forever (syncBarrier b). We first encounter the syncBarrier and we store in the model that the process would engage in event “b”. Then in the forever function we loop round and reach the syncBarrier again, and add on to the model that the process would next engage in “b” again. And so on and so on. An infinite loop in our program would cause an infinite loop in our model generation, rendering it useless. We cannot observationally determine that the program is looping forever; think how our approach could possibly distinguish the above from replicateM_ 1000000 (syncBarrier b) >> syncBarrier c.

Forever in an Instant

To solve this particular problem, we supply a foreverP function in the CHP library and its redefinition. In the original CHP library, foreverP acts as forever. In the forthcoming chp-spec library, it is defined differently: it runs the code once to record the model, then makes a note in the model that this code should have run forever. It also stops further modelling; any code after a call to forever will be ignored in the normal library, and so it should be here, too:

foreverP :: CHP a -> CHP b
foreverP p = stopSpecT . liftM (Repeat . snd) . finSpecT

stopSpecT :: Monad m => m SpecItem -> CHPSpecT m a
stopSpecT m = CHPSpecT $ const $ liftM (\sp -> (error "stopSpecT", (sp :))) m

The stopSpecT makes sure that no further specification is performed after this point. The crucial part is the const which ignores the continuation function — this is something that is easily done with our continuation-passing monad (and is like a short-circuiting error monad).

The above takes care of forever, which is used when the program carries no state around. But other processes have direct recursion and cannot use this function.

Process Annotations

To solve the recursion problem, we introduce a process annotation. The process annotation surrounds a process and captures the value of its arguments. The assumption is made (or rather, a condition is placed on the user) that the process will have the same behaviour (excluding any external input from channels and liftIO functions) when given the same arguments. In a pure language like Haskell, this is reasonable and will commonly be the case. The annotation should be added at the beginning of any process that recurses — when the recursion is performed, the process is modelled iff it has never been run before with these arguments; if it has been run before, its behaviour has already been modelled and recorded, so it is returned directly. Here is the annotation in action, on the security guard from the dining philosophers:

security :: [(Chanin (), Chanin ())] -> CHP ()
security chans = security' chans 0
 where
  security' :: [(Chanin (), Chanin ())] -> Int -> CHP ()
  security' = process "security" $ \chans satDown ->
    let (ups, downs) = unzip chans
        maxSatDown = length chans - 1 in
    (alt $
      [readChannel c >> return (satDown - 1) | c <- ups, satDown > 0] ++
      [readChannel c >> return (satDown + 1) | c <- downs, satDown < maxSatDown]
    ) >>= security' chans

It takes as its first parameter the name of the process — it is user-supplied, but should be unique in the program. The second parameter is the process itself. Here, the process has two arguments — but it could be any number. The process annotation is designed using type-classes so that it can be used with processes that take any number of arguments.

We need to store the arguments that each process took when it was modelled. So we need to store the models in a data structure like Map String (Map Args Model). But Args needs to be a set of differently-typed arguments for each process — we can’t statically assign a type to it. If we only supported self-recursion we could probably solve this with phantom type parameters in the monad and so forth, but sometimes, even in Haskell, it is appropriate to use dynamic typing. The Data.Dynamic module supports safe dynamic typing (in that casts from the Dynamic type have a run-time check).

In fact, we don’t actually need to store the arguments themselves. What we need to store with a previous model is a function like a -> Bool that says whether the latest parameter is the same as was used for generating the previous model:

type CheckArg = Dynamic -> Bool

checkArgs :: [Dynamic] -> [CheckArg] -> Bool
checkArgs ds fs
  | length ds /= length fs = False
  | otherwise = and $ zipWith ($) fs ds

We create a Process type-class to implement our process annotation:

process :: Process p => String -> p -> p
process s = process' True s []

class Process p where
  process' :: Bool -> String -> [(Dynamic, CheckArg)] -> p -> p

instance (Eq a, Typeable a, Process b) => Process (a -> b) where
 process' topLevel name args f x
  = process' topLevel name (args ++ [(toDyn x,(== Just x) . fromDynamic)]) (f x)

The process' function takes a Bool (ignore that for now), the process name, a list of arguments so far (each pair is the argument itself, and its function to check against a future value), and then wraps a process “p”. The instance shown above is the one that captures all a process’s parameters. Each parameter is appended to the list. toDyn turns a value into a Dynamic, and fromDynamic returns a Maybe value (Nothing if the type-cast is unsuccessful, Just if was successful). Comparing the result of fromDynamic to Just x checks both the type and the value at once. To support dynamic typing, parameters must have a Typeable instance (which GHC can derive for most types — and is supplied for all CHP library types) and an Eq instance to check for equality. The most notable types that cannot be used for a parameter are functions. This is a limitation of the approach — and indeed, CSP itself does not support any notion of functions being passed around. Any processes that take functions as parameters would have to be made first-order (Neil Mitchell’s Firstify comes to mind).

The base-case instance for Process is the one that actually uses the parameter list:

data CHPState = CHPState { ... , 
 chpProcessMap :: Map.Map String (Map.Map Integer ([CheckArg], (Spec,Dynamic))),
 chpFreeNames :: [(Dynamic, CheckArg)],
 chpNextProcess :: !Integer }

instance (Typeable a, Eq a) => Process (CHP a) where
  process' topLevel name immArgs p = addSpecT1 $ do
    st <- get
    let possibles = Map.toList <$> (Map.lookup name $ chpProcessMap st)
        args = if topLevel then immArgs
                 else chpFreeNames st ++ immArgs
    case possibles >>= L.find (checkArgs (map fst args) . fst . snd) of
      Just (n, (_, (_, r)))
        -> return (flip fromDyn (error "process-lookup") r, Call n)
      Nothing -> 
        do let n = chpNextProcess st
           put $ st { chpProcessMap = insertMapMap name n
                        (map snd args, (error "process", toDyn ()))
                          $ chpProcessMap st
                    , chpFreeNames = args
                    , chpNextProcess = succ n
                    }
           (r, f) <- finSpecT p
           modify $ \st' -> st' { chpProcessMap = insertMapMap name n
                                    (map snd args, (f, toDyn r))
                                      $ chpProcessMap st'
                                  -- Restore original free names:
                                , chpFreeNames = chpFreeNames st }
           return (r, Call n)

The case statement checks if any previously-modelled processes match. If one does (the Just case), its model and return value are returned. If Nothing is found, the process is modelled. It is crucial that the state is first updated with an entry for the process — that way, when the process recurses, it can find itself in the collection of recorded processes (if the recursion uses the same argument values). The dummy entry has the right parameters but an invalid model (that will never be accessed before it is later updated) and an incorrect return type; processes that recurse and then examine the value of the recursive call are not supported here. However, almost every recursive CHP process is tail-recursive, which can be modelled just fine (if they weren’t tail recursive, they would probably feature a space-leak). After the process has been modelled, its entry is updated with the real return value and real specification.

The model returned by the process annotation is always simply a Call item. Therefore, any process that recurses will simply have a Call item added to the end, stopping the model from extending forever — provided that at some point the same parameters are used to the process. A parameter with continually-changing parameters that never repeat — for example, one that outputs ascending integers — cannot be modelled here. This is yet another limitation of the approach.

Example

As an example, we’ll use the security guard shown earlier in this post, and how that is modelled with three philosophers (i.e. three sets of channels). Our approach produces a model for each different set of arguments to the process — the models for different arguments can potentially be completely different. Some effort could be put into collapsing them back down during post-processing, but here are the three models:

security_10=
  (((security.up.phil0?x_14 -> security_9)
    []
    (security.up.phil1?x_15 -> security_9)
    []
    (security.up.phil2?x_16 -> security_9)
    []
    (security.down.phil0?x_17 -> security_11)
    []
    (security.down.phil1?x_21 -> security_11)
    []
    (security.down.phil2?x_22 -> security_11)))
security_11=
  (((security.up.phil0?x_18 -> security_10)
    []
    (security.up.phil1?x_19 -> security_10)
    []
    (security.up.phil2?x_20 -> security_10)))
security_9=
  (((security.down.phil0?x_13 -> security_10)
    []
    (security.down.phil1?x_23 -> security_10)
    []
    (security.down.phil2?x_24 -> security_10)))

The top process, security_10, is the state where one philosopher is currently seated and thus all up and down events are offered. If an up event occurs, the next process is security_9, the state where no philosophers are seated, and thus only down events are offered. If a down event occurs in security_10, the next process is security_11, where two philosophers are seated and only up events are offered: this prevents a third philosopher sitting down (which could potentially lead to the classic deadlock in the dining philosophers).

That concludes all the in-depth technical parts of this guide. There’ll be one more post explaining the top-level specify method, which should also include the announcement of the release of the chp-spec library based on this guide.

Categories: Uncategorized Tags: ,

Automatic Model Generation, part 4: Communication

April 22, 2010 2 comments

This is part of a multi-part guide on redefining the CHP monad and primitives to generate a CSP model rather than executing the program. In this relatively self-contained part, we examine the issue of tracking values as they are communicated around the process network.

Modelling the mere occurrence of communications and synchronisations themselves is quite straightforward. We just label each created channel/barrier with a unique identifier, and then every time it is used we record in the model which channel/barrier was used. We can, however, do a bit better than that. Consider this code:

do x <- readChannel input
   writeChannel output x

What if we could tell that the value sent on the output channel was the same value as the one received on the input channel — that would be much more powerful than simply recording that a communication took place on the input channel followed by a communication on the output channel.

Identifiable Bottoms

Haskell has a function, undefined :: a. This produces a “bottom” value of any type, and will give an error if evaluated. It has a counterpart, error :: String -> a that allows the error message to be customised to be something useful. The Control.Exception module provides a try function that can catch these messages, and an evaluate function to examine values. If we put all of these together, we can create dummy values of any type, and later on we can identify them (but see the limitations section at the end of this post). This is the same technique used by Lazy SmallCheck, which we used for modelling IO computations in the previous part of the guide.

Input and Output

When an input occurs, we use the identifier of the channel (which is passed to the fakeCommIn function below) to form a uniquely-identifiable bottom value, which we make a note of in our state and then return:

bottomPrefix :: String
bottomPrefix = "__CHP.bottom__"

fakeCommIn :: Integer -> CHP a
fakeCommIn n = addSpecT1 $ do
  st <- get
  put $ st { chpNextBottom = succ (chpNextBottom st) }
  return ( error $ bottomPrefix ++ show (chpNextBottom st)
         , Sync $ Right (n, DirInput, chpNextBottom st)
         )

The Sync . Right item holds the identifier of the channel, “n”, the direction of the communication, and the identifier of the bottom value involved. In our corresponding output function, we watch out for these bottom values:

fakeCommOut :: Integer -> a -> CHP ()
fakeCommOut n x = addSpecT1 $ do
  possErr <- lift $ C.try $ C.evaluate x
  case possErr of
    Left (C.ErrorCall s) | bottomPrefix `isPrefixOf` s
      -> return ( (), Sync $ Right
           (n, DirOutput, read $ drop (length bottomPrefix) s))
    -- Wasn't one of our bottoms:
    _ -> return ((), Sync $ Left n)

That is all the core code we need. Our replacement channels just store the channel identifier, and delegate reading and writing on channels to the above two methods.

Example

For an example, we will use a process that reads two values from one pair of channels, and sends them out, swapped, on another pair of channels:

import Control.Concurrent.CHPSpec
import Control.Monad (replicateM)

swap :: (Chanin a, Chanin b) -> (Chanout b, Chanout a) -> CHP ()
swap (inA, inB) (outB, outA)
  = do (a, b) <- readChannel inA <||> readChannel inB
       writeChannel outB b <|*|> writeChannel outA a

p :: CHP ()
p = do leftIn <- oneToOneChannel' $ chanLabel "leftIn"
       leftOut <- oneToOneChannel' $ chanLabel "leftOut"
       rightIn <- oneToOneChannel' $ chanLabel "rightIn"
       rightOut <- oneToOneChannel' $ chanLabel "rightOut"
       swap (reader leftIn :: Chanin Int, reader rightIn :: Chanin Int)
            (writer leftOut, writer rightOut)

main :: IO ()
main = specify True p >>= putStrLn

This program is able to generate the following model, that correctly follows the values through the process:

channel leftIn
channel leftOut
channel rightIn
channel rightOut
main_1=
  (((leftIn?x_1 -> SKIP) ||| (rightIn?x_2 -> SKIP))
   ;
   ((leftOut!x_2 -> SKIP) ||| (rightOut!x_1 -> SKIP)))

Limitations

The technique of identifying bottoms is far from foolproof. In particular, for Int, “1+x” will be identified as being the same as “x”. What we are really identifying is that the bottom value received is used somewhere prominent in the output. So the idea is nice, but the execution is imperfect. Tristan Allwood suggested looking at stable names — these may help to reduce the imperfection, but they do not provide a strong enough guarantee to make the technique water-tight either.

I have now covered most of the model-generation technique. In the next part I will tackle the complex matter of recursive processes, probably followed by a final part showing how the top-level specify method works — at which point I hope to release all of this as a library, alongside the next CHP release (with the mtl dependency removed).

Categories: Uncategorized Tags: ,

Automatic Model Generation, part 3: Choice and IO

April 20, 2010 3 comments

This is the third part in a multi-part series describing a technique to generate formal CSP models of CHP programs by redefining the CHP monad and primitive actions to output the model of the program. In part 1 we saw the redefinition of the monad and the specification type; in part 2 we saw how to pretty-print the model, including the parallel composition alphabets. In this part, we look at choice and IO actions.

In part 1 of this guide, we saw how to define parallel composition. The specifications for each branch were generated, composed into a Par item, and added to the model. Choice is not so straightforward, because it can introduce branching execution paths. Consider this code:

alt [syncBarrier b >> return Nothing, syncBarrier c >> return (Just x)]
    >>= maybe p q

The choice doesn’t return a single value — it returns one of two, and which value is returned will affect the code coming afterwards. So we need a different approach from just putting a choice on the front of the sequential execution of the model. In fact, there is quite a simple solution just by using one of the laws of CHP from the tutorial:

alt [p, q] >>= k = alt [p >>= k, q >>= k]

This rule is usually useful because you can use it to remove the code duplication on the right by transforming to the form on the left. Here, we will use the rule in the opposite way, to move the binding of the multiple results back into the alt and attach it to each branch.

Choice

The alt :: [CHP a] -> CHP a function chooses between several alternatives. We define it by using the Alt specification item:

alt :: [CHP a] -> CHP a
alt [] = stopSpecT $ return Stop
alt ps = altSpecT ps

altSpecT :: Monad m => [CHPSpecT m a] -> CHPSpecT m a
altSpecT ms = CHPSpecT $
  \k -> do xfs <- mapM (flip runSpecT k) ms
           return (error "alt return", \s -> [Alt $ map (($ s) . snd) xfs])

The key aspect of this code is that due to the continuation-passing style of the replacement CHP monad, we have a continuation (“k” in the above code) that represents all actions that would be taken after the alt. We pass this as the continuation when we run each of our branches of the alt (the first line of the do block). The specification-modifying functions from each branch are applied to future specifications, then returned as a specification containing a single Alt item. This can be depicted as follows:

Note that the eventual return values of all of the branches are discarded. This is not as bad as first appears, because the result values are only used if this choice is in a parallel composition and its result is needed (which is surprisingly rare in CHP code) or if the choice is in a recursive process where the result is needed (this again is surprisingly rare, and will be dealt with in a future part of the guide).

IO Actions

The real CHP monad is really a monad transformer on top of the IO monad, and supports IO actions being lifted into the CHP monad. An IO action may perform all sorts of actions, but these fall outside the scope of our modelling. The more significant problem for the modelling is that the IO actions may return a meaningful value (i.e. one other than a unit return). These values can then be used in the CHP program, and in particular they may be used to branch execution. If we simply return an error value in these cases, the code will fail. We can do slightly better than that, to allow partial support for IO actions when modelling.

In the case of an IO computation, we have some code (the code following the IO computation) that takes a value (the return value of the IO computation) and produces an output based on the value (the model of the code following the IO computation). This is, in effect, a testing problem, and we can borrow techniques from software testing to solve it. Haskell has several clever testing libraries: QuickCheck is one of the most well-known, but an even cleverer library is Lazy SmallCheck. Lazy SmallCheck allows for an efficient search of the input space to a function by starting off with undefined values, and partially defining them as necessary. Not only can this be efficient, but a particularly nice aspect is that it allows us to know if the search was complete or not.

The Lazy SmallCheck library doesn’t actually expose enough of its implementation as it currently stands, so I took it and modified it until I was able to produce a fuzz function:

fuzz :: (Serial a) => (a -> StateT s IO b) -> StateT s IO ([b], Bool)

This function takes an IO computation that sits inside a state monad transformer (I needed this, and it was much easier to embed this in the fuzz function than to do it any other way). The input space for this function is searched (with the stateful side-effects of any successful returns being kept) and a list of return values (the models) is produced. The boolean parameter indicates whether the search was complete: that is, whether it examined all possible values in the search space (either by exhaustion of all fully-defined values, or because it explored a complete set of partially defined values).

We can use the fuzz function on the continuation passed to the IO computation in order to explore the continuation’s possible models with different input values to the continuation. This will give us a collection of models, each of which represents a possible behaviour of the program from this point, based on an event (the IO action) that we must treat as unpredictable and out of our control. In CSP we can join these models together as an external choice between processes that are prefixed with non-synchronising events: that is, events that are not featured anywhere else in the model. The external choice between such events indicates that the process’s behaviour will depend on which event is offered by its environment, and that we must assume that any event might be offered (i.e. any input value might be returned). This code achieves that:

liftIO_CHP :: Serial a => IO a -> CHP a
liftIO_CHP _ = CHPSpecT $ \k ->
    do (vals, complete) <- fuzz k
       unless complete $
         liftIO $ putStrLn "Incomplete fuzzing of IO computation"
       nonces <- replicateM (length vals) newEvent
       let firstLabel = "IO_" ++ show (head nonces)
       zipWithM_ labelEvent nonces (map (firstLabel++) suffixes)
       modify $ \st -> st { chpIOEvents = Set.union
                              (Set.fromList nonces) (chpIOEvents st) }
       return (error "liftIO return",
         \s -> [Alt $ zipWith (\n f -> Sync (Left n) : snd f s) nonces vals])
  where
    suffixes = map (:[]) ['A'..'Z'] ++ map show [(0::Integer)..]

As ever, this technique is limited: the search may be incomplete, in which case the model may be incomplete. We print a message in these cases to warn the user of the problem. The search will only be incomplete if the return value has a large flat domain (e.g. integers) and/or the subsequent computation makes full use of the domain. If the return value is ignored or if it is not used to make decisions about the control path, the search will be complete (because the value will never be defined). So if, for example, a String is returned and then passed to a different IO action to write to a file, this will not cause a problem in the modelling because the behaviour in terms of CSP processes is invariant to the return value. The only time this search above comes into play is when the return value is used to affect the execution path, for example if the process acts differently when the String is empty. (This is reminiscent of the different between monads and applicative functors: if CHP was only an applicative functor, IO actions could be completely ignored.)

Example

For an example, I will use a simple simulation example with a tick-tock barrier to divide time into time-steps. The simulation has six site processes wired up in a ring, with each site connected to its neighbours via a pair of barriers, representing incoming and outgoing movements. A site may be full or empty. If it is full (i.e. contains an agent), it makes a random choice as to which direction to send the agent in a time-step (and then ends the time-step). If it is empty, it offers to receive a new agent from either side, or to end the time-step. Here’s the diagram (the star indicates a full site) and the code:

site :: EnrolledBarrier -> (EnrolledBarrier, EnrolledBarrier)
     -> (EnrolledBarrier, EnrolledBarrier) -> Bool -> CHP ()
site = process "site" $ \bar (inL, outL) (inR, outR) occupied ->
  (if occupied
     then do shouldMove <- liftIO_CHP' "shouldMove" $ getStdRandom random
             if shouldMove
               then do moveLeft <- liftIO_CHP' "moveLeft" $ getStdRandom random
                       if moveLeft
                         then syncBarrier outL
                         else syncBarrier outR
                       syncBarrier bar
                       return False
               else syncBarrier bar >> return True
    else alt [syncBarrier inL >> syncBarrier bar >> return True
             ,syncBarrier inR >> syncBarrier bar >> return True
             ,syncBarrier bar >> return False
             ]
  ) >>= site bar (inL, outL) (inR, outR)

main :: IO ()
main = specify True (runSiteRing [True, False, False, True, False, False])
          >>= putStrLn

I’ve omitted a couple of uninteresting wiring functions. You can see that I’m using a version of liftIO_CHP that allows me to supply a label, which helps in reading the model. The program uses choice, and lifted IO actions to get random boolean values. It also has a potential deadlock, which we will find using the FDR model-checker.

One problem with this whole technique at the moment is that each site generates a separate model, even though they are all similar enough to be expressed in one model. So here are the processes that make up the behaviour of the first site — there are similar pairs for the other five sites:

site_2=
  (((IO_shouldMove7A -> tock -> site_2)
    []
    (IO_shouldMove7B
     ->
     ((IO_moveLeft5A -> right_0 -> tock -> site_3)
      []
      (IO_moveLeft5B -> left_5 -> tock -> site_3)))))
site_3=
  (((right_5 -> tock -> site_2)
    []
    (left_0 -> tock -> site_2)
    []
    (tock -> site_3)))

The site_2 process is the full process. It makes a choice between two events: IO_shouldMove7A and IO_shouldMove7B. These are the two possible outcomes of asking for a random boolean. In one case, it synchronises on the tock event and recurses — this must be the won’t-move case. In the other, it chooses between two other events: IO_moveLeft5A and IO_moveLeft5B. These are the decisions as to whether to go right or left: it next either synchronises on its right event or its left event, and then becomes site_3. The site_3 process is the empty process. It will agree to synchronise with its neighbours, end the timestep and become the full process — or to just end the timestep and remain the empty process.

There are six such pairs of processes in the model, one for each site in our model. The only difference between them is the names of the dummy IO events, and the numbers of the right and left events. They are all wired up in the main process:

main_1=
  (((site_2)
    [|{| left_0 , left_5 , right_0 , right_5 , tock |}|]
    (((site_4)
      [|{| left_4 , right_4 , tock |}|]
      (((site_6)
        [|{| left_3 , right_3 , tock |}|]
        (((site_8)
          [|{| left_2 , right_2 , tock |}|]
          (((site_10) [|{| left_1 , right_1 , tock |}|] (site_12)))))))))))

We can add a single line at the end of the model: assert main_1 :[ deadlock free] and feed it to FDR, asking it to produce a trace if it finds a counter-example. FDR does find a counter-example for our assertion: a trace that ends in deadlock. Here it is:

IO_shouldMove7B
IO_moveLeft5A
right_0
IO_shouldMove25A
tock
IO_shouldMove35B
IO_moveLeft33A
right_1
IO_shouldMove25B
IO_moveLeft23B

We can take this trace (a chronological list of events that occur) and relate it to the model. Here’s the short form: the agent in the first site moves right, while the agent in the fourth site stays put, and the time-step ends. Then the agent that is now in the second site moves right, and the agent in the fourth site decides to move left. At this point, one agent is in the third site (which is waiting to end the time-step) and the other is in the fourth site trying to move into the third site. Deadlock!

There are several ways to solve this problem, but I’m not really interested in them — the point here was to show that we took a simple example, we generated our model and fed it straight to the proof-checker which gave us a trace that produces deadlock. Reading the CSP and relating the trace to the model is straightforward, but can be long-winded. What might be nice in future is to have an option to somehow encode the source location in the event name, to then easily point back to the source and say “this bit of the source happened, then that bit happened, then deadlock”. In the next part of the guide I’ll show how we can deal with tracking communications when building the model.

Categories: Uncategorized Tags: ,

Automatic Model Generation, part 2: Pretty-Printing the Model

April 15, 2010 3 comments

In the first part of this guide, I explained part of a technique for generating CSP models from a CHP program — including the introduction of the specification type. In this second part of the guide I take a slight diversion to explain how to pretty-print the specifications. We will see in this part how parallel composition is rendered, which has particular importance in CSP.

After we’ve created the model of our program, we need to print it out to a file, ready to be fed to the proof-checker or other tool. I wanted the model to be as comprehensible as possible after printing, so that users (and I!) can read it. To that end, I used a pretty-printing library: pretty from Hackage, which is imported as PP throughout the code in this post. (After I wrote this code, Marc Fontaine uploaded a library to Hackage produced that supports CSP notation in an AST, but it’s easier for me to keep my original code rather than switch over.)

Sequence

CSP has two forms of sequencing. A semi-colon sequences two processes, whereas an arrow prefixes a process (right-hand side) with an event (left-hand side). My code for printing out a sequence of processes uses the arrow wherever possible, and a semi-colon otherwise:

type ProcEvents = Map.Map String (Set.Set String)

pprintCSP_Seq :: ProcEvents -> Spec' String String -> PP.Doc
pprintCSP_Seq m = PP.parens . PP.sep . p
  where
    p [] = [PP.text "SKIP"]
    p (Sync c : xs) = [name c, PP.text "->"] ++ p xs
    p [x] = [pprintCSP m x]
    p (x:xs) = [pprintCSP m x, PP.text ";"] ++ p xs

The SKIP process does nothing, so it is used for blank lists and at the end of the list. Note that the singleton list [Sync x] will turn into x -> SKIP, whereas [Par xs] will not turn into xs ; SKIP (it will be merely xs). The ProcEvents parameter is used later on for printing out parallel composition and so must be passed around in case there are any nested parallel compositions.

Choice

External choice is handled by the first case of pprintCSP:

pprintCSP :: ProcEvents -> SpecItem' String String -> PP.Doc
pprintCSP m (Alt s) = zeroOneMore m "STOP" (withOp m "[]") s

The helper functions are used to treat differently the case where the list is empty (in which case it uses the second parameter), has one item (in which case it is printed using pprintCSP_Seq), or has multiple items (in which case it is processed using the third parameter, which above joins the items with the external choice operator []):

withOp :: ProcEvents -> String -> [Spec' String String] -> PP.Doc
withOp m op
  = PP.parens . PP.sep . intersperse (PP.text op) . map (pprintCSP_Seq m)

zeroOneMore :: ProcEvents -> String ->
  ([Spec' String String] -> PP.Doc) -> [Spec' String String] -> PP.Doc
zeroOneMore _ z _ [] = PP.text z
zeroOneMore m _ _ [x] = pprintCSP_Seq m x
zeroOneMore _ _ f xs = f xs

Parallel

When you compose two CSP processes in parallel, you must specify the events on which they synchronise. The process (a -> SKIP) [| {|a|} |] (a -> SKIP) will perform the event “a” once, with both sides of the parallel synchronising on event “a” together — that item in the middle is an infix parallel composition operator, parameterised by the singleton set containing “a”. In contrast, the process a -> SKIP [| {| |} |] a -> SKIP (which can be written more simply as: a -> SKIP ||| a -> SKIP) will perform the event “a” twice, with each side of the parallel doing it in turn — this is known as interleaving on an event.

In CHP, there is no support for interleaving; all events are synchronising. However, we cannot just put all events in the synchronising set. The process (a -> SKIP) [| {|a,b|} |] (b -> SKIP) will deadlock — each side tries to synchronise with the other on their respective events, which never happens. The solution is of course to use the intersection of the events that each side engages in:

pprintCSP m (Par xs) = zeroOneMore m "SKIP" joinPar xs
  where
    joinPar :: [Spec' String String] -> PP.Doc
    joinPar [a] = pprintCSP_Seq m a
    joinPar (a:bs) = PP.parens $ PP.sep $ 
      [ pprintCSP_Seq m a
      , pprintParOp (findAllComms eventName m a `Set.intersection`
                       unionAll (map (findAllComms eventName m) bs))
      , joinPar bs
      ]
pprintCSP _ (Call p) = name p
pprintCSP _ x = error $ "pprintCSP: " ++ show x

pprintParOp :: Set.Set String -> PP.Doc
pprintParOp s
  | Set.null s = PP.text "|||"
  | otherwise = surround "[|{|" "|}|]" $ intersperse PP.comma $
      map name (Set.toList s)
  where
    surround a b x = PP.sep $ [PP.text a] ++ x ++ [PP.text b]

The findAllComms does as its name suggests: it forms a set of all the communications performed anywhere in the process, using the supplied Map that we have been passing around to find out what communications are performed by named processes that are called. Note that as we compose parallel items together, we compose the head of the list using the intersection of all communications performed in the head, and all communications performed in the rest of the list (which will be the right-hand side of the parallel composition). So if you have three processes: [a -> b -> SKIP, b -> c -> SKIP, a -> c -> SKIP], they will get composed as:

(((a -> b -> SKIP)
    [|{| a , b |}|]
    ((b -> c -> SKIP) [|{| c |}|] (a -> c -> SKIP))))

The first process synchronises on “a” and “b” with the others, as both feature somewhere later in the list. The second and third process only synchronise together on “c”; the events “a” and “b” are not shared, and thus occur separately, but synchronised with the first process.

Parallel Composition Omission

Basing the synchronisation sets on those events that are potentially performed has a problem. If a process is given a channel-end or barrier, but never uses it at all, the event will not show up in the model for that process, and thus will not be used in the synchronisation set. Let’s say you have a barrier, “b” with two processes enrolled, but one does not use it. Given those two processes, say “a ->SKIP” and “SKIP” they will be composed as “a ->SKIP ||| SKIP“, which will run fine — even though the original program would have deadlocked! The solution to this is to introduce extra code for tracking how many processes are involved in an event, and add dummy processes that have the events in their synchronisation set but never use them, thus reflecting the deadlock in the model. I haven’t yet implemented this just yet, though.

Summary

We have now seen how parallel composition is implemented, including determination of the synchronisation sets. In future parts of the guide, we will examine the remaining parts of CHP: channel communications, choice, and iteration, as well as how all these are put together.

Categories: Uncategorized Tags: ,

Automatic Model Generation, part 1: Parallel

April 13, 2010 11 comments

CHP is based on CSP, a formal process calculus. CSP has a model-checker, FDR (among other tools), that is free for academic use for dealing with CSP. It would be great if we could take our CHP programs and prove them correct using FDR, e.g. prove them deadlock free. To properly translate a CHP program into its CSP model requires full semantics-based processing of the program’s source code, including all the functional parts (including tackling things like writeChannelStrict c undefined). But what if, at least for simple programs, we didn’t need to go to these lengths?

In this multi-part guide, I will introduce a technique to generate CSP models from CHP programs without source code analysis. The programmer would make one small alteration to the imports of a program and then: instead of executing properly, the program would spit out its own CSP model which we could feed to FDR. This technique involves redefining the CHP monad and all the CHP primitives so that the program spits out its model rather than actually executing. The technique is very limited in some regards, and I’ll try to point out the limitations as we go along. But hopefully it will be interesting to show what you can do by redefining your monad (see also: the Concurrent Haskell Debugger, and the Beauty in the Beast paper). Explaining my technique will take several posts — for today, I will focus on the definition of the specification type, the CHP monad — and redefining the runParallel function.

Specification Type

We start by defining a type to represent a CSP model. An introduction to CSP can be found in a previous post, but if you know CHP then the CSP model type has a fairly straightforward correspondence to CHP code. The type of the specification is as follows:

data SpecItem' proc comm
  = Par [Spec' proc comm]
  | Alt [Spec' proc comm]
  | Call proc
  | Sync comm
  | Stop
  | Repeat (Spec' proc comm)
  deriving (Show)

type Spec' proc comm = [SpecItem' proc comm]

The main Spec' list type is a chronological sequence of processes in the model. The types are parameterised by proc (the type that identifies a process) and comm (the type that identifies a communication/synchronisation). We will explain these types in a future part of the guide; for now, we will use the opaque types ProcessId and CommId:

type SpecItem = SpecItem' ProcessId CommId

type Spec = Spec' ProcessId CommId

type SpecMod = Spec -> Spec

finalise :: SpecMod -> Spec
finalise f = f []

The SpecMod type is a function that modifies a specification. We will compose several of these functions while building the model — and at the end, we can use the finalise function to turn a SpecMod function into an actual Spec, by applying the function to the empty sequence.

Redefining the CHP monad

These specification types are used in our redefinition of the CHP monad. The main part of the new CHP monad is a monad transformer CHPSpecT that permits building of specifications on top of an existing (book-keeping) monad. The monad is as follows:

newtype CHPSpecT m r
  = CHPSpecT {runSpecT :: forall b. (r -> m (b, SpecMod)) -> m (b, SpecMod) }

For those of you playing monad transformer bingo at home, CHPSpecT is effectively the unrolled version of forall b. ContT b (WriterT SpecMod), and will later be used on top of StateT. As with ContT, the monad instance itself makes no reference to the underlying monad, so it is surprisingly simple:

instance Monad m => Monad (CHPSpecT m) where
  return x = CHPSpecT ($ x)
  m >>= k  = CHPSpecT $ \c -> runSpecT m $ \a -> runSpecT (k a) c

The road to hell is paved with monad explanations, but here goes. The r -> m (b, SpecMod) item takes a value of type “r” (for red) and gives back an item of type “b” (for blue) along with a specification modifying function. We can envisage that item as follows:

The values pass left to right in the top half (turning red to blue), while the model actually passes in the opposite direction. Our models are effectively built backwards, with each SpecMod function modifying all future specifications to produce a current specification. We can now diagram our CHP monad as follows:

Given a circled value of the aforementioned type r -> m (b, SpecMod) (the lambda is used to indicate that this is an argument), a CHP item will give back something of type m (b, SpecMod); this is drawn on the right as a blue item with a straight left edge (to indicate it has no input), paired with a specification-modifying function. You can pick all sorts of holes in these diagrams, but I hope they will be useful in explaining various uses of this monad as the guide continues.

CHP Monad Helper Functions

The CHP monad itself has no real logic involved; all the logic is actually captured in other functions, two of which I will introduce here, beginning with finSpecT:

finSpecT :: Monad m => CHPSpecT m r -> m (r, Spec)
finSpecT = liftM (second finalise) . flip runSpecT (\x -> return (x, id))

The finSpecT function is used to run the CHPSpecT transformer; it returns a value and a model. Once it has the return value paired with the corresponding model-changing function, it makes the latter into a model by finalising it (applying it to the empty specification). The result of this latter operation can be visualised below; the composite item has an unmodified return value, but has its specification-modifying function applied to the empty specification (the empty list):

The second function is addSpecT1:

addSpecT1 :: forall m r. Monad m => m (r, SpecItem) -> CHPSpecT m r
addSpecT1 m = CHPSpecT $ \k -> m >>= apply k
  where
    apply :: (r -> m (b, SpecMod)) -> (r, SpecItem) -> m (b, SpecMod)
    apply k (x, s) = liftM (second ((s :) .)) $ k x

The addSpecT1 function encapsulates the logic for sequencing; it takes a monadic action that gives a return value paired with a corresponding single specification item, and turns all that into a CHPSpecT item that adds the item to the model. The diagram is below; the addSpecT1 takes a parameter (the outermost lambda) and gives back a CHP item that takes an inner continuation parameter (the lambda inside the box). The value of the outer parameter is passed to the continuation, and the specification item of the outer parameter (a trapezoid named “s”) is adjoined to the front of the result of the specification-modifying function of the continuation:

Parallel Composition

For parallel composition, we take each branch of the parallel composition and finalise it into a specification (using finSpecT), then add the Par constructor and join it on to the front of future specifications:

runParallel :: [CHP a] -> CHP [a]
runParallel = addSpecT1 . liftM (second Par . unzip) . mapM finSpecT

We can again diagram this:

Hmmm, hopefully that makes it clearer! The finalised parts of the parallel specification (which were run sequentially — no need for actual parallelism during model generation) are shown on the left: their return values are formed into a list of values (following the arrows at the top of the diagram) that is passed to the continuation, while their models are put into a list with a Par constructor (following the arrows at the bottom of the diagram), and this is prepended to the sequence from future models.

Example

We can now given an example of the model generated for a program with parallel composition. This example is particularly simple, as I don’t want to use features that I haven’t yet explained. Here is a plain CHP program, that synchronises on various barriers in parallel:

import Control.Concurrent.CHP

syncOn :: [String] -> CHP ()
syncOn = mapM_ syncOnIndiv
  where
    syncOnIndiv :: String -> CHP ()
    syncOnIndiv name = do b <- newBarrierWithLabel name
                          enroll b syncBarrier

p :: CHP ()
p = runParallel_ [syncOn ["a", "b"], syncOn ["c", "d"], syncOn ["e"]]

main :: IO ()
main = runCHP_ p

This program will compile, and can be executed normally. To get a model of this program, rather than executing it, we only need change the top and bottom lines; we import a different module, and change the outermost-call to runCHP_ to specify:

import Control.Concurrent.CHPSpec

... processes as before

main :: IO ()
main = specify True p >>= putStrLn

The program will again compile (against a different library), but this time when it is executed it will output the following model, ready to be read into the FDR model-checker or another formal tool for CSP:

channel a
channel b
channel c
channel d
channel e
main_1=
  (((a -> b -> SKIP) ||| ((c -> d -> SKIP) ||| ((e -> SKIP)))))
main = main_0

In the next part of the guide, we will examine exactly how these models are printed — and after that we will cover model generation for CHP’s other primitives.

Categories: Uncategorized Tags: ,

Removing mtl dependency

April 9, 2010 13 comments

In Haskell, there are a variety of monad transformer libraries. The most popular is the original library, mtl, which is in the Haskell platform. Many people mutter complaints about mtl, including poor design, use of functional dependencies (which are somewhat out of favour these days) and inefficient implementation. Several replacements have been developed — direct replacements such as the transformers and mmtl libraries, and also packages like monadLib.

This issue arose again recently in a reddit discussion, with people wondering when mtl is going to be replaced. CHP used to make a lot of (internal) use of monad transformers from the mtl package. I’ve recently refactored the CHP monad, which removed a large part of that dependency: gone are ErrorT and ReaderT, in comes a continuation-passing style. If I want to, I’m now able to make the next version of CHP not depend on mtl (nor on any other monad transformer library). I’m not sure if this is worth doing — does it actually make anyone’s life easier using CHP if it doesn’t depend on mtl?

The only external change that would result from giving up mtl in the chp package is the MonadIO CHP instance. MonadIO is a useful type-class in mtl for any monad (like CHP) that sits on top of IO. I would stop having an instance of MonadIO CHP in the CHP library, and would instead provide a liftIO_CHP :: IO a -> CHP a function for the same purpose, which could then be used in a MonadIO instance for CHP in your own package (using MonadIO from mtl, or from transformers, or whatever your favourite monad transformer library is). There was a recent proposal to split MonadIO from mtl because it’s so useful (and can be stand-alone) but that hasn’t happened yet. The change from liftIO to the function liftIO_CHP would actually be slightly beneficial for me (due to some other work going on) but could be a pain to users who don’t care about mtl/transformers and who don’t want to have to alter their existing code.

If you are bothered either way — keeping mtl and the MonadIO instance in CHP versus removing them — speak up.

Categories: Uncategorized

Tock — A Compiler for Concurrent Languages, written in Haskell

April 7, 2010 2 comments

Occam and Tock

The CSP calculus has inspired quite a few languages and frameworks; CHP is just one of them. Go is a recent example, but probably the oldest example is the occam programming language. Occam was developed for the Transputer processor, which has since died out. The efforts of research groups at Kent and Southampton made sure that occam survived the death of its original host platform. Today, a much-modified version of the original compiler, now named KRoC, will compile and run occam code on x86 Linux and various other platforms. Its basis, the original compiler, was written around twenty-five years ago. In C — eugh. Nowadays there are better languages to write a compiler in — such as Haskell.

My colleague Adam Sampson began playing with writing an occam compiler in Haskell using modern technologies like nanopass compilation (where the compiler consists of a large number of small and simple passes), Parsec (for, um, parsing) and generic programming (for performing automatic descents of the abstract syntax tree during the passes). At some point I joined in, and between us we wrote Tock, a compiler for occam and other concurrent languages, written in around twenty-thousand lines of Haskell.

Parsing

Occam is a difficult language to compile, for several reasons. Its grammar has ambiguities that can only be resolved with semantic knowledge. For example, c ? x is either an input from channel c into variable x, or it is an input from channel c that must match the tag (think Haskell data-type constructor) x. It is not clear which it is until you have determined what x is, which typically happens after parsing. This is exactly the sort of problem that would occur if Haskell did not require data constructors to start upper-case and variables to start lower-case (without this restriction, what would x be in “let x = y”: variable or constructor?).

Occam is layout-based, using 2-space indents. This requires some pre-processing to insert indent/outdent markers. We use alex to tokenise and insert the markers, then use Parsec-2 to parse a stream of tokens into an abstract syntax tree (AST) after running a pre-processor stage on the token stream. To help resolve the ambiguities, we accumulate a symbol table in the parser, which is not nice — we would like to iron that out.

Generic Programming

We started out using Scrap Your Boilerplate (SYB) as our generic programming method for implementing traversals of the AST in our nanopasses. The advantage of using generic programming for traversals is that you can write the code you’re interested in (e.g. apply this check to all parallel constructs in my AST) without writing the code you’re not interested in (if you find an if statement, here’s how to descend into it to find the parallel constructs). SYB turned out to be too slow for us and we ended up constructing our own generic programming framework, Alloy. It’s available on Hackage, and there is a paper (and a talk) if you’re interested.

Safety Checks

Occam permits some powerful static safety checks. In every framework based on occam’s principles (including CHP) it is not possible to statically ensure that one-to-one channels are not going to be used wrongly in parallel (e.g. that the writing end will not be written to by two processes in parallel). The occam compiler does check this.

The algorithm is actually quite simple to implement in the AST. Wherever you find a PAR construct (which runs its branches in parallel), descend into all the branches and make a set of pairs of channel name and direction (input or output) that are used or passed as parameters to processes. Check all the sets against each other, and if any pair is present in more than one set, issue an error. With generic programming, making the set of pairs is just a generic query, and the whole thing is fairly straightforward. A similar check is performed to ensure that variables obey the Concurrent-Read Exclusive-Write (CREW) rule: that in parallel you either have no-one accessing a variable, only readers, or exactly one writer.

Another safety check involves arrays. Occam permits arrays of variables and arrays of channels, and performs safety checks on both. These checks are particularly tricky when combined with replicated parallel constructs. These are like parallel for loops; you can write PAR i = 0 FOR 10, and the code beneath will be replicated 10 times in parallel with indexes 0 through 9. The old occam compiler statically checked this by only allowing statically-known replication counts, and using brute force to expand out all the code and perform lots of known-constant checks on the array indices. Tock uses the Omega test (a form of Presburger arithmetic) to perform the same checks for dynamic replication counts. I spent a while pouring over the Omega Test paper to understand it (and produced some slides on it to help others understand it) and then implemented it in the compiler. I am still impressed by what it can do. Take this code, for example:

PROC p()
  [100]INT a:
  PAR i = 0 FOR 10
    SEQ
      a[i * 10] := 0
      SEQ j = 1 FOR 10
        a[(i*10) + j] := j
:

This code declares an array of 100 INTs. The PAR i = 0 FOR 10 line runs the code below it 10 times in parallel, each with a different value of i (0 through 9, inclusive). This code is initialising an array of 100 items, by running these 10 parts in parallel, each of which initialises the beginning of a strip (indices 0, 10, 20, etc) to zero, and the rest of the strip (e.g. indices 1 through 9, 11 through 19) to be 1 to 9. In the above code, there is an error arising from incorrect indexing. The tock compiler spots this and gives the following error:

unsafe2.occ:
1:   PROC p()
2:     [100]INT a:
3:     PAR i = 0 FOR 10
Here-------^
4:       SEQ
5:         a[i * 10] := 0
6:         SEQ j = 1 FOR 10
7:           a[(i*10) + j] := j
8:   :

tock: Error: unsafe2.occ:3:7 Indexes of array "a" ("i * 10 + j" and "i * 10")
  could overlap when: 1 <= i' <= 9 , j = 10 , 1 <= j' <= 10

That error message could perhaps do with being more helpful, but the important part is the most specific: when j is equal to 10, there can be an overlap between array indices being written to in parallel. The problem is that j shouldn’t be able to reach 10: the SEQ j = 1 FOR 10 line should be SEQ j = 1 FOR 9. If you are doing parallel imperative programming with arrays, this is a very useful error for the compiler to be able to spot. Once the line is corrected, tock can tell that it is safe, and will compile the file.

Backends

Tock is really a source-to-source compiler. Rather than producing native code, it produces C code (using the CCSP library) or C++ code (using the C++CSP library). This means that the code is as portable as the concurrency libraries themselves (almost every platform has a C/C++ compiler), and means that we get the C compiler to do all the optimisation of straight-line code for us, rather than writing that ourselves. I actually started work on a CHP backend for Tock, which I guess is fairly silly: compiling a low-level imperative language into a high-level functional language is not typically a good idea, but it was interesting. It involved lots of do blocks and let clauses to directly implement the imperative aspects in a sort of single-static assignment form crossed with a hint of continuation-passing. I’d like to finish it one day — but it’s a lot of work for no particular gain.

State of Play and Build System

Tock is just about feature complete. It lives in a public darcs repository if you are interested in it. The build system uses autoconf and automake, which is unusual for a Haskell project — most Haskell programs use cabal. There are several things that Tock needs that I don’t think cabal provides by itself:

  • Two-stage compilation; due to our generics library, we must build one Haskell program and run it in order to generate more Haskell source that is then compiled alongside more source files to form Tock.
  • C Tests; since we compile C using Tock, we must find out details about the C compiler (e.g. size of void* in C, or accepted flags) to give to Tock during the build process.
  • Program availability; there are certain programs we require (e.g. gcc, alex), so we need to test for them during configuration.

That list is shorter than last time I looked at converting Tock to cabal, so perhaps it may now be possible to switch to cabal — I gather we might be able to keep our autoconf script for the tricky parts and use cabal for the rest. Any links on doing so would be appreciated! If we can get Tock cabalised, we can get it on Hackage, which would make it easier for people to download and use.

Almost all of our dependencies are in the Haskell Platform, so it would be nice to be able to say: install the Haskell Platform, install a concurrent run-time (e.g. CCSP), then issue “cabal install tock” and you’re done. What would also be good would be to get people working on the compiler. For example, Adam and I have wondered if we could add a Go frontend to our compiler, and thus bring the safety checks to bear on the Go language (and potentially to compile it down to our fast concurrency libraries). Unfortunately, neither Adam nor I have had time to work on Tock for a while now, but it would be a shame to let it stagnate.

Categories: Uncategorized

Inferring Process Behaviour from Type

March 30, 2010 Leave a comment

One of Haskell’s big selling points is its powerful type system which, among many other things, allows for easy type-parameterisation of functions. For example, the identity function has type a -> a, meaning that it will take something of any type a and return a value of the same type. Think about this function for a moment — it promises to return something of type a, no matter what type that is. The only way it can do so is to return the value given to it — that’s the only thing available that’s guaranteed to be of the right type. So in fact the only function that satisfies the type a -> a is the identity function. (Consider everything in this blog post to have the disclaimer: …as long as no bottom values are involved.)

CHP Process Types

We can extend this type of reasoning to CHP processes. Let’s start with the process type: a -> Chanout a -> CHP (). A process of this type has an output channel carrying type a, and the only value it has available to it of that type is the one given as the first parameter. So all values sent out on the channel must be the value passed as the first parameter. That’s useful to know. Reasonable behaviours of the process might be to do nothing (semi-reasonable, anyway: const (const (return ()))), write the value once (writeValue) or write it repeatedly (repeat).

We can also consider processes with types such as Chanout b -> CHP () or Chanin a -> Chanout b -> CHP (). We can tell that a process with either of these types will never generate any output, because it cannot form any values of the right type. This is a bit like the function with type a -> b in Haskell, which cannot be properly implemented.

Another process we can consider is one with type (a -> b) -> Chanin a -> Chanout b -> CHP (). In this case, any values sent out on the output channel must be values that were read from the input channel that have been transformed by the given function. The process may re-order, drop or delay values that are sent in — for example, it may send on only every third value. Since the process also has access to the IO monad, it may also drop values randomly. But we know that its behaviour cannot be affected by the values being passed through, because the process has no means by which to inspect values of type a or b. Contrast this with a process that has type (a -> Bool) -> Chanin a -> Chanout a -> CHP(); there the process has a function that it can use to make a decision about the given value.

There is another useful fact we can deduce about the process with type (a -> b) -> Chanin a -> Chanout b -> CHP (). If we never send in any values, it won’t have any to output — so we know this process cannot produce any output before it has first read some input. That’s useful information, and we know it just by looking at the type of the process. This reasoning extends to any process with a type like Chanin a -> Chanout a -> CHP (); such processes must consume input before they can produce any output. Similarly, the process with type Chanin a -> Chanin b -> Chanout (a, b) -> CHP () must input at least once from each of its input channels before it is able to output.

Free Theorems

One well-known bit of work along these lines is Wadler’s Free Theorems paper (bottom of the page). The paper uses these ideas to automatically form theorems based on functions. For example, you can deduce that for any two functions f :: [a] -> a and g :: a -> a, it must be the case that f . map g = g . f. Intuitively, the result of f must be one item from the input list, so whether you apply g to that item afterwards, or apply g to all the list elements beforehand, you must get the same answer. Other examples abound in the paper.

So can we apply similar reasoning to processes? With processes, we have two considerations: the values being passed through the processes, and the communication behaviour. Given two processes p, q :: Chanin a -> Chanout a -> CHP (), what can we say about them? We might wonder if composing them in either order (p <=> q vs q <=> p) gives the same results. This is not true, for several reasons.

Let’s start with values. Imagine that p replicates each value three times, and q sends on every third value. The composition p <=> q will act like the identity process; of every three repeats that p sends on, q will drop all but one. But the composition q <=> p will act differently; q will send on every third value, and p will then repeat it three times. So the same number of values will emerge (perhaps this, at least, is always true?), but they will be different.

The communication behaviour can also be different. Consider if p drops all the values it receives without forwarding them, while q refuses to do anything. The composition p <=> q will act as p does, silently dropping all values received — but it will at least accept all the values it is given. The composition q <=> p will refuse to read any values, so attempting to send a value to it will result in deadlock. It seems that attempts to form theorems about the composition of processes does not work as well as it does for functions.

All this reasoning collapses as soon as the processes have more specific values. The process with type Chanin Int -> Chanout Int -> CHP () might do anything, because it can construct Int values itself. It’s generally good practice in Haskell to keep the types as general as possible, and this type-based reasoning is one reason why that’s a good idea.

Categories: Uncategorized

Abstractions for Message-Passing

March 28, 2010 2 comments

I’ve been writing a couple of papers recently based on CHP. The latest is (provisionally) titled “Abstractions for Message-Passing” (naming suggestions are welcome!). It’s a combination of general CHP, process wiring, process composition and behaviours. I’m going to try submitting it to ICFP — I hope it’s a good enough fit, being the application of FP-like ideas to message-passing within an FP language. I’m posting it here, partly so you can read it if interested (it effectively provides a summary of a lot of my recent work in CHP), and partly in the hope that I might get some constructive feedback (small or big). The deadline is this Friday afternoon (2nd April), which I realise is now quite close, but in case anyone has time to read it this week I’d welcome any comments — for example, I can’t help but feel I’m missing more related work. Link to the current draft is here.

Categories: Uncategorized

Criterion’s Confidence Intervals: The Bootstrap

March 24, 2010 4 comments

I’ve been doing some benchmarking with Criterion recently, and wanted to write up the results. However, I didn’t really understand how it was arriving at some of the statistics that it spits out, particularly the confidence intervals. One book on the subject later, and I think I’ve got it. I thought I’d write it down in case anyone else needed to know. (I’m not a statistics expert though, so please do send corrections if necessary.) The mean result from Criterion is straightforward; the benchmark is run (typically 100 times) and the mean is calculated in the standard way (total time divided by count). But Criterion also gives 95% confidence intervals for the mean, and these require some more detailed explanation.

Confidence Intervals

Confidence intervals are a notoriously tricky part of statistics. A confidence interval of 95% means that if you run many separate experiments, and calculate a confidence interval for each of them, 95% of those intervals will contain the true mean. The kicker being that you have no way of knowing whether the interval you have for your one experiment is one of the 95% that contain the mean, or one of the 5% that don’t.

One way to calculate a confidence interval is to make an assumption about the distribution of the data. It is important to realise that this is an assumption. For example, you might be tempted to assume the times were normally distributed — but do you really believe that your times come from a perfectly symmetric bell-curve? Run times are often skew, distributed more exponentially, or simply too noisy to pick a distribution that could have generated them. (You can perform a statistical test for the validity of the normality assumption in your sample, if you’re interested.) If you did pick a distribution and calculate the parameters — e.g. the variance of a normal distribution — you could say where 95% of the values are likely to fall because you know what shape the distribution is. But if you don’t know the shape of the real distribution (from which your noisy sample was drawn), you need to use a different technique; Criterion uses the bootstrap.

The Bootstrap

Imagine for a moment that rather than just one mean, you instead had 1000 means. A straightforward way to form a 95% confidence interval would be to arrange your sample times in order, and pick the 2.5% mark (roughly the 25th highest mean) and the 97.5% mark (roughly the 975th highest mean). This method makes no new assumptions, and would derive the interval just from the data. Of course, you don’t have the time to wait around and record enough samples to form 1000 means. (Some of my benchmarks took in the order of 20 minutes per benchmark to collect the 100 values for a single mean; for 1000 means I would have to wait a long time!)

The bootstrap technique generates a lot of means, by “faking” a lot more samples than you actually recorded and taking the mean of each of them. It takes your single sample of 100, and generates many more samples (in Criterion, 100000 by default) by randomly picking 100 values out. Obviously, if you picked 100 values from a set of 100 without replacement you’d always end up with the starting set; the bootstrap picks with replacement, so it is very likely to feature repeated original values in the new fake sample, and also to omit some of the original values in each new fake. The confidence interval can be formed from these fake samples as described in the previous paragraph.

That is the intuition of the bootstrap method of calculating confidence intervals. In fact, Criterion uses bootstrap with bias correction and acceleration. These two extra measures affect how the confidence intervals are calculated by attempting to adjust for skew and sensitivity in the data. Unfortunately the reference books I have available are not very illuminating on the intuition behind bias and acceleration — so you and I will have to remain in the dark, pending a better reference.

The Code

Often, programmers learn a system best by tinkering and seeing what happens. In that spirit, here is a little code that takes a sample and prints out the bootstrapped mean confidence intervals for it (as Criterion would derive them):

import Control.Monad.ST (runST)
import Data.Array.Vector (toU)
import Statistics.Sample (mean)
import Statistics.Resampling (resample)
import Statistics.Resampling.Bootstrap (bootstrapBCA)
import System.Random.MWC (create)

sample :: [Double]
sample = [1..10] ++ [20]

main :: IO ()
main = print . runST $
       do g <- create
          resamples <- resample g [mean] 100000 sampleU
          return $ bootstrapBCA 0.95 sampleU [mean] resamples
  where
    sampleU = toU sample

You’ll need the Hackage packages statistics, uvector and mwc-random. For the above, the output (to 2 d.p.) is:

[Estimate {estPoint = 6.82, estLowerBound = 4.64, estUpperBound = 11.00, ..}]

You can see that the outlier 20 pulls the upper bound of the confidence interval beyond all the other values. Fiddle with sample yourself and see what happens.

Categories: Uncategorized
Follow

Get every new post delivered to your Inbox.