Archive

Posts Tagged ‘pipeline’

Sticky Platelets in a Pipeline (Part 1)

December 21, 2009 2 comments

This post brings together concepts from several recent posts, including behaviours, conjunction and channel wiring. It is based on the work in the completed TUNA project, which has some stunning videos of blood clotting (and some images if you prefer):

I’m going to cut down the TUNA example greatly, to blog-post size. What we are going to be simulating is sticky objects in a pipeline, for example sticky blood platelets moving through a blood vessel. Our platelets will be represented as data that is passed between active “site” processes; our blood vessel will be a one-dimensional pipeline of site processes. Each process will be connected to its neighbours with a channel that can pass platelets. All the processes will be enrolled on the tick barrier, as is customary in CHP simulations.

We’ll begin the code with a helper function (one that I would like to see in the standard library) that iterates forever with a state value being passed through, and our platelet data type:

foreverFeed :: Monad m => (a -> m a) -> a -> m b
foreverFeed f x = f x >>= foreverFeed f

data Platelet = Platelet Float

The data in the platelet type is a colour value that I will use for visualisation in the next part of this guide. This is set randomly by the platelet generator at the beginning of the pipeline:

plateletGenerator :: Chanout (Maybe Platelet)
                  -> EnrolledBarrier
                  -> CHP ()
plateletGenerator out tick = forever $ on >> off >> off
  where
    on = do platelet <- Just . Platelet <$> liftIO (randomRIO (0, 0.5))
            offerAll [ once (writeChannel out platelet)
                     , endWhen (syncBarrier tick) ]
    off = offerAll [once (writeChannel out Nothing), endWhen (syncBarrier tick)]

The pipeline generates platelets regularly, one every three time-steps (this is coded as the simple on-off-off sequence). When it is performing an “on” time-step, it generates a platelet with a random shade, then uses behaviours to offer to once send the platelet until tick happens (i.e. the frame is over). The next site in the pipeline may not take the new platelet if the site is full and not moving this time-step, so the platelet may get discarded in that case. In the off state, the generator waits for the tick to end the frame, but also offers to tell the site ahead of it that the generator is empty (signified by sending Nothing rather than Just a platelet).

The main logic is in the site process, which also has two states, empty and full:

site :: Chanin (Maybe Platelet)
     -> Chanout (Maybe Platelet)
     -> EnrolledBarrier
     -> CHP ()
site prevIn nextOut tick = foreverFeed (maybe empty full) Nothing
  where
    empty :: CHP (Maybe Platelet)
    full :: Platelet -> CHP (Maybe Platelet)

Each time, if there is Just a platelet returned by the function, the next state will be full, otherwise it will be empty. The initial state is empty (the Nothing value). The empty state consists of three possible behaviours:

    empty = join . fst <$> offer
              (            (once $ readChannel prevIn)
               `alongside` (once $ writeChannel nextOut Nothing)
               `alongside` (endWhen $ syncBarrier tick)
              )

In an empty state, a site will read in a new platelet from the previous site in the pipeline (if available), it will offer to communicate to the next site in the pipeline that it is empty, and it will finish this behaviour when the tick event happens. The value returned is the result of reading from the channel, which will be Nothing if no read occurred or if we read in a Nothing value (and hence the site remains empty) or Just the result of the read if it did happen and was a platelet (in which case the site will become full). It is possible to reduce the amount of communications happening with empty processes, but I want to keep this example simple if I can.

The full state is as follows:

    full platelet
      = do r <- liftIO $ randomRIO (0, (1::Double))
           let
             move = readChannel prevIn <&> writeChannel nextOut (Just platelet)
             probablyMove = if r < 0.05 then stop else fst <$> move

           fromMaybe (Just platelet) . fst <$>
             (offer $
               once probablyMove
               `alongside` endWhen (syncBarrier tick)
             )

We will pick this code apart, bit by bit. It is primarily an offer between the tick to end the frame and another behaviour, called probablyMove. When the site is full, it has a 5% chance of refusing to do anything, meaning that a single platelet will not move in 5% of time-steps. So it starts by generating a random number between 0 and 1. If this is under 0.05 (i.e. a 5% chance), the probablyMove behaviour is stop, meaning it will not move — the site will just wait for the end of the frame in these 5% of cases.

In the other 95% of the time-steps, a move is offered, using conjunction. The site offers to read a value from the previous channel (which may be Just a platelet, or a Nothing value signifying the site was empty) and send on its current platelet, shuffling the platelets along the pipeline. So its overall behaviour is that it will send on its current platelet, if and only if: the previous site is empty, or the previous site is full and willing to send its platelet on (it won’t be willing 5% of the time). So a platelet can only move if there is no-one behind it, or if the platelet behind it moves too.

The implications of this behaviour are that once platelets are in adjoining cells, they only move on together. Thus any platelets that bump together form a notional clot that stays together forever after. This clot is not explicitly programmed and has no representation in the program. It is emergent behaviour that arises out of the local rules of the site process; each site only communicates with the site either side of it, and yet the program logic means that clots that are tens of platelets long could form, and would be unbreakable.

The other neat thing that arises out of the logic comes from the 5% chance. In 5% of time-steps a platelet will not move. (This is what allows the platelets to bump together in the first place.) Since a clot can only move when all its platelets move, a two-platelet clot has a roughly 10% chance of not moving (really: 1 – 0.95^2), and a three-platelet clot has about a 14% chance of not moving (1 – 0.95^3). So big clots will move slower, which means that the platelets behind become more likely to join on. Despite only having a local probability of not moving, we get the behaviour that larger clots are less likely to be able to move.

Enough on the site process; at the end of the pipeline is a platelet printer, that swallows up any platelets and prints out how large each clot was that reached the end of the pipeline:

plateletPrinter :: Chanin (Maybe Platelet) -> EnrolledBarrier -> CHP ()
plateletPrinter input tick
  = foreverFeed plateletPrinter' []
  where
    plateletPrinter' ps
      = do mp <- join . fst <$> offer (once (readChannel input)
                                       `alongside` endWhen (syncBarrier tick))
           let ps' = ps ++ [mp]
               (test, text) = maybe (isNothing, "Blank")
                                    (const (isJust, "Clot"))
                                    (head ps')
               (chunk, rest) = span test ps'
           if null chunk || null rest
             then return ps'
             else do let status = text ++ ", size: " ++ show (length chunk)
                     liftIO $ putStrLn status
                     return rest

And finally we must wire up all of this. Thankfully, our new connectable helper functions make this quite easy, and short:

main :: IO ()
main = runCHP_ $
         pipelineConnectCompleteT (enrollAll newBarrier)
           plateletGenerator (replicate numSites site) plateletPrinter
  where
    numSites = 100

If we compile and run this program, we get a print-out of clot sizes:

Blank, size: 103
Clot, size: 1
Blank, size: 51
Clot, size: 4
Blank, size: 2
Clot, size: 1

That is terribly unexciting, so I’m going to give a sneak video preview of a visualisation that I will go through in my next post. The 1D pipeline of sites is visualised left-to-right, with each tall bar being a platelet (or black for empty). When the bar flashes white, this is in the 5% of cases where the platelet is refusing to move. Hence you can see that when the larger clots form, the white flashes of the various platelets prevent the clot from moving:


Choose from WMV format (suitable for Windows) or MPEG format (suitable for Linux, etc).

Functions into processes, using arrows

September 22, 2009 8 comments

Pointfree notation is often the most elegant way to write a function in Haskell. Put simply, any time you write code such as:

foo x = f (g (h x))

Or, if you are a dollar fan:

foo x = f $ g $ h x

You can rewrite it as:

foo = f . g . h

Consider this example of a function composition from Neil Mitchell that finds the mode (most common element) of a list:

mostCommon :: Ord a => [a] -> a
mostCommon = head . maximumBy (comparing length) . group . sort

A nice composition of four functions. Each function in that pipeline of functions is taking a single input and producing a single output, which is then fed into the next function. This pipeline of pure functions is analogous to a pipeline of communicating processes — each taking a single input and sending on a single output to the next process. So is there an easy way of converting such function pipelines into process pipelines? The answer is yes — by using arrows.

arrow

Even if you are a Haskell programmer, you may not be familiar with arrows. They can be used to express these input and output compositions. We can convert our function pipeline to use arrow notation by just changing the composition operator:

mostCommonArr1 :: Ord a => [a] -> a
mostCommonArr1 = head <<< maximumBy (comparing length) <<< group <<< sort

This is because by good design/happy accident, a function does not need any special annotation to become part of an arrow. If we want to be more general, we must use the arr function to convert pure functions into arrows:

mostCommonArr2 :: Ord a => [a] -> a
mostCommonArr2 = arr head <<< arr (maximumBy (comparing length)) <<< arr group <<< arr sort

A bit more cumbersome perhaps, but we haven’t changed the original too much — the pipeline of the four functions is still visibly there. Now that we have our function pipeline expressed in terms of arrows, changing to a process pipeline is a relatively simple matter. You’ll need to import the Control.Concurrent.CHP.Arrow module, and then re-type the pipeline to be a CHP process with channels, and stick runPipeline on the front:

mostCommonArrProc :: Ord a => Chanin [a] -> Chanout a -> CHP ()
mostCommonArrProc = runPipeline $ arr head <<< arr (maximumBy (comparing length)) <<< arr group <<< arr sort

And that’s it. The function here is now a communicating pipeline of four functions, wrapped up into one. mostCommonArrProc will sit there waiting to be sent a list of items, and once it has been, it will output the most common element of the list. So we’ve re-used our simple pure-function pipeline as a pipeline of communicating processes, with only a little change in notation.

Note: Since base-4, Arrow has become based on Category, which means you can actually express the function using the original dot composition, like so:

mostCommonCatProc :: Ord a => Chanin [a] -> Chanout a -> CHP ()
mostCommonCatProc = runPipeline $ arr head . arr (maximumBy (comparing length)) . arr group . arr sort

I prefer the <<< notation of the arrows, as I think it better expresses a pipeline of processes — and it doesn’t require base-4.

Categories: Uncategorized Tags: ,
Follow

Get every new post delivered to your Inbox.