Home > Uncategorized > Haskell API Design: Having your own Monad atop IO

Haskell API Design: Having your own Monad atop IO

I was recently writing my PhD thesis (which is about CHP), and writing the conclusions got me thinking about weaknesses of CHP. (For newcomers: CHP is an imperative message-passing concurrency library for Haskell.) When you’ve been working on some software for several years, you want to believe it’s fantastic, wonderful, flawless stuff and that everyone should use it. But why might CHP not be the best thing ever? (Other suggestions are welcome, honest!) Some aspects of the API design might fit into this category, and in this post I’ll discuss one of the API choices behind CHP, namely: having a special CHP monad instead of using the IO monad that sits underneath. This may have wider relevance to other libraries that similarly build on top of IO.

I’m going to invert the flow of this post: later on I’ll explain why I created a CHP monad, but first I’ll discuss what problems it causes.

IO vs CHP

The CHP monad is a layer on top of the IO monad (think of it as a few monad transformers on top of IO). What difference does it make to the user to have the CHP monad instead of the IO monad that underlies it?

Lifting IO

CHP being a layer on top of IO means that you can “lift” IO actions into the CHP monad to use IO actions in your CHP processes. In Haskell, all actions in a monadic block must be of the same type. But this cannot be done automatically. So if you have an IO action like touchFile :: String -> IO () and the standard CHP function readChannel :: Chanin a -> CHP a, you cannot write:

do fileName <- readChannel fileNameChan
   touchFile fileName

The first action is CHP and the second is IO — Haskell can’t reconcile the two. There are two fixes for this. One is to use a package like monadIO, which defines several standard IO actions to have types like: touchFile :: MonadIO m => String -> m (), i.e. to be able to fit into any monad built on top of IO (some argue all IO-based functions should have a type like this). Then you can write the code above. But the monadIO package only supports a few core IO actions, and any time you use another IO-based library (like, say, network) you are stuck. The other fix is to use the liftIO_CHP :: IO a -> CHP a function that exists exactly for this purpose. That works, but it gets rather verbose:

do fileName <- readChannel fileNameChan
   liftIO_CHP $ touchFile fileName

If you have lots of IO and CHP actions interspersed, it clutters up the code and makes it less readable. What’s galling is that liftIO_CHP has no particular semantic effect; it’s only really needed (to my mind) to placate the type-checker.

This lifting problem is not specific to CHP, incidentally: it is a problem for all monad transformer libraries. Adding any monad transformer on top of IO requires lifting functions such as liftIO_CHP.

Just a little CHP

Imagine that you have written a program that doesn’t use CHP. For all the imperative I/O parts it probably uses the ubiquitous IO monad. Now you want to add some concurrency. If you use MVars, you just need to create them and use them — writing and reading with MVars is done in the IO monad. The STM library has its own monad, but this is a monad in which you write a small transaction and then execute it in the IO monad, so the rest of your program can happily remain in the IO monad. But if you want to add a little CHP, suddenly you need to be in the CHP monad! You can’t just use runCHP :: CHP a -> IO (Maybe a) at the points where you want to use CHP. Firstly, the library is not built to be used like that (and the API doesn’t support being used like that). Secondly, if you have to tag all the CHP actions in the IO monad with runCHP, you’re not much better off than you were when you had to tag all the IO actions in the CHP monad with liftIO_CHP.

The “correct” way to add some CHP to your program is to adjust the whole program to be inside the CHP monad. You need to either change the types of your functions that used to be in the IO monad to be in the CHP monad (with varying amounts of liftIO_CHP) or you need to wrap them in CHP wrappers. That’s quite an overhead, especially if you would prefer to gradually introduce some CHP into your program. I don’t like to be forcing people to put their entire program inside CHP to get anywhere. (This reminds me a bit of emacs, where questions involving loading and quitting emacs to do other tasks often elicit the response: don’t quit emacs, just do everything from inside emacs.)

So what is the CHP monad for?

So far we’ve explored all the problems of the CHP monad, which may have convinced you that it was a bad design decision. Now I want to explain what the CHP monad does (over and above IO) and where it might be difficult to replace CHP with plain IO.

Poison

CHP has the notion of poison. Channels can be set into a poisoned state, and any attempt to use them thereafter results in a poison exception being thrown. The CHP monad incorporates this using a very standard error-monad(-transformer) mechanism. This could easily be implemented using the standard Haskell exception mechanisms that exist in the IO monad. I guess in general, an error monad transformer on top of IO is somewhat redundant.

Traces

A nice feature of CHP is that tracing support is built in, and can be turned on or off at run-time. This used to be done using a state-monad(-transformer), but the problem with implementing it that way is that if a process deadlocks, the state is lost. Since the main time you want the trace is when a process deadlocks, this was quite a limitation! So now it works by having a mutable (IORef/TVar) variable for recording the trace, with a reader-monad(-transformer) containing the address of that mutable variable along with an identifier for the current process.

At first, I thought to replicate this functionality I would need some sort of thread-local storage, which is discussed a little on an old Haskell wiki page. Now thinking about it, I could hold information on whether tracing is turned on in some sort of singleton variable (that is initialised once near the beginning of the program and read-only thereafter), and use ThreadId in lieu of a process identifier. Process identifiers in CHP at the moment have information on their ancestry, but I could easily record (during the runParallel function) a map from ThreadId to ancestry information in the trace.

Choice actions

The other functionality that the CHP monad supports is choosing between the leading action of two code blocks. That is, this code:

(readChannel c >>= writeChannel x) <|> (readChannel d >>= writeChannel y)

chooses between reading from channels “c” and “d”. Once a value arrives on one of those channels, the choice is made for definite. After that, the value is sent on either the channel x or the channel y respectively. This is slightly unusual for Haskell, but I’ve found it helps to make writing CHP programs a lot easier.
(I discussed this in more detail in the appendix of a recent draft paper.) There is no way to replicate the same functionality in the IO monad.

One alternative to this style of API is to use a style more like CML. In fact I’ve previously released a “sync” library that is a cut-down version of CHP with a CML-style API in the IO monad. So something similar to that may be acceptable; one possibility might be:

choose :: [forall a. (Event a, a -> IO b)] -> IO b

(Hoping I have my forall in the right place there! Is that technically an ImpredicativeType?) Which would mean I could write the above as:

choose [(readChannel c, writeChannel x), (readChannel d, writeChannel y)]

This could be done without the nasty types if I use a special bind-like operator (the CML library calls this exact same function wrap):

(|>=) :: Event a -> (a -> IO b) -> Event b

That, along with sync :: Event a -> IO a would allow me to write something like:

sync $ (readChannel c |>= writeChannel x) <|> (readChannel c |>= writeChannel y)

That might not be too terrible, although maybe a better combination of symbols for my bind-like operator might be better.

Summary

The more I think about (and write about) this issue, the more I begin to think that I would be better off removing the CHP monad. Slipping back to the IO monad would allow easier integration with existing Haskell code, and hence easier uptake of the library. But that’s quite a shift in the API — almost every single function in the CHP library would have to change its type. Is that a jump that can be made by shifting from CHP 2.x to 3.x, or is it perhaps better to start again with a new library and build from there? (This brings to mind several recent discussions on haskell-cafe and the resulting wiki page on whether to rename/re-version — although CHP doesn’t have as many users as FGL, so far as I know!) There may actually be a relatively easy way to transition with such a major change; defining:

type CHP a = IO a
liftIO_CHP = id
run_CHP p = (Just <$> p) `onPoisonTrap` return Nothing

Should probably allow most existing code to keep working without modification, except for choice.

All opinions are welcome on whether to dump the CHP monad for IO, especially if you use the library.

Categories: Uncategorized
  1. Robert Lee
    June 28, 2010 at 5:09 pm

    I use CHP in a small port knocker, and it works well. I isolate IO functionality in separate CHP processes so I don’t have a problem with the present library’s occasional IO lifting. However, I began writing the port knocker with CHP in mind. I adapted to a CHP idiomatic way of thinking from the start, which is brilliant for server code. In a sense the present CHP idiom renders IO a second class citizen, which isn’t really a bad thing for my purposes.

    For me, CHP style programming has too much potential to ignore in interactive real world code. Making CHP play nice with IO might make it more attractive to those who just want to dip their toes into the water, but I don’t need it.

    • June 29, 2010 at 8:31 am

      Thanks for the information, Robert

  2. Robert Lee
    June 28, 2010 at 5:45 pm

    Another concurrent thought:

    What about the possibility of creating concurrent CHP domains like:

    main = do
    domChan0 <- domCHP blah blah blah
    domChan1 <- domCHP blah blah blah
    ioStuff <- IO stuff
    reply <- domChanTalk domChans ioStuff
    domChan2 <- domCHP blah blah blah
    domChan3 <- domCHP blah blah blah
    domChanx <- domCHP blah blah blah

    Where domCHans have a set of CHP domain functions that are available to communicate between plain IO and the concurrent CHP domains. I don't really have this very well thought out, just a glimmer of an idea.

  3. June 28, 2010 at 9:37 pm

    Hi Neil,

    I don’t know enough about your library to know whether this is madness or not, but did you consider and rule out writing CHP as a mini-interpreter? E.g. all CHP functions actually build an expression containing function values (perhaps pure and IO) which itself is pure, but can then be executed by some top level IO function?

    Again I don’t know enough, but I think this would provide a means to work around some of the monad transformer related issues and allow some nice side benefits such as greater introspection and being able to build different back ends. There would be some performance cost I expect, but (finger in air) it doesn’t feel too excessive.

    Anyway, it would be interesting to know the many ways in which I’m wrong 🙂

    Cheers,
    Sam

    • June 29, 2010 at 8:33 am

      I guess the issue is that I was wondering about moving closer to standard idiomatic Haskell (by using IO instead of CHP) to allow better integration with existing Haskell code. Your suggestion is interesting, but is heading in the opposite direction, more towards CHP as an EDSL or separate language than as a Haskell library.

  4. June 29, 2010 at 6:03 am

    I was under the impression that if CHP had a MonadIO instance, you could simply use liftIO.

    • June 29, 2010 at 8:36 am

      CHP doesn’t have a MonadIO instance in the library itself, as mtl, transformers etc all have their own MonadIO class. See this post — I ended up moving the MonadIO instances out to chp-mtl, etc. But whether the lifting function is liftIO_CHP or liftIO, it’s still an irritating overhead.

      • June 29, 2010 at 8:58 am

        Ah, I see. They should hurry up and move MonadIO to base 🙂

        With regard to your whole post, one thing to think about when moving from CHP to IO is the fact that the CHP monad may enforce invariants that can be broken when you’re in the wild-west of IO. As far as I can tell, this is not relevant for CHP, but we can look at another monad like Orc, where a logic+concurrency piece of functionality is built on top of a concurrency abstraction HIO. In this case, you are expected to use the concurrency primitives that Orc provides and, though Orc has a MonadIO instance, you are not supposed to start manually forking threads.

        In the case where there is interesting code to be written inside of the monad using the domain specific language, I don’t mind having to use liftIO; in particular, I can just move my sequences of IO into their own functions and then lift the entire thing.

  5. June 29, 2010 at 9:09 am

    Edward,

    I agree that MonadIO should be in base. You’re right that if I did move the whole of CHP to IO, I would probably have to advise users not to use forkIO and to use my own variant instead. But I don’t think that’s terminal. Orc does do more book-keeping behind the scenes than CHP with its thread-groups and so on (and support for killing threads). The main motivation behind all this was writing an IO-heavy CHP example recently and getting frustrated when every alternate line was an IO action, and thus had to be lifted — but I couldn’t lift the whole block because the other alternate lines were CHP actions.

  1. No trackbacks yet.

Leave a reply to Neil Brown Cancel reply