Archive

Archive for the ‘Testing’ Category

Concurrent Testing and Tracing: Useful Output for Test Failures

November 2, 2009 1 comment

Programmers write testcases for programs to check that the programs work correctly. When a test-case fails, a programmer wants to know two things: first, why specifically did the test fail (which assertion did not hold?), and secondly, what is wrong with the program (or the test case!)?

Last Friday I released CHP 1.5.0, which adds better support for testing, tracing, and the combination of the two. Tracing refers in theoretical terms to recording a computation’s history: practically, it means keeping a log of all the CHP synchronisation events (e.g. channel communications) that happened in your program. CHP supports several types of tracing, but for this post I will use the simplest type: a CSP trace, which is just a linear sequence of events that occurred.

Let’s explore the combination of testing and tracing in practice. Previously I used an example involving a four-item pipeline to find the mode (most common element) of a list. Let’s use this example again, but since this is a post about testing, we’ll introduce an error: we will forget to include the sort process at the front of the pipeline.

To use the tracing framework, you have to augment your code slightly to label parts of your code. This involves using arrLabel instead of arr, and *<<<* instead of <<< (see the remarks at the end of this post for why, and for a way to shorten the tedious labelling):

badMode = (arrLabel "group" group) *>>>* (arrLabel "maxByLength" maxByLength)
            *>>>* (arrLabel "head" head)
  where
    maxByLength = maximumBy (comparing length)

We can then join this pipeline onto one that takes a string, and splits it into words — making a pipeline that looks for the most frequent word:

mostFreqWord = (arrLabel "makeLower" makeLower) *>>>* (arrLabel "words" words) *>>>* badMode
  where
    makeLower = map toLower

Finally, we can provide a simple test sentence for the pipeline that should pick out the most frequent word, “the”:

testFreqWord = testCHPInOut (const (== "the")) (runPipelineLabel mostFreqWord)
  "The quick brown fox jumps over the lazy dog"

main :: IO ()
main = runTestTT testFreqWord >> return ()

If we run this output, what do we get? Well, the test case fails, so we are told that, but we are also automatically given the trace of the failing test case:

### Failure:                              
testCHP Failure; trace: < _c4,
  makeLower->words."the quick brown fox jumps over the lazy dog",
  words->group.["the","quick","brown","fox","jumps","over","the","lazy","dog"],
  group->maxByLength.[["the"],["quick"],["brown"],["fox"],["jumps"],["over"],["the"],["lazy"],["dog"]],
  maxByLength->head.["dog"],
  _c5 >
Cases: 1  Tried: 1  Errors: 0  Failures: 1

So how do we read this trace? The trace is delimited by angle brackets, and is a comma-separated sequence of events (I’ll look to tighten up the grammar in future to allow for automated parsing). The _c4 is an unnamed event at the start (this is the channel used in the test framework to inject the data). Then the following events are channel communications of the form channel-name.data-value. So makeLower->words is the channel name, and "the quick brown fox jumps over the lazy dog" is the data value. The channel names have been automatically generated based on the two processes that the channel connects, and the data value has been printed out using show. We can use this trace to immediately see which values were passed between each process pair, and thus track down the problem. You can see from looking at the trace that group is not grouping the two “the”s together, which is because the list is not sorted. We can easily fix this by putting in the missing sort process. The trace with sort added now works as we would expect:

< _c5,
  makeLower->words."the quick brown fox jumps over the lazy dog",
  words->sort.["the","quick","brown","fox","jumps","over","the","lazy","dog"],
  sort->group.["brown","dog","fox","jumps","lazy","over","quick","the","the"],
  group->maxByLength.[["brown"],["dog"],["fox"],["jumps"],["lazy"],["over"],["quick"],["the","the"]],
  maxByLength->head.["the","the"],
  _c6 >

Tracing is useful for tracking down problems and strange behaviours in your system, so integrating it with the test framework made a lot of sense. You can trace any program, not just test-cases, but I will explore that at a future date. You may observe that tracing (at least a CSP trace) is not too far different from adding print (or Debug.Trace.trace) statements around your program. The main differences are:

  • Channels are named when you join together processes, so you don’t have to modify the definition of the original processes to get intelligible trace output (unless you want to trace the internal channels).
  • Tracing can be turned on or off at run-time (by changing the top-level call to runCHP), rather than at compile-time with pre-processor macros — but if tracing is switched off, you do not incur any overhead.
  • The order in the trace is guaranteed to reflect the real ordering of the channel communications, so you can’t end up with an unrepresentative order that usually happens when you put print statements in a concurrent program.

Remarks: There are several side notes to add to this post. Many readers will probably recoil at my introduction of a *>>>* combinator. This has type:

(*>>>*) :: Show b => ProcessPipelineLabel a b -> ProcessPipelineLabel b c
  -> ProcessPipelineLabel a c

The show constraint is the reason that I could not declare an arrow instance — so any fix to this problem had to involve declaring a similar combinator outside the arrow, unless I’ve missed another approach.

The naming of the processes can be simplified with the use of a pre-processor. Here are the same process declarations, named using the C pre-processor:

#define PR(p) (arrLabel #p p)

badMode = PR(group) *>>>* PR(maxByLength) *>>>* PR(head)
  where
    maxByLength = maximumBy (comparing length)

mostFreqWord = PR(makeLower) *>>>* PR(words) *>>>* badMode
  where
    makeLower = map toLower

Unfortunately, GHC with the -cpp flag doesn’t support this stringize operator so you have to run the C pre-processor on the file first as part of your build process before feeding it to GHC, or use the -F flag with a wrapper script.

Showing the data being passed down the channel obviously forces the data, which may potentially affect the behaviour. Whether you show the data in the trace can be turned on and off — and you can also supply a custom function to show your data if, for example, you are passing around infinite lists (you could just print the first three items). I’ve tried to give the gist here rather than all the details, but the CHP documentation explains the different ways you can record a channel communication in a trace.

I will present this example at the fringe session at the CPA 2009 conference this evening here in Eindhoven. If anyone is interested, last year’s paper on tracing is available in my list of publications, with this year’s publication to be added shortly.

Categories: Testing Tags:

Testing Communicating Haskell Processes with QuickCheck and HUnit

October 2, 2009 Leave a comment

In previous posts, I have explained the sort pump and a way to generate prime numbers. It would be good to get some assurance that those programs work correctly; we need to test them. The newest release of CHP (v1.4.0, released yesterday) contains a new Control.Concurrent.CHP.Test module that will help, and this post will explain how to use it.

Probably the most well-known Haskell testing framework is QuickCheck. QuickCheck takes a property for the output of a function. It then generates random inputs, feeding them to the function and checks that the property holds on the output. QuickCheck is a very good fit for testing our sort pump: our inputs can be lists of integers, and the property of the output is that it should be sorted.

An alternative approach to testing is to use something like HUnit which runs a specific prescribed test. HUnit is a good fit for our primes pipeline: we want to test that the numbers coming out are the prime numbers.

We will start by using QuickCheck to test the sort pump. QuickCheck version 1 only supported testing pure functions, which is not enough to test our concurrent programs. Thankfully, QuickCheck 2 supports testing monadic functions (even if the documentation is lacking), so we can use that via the new CHP function: propCHPInOut. As the documentation for the function says:

propCHPInOut :: Show a => (a -> b -> Bool)
             -> (Chanin a -> Chanout b -> CHP ()) -> Gen a -> Property

The first parameter is a pure function that takes both the input to the process and the output the process gave back, and indicates whether this is okay (True = test pass, False = test fail). The second parameter is the process to test, and the third parameter is the thing to use to generate the inputs (passing ‘arbitrary’ is the simplest thing to do).

The first parameter is straightforward — the output should be a sorted version of the input. The second parameter is the sort pump process. The third parameter could just be ‘arbitrary’, the default generator for lists of Ints, but it would be nice to make sure that the list is likely to contain duplicate entries, so we supply a customised generator:

test :: IO ()
test = quickCheck $ propCHPInOut (\input output -> output == sort input)
                                 sorterGrow gen
  where
    gen = do len <- elements [0..30]
             replicateM len $ elements [(-5)..5]

That is the complete code for testing our sort pump — and it passes. The propCHPInOut function (and its HUnit equivalent) is very useful for testing these single-input single-output stateless processes.

The primes pipeline is different as it has no input — it is a generator that just produces output. So we will instead use the more general testCHP function that works with HUnit. testCHP takes an item of CHP Bool and turns it into an HUnit test, so we have to do all the heavy lifting in producing that Bool that represents test success or failure. (In future it might be nice to provide support for HUnit asserts in a CHP program, interfacing nicely with the poison mechanism, but that’s not implemented just yet.) The key thing to be careful about is that if your process exits with poison, that is automatically counted as a test failure. So you need to make sure that if the test passes, it shuts down without passing on the poison — for this we can use onPoisonTrap rather than onPoisonRethrow:

test :: IO ()
test = do runTestTT $ testCHP $
            do c <- oneToOneChannel
               liftM snd $ (primes (writer c) `onPoisonTrap` return ())
                 <||> (do ps <- replicateM 200 $ readChannel (reader c)
                          poison (reader c)
                          return $ ps == take 200 funcPrimes
                      )
          return ()
  where
    funcPrimes = sieve [2..]
    sieve (p : xs) = p : sieve [x | x <- xs, x `mod` p > 0]

This code creates a channel to communicate on, and passes the primes process one end of it. Note that any poison thrown by the primes process will be trapped rather than passed on. At the other end of the channel, we read the first 200 numbers then poison the channel, before checking that these are indeed the first 200 primes (which we generate with the classic functional algorithm). If the primes process unexpectedly poisons the channel, the poison from the second part of the parallel composition will propagate to the top-level and thus fail the test — we don’t need any extra code to check if primes misbehaves by arbitrarily poisoning the channel.

This second test is an example that builds up and tears down the whole infrastructure with our own code, but it still is not too long. Just as with pure functions, QuickCheck is useful for testing random inputs to processes, whereas HUnit is useful for specific tests. It would also be nice to get Lazy SmallCheck working with CHP, but because of the way it throws errors for test failures it may be more difficult (and I haven’t checked if it can do monadic tests). These examples have worked with processes that do not carry any particular state — in future I hope to return to look at testing processes that do keep some state over time.

Categories: Testing
Follow

Get every new post delivered to your Inbox.