Concurrent Testing and Tracing: Useful Output for Test Failures
Programmers write testcases for programs to check that the programs work correctly. When a test-case fails, a programmer wants to know two things: first, why specifically did the test fail (which assertion did not hold?), and secondly, what is wrong with the program (or the test case!)?
Last Friday I released CHP 1.5.0, which adds better support for testing, tracing, and the combination of the two. Tracing refers in theoretical terms to recording a computation’s history: practically, it means keeping a log of all the CHP synchronisation events (e.g. channel communications) that happened in your program. CHP supports several types of tracing, but for this post I will use the simplest type: a CSP trace, which is just a linear sequence of events that occurred.
Let’s explore the combination of testing and tracing in practice. Previously I used an example involving a four-item pipeline to find the mode (most common element) of a list. Let’s use this example again, but since this is a post about testing, we’ll introduce an error: we will forget to include the sort process at the front of the pipeline.
To use the tracing framework, you have to augment your code slightly to label parts of your code. This involves using arrLabel instead of arr, and *<<<* instead of <<< (see the remarks at the end of this post for why, and for a way to shorten the tedious labelling):
badMode = (arrLabel "group" group) *>>>* (arrLabel "maxByLength" maxByLength) *>>>* (arrLabel "head" head) where maxByLength = maximumBy (comparing length)
We can then join this pipeline onto one that takes a string, and splits it into words — making a pipeline that looks for the most frequent word:
mostFreqWord = (arrLabel "makeLower" makeLower) *>>>* (arrLabel "words" words) *>>>* badMode where makeLower = map toLower
Finally, we can provide a simple test sentence for the pipeline that should pick out the most frequent word, “the”:
testFreqWord = testCHPInOut (const (== "the")) (runPipelineLabel mostFreqWord) "The quick brown fox jumps over the lazy dog" main :: IO () main = runTestTT testFreqWord >> return ()
If we run this output, what do we get? Well, the test case fails, so we are told that, but we are also automatically given the trace of the failing test case:
### Failure: testCHP Failure; trace: < _c4, makeLower->words."the quick brown fox jumps over the lazy dog", words->group.["the","quick","brown","fox","jumps","over","the","lazy","dog"], group->maxByLength.[["the"],["quick"],["brown"],["fox"],["jumps"],["over"],["the"],["lazy"],["dog"]], maxByLength->head.["dog"], _c5 > Cases: 1 Tried: 1 Errors: 0 Failures: 1
So how do we read this trace? The trace is delimited by angle brackets, and is a comma-separated sequence of events (I’ll look to tighten up the grammar in future to allow for automated parsing). The _c4 is an unnamed event at the start (this is the channel used in the test framework to inject the data). Then the following events are channel communications of the form channel-name.data-value. So makeLower->words is the channel name, and "the quick brown fox jumps over the lazy dog" is the data value. The channel names have been automatically generated based on the two processes that the channel connects, and the data value has been printed out using show. We can use this trace to immediately see which values were passed between each process pair, and thus track down the problem. You can see from looking at the trace that group is not grouping the two “the”s together, which is because the list is not sorted. We can easily fix this by putting in the missing sort process. The trace with sort added now works as we would expect:
< _c5, makeLower->words."the quick brown fox jumps over the lazy dog", words->sort.["the","quick","brown","fox","jumps","over","the","lazy","dog"], sort->group.["brown","dog","fox","jumps","lazy","over","quick","the","the"], group->maxByLength.[["brown"],["dog"],["fox"],["jumps"],["lazy"],["over"],["quick"],["the","the"]], maxByLength->head.["the","the"], _c6 >
Tracing is useful for tracking down problems and strange behaviours in your system, so integrating it with the test framework made a lot of sense. You can trace any program, not just test-cases, but I will explore that at a future date. You may observe that tracing (at least a CSP trace) is not too far different from adding print (or Debug.Trace.trace) statements around your program. The main differences are:
- Channels are named when you join together processes, so you don’t have to modify the definition of the original processes to get intelligible trace output (unless you want to trace the internal channels).
- Tracing can be turned on or off at run-time (by changing the top-level call to runCHP), rather than at compile-time with pre-processor macros — but if tracing is switched off, you do not incur any overhead.
- The order in the trace is guaranteed to reflect the real ordering of the channel communications, so you can’t end up with an unrepresentative order that usually happens when you put print statements in a concurrent program.
Remarks: There are several side notes to add to this post. Many readers will probably recoil at my introduction of a *>>>* combinator. This has type:
(*>>>*) :: Show b => ProcessPipelineLabel a b -> ProcessPipelineLabel b c -> ProcessPipelineLabel a c
The show constraint is the reason that I could not declare an arrow instance — so any fix to this problem had to involve declaring a similar combinator outside the arrow, unless I’ve missed another approach.
The naming of the processes can be simplified with the use of a pre-processor. Here are the same process declarations, named using the C pre-processor:
#define PR(p) (arrLabel #p p) badMode = PR(group) *>>>* PR(maxByLength) *>>>* PR(head) where maxByLength = maximumBy (comparing length) mostFreqWord = PR(makeLower) *>>>* PR(words) *>>>* badMode where makeLower = map toLower
Unfortunately, GHC with the -cpp flag doesn’t support this stringize operator so you have to run the C pre-processor on the file first as part of your build process before feeding it to GHC, or use the -F flag with a wrapper script.
Showing the data being passed down the channel obviously forces the data, which may potentially affect the behaviour. Whether you show the data in the trace can be turned on and off — and you can also supply a custom function to show your data if, for example, you are passing around infinite lists (you could just print the first three items). I’ve tried to give the gist here rather than all the details, but the CHP documentation explains the different ways you can record a channel communication in a trace.
I will present this example at the fringe session at the CPA 2009 conference this evening here in Eindhoven. If anyone is interested, last year’s paper on tracing is available in my list of publications, with this year’s publication to be added shortly.