Concurrent Pearl: The Sort Pump
A common feature of many Computer Science courses is the issue of sorting: you may recall the bubble sort, the insertion sort, the merge sort and the quick sort. You may also know their time complexity — bubble sort is O(N^2), meaning the time it takes is proportional to the square of the number of items involved. In contrast, quick sort can reach O(N*log N) in some cases, which — crucially — is better. These are all sequential algorithms though — what happens when we want to make a concurrent algorithm?
There is a parallel version of quicksort, where as you divide and conquer your list, you fork off the two divisions into separate threads, and there are probably other pure parallel sorting strategies too. With CHP we can approach the problem differently, using a concurrent sort pump. I want to explain the sort pump because I like its design, not because it is blazingly fast. We’ll begin with a simple component process:
highest :: Ord a => Chanin a -> Chanout a -> CHP () highest input output = readChannel input >>= highest' where highest' x = do y <- readChannel input let (low, high) = if x < y then (x, y) else (y, x) writeChannel output low highest' high
This process starts by reading in a value to hold. Then it repeatedly reads in a new value, keeping the highest one and sending on the lowest one. So if you feed it a stream of values, it will keep hold of the highest value it has seen, and will pass on all lower values. Now let’s consider what happens if you connect two of these processes together. The first process will keep the highest value, sending on all others. The second process will keep the highest value of these others — i.e. the second highest value of all of them. Connect three together, and the third will hold the third highest value. Connect N of these together, pass in N values, and the Mth process will hold the Mth highest value. Voila, a sort pump!
There is a problem with our sort pump though — it sorts the values, but we have no way of getting the result out! So we need to augment our component to permit the “flushing” of the pump:
data FlushOrNormal a = Flush [a] | Normal a highestFlush :: Ord a => Chanin (FlushOrNormal a) -> Chanout (FlushOrNormal a) -> CHP () highestFlush input output = empty where empty = do x <- readChannel input case x of Flush _ -> writeChannel output x >> empty Normal y -> full y full cur = do x <- readChannel input case x of Flush xs -> writeChannel output (Flush (cur:xs)) >> empty Normal new -> let (low, high) = if cur < new then (cur, new) else (new, cur) in writeChannel output (Normal low) >> full high
If the component receives a flush signal, it will prepend its currently held value to the list and send on the flush message. Because the first process has the highest value, and all processes prepend, the list ends up in ascending order. If a normal (non-flush) value is received, the component behaves as before. We can then wrap a pipeline of these processes in a tidy process that takes in a list of items, sorts them and sends them out again:
sorterFixedCapacity :: Ord a => Int -> Chanin [a] -> Chanout [a] -> CHP () sorterFixedCapacity n input output = do pumpIn <- oneToOneChannel pumpOut <- oneToOneChannel pipeline (replicate n highestFlush) (reader pumpIn) (writer pumpOut) <|*|> (forever $ do xs <- readChannel input mapM_ (writeChannel (writer pumpIn) . Normal) xs writeChannel (writer pumpIn) $ Flush  Flush xs' <- readChannel (reader pumpOut) writeChannel output xs')
This sort pump is fixed capacity — if you send the above process a list that is longer than its stated capacity, it won’t work correctly. I will not go into the solutions in detail here, but there are are several — we could set-up and tear-down the list of processes each time if we wished, and we could also make the pump automatically expand.
Finally, let’s think about the sort pump’s time complexity. Each value will pass through half the pipeline on average (not counting the flush), so N/2 communications per item. This happens to N items, so N*N/2 communications (plus N for the flush, but that falls away) [Correction: this should be N*(N+1)/2 -- see the comments below]. Each component will see on average N/2 items, and will thus perform N/2 comparisons, so N*N/2 comparisons [Correction: actually, this should be N*(N-1)/2, I think -- see the comments]. Our time complexity is therefore O(N^2) overall, but importantly if you had N processors (and very fast communications), each process would run O(N) operations in parallel, and thus you would have an O(N) sorting algorithm. Although, as the corrections indicate, some of this can be a bit harder to calculate than it first appears!
While our theoretical bound is good, in practice the sort pump is only going to stand a chance of good performance if you have very few items that are very expensive to compare (so that the computation — i.e. the comparisons — outweigh the communications). The sort pump is really a concurrent version of the bubble sort, and shares many of its attributes: small, built on simple principles — but slow.