The octopus, the boids and GHC 6.12.1rc1
I had been eagerly awaiting the release candidate of the latest GHC (6.12). Last night, the GHC team delivered. Straight away, I downloaded and installed it on octopus, the 8-core machine I use for benchmarking. In the last part of the boids guide I did some benchmarking on octopus to see what speed-up I got with the boids. This was the result (click on image for full-size, wordpress’s thumbnails are quite poor):
Since then, I have optimised the CHP library, and installed GHC 6.12.1rc1. The result is a better graph, that takes less time sequentially, and has a better speed-up profile (including taking out that nasty upswing at the end of the first graph — click for full-size):
Edit: the graph now includes the GHC 6.10 time (including CHP optimisations) and GHC 6.12.1-rc1 time (same CHP optimisations), so you can get an idea of the differences. Apologies for not overlaying it on one graph, but I have mislaid the source figures for the original graph. I should benchmark all this properly and put the error bars on, etc.
I’m pleased with the effect that the optimisations and the GHC upgrade have had. The thing I really want to play with next is Satnam Singh’s ThreadScope, which visualises the thread usage with GHC 6.12, and will hopefully allow me to play with profiling-directed optimisation of CHP programs.
Since you said you improved CHP’s code, it would be nice to include in the second graph the improved code using GHC 6.10.
Thanks!
Done — now you can see the 6.10->6.12.1-rc1 change fairly clearly. It’s not massive, but from my point of view it came for free 🙂
Now we can see that the performance loss at 8 cores really was a GHC glitch, not your fault ;).
Hi Neil
I’m just curious, how did you come with the ideal values in the graphs?
I took the simple approach of taking the 1-core time, and dividing it by N, the number of cores. So this is the ideal speed-up scaling for my program — but of course a different implementation might be faster than my ideal speed-up curve.