Measuring Functions in Criterion

I’ve recently spent a fair amount of time with Haskell’s Criterion library for measuring performance. I found myself somewhat disappointed by the performance of Criterion itself, and so took a whack at speeding it up. After several issues, I am now more familiar with both Criterion and Haskell. Haskell is often criticized as being hard to reason about, so I thought I would write up my experiences.

As Criterion is one of the sources of truth about the performance of Haskell code, the changes I made have to be verified through other means. I’ll try to provide evidence from both the dumped compilation results and examples.

The Measurement Loop

At the highest level, Criterion measures functions by running them in batches. It will run the function an increasing number of times until it can make a “valid” measurement. At the moment, the batch must run for 30ms to be considered statistically significant. Thus, a function that takes 1s must run once and a function that takes 1ms must run thirty times.

The batching and measurement both come with their own overhead, so care must be taken not to include those in the final result. Keeping that goal in mind, here is a simplified version of the main loop found in runBenchmark:

With that, it becomes much easier to talk about particular optimizations. I made two major changes of interest. When I first started looking into the performance issues, I noticed right away that the main issue was garbage collection. The two solutions to that are reducing allocation and removing memory leaks. In my case, reducing allocation did not help performance much, but it did help correctness. There also was not a memory leak, but there was a large working set. So I had to reduce the impact of garbage collection.

  1. Reducing Allocation
  2. Reducing GC
  3. Results
Next