Using Heap View

There are some patterns in Haskell code that often lead to space leaks. Here is one you might run into very quickly when learning Haskell.

-- | Sum the numbers from 1 to i
sumRange :: Int -> Int -> IO Int
sumRange acc 0 = return acc
sumRange acc i = sumRange (acc + i) (i - 1)

(I’ve made the return type IO Int so that I can use getClosureData. You could theoretically wrap it in an unsafePerformIO like the trace function, but for now I’ll just do this.)

Running sumRange 0 1000000 on my computer (with a mostly unoptimized GHC), I get these performance stats:

$ ./test +RTS -s
    121,344,040 bytes allocated in the heap
     57,823,520 bytes maximum residency

  Total time    0.419s

  Productivity  34.8% of total elapsed

Looking at our program, these numbers are not good. We shouldn’t be building any big structures of data. In fact, we should have roughly two Ints worth of memory at any given time. One for acc, one for i.

At this point, we have a very specific problem: there is too much data being held on to. The “maximum residency” tells us that the garbage collector found 58MB of live data on the heap when it ran. That doesn’t seem right. If we lower i to 100, then only 44KB are used. So our program memory grows with i even though it should be roughly constant.

We’ve profiled and found a memory leak, so it is time to get rid of it. At this point, there are basically two recommendations. One, look at core. You can dump the intermediate stages of GHC compilation, which look a little closer to imperative programs. This is often useful, but somewhat heavy-handed. It also can become quite hard to look through in larger projects. Two, add strictness annotations everywhere and hope for the best. This is in fact the solution to our problem, but it would be nice to have a target for our !-gun. In this case, there are only two options (!acc and !i) so we could just compile-profile-loop until we found the solution.

With heap-view however, we have another option. Let’s look at the values on the heap during each loop. For the sake of screen space, we’ll limit the loop to only 3 iterations. Here is the updated code and output, again trimmed for clarity:

import GHC.Exts.Heap

-- | Sum the numbers from 1 to i
sumRange :: Int -> Int -> IO Int
sumRange acc 0 = return acc
sumRange acc i = do
    ca <- getClosureData acc
    putStr "acc: "
    print ca
    ci <- getClosureData i
    putStr "  i: "
    print ci
    sumRange (acc + i) (i - 1)

$ ./test
  i: ConstrClosure { info = StgInfoTable {tipe = CONSTR_0_1}
                   , ptrArgs = [], dataArgs = [3], name = "I#"}
acc: ConstrClosure { info = StgInfoTable {tipe = CONSTR_0_1}
                   , ptrArgs = [], dataArgs = [0], name = "I#"}
  i: ConstrClosure { info = StgInfoTable {tipe = CONSTR_0_1}
                   , ptrArgs = [], dataArgs = [2], name = "I#"}
acc: ThunkClosure  { info = StgInfoTable {tipe = THUNK_2_0 }
                   , ptrArgs = [0x00c66518/1,0x00c66548/1], dataArgs = []}
  i: ConstrClosure { info = StgInfoTable {tipe = CONSTR_0_1}
                   , ptrArgs = [], dataArgs = [1], name = "I#"}
acc: ThunkClosure  { info = StgInfoTable {tipe = THUNK_2_0 }
                   , ptrArgs = [0x00420000fe50,0x00420000fe70/1], dataArgs = []}

If we look at i, we see that at every iteration of the loop, it is just our Int constructor from last time. So i stays constant in space usage. Looking at acc though, it is a thunk. Every time we go through the loop, we are simply adding another thunk to the chain. In memory, acc looks something like this (assuming we start with acc == 0):

+---+---------+---+
| + | old_acc | 0 |
+---+---------+---+
         | +---+-----------+---+
         ->| + | older_acc | 1 |
           +---+-----------+---+
                    |
                   ...
                    | +---+---+---+
                    ->| + | 0 | i |
                      +---+---+---+

Or, in a perhaps more understandable format:

(...(((0 + i) + (i-1)) + (i-2)) + ... + 1)

It seems like making acc strict should solve our problem.

{-# LANGUAGE BangPatterns #-}

-- | Sum the numbers from 1 to i
sumRange :: Int -> Int -> IO Int
sumRange !acc 0 = return acc
sumRange acc i = sumRange (acc + i) (i - 1)

Running our program with 1 million again, our profile looks much better.

$ ./test +RTS -s
     88,051,752 bytes allocated in the heap
         44,576 bytes maximum residency

  Total time    0.141s

  Productivity  99.3% of total elapsed

If you include the closure data output, you’ll see that both i and acc are now always Ints. A single ! made our code four times faster! This is definitely not an ideal situation, but it is the state of things right now.

For me, heap-view sits at a nice place between guessing and reading core. If you have a hypothesis, this makes it reasonable to test it without changing build flags. More than anything, I think it removes a lot of the guess-work from exploring Haskell’s evaluation. You can crawl all over the heap if you want to. I hope people try it out and find interesting use-cases for it.