Benchmark: unpacked values in containers

Published: 2012-06-08T21:59Z
License: CC-BY

Inspired by a discussion on the ghc mailing list, I wondered how much performance can be gained by specializing and unboxing certain data types. In particular, I looked at Data.Map. Suppose that you have a map from ints to ints. First of all, you should be using Data.IntMap instead, but that is besides the point.

If you know that the keys and values are always strict integers, then the data type could be specialized from

data Map k a
    = Bin {-# UNPACK #-} !Size !k a !(Map k a) !(Map k a)
    | Tip

to

data MapIntInt
    = Tip
    | Bin {-# UNPACK #-} !Size {-# UNPACK #-} !Int {-# UNPACK #-} !Int
          !(MapIntInt) !(MapIntInt)

It would be great if this could be generated automatically by the compiler. But as was pointed out, that is really hard to do, because the size of the constructors would change, depending on the type arguments. So generic functions become impossible. It would also require multiple different info tables for the garbage collector, among other problems.

So, it's probably easier to do this specialization manually. I was thinking of using template haskell, in combination with type families. This would allow you to write something like

deriveSpecializedUnboxedType [d|type UnboxedMapIntInt = Map !Int !Int |]

but before going there, let's first see whether this is worth the effort at all.

So, I did the specialization by hand for Map Int Int, and ran the containers benchmarks. Here is a representative part of the results,

click for full the criterion report. The horribly hacky code is available on github.

In this graph

As you can see, specializing and unboxing gives a modest performance improvement. There is probably also an improvement in memory usage, but this benchmark doesn't directly measure that. Switching to a better data structure, i.e. patricia tries instead of balanced trees helps a lot more for some benchmarks, such as delete, but very little for others such as map.

Overall, it seems like specialization can definitely be worth it; in some cases improving performance by 40%. And it never has a negative impact, at least in this benchmark. Real life might be different though, especially if there are also Maps with other types of keys and values around.

Note also that this benchmark was compiled for a 32-bit architecture. On 64-bit, pointers and hence boxed values have more overhead.

Comments

Eugene KirpichovDate: 2012-06-09T02:22Zx

Actually I once tried to do this, but in my case it gave a slight performance drop. I don't remember the exact circumstances though. I think it was maybe a map from bytestrings to ints.

wren ng thorntonDate: 2012-06-09T04:37Zx

One big place where this would help a lot IME is with tuples, especially pairs. While the individual changes are small, in aggregate it can make a huge difference. GHC already does a good deal of shenanigans with tuples in order to try to unpack them into registers and such, but that doesn't work when the strictness analyzer fails (which is reparable) or when you actually do need a manifest tuple to store somewhere. Of course, dealing with tuples is a lot easier since they're so simple. In the code where I really care about such things I use a type class for PairLike things. The class is small and well-defined, unlike for most containers, and instances are trivial to write, unlike any interesting container.

Reply

(optional)
(optional, will not be revealed)
What greek letter is usually used for anonymous functions?
Use > code for code blocks, @code@ for inline code. Some html is also allowed.