Hash array mapped trie

From Wikipedia, the free encyclopedia
Jump to: navigation, search

A hash array mapped trie[1] (HAMT) is an implementation of an associative array that combines the characteristics of a hash table and an array mapped trie.[1] It is a refined version of the more general notion of a hash tree.

Operation[edit]

A HAMT is an array mapped trie where the keys are first hashed in order to ensure an even distribution of keys and a constant key length.

In a typical implementation of HAMT's array mapped trie, each node contains a table with some fixed number N of slots with each slot containing either a nil pointer or a pointer to another node. N is commonly 32. As allocating space for N pointers for each node would be expensive, each node instead contains a bitmap which is N bits long where each bit indicates the presence of a non-nil pointer. This is followed by an array of pointers equal in length to the number of ones in the bitmap, (its Hamming weight).

Advantages of HAMTs[edit]

The hash array mapped trie achieves almost hash table-like speed while using memory much more economically. Also, a hash table may have to be periodically resized, an expensive operation, whereas HAMTs grow dynamically. Generally, HAMT performance is improved by a larger root table with some multiple of N slots; some HAMT variants allow the root to grow lazily[1] with negligible impact on performance.

Problems with HAMTs[edit]

Implementation of a HAMT involves the use of the population count function, which counts the number of ones in the binary representation of a number. This operation is available in many instruction set architectures, but it is available in only some high-level languages. Although population count can be implemented in software in O(1) time using a series of shift and add instructions, doing so may perform the operation an order of magnitude slower.

This is of course not so much a problem with HAMTs but rather an indication that HAMT performance can be improved still further where hardware support is available.

Implementations[edit]

The programming languages Clojure[2] and Scala use a persistent variant of hash array mapped tries for their native hash map type. The Haskell library unordered-containers uses the same to implement persistent map and set data types.[3] A Javascript HAMT library [4] based on the Clojure implementation is also available. The Rubinius[5] implementation of Ruby includes a HAMT, mostly written in Ruby but with 3[6] primitives. There is a C++ HAMT[7] implementation that optionally uses POPCNT CPU instruction.

The concurrent lock-free version[8] of the hash trie called Ctrie is a mutable thread-safe implementation which ensures progress. The data-structure has been proven to be correct[9] - Ctrie operations have been shown to have the atomicity, linearizability and lock-freedom properties.

References[edit]