Hash array mapped trie
A hash array mapped trie (HAMT) is an implementation of an associative array that combines the characteristics of a hash table and an array mapped trie. It is a refined version of the more general notion of a hash tree.
A HAMT is an array mapped trie where the keys are first hashed in order to ensure an even distribution of keys and a constant key length.
In a typical implementation of HAMT's array mapped trie, each node contains a table with some fixed number N of slots with each slot containing either a nil pointer or a pointer to another node. N is commonly 32. As allocating space for N pointers for each node would be expensive, each node instead contains a bitmap which is N bits long where each bit indicates the presence of a non-nil pointer. This is followed by an array of pointers equal in length to the number of ones in the bitmap, (its Hamming weight).
Advantages of HAMTs
The hash array mapped trie achieves almost hash table-like speed while using memory much more economically. Also, a hash table may have to be periodically resized, an expensive operation, whereas HAMTs grow dynamically. Generally, HAMT performance is improved by a larger root table with some multiple of N slots; some HAMT variants allow the root to grow lazily with negligible impact on performance.
Problems with HAMTs
Implementation of a HAMT involves the use of the population count function, which counts the number of ones in the binary representation of a number. This operation is available in many instruction set architectures, but it is available in only some high-level languages. Although population count can be implemented in software in O(1) time using a series of shift and add instructions, doing so may perform the operation an order of magnitude slower.
This is of course not so much a problem with HAMTs but rather an indication that HAMT performance can be improved still further where hardware support is available.
The concurrent lock-free version of the hash trie called Ctrie is a mutable thread-safe implementation which ensures progress. The data-structure has been proven to be correct - Ctrie operations have been shown to have the atomicity, linearizability and lock-freedom properties.
- Phil Bagwell (2000). Ideal Hash Trees (Report). Infoscience Department, École Polytechnique Fédérale de Lausanne. http://infoscience.epfl.ch/record/64398/files/idealhashtrees.pdf.
- Java source file of Clojure's hash map type.
- Johan Tibell, A. Announcing unordered-containers 0.2
- Ruby source file of Rubinius's HAMT
- Hash Array Mapped Trie (C++ Templates)
- Prokopec, A. Implementation of Concurrent Hash Tries on GitHub
- Prokopec, A. et al. (2011) Cache-Aware Lock-Free Concurrent Hash Tries. Technical Report, 2011.