MurmurHash

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by A.Ou (talk | contribs) at 23:39, 13 November 2010 (Added citation). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

MurmurHash is a non-cryptographic hash function suitable for general hash-based lookup.[1][2][3] It was created by Austin Appleby in 2008,[4][5] and exists in a number of variants, all of which have been released into the public domain.

Variants

The current version is MurmurHash2, which yields a 32-bit hash value and is optimized for Intel processors. Slower versions of MurmurHash2 are available for big-endian and aligned-only machines. The MurmurHash2A variant adds the Merkle-Damgard construction so that it can be called incrementally. There are two variants which generate 64-bit values; MurmurHash64A, which is optimized for 64-bit processors, and MurmurHash64B, for 32-bit ones. MurmurHash1 is obsolete.

Implementations

The canonical implementations are in C++, but there are efficient ports for a variety of popular languages, including Python,[6] C#,[7] Perl,[8] Ruby[9], Haskell,[10] Java,[11] and JavaScript.[12]

It has been adopted into a number of open-source projects, most notably libmemcached[13] (the C driver for Memcached), maatkit,[14] and Hadoop.[1]

Collision vulnerabilities

Discovered in November 2008, one pathological input sequence of 2^32 values causes MurmurHash2 to suffer from a rate of hash collision as high as 97.6%.[15][16]

References

See also