MurmurHash

From Wikipedia, the free encyclopedia
Jump to: navigation, search

MurmurHash is a non-cryptographic hash function suitable for general hash-based lookup.[1][2][3] It was created by Austin Appleby in 2008,[4][5] and exists in a number of variants,[6] all of which have been released into the public domain. When compared to other popular hash functions, MurmurHash performed well in a random distribution of regular keys.[7]

Contents

Variants [edit]

The current version is MurmurHash3,[8][9] which yields a 32-bit or 128-bit hash value.

The older MurmurHash2[10] yields a 32-bit or 64-bit value. Slower versions of MurmurHash2 are available for big-endian and aligned-only machines. The MurmurHash2A variant adds the Merkle–Damgård construction so that it can be called incrementally. There are two variants which generate 64-bit values; MurmurHash64A, which is optimized for 64-bit processors, and MurmurHash64B, for 32-bit ones. MurmurHash2-160 generates the 160-bit hash, and MurmurHash1 is obsolete.

Implementations [edit]

The canonical implementation is in C++, but there are efficient ports for a variety of popular languages, including Python,[11] C,[12] C#,[9][13] Perl,[14] Ruby,[15] PHP,[16] Haskell,[17] Scala,[18] Java,[19][20] and JavaScript.[21][22]

It has been adopted into a number of open-source projects, most notably libstdc++ (ver 4.6), Perl,[23] nginx (ver 1.0.1),[24] Rubinius,[25] libmemcached (the C driver for Memcached),[26] maatkit,[27] Hadoop,[1] Kyoto Cabinet,[28] and RaptorDB.[29]

Algorithm [edit]

Murmur3_32(key, len, seed)
    c1 \gets 0xcc9e2d51
    c2 \gets 0x1b873593
    r1 \gets 15
    r2 \gets 13
    m \gets 5
    n \gets 0xe6546b64
 
    hash \gets seed

    for each fourByteChunk of key
        k \gets fourByteChunk

        k \gets k * c1
        k \gets (k << r1) OR (k >> (32-r1))
        k \gets k * c2

        hash \gets hash XOR k
        hash \gets (hash << r2) OR (hash >> (32-r2))
        hash \gets hash * m + n

    with any remainingBytesInKey
        remainingBytes \gets SwapEndianOrderOf(remainingBytesInKey)
        remainingBytes \gets remainingBytes * c1
        remainingBytes \gets (remainingBytes << r1) OR (remainingBytes >> (32 - r1))
        remainingBytes \gets remainingBytes * c2

        hash \gets hash XOR remainingBytes
 
    hash \gets hash XOR len

    hash \gets hash XOR (hash >> 16)
    hash \gets hash * 0x85ebca6b
    hash \gets hash XOR (hash >> 13)
    hash \gets hash * 0xc2b2ae35
    hash \gets hash XOR (hash >> 16)

References [edit]

  1. ^ a b "Hadoop in Java". Hbase.apache.org. 24 July 2011. Retrieved 13 January 2012. 
  2. ^ Chouza et al.
  3. ^ "Couceiro et al." (PDF) (in (Portuguese)). Retrieved 13 January 2012. 
  4. ^ "MurmurHash on GooglePages". Murmurhash.googlepages.com. Retrieved 13 January 2012. 
  5. ^ Tanjent (tanjent) wrote,3 March 2008 13:31:00. "MurmurHash first announcement". Tanjent.livejournal.com. Retrieved 13 January 2012. 
  6. ^ "MurmurHash2-160". Simonhf.wordpress.com. 25 September 2010. Retrieved 13 January 2012. 
  7. ^ "Which hashing algorithm is best for uniqueness and speed". stackexchange.com. 
  8. ^ "MurmurHash3 on smhasher". 
  9. ^ a b Horvath, Adam (Aug 10, 2012). "MurMurHash3, an ultra fast hash algorithm for C# / .NET". 
  10. ^ "MurmurHash2 on smhasher". 
  11. ^ "pyfasthash in Python". Google. Retrieved 13 January 2012. 
  12. ^ "C implementation in qLibc by Seungyoung Kim". 
  13. ^ Landman, Davy. "Davy Landman in C#". Landman-code.blogspot.com. Retrieved 13 January 2012. 
  14. ^ "Toru Maesaka in Perl". Search.cpan.org. Retrieved 13 January 2012. 
  15. ^ Bruce Williams <http://codefluency.com>, for Ruby Central <http://rubycentral.org> (3 May 2009). "Ruby". Rubyforge.org. Retrieved 13 January 2012. 
  16. ^ "Murmurhash3 PHP extension". Murmur.vaizard.org. Retrieved 13 January 2012. 
  17. ^ "Haskell". Hackage.haskell.org. Retrieved 13 January 2012. 
  18. ^ "Scala standard library implementation". 14 December 2012. 
  19. ^ MurmurHash3 in Java, part of Guava
  20. ^ Derek Young in Java, public domain
  21. ^ raycmorgan (owner). "Javascript implementation by Ray Morgan". Gist.github.com. Retrieved 13 January 2012. 
  22. ^ garycourt. "MurmurHash.js by Gary Court". Github.com. Retrieved 13 January 2012. 
  23. ^ "perl5176delta". Retrieved 31 December 2012. 
  24. ^ "nginx". Retrieved 13 January 2012. 
  25. ^ "Rubinius". Retrieved 29 February 2012. 
  26. ^ libmemcached
  27. ^ "maatkit". Google. 24 March 2009. Retrieved 13 January 2012. 
  28. ^ "Kyoto Cabinet specification". Fallabs.com. 4 March 2011. Retrieved 13 January 2012. 
  29. ^ Gholam, Mehdi (13 November 2011). "RaptorDB CodeProject page". Codeproject.com. Retrieved 13 January 2012. 

See also [edit]