Jump to content

Linear probing: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
replace unmotivated manipulation of binomials with an explanation of what functions you need to use to get it to work
Patmorin (talk | contribs)
Added a link to a book section
Line 55: Line 55:
==External links==
==External links==
*[http://www.siam.org/meetings/alenex05/papers/13gheileman.pdf How Caching Affects Hashing] by Gregory L. Heileman and Wenbin Luo 2005.
*[http://www.siam.org/meetings/alenex05/papers/13gheileman.pdf How Caching Affects Hashing] by Gregory L. Heileman and Wenbin Luo 2005.
*[http://opendatastructures.org/versions/edition-0.1c/ods-java/node32.html Open Data Structures - Section 5.2 - LinearHashTable: Linear Probing]


[[Category:Search algorithms]]
[[Category:Search algorithms]]

Revision as of 14:04, 2 March 2012

Linear probing is a scheme in computer programming for resolving hash collisions of values of hash functions by sequentially searching the hash table for a free location.[1] This is accomplished using two values - one as a starting value and one as an interval between successive values in modular arithmetic. The second value, which is the same for all keys and known as the stepsize, is repeatedly added to the starting value until a free space is found, or the entire table is traversed.

   newLocation = (startingValue + stepSize) % arraySize

This algorithm, which is used in open-addressed hash tables, provides good memory caching (if stepsize is equal to one), through good locality of reference, but also results in clustering, an unfortunately high probability that where there has been one collision there will be more. The performance of linear probing is also more sensitive to input distribution when compared to double hashing.

Given an ordinary hash function H(x), a linear probing function (H(x, i)) would be:

Here H(x) is the starting value, n the size of the hash table, and the stepsize is i in this case.

Dictionary operation in constant time

Using linear probing, dictionary operation can be implemented in constant time. In other words, insert, remove and find operations can be implemented in O(1), as long as the load factor of the hash table is a constant strictly less than one.[2] This analysis makes the (unrealistic) assumption that the hash function is completely random, but can be extended also to 5-independent hash functions.[3] Weaker properties, such as universal hashing, are not strong enough to ensure the constant-time operation of linear probing,[4] but one practical method of hash function generation, tabulation hashing, again leads to a guaranteed constant expected time performance despite not being 5-independent.[5]

See also

References

  1. ^ Dale, Nell (2003). C++ Plus Data Structures. Sudbury, MA: Jones and Bartlett Computer Science. ISBN 0-7637-0481-4.
  2. ^ Knuth, Donald (1963), Notes on "Open" Addressing.
  3. ^ Pagh, Anna; Pagh, Rasmus; Ružić, Milan (2009), "Linear probing with constant independence", SIAM Journal on Computing, 39 (3): 1107–1120, doi:10.1137/070702278, MR 2538852.
  4. ^ Pătraşcu, Mihai; Thorup, Mikkel (2010), "On the k-independence required by linear probing and minwise independence", Automata, Languages and Programming, 37th International Colloquium, ICALP 2010, Bordeaux, France, July 6-10, 2010, Proceedings, Part I (PDF), Lecture Notes in Computer Science, vol. 6198, Springer, pp. 715–726, doi:10.1007/978-3-642-14165-2_60.
  5. ^ Pătraşcu, Mihai; Thorup, Mikkel (2011), "The power of simple tabulation hashing", Proceedings of the 43rd annual ACM Symposium on Theory of Computing (STOC '11), pp. 1–10, arXiv:1011.5200, doi:10.1145/1993636.1993638.