Static Hashing

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Static Hashing is another form of the hashing problem which allows users to perform lookups on a finalized dictionary set (all objects in the dictionary are final and not changing).

Usage [1][edit]

Application[edit]

Since static hashing requires that the database, its objects and reference remain the same its applications are limited. Databases which contain information which changes rarely are also eligible as it would only require a full rehash of the entire database on rare occasion. Examples of this include sets of words and definitions of specific languages, sets of significant data for an organization's personnel, etc.

Perfect Hashing[edit]

Perfect hashing is a model of hashing in which any set of n elements can be stored in a hash table of equal size and can have lookups done in constant time. It was specifically discovered and discussed by Fredman, Komlos and Szemeredi (1984) and has therefore been nicknamed "FKS Hashing".[2]

FKS Hashing[edit]

FKS Hashing makes use of a hash table with two levels in which the top level contains n buckets which each contain their own hash table. FKS hashing requires that if collisions occur they must do so only on the top level.

Implementation[edit]

The top level contains a randomly created hash function, h(x), which fits within the constraints of a Carter and Wegman hash function - seen in Universal hashing. Having done so the top level shall contain n buckets labeled k1, k2, k2, ..., kn. Following this pattern, all of the buckets hold a hash table of size si and a respective hash function, hi(x). The hash function will be decided by setting si to ki2 and randomly going through functions until there are no collisions. This can be done in constant time.

Performance[edit]

Because there are "n choose 2" pairs of elements, of which have a probability of collision equal to 1/n, FKS hashing can expect to have strictly less than n/2 collisions. Based on this fact and that each h(x) was selected so that the number of collisions would be at most n/2, the size of each table on the lower level will be no greater than 2n.

References[edit]

  1. ^ Daniel Roche (2013). SI486D: Randomness in Computing, Hashing Unit. United States Naval Academy, Computer Science Department. 
  2. ^ Michael Fredman, Janos Komlos, Endre Szemeredi (1984). Storing a Sparse Table with O(1) Worst Case Access Time. Journal of the ACM (Volume 31, Issue 3).