Talk:Locality-sensitive hashing

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Robotics (Rated Start-class, Mid-importance)
WikiProject icon Locality-sensitive hashing is within the scope of WikiProject Robotics, which aims to build a comprehensive and detailed guide to Robotics on Wikipedia. If you would like to participate, you can choose to edit this article, or visit the project page (Talk), where you can join the project and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.

Reference from "Locality preserving hashing"[edit]

What is the difference between "Locality-sensitive hashing" and "Locality preserving hashing"? The article of the last one refers to this article, but there is no detailed explanation of the motivation behind the "sensitive" and "preserving" notation. — Preceding unsigned comment added by (talk) 08:50, 30 July 2015 (UTC)

I added a reference to this article that goes into a lot of detail about two specific algorithms, LSH and LPH.
I agree that the difference in terminology (if any) is unclear. What (if anything) is the difference between "sensitive" and "preserving" to a hash function? --DavidCary (talk) 16:52, 22 August 2016 (UTC)

Suggestion to remove Nilsimsa Hash section[edit]

The Nilsimsa Hash does not really fit into the LSH definition(s) (Indyk's or Charikar's). Consequently, there is no way to plug into the common framework of LSH and obtain good index-size and query performance guarantees --- one of the strengths of LSH approaches. Hence, I suggest this section should be removed, or at least not in the current place (in the same level with random projection, simhash, and p-stable based lsh).

Is there a name for the more general category that includes Nilsima hash, LSH, LPH, TLSH, etc.? --DavidCary (talk) 16:52, 22 August 2016 (UTC)


Just made the page. There are some variations among definitions of LSH - I am using Charikar's. Flamholz 19:40, 6 June 2007 (UTC)

Charikar's definition is too narrow, though a bit easy to understand by beginners.

Do you have a reference to a "better" definition? Please add that reference to the article. Thank you. --DavidCary (talk) 16:52, 22 August 2016 (UTC)

Definition of an LSH[edit]

I don't think the current definition really makes sense, although maybe it could be modified a little to work.

Specifically: for a metric phi(x,y), we have phi(x,y)->0 (intuitively) as x->y. But if Pr[h(x)=h(y)] -> 0 as x->y, that's bad! I mean, that is just about the opposite of a locality-sensitive hash.

One fix might be to say Pr[h(a) = h(b)] = 1 - phi(a,b) instead.

Although it would also be nice to allow for general boolean combinations of hashes, such as simultaneously hashing to many different values, and calling it a hit if some combination thereof actually collide. —Preceding unsigned comment added by PhiloMath (talkcontribs) 07:14, 5 December 2007 (UTC)

I totally agree. The definition as it stands is wrong. The phi(a,b) is a similarity not a distance or metric (mathematics). Another error is that the Jaccard_index which is a similarity but is currently is referred to as the "Jaccard distance". Notice that the correct definition looks like a probabilistic version of Injective_mapping. cmobarry (talk) 17:26, 20 December 2007 (UTC)

Added another variant of LSH definition[edit]

We added the Indyk-Motwani definition of the LSH family, plus an LSH family for the Hamming space (by bit sampling), as well as the LSH algorithm for the nearest neighbor search (approximate). Alex and Piotr. (talk) 02:13, 7 February 2008 (UTC)

Hey guys, i have a question.[edit]

in the last section:

LSH Algorithm for the Nearest Neighbor Search

... it is being claimed that : ....

query time: ;

i am trying to figure out from where does the comes from... why is the probability for colision is  ?

can someone please shed some light on this? —Preceding unsigned comment added by Caligola0 (talkcontribs) 18:01, 18 June 2009 (UTC)

I agree -- what is the meaning of , , and ? --DavidCary (talk) 16:52, 22 August 2016 (UTC)

Relation to Vector Quantization?[edit]

hi - could someone clarify the relation to Vector Quantization please? --mcld (talk) 09:21, 8 April 2010 (UTC)


I suggest merging locality-preserving hashing into locality-sensitive hashing. There seems to be enough WP:OVERLAP that a single article can cover both, and clarify the distinction (if any) between them. --DavidCary (talk) 15:50, 22 August 2016 (UTC)

Random projection[edit]

How is "closely related" to for small ? It is surely not a Taylor expansion, or anything of that sort. How is this even a relevant comment at this point?