Talk:Nearest neighbor search
|WikiProject Robotics||(Rated Start-class, Mid-importance)|
The nearest-neighbors-searching community has an unfortunate habit of stating that the "dimension is constant", probably because then the asymptotic results look way better than they actually are. For example, the all-nearest neighbors algorithm by Vaidya (which is cited here) is actually an awfully slow algorithm to do its task: it becomes unusable already in quite low dimensions such as 4 or 5. This is because there is a hidden factor exponential in dimension d. The fastest way I know to do the all-nearest neighbors search is simply to do repeated searches to a spatial subdivision structure such as BBD-tree or a sliding-midpoint kd-tree. I find the "dimension is constant" habit highly misleading and I'd rather not see this habit in the Wikipedia articles. I therefore suggest to take a more explicit route to including those exponential factors. There are similar problems with the kd-tree article, but I'll comment there separately.
This seems like a great point. I have seen similiar issues in other algorithmic topics. The most informative solution would seem to be to express the asymptotic speeds with both dimension and set size parameters. Can somebody go through the references and fill in the missing d's?
More on all-nearest neighbors
"As a simple example: when we find the distance from point X to point Y, that also tells us the distance from point Y to point X, so the same calculation can be reused in two different queries.". This is not true: the is-a-nearest-neighbor-of relation is not symmetric, as can be seen from a small example already in 1D: *--*-* (ascii art, stars are points).
- The information can be reused -- you can use it as an upper bound in your next NN search. User A1 (talk) 16:37, 7 August 2010 (UTC)
- Rewording may be in order to prevent misunderstanding. It true that point A's nearest neighbor, point B, might itself have a nearest neighbor that is distinct from point A. However, both point A and point B must consider (or rule out) distance when finding their nearest neighbors; accordingly, it's possible to reduce the total number of distance computations by half by exploiting the metric's symmetry. If the distance computation is expensive compared to other work (floating-point comparison, iteration, etc.), which it usually is, this will cut the running time in half. On the other hand, the mentioned algorithm is considerably better than this. Rriegs (talk) 13:31, 14 March 2013 (UTC)
Dynamic nearest-neighbor tracking
I'm interested in tracking nearest neighbors as points move. I don't know much about this topic, but it seems like there should be an O(N)f(d) algorithm for tracking nearest-neighbors of all points when there is an lower bound on the seperation between points and points move a small amount relative to this lower bound at each time-step by looking at the nearest neighbors of the old nearest neighbors. (memory would scale like O(N) for this method) This also avoids the tree-backtracking issues in the spatial-partitioning method. Can anybody supply a reference in the main body of the page?