Closest pair of points problem: Difference between revisions

Content deleted Content added

Inline

Revision as of 17:33, 9 February 2007

The closest pair of points problem or closest pair problem is a problem of computational geometry: given n points in the d-dimensional metric space, find a pair of points with the smallest distance between them. Its two-dimensional version, for points in the plane^[1], was among the first geometric problems which were treated at the origins of the systematic study of the computational complexity of geometric algorithms.

A naive algorithm of finding distances between all pairs of points and selecting the minimum requires O(dn²) time. It turns out that the problem may be solved in O(n log n) time (assuming the dimension d of the space to be constant). In the rest of the text, we assume that d is a constant, unless otherwise specified. In the algebraic decision tree model of computation, the O(n log n) algorithm is optimal. The optimality follows from the observation that the element uniqueness problem (with the lower bound of Ω(n log n) for time complexity) is reducible to the closest pair problem: checking whether the minimal distance is 0 after the solving of the closest pair problem answers the question whether there are two coinciding points.

In the computational model which assumes that the floor function is computable in constant time the problem can be solved in O(n log log n) time^[2]. If we allow randomization to be used together with the floor function, the problem can be solved in O(n) time^[3].

Brute-force algorithm

The closest pair of points can easily be computed in O(n²) time. To do that, one could compute the distances between all the n(n-1)/2 pairs of points, then pick the pair with the smallest distance, as illustrated below.

minDist = dist(P[0],P[1])
for p in P:
 for q in P:
  if p ≠ q and dist(p,q) < minDist:
   minDist = dist(p,q)
   closestPair = (p,q)
return closestPair

Planar case

The problem can be solved in O(n log n) time using the recursive divide and conquer approach, e.g., as follows^[1]:

Sort points along the x-coordinate
Split the set of points into two equal-sized subsets by a vertical line $x=x_{mid}$
Solve the problem recursively in the left and right subsets. This will give the left-side and right-side minimal distances $d_{Lmin}$ and $d_{Rmin}$ respectively.
Find the minimal distance $d_{LRmin}$ among the pair of points in which one point lies on the left of the dividing vertical and the second point lies to the right.
The final answer is the minimum among $d_{Lmin}$ , $d_{Rmin}$ , and $d_{LRmin}$ .

It turns out that step 4 may be accomplished in linear time. Again, a naive approach would require the calculation of distances for all left-right pairs, i.e., in quadratic time. The key observation is based on the following sparsity property of the point set. We already know that the closest pair of points is no further apart than $dist=\min(d_{Lmin},d_{Rmin})$ . Therefore for each point $p$ of the left of the dividing line we have to compare the distances to the points that lie in the rectangle of diminsions (dist, 2 * dist) to the right of the dividing line, as shown in the figure. And what is more, this rectangle can contain at most 6 points with pairwise distances at least $d_{Rmin}$ . Therefore it is sufficient to compute at most 6n left-right distance computations in step 4.

As the closest pair of points define an edge in the Delaunay triangulation, and correspond to two adjacent cells in the Voronoi diagram, the closest pair of points can be determined in linear time when we are given one of these two structures. Computing either the Delaunay triangulation or the Voronoi diagram takes O(n log n) time. These approaches are not efficient for dimension d>2, while the divide an conquer algorithm can be generalized to take O(n log n) time for any constant value of d.

Dynamic closest-pair problem

The dynamic version for the closest-pair problem is stated as follows:

Given a dynamic set of objects, find algorithms and data structures for efficient recalculation of the closest pair of objects each time the objects are inserted or deleted.

If the bounding box for all points is known in advance and the constant-time floor function is available, then the expected O(n) space data structure was suggested that supports expected-time O(log n) insertions and deletions and constant query time. When modified for the algebraic decision tree model, insertions and deletions would require O(log² n) expected time. ^[4]

References

^ ^a ^b M. I. Shamos and D. Hoey. "Closest-point problems." In Proc. 16th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 151—162, 1975
^ S. Fortune and J.E. Hopcroft. "A note on Rabin's nearest-neighbor algorithm." Information Processing Letters, 8(1), pp. 20—23, 1979
^ S. Khuller and Y. Matias. A simple randomized sieve algorithm for the closest-pair problem. Inf. Comput., 118(1):34—37,1995
^ Mordecai Golin, Rajeev Raman, Christian Schwarz, Michiel Smid, "Randomized Data Structures For The Dynamic Closest-Pair Problem", SIAM J. Comput., vo. 26, no. 4, 1998, preliminary version reported at the 4th Annu. ACM-SIAM Symp. on Discrete Algoritms, pp. 301-310 (1993)

Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Pages 957–961 of section 33.4: Finding the closest pair of points.
Jon Kleinberg; Éva Tardos (2006). Algorithm Design. Addison Wesley.{{cite book}}: CS1 maint: multiple names: authors list (link)

[sh-1] M. I. Shamos and D. Hoey. "Closest-point problems." In Proc. 16th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 151—162, 1975

[fh-2] S. Fortune and J.E. Hopcroft. "A note on Rabin's nearest-neighbor algorithm." Information Processing Letters, 8(1), pp. 20—23, 1979

[km-3] S. Khuller and Y. Matias. A simple randomized sieve algorithm for the closest-pair problem. Inf. Comput., 118(1):34—37,1995

[4] Mordecai Golin, Rajeev Raman, Christian Schwarz, Michiel Smid, "Randomized Data Structures For The Dynamic Closest-Pair Problem", SIAM J. Comput., vo. 26, no. 4, 1998, preliminary version reported at the 4th Annu. ACM-SIAM Symp. on Discrete Algoritms, pp. 301-310 (1993)

[1]

[2]

[3]

[4]

@@ Line 20: / Line 20: @@
 ==Planar case==
-The problem can be solved in <math>O(N \, \log N)</math> time using the [[recursion|recursive]] [[divide and conquer (algorithm)|divide and conquer]] approach, e.g., as follows<ref name=sh/>:
+The problem can be solved in O(''n'' log ''n'') time using the [[recursion|recursive]] [[divide and conquer (algorithm)|divide and conquer]] approach, e.g., as follows<ref name=sh/>:
 #Sort points along the x-coordinate
@@ Line 29: / Line 29: @@
 [[Image:Closest pair.jpg|right|thumb|Divide-and-conquer: sparse box observation]]
-It turns out that step 4 may be accomplished in linear time. Again, a naive approach would require the calculation of distances for all left-right pairs, i.e., in quadratic time. The key observation is based on the following sparsity property of the point set. We already know that the closest pair of points is no further apart than <math>dist=\min(d_{Lmin}, d_{Rmin})</math>. Therefore for each point <math>p</math> of the left of the dividing line we have to compare the distances to the points that lie in the rectangle of diminsions (''dist'', 2 * ''dist'') to the right of the dividing line, as shown in the figure. And what is more, this rectangle can contain at most 6 points with pairwise distances at least <math>d_{Rmin}</math>. Therefore it is sufficient to compute at most <math>6N</math> left-right distance computations in step 4.
+It turns out that step 4 may be accomplished in linear time. Again, a naive approach would require the calculation of distances for all left-right pairs, i.e., in quadratic time. The key observation is based on the following sparsity property of the point set. We already know that the closest pair of points is no further apart than <math>dist=\min(d_{Lmin}, d_{Rmin})</math>. Therefore for each point <math>p</math> of the left of the dividing line we have to compare the distances to the points that lie in the rectangle of diminsions (''dist'', 2 * ''dist'') to the right of the dividing line, as shown in the figure. And what is more, this rectangle can contain at most 6 points with pairwise distances at least <math>d_{Rmin}</math>. Therefore it is sufficient to compute at most 6''n'' left-right distance computations in step 4.
 As the closest pair of points define an edge in the [[Delaunay triangulation]], and correspond to two adjacent cells in the [[Voronoi diagram]], the closest pair of points can be determined in linear time when we are given one of these two structures. Computing either the Delaunay triangulation or the Voronoi diagram takes O(''n'' log ''n'') time. These approaches are not efficient for dimension ''d''>2, while the divide an conquer algorithm can be generalized to take O(''n'' log ''n'') time for any constant value of ''d''.
@@ Line 37: / Line 37: @@
 *Given a [[dynamic set]] of objects, find algorithms and [[data structure]]s for efficient recalculation of the closest pair of objects each time the objects are inserted or deleted.
-If the [[bounding box]] for all points is known in advance and the constant-time floor function is available, then the expected O(N) space data structure was suggested that supports expected-time <math>O(\log N)</math> insertions and deletions and constant query time. When modified for the algebraic decision tree model, insertions and deletions would require <math>O(\log^2 N)</math> expected time. <ref>Mordecai Golin, Rajeev Raman, Christian Schwarz, Michiel Smid, "Randomized Data Structures For The Dynamic Closest-Pair Problem", [[SIAM J. Comput.]], vo. 26, no. 4, 1998, preliminary version reported at the 4th Annu. ACM-SIAM Symp. on Discrete Algoritms, pp. 301-310 (1993)</ref>
+If the [[bounding box]] for all points is known in advance and the constant-time floor function is available, then the expected O(''n'') space data structure was suggested that supports expected-time O(log ''n'') insertions and deletions and constant query time. When modified for the algebraic decision tree model, insertions and deletions would require O(log² ''n'') expected time. <ref>Mordecai Golin, Rajeev Raman, Christian Schwarz, Michiel Smid, "Randomized Data Structures For The Dynamic Closest-Pair Problem", [[SIAM J. Comput.]], vo. 26, no. 4, 1998, preliminary version reported at the 4th Annu. ACM-SIAM Symp. on Discrete Algoritms, pp. 301-310 (1993)</ref>
 ==References==