Farthest-first traversal

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
The first five steps of the farthest-first traversal of a planar point set. The first point is chosen arbitrarily and each successive point is as far as possible from all previously chosen points.

In computational geometry, the farthest-first traversal of a bounded metric space is a sequence of points in the space, where the first point is selected arbitrarily and each successive point is as far as possible from the set of previously-selected points. The same concept can also be applied to a finite set of geometric points, by restricting the selected points to belong to the set or equivalently by considering the finite metric space generated by these points. For a finite metric space or finite set of geometric points, the resulting sequence forms a permutation of the points, known as the greedy permutation.

Farthest-point traversals have many applications, including the approximation of the traveling salesman problem and the metric k-center problem. They may be constructed in polynomial time, or (for low-dimensional Euclidean spaces) approximated in near-linear time.

Properties[edit]

Fix a number k, and consider the subset formed by the first k points of the farthest-first traversal of any metric space. Let r be the distance between the final point of the prefix and the set of previously-selected points. Then this subset has the following two properties:

  • All pairs of the selected points are at distance at least r from each other, and
  • All points of the whole metric space are at distance at most r from the subset.

In other words, each prefix of the farthest-first traversal forms a Delone set.[1]

Applications[edit]

The first use of the farthest-first traversal was by Rosenkrantz, Stearns & Lewis (1977) in connection with heuristics for the travelling salesman problem. In the farthest-insertion heuristic, discussed by Rosenkrantz et al., a tour is built up incrementally, by adding one point at a time in the ordering given by a farthest-first traversal. To add each point to the traveling salesman tour of the previous points, this heuristic considers all possible ways of breaking one edge of the tour and replacing it by two edges through the new point, and chooses the cheapest of these replacements. Although Rosenkrantz et al. prove only a logarithmic approximation ratio for this method, they show that in practice it often works better than other insertion methods with better provable approximation ratios.[2]

Later, the same sequence of points was popularized by Gonzalez (1985), who used it as part of a greedy approximation algorithm for the problem of finding k clusters that minimize the maximum diameter of a cluster. The same algorithm applies also, with the same approximation quality, to the metric k-center problem. This problem is one of several formulations of cluster analysis and facility location, in which the goal is to partition a given set of points into k different clusters, each with a chosen center point, such that the maximum distance from any point to the center of its cluster is minimized. For instance, this problem can be used to model the placement of fire stations within a city, in order to ensure that every address within the city can be reached quickly by a fire truck. Gonzalez described a clustering heuristic that selects as centers the first k points of a farthest-first traversal, and then assigns each of the input points to its nearest center. If r is the distance from the set of k selected centers to the next point at position k + 1 in the traversal, then this clustering achieves a distance of r. However, the subset of k centers together with the next point are all at distance at least r from each other, and any k-clustering would put two of these points into one cluster, so there is no clustering with distance better than r/2. Thus, Gonzalez's heuristic gives an approximation ratio of 2 for this problem.[1] This heuristic, and the name "farthest-first traversal", are often incorrectly attributed to a different paper from the same time, Hochbaum & Shmoys (1985). However, Hochbaum and Shmoys used graph-theoretic techniques, not the farthest-first traversal, to obtain a different approximation algorithm for the metric k-center with the same approximation ratio as Gonzalez's heuristic.[3] For both the min-max diameter clustering problem and the metric k-center problem, these approximations are optimal: the existence of a polynomial-time heuristic with any constant approximation ratio less than 2 would imply that P = NP.[1][3]

As well as for clustering, the farthest-first traversal can also be used in another type of facility location problem, the max-min facility dispersion problem, in which the goal is to choose the locations of k different facilities so that they are as far apart from each other as possible. More precisely, the goal in this problem is to choose k points from a given metric space or a given set of candidate points, in such a way as to maximize the minimum pairwise distance between the selected points. Again, this can be approximated by choosing the first k points of a farthest-first traversal. If r denotes the distance of the kth point from all previous points, then every point of the metric space or the candidate set is within distance r of the first k − 1 points. By the pigeonhole principle, some two points of the optimal solution (whatever it is) must both be within distance r of the same point among these first k − 1 chosen points, and (by the triangle inequality) within distance 2r of each other. Therefore, the heuristic solution given by the farthest-first traversal is within a factor of two of optimal.[4][5][6]

Other applications of the farthest-first traversal include color quantization (clustering the colors in an image to a smaller set of representative colors),[7] progressive scanning of images (choosing an order to display the pixels of an image so that prefixes of the ordering produce good lower-resolution versions of the whole image rather than filling in the image from top to bottom),[8] point selection in the probabilistic roadmap method for motion planning,[9] simplification of point clouds,[10] generating masks for halftone images,[11][12] hierarchical clustering,[13] finding the similarities between polygon meshes of similar surfaces,[14] underwater robot task planning,[15] fault detection in sensor networks,[16] modeling phylogenetic diversity,[17] matching vehicles in a heterogenous fleet to customer delivery requests,[18] uniform distribution of geodetic observatories on the Earth's surface[19] or of other types of sensor network,[20] generation of virtual point lights in the instant radiosity computer graphics rendering method,[21] and geometric range searching data structures.[22]

Algorithms[edit]

The farthest-first traversal of a finite point set may be computed by a greedy algorithm that maintains the distance of each point from the previously selected points, performing the following steps:

  • Initialize the sequence of selected points to the empty sequence, and the distances of each point to the selected points to infinity.
  • While not all points have been selected, repeat the following steps:
    • Scan the list of not-yet-selected points to find a point p that has the maximum distance from the selected points.
    • Remove p from the not-yet-selected points and add it to the end of the sequence of selected points.
    • For each remaining not-yet-selected point q, replace the distance stored for q by the minimum of its old value and the distance from p to q.

For a set of n points, this algorithm takes O(n2) steps and O(n2) distance computations. A faster approximation algorithm, given by Har-Peled & Mendel (2006), applies to any subset of points in a metric space with bounded doubling dimension, a class of spaces that include the Euclidean spaces of bounded dimension. Their algorithm finds a sequence of points in which each successive point has distance within a 1 − ε factor of the farthest distance from the previously-selected point, where ε can be chosen to be any positive number. It takes time O(n log n).[23]

For selecting points from a continuous space such as the Euclidean plane, rather than a finite set of candidate points, these methods will not work directly, because there would be an infinite number of distances to maintain. Instead, each new point should be selected as the center of the largest empty circle defined by the previously-selected point set.[8] This center will always lie on a vertex of the Voronoi diagram of the already selected points, or at a point where an edge of the Voronoi diagram crosses the domain boundary. In this formulation the method for constructing farthest-first traversals has also been called incremental Voronoi insertion.[24] It is similar to Ruppert's algorithm for finite element mesh generation, but differs in the choice of which Voronoi vertex to insert at each step.[25]

See also[edit]

  • Lloyd's algorithm, a different method for generating evenly spaced points in geometric spaces

References[edit]

  1. ^ a b c Gonzalez, T. F. (1985), "Clustering to minimize the maximum intercluster distance", Theoretical Computer Science, 38 (2–3): 293–306, doi:10.1016/0304-3975(85)90224-5, MR 0807927.
  2. ^ Rosenkrantz, D. J.; Stearns, R. E.; Lewis, P. M., II (1977), "An analysis of several heuristics for the traveling salesman problem", SIAM Journal on Computing, 6 (3): 563–581, doi:10.1137/0206041, MR 0459617.
  3. ^ a b Hochbaum, Dorit S.; Shmoys, David B. (1985), "A best possible heuristic for the k-center problem", Mathematics of Operations Research, 10 (2): 180–184, doi:10.1287/moor.10.2.180, MR 0793876.
  4. ^ White, Douglas J. (1991), "The maximal-dispersion problem", IMA Journal of Mathematics Applied in Business and Industry, 3 (2): 131–140 (1992), doi:10.1093/imaman/3.2.131, MR 1154657. White credits the use of the farthest-first traversal as a heuristic for this problem to Steuer, R. E. (1986), Multiple-Criteria Optimization: Theory, Computation, and Applications, New York: Wiley.
  5. ^ Tamir, Arie (1991), "Obnoxious facility location on graphs", SIAM Journal on Discrete Mathematics, 4 (4): 550–567, doi:10.1137/0404048, MR 1129392.
  6. ^ Ravi, S. S.; Rosenkrantz, D. J.; Tayi, G. K. (1994), "Heuristic and special case algorithms for dispersion problems", Operations Research, 42 (2): 299–310, doi:10.1287/opre.42.2.299, JSTOR 171673.
  7. ^ Xiang, Z. (1997), "Color image quantization by minimizing the maximum intercluster distance", ACM Transactions on Graphics, 16 (3): 260–276, doi:10.1145/256157.256159.
  8. ^ a b Eldar, Y.; Lindenbaum, M.; Porat, M.; Zeevi, Y. Y. (1997), "The farthest point strategy for progressive image sampling", IEEE Transactions on Image Processing, 6 (9): 1305–1315, Bibcode:1997ITIP....6.1305E, doi:10.1109/83.623193.
  9. ^ Mazer, E.; Ahuactzin, J. M.; Bessiere, P. (1998), "The Ariadne's clew algorithm", Journal of Artificial Intelligence Research, 9: 295–316, doi:10.1613/jair.468.
  10. ^ Moenning, C.; Dodgson, N. A. (2003), "A new point cloud simplification algorithm", 3rd IASTED International Conference on Visualization, Imaging, and Image Processing.
  11. ^ Gotsman, Craig; Allebach, Jan P. (1996), "Bounds and algorithms for dither screens" (PDF), in Rogowitz, Bernice E.; Allebach, Jan P., Human Vision and Electronic Imaging, Proc. SPIE, 2657, pp. 483–492, doi:10.1117/12.238746.
  12. ^ Shahidi, R.; Moloney, C.; Ramponi, G. (2004), "Design of farthest-point masks for image halftoning", EURASIP Journal on Applied Signal Processing, 12: 1886–1898, Bibcode:2004EJASP2004...45S, doi:10.1155/S1110865704403217.
  13. ^ Dasgupta, S.; Long, P. M. (2005), "Performance guarantees for hierarchical clustering", Journal of Computer and System Sciences, 70 (4): 555–569, doi:10.1016/j.jcss.2004.10.006, MR 2136964.
  14. ^ Lipman, Y.; Funkhouser, T. (2009), "Möbius voting for surface correspondence", Proc. ACM SIGGRAPH, pp. 72:1–72:12, doi:10.1145/1576246.1531378.
  15. ^ Girdhar, Y.; Giguère, P.; Dudek, G. (2012), "Autonomous adaptive underwater exploration using online topic modelling" (PDF), Proc. Int. Symp. Experimental Robotics.
  16. ^ Altinisik, U.; Yildirim, M.; Erkan, K. (2012), "Isolating non-predefined sensor faults by using farthest first traversal algorithm", Ind. Eng. Chem. Res., 51 (32): 10641–10648, doi:10.1021/ie201850k.
  17. ^ Bordewich, Magnus; Rodrigo, Allen; Semple, Charles (2008), "Selecting taxa to save or sequence: Desirable criteria and a greedy solution", Systematic Biology, 57 (6): 825–834, doi:10.1080/10635150802552831.
  18. ^ Fisher, Marshall L.; Jaikumar, Ramchandran (1981), "A generalized assignment heuristic for vehicle routing", Networks, 11 (2): 109–124, doi:10.1002/net.3230110205, MR 0618209. As cited by Gheysens, Filip; Golden, Bruce; Assad, Arjang (1986), "A new heuristic for determining fleet size and composition", in Gallo, Giorgio; Sandi, Claudio, Netflow at Pisa, Mathematical Programming Studies, 26, Springer, pp. 233–236, doi:10.1007/bfb0121103.
  19. ^ Hase, Hayo (2000), "New method for the selection of additional sites for the homogenisation of an inhomogeneous cospherical point distribution", in Rummel, Reinhard; Drewes, Hermann; Bosch, Wolfgang; et al., Towards an Integrated Global Geodetic Observing System (IGGOS): IAG Section II Symposium Munich, October 5-9, 1998, Posters – Session B, International Association of Geodesy Symposia, Springer, pp. 180–183, doi:10.1007/978-3-642-59745-9_35.
  20. ^ Vieira, Luiz Filipe M.; Vieira, Marcos Augusto M.; Ruiz, Linnyer Beatrys; Loureiro, Antonio A. F.; Silva, Diógenes Cecílio; Fernandes, Antônio Otávio (2004), "Efficient Incremental Sensor Network Deployment Algorithm" (PDF), Proc. Brazilian Symp. Computer Networks, pp. 3–14.
  21. ^ Laine, Samuli; Saransaari, Hannu; Kontkanen, Janne; Lehtinen, Jaakko; Aila, Timo (2007), "Incremental instant radiosity for real-time indirect illumination", Proceedings of the 18th Eurographics Conference on Rendering Techniques (EGSR'07), Aire-la-Ville, Switzerland, Switzerland: Eurographics Association, pp. 277–286, doi:10.2312/EGWR/EGSR07/277-286, ISBN 978-3-905673-52-4.
  22. ^ Abbar, S.; Amer-Yahia, S.; Indyk, P.; Mahabadi, S.; Varadarajan, K. R. (2013), "Diverse near neighbor problem", Proceedings of the 29th Annual Symposium on Computational Geometry, pp. 207–214, doi:10.1145/2462356.2462401.
  23. ^ Har-Peled, S.; Mendel, M. (2006), "Fast construction of nets in low-dimensional metrics, and their applications", SIAM Journal on Computing, 35 (5): 1148–1184, arXiv:cs/0409057, doi:10.1137/S0097539704446281, MR 2217141.
  24. ^ Teramoto, Sachio; Asano, Tetsuo; Katoh, Naoki; Doerr, Benjamin, "Inserting points uniformly at every instance", IEICE Transactions on Information and Systems, E89-D (8): 2348–2356, Bibcode:2006IEITI..89.2348T, doi:10.1093/ietisy/e89-d.8.2348.
  25. ^ Ruppert, Jim (1995), "A Delaunay refinement algorithm for quality 2-dimensional mesh generation", Journal of Algorithms, 18 (3): 548–585, doi:10.1006/jagm.1995.1021.