Euclidean minimum spanning tree

An EMST of 25 random points in the plane

The Euclidean minimum spanning tree or EMST is a minimum spanning tree of a set of ${\displaystyle n}$ points in the plane or higher-dimensional Euclidean space. It connects the points by a system of line segments, so that any two points can reach each other along a path through the line segments, and it selects line segments that minimize the sum of the Euclidean distances between directly-connected pairs of points.

In the plane, the Euclidean minimum spanning tree is a subgraph of the Delaunay triangulation. Using this fact, the Euclidean minimum spanning tree for a given set of planar points may be found in time ${\displaystyle O(n\log n)}$ (expressed in Big O notation), using algorithms based on comparisons of simple combinations of input coordinates. Faster randomized algorithms are known in models of computation allowing more powerful operations such as integer rounding.

In higher dimensions (${\displaystyle d\geq 3}$), finding an optimal algorithm remains an open problem.

Lower bound

An asymptotic lower bound of ${\displaystyle \Omega (n\log n)}$ of the EMST problem can be established in restricted models of computation, such as the algebraic decision tree and algebraic computation tree models, in which the algorithm has access to the input points only through certain restricted primitives that perform simple algebraic computations on their coordinates: in these models, the closest pair of points problem requires ${\displaystyle \Omega (n\log n)}$ time, but the closest pair is necessarily an edge of the EMST, so the EMST also requires this much time.[1] However, if the input points have integer coordinates and bitwise operations and table indexing operations are permitted using those coordinates, then faster algorithms are possible.[2]

Algorithms for computing EMSTs in two dimensions

The simplest algorithm to find an EMST in two dimensions, given ${\displaystyle n}$ points, is to actually construct the complete graph on ${\displaystyle n}$ vertices, which has ${\displaystyle n(n-1)/2}$ edges, compute each edge weight by finding the distance between each pair of points, and then run a standard minimum spanning tree algorithm (such as Prim's algorithm or Kruskal's algorithm) on it. This solution uses time and space that are both ${\displaystyle O(n^{2})}$.

A faster approach to finding the EMST in a plane is to note that it is a subgraph of every Delaunay triangulation of the ${\displaystyle n}$ points, a much-reduced set of edges:

1. Compute the Delaunay triangulation in ${\displaystyle O(n\log n)}$ time and ${\displaystyle O(n)}$ space. Because the Delaunay triangulation is a planar graph, and there are no more than three times as many edges as vertices in any planar graph, this generates only ${\displaystyle O(n)}$ edges.
2. Label each edge with its length.
3. Run a graph minimum spanning tree algorithm on this graph to find a minimum spanning tree. Since there are ${\displaystyle O(n)}$ edges, this requires ${\displaystyle O(n\log n)}$ time using any of the standard minimum spanning tree algorithms such as Borůvka's algorithm, Prim's algorithm, or Kruskal's algorithm.

The final result is an algorithm taking ${\displaystyle O(n\log n)}$ time and ${\displaystyle O(n)}$ space.

If the input coordinates are integers and can be used as array indices, faster algorithms are possible: the Delaunay triangulation can be constructed by a randomized algorithm in ${\displaystyle O(n\log \log n)}$ expected time.[2] Additionally, since the Delaunay triangulation is a planar graph, its minimum spanning tree can be found in linear time by a variant of Borůvka's algorithm that removes all but the cheapest edge between each pair of components after each stage of the algorithm.[3][4] Therefore, the total expected time for this algorithm is ${\displaystyle O(n\log \log n)}$.[2]

Higher dimensions

The problem can also be generalized to ${\displaystyle n}$ points in the ${\displaystyle d}$-dimensional space ${\displaystyle \mathbb {R} ^{d}}$. In higher dimensions, the connectivity determined by the Delaunay triangulation (which, likewise, partitions the convex hull into ${\displaystyle d}$-dimensional simplices) contains the minimum spanning tree; however, the triangulation might contain the complete graph.[5] Therefore, finding the Euclidean minimum spanning tree as a spanning tree of the complete graph or as a spanning tree of the Delaunay triangulation both take ${\displaystyle O(dn^{2})}$ time. For three dimensions it is possible to find the minimum spanning tree in time ${\displaystyle O{\bigl (}(n\log n)^{4/3}{\bigr )}}$, and in any dimension greater than three it is possible to solve it in a time that is faster than the quadratic time bound for the complete graph and Delaunay triangulation algorithms.[5] For uniformly random point sets it is possible to compute minimum spanning trees as quickly as sorting.[6] Using a well-separated pair decomposition, it is possible to produce a ${\displaystyle (1+\varepsilon )}$-approximation in ${\displaystyle O(n\log n)}$ time.[7]

Subtree of Delaunay triangulation

All edges of an EMST are edges of a relative neighborhood graph,[8][9][10] which in turn are edges of a Gabriel graph, which are edges in a Delaunay triangulation of the points,[11][12] as can be proven via the equivalent contrapositive statement: every edge not in a Delaunay triangulation is also not in any EMST. The proof is based on two properties of minimum spanning trees and Delaunay triangulations:

1. (the cycle property of minimum spanning trees): For any cycle ${\displaystyle C}$ in the graph, if the weight of an edge ${\displaystyle e}$ of ${\displaystyle C}$ is larger than the weights of other edges of ${\displaystyle C}$, then this edge cannot belong to an MST.
2. (a property of Delaunay triangulations): If there is a circle with two of the input points on its boundary which contains no other input points, the line between those two points is an edge of every Delaunay triangulation.

Consider an edge ${\displaystyle e}$ between two input points ${\displaystyle o}$ and ${\displaystyle q}$ which is not an edge of a Delaunay triangulation. Property 2 implies that the circle ${\displaystyle C}$ with ${\displaystyle e}$ as its diameter must contain some other point ${\displaystyle r}$ inside. But then ${\displaystyle r}$ is closer to both ${\displaystyle p}$ and ${\displaystyle q}$ than they are to each other, and so the edge from ${\displaystyle p}$ to ${\displaystyle q}$ is the longest edge in the cycle of points ${\displaystyle p\to q\to r\to p}$, and by property 1 ${\displaystyle e}$ is not in any EMST.

Expected size

The expected size of the EMST for large numbers of points has been determined by J. Michael Steele.[13] If ${\displaystyle f}$ is the density of the probability function for picking points, then for large ${\displaystyle n}$ and ${\displaystyle d\neq 1}$ the size of the EMST is approximately

${\displaystyle c(d)n^{\frac {d-1}{d}}\int _{\mathbb {R} ^{d}}f(x)^{\frac {d-1}{d}}dx}$

where ${\displaystyle c(d)}$ is a constant depending only on the dimension ${\displaystyle d}$. The exact value of the constants are unknown but can be estimated from empirical evidence.

Applications

An obvious application of Euclidean minimum spanning trees is to find the cheapest network of wires or pipes to connect a set of places, assuming the links cost a fixed amount per unit length. However, while these give an absolute lower bound on the amount of connection needed, most such networks prefer a k-connected graph to a tree, so that failure of an any individual link will not split the network into parts.

Another application of EMSTs is a constant-factor approximation algorithm for approximately solving the Euclidean traveling salesman problem, the version of the traveling salesman problem on a set of points in the plane with edges labelled by their length. This realistic variation of the problem can be solved within a factor of 2 by computing the EMST, doing a walk along its boundary which outlines the entire tree, and then removing all but one occurrence of each vertex from this walk.

Planar realization

The realization problem for Euclidean minimum spanning trees is stated as follows: Given a tree ${\displaystyle T=(V,E)}$, find a location ${\displaystyle D(u)}$ for each vertex ${\displaystyle u\in V}$ so that ${\displaystyle T}$ is a minimum spanning tree of ${\displaystyle \{D(u):u\in V\}}$, or determine that no such locations exist. Testing of the existence of a realization in the plane is NP-hard.[14]

References

1. ^ Yao, A. C.-C. (1989), "Lower bounds for algebraic computation trees with integer inputs", Proc. 30th Annual Symposium on Foundations of Computer Science (FOCS 1989), pp. 308–313, doi:10.1109/SFCS.1989.63495
2. ^ a b c Buchin, Kevin; Mulzer, Wolfgang (2009), "Delaunay triangulations in O(sort(n)) time and more" (PDF), Proc. 50th IEEE Symposium on Foundations of Computer Science, pp. 139–148, doi:10.1109/FOCS.2009.53
3. ^ Eppstein, David (1999), "Spanning trees and spanners" (PDF), in Sack, J.-R.; Urrutia, J. (eds.), Handbook of Computational Geometry, Elsevier, pp. 425–461
4. ^ Mareš, Martin (2004), "Two linear time algorithms for MST on minor closed graph classes" (PDF), Archivum Mathematicum, 40 (3): 315–320
5. ^ a b Agarwal, P. K.; Edelsbrunner, H.; Schwarzkopf, O.; Welzl, E. (1991), "Euclidean minimum spanning trees and bichromatic closest pairs", Discrete and Computational Geometry, Springer, 6 (1): 407–422, doi:10.1007/BF02574698
6. ^ Chatterjee, S.; Connor, M.; Kumar, P. (2010), "Geometric minimum spanning trees with GeoFilterKruskal", in Festa, Paola (ed.), Symposium on Experimental Algorithms, Lecture Notes in Computer Science, 6049, Springer-Verlag, pp. 486–500, doi:10.1007/978-3-642-13193-6_41
7. ^ Smid, Michiel (May 2018), "The well-separated pair decomposition and its applications" (PDF), Handbook of Approximation Algorithms and Metaheuristics (2nd ed.), Chapman and Hall/CRC, doi:10.1201/9781351235426
8. ^ Jaromczyk, J. W.; Toussaint, G. T. (1992), "Relative neighborhood graphs and their relatives", Proceedings of the IEEE, 80 (9): 1502–1517, doi:10.1109/5.163414
9. ^ Toussaint, G. T. (1980), "Comment: Algorithms for computing relative neighborhood graph", Electronics Letters, 16 (22): 860, doi:10.1049/el:19800611; reply by Urquhart, pp. 860–861
10. ^ Toussaint, G. T. (1980), "The relative neighborhood graph of a finite planar set", Pattern Recognition, 12 (4): 261–268, doi:10.1016/0031-3203(80)90066-7
11. ^ Pless, Robert (Spring 2003), "Lecture 17: Voronoi Diagrams and Delauney Triangulations", Computational Geometry Class Page, Washington University, archived from the original on 2006-09-12
12. ^ Sedgewick, Robert; Wayne, Kevin (Spring 2007), "Minimum Spanning Tree lecture notes" (PDF), Computer Science 226: Algorithms & Data Structures, Princeton University
13. ^ Steele, J. Michael (1988), "Growth rates of Euclidean minimum spanning trees with power weighted edges" (PDF), The Annals of Probability, 16 (4): 1767–1787, doi:10.1214/aop/1176991596
14. ^ Eades, Peter; Whitesides, Sue (1994), "The realization problem for Euclidean minimum spanning trees is NP-hard", Proc. 10th ACM Symposium on Computational Geometry, pp. 49–56, doi:10.1145/177424.177507