Prim's algorithm

Template:Graph search algorithm In computer science, Prim's algorithm is a greedy algorithm that finds a minimum spanning tree for a connected weighted undirected graph. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized. The algorithm was developed in 1930 by Czech mathematician Vojtěch Jarník and later independently by computer scientist Robert C. Prim in 1957 and rediscovered by Edsger Dijkstra in 1959. Therefore it is also sometimes called the DJP algorithm, the Jarník algorithm, or the Prim–Jarník algorithm.

Other algorithms for this problem include Kruskal's algorithm and Borůvka's algorithm. These algorithms find the minimum spanning forest in a possibly disconnected graph. By running Prim's algorithm for each connected component of the graph, it can also be used to find the minimum spanning forest.

Description

Informal

Initialize a tree with a single vertex, chosen arbitrarily from the graph.
Grow the tree by one edge: of the edges that connect the tree to vertices not yet in the tree, find the minimum-weight edge, and transfer it to the tree.
Repeat step 2 (until all vertices are in the tree).

Technical

If a graph is empty then we are done immediately. Thus, we assume otherwise.

The algorithm starts with a tree consisting of a single vertex, and continuously increases its size one edge at a time, until it spans all vertices.

Input: A non-empty connected weighted graph with vertices V and edges E (the weights can be negative).
Initialize: V_new = {x}, where x is an arbitrary node (starting point) from V, E_new = {}
Repeat until V_new = V:
- Choose an edge {u, v} with minimal weight such that u is in V_new and v is not (if there are multiple edges with the same weight, any of them may be picked)
- Add v to V_new, and {u, v} to E_new
Output: V_new and E_new describe a minimal spanning tree

MST-PRIM (G, w, r) {

for each u ∈ G.V

u.key = ∞

u.parent = NIL

r.key = 0

Q = G.V

while (Q ≠ ø)

u = Extract-Min(Q)

for each v ∈ G.Adj[u]

if (v ∈ Q) and w(u,v) < v.key

v.parent = u

v.key = w(u,v)

}

^[1]

In order to implement Prim’s algorithm efficiently, we need a fast way to select a new edge to add to the tree formed by the edges in A. In the pseudo code above, the connected graph G and the root r of the minimum spanning tree to be grown are inputs to the algorithm. During execution of the algorithm, all vertices that are not in the tree reside in a min-priority queue Q based on a key attribute. For each vertex v, the attribute v:key is the minimum weight of any edge connecting to a vertex in the tree; by convention, key = ∞ if there is no such edge. The attribute v.parent names the parent of v in the tree. The algorithm implicitly maintains the set A from GENERIC-MST as

A = { (v, v.parent) : v ∈ V - {r} - Q }.

When the algorithm terminates, the min-priority queue Q is empty; the minimum spanning tree A for G is thus

A = { (v, v.parent) : v ∈ V - {r} }.

Lines 1–5 set the key of each vertex to 1 (except for the root r, whose key is set to 0 so that it will be the first vertex processed), set the parent of each vertex to NIL, and initialize the min_priority queue Q to contain all the vertices. The algorithm maintains the following three-part loop invariant:
^[2]

Prior to each iteration of the while loop of lines 6–11,

1. A = { (v, v.parent) : v ∈ V - {r} - Q }.

2. The vertices already placed into the minimum spanning tree are those in V−Q.

3. For all vertices v ∈ Q, if v.parent ≠ NIL, then v.key < ∞ and v.key is the weight of a light edge (v, v.parent) connecting v ::to some vertex already placed into the minimum spanning tree.

Line 7 identifies a vertex u 2 Q incident on a light edge that crosses the cut (V−Q, Q) (with the exception of the first iteration, in which u = r due to line 4). Removing u from the set Q adds it to the set V−Q of vertices in the tree, thus adding (u, u.parent) to A. The for loop of lines 8–11 updates the key and parent attributes of every vertex v adjacent to u but not in the tree, thereby maintaining the third part of the loop invariant.

Time complexity

Prim's algorithm has many applications, such as in the generation of this maze, which applies Prim's algorithm to a randomly weighted grid graph.

Minimum edge weight data structure	Time complexity (total)
adjacency matrix, searching	O(\|V\|²)
binary heap and adjacency list	O((\|V\| + \|E\|) log \|V\|) = O(\|E\| log \|V\|)
Fibonacci heap and adjacency list	O(\|E\| + \|V\| log \|V\|)

A simple implementation of Prim's, using an adjacency matrix graph representation and linearly searching an array of weights to find the minimum weight edge, to add requires O(|V|²) running time. Switching to an adjacency list representation brings this down to O(|V||E|), which is strictly better for sparse graphs. However, this running time can be greatly improved further by using heaps to implement finding minimum weight edges in the algorithm's inner loop.

A first improved version uses a heap to store all edges of the input graph, ordered by their weight. This leads to an O(|E| log |E|) worst-case running time. But storing vertices instead of edges can improve it still further. The heap should order the vertices by the smallest edge-weight that connects them to any vertex in the partially constructed minimum spanning tree (MST) (or infinity if no such edge exists). Every time a vertex v is chosen and added to the MST, a decrease-key operation is performed on all vertices w outside the partial MST such that v is connected to w, setting the key to the minimum of its previous value and the edge cost of (v,w).

Using a simple binary heap data structure, Prim's algorithm can now be shown to run in time O(|E| log |V|) where |E| is the number of edges and |V| is the number of vertices. Using a more sophisticated Fibonacci heap, this can be brought down to O(|E| + |V| log |V|), which is asymptotically faster when the graph is dense enough that |E| is ω(|V|).

Proof of correctness

Let P be a connected, weighted graph. At every iteration of Prim's algorithm, an edge must be found that connects a vertex in a subgraph to a vertex outside the subgraph. Since P is connected, there will always be a path to every vertex. The output Y of Prim's algorithm is a tree, because the edge and vertex added to tree Y are connected. Let Y₁ be a minimum spanning tree of graph P. If Y₁=Y then Y is a minimum spanning tree. Otherwise, let e be the first edge added during the construction of tree Y that is not in tree Y₁, and V be the set of vertices connected by the edges added before edge e. Then one endpoint of edge e is in set V and the other is not. Since tree Y₁ is a spanning tree of graph P, there is a path in tree Y₁ joining the two endpoints. As one travels along the path, one must encounter an edge f joining a vertex in set V to one that is not in set V. Now, at the iteration when edge e was added to tree Y, edge f could also have been added and it would be added instead of edge e if its weight was less than e, and since edge f was not added, we conclude that

w(f)\geq w(e).

Let tree Y₂ be the graph obtained by removing edge f from and adding edge e to tree Y₁. It is easy to show that tree Y₂ is connected, has the same number of edges as tree Y₁, and the total weights of its edges is not larger than that of tree Y₁, therefore it is also a minimum spanning tree of graph P and it contains edge e and all the edges added before it during the construction of set V. Repeat the steps above and we will eventually obtain a minimum spanning tree of graph P that is identical to tree Y. This shows Y is a minimum spanning tree.

Explanation for Time Complexity

In the method that uses binary heaps, we can observe that the traversal is executed O(V+E) times (similar to BFS). Each traversal has operation which takes O(LogV) time. So overall time complexity is O(E+V)*O(LogV) which is O((E+V)*LogV) = O(ELogV) (For a connected graph, V = O(E)).

References

^ Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Third Edition. MIT Press, 2009. ISBN 0-262-03384-4. Section 23.2: The algorithms of Kruskal and Prim, pp. 631–638.
^ Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Third Edition. MIT Press, 2009. ISBN 0-262-03384-4. Section 23.2: The algorithms of Kruskal and Prim, pp. 631–638.

V. Jarník: O jistém problému minimálním [About a certain minimal problem], Práce Moravské Přírodovědecké Společnosti, 6, 1930, pp. 57–63. (in Czech)
R. C. Prim: Shortest connection networks and some generalizations. In: Bell System Technical Journal, 36 (1957), pp. 1389–1401
D. Cheriton and R. E. Tarjan: Finding minimum spanning trees. In: SIAM Journal on Computing, 5 (Dec. 1976), pp. 724–741
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Third Edition. MIT Press, 2009. ISBN 0-262-03384-4. Section 23.2: The algorithms of Kruskal and Prim, pp. 631–638.

External links

[1] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Third Edition. MIT Press, 2009. ISBN 0-262-03384-4. Section 23.2: The algorithms of Kruskal and Prim, pp. 631–638.

[2] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Third Edition. MIT Press, 2009. ISBN 0-262-03384-4. Section 23.2: The algorithms of Kruskal and Prim, pp. 631–638.

[1]

[2]