Jump to content

Held–Karp algorithm: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
→‎Algorithm description and motivation: Fix range for i (1..k instead of 1..n)
Line 3: Line 3:
==Algorithm description and motivation==
==Algorithm description and motivation==


Number the cities <math>1, 2, \ldots, n</math>, with <math>1</math> designated arbitrarily as a "starting" city (since the solution to TSP is a cycle, the choice of starting city doesn't matter). The Held–Karp algorithm begins by calculating, for each set of cities <math>S \subseteq \{2, \ldots, n\}</math> and every city <math>e \neq 1</math> not contained in <math>S</math>, the shortest one-way path from <math>1</math> to <math>e</math> that passes through every city in <math>S</math> in some order (but not through any other cities). Denote this distance <math>g(S, e)</math>, and write <math>d(u, v)</math> for the length of the direct edge from <math>u</math> to <math>v</math>. We'll compute values of <math>g(S, e)</math> starting from the smallest sets <math>S</math> to the largest.
Number the cities <math>1, 2, \ldots, n</math>, with <math>1</math> designated arbitrarily as a "starting" city (since the solution to TSP is a cycle, the choice of starting city doesn't matter). The Held–Karp algorithm begins by calculating, for each set of cities <math>S \subseteq \{2, \ldots, n\}</math> and every city <math>e \neq 1</math> not contained in <math>S</math>, the shortest one-way path from <math>1</math> to <math>e</math> that passes through every city in <math>S</math> in some order (but not through any other cities). Denote this distance <math>g(S, e)</math>, and write <math>d(u, v)</math> for the length of the direct edge from <math>u</math> to <math>v</math>. We'll compute values of <math>g(S, e)</math> starting with the smallest sets <math>S</math> and finishing with the largest.


When <math>S</math> has two or fewer elements, then calculating <math>g(S, e)</math> requires looking at one or two possibilities. For example, <math>g(\emptyset, e)</math> is simply <math>d(1, e)</math>, and <math>g(\{2\}, 3)</math> is just the length of <math>1 \rightarrow 2 \rightarrow 3</math>. Likewise, <math>g(\{2, 3\}, 4)</math> is the length of either <math>1 \rightarrow 2 \rightarrow 3 \rightarrow 4</math> or <math>1 \rightarrow 3 \rightarrow 2 \rightarrow 4</math>, whichever is shorter.
When <math>S</math> has two or fewer elements, then calculating <math>g(S, e)</math> requires looking at one or two possibilities. For example, <math>g(\emptyset, e)</math> is simply <math>d(1, e)</math>, and <math>g(\{2\}, 3)</math> is just the length of <math>1 \rightarrow 2 \rightarrow 3</math>. Likewise, <math>g(\{2, 3\}, 4)</math> is the length of either <math>1 \rightarrow 2 \rightarrow 3 \rightarrow 4</math> or <math>1 \rightarrow 3 \rightarrow 2 \rightarrow 4</math>, whichever is shorter.

Revision as of 00:11, 14 July 2021

The Held–Karp algorithm, also called Bellman–Held–Karp algorithm, is a dynamic programming algorithm proposed in 1962 independently by Bellman[1] and by Held and Karp[2] to solve the traveling salesman problem (TSP), in which the input is a distance matrix between a set of cities, and the goal is to find a minimum-length tour that visits each city exactly once before returning to the starting point. It finds the exact solution to this problem, and to several related problems including the Hamiltonian circuit problem, in exponential time.

Algorithm description and motivation

Number the cities , with designated arbitrarily as a "starting" city (since the solution to TSP is a cycle, the choice of starting city doesn't matter). The Held–Karp algorithm begins by calculating, for each set of cities and every city not contained in , the shortest one-way path from to that passes through every city in in some order (but not through any other cities). Denote this distance , and write for the length of the direct edge from to . We'll compute values of starting with the smallest sets and finishing with the largest.

When has two or fewer elements, then calculating requires looking at one or two possibilities. For example, is simply , and is just the length of . Likewise, is the length of either or , whichever is shorter.

Once contains three or more cities, the number of paths through rises quickly, but only a few such paths need to be examined to find the shortest. For instance, if is shorter than , then must be shorter than , and the length of is not a possible value of . Similarly, if the shortest path from through to is , and the shortest path from through to ends with the edge , then the whole path from to must be , and not any of the other five paths created by visiting in a different order.

More generally, suppose is a set of cities. For every integer , write for the set created by removing from . Then if the shortest path from through to has as its second-to-last city, then removing the final edge from this path must give the shortest path from to through . This means there are only possible shortest paths from to through , one for each possible second-to-last city with distance , and .

This stage of the algorithm finishes when is known for every integer , giving the shortest distance from city to city that passes through every other city. The much shorter second stage adds these distances to the edge lengths to give possible shortest cycles, and then finds the shortest.

The shortest path itself (and not just its length), finally, may be reconstructed by storing alongside the label of the second-to-last city on the path from to through , raising space requirements by only a constant factor.

Algorithmic complexity

The Held–Karp algorithm has exponential time complexity , significantly better than the superexponential performance of a brute-force algorithm. Held–Karp, however, requires space to hold all computed values of the function , while brute force needs only space to store the graph itself.

Time

Computing one value of for a -element subset of requires finding the shortest of possible paths, each found by adding a known value of and an edge length from the original graph; that is, it requires time proportional to . There are -element subsets of ; and each subset gives possible values of . Computing all values of where thus requires time , for a total time across all subset sizes . The second stage of the algorithm, finding a complete cycle from candidates, takes time and does not affect asymptotic performance.

Space

Storing all values of for subsets of size requires keeping values. A complete table of values of thus requires space .

If only the length of the shortest cycle is needed, not the cycle itself, then space complexity can be improved somewhat by noting that calculating for a of size requires only values of for subsets of size . Keeping only the values of where has size either or reduces the algorithm's maximum space requirements, attained when , to .

Example[3]

Distance matrix:

Functions description:

  • g(x, S) - starting from 1, path min cost ends at vertex x, passing vertices in set S exactly once
  • cxy - edge cost ends at x from y
  • p(x, S) - the second-to-last vertex to x from set S. Used for constructing the TSP path back at the end.


k = 0, null set:

Set ∅:

       g(2, ∅) = c21 = 1 
       g(3, ∅) = c31 = 15
       g(4, ∅) = c41 = 6

k = 1, consider sets of 1 element:

Set {2}:

       g(3,{2}) = c32 + g(2, ∅ ) = c32 + c21 = 7 + 1 = 8       p(3,{2}) = 2
       g(4,{2}) = c42 + g(2, ∅ ) = c42 + c21 = 3 + 1 = 4       p(4,{2}) = 2

Set {3}:

       g(2,{3}) = c23 + g(3, ∅ ) = c23 + c31 = 6 + 15 = 21     p(2,{3}) = 3
       g(4,{3}) = c43 + g(3, ∅ ) = c43 + c31 = 12 + 15 = 27    p(4,{3}) = 3

Set {4}:

       g(2,{4}) = c24 + g(4, ∅ ) = c24 + c41 = 4 + 6 = 10      p(2,{4}) = 4
       g(3,{4}) = c34 + g(4, ∅ ) = c34 + c41 = 8 + 6 = 14      p(3,{4}) = 4

k = 2, consider sets of 2 elements:

Set {2,3}:

         g(4,{2,3}) = min {c42 + g(2,{3}), c43 + g(3,{2})} = min {3+21, 12+8}= min {24, 20}= 20
         p(4,{2,3}) = 3

Set {2,4}:

         g(3,{2,4}) = min {c32 + g(2,{4}), c34 + g(4,{2})} = min {7+10, 8+4}= min {17, 12} = 12
         p(3,{2,4}) = 4

Set {3,4}:

          g(2,{3,4}) = min {c23 + g(3,{4}), c24 + g(4,{3})} = min {6+14, 4+27}= min {20, 31}= 20
          p(2,{3,4}) = 3

Length of an optimal tour:

 f = g(1,{2,3,4}) = min { c12 + g(2,{3,4}), c13 + g(3,{2,4}), c14 + g(4,{2,3}) }
                  = min {2 + 20, 9 + 12, 10 + 20} = min {22, 21, 30} = 21

Predecessor of node 1: p(1,{2,3,4}) = 3

Predecessor of node 3: p(3, {2,4}) = 4

Predecessor of node 4: p(4, {2}) = 2

Predecessor of node 2: p(2, {}) = 1

Hence, an optimal TSP tour is given by 1 → 2 → 4 → 3 → 1.

Pseudocode[4]

function algorithm TSP (G, n) is
    for k := 2 to n do
        C({k}, k) := d1,k
    end for

    for s := 2 to n−1 do
        for all S ⊆ {2, . . . , n}, |S| = s do
            for all k ∈ S do
                C(S, k) := minm≠k,m∈S [C(S\{k}, m) + dm,k]
            end for
        end for
    end for

    opt := mink≠1 [C({2, 3, . . . , n}, k) + dk, 1]
    return (opt)
end function

Precise algorithm for solving the TSP

Besides Dynamic Programming, Linear programming and Branch-bound algorithm are precise algorithms that can solve TSP. Linear programming applies to the cutting plane method in the integer programming, i.e. solving the LP formed by two constraints in the model and then seeking the cutting plane by adding inequality constraint to gradually converge at an optimal solution. When people apply this method to find a cutting plane, they often depend on experience. So this method is seldom deemed as a general method.

Branch-bound algorithm is a search algorithm widely used, although it's not good for solving the large-scale problem. It controls the searching process through effective restrictive boundary so that it can search for the optimal solution branch from the space state tree to find an optimal solution as soon as possible. The key point of this algorithm is the choice of the restrictive boundary. Different restrictive boundaries may form different branch-bound algorithms.

Approximate algorithm for solving the TSP

As the application of precise algorithm to solve problem is very limited, we often use approximate algorithm or heuristic algorithm. The result of the algorithm can be assessed by C / C* ≤ ε . C is the total travelling distance generated from approximate algorithm; C* is the optimal travelling distance; ε is the upper limit for the ratio of the total travelling distance of approximate solution to optimal solution under the worst condition. The value of ε >1.0. The more it closes to 1.0, the better the algorithm is. These algorithms include: Interpolation algorithm, Nearest neighbour algorithm, Clark & Wright algorithm, Double spanning tree algorithm, Christofides algorithm, Hybrid algorithm, Probabilistic algorithm.

References

  1. ^ ‘Dynamic programming treatment of the travelling salesman problem’, Richard Bellman, Journal of Assoc. Computing Mach. 9. 1962.
  2. ^ 'A dynamic programming approach to sequencing problems’, Michael Held and Richard M. Karp, Journal for the Society for Industrial and Applied Mathematics 1:10. 1962
  3. ^ http://www.mafy.lut.fi/study/DiscreteOpt/tspdp.pdf
  4. ^ http://www.lsi.upc.edu/~mjserna/docencia/algofib/P07/dynprog.pdf[permanent dead link]