Maximum flow problem

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Flow network for the problem: Each human (ri) is willing to adopt a cat (wi1) and/or a dog (wi2). However each pet (pi) has a preference for only a subset of the humans. Find any matching of pets to humans such that the maximum number of pets are adopted by one of its preferred humans.
Flow network for the problem: Each human (ri) is willing to adopt a cat (wi1) and/or a dog (wi2). However each pet (pi) has a preference for only a subset of the humans. Find any matching of pets to humans such that the maximum number of pets are adopted by one of its preferred humans.

In optimization theory, maximum flow problems involve finding a feasible flow through a flow network that obtains the maximum possible flow rate.

The maximum flow problem can be seen as a special case of more complex network flow problems, such as the circulation problem. The maximum value of an s-t flow (i.e., flow from source s to sink t) is equal to the minimum capacity of an s-t cut (i.e., cut severing s from t) in the network, as stated in the max-flow min-cut theorem.

History[edit]

The maximum flow problem was first formulated in 1954 by T. E. Harris and F. S. Ross as a simplified model of Soviet railway traffic flow.[1][2][3]

In 1955, Lester R. Ford, Jr. and Delbert R. Fulkerson created the first known algorithm, the Ford–Fulkerson algorithm.[4][5] In their 1955 paper,[4] Ford and Fulkerson wrote that the problem of Harris and Ross is formulated as follows (see[1] p. 5):

Consider a rail network connecting two cities by way of a number of intermediate cities, where each link of the network has a number assigned to it representing its capacity. Assuming a steady state condition, find a maximal flow from one given city to the other.

In their book Flows in Network,[5] in 1962, Ford and Fulkerson wrote:

It was posed to the authors in the spring of 1955 by T. E. Harris, who, in conjunction with General F. S. Ross (Ret.), had formulated a simplified model of railway traffic flow, and pinpointed this particular problem as the central one suggested by the model [11].

where [11] refers to the 1955 secret report Fundamentals of a Method for Evaluating Rail net Capacities by Harris and Ross[3] (see[1] p. 5).

Over the years, various improved solutions to the maximum flow problem were discovered, notably the shortest augmenting path algorithm of Edmonds and Karp and independently Dinitz; the blocking flow algorithm of Dinitz; the push-relabel algorithm of Goldberg and Tarjan; and the binary blocking flow algorithm of Goldberg and Rao. The algorithms of Sherman[6] and Kelner, Lee, Orecchia and Sidford,[7][8] respectively, find an approximately optimal maximum flow but only work in undirected graphs.

In 2013 James B. Orlin published a paper describing an algorithm.[9]

Definition[edit]

A flow network, with source s and sink t. The numbers next to the edge are the capacities.

First we establish some notation:

  • Let be a network with being the source and the sink of respectively.
  • If is function on the edges of then its value on is denoted by or

Definition. The capacity of an edge is the maximum amount of flow that can pass through an edge. Formally it is a map

Definition. A flow is a map that satisfies the following:

  • Capacity constraint. The flow of an edge cannot exceed its capacity, in other words: for all
  • Conservation of flows. The sum of the flows entering a node must equal the sum of the flows exiting that node, except for the source and the sink. Or:

Remark. Flows are skew symmetric: for all

Definition. The value of flow is the amount of flow passing from the source to the sink. Formally for a flow it is given by:

Definition. The maximum flow problem is to route as much flow as possible from the source to the sink, in other words find the flow with maximum value.

Note that several maximum flows may exist, and if arbitrary real (or even arbitrary rational) values of flow are permitted (instead of just integers), there is either exactly one maximum flow, or infinitely many, since there are infinitely many linear combinations of the base maximum flows. In other words, if we send units of flow on edge in one maximum flow, and units of flow on in another maximum flow, then for each we can send units on and route the flow on remaining edges accordingly, to obtain another maximum flow. If flow values can be any real or rational numbers, then there are infinitely many such values for each pair .

Algorithms[edit]

The following table lists algorithms for solving the maximum flow problem.

Method Complexity Description
Linear programming Constraints given by the definition of a legal flow. See the linear program here.
Ford–Fulkerson algorithm As long as there is an open path through the residual graph, send the minimum of the residual capacities on the path.

The algorithm is only guaranteed to terminate if all weights are rational, in which case the amount added to the flow in each step is at least the greatest common divisor of the weights. Otherwise it is possible that the algorithm will not converge to the maximum value. However, if the algorithm terminates, it is guaranteed to find the maximum value.

Edmonds–Karp algorithm A specialization of Ford–Fulkerson, finding augmenting paths with breadth-first search.
Dinic's algorithm In each phase the algorithms builds a layered graph with breadth-first search on the residual graph. The maximum flow in a layered graph can be calculated in time, and the maximum number of phases is . In networks with unit capacities, Dinic's algorithm terminates in time.[citation needed]
MKM (Malhotra, Kumar, Maheshwari) algorithm[10] Only works on acyclic networks. Refer to the original paper.
Dinic's algorithm with dynamic trees The dynamic trees data structure speeds up the maximum flow computation in the layered graph to .
General push–relabel algorithm[11] The push relabel algorithm maintains a preflow, i.e. a flow function with the possibility of excess in the vertices. The algorithm runs while there is a vertex with positive excess, i.e. an active vertex in the graph. The push operation increases the flow on a residual edge, and a height function on the vertices controls through which residual edges can flow be pushed. The height function is changed by the relabel operation. The proper definitions of these operations guarantee that the resulting flow function is a maximum flow.
Push–relabel algorithm with FIFO vertex selection rule[11] Push-relabel algorithm variant which always selects the most recently active vertex, and performs push operations while the excess is positive and there are admissible residual edges from this vertex.
Push-relabel algorithm with dynamic trees[11] The algorithm builds limited size trees on the residual graph regarding to the height function. These trees provide multilevel push operations, i.e. pushing along an entire saturating path instead of a single edge.
KRT (King, Rao, Tarjan)'s algorithm[12]
Binary blocking flow algorithm[13] The value U corresponds to the maximum capacity of the network.
James B Orlin's + KRT (King, Rao, Tarjan)'s algorithm[9] Orlin's algorithm solves max-flow in time for while KRT solves it in for .

For a more extensive list, see Goldberg & Tarjan (1988).

Integral flow theorem[edit]

The integral flow theorem states that

If each edge in a flow network has integral capacity, then there exists an integral maximal flow.

Application[edit]

Multi-source multi-sink maximum flow problem[edit]

Fig. 4.1.1. Transformation of a multi-source multi-sink maximum flow problem into a single-source single-sink maximum flow problem

Given a network with a set of sources and a set of sinks instead of only one source and one sink, we are to find the maximum flow across . We can transform the multi-source multi-sink problem into a maximum flow problem by adding a consolidated source connecting to each vertex in and a consolidated sink connected by each vertex in (also known as supersource and supersink) with infinite capacity on each edge (See Fig. 4.1.1.).

Minimum path cover in directed acyclic graph[edit]

Given a directed acyclic graph , we are to find the minimum number of vertex-disjoint paths to cover each vertex in . We can construct a bipartite graph from , where

  1. .

Then it can be shown, via Kőnig's theorem, that has a matching of size if and only if has a vertex-disjoint path cover of size , where is the number of vertices in . Therefore, the problem can be solved by finding the maximum cardinality matching in instead.

Intuitively, if two vertices are matched in , then the edge is contained in . To see that is vertex-disjoint, consider the following:

  1. Each vertex in can either be non-matched in , in which case there are no edges leaving in ; or it can be matched, in which case there is exactly one edge leaving in . In either case, no more than one edge leaves any vertex in .
  2. Similarly for each vertex in – if it is matched, there is a single incoming edge into in ; otherwise has no incoming edges in .

Thus no vertex has two incoming or two outgoing edges in , which means all paths in are vertex-disjoint.

Maximum cardinality bipartite matching[edit]

Fig. 4.3.1. Transformation of a maximum bipartite matching problem into a maximum flow problem

Given a bipartite graph , we are to find a maximum cardinality matching in , that is a matching that contains the largest possible number of edges. This problem can be transformed into a maximum flow problem by constructing a network , where

  1. contains the edges in directed from to .
  2. for each and for each .
  3. for each (See Fig. 4.3.1).

Then the value of the maximum flow in is equal to the size of the maximum matching in .

Maximum flow with vertex capacities[edit]

Fig. 4.4.1. Transformation of a maximum flow problem with vertex capacities constraint into the original maximum flow problem by node splitting

Let be a network. Suppose there is capacity at each node in addition to edge capacity, that is, a mapping such that the flow has to satisfy not only the capacity constraint and the conservation of flows, but also the vertex capacity constraint

In other words, the amount of flow passing through a vertex cannot exceed its capacity. To find the maximum flow across , we can transform the problem into the maximum flow problem in the original sense by expanding . First, each is replaced by and , where is connected by edges going into and is connected to edges coming out from , then assign capacity to the edge connecting and (see Fig. 4.4.1). In this expanded network, the vertex capacity constraint is removed and therefore the problem can be treated as the original maximum flow problem.

Maximum number of paths from s to t[edit]

Given a directed graph and two vertices and , we are to find the maximum number of paths from to . This problem has several variants:

1. The paths must be edge-disjoint. This problem can be transformed to a maximum flow problem by constructing a network from , with and being the source and the sink of respectively, and assigning each edge a capacity of . In this network, the maximum flow is iff there are edge-disjoint paths.

2. The paths must be independent, i.e., vertex-disjoint (except for and ). We can construct a network from with vertex capacities, where the capacities of all vertices and all edges are . Then the value of the maximum flow is equal to the maximum number of independent paths from to .

3. In addition to the paths being edge-disjoint and/or vertex disjoint, the paths also have a length constraint: we count only paths whose length is exactly , or at most . Most variants of this problem are NP-complete, except for small values of .[14]

Closure problem[edit]

A closure of a directed graph is a set of vertices C, such that no edges leave C. The closure problem is the task of finding the maximum-weight or minimum-weight closure in a vertex-weighted directed graph. It may be solved in polynomial time using a reduction to the maximum flow problem.

Real world applications[edit]

Baseball elimination[edit]

Construction of network flow for baseball elimination problem

In the baseball elimination problem there are n teams competing in a league. At a specific stage of the league season, wi is the number of wins and ri is the number of games left to play for team i and rij is the number of games left against team j. A team is eliminated if it has no chance to finish the season in the first place. The task of the baseball elimination problem is to determine which teams are eliminated at each point during the season. Schwartz[15] proposed a method which reduces this problem to maximum network flow. In this method a network is created to determine whether team k is eliminated.

Let G = (V, E) be a network with s,tV being the source and the sink respectively. One adds a game node {i,j} with i < j to V, and connects each of them from s by an edge with capacity rij – which represents the number of plays between these two teams. We also add a team node for each team and connect each game node {i,j} with two team nodes i and j to ensure one of them wins. One does not need to restrict the flow value on these edges. Finally, edges are made from team node i to the sink node t and the capacity of wk+rkwi is set to prevent team i from winning more than wk+rk. Let S be the set of all teams participating in the league and let . In this method it is claimed team k is not eliminated if and only if a flow value of size r(S − {k}) exists in network G. In the mentioned article it is proved that this flow value is the maximum flow value from s to t.

Airline scheduling[edit]

In the airline industry a major problem is the scheduling of the flight crews. The airline scheduling problem can be considered as an application of extended maximum network flow. The input of this problem is a set of flights F which contains the information about where and when each flight departs and arrives. In one version of airline scheduling the goal is to produce a feasible schedule with at most k crews.

In order to solve this problem one uses a variation of the circulation problem called bounded circulation which is the generalization of network flow problems, with the added constraint of a lower bound on edge flows.

Let G = (V, E) be a network with s,tV as the source and the sink nodes. For the source and destination of every flight i, one adds two nodes to V, node si as the source and node di as the destination node of flight i. One also adds the following edges to E:

  1. An edge with capacity [0, 1] between s and each si.
  2. An edge with capacity [0, 1] between each di and t.
  3. An edge with capacity [1, 1] between each pair of si and di.
  4. An edge with capacity [0, 1] between each di and sj, if source sj is reachable with a reasonable amount of time and cost from the destination of flight i.
  5. An edge with capacity [0, ] between s and t.

In the mentioned method, it is claimed and proved that finding a flow value of k in G between s and t is equal to finding a feasible schedule for flight set F with at most k crews.[16]

Another version of airline scheduling is finding the minimum needed crews to perform all the flights. In order to find an answer to this problem, a bipartite graph G' = (AB, E) is created where each flight has a copy in set A and set B. If the same plane can perform flight j after flight i, iA is connected to jB. A matching in G' induces a schedule for F and obviously maximum bipartite matching in this graph produces an airline schedule with minimum number of crews.[16] As it is mentioned in the Application part of this article, the maximum cardinality bipartite matching is an application of maximum flow problem.

Circulation–demand problem[edit]

There are some factories that produce goods and some villages where the goods have to be delivered. They are connected by a networks of roads with each road having a capacity c for maximum goods that can flow through it. The problem is to find if there is a circulation that satisfies the demand. This problem can be transformed into a maximum-flow problem.

  1. Add a source node s and add edges from it to every factory node fi with capacity pi where pi is the production rate of factory fi.
  2. Add a sink node t and add edges from all villages vi to t with capacity di where di is the demand rate of village vi.

Let G = (V, E) be this new network. There exists a circulation that satisfies the demand if and only if :

Maximum flow value(G) .

If there exists a circulation, looking at the max-flow solution would give the answer as to how much goods have to be sent on a particular road for satisfying the demands.

The problem can be extended by adding a lower bound on the flow on some edges.[17]


Image segmentation[edit]

Source image of size 8x8.
Network built from the bitmap. The source is on the left, the sink on the right. The darker an edge is, the bigger is its capacity. ai is high when the pixel is green, bi when the pixel is not green. The penalty pij are all equal.[18]

In their book, Kleinberg and Tardos present an algorithm for segmenting an image.[19] They present an algorithm to find the background and the foreground in an image. More precisely, the algorithm takes a bitmap as an input modelled as follows: ai ≥ 0 is the likelihood that pixel i belongs to the foreground, bi ≥ 0 in the likelihood that pixel i belongs to the background, and pij is the penalty if two adjacent pixels i and j are placed one in the foreground and the other in the background. The goal is to find a partition (A, B) of the set of pixels that maximize the following quantity

,

Indeed, for pixels in A (considered as the foreground), we gain ai; for all pixels in B (considered as the background), we gain bi. On the border, between two adjacent pixels i and j, we loose pij. It is equivalent to minimize the quantity

because

Minimum cut displayed on the network (triangles VS circles).

We now construct the network whose nodes are the pixel, plus a source and a sink, see Figure on the right. We connect the source to pixel i by an edge of weight ai. We connect the pixel i to the sink by an edge of weight bi. We connect pixel i to pixel j with weight pij. Now, it remains to compute a minimum cut in that network (or equivalently a maximum flow). The last figure shows a minimum cut.

Extensions[edit]

1. In the minimum-cost flow problem, each edge (u,v) also has a cost-coefficient auv in addition to its capacity. If the flow through the edge is fuv, then the total cost is auvfuv. It is required to find a flow of a given size d, with the smallest cost. In most variants, the cost-coefficients may be either positive or negative. There are various polynomial-time algorithms for this problem.

2. The maximum-flow problem can be augmented by disjunctive constraints: a negative disjunctive constraint says that a certain pair of edges cannot simultaneously have a nonzero flow; a positive disjunctive constraints says that, in a certain pair of edges, at least one must have a nonzero flow. With negative constraints, the problem becomes strongly NP-hard even for simple networks. With positive constraints, the problem is polynomial if fractional flows are allowed, but may be strongly NP-hard when the flows must be integral.[20]


References[edit]

  1. ^ a b c Schrijver, A. (2002). "On the history of the transportation and maximum flow problems". Mathematical Programming. 91 (3): 437–445. CiteSeerX 10.1.1.23.5134. doi:10.1007/s101070100259. S2CID 10210675.
  2. ^ Gass, Saul I.; Assad, Arjang A. (2005). "Mathematical, algorithmic and professional developments of operations research from 1951 to 1956". An Annotated Timeline of Operations Research. International Series in Operations Research & Management Science. 75. pp. 79–110. doi:10.1007/0-387-25837-X_5. ISBN 978-1-4020-8116-3.
  3. ^ a b Harris, T. E.; Ross, F. S. (1955). "Fundamentals of a Method for Evaluating Rail Net Capacities" (PDF). Research Memorandum.
  4. ^ a b Ford, L. R.; Fulkerson, D. R. (1956). "Maximal flow through a network". Canadian Journal of Mathematics. 8: 399–404. doi:10.4153/CJM-1956-045-5.
  5. ^ a b Ford, L.R., Jr.; Fulkerson, D.R., Flows in Networks, Princeton University Press (1962).
  6. ^ Sherman, Jonah (2013). "Nearly Maximum Flows in Nearly Linear Time". Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science. pp. 263–269. arXiv:1304.2077. doi:10.1109/FOCS.2013.36. ISBN 978-0-7695-5135-7. S2CID 14681906.
  7. ^ Kelner, J. A.; Lee, Y. T.; Orecchia, L.; Sidford, A. (2014). "An Almost-Linear-Time Algorithm for Approximate Max Flow in Undirected Graphs, and its Multicommodity Generalizations" (PDF). Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms. p. 217. arXiv:1304.2338. doi:10.1137/1.9781611973402.16. ISBN 978-1-61197-338-9. S2CID 10733914. Archived from the original (PDF) on 2016-03-03.
  8. ^ Knight, Helen (7 January 2014). "New algorithm can dramatically streamline solutions to the 'max flow' problem". MIT News. Retrieved 8 January 2014.
  9. ^ a b Orlin, James B. (2013). "Max flows in O(nm) time, or better". Proceedings of the 45th annual ACM symposium on Symposium on theory of computing - STOC '13. STOC '13 Proceedings of the Forty-fifth Annual ACM Symposium on Theory of Computing. pp. 765–774. CiteSeerX 10.1.1.259.5759. doi:10.1145/2488608.2488705. ISBN 9781450320290. S2CID 207205207.
  10. ^ Malhotra, V.M.; Kumar, M. Pramodh; Maheshwari, S.N. (1978). "An algorithm for finding maximum flows in networks" (PDF). Information Processing Letters. 7 (6): 277–278. doi:10.1016/0020-0190(78)90016-9.
  11. ^ a b c Goldberg, A. V.; Tarjan, R. E. (1988). "A new approach to the maximum-flow problem". Journal of the ACM. 35 (4): 921. doi:10.1145/48014.61051. S2CID 52152408.
  12. ^ King, V.; Rao, S.; Tarjan, R. (1994). "A Faster Deterministic Maximum Flow Algorithm". Journal of Algorithms. 17 (3): 447–474. doi:10.1006/jagm.1994.1044. S2CID 15493.
  13. ^ Goldberg, A. V.; Rao, S. (1998). "Beyond the flow decomposition barrier". Journal of the ACM. 45 (5): 783. doi:10.1145/290179.290181. S2CID 96030.
  14. ^ Itai, A.; Perl, Y.; Shiloach, Y. (1982). "The complexity of finding maximum disjoint paths with length constraints". Networks. 12 (3): 277–286. doi:10.1002/net.3230120306. ISSN 1097-0037.
  15. ^ Schwartz, B. L. (1966). "Possible Winners in Partially Completed Tournaments". SIAM Review. 8 (3): 302–308. doi:10.1137/1008062. JSTOR 2028206.
  16. ^ a b Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein (2001). "26. Maximum Flow". Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill. pp. 643–668. ISBN 978-0-262-03293-3.CS1 maint: multiple names: authors list (link)
  17. ^ Carl Kingsford. "Max-flow extensions: circulations with demands" (PDF).
  18. ^ "Project imagesegmentationwithmaxflow, that contains the source code to produce these illustrations". GitLab. Archived from the original on 2019-12-22. Retrieved 2019-12-22.
  19. ^ "Algorithm Design". www.pearson.com. Retrieved 2019-12-21.
  20. ^ Schauer, Joachim; Pferschy, Ulrich (2013-07-01). "The maximum flow problem with disjunctive constraints". Journal of Combinatorial Optimization. 26 (1): 109–119. CiteSeerX 10.1.1.414.4496. doi:10.1007/s10878-011-9438-7. ISSN 1382-6905. S2CID 6598669.

Further reading[edit]