Assignment problem

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The assignment problem is a fundamental combinatorial optimization problem. It consists of finding, in a weighted bipartite graph, a matching in which the sum of weights of the edges is as large as possible. A common variant consists of finding a minimum-weight perfect matching.

It is a specialization of the maximum weight matching problem for bipartite graphs.

In its most general form, the problem is as follows:

The problem instance has a number of agents and a number of tasks. Any agent can be assigned to perform any task, incurring some cost that may vary depending on the agent-task assignment. It is required to perform all tasks by assigning exactly one agent to each task and exactly one task to each agent in such a way that the total cost of the assignment is minimized.

If the numbers of agents and tasks are equal, and the total cost of the assignment for all tasks is equal to the sum of the costs for each agent (or the sum of the costs for each task, which is the same thing in this case), then the problem is called the linear assignment problem. Commonly, when speaking of the assignment problem without any additional qualification, then the linear assignment problem is meant.

Algorithms and generalizations[edit]

A naive solution for the assignment problem is to check all the assignments and calculate the cost of each one. This may be very inefficient since, with n agents and n tasks, there are n! (factorial of n) different assignments.

Many algorithms have been developed for solving the assignment problem in time bounded by a polynomial of n. One of the first such algorithms was the Hungarian algorithm, developed by Munkres.[1] Other algorithms include adaptations of the primal simplex algorithm, and the auction algorithm.

The assignment problem is a special case of the transportation problem, which is a special case of the minimum cost flow problem, which in turn is a special case of a linear program. While it is possible to solve any of these problems using the simplex algorithm, each specialization has more efficient algorithms designed to take advantage of its special structure.

When a number of agents and tasks is very large, a parallel algorithm with randomization can be applied.

The problem of finding minimum weight maximum matching can be converted to finding a minimum weight perfect matching. A bipartite graph can be extended to a complete bipartite graph by adding artificial edges with large weights. These weights should exceed the weights of all existing matchings to prevent appearance of artificial edges in the possible solution. As shown by Mulmuley, Vazirani and Vazirani,[2] the problem of minimum weight perfect matching is converted to finding minors in the adjacency matrix of a graph. Using the isolation lemma, a minimum weight perfect matching in a graph can be found with probability at least ½. For a graph with n vertices, it requires time.

Example[edit]

Suppose that a taxi firm has three taxis (the agents) available, and three customers (the tasks) wishing to be picked up as soon as possible. The firm prides itself on speedy pickups, so for each taxi the "cost" of picking up a particular customer will depend on the time taken for the taxi to reach the pickup point. The solution to the assignment problem will be whichever combination of taxis and customers results in the least total cost.

However, the assignment problem can be made rather more flexible than it first appears. In the above example, suppose that there are four taxis available, but still only three customers. Then a fourth dummy task can be invented, perhaps called "sitting still doing nothing", with a cost of 0 for the taxi assigned to it. The assignment problem can then be solved in the usual way and still give the best solution to the problem.

Similar adjustments can be done in order to allow more tasks than agents, tasks to which multiple agents must be assigned (for instance, a group of more customers than will fit in one taxi), or maximizing profit rather than minimizing cost.

Formal definition[edit]

The formal definition of the assignment problem (or linear assignment problem) is

Given two sets, A and T, of equal size, together with a weight function C : A × TR. Find a bijection f : AT such that the cost function:

is minimized.

Usually the weight function is viewed as a square real-valued matrix C, so that the cost function is written down as:

The problem is "linear" because the cost function to be optimized as well as all the constraints contain only linear terms.

Solution by linear programming[edit]

The assignment problem can be solved by presenting it as a linear program. For convenience we will present the maximization problem. Each edge (i,j), where i is in A and j is in T, has a weight . For each edge (i,j) we have a variable . The variable is 1 if the edge is contained in the matching and 0 otherwise, so we set the domain constraints:

The total weight of the matching is: . The goal is to find a maximum-weight perfect matching.

To guarantee that the variables indeed represent a perfect matching, we add constraints saying that each vertex is adjacent to exactly one edge in the matching, i.e, .

All in all we have the following LP:

This is an integer linear program. However, we can solve it without the integrality constraints (i.e., drop the last constraint), using standard methods for solving continuous linear programs. While this formulation allows also fractional variable values, in this special case, the LP always has an optimal solution where the variables take integer values. This is because the constraint matrix of the fractional LP is totally unimodular - it satisfies the four conditions of Hoffman and Gale.

This can also be proved directly.[3]:31-37 Let x be an optimal solution of the fractional LP, w(x) be its total weight, and k(x) be the number of non-integral variables. If k(x)=0 we are done. Otherwise, there is a fractional variable, say . Because the sum of variables adjacent to j2 is 1, which in an integer, there must be another variable adjacent to j2 with a fractional value, say . By similar considerations on i3, there must be another variable adjacent to i3 with a fractional value, say . By similar considerations we move from one vertex to another, collecting edges with fractional values. Since the graph is finite, at some point we must have a cycle. Without loss of generality we can assume that the cycle ends at vertex i1, so the last fractional variable in the cycle is . So the number of edges in the cycle is 2m - it must be even since the graph is bipartite.

Suppose we add a certain constant e to all even variables in the cycle, and remove the same constant e from all odd variables in the cycle. For any such e, the sum of variables near each vertex remains the same (1), so the vertex constraints are still satisfied. Moreover, if e is sufficiently small, all variables remain between 0 and 1, so the domain constraints are still satisfied too. It is easy to find a largest e that maintains the domain constraints: it is either the smallest difference between an odd variable and 0, or the smallest difference between an even variable and 1. Now, we have one less fractional variable, so k(x) decreases by 1. The objective value remains the same, since otherwise we could increase it by selecting e to be positive or negative, in contradiction to the assumption that it is maximal.

By repeating the cycle-removal process we arrive, after at most n steps, at a solution in which all variables are integral.

See also[edit]

References and further reading[edit]

  1. ^ Munkres, James (1957). "Algorithms for the Assignment and Transportation Problems". Journal of the Society for Industrial and Applied Mathematics. 5 (1): 32–38. CiteSeerX 10.1.1.228.3911. doi:10.1137/0105003. JSTOR 2098689.
  2. ^ Mulmuley, Ketan; Vazirani, Umesh; Vazirani, Vijay (1987). "Matching is as easy as matrix inversion". Combinatorica. 7 (1): 105–113. doi:10.1007/BF02579206.
  3. ^ Bernd Gärtner and Jiří Matoušek (2006). Understanding and Using Linear Programming. Berlin: Springer. ISBN 3-540-30697-8.