Knapsack problem

From Wikipedia, the free encyclopedia

Jump to: navigation, search
Example of a one-dimensional (constraint) knapsack problem: which boxes should be chosen to maximize the amount of money while still keeping the overall weight under or equal to 15 kg? A multiple constrained problem could consider both the weight and volume of the boxes. Modeling the shapes and sizes would instead constitute a packing problem.
(Solution: if any number of each box is available, then three yellow boxes and three grey boxes; if only the shown boxes are available, then all but the green box.)

The knapsack problem or rucksack problem is a problem in combinatorial optimization: Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than a given limit and the total value is as large as possible. It derives its name from the problem faced by someone who is constrained by a fixed-size knapsack and must fill it with the most useful items.

The problem often arises in resource allocation with financial constraints. A similar problem also appears in combinatorics, complexity theory, cryptography and applied mathematics.

The decision problem form of the knapsack problem is the question "can a value of at least V be achieved without exceeding the weight W?"

Contents

[edit] Definition

In the following, we have n kinds of items, 1 through n. Each kind of item j has a value pj and a weight wj. We usually assume that all values and weights are nonnegative. The maximum weight that we can carry in the bag is W.

The most common formulation of the problem is the 0-1 knapsack problem, which restricts the number xj of copies of each kind of item to zero or one. Mathematically the 0-1-knapsack problem can be formulated as:

maximize \qquad \sum_{j=1}^n p_j\,x_j
subject to \qquad \sum_{j=1}^n w_j\,x_j \ \le \ W, \quad \quad x_j \ \in \ \{0,1\}

The bounded knapsack problem restricts the number xj of copies of each kind of item to a maximum integer value bj. Mathematically the bounded knapsack problem can be formulated as:

maximize \qquad \sum_{j=1}^n p_j\,x_j
subject to \qquad \sum_{j=1}^n w_j\,x_j \ \le \ W, \quad \quad x_j \ \in \ \{0,1,\ldots,b_j\}

The unbounded knapsack problem places no upper bound on the number of copies of each kind item.

Of particular interest is the special case of the problem with these properties:

  • it is a decision problem,
  • it is a 0-1 problem,
  • for each kind of item, the weight equals the value: wj = pj.

Notice that in this special case, the problem is equivalent to this: given a set of nonnegative integers, does any subset of it add up to exactly W? Or, if negative weights are allowed and W is chosen to be zero, the problem is: given a set of integers, does any subset add up to exactly 0? This special case is called the subset sum problem. In the field of cryptography the term knapsack problem is often used to refer specifically to the subset sum problem.

If multiple knapsacks are allowed, the problem is better thought of as the bin packing problem.

[edit] Computational complexity

The knapsack problem is interesting from the perspective of computer science because

The subset sum version of the knapsack problem is commonly known as one of Karp's 21 NP-complete problems.

There have been attempts to use subset sum as the basis for public key cryptography systems, such as Merkle-Hellman. These attempts typically used some group other than the integers. Merkle-Hellman and several similar algorithms were later broken, because the particular subset sum problems they produced were in fact solvable by polynomial-time algorithms.

One theme in research literature is to identify what the "hard" instances of the knapsack problem look like[1][2], or viewed another way, to identify what properties of instances in practice might make them more amenable than their worst-case NP-complete behaviour suggests.

Several algorithms are freely available to solve knapsack problems, based on dynamic programming approach[3], branch and bound approach[4] or hybridations of both approaches[5][6][7][8]

[edit] Dynamic programming solution

[edit] Unbounded knapsack problem

If all weights (w1, ..., wn and W) are nonnegative integers, the knapsack problem can be solved in pseudo-polynomial time using dynamic programming. The following describes a dynamic programming solution for the unbounded knapsack problem.

To simplify things, assume all weights are strictly positive (wj > 0). We wish to maximize total value subject to the constraint that total weight is less than or equal to W. Then for each YW, define A(Y) to be the maximum value that can be attained with total weight less than or equal to Y. A(W) then is the solution to the problem.

Observe that A(Y) has the following properties:

  • A(0) = 0 (the sum of zero items, i.e., the summation of the empty set)
  • A(Y) = max { pj + A(Ywj)  |  wjY }

where pj is the value of the jth kind of item.

Here the maximum of the empty set is taken to be zero. Tabulating the results from A(0) up through A(W) gives the solution. Since the calculation of each A(Y) involves examining n items, and there are W values of A(Y) to calculate, the running time of the dynamic programming solution is O(nW). Dividing w1, ..., wn, W by their greatest common divisor is an obvious way to improve the running time.

The O(nW) complexity does not contradict the fact that the knapsack problem is NP-complete, since W, unlike n, is not polynomial in the length of the input to the problem. The length of the input to the problem is proportional to the number, log W, of bits in W, not to W itself.

[edit] 0-1 knapsack problem

A similar dynamic programming solution for the 0-1 knapsack problem also runs in pseudo-polynomial time. As above, assume w1, ..., wn, W are strictly positive integers. Define A(j, Y) to be the maximum value that can be attained with weight less than or equal to Y using items up to j.

We can define A(j,Y) recursively as follows:

  • A(0, Y) = 0
  • A(j, 0) = 0
  • A(j, Y) = A(j − 1, Y)  if wj > Y
  • A(j, Y) = max { A(j − 1, Y),  pj + A(j − 1, Ywj) }  if wjY.

The solution can then be found by calculating A(n, W). To do this efficiently we can use a table to store previous computations. This solution will therefore run in O(nW) time and O(nW) space, though with some slight modifications we can reduce the space complexity to O(W).

[edit] Greedy approximation algorithm

George Dantzig proposed (1957) a greedy approximation algorithm to solve the unbounded knapsack problem. His version sorts the items in decreasing order of value per unit of weight, pj/wj. It then proceeds to insert them into the sack, starting with as many copies as possible of the first kind of item until there is no longer space in the sack for more. Provided that there is an unlimited supply of each kind of item, if A is the maximum value of items that fit into the sack, then the greedy algorithm is guaranteed to achieve at least a value of A/2. However, for the bounded problem, where the supply of each kind of item is limited, the algorithm may be very much further from optimal.

[edit] Dominance relations to simplify the resolution of the Unbounded Knapsack Problem

Some relations between items are such that quite a lot of items may be useless to consider to build an optimal solution. These relations are known as Dominance relations. When an item "i" is known to be dominated by a set of items "J", it can be thrown out of the set of items usable to build an optimal value. The dominance relations between items allow the size of the search space to be significantly reduced. All the dominance relations, enumerated below, could be derived by the following inequalities: \qquad \sum_{j \in J} w_j\,x_j \ \le  \alpha\,w_i, and \qquad \sum_{j \in J} p_j\,x_j \ \ge \alpha\,p_i\, for some x \in Z _+^n

where \alpha\in Z_+ \,,J\subseteq N\, i\not\in J

[edit] Collective Dominance

The i-th item is collectively dominated by J, written as i\ll J iff \qquad \sum_{j \in J} w_j\,x_j \ \le  w_i and \qquad \sum_{j \in J} p_j\,x_j \ \ge p_i for some x \in Z _+^n i.e. α = 1. The verification of this dominance is computationally hard, so it can be used in a dynamic programming approach only.

[edit] Threshold Dominance

the i-th item is threshold dominated by J, written as i\prec\prec J iff (the above inequalities hold when \alpha\geq 1. This is an obvious generalization of the collective dominance by using instead of single item "i" a compound one, say α times item "i". The smallest such α defines the threshold of the item "i", written ti = (α − 1)wi.

[edit] Multiple Dominance

The item "i" is multiply dominated by "j", written as i\ll_{m} j, iff \qquad w_j\,x_j \ \le  w_i, and \qquad p_j\,x_j \ \ge p_i for some x_j \in Z _+ i.e.  J=\{j\}, \alpha=1,  x_j=\lfloor \frac{w_i}{w_j}\rfloor. This dominance could be efficiently used in a preprocessing because it can be detected relatively easily.


[edit] Modular Dominance

Let b = the best item, i.e \frac{p_b}{w_b}\ge\frac{p_j}{w_j}\, for all j The item i is modularly dominated by j, written as i\ll_\equiv j iff  w_j=w_i+tw_b, t\leq 0 , and p_j -tp_b \ge p_i i.e. J = {b,j},α = 1,xb = − t,xj = 1

[edit] See also

[edit] References

  1. ^ Pisinger, D. 2003. Where are the hard knapsack problems? Technical Report 2003/08, Department of Computer Science, University of Copenhagen, Copenhagen, Denmark.
  2. ^ L. Caccetta, A. Kulanoot, Computational Aspects of Hard Knapsack Problems, Nonlinear Analysis 47 (2001) 5547–5558.
  3. ^ Rumen Andonov, Vincent Poirriez, Sanjay Rajopadhye (2000) Unbounded Knapsack Problem : dynamic programming revisited European Journal of Operational Research 123: 2. 168-181 http://dx.doi.org/10.1016/S0377-2217(99)00265-9
  4. ^ S. Martello, P. Toth, Knapsack Problems: Algorithms and Computer Implementation , John Wiley and Sons, 1990
  5. ^ S. Martello, D. Pisinger, P. Toth, Dynamic programming and strong bounds for the 0-1 knapsack problem , Manag. Sci., 45:414-424, 1999.
  6. ^ Vincent Poirriez, Nicola Yanev, Rumen Andonov (2009) A Hybrid Algorithm for the Unbounded Knapsack Problem Discrete Optimization http://dx.doi.org/10.1016/j.disopt.2008.09.004
  7. ^ G. Plateau, M. Elkihel, A hybrid algorithm for the 0-1 knapsack problem, Methods of Oper. Res., 49:277-293, 1985.
  8. ^ S. Martello, P. Toth, A mixture of dynamic programming and branch-and-bound for the subset-sum problem, Manag. Sci., 30:765-771

[edit] Further reading

[edit] External links

Personal tools