Cobham's thesis

From Wikipedia, the free encyclopedia
  (Redirected from Tractable problem)
Jump to: navigation, search
The graph shows time (average of 100 instances in ms using a 933 MHz Pentium III) vs.problem size for knapsack problems for a state-of-the-art specialized algorithm. Quadratic fit suggests that empirical algorithmic complexity for instances with 50–10,000 variables is O((log n)2). The data comes from [1]

Cobham's thesis, also known as Cobham–Edmonds thesis (named after Alan Cobham and Jack Edmonds),[2][3][4] asserts that computational problems can be feasibly computed on some computational device only if they can be computed in polynomial time; that is, if they lie in the complexity class P.[5]

Formally, to say that a problem can be solved in polynomial time is to say that there exists an algorithm that, given an n-bit instance of the problem as input, can produce a solution in time O(nc), where c is a constant that depends on the problem but not the particular instance of the problem.

Alan Cobham's 1965 paper entitled "The intrinsic computational difficulty of functions"[6] is one of the earliest mentions of the concept of the complexity class P, consisting of problems decidable in polynomial time. Cobham theorized that this complexity class was a good way to describe the set of feasibly computable problems. Any problem that cannot be contained in P is not feasible, but if a real-world problem can be solved by an algorithm existing in P, generally such an algorithm will eventually be discovered.

The class P is a useful object of study because it is not sensitive to the details of the model of computation: for example, a change from a single-tape Turing machine to a multi-tape machine can lead to a quadratic speedup, but any algorithm that runs in polynomial time under one model also does so on the other.

In similar spirit, NC complexity class can be thought to capture problems "effectively solvable" on a parallel computer.

Reasoning[edit]

The thesis is widely considered to be a good rule of thumb for real-life problems. Typical input lengths that users and programmers are interested in are approximately between 100 and 1,000,000. Consider an input length of n=100 and a polynomial algorithm whose running time is n2. This is a typical running time for a polynomial algorithm. (See the "Objections" section for a discussion of atypical running times.) The number of steps that it will require, for n=100, is 1002=10000. A typical CPU will be able to do approximately 109 operations per second (this is extremely simplified). So this algorithm will finish on the order of (10000 ÷109) = .00001 seconds. A running time of .00001 seconds is reasonable, and that's why this is called a practical algorithm. The same algorithm with an input length of 1,000,000 will take on the order of 17 minutes, which is also a reasonable time for most (non-real-time) applications.

Meanwhile, an algorithm that runs in exponential time might have a running time of 2n. The number of operations that it will require, for n=100, is 2100. It will take (2100 ÷ 109) ≈ 1.3×1021 seconds, which is (1.3×1021 ÷ 31556926) ≈ 4.1×1013 years, longer than the age of the universe. The largest problem this algorithm could solve in a day would have n=46, which seems very small.

Mathematically speaking, for big enough inputs, any polynomial time algorithm will beat any exponential time algorithm, and by arbitrarily large amounts. The only question is how big the input must be for this crossover to occur.

Objections[edit]

There are many lines of objection to Cobham's thesis. The thesis essentially states that "P" means "easy, fast, and practical," while "not in P" means "hard, slow, and impractical." But this is not always true. To begin with, it abstracts away some important variables that influence the runtime in practice:

  • It ignores constant factors and lower-order terms.
  • It ignores the size of the exponent. The time hierarchy theorem proves the existence of problems in P requiring arbitrarily large exponents.
  • It ignores the typical size of the input.

All three are related, and are general complaints about analysis of algorithms, but they particularly apply to Cobham's thesis since it makes an explicit claim about practicality. Under Cobham's thesis, we are to call a problem for which the best algorithm takes 10100n instructions feasible, and a problem with an algorithm that takes 20.00001 n infeasible—even though we could never solve an instance of size n=1 with the former algorithm, whereas we could solve an instance of the latter problem of size n=106 without difficulty. As we saw, it takes a day on a typical modern machine to process 2n operations when n=46; this may be the size of inputs we have, and the amount of time we have to solve a typical problem, making the 2n-time algorithm feasible in practice on the inputs we have. Conversely, in fields where practical problems have millions of variables (such as Operations Research or Electronic Design Automation), even O(n3) algorithms are often impractical.

References[edit]

  1. ^ Pisinger, D. 2003. "Where are the hard knapsack problems?" Technical Report 2003/08, Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
  2. ^ Oded Goldreich (2008), Computational complexity: a conceptual perspective, Cambridge University Press, p. 128, ISBN 978-0-521-88473-0 
  3. ^ Dexter Kozen (2006), Theory of computation, Birkhäuser, p. 4, ISBN 978-1-84628-297-3 
  4. ^ Egon Börger (1989), Computability, complexity, logic, Elsevier, p. 225, ISBN 978-0-444-87406-1 
  5. ^ Steven Homer and Alan L. Selman (1992), "Complexity Theory", in Alan Kent and James G. Williams, Encyclopedia of Computer Science and Technology 26, CRC Press 
  6. ^ Alan Cobham (1965), "The intrinsic computational difficulty of functions", Proc. Logic, Methodology, and Philosophy of Science II, North Holland