Computational complexity

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computer science, the computational complexity, or simply complexity of an algorithm is the amount of resources required for running it. The computational complexity of a problem is the minimum of the complexities of all possible algorithms for this problem (including the unknown algorithms).

As the amount of needed resources varies with the input, the complexity is generally expressed as a function nf(n), where n is the size of the input, and f(n) is either the worst-case complexity, that is the maximum of the amount of resources that are needed for all input of size n, or the average-case complexity, that is average of the amount of resources over all input of size n.

When the nature of the resources is not explicitly given, this is usually the time needed for running the algorithm, and one talks of time complexity. However, this depends on the computer that is used, and the time is generally expressed as the number of needed elementary operations, which are supposed to take a constant time on a given computer, and to change by a constant factor when one changes of computer.

Otherwise, the resource that is considered is often the size of the memory that is needed, and one talks of space complexity.

The study of the complexity of explicitly given algorithms is called analysis of algorithms, while the study of the complexity of problems is called computational complexity theory. Clearly, both areas are strongly related, as the complexity of an algorithm is always an upper bound of the complexity of the problem solved by this algorithm.

Resources[edit]

Time[edit]

The resource that is most commonly considered is the time, and one talks of time complexity. When "complexity" is used without being qualified, this generally means time complexity.

The usual units of time are not used in complexity theory, because they are too dependent on the choice of a specific computer and of the evolution of the technology. Therefore, instead of the real time, one generally consider the elementary operations that are done during the computation. These operations are supposed to take a constant time on a given machine, and are often called steps.

Space[edit]

Another important resource is the size of computer memory that is needed for running algorithms.

Others[edit]

The number of arithmetic operations in another resource that is commonly used. In this case, one talks of arithmetic complexity. If one knows an upper bound on the size of the binary representation of the numbers that occur during a computation, the time complexity is generally the product of the arithmetic complexity by a constant factor.

For many algorithms the size of the integers that are used during a computation is not bounded, and it is not realistic to consider that arithmetic operations take a constant time. Therefore the time complexity, generally called bit complexity in this context, may be much larger than the arithmetic complexity. For example, the arithmetic complexity of the computation of the determinant of a n×n integer matrix is for the usual algorithms (Gaussian elimination). The bit complexity of the same algorithms is exponential in n, because the size of the coefficients may grow exponentially during the computation. On the other hand, if these algorithms are coupled with multi-modular arithmetic, the bit complexity may be reduced to O~(n4).

In sorting and searching, the resource that is generally considered is the number of entries comparisons. This is generally a good measure of the time complexity if data are suitably organized.

Complexity as a function of input size[edit]

For clarity, only time complexity is considered in this section, but everything applies (with slight modifications) to the complexity with respect to other resources.

It is impossible to count the number of steps of an algorithm on all possible inputs. As the complexity increases generally with the size of the input, the complexity is generally expressed as a function of the size n (in bits) of the input, and therefore, the complexity is a function of n. However, the complexity of an algorithm may vary dramatically for different inputs of the same size. Therefore several complexity functions are commonly used.

The worst-case complexity is the maximum of the complexity over all inputs of size n, and the average-case complexity is the average of the complexity over all inputs of size n (this makes sense, as the number of possible inputs of a given size is finite). Generally, when "complexity" is used without being further specified, this is the worst-case time complexity that is considered.

Asymptotic complexity[edit]

It is generally very difficult to compute exactly the worst-case and the average-case complexity. Moreover this makes not really sense, as any change of computer or of model of computation would change the complexity. Moreover, the resource use is not critical for small values of n, and this makes that, for small n, the ease of implementation is generally more interesting than a good complexity.

For these reasons, one generally focuses on the behavior of the complexity for large n, that is on its asymptotic behavior when n tends to the infinity. Therefore, the complexity is generally expressed by using big O notation.

For example, the usual algorithm for integer multiplication has a complexity of this means that there is a constant such that the multiplication two integers of at most n digits may be done in a time less than This bound is sharp in the sense that the worst-case complexity and the average-case complexity are which means that there is a constant such that these complexities are larger than The radix does not appear in these complexity, as changing of radix changes only the constants and

Use in algorithm design[edit]

Evaluating the complexity of an algorithm is an important part of algorithm design, as this gives useful information on the performance that may be expected.

It is a common misconception that the evaluation of the complexity of algorithms becomes less important, because of the exponential growth (Moore's law) of the power of modern computers. This is wrong because, this power increase allows working with large input data (big data). For example, when one want to sort alphabetically a list of a few hundreds of entries, such as the bibliography of a book, any algorithm should work well in less than a second. On the other hand, for a list of a million of entries (the phone numbers of a large town, for example), the elementary algorithms that require comparisons would have to do a trillion of comparisons, which would need more than one day at the speed of a million of comparisons per second. On the other hand the quicksort and merge sort require only comparisons (as average-case complexity for the former, as worst-case complexity for the latter). For n = 1,000,000, this gives approximately 30,000,000 comparisons, which may be done in a few seconds on a powerful computer.

Thus the evaluation of the complexity may allow eliminating many inefficient algorithms before any implementation. This may also be used for tuning complex algorithms without testing all variants. By determining the most costly steps of a complex algorithms, the study of complexity allows also focusing on these steps the effort for improving the efficiency of an implementation.

See also[edit]