# Proof complexity

Jump to: navigation, search

In theoretical computer science, and specifically computational complexity theory, proof complexity is the field aiming to understand and analyse the computational resources that are required to prove or refute statements. Research in proof complexity is predominantly concerned with proving proof-length lower and upper bounds in various propositional proof systems. Systematic study of proof complexity began with the work of Cook and Reckhow (1979) who provided the basic definition of a propositional proof system from the perspective of computational complexity. Specifically Cook and Reckhow observed that proving proof size lower bounds on stronger and stronger propositional proof systems can be viewed as a step towards separating NP from coNP (and thus P from NP), since the existence of a propositional proof system that admits polynomial size proofs for all tautologies implies that NP=coNP.

Contemporary proof complexity research draws ideas and methods from many areas in computational complexity, algorithms and mathematics. Since many important algorithms and algorithmic techniques can be cast as proof search algorithms for certain proof systems, proving lower bounds on these proof sizes implies run time lower bounds on the corresponding algorithms.

Mathematical logic can also serve as a framework to study propositional proof sizes. Specifically weak fragments of Peano Arithmetic, which come under the name Bounded Arithmetic theories, serve as a uniform version of propositional proofs, corresponding to different propositional proof systems.

## Polynomiality of proofs

Different propositional proof system for theorem proving in propositional logic, such as the sequent calculus, the cutting-plane method, resolution, the DPLL algorithm, etc. produce different proofs when applied to the same formula. Proof complexity measures the efficiency of a method in terms of the size of the proofs it produces.

Two points make the study of proof complexity non-trivial:

1. the size of a proof depends on the formula that is to be proved inconsistent;
2. proof methods are generally families of algorithms, as some of their steps are not univocally specified; for example, resolution is based on iteratively choosing a pair of clauses containing opposite literals and producing a new clause that is a consequence of them; since several such pairs may be available at each step, the algorithm has to choose one; these choices affect the proof length.

The first point is taken into account by comparing the size of a proof of a formula with the size of the formula. This comparison is made using the usual assumptions of computational complexity: first, a polynomial proof size/formula size ratio means that the proof is of size similar to that of the formula; second, this ratio is studied in the asymptotic case as the size of the formula increases.

The second point is taken into account by considering, for each formula, the shortest possible proof the considered method can produce.

The question of polynomiality of proofs is whether a method can always produce a proof of size polynomial in the size of the formula. If such a method exists, then NP would be equal to coNP: this is why the question of polynomiality of proofs is considered important in computational complexity. For some methods, the existence of formulae whose shortest proofs are always superpolynomial has been proved. For other methods, it is an open question.

## Proof size comparison

A second question about proof complexity is whether a method is more efficient than another. Since the proof size depends on the formula, it is possible that one method can produce a short proof of a formula and only long proofs of another formula, while a second method can have exactly the opposite behavior. The assumptions of measuring the size of the proofs relative to the size of the formula and considering only the shortest proofs are also used in this context.

When comparing two proof methods, two outcomes are possible:

1. for every proof of a formula produced using the first method, there is a proof of comparable size of the same formula produced by the second method;
2. there exists a formula such that the first method can produce a short proof while all proofs obtained by the second method are consistently larger.

Several proofs of the second kind involve contradictory formulae expressing the negation of the pigeonhole principle, namely that ${\displaystyle n+1}$ pigeons can fit ${\displaystyle n}$ holes with no hole containing two or more pigeons.

## Automatizability

A proof method is automatizable if one of the shorter proofs of a formula can always be generated in time polynomial (or sub-exponential) in the size of the proof. Some methods, but not all, are automatizable. Automatizability results are not in contrast with the assumption that the polynomial hierarchy does not collapses, which would happen if generating a proof in time polynomial in the size of the formula were always possible.

## Interpolation

Consider a tautology of the form ${\displaystyle A(x,y)\rightarrow B(y,z)}$. The tautology is true for every choice of ${\displaystyle y}$, and after fixing ${\displaystyle y}$ the evaluation of ${\displaystyle A}$ and ${\displaystyle B}$ are independent because are defined on disjoint sets of variables. This means that it is possible to define an interpolant circuit ${\displaystyle C(y)}$, such that both ${\displaystyle A(x,y)\rightarrow C(y)}$ and ${\displaystyle C(y)\rightarrow B(y,z)}$ hold. The interpolant circuit decides either if ${\displaystyle A(x,y)}$ is false or if ${\displaystyle B(y,z)}$ is true, by only considering ${\displaystyle y}$. The nature of the interpolant circuit can be arbitrary. Nevertheless, it is possible to use a proof of the initial tautology ${\displaystyle A(x,y)\rightarrow B(y,z)}$ as a hint on how to construct ${\displaystyle C}$. Some proof systems (e.g. resolution) are said to have efficient interpolation because the interpolant ${\displaystyle C(y)}$ is efficiently computable from any proof of the tautology ${\displaystyle A(x,y)\rightarrow B(y,z)}$ in such proof system. The efficiency is measured with respect to the length of the proof: it is easier to compute interpolants for longer proofs, so this property seems to be anti-monotone in the strength of the proof system.

Interpolation is a weak form of automatization: a way to deduce the existence of small circuits from the existence of small proofs. In particular the following three statements cannot be simultaneously true: (a) ${\displaystyle A(x,y)\rightarrow B(y,z)}$ has a short proof in a some proof system; (b) such proof system has efficient interpolation; (c) the interpolant circuit solves a computationally hard problem. It is clear that (a) and (b) imply that there is a small interpolant circuit, which is in contradiction with (c). Such relation allows to turn proof length upper bounds into lower bounds on computations, and dually to turn efficient interpolation algorithms into lower bounds on proof length.

## Non-classical logics

The idea of comparing the size of proofs can be used for any automated reasoning procedure that generates a proof. Some research has been done about the size of proofs for propositional non-classical logics, in particular, intuitionistic, modal, and non-monotonic logics.