# Proof complexity

In computer science, proof complexity is a measure of efficiency of automated theorem proving methods that is based on the size of the proofs they produce. The methods for proving contradiction in propositional logic are the most analyzed. The two main issues considered in proof complexity are whether a proof method can produce a polynomial proof of every inconsistent formula, and whether the proofs produced by one method are always of size similar to those produced by another method.

## Polynomiality of proofs

Different propositional proof system for theorem proving in propositional logic, such as the sequent calculus, the cutting-plane method, resolution, the DPLL algorithm, etc. produce different proofs when applied to the same formula. Proof complexity measures the efficiency of a method in terms of the size of the proofs it produces.

Two points make the study of proof complexity non-trivial:

1. the size of a proof depends on the formula that is to be proved inconsistent;
2. proof methods are generally families of algorithms, as some of their steps are not univocally specified; for example, resolution is based on iteratively choosing a pair of clauses containing opposite literals and producing a new clause that is a consequence of them; since several such pairs may be available at each step, the algorithm has to choose one; these choices affect the proof length.

The first point is taken into account by comparing the size of a proof of a formula with the size of the formula. This comparison is made using the usual assumptions of computational complexity: first, a polynomial proof size/formula size ratio means that the proof is of size similar to that of the formula; second, this ratio is studied in the asymptotic case as the size of the formula increases.

The second point is taken into account by considering, for each formula, the shortest possible proof the considered method can produce.

The question of polynomiality of proofs is whether a method can always produce a proof of size polynomial in the size of the formula. If such a method exists, then NP would be equal to coNP: this is why the question of polynomiality of proofs is considered important in computational complexity. For some methods, the existence of formulae whose shortest proofs are always superpolynomial has been proved. For other methods, it is an open question.

## Proof size comparison

A second question about proof complexity is whether a method is more efficient than another. Since the proof size depends on the formula, it is possible that one method can produce a short proof of a formula and only long proofs of another formula, while a second method can have exactly the opposite behavior. The assumptions of measuring the size of the proofs relative to the size of the formula and considering only the shortest proofs are also used in this context.

When comparing two proof methods, two outcomes are possible:

1. for every proof of a formula produced using the first method, there is a proof of comparable size of the same formula produced by the second method;
2. there exists a formula such that the first method can produce a short proof while all proofs obtained by the second method are consistently larger.

Several proofs of the second kind involve contradictory formulae expressing the negation of the pigeonhole principle, namely that $n+1$ pigeons can fit $n$ holes with no hole containing two or more pigeons.

## Automatizability

A proof method is automatizable if one of the shorter proofs of a formula can always be generated in time polynomial (or sub-exponential) in the size of the proof. Some methods, but not all, are automatizable. Automatizability results are not in contrast with the assumption that the polynomial hierarchy does not collapses, which would happen if generating a proof in time polynomial in the size of the formula were always possible.

## Interpolation

Consider a tautology of the form $A(x,y) \rightarrow B(y,z)$. The tautology is true for every choice of $y$, and after fixing $y$ the evaluation of $A$ and $B$ are independent because are defined on disjoint sets of variables. This means that it is possible to define an interpolant circuit $C(y)$, such that both $A(x,y) \rightarrow C(y)$ and $C(y) \rightarrow B(y,z)$ hold. The interpolant circuit decides either if $A(x,y)$ is false or if $B(y,z)$ is true, by only considering $y$. The nature of the interpolant circuit can be arbitrary. Nevertheless it is possible to use a proof of the initial tautology $A(x,y) \rightarrow B(y,z)$ as a hint on how to construct $C$. Some proof systems (e.g. resolution) are said to have efficient interpolation because the interpolant $C(y)$ is efficiently computable from any proof of the tautology $A(x,y) \rightarrow B(y,z)$ in such proof system. The efficiency is measured with respect to the length of the proof: it is easier to compute interpolants for longer proofs, so this property seems to be anti-monotone in the strength of the proof system.

Interpolation is a weak form of automatization: a way to deduce the existence of small circuits from the existence of small proofs. In particular the following three statements cannot be simultaneously true: (a) $A(x,y) \rightarrow B(y,z)$ has a short proof in a some proof system; (b) such proof system has efficient interpolation; (c) the interpolant circuit solves a computationally hard problem. It is clear that (a) and (b) imply that there is a small interpolant circuit, which is in contradiction with (c). Such relation allows to turn proof length upper bounds into lower bounds on computations, and dually to turn efficient interpolation algorithms into lower bounds on proof length.

## Non-classical logics

The idea of comparing the size of proofs can be used for any automated reasoning procedure that generates a proof. Some research has been done about the size of proofs for propositional non-classical logics, in particular, intuitionistic, modal, and non-monotonic logics.