# Schaefer's dichotomy theorem

In computational complexity theory, a branch of computer science, Schaefer's dichotomy theorem states necessary and sufficient conditions under which a finite set S of relations over the Boolean domain yields polynomial-time or NP-complete problems when the relations of S are used to constrain some of the propositional variables.[1] It is called a dichotomy theorem because the complexity of the problem defined by S is either in P or NP-complete as opposed to one of the classes of intermediate complexity that is known to exist (assuming P ≠ NP) by Ladner's theorem.

Special cases of Schaefer's dichotomy theorem include the NP-completeness of SAT (the Boolean satisfiability problem) and its two popular variants 1-in-3 SAT and Not-All-Equal 3SAT (often denoted by NAE-3SAT). In fact, for these two variants of SAT, Schaefer's dichotomy theorem shows that their monotone versions (where negations of variables are not allowed) are also NP-complete.

## Original presentation

Schaefer defines a decision problem that he calls the Generalized Satisfiability problem for S (denoted by SAT(S)). An instance of the problem is an S-formula, i.e. a conjunction of constraints of the form $R(x_{i_1}, \ldots , x_{i_n})$ where R is a relation in S and the $x_{i_j}$ are propositional variables. The problem is to determine whether the given formula is satisfiable, in other words if the variables can be assigned values such that they satisfy all the constraints.

Schaefer identifies six classes of sets of Boolean relations for which SAT(S) is in P and proves that all other sets of relations generate an NP-complete problem. A finite set of relations S over the Boolean domain defines a polynomial time computable satisfiability problem if any one of the following conditions holds:

1. all relations which are not constantly false are true when all its arguments are true;
2. all relations which are not constantly false are true when all its arguments are false;
3. all relations are equivalent to a conjunction of binary clauses;
4. all relations are equivalent to a conjunction of Horn clauses;
5. all relations are equivalent to a conjunction of dual-Horn clauses;
6. all relations are equivalent to a conjunction of affine formulae. [2]

Otherwise, the problem SAT(S) is NP-complete.

## Modern presentation

A modern, streamlined presentation of Schaefer's theorem is given in an expository paper by Hubie Chen.[3][4] In modern terms, the problem SAT(S) is viewed as a constraint satisfaction problem over the Boolean domain. In this area, it is standard to denote the set of relations by Γ and the decision problem defined by Γ as CSP(Γ).

This modern understanding uses algebra, in particular, universal algebra. For Schaefer's dichotomy theorem, the most important concept in universal algebra is that of a polymorphism. An operation $f : D^m \to D$ is a polymorphism of a relation $R \subseteq D^k$ if, for any choice of m tuples $(t_{11}, \dotsc, t_{1k}), \dotsc, (t_{m1}, \dotsc, t_{mk})$ from R, it holds that the tuple obtained from these m tuples by applying f coordinate-wise, i.e. $(f(t_{11}, \dotsc, t_{m1}), \dotsc, f(t_{1k}, \dotsc, t_{mk}))$, is in R. That is, an operation f is a polymorphism of R if R is closed under f: applying f to any tuples in R yields another tuple inside R. A set of relations Γ is said to have a polymorphism f if every relation in Γ has f as a polymorphism. This definition allows for the algebraic formulation of Schaefer's dichotomy theorem.

Let Γ be a finite constraint language over the Boolean domain. The problem CSP(Γ) is decidable in polynomial-time if Γ has one of the following six operations as a polymorphism:

1. the constant unary operation 0;
2. the constant unary operation 1;
3. the binary AND operation ∧;
4. the binary OR operation ∨;
5. the ternary majority operation $\operatorname{Majority}(x,y,z) = (x \wedge y) \vee (x \wedge z) \vee (y \wedge z);$
6. the ternary minority operation $\operatorname{Minority}(x,y,z) = x \oplus y \oplus z.$

Otherwise, the problem CSP(Γ) is NP-complete.

In this formulation, it is easy to check if any of the tractability conditions hold.

## Properties of Polymorphisms

Given a set Γ of relations, there is a surprisingly close connection between its polymorphisms and the computational complexity of CSP(Γ).

A relation R is called primitive positive definable, or short pp-definable, from a set Γ of relations if R(v1, ... , vk) ⇔ ∃x1 ... xm. C holds for some conjunction C of constraints from Γ and equations over the variables {v1,...,vk, x1,...,xm}. For example, if Γ consists of the ternary relation nae(x,y,z) holding if x,y,z are not all equal, and R(x,y,z) is xyz, then R can be pp-defined by R(x,y,z) ⇔ ∃a. nae(0,x,a) ∧ nae(y,za); this reduction has been used to prove that NAE-3SAT is NP-complete. The set of all relations which are pp-definable from Γ is denoted by ≪Γ≫. If Γ' ⊆ ≪Γ≫ for some finite constraint sets Γ and Γ', then CSP(Γ') reduces to CSP(Γ).[5]

Given a set Γ of relations, Pol(Γ) denotes the set of polymorphisms of Γ. Conversely, if O is a set of operations, then Inv(O) denotes the set of relations having all operations in O as a polymorphism. Pol and Inv together build a Galois connection. For any finite set Γ of relations over a finite domain, ≪Γ≫ = Inv(Pol(Γ)) holds, that is, the set of relations pp-definable from Γ can be derived from the polymorphisms of Γ.[6] Moreover, if Pol(Γ) ⊆ Pol(Γ') for two finite relation sets Γ and Γ', then Γ' ⊆ ≪Γ≫ and CSP(Γ') reduces to CSP(Γ). As a consequence, two relation sets having the same polymorphisms lead to the same computational complexity.[7]

## Generalizations

The analysis was later fine-tuned: CSP(Γ) is either solvable in co-NLOGTIME, L-complete, NL-complete, ⊕L-complete, P-complete or NP-complete and given Γ, one can decide in polynomial time which of these cases holds.[8]

Schaefer's dichotomy theorem was recently generalized to a larger class of relations.[9]

## Related work

If the problem is to count the number of solutions, which is denoted by #CSP(Γ), then a similar result by Creignou and Hermann holds.[10] Let Γ be a finite constraint language over the Boolean domain. The problem #CSP(Γ) is computable in polynomial time if Γ has a Mal'tsev operation as a polymorphism. Otherwise, the problem #CSP(Γ) is #P-complete. A Mal'tsev operation m is a ternary operation that satisfies $m(x,y,y) = m(y,y,x) = x.$ An example of a Mal'tsev operation is the Minority operation given in the modern, algebraic formulation of Schaefer's dichotomy theorem above. Thus, when Γ has the Minority operation as a polymorphism, it is not only possible to decide CSP(Γ) in polynomial time, but to compute #CSP(Γ) in polynomial time. Other examples of Mal'tsev operations include $x - y + z$ and $x y^{-1} z.$

For larger domains, even for a domain of size three, the existence of a Maltsev polymorphism for Γ is no longer a sufficient condition for the tractability of #CSP(Γ). However, the absence of a Mal'tsev polymorphism for Γ still implies the #P-hardness of #CSP(Γ).