Exact cover

Given a universe U of elements and a collection ${\mathcal {S}}$ of sets, an exact cover is a subcollection ${\mathcal {S}}^{*}$ of ${\mathcal {S}}$ such that every element in U is contained in exactly one set in ${\mathcal {S}}^{*}$ .

In other words, an exact cover is a subcollection ${\mathcal {S}}^{*}$ of ${\mathcal {S}}$ that is a partition of U: The sets in ${\mathcal {S}}^{*}$ are pairwise disjoint and their union is U.

Given a universe U of elements and a collection ${\mathcal {S}}$ of sets, the exact cover problem is to find an exact cover ${\mathcal {S}}^{*}$ or else determine there is none.

The exact cover problem is an example of a constraint satisfaction problem.

Example 1

Let U = {0, 1, 2, 3, 4, ...} be the universe of natural numbers and let ${\mathcal {S}}$ = {E, O, P} be a collection of three sets of natural numbers:

the even numbers E = {0, 2, 4, 6, 8, ...};
the odd numbers O = {1, 3, 5, 7, 9, ...}; and
the prime numbers P = {2, 3, 5, 7, ...}.

Then the subcollection ${\mathcal {S}}^{*}$ = {E, O} is an exact cover, since each natural number is either even or odd, but not both.

However, {E, P} is not an exact cover, since some natural numbers are contained in no set: For example, 1 is neither even nor prime. Moreover, some natural numbers are contained in more than one set: For example, 2 is both even and prime.

Example 2

Let U = {1, 2, 3, 4, 5, 6, 7} be a universe of 7 elements and let ${\mathcal {S}}$ = {A, B, C, D, E, F} be a collection of 6 sets:

A = {1, 4, 7};
B = {1, 4};
C = {4, 5, 7};
D = {3, 5, 6};
E = {2, 3, 6, 7}; and
F = {2, 7}.

Then the subcollection ${\mathcal {S}}^{*}$ = {B, D, F} is an exact cover, since every element is contained in exactly one of the sets B = {1, 4}, D = {3, 5, 6}, or F = {2, 7}.

Moreover, {B, D, F} is the only exact cover, as the following argument demonstrates: An exact cover must cover 1 and A and B are the only sets containing 1, thus an exact cover must contain A or B, but not both. If an exact cover contains A, then it doesn't contain B, C, E, or F, as each of these sets have at least one element in common with A. Then D is the only set left to be part of an exact cover, but the collection {A, D} doesn't cover the element 2. In conclusion, there is no exact cover containing A. On the other hand, if an exact cover contains B, then it doesn't include A or C, as each of these sets have at least one element in common with B. Then D is the only remaining set that contains 5, so D must be part of the exact cover. If an exact cover contains D, then it doesn't contain E, as D and E have the elements 3 and 6 in common. Then F is the only set left to be part of the exact cover, and the collection {B, D, F} is indeed an exact cover. See the example in the article on Knuth's Algorithm X for a matrix-based version of this argument.

Matrix representation

If ${\mathcal {S}}$ = {S₁, S₂, ..., S_m} is a finite collection of m sets and U = {x₁, x₂, ..., x_n} is a finite universe of n elements, then the exact cover problem can be represented as an m×n matrix containing only 0s and 1s. The i-th row of the matrix represents the i-th set S_i and the j-th column of the matrix represents the j-th element x_j. The entry of the matrix in the i-th row and j-th column is 1 if the set S_i contains the element x_j; the entry is 0 otherwise.

Given such a matrix, an exact cover is a selection of rows such that every column has a 1 in exactly one of the selected rows.

In other words, an exact cover is a selection of rows that sum to a row containing all 1s.

Example 3

If in Example 2 above we represent the 6 sets in ${\mathcal {S}}$ = {A, B, C, D, E, F} as rows and the 7 elements in U = {1, 2, 3, 4, 5, 6, 7} as columns, then the exact cover problem is represented by the 6×7 matrix:

	1	2	3	4	5	6	7
A	1	0	0	1	0	0	1
B	1	0	0	1	0	0	0
C	0	0	0	1	1	0	1
D	0	0	1	0	1	1	0
E	0	1	1	0	0	1	1
F	0	1	0	0	0	0	1

As in Example 2, the selection ${\mathcal {S}}^{*}$ = {B, D, F} of rows is a solution to this exact cover problem:

	1	2	3	4	5	6	7
B	1	0	0	1	0	0	0
D	0	0	1	0	1	1	0
F	0	1	0	0	0	0	1

Generalizations

Although the standard exact cover problem involves a collection ${\mathcal {S}}$ of sets containing elements in a universe U, the structure of the problem does not depend on the presence of sets containing elements. The same structure arises whenever there are two sets with some binary relation between them.

Given a set X, a set Y, and a binary relation R between X and Y, a (generalized) exact cover is a subset X* of X such that every element of Y is R^-1-related to exactly one element of X*. Here R^-1 is the converse of R.

In general, R^-1 restricted to Y×X* is a function, i.e., every element Y is R^-1-related to exactly than one element of X*: the unique element of X* that "covers" the particular element of Y.

Moreover, if R is left-total, i.e., if every element of X is R-related to at least one element of Y, then R^-1 restricted to Y×X* is a surjective function. (The condition in the generalized problem that R is left-total corresponds to the condition in the standard problem that every set in ${\mathcal {S}}$ is non-empty. Obviously, it makes no difference whether or not an empty set is part of a cover.)

In the matrix representation of the generalized problem, the relation R is represented by a matrix, the elements of X are represented by rows of the matrix, and the elements of Y are represented by columns of the matrix. Then, as in the standard problem, a (generalized) exact cover is a selection of rows such that every column has a 1 in exactly one of the selected rows.

The standard problem is a special case of the generalized problem in which the set X is a collection ${\mathcal {S}}$ of sets, the set Y is a universe U of elements, and the binary relation R is the relation "contains" between sets and elements. Then R^-1 restricted to Y×X* is a function from elements to sets specifying the unique set in the cover that contains the specified element. Moreover, if R is left-total, i.e., if no set is empty, then R^-1 restricted to Y×X* is a surjective function, i.e., every set in the cover contains at least one element.

Example 4

The matrix in Example 3 above can be reinterpreted to represent the binary relation "is contained in" between the set X = {1, 2, 3, 4, 5, 6} of 6 elements and the collection Y = {A, B, C, D, E, F, G} of 7 sets:

A = {1, 2}
B = {5, 6}
C = {4, 5}
D = {1, 2, 3}
E = {3, 4}
F = {4, 5}
G = {1, 3, 5, 6}

The same matrix representation as in Example 3 applies here, but it is labeled differently:

	A	B	C	D	E	F	G
1	1	0	0	1	0	0	1
2	1	0	0	1	0	0	0
3	0	0	0	1	1	0	1
4	0	0	1	0	1	1	0
5	0	1	1	0	0	1	1
6	0	1	0	0	0	0	1

As in Example 3, a solution is to select the 2nd, 4th and 6th rows of the matrix:

	A	B	C	D	E	F	G
2	1	0	0	1	0	0	0
4	0	0	1	0	1	1	0
6	0	1	0	0	0	0	1

But unlike in Example 3, the interpretation here is that the solution is the set of elements X* = {2, 4, 6} such that every set in Y contains exactly one element in X*:

A = {1, 2}
B = {5, 6}
C = {4, 5}
D = {1, 2, 3}
E = {3, 4}
F = {4, 5}
G = {1, 3, 5, 6}

In other words, the solution is an exact hitting set. Indeed, the exact cover and the exact hitting set problems are dual to each other, i.e., essentially the same.

Finding solutions

Knuth's Algorithm X is a recursive, nondeterministic, depth-first, brute-force algorithm that finds all solutions to the exact cover problem.

Dancing Links, commonly known as DLX, is the technique suggested by Knuth to efficiently implement his Algorithm X on a computer. Dancing Links uses the matrix representation of the problem. Dancing Links implements the matrix as a series of doubly-linked lists of the 1s of the matrix: each 1 element has a link to the next 1 above, below, to the left, and to the right of itself.

The exact cover problem is known to be NP-complete.

Noteworthy examples

Many problems can be reduced to exact cover problems, which then can be solved with techniques like Dancing Links.

For instance, the problem of tiling a board with pentominoes, the eight queens problem, and solving Sudoku can all be viewed as exact cover problems and solved with Dancing Links.

Pentamino tiling

(This subsection should be fleshed out.)

The problem of tiling a board with pentominoes is an example of an exact cover problem.

For example, see Solving Pentomino Puzzles with Backtracking.

Eight queens problem

(This subsection should be fleshed out.)

The eight queens problem is an example of an exact cover problem.

Sudoku

The problem in Sudoku is to assign numbers (or digits, values, symbols) to cells (or squares) in a grid so as to satisfy certain constraints.

In the standard 9×9 Sudoku variant, there are four kinds of constraints:

Row-Column: Each intersection of a row and column, i.e, each cell, must contain exactly one number.

Row-Number: Each row must contain each number exactly once

Column-Number: Each column must contain each number exactly once.

Box-Number: Each box must contain each number exactly once.

While the first constraint might seem trivial, it is nevertheless needed to ensure there is only one number per cell.

Solving Sudoku is an exact cover problem.

More precisely, solving Sudoku is an exact hitting set problem, which is equivalent to an exact cover problem (as in Example 4 above), when viewed as a problem to select possibilities such that every constraint set contains (i.e., is hit by) exactly one selected possibility. In the notation above for the (generalized) exact cover problem, X is the set of possibilities, Y is a set of contraint sets, and R is the binary relation "is contained in."

Each possible assignment of a particular number to a particular cell is a possibility (or candidate). When Sudoku is played with pencil and paper, possibilities are often called pencil marks.

In the standard 9×9 Sudoku variant, in which each of 9×9 cells is assigned one of 9 numbers, there are 9×9×9=729 possibilities. Using obvious notation for rows, columns and numbers, the possibilities can be labeled

R1C1#1, R1C1#2, …, R9C9#9.

The fact that each kind of constraint involves exactly one of something is what makes Sudoku an exact hitting set problem. The constraints can be represented by constraint sets. The problem is to select possibilities such that every constraint set contains (i.e., is hit by) exactly one selected possibility.

In the standard 9×9 Sudoku variant, there are four kinds of constraints sets corresponding to the four kinds of constraints:

Row-Column: A row-column constraint set contains all the possibilities for the intersection of a particular row and column, i.e., for a cell. For example, the constraint set for row 1 and column 1, which can be labeled R1C1, contains the 9 possibilities for row 1 and column 1 but different numbers:

R1C1 = { R1C1#1, R1C1#2, R1C1#3, R1C1#4, R1C1#5, R1C1#6, R1C1#7, R1C1#8, R1C1#9 }.

Row-Number: A row-number constraint set contains all the possibilities for a particular row and number. For example, the constraint set for row 1 and number 1, which can be labeled R1#1, contains the 9 possibilities for row 1 and number 1 but different columns:

R1#1 = { R1C1#1, R1C2#1, R1C3#1, R1C4#1, R1C5#1, R1C6#1, R1C7#1, R1C8#1, R1C9#1 }.

Column-Number: A column-number constraint set contains all the possibilities for a particular column and number. For example, the constraint set for column 1 and number 1, which can be labeled C1#1, contains the 9 possibilities for column 1 and number 1 but different rows:

C1#1 = { R1C1#1, R2C1#1, R3C1#1, R4C1#1, R5C1#1, R6C1#1, R7C1#1, R8C1#1, R9C1#1 }.

Box-Number: A box-number constraint set contains all the possibilities for a particular box and number. For example, the constraint set for box 1 (in the upper lefthand corner) and number 1, which can be labeled B1#1, contains the 9 possibilities for the cells in box 1 and number 1:

B1#1 = { R1C1#1, R1C2#1, R1C3#1, R2C1#1, R2C2#1, R2C3#1, R3C1#1, R3C2#1, R3C3#1 }.

Since there are 9 rows, 9 columns, 9 boxes and 9 numbers, there are 9×9=81 row-column constraint sets, 9×9=81 row-number constraint sets, 9×9=81 column-number constraint sets, and 9×9=81 box-number constraint sets: 81+81+81+81=324 constraint sets in all.

In brief, the standard 9×9 Suduko variant is an exact hitting set problem with 729 possibilities and 324 constraint sets. Thus the problem can be represented by a 729×324 matrix.

Although it is difficult to present the entire 729×324 matrix, the general nature of the matrix can be seen from several snapshots:

**Row-Column Constraints**
	R1 C1	R1 C2	…
R1C1#1	1	0	…
R1C1#2	1	0	…
R1C1#3	1	0	…
R1C1#4	1	0	…
R1C1#5	1	0	…
R1C1#6	1	0	…
R1C1#7	1	0	…
R1C1#8	1	0	…
R1C1#9	1	0	…
R1C2#1	0	1	…
R1C2#2	0	1	…
R1C2#3	0	1	…
R1C2#4	0	1	…
R1C2#5	0	1	…
R1C2#6	0	1	…
R1C2#7	0	1	…
R1C2#8	0	1	…
R1C2#9	0	1	…
…	…	…	…

**Row-Number Constraints**
	R1 #1	R1 #2	…
R1C1#1	1	0	…
R1C1#2	0	1	…
…	…	…	…
R1C2#1	1	0	…
R1C2#2	0	1	…
…	…	…	…
R1C3#1	1	0	…
R1C3#2	0	1	…
…	…	…	…
R1C4#1	1	0	…
R1C4#2	0	1	…
…	…	…	…
R1C5#1	1	0	…
R1C5#2	0	1	…
…	…	…	…
R1C6#1	1	0	…
R1C6#2	0	1	…
…	…	…	…
R1C7#1	1	0	…
R1C7#2	0	1	…
…	…	…	…
R1C8#1	1	0	…
R1C8#2	0	1	…
…	…	…	…
R1C9#1	1	0	…
R1C9#2	0	1	…
…	…	…	…

**Column-Number Constraints**
	C1 #1	C1 #2	…
R1C1#1	1	0	…
R1C1#2	0	1	…
…	…	…	…
R2C1#1	1	0	…
R2C1#2	0	1	…
…	…	…	…
R3C1#1	1	0	…
R3C1#2	0	1	…
…	…	…	…
R4C1#1	1	0	…
R4C1#2	0	1	…
…	…	…	…
R5C1#1	1	0	…
R5C1#2	0	1	…
…	…	…	…
R6C1#1	1	0	…
R6C1#2	0	1	…
…	…	…	…
R7C1#1	1	0	…
R7C1#2	0	1	…
…	…	…	…
R8C1#1	1	0	…
R8C1#2	0	1	…
…	…	…	…
R9C1#1	1	0	…
R9C1#2	0	1	…
…	…	…	…

**Box-Number Constraints**
	B1 #1	B1 #2	…
R1C1#1	1	0	…
R1C1#2	0	1	…
…	…	…	…
R1C2#1	1	0	…
R1C2#2	0	1	…
…	…	…	…
R1C3#1	1	0	…
R1C3#2	0	1	…
…	…	…	…
R2C1#1	1	0	…
R2C1#2	0	1	…
…	…	…	…
R2C2#1	1	0	…
R2C2#2	0	1	…
…	…	…	…
R2C3#1	1	0	…
R2C3#2	0	1	…
…	…	…	…
R3C1#1	1	0	…
R3C1#2	0	1	…
…	…	…	…
R3C2#1	1	0	…
R3C2#2	0	1	…
…	…	…	…
R3C3#1	1	0	…
R3C3#2	0	1	…
…	…	…	…

The complete 729×324 matrix is available from Bob Hanson.

Note that the set of possibilities RxCy#z can be arranged as a 9×9×9 cube in a 3-dimensional space with coordinates x, y, and z. Then each row Rx, column Cy, or number #z is a 9×9×1 "slice" of possibilities; each box Bw is a 9x3×3 "tube" of possibilities; each row-column constraint set RxCy, row-number constraint set Rx#z, or column-number constraint set Cy#z is a 9x1×1 "strip" of possibilities; each box-number constraint set Bw#z is a 3x3×1 "square" of possibilities; and each possibility RxCy#z is a 1x1×1 "cubie" consisting of a single possibility. Moreover, each constraint set or possibility is the intersection of the component sets. For example, R1C2#3 = R1 · C2 · #3, where "·" denotes set intersection.

Although other Sudoku variations have different numbers of rows, columns, numbers and/or different kinds of constraints, they all involve possibilities and constraint sets, and thus can be seen as exact hitting set problems.

External links

Michael R. Garey and David S. Johnson (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman. ISBN 0-7167-1045-5. A3.1: SP2: EXACT COVER BY 3-SETS (X3C), p. 221.
Karl Dahlke. "NP Complete, Exact Cover". Math Reference Project. Retrieved 2006-06-27.