Mathematics of Sudoku
Sudoku puzzles can be studied mathematically to answer questions such as "How many filled Sudoku grids are there?", "What is the minimal number of clues in a valid puzzle?" and "In what ways can Sudoku grids be symmetric?" through the use of combinatorics and group theory.
The main results are that for the classical Sudoku the number of filled grids is 6,670,903,752,021,072,936,960 (6.67×1021), which reduces to 5,472,730,538 essentially different groups under the validity preserving transformations. There are 26 types of symmetry, but they can only be found in about 0.005% of all filled grids. A puzzle with a unique solution must have at least 17 clues, and there is a solvable puzzle with at most 21 clues for every solved grid. The largest minimal puzzle found so far has 40 clues.
Similar results are known for variants and smaller grids. No exact results are known for Sudokus larger than the classical 9×9 grid, although there are estimates which are believed to be fairly accurate.
The analysis of Sudoku falls into two main areas:
- analyzing the properties of completed grids
- analyzing the properties of completed puzzles.
Initial analysis was largely focused on enumerating solutions, with results first appearing in 2004.
There are many Sudoku variants, partially characterized by size (N), and the shape of their regions. Unless noted, discussion in this article assumes classic Sudoku, i.e. N=9 (a 9×9 grid and 3×3 regions). A rectangular Sudoku uses rectangular regions of row-column dimension R×C. Other variants include those with irregularly-shaped regions or with additional constraints (hypercube) or different constraint types (Samunamupure).
Regions are also called blocks or boxes. A band is a part of the grid that encapsulates 3 rows and 3 boxes, and a stack is a part of the grid that encapsulates 3 columns and 3 boxes. A puzzle is a partially completed grid, and the initial values are givens or clues. A proper puzzle has a unique solution. A minimal puzzle is a proper puzzle from which no clue can be removed without introducing additional solutions. See Glossary of Sudoku for other terminology.
Solving Sudokus from the viewpoint of a player has been explored in Denis Berthier's book "The Hidden Logic of Sudoku" (2007) which considers strategies such as "hidden xy-chains".
A puzzle can be expressed as a graph coloring problem. The aim is to construct a 9-coloring of a particular graph, given a partial 9-coloring. The Sudoku graph has 81 vertices, one vertex for each cell. The vertices are labeled with ordered pairs (x, y), where x and y are integers between 1 and 9. In this case, two distinct vertices labeled by (x, y) and (x′, y′) are joined by an edge if and only if:
- x = x′ (same column) or,
- y = y′ (same row) or,
- ⌈ x/3 ⌉ = ⌈ x′/3 ⌉ and ⌈ y/3 ⌉ = ⌈ y′/3 ⌉ (same 3×3 cell)
The puzzle is then completed by assigning an integer between 1 and 9 to each vertex, in such a way that vertices that are joined by an edge do not have the same integer assigned to them.
Sudokus from group tables
As in the case of Latin squares the (addition- or) multiplication tables (Cayley tables) of finite groups can be used to construct Sudokus and related tables of numbers. Namely, one has to take subgroups and quotient groups into account:
Take for example the group of pairs, adding each component separately modulo some . By omitting one of the components, we suddenly find ourselves in (and this mapping is obviously compatible with the respective additions, i.e. it is a group homomorphism). One also says that the latter is a quotient group of the former, because some once different elements become equal in the new group. However, it is also a subgroup, because we can simply fill the missing component with to get back to .
Under this view, we write down the example, Grid 1, for .
Each Sudoku region looks the same on the second component (namely like the subgroup ), because these are added regardless of the first one. On the other hand, the first components are equal in each block, and if we imagine each block as one cell, these first components show the same pattern (namely the quotient group ). As outlined in the article of Latin squares, this is a Latin square of order .
Now, to yield a Sudoku, let us permute the rows (or equivalently the columns) in such a way, that each block is redistributed exactly once into each block – for example order them . This of course preserves the Latin square property. Furthermore, in each block the lines have distinct first component by construction and each line in a block has distinct entries via the second component, because the blocks' second components originally formed a Latin square of order (from the subgroup ). Thus we arrive at a Sudoku (rename the pairs to numbers 1...9 if you wish). With the example and the row permutation above, we arrive at Grid 2.
For this method to work, one generally does not need a product of two equally-sized groups. A so-called short exact sequence of finite groups of appropriate size already does the job. Try for example the group with quotient- and subgroup . It seems clear (already from enumeration arguments), that not all Sudokus can be generated this way.
A Sudoku whose regions are not (necessarily) square or rectangular is known as a Jigsaw Sudoku. In particular, an N×N square where N is prime can only be tiled with irregular N-ominoes. For small values of N the number of ways to tile the square (excluding symmetries) has been computed (sequence A172477 in the OEIS). For N ≥ 4 some of these tilings are not compatible with any Latin square; i.e. all Sudoku puzzles on such a tiling have no solution.
The answer to the question 'How many Sudoku grids are there?' depends on the definition of when similar solutions are considered different.
For the enumeration of all possible solutions, two solutions are considered distinct if any of their corresponding (81) cell values differ. Symmetry relations between similar solutions are ignored., e.g. the rotations of a solution are considered distinct. Symmetries play a significant role in the enumeration strategy, but not in the count of all possible solutions.
The first known solution to complete enumeration was posted by QSCGZ (Guenter Stertenbrink) to the rec.puzzles newsgroup in 2003, obtaining 6,670,903,752,021,072,936,960 (6.67×1021) distinct solutions.
In a 2005 study, Felgenhauer and Jarvis analyzed the permutations of the top band used in valid solutions. Once the Band1 symmetries and equivalence classes for the partial grid solutions were identified, the completions of the lower two bands were constructed and counted for each equivalence class. Summing completions over the equivalence classes, weighted by class size, gives the total number of solutions as 6,670,903,752,021,072,936,960, confirming the value obtained by QSCGZ. The value was subsequently confirmed numerous times independently. A second enumeration technique based on band generation was later developed that is significantly less computationally intensive. This subsequent technique resulted in roughly needing 1/97 of the number of computation cycles as the original techniques, but was significantly more complicated to set up.
Essentially different solutions
The sudoku symmetry group
The precise structure of the sudoku symmetry group can be expressed succinctly using the wreath product (≀). The possible row (or column) permutations form a group isomorphic to S3 ≀ S3 of order 3!4 = 1,296. The whole rearrangement group is formed by letting the transposition operation (isomorphic to C2) act on two copies of that group, one for the row permutations and one for the column permutations. This is S3 ≀ S3 ≀ C2, a group of order 1,2962 × 2 = 3,359,232. Finally, the relabelling operations commute with the rearrangement operations, so the full sudoku (VPT) group is (S3 ≀ S3 ≀ C2) × S9 of order 1,218,998,108,160.
Fixed points and Burnside's lemma
The set of equivalent grids which can be reached using these operations (excluding relabeling) forms an orbit of grids under the action of the rearrangement group. The number of essentially different solutions is then the number of orbits, which can be computed using Burnside's lemma. The Burnside fixed points are grids that either do not change under the rearrangement operation or only differ by relabeling. To simplify the calculation the elements of the rearrangement group are sorted into conjugacy classes, whose elements all have the same number of fixed points. It turns out only 27 of the 275 conjugacy classes of the rearrangement group have fixed points; these conjugacy classes represent the different types of symmetry (self-similarity or automorphism) that can be found in completed sudoku grids. Using this technique, Ed Russell and Frazer Jarvis were the first to compute the number of essentially different sudoku solutions as 5,472,730,538.
Minimum number of givens
Ordinary Sudokus (proper puzzles) have a unique solution. A minimal Sudoku is a Sudoku from which no clue can be removed leaving it a proper Sudoku. Different minimal Sudokus can have a different number of clues. This section discusses the minimum number of givens for proper puzzles.
Many Sudokus have been found with 17 clues, although finding them is not a trivial task. A paper by Gary McGuire, Bastian Tugemann, and Gilles Civario, released on 1 January 2012, explains how it was proved through an exhaustive computer search that the minimum number of clues in any proper Sudoku is 17.
A Sudoku with 24 clues, dihedral symmetry (symmetry on both orthogonal axis, 90° rotational symmetry, 180° rotational symmetry, and diagonal symmetry) is known to exist. It is not known if this number of clues is minimal for this class of Sudoku. The fewest clues in a Sudoku with two-way diagonal symmetry is believed to be 18, and in at least one case such a Sudoku also exhibits automorphism.
Sudokus of other sizes
Number of minimal puzzles
The number of minimal Sudokus (Sudokus in which no clue can be deleted without losing uniqueness of the solution) is not precisely known. However, statistical techniques combined with a generator ('Unbiased Statistics of a CSP – A Controlled-Bias Generator'), show that there are approximately (with 0.065% relative error):
- 3.10 × 1037 minimal puzzles,
- 2.55 × 1025 non-essentially-equivalent minimal puzzles.
- Combinatorial explosion (with summary of grid count of Sudoku compared to Latin squares)
- Dancing Links
- Glossary of Sudoku
- Sudoku code
- Lin, Keh Ying (2004), "Number of Sudokus", Journal of Recreational Mathematics, 33 (2): 120–24.
- Berthier, Denis (November 2007). The Hidden Logic of Sudoku (Second, revised and extended ed.). Lulu.com. ISBN 978-1847534729.
- Yato, Takayuki; Seta, Takahiro (2003). "Complexity and Completeness of Finding Another Solution and Its Application to Puzzles". IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences. E86-A (5): 1052–1060.
- Lewis, R. A Guide to Graph Colouring: Algorithms and Applications. Springer International Publishers, 2015.
- de Ruiter, Johan (15 March 2010). "On Jigsaw Sudoku Puzzles and Related Topics (Bachelor Thesis)" (PDF). Leiden Institute of Advanced Computer Science (LIACS).
- QSCGZ (Guenter Stertenbrink) (21 September 2003). "combinatorial question on 9x9 (rec.puzzles)". Google Discussiegroepen. Retrieved 20 October 2013.
- Jarvis, Frazer (2 February 2008). "Sudoku enumeration problems". Frazer Jarvis's home page. University of Sheffield. Retrieved 20 October 2013.
- Felgenhauer, Bertram; Jarvis, Frazer (20 June 2005), Enumerating possible Sudoku grids (PDF).
- G. McGuire, B. Tugemann, G. Civario. "There is no 16-Clue Sudoku: Solving the Sudoku Minimum Number of Clues Problem". Arxiv.org.
- Ed Russell and Frazer Jarvis (7 September 2005). "There are 5472730538 essentially different Sudoku grids... and the Sudoku symmetry group". Frazer Jarvis's home page. University of Sheffield. Retrieved 20 October 2013.
- Ed Russell, Frazer Jarvis (25 January 2006). "Mathematics of Sudoku II" (PDF).
- "Symmetrical 17 Clue Puzzle" Symmetrical 17 Clue Puzzle.
- "Raphael – 18 Clue Symmetrical" Raphael – an 18 clue Sudoku with orthogonal symmetry.
- "Total symmetry" Total symmetry – a 24 clue Sudoku with total symmetry.
- "Tourmaline – 19 Clue Two-Way Symmetry" Tourmaline – a 19 clue Sudoku with two-way orthogonal symmetry.
- "プログラミングパズル雑談コーナー". .ic-net.or.jp. Archived from the original on 12 October 2016. Retrieved 20 October 2013.
- "Minimum Sudoku". Csse.uwa.edu.au. Archived from the original on 26 November 2006. Retrieved 20 October 2013.
- Yirka, Bob (6 January 2012). "Mathematicians Use Computer to Solve Minimum Sudoku Solution Problem". PhysOrg. Retrieved 6 January 2012.
- McGuire, Gary (1 January 2012). "There is no 16-Clue Sudoku: Solving the Sudoku Minimum Number of Clues Problem". arXiv:1201.0749.
- Taalman, Laura (2007), "Taking Sudoku seriously", Math Horizons, 15 (1): 5–9, doi:10.1080/10724117.2007.11974720, JSTOR 25678701, S2CID 126371771. See in particular Figure 7, p. 7.
-  Minimal number of clues for Sudokus
- Berthier, Denis (4 December 2009). "Unbiased Statistics of a CSP – A Controlled-Bias Generator". Retrieved 4 December 2009.
- Rosenhouse, Jason; Taalman, Laura (2011). Taking Sudoku Seriously: The math behind the world's most popular pencil puzzle. Oxford University Press.
- V. Elser's difference-map algorithm also solves Sudoku
- Sudoku Puzzle — an Exercise in Constraint Programming and Visual Prolog 7 by Carsten Kehler Holst (in Visual Prolog)
- Sudoku Squares and Chromatic Polynomials by Herzberg and Murty, treats Sudoku puzzles as vertex coloring problems in graph theory.