Endgame tablebase

In chess, an endgame tablebase, or simply a tablebase, is a database of all possible endgame positions with small groups of material. Each position is conclusively determined as a win, loss, or draw for the player whose turn it is to move. If the result is decisive, the tablebase gives the number of moves until checkmate is forced.

A tablebase acts as an oracle, providing perfect play for both White and Black. If the position leads to a decisive result, the winning side always plays the move that produces the quickest victory, while the losing opponent delays his loss for as long as possible.

Chess tablebases contain at least three pieces — the two kings, plus a pawn, rook, or queen — because this minimum is necessary for the possibility of producing checkmate. The maximum number of pieces is limited by the constraints of computer technology. As of 2006, all endgames with up to six pieces (including the two kings) have been completely analyzed. The tablebases for all positions with up to 6 pieces, except 5 pieces against 1, are available for free download and can also be queried on the web (see the External Links section below). Some seven-piece tablebases have been produced, and research on other seven-piece tablebases is in progress. A full set of seven-piece tablebases is not realistically expected until 2015 or later.

Definition and Etymology

The most common usage of the word "tablebase" occurs in the context of chess endgames. The word is not found in dictionaries as of 2006, although it has permanently entered the lexicon of computer chess.

According to Guy Haworth, the word "tablebase" was first used in connection with chess endgames in 1995 in an article in the ICCA Journal (EG 137:151, see page 3 at this pdf link: [1]).

What distinguishes a tablebase from other databases is the totality of the information it contains. Theoretically, it is possible to add new data to most databases; but a tablebase, by definition, contains a complete set of information.

There have been several other names for endgame tablebases:

Endgame Database. This is the oldest name, and it has been used since 1985 if not earlier. Examples of its usage appear in article titles and references in the International Computer Chess Association (ICCA) Journal, later renamed the International Computer Games Association (ICGA) Journal ([2]). "Endgame Database" was also used in How Computers Play Chess (see reference below).
Endgame Table (EGT). This term was preferred by Guy Haworth in a short communication in the magazine EG (see above), and he has used it consistently. It was used three article titles in 2000 and 2001 in the ICGA journal (see the pdf document at this link).
Oracle Database (ODB). This term was preferred in 2000 by John Roycroft and six other composers in an article in EG (137:9) (see page 9 at this link: [3]). In a list of definitions, he wrote, "odb - otherwise known as total information database or tablebase."

History of tablebases

The concept of a solved game is not unique to chess. Some simple games, such as Tic Tac Toe and Connect Four, have been solved entirely as wins or draws with perfect play. Other games, such as chess (from the starting position) and Go, cannot be solved currently because their game complexity is too vast to allow calculation for all possible positions. Instead, these more complex games can be partially solved by reducing the size of the board, or the number of pieces, or both.

Computer chess programs have existed since the 1940s. Indeed, Alan Turing used a primitive algorithm to evaluate and select chess moves by adding values for material and mobility into a calculator, without writing a formal computer program Template:Ref harvard. Conventional chess-playing computers look at the existing position on the board, and evaluate the results of each possible move by analyzing forward into the future. The computer evaluates a chess move using different criteria than humans would use. However, humans and computers share the basic common denominator that they both think forward from the position that is on the board at that particular moment.

With improvements to computer hardware and software in the 1970s, it became possible for computers to address simple endgame positions using retrograde analysis. This method is completely opposite the standard method of evaluation. Instead of thinking forward from the position on the board, the computer thinks backward from the terminal position (win or draw) that it seeks to achieve. This requires the computer to generate every possible position with a given set of material, and to link all moves in a decisive position to the mate result.

A German programmer named Thomas Strohlein is considered to have initiated the database technique in 1970, according to Tim Krabbe ^[1] and others. Ken Thompson developed four- and five-piece tablebases. Lewis Stiller published a thesis with research on some six-piece tablebase endgames in 1995 [4]. Nalimov tablebases, which are a popular format, are named for Eugene Nalimov. Recent contributors to the field are Eiko Bleicher and the collaboration of Marc Bourzutschky and Yakov Konoval.

Generating tablebases

Step 1: Generating all possible positions

David Levy, How Computers Play Chess

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

The ten unique squares (with symmetry)

First it is necessary to generate all the possible positions, with a given material, without evaluating them. For the endgame of king and queen versus king, there are about 40,000 unique legal positions. The number is given by David Levy, and it comes from a symmetry argument.

The Black king can be placed on any of ten squares: a1, b1, c1, d1, b2, c2, d2, c3, d3, and d4 (see diagram; also see algebraic notation). On any other square, its position can be considered equivalent by symmetry of rotation or reflection. Thus, there is no difference whether a Black king in a corner resides on a1, a8, h8, or h1. Multiply this number of 10 by at most 64 squares for placing the White king and then by at most 64 squares for the White queen. The product 10×64×64 = 40,960. Several hundred of these positions are illegal, impossible, or merely symmetrical reflections of each other, so the actual number is somewhat smaller Template:Ref harvard.

For each position, the tablebase evaluates the situation separately for White-to-move (wtm) and Black-to-move (btm). The vast majority of positions are wins for the player with the queen, with checkmate forced in not more than ten moves. Some are draws because of stalemate or the unavoidable loss of the queen.

Each additional piece added to a pawnless endgame multiplies the number of unique positions by about a factor of sixty - the approximate number of squares not already occupied by other pieces. Also, endgames with one or more pawns increase the complexity because the aforementioned symmetry argument no longer works. Since the pawn can move only forward and not sideways, and since pawns can only exist on 48 of the 64 squares, there is no vertical symmetry. Thus (Ka1, Pb4) and (Ka8, Pb5) are two very different positions, even though (Ka1, Qb4) and (Ka8, Qb5) are essentially identical. However, there still is horizontal symmetry: (Ka1, Pb4) is symmetrically equivalent to (Kh1, Pg4).

Step 2: Evaluating positions using retrograde analysis

Figure 1

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

White to move: mate in 3 ply (Kc6)

Figure 2

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

Black to move: mate in 2 ply (Kd8 or Kb8)

Krabbe explains the process of generating a tablebase as follows:

"The idea is that a database is made with all possible positions with a given material [note: as in the preceding section]. Then a subdatabase is made of all positions where Black is mate. Then one where White can give mate. Then one where Black cannot stop White giving mate next move. Then one where White can always reach a position where Black cannot stop him from giving mate next move. And so on, always a ply further away from mate until all positions that are thus connected to mate have been found. Then all of these positions are linked back to mate by the shortest path through the database. That means that, apart from 'equi-optimal' moves, all the moves in such a path are perfect: White's move always leads to the quickest mate, Black's move always leads to the slowest mate."

It is worthwhile to illustrate the idea of retrograde analysis with reference to the diagram at the upper left (Figure 1). White mates in two moves with 1. Kc6 (see algebraic notation), leading to the position at the upper right (Figure 2). Then if 1...Kb8 2. Qb7, and if 1...Kd8 2. Qd7, with checkmate resulting in both cases.

Figure 3

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

White to move: mate in 1 ply (Qd7)

The position at left (Figure 3), before White's second move, is defined as "mate in one ply." Then the position after White's first move (Figure 2) is "mate in two ply," regardless of how Black plays. Finally, the initial position (Figure 1) is "mate in three ply" (i.e., two moves) because after 1. Kc6 the position in Figure 2 is already known as a "mate in two ply." This process, which links a current position to another position that could have existed one ply earlier, can continue indefinitely.

Each position is thus evaluated as a win or loss in a certain number of moves. At the end of the retrograde analysis, positions which are not included as wins are necessarily draws.

Step 3: Verification

After the tablebase has been generated, and every position has been evaluated, the result must be verified independently. The verification process should use a different algorithm than the original generation. The purpose is to check the self-consistency of the tablebase results.

For example, in Figure 1 above, the verification program sees the evaluation "mate in three ply (Kc6)." It then looks at the position in Figure 2, after Kc6, and sees the evaluation "mate in two ply." These two evaluations are consistent with each other. If the evaluation of Figure 2 were "draw," that would be inconsistent with Figure 1, and would thus indicate a mistake.

Captures, pawn promotion, and special moves

A 4-man tablebase must rely on three-piece tablebases that could result if one of the pieces is captured. Similarly, a tablebase containing a pawn must be able to rely on other tablebases that deal with the new set of material after pawn promotion to a queen or other piece. The analysis process is made somewhat more difficult since the previous move from a given position might have been a capture or a pawn promotion.

Tablebases assume that castling is not possible because, in practical endgames, this assumption is almost always correct. (However, castling is allowed by convention in composed problems and studies.)

There is another reason to exclude castling: it is not apparent whether castling is possible in a given position, because the king or rook may have moved earlier in the game. Thus, for each position with king and rook on their original squares, it would be necessary to make separate evaluations for states in which castling is or is not possible.

The same concern exists for the special en passant capture, since the possibility of en passant depends on the opponent's previous move. However, practical applications of en passant occur frequently in pawn endgames, so tablebases account for the possibility of en passant for positions where both sides have at least one pawn.

Metrics: Distance to Conversion and Distance to Mate

A tablebase must be calibrated to a metric of optimality - in other words, it must define at what point a player has "won" the game. There are two intuitive choices for this definition:

Checkmate is the only way to win. This metric is called distance to mate (DTM).
Checkmate or a decisive gain of material are both ways to win. This metric is called distance to conversion (DTC).
Guy Haworth has referred to other metrics, such as DTZ and DTR, in order to account for the fifty move rule. These other metrics are not in common usage.

Generally, DTC requires fewer moves than DTM only in endgames where the stronger player stands to gain by capturing material. In three-man endgames, and in the unusual endgame of two knights versus one pawn, DTC = DTM.

Applications of Tablebases

Practical Play (Humans)

In correspondence chess, a player may consult a chess computer for assistance, provided that the etiquette of the competition allows this. A six-piece tablebase was used to analyze the endgame that occurred in the correspondence game Kasparov versus The World.

It should be noted that standard tablebases do not account for the fifty-move rule, which dictates that a game is drawn if either player claims correctly that there has been no pawn move and no piece capture in the last fifty moves. (The purpose of this rule is to prevent deadlocked games from continuing endlessly.) In games where the fifty-move rule is in effect, it is necessary to evaluate whether the theoretical victory can be consummated without overstepping the 50-move rule. Some tablebase wins last more than 200 moves without a piece capture or a pawn move. This is because tablebases are of theoretical interest, and have applications beyond practical play.

Practical Play (Computers)

Other things being equal, the knowledge contained in tablebases gives chess computers a tremendous advantage in the endgame, save the caveat in the preceding paragraph. Not only can computers play perfectly within an endgame, but also they can simplify to a winning tablebase position from a more complicated endgame.

However, some computer chess experts have observed other practical drawbacks. A computer in a difficult position might avoid the losing side of a tablebase ending, even if it is almost impossible for the opponent to win the ending without himself knowing the tablebase. The adverse effect could be a premature resignation, or an inferior line of play that loses with less resistance than a tablebase might offer.

Another drawback is that tablebases require a lot of memory to store the many thousands of positions. The Nalimov tablebases, which use state-of-the-art compression techniques, require 7.05 GB of hard disk space for all five-piece endings. The six-piece endings require approximately 1.2 terabytes. Nalimov seven-piece tablebases require more hard drive storage capacity and RAM to operate than will be practical to use for the foreseeable future, but other tablebase formats that are far less resource-hungry are in development.

Some computers play better overall if their memory is devoted instead to the ordinary evalution function. Also, modern computers analyze far enough ahead conventionally to handle the elementary endgames without needing tablebases. It is only for more sophisticated positions that the tablebase will improve significantly over the ordinary computer.

Endgame Theory

Tablebases have answered longstanding questions about whether certain combinations of material are wins or draws. For example KBBKN (two White bishops versus one Black knight) was long believed a draw, although John Roycroft discussed the question in multiple issues of his endgame magazine EG. Tablebase proved that, surprisingly, the player with the bishops can eventually force the capture of the knight, after which the checkmate is elementary.

Lewis Stiller, 1991

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

White mates in 262 moves

The theory of the endgame of KRBKNN (rook plus bishop versus two knights) and other pawnless endings was considered speculatively by Freidrich Ludwig Amelung, according to Lewis Stiller's thesis (p. 81). Stiller generated a tablebase for this endgame and found that it is a draw for most positions, but that many are wins.

An extreme case among all six-men endgames is the diagram position (KRNKNN), where White wins starting with 1. Ke6! Stiller's tablebase operated on the DTC metric (see above), sohe published the position as a win in 246 moves. Later, DTM tablebases revealed that it was mate in 262 moves.

Another example of an endgame tablebase adding to theory is the entirely impractical KNNNNKQ (four knights versus queen). Again this is drawn in most positions, but in a particular position, the knights can win in 85 moves (^[2]). This set of material had already appeared in the published work of the renowned endgame composer Alexei Troitsky. Troitsky also devoted significant effort to elucidating the unusual but practically relevant ending of KNNKP (two knights versus a pawn). If the pawn can be stopped no farther than the Troitsky line, the knights can win, as the tablebases confirm.

The current record for longest tablebase win is 517 moves DTC in KQNKRBN, as published by Bourzutschky and Konoval in May 2006 (http://www.xs4all.nl/~timkr/chess2/diarytxt.htm see item 316). Bourzutschky wrote, "This was a big surprise for us and is a great tribute to the complexity of chess."

In August 2006, Bourzutschky posted an update about his research into 7-men endgames. He released initial results for KQQPKQQ, KRRPKRR, and KBBPKNN.[5]

Endgame Studies

Since many composed endgame studies deal with positions that exist in tablebases, their soundness can be checked using the tablebases. Some studies have been cooked, i.e. proven unsound, by the tablebases. That can be either because the composer's solution does not work, or else because there is an equally effective alternative that the composer did not consider.

Harold van der Heijden, 2001

	a	b	c	d	e	f	g	h
8									8
7									7
6									6
5									5
4									4
3									3
2									2
1									1
	a	b	c	d	e	f	g	h

White to play and draw

While tablebases have cooked some studies, they have assisted in the creation of other studies. Composers can search tablebases for interesting positions, such as zugzwang, using a method called data mining.

There has been some controversy whether to allow endgame studies composed with tablebase assistance into composing tourneys. In multiple issues of EG, Roycroft has argued that a tablebase composition is not really human, so a human should not be rewarded for its composition. Mark Dvoretsky, in an 2006 article on chesscafe.com, took a more permissive stance.

Dvoretsky was referring to a study by Harold van der Heijden, published in 2001, which reached the position at right after three moves ^[3]. The drawing move for White is 4. Kb4!! (and not 4. Kb5), for reasons beyond the scope of this article.

Dvoretsky comments

"Here, we should touch on one delicate question. I am sure that this unique endgame position was discovered with the help of [Ken] Thompson’s famous computer database. Is this a ‘flaw,’ diminishing the composer’s achievement?

"Yes, the computer database is an instrument, available to anyone nowadays. Out of it, no doubt, we could probably extract yet more unique positions – there are some chess composers who do so regularly. The standard for evaluation here should be the result achieved. Thus: miracles, based upon complex computer analysis rather than on their content of sharp ideas, are probably of interest only
to certain aesthetes."

The "miracles" Dvoretsky refers to are presumably positions such as Stiller's 262-move win, which is not amenable to human understanding.

Philosophical Implications

The commentator Krabbe has written, in reference to Stiller's long wins:

"A grandmaster wouldn't be better at these endgames than someone who had learned chess yesterday. It's a sort of chess that has nothing to do with chess, a chess that we could never have imagined without computers. The Stiller moves are awesome, almost scary, because you know they are the truth, God's Algorithm - it's like being revealed the Meaning of Life, but you don't understand one word."

The metaphor draws on the omniscience of God, who is presumed to know all information, even when humans can know only limited information.

Notes

References

^{Levy and Newborn 1991} David Levy and Monty Newborn (1991). How Computers Play Chess. Computer Science Press. ISBN 0-7167-8121-2.

External links

Guide to the use of Computer Chess Endgame Tablebases by Aaron Tay
Downloading tablebases
- Distribution site for tablebases up to six pieces
- 3-4-5 pieces on Robert Hyatt's FTP site
Querying tablebases on the web
- Web query server for Nalimov tablebases by Eiko Bleicher (up to 6 pieces)
- A mirror of Bleicher's tablebase server (up to 6 pieces)
- Web query server for Nalimov tablebases by Lokasoft (up to 5 pieces)
Maximal positions - i.e. longest distance-to-mate positions for endgames with 3-4-5 pieces
More maximal positions, including 6-men tablebases, compiled by Kirill Kryukov
John Tamplin's Endgame Tablebase - Contains distance-to-conversion and distance-to-mate web query server. Also contains a list of all known positions of mutual zugzwang for 3-4-5 piece endgames, and for 6-piece endgames without pawns.

[1] ttp://www.xs4all.nl/~timkr/chess/perfect.htm

[2] ttp://web.quick.cz/EVCOMP/tablebase.htm#7menfirst

[3] ttp://www.chesscafe.com/text/dvoretsky70.pdf (see pages 3-5)

[1]

[2]

[3]