# Vigenère cipher

The Vigenère cipher (French pronunciation: ​[viʒnɛːʁ]) is a method of encrypting alphabetic text by using a series of interwoven Caesar ciphers, based on the letters of a keyword. It employs a form of polyalphabetic substitution.

First described by Giovan Battista Bellaso in 1553, the cipher is easy to understand and implement, but it resisted all attempts to break it until 1863, three centuries later. This earned it the description le chiffre indéchiffrable (French for 'the indecipherable cipher'). Many people have tried to implement encryption schemes that are essentially Vigenère ciphers. In 1863, Friedrich Kasiski was the first to publish a general method of deciphering Vigenère ciphers.

In the 19th century the scheme was misattributed to Blaise de Vigenère (1523–1596), and so acquired its present name.

## History

The first well-documented description of a polyalphabetic cipher was by Leon Battista Alberti around 1467 and used a metal cipher disk to switch between cipher alphabets. Alberti's system only switched alphabets after several words, and switches were indicated by writing the letter of the corresponding alphabet in the ciphertext. Later, Johannes Trithemius, in his work Polygraphiae (which was completed in manuscript form in 1508 but first published in 1518), invented the tabula recta, a critical component of the Vigenère cipher. The Trithemius cipher, however, provided a progressive, rather rigid and predictable system for switching between cipher alphabets.[note 1]

In 1586 Blaise de Vigenère published a type of polyalphabetic cipher called an autokey cipher – because its key is based on the original plaintext – before the court of Henry III of France.. The cipher now known as the Vigenère cipher, however, is that originally described by Giovan Battista Bellaso in his 1553 book La cifra del Sig. Giovan Battista Bellaso. He built upon the tabula recta of Trithemius but added a repeating "countersign" (a key) to switch cipher alphabets every letter. Whereas Alberti and Trithemius used a fixed pattern of substitutions, Bellaso's scheme meant the pattern of substitutions could be easily changed, simply by selecting a new key. Keys were typically single words or short phrases, known to both parties in advance, or transmitted "out of band" along with the message. Bellaso's method thus required strong security for only the key. As it is relatively easy to secure a short key phrase, such as by a previous private conversation, Bellaso's system was considerably more secure.[citation needed]

In the 19th century, the invention of Bellaso's cipher was misattributed to Vigenère. David Kahn, in his book, The Codebreakers lamented this misattribution, saying that history had "ignored this important contribution and instead named a regressive and elementary cipher for him [Vigenère] though he had nothing to do with it".

The Vigenère cipher gained a reputation for being exceptionally strong. Noted author and mathematician Charles Lutwidge Dodgson (Lewis Carroll) called the Vigenère cipher unbreakable in his 1868 piece "The Alphabet Cipher" in a children's magazine. In 1917, Scientific American described the Vigenère cipher as "impossible of translation". That reputation was not deserved. Charles Babbage is known to have broken a variant of the cipher as early as 1854 but did not publish his work. Kasiski entirely broke the cipher and published the technique in the 19th century, but even in the 16th century, some skilled cryptanalysts could occasionally break the cipher.

The Vigenère cipher is simple enough to be a field cipher if it is used in conjunction with cipher disks. The Confederate States of America, for example, used a brass cipher disk to implement the Vigenère cipher during the American Civil War. The Confederacy's messages were far from secret, and the Union regularly cracked its messages. Throughout the war, the Confederate leadership primarily relied upon three key phrases: "Manchester Bluff", "Complete Victory" and, as the war came to a close, "Come Retribution".

A Vernam cipher whose key is as long as the message becomes a one-time pad, a theoretically unbreakable cipher. Gilbert Vernam tried to repair the broken cipher (creating the Vernam–Vigenère cipher in 1918), but the technology he used was so cumbersome as to be impracticable.

## Description

In a Caesar cipher, each letter of the alphabet is shifted along some number of places. For example, in a Caesar cipher of shift 3, A would become D, B would become E, Y would become B and so on. The Vigenère cipher has several Caesar ciphers in sequence with different shift values.

To encrypt, a table of alphabets can be used, termed a tabula recta, Vigenère square or Vigenère table. It has the alphabet written out 26 times in different rows, each alphabet shifted cyclically to the left compared to the previous alphabet, corresponding to the 26 possible Caesar ciphers. At different points in the encryption process, the cipher uses a different alphabet from one of the rows. The alphabet used at each point depends on a repeating keyword.[citation needed]

For example, suppose that the plaintext to be encrypted is

ATTACKATDAWN.

The person sending the message chooses a keyword and repeats it until it matches the length of the plaintext, for example, the keyword "LEMON":

LEMONLEMONLE

Each row starts with a key letter. The rest of the row holds the letters A to Z (in shifted order). Although there are 26 key rows shown, a code will use only as many keys (different alphabets) as there are unique letters in the key string, here just 5 keys: {L, E, M, O, N}. For successive letters of the message, successive letters of the key string will be taken and each message letter enciphered by using its corresponding key row. The next letter of the key is chosen, and that row is gone along to find the column heading that matches the message character. The letter at the intersection of [key-row, msg-col] is the enciphered letter.

For example, the first letter of the plaintext, A, is paired with L, the first letter of the key. Therefore, row L and column A of the Vigenère square are used, namely L. Similarly, for the second letter of the plaintext, the second letter of the key is used. The letter at row E and column T is X. The rest of the plaintext is enciphered in a similar fashion:

 Plaintext: ATTACKATDAWN Key: LEMONLEMONLE Ciphertext: LXFOPVEFRNHR

Decryption is performed by going to the row in the table corresponding to the key, finding the position of the ciphertext letter in that row and then using the column's label as the plaintext. For example, in row L (from LEMON), the ciphertext L appears in column A, which is the first plaintext letter. Next, in row E (from LEMON), the ciphertext X is located in column T. Thus T is the second plaintext letter.

## Algebraic description

Vigenère can also be described algebraically. If the letters AZ are taken to be the numbers 0–25 ($A\,{\widehat {=}}\,0$ , $B\,{\widehat {=}}\,1$ , etc.), and addition is performed modulo 26, Vigenère encryption $E$ using the key $K$ can be written as

$C_{i}=E_{K}(M_{i})=(M_{i}+K_{i}){\bmod {2}}6$ and decryption $D$ using the key $K$ as

$M_{i}=D_{K}(C_{i})=(C_{i}-K_{i}+26){\bmod {2}}6,$ in which $M=M_{1}\dots M_{n}$ is the message, $C=C_{1}\dots C_{n}$ is the ciphertext and $K=K_{1}\dots K_{n}$ is the key obtained by repeating the keyword $\lceil n/m\rceil$ times in which $m$ is the keyword length.

Thus, by using the previous example, to encrypt $A\,{\widehat {=}}\,0$ with key letter $L\,{\widehat {=}}\,11$ the calculation would result in $11\,{\widehat {=}}\,L$ .

$11=(0+11){\bmod {2}}6$ Therefore, to decrypt $R\,{\widehat {=}}\,17$ with key letter $E\,{\widehat {=}}\,4$ , the calculation would result in $13\,{\widehat {=}}\,N$ .

$13=(17-4){\bmod {2}}6$ In general, if $\Sigma$ is the alphabet of length $\ell$ , and $m$ is the length of key, Vigenère encryption and decryption can be written:

$C_{i}=E_{K}(M_{i})=(M_{i}+K_{(i{\bmod {m}})}){\bmod {\ell }},$ $M_{i}=D_{K}(C_{i})=(C_{i}-K_{(i{\bmod {m}})}){\bmod {\ell }}.$ $M_{i}$ denotes the offset of the i-th character of the plaintext $M$ in the alphabet $\Sigma$ . For example, by taking the 26 English characters as the alphabet $\Sigma =(A,B,C,\ldots ,X,Y,Z)$ , the offset of A is 0, the offset of B is 1 etc. $C_{i}$ and $K_{i}$ are similar.

## Cryptanalysis

The idea behind the Vigenère cipher, like all other polyalphabetic ciphers, is to disguise the plaintext letter frequency to interfere with a straightforward application of frequency analysis. For instance, if P is the most frequent letter in a ciphertext whose plaintext is in English, one might suspect that P corresponds to E since E is the most frequently used letter in English. However, by using the Vigenère cipher, E can be enciphered as different ciphertext letters at different points in the message, which defeats simple frequency analysis.

The primary weakness of the Vigenère cipher is the repeating nature of its key. If a cryptanalyst correctly guesses the key's length, the cipher text can be treated as interwoven Caesar ciphers, which can easily be broken individually. The Kasiski examination and Friedman test can help to determine the key length (see below: § Kasiski examination and § Friedman test).

### Kasiski examination

In 1863, Friedrich Kasiski was the first to publish a successful general attack on the Vigenère cipher. Earlier attacks relied on knowledge of the plaintext or the use of a recognizable word as a key. Kasiski's method had no such dependencies. Although Kasiski was the first to publish an account of the attack, it is clear that others had been aware of it. In 1854, Charles Babbage was goaded into breaking the Vigenère cipher when John Hall Brock Thwaites submitted a "new" cipher to the Journal of the Society of the Arts. When Babbage showed that Thwaites' cipher was essentially just another recreation of the Vigenère cipher, Thwaites presented a challenge to Babbage: given an original text (from Shakespeare's The Tempest : Act 1, Scene 2) and its enciphered version, he was to find the key words that Thwaites had used to encipher the original text. Babbage soon found the key words: "two" and "combined". Babbage then enciphered the same passage from Shakespeare using different key words and challenged Thwaites to find Babbage's key words. Babbage never explained the method that he used. Studies of Babbage's notes reveal that he had used the method later published by Kasiski and suggest that he had been using the method as early as 1846.

The Kasiski examination, also called the Kasiski test, takes advantage of the fact that repeated words are, by chance, sometimes encrypted using the same key letters, leading to repeated groups in the ciphertext. For example, consider the following encryption using the keyword ABCD:

Key:        ABCDABCDABCDABCDABCDABCDABCD
Plaintext:  CRYPTOISSHORTFORCRYPTOGRAPHY
Ciphertext: CSASTPKVSIQUTGQUCSASTPIUAQJB


There is an easily noticed repetition in the ciphertext, and so the Kasiski test will be effective.

The distance between the repetitions of CSASTP is 16. If it is assumed that the repeated segments represent the same plaintext segments, that implies that the key is 16, 8, 4, 2, or 1 characters long. (All factors of the distance are possible key lengths; a key of length one is just a simple Caesar cipher, and its cryptanalysis is much easier.) Since key lengths 2 and 1 are unrealistically short, one needs to try only lengths 16, 8 or 4. Longer messages make the test more accurate because they usually contain more repeated ciphertext segments. The following ciphertext has two segments that are repeated:

Ciphertext: VHVSSPQUCEMRVBVBBBVHVSURQGIBDUGRNICJQUCERVUAXSSR


The distance between the repetitions of VHVS is 18. If it is assumed that the repeated segments represent the same plaintext segments, that implies that the key is 18, 9, 6, 3, 2 or 1 character long. The distance between the repetitions of QUCE is 30 characters. That means that the key length could be 30, 15, 10, 6, 5, 3, 2 or 1 character long. By taking the intersection of those sets, one could safely conclude that the most likely key length is 6 since 3, 2, and 1 are unrealistically short.

### Friedman test

The Friedman test (sometimes known as the kappa test) was invented during the 1920s by William F. Friedman, who used the index of coincidence, which measures the unevenness of the cipher letter frequencies to break the cipher. By knowing the probability $\kappa _{p}$ that any two randomly chosen source language letters are the same (around 0.067 for monocase English) and the probability of a coincidence for a uniform random selection from the alphabet $\kappa _{r}$ (1/26 = 0.0385 for English), the key length can be estimated as the following:

${\kappa _{p}-\kappa _{r}} \over {\kappa _{o}-\kappa _{r}}$ from the observed coincidence rate

$\kappa _{o}={\frac {\sum _{i=1}^{c}n_{i}(n_{i}-1)}{N(N-1)}}$ in which c is the size of the alphabet (26 for English), N is the length of the text and n1 to nc are the observed ciphertext letter frequencies, as integers.

That is, however, only an approximation; its accuracy increases with the size of the text. It would, in practice, be necessary to try various key lengths that are close to the estimate. A better approach for repeating-key ciphers is to copy the ciphertext into rows of a matrix with as many columns as an assumed key length and then to compute the average index of coincidence with each column considered separately. When that is done for each possible key length, the highest average I.C. then corresponds to the most-likely key length. Such tests may be supplemented by information from the Kasiski examination.

### Frequency analysis

Once the length of the key is known, the ciphertext can be rewritten into that many columns, with each column corresponding to a single letter of the key. Each column consists of plaintext that has been encrypted by a single Caesar cipher. The Caesar key (shift) is just the letter of the Vigenère key that was used for that column. Using methods similar to those used to break the Caesar cipher, the letters in the ciphertext can be discovered.

An improvement to the Kasiski examination, known as Kerckhoffs' method, matches each column's letter frequencies to shifted plaintext frequencies to discover the key letter (Caesar shift) for that column. Once every letter in the key is known, all the cryptanalyst has to do is to decrypt the ciphertext and reveal the plaintext. Kerckhoffs' method is not applicable if the Vigenère table has been scrambled, rather than using normal alphabetic sequences, but Kasiski examination and coincidence tests can still be used to determine key length.

### Key elimination

The Vigenère cipher, with normal alphabets, essentially uses modulo arithmetic, which is commutative. Therefore, if the key length is known (or guessed), subtracting the cipher text from itself, offset by the key length, will produce the plain text encrypted with itself. If any "probable word" in the plain text is known or can be guessed, its self-encryption can be recognized, which allows recovery of the key by subtracting the known plaintext from the cipher text. Key elimination is especially useful against short messages.

## Variants

The running key variant of the Vigenère cipher was also considered unbreakable at one time. This version uses as the key a block of text as long as the plaintext. Since the key is as long as the message, the Friedman and Kasiski tests no longer work, as the key is not repeated.

If multiple keys are used, the effective key length is the least common multiple of the lengths of the individual keys. For example, using the two keys GO and CAT, whose lengths are 2 and 3, one obtains an effective key length of 6 (the least common multiple of 2 and 3). This can be understood as the point where both keys line up.

 Plaintext: ATTACKATDAWN Key 1: GOGOGOGOGOGO Key 2: CATCATCATCAT Ciphertext: IHSQIRIHCQCU

Encrypting twice, first with the key GO and then with the key CAT is the same as encrypting once with a key produced by encrypting one key with the other.

 Plaintext: GOGOGO Key: CATCAT Ciphertext: IOZQGH

This is proven by encrypting ATTACKATDAWN with IOZQGH, to produce the same ciphertext as in the original example.

 Plaintext: ATTACKATDAWN Key: IOZQGHIOZQGH Ciphertext: IHSQIRIHCQCU

If key lengths are relatively prime, the effective key length grows exponentially as the individual key lengths are increased. This is especially true if each key length is individually prime. For example, the effective length of keys 2, 3, and 5 characters is 30, but that of keys of 7, 11, and 13 characters is 1,001. If this effective key length is longer than the ciphertext, it achieves the same immunity to the Friedman and Kasiski tests as the running key variant.

If one uses a key that is truly random, is at least as long as the encrypted message, and is used only once, the Vigenère cipher is theoretically unbreakable. However, in that case, the key, not the cipher, provides cryptographic strength, and such systems are properly referred to collectively as one-time pad systems, irrespective of the ciphers employed.

Vigenère actually invented a stronger cipher, an autokey cipher. The name "Vigenère cipher" became associated with a simpler polyalphabetic cipher instead. In fact, the two ciphers were often confused, and both were sometimes called le chiffre indéchiffrable. Babbage actually broke the much-stronger autokey cipher, but Kasiski is generally credited with the first published solution to the fixed-key polyalphabetic ciphers.

A simple variant is to encrypt by using the Vigenère decryption method and to decrypt by using Vigenère encryption. That method is sometimes referred to as "Variant Beaufort". It is different from the Beaufort cipher, created by Francis Beaufort, which is similar to Vigenère but uses a slightly modified enciphering mechanism and tableau. The Beaufort cipher is a reciprocal cipher.

Despite the Vigenère cipher's apparent strength, it never became widely used throughout Europe. The Gronsfeld cipher is a variant created by Count Gronsfeld (Josse Maximilaan van Gronsveld né van Bronckhorst); it is identical to the Vigenère cipher except that it uses just 10 different cipher alphabets, corresponding to the digits 0 to 9). A Gronsfeld key of 0123 is the same as a Vigenere key of ABCD. The Gronsfeld cipher is strengthened because its key is not a word, but it is weakened because it has just 10 cipher alphabets. It is Gronsfeld's cipher that became widely used throughout Germany and Europe, despite its weaknesses.