Vigenère cipher

The Vigenère cipher is named for Blaise de Vigenère (pictured). Although Giovan Battista Bellaso had invented the cipher earlier, Vigenère developed a stronger autokey cipher.
A reproduction of the Confederacy's cipher disk used in the American Civil War on display in the National Cryptologic Museum

The Vigenère cipher (French pronunciation: ​[viʒnɛːʁ]) is a method of encrypting alphabetic text by using a series of interwoven Caesar ciphers, based on the letters of a keyword. It is a form of polyalphabetic substitution.[1][2]

The cipher is easy to understand and implement, but it resisted all attempts to break it for three centuries, which earned it the description le chiffre indéchiffrable (French for 'the indecipherable cipher'). Many people have tried to implement encryption schemes that are essentially Vigenère ciphers.[3] In 1863, Friedrich Kasiski was the first to publish a general method of deciphering Vigenère ciphers.

The Vigenère cipher was originally described by Giovan Battista Bellaso in his 1553 book La cifra del. Sig. Giovan Battista Bellaso, but the scheme was later misattributed to Blaise de Vigenère (1523 – 1596) in the 19th century and so acquired its present name.[citation needed]

History

The first well-documented description of a polyalphabetic cipher was formulated by Leon Battista Alberti around 1467 and used a metal cipher disc to switch between cipher alphabets. Alberti's system only switched alphabets after several words, and switches were indicated by writing the letter of the corresponding alphabet in the ciphertext. Later, in 1508, Johannes Trithemius, in his work Poligraphia, invented the tabula recta, a critical component of the Vigenère cipher. The Trithemius cipher, however, only provided a progressive, rigid and predictable system for switching between cipher alphabets.[citation needed]

What is now known as the Vigenère cipher was originally described by Giovan Battista Bellaso in his 1553 book La cifra del Sig. Giovan Battista Bellaso.[4] He built upon the tabula recta of Trithemius but added a repeating "countersign" (a key) to switch cipher alphabets every letter. Whereas Alberti and Trithemius used a fixed pattern of substitutions, Bellaso's scheme meant the pattern of substitutions could be easily changed, simply by selecting a new key. Keys were typically single words or short phrases, known to both parties in advance, or transmitted "out of band" along with the message. Bellaso's method thus required strong security for only the key. As it is relatively easy to secure a short key phrase, such as by a previous private conversation, Bellaso's system was considerably more secure.[citation needed]

Blaise de Vigenère published his description of a similar but stronger autokey cipher before the court of Henry III of France, in 1586.[5] Later, in the 19th century, the invention of Bellaso's cipher was misattributed to Vigenère. David Kahn, in his book, The Codebreakers lamented the misattribution by saying that history had "ignored this important contribution and instead named a regressive and elementary cipher for him [Vigenère] though he had nothing to do with it".[6]

The Vigenère cipher gained a reputation for being exceptionally strong. Noted author and mathematician Charles Lutwidge Dodgson (Lewis Carroll) called the Vigenère cipher unbreakable in his 1868 piece "The Alphabet Cipher" in a children's magazine. In 1917, Scientific American described the Vigenère cipher as "impossible of translation".[7][8] That reputation was not deserved. Charles Babbage is known to have broken a variant of the cipher as early as 1854 but failed to publish his work.[9] Kasiski entirely broke the cipher and published the technique in the 19th century, but even earlier, some skilled cryptanalysts could occasionally break the cipher in the 16th century.[6]

Cryptographic slide rule used as a calculation aid by the Swiss Army between 1914 and 1940.

The Vigenère cipher is simple enough to be a field cipher if it is used in conjunction with cipher disks.[10] The Confederate States of America, for example, used a brass cipher disk to implement the Vigenère cipher during the American Civil War. The Confederacy's messages were far from secret, and the Union regularly cracked its messages. Throughout the war, the Confederate leadership primarily relied upon three key phrases: "Manchester Bluff", "Complete Victory" and, as the war came to a close, "Come Retribution".[11]

Gilbert Vernam tried to repair the broken cipher (creating the Vernam–Vigenère cipher in 1918), but no matter what he did, the cipher was still vulnerable to cryptanalysis. Vernam's work, however, eventually led to the one-time pad, a theoretically-unbreakable cipher.[12]

Description

The Vigenère square or Vigenère table, also known as the tabula recta, can be used for encryption and decryption.

In a Caesar cipher, each letter of the alphabet is shifted along some number of places. For example, in a Caesar cipher of shift 3, A would become D, B would become E, Y would become B and so on. The Vigenère cipher has several Caesar ciphers in sequence with different shift values.

To encrypt, a table of alphabets can be used, termed a tabula recta, Vigenère square or Vigenère table. It has the alphabet written out 26 times in different rows, each alphabet shifted cyclically to the left compared to the previous alphabet, corresponding to the 26 possible Caesar ciphers. At different points in the encryption process, the cipher uses a different alphabet from one of the rows. The alphabet used at each point depends on a repeating keyword.[citation needed]

For example, suppose that the plaintext to be encrypted is

ATTACKATDAWN.

The person sending the message chooses a keyword and repeats it until it matches the length of the plaintext, for example, the keyword "LEMON":

LEMONLEMONLE

Each row starts with a key letter. The rest of the row holds the letters A to Z (in shifted order). Although there are 26 key rows shown, a code will use only as many keys (different alphabets) as there are unique letters in the key string, here just 5 keys: {L, E, M, O, N}. For successive letters of the message, successive letters of the key string will be taken and each message letter enciphered by using its corresponding key row. The next letter of the key is chosen, and that row is gone along to find the column heading that matches the message character. The letter at the intersection of [key-row, msg-col] is the enciphered letter.

For example, the first letter of the plaintext, A, is paired with L, the first letter of the key. Therefore, row L and column A of the Vigenère square are used, namely L. Similarly, for the second letter of the plaintext, the second letter of the key is used. The letter at row E and column T is X. The rest of the plaintext is enciphered in a similar fashion:

 Plaintext: ATTACKATDAWN Key: LEMONLEMONLE Ciphertext: LXFOPVEFRNHR

Decryption is performed by going to the row in the table corresponding to the key, finding the position of the ciphertext letter in that row and then using the column's label as the plaintext. For example, in row L (from LEMON), the ciphertext L appears in column A, which is the first plaintext letter. Next, row E (from LEMON) is gone to, the ciphertext X is located that is found in column T. Thus T is the second plaintext letter.

Algebraic description

Vigenère can also be described algebraically. If the letters AZ are taken to be the numbers 0–25 (${\displaystyle A{\widehat {=}}0}$, ${\displaystyle B{\widehat {=}}1}$, etc.), and addition is performed modulo 26, Vigenère encryption ${\displaystyle E}$ using the key ${\displaystyle K}$ can be written as

${\displaystyle C_{i}=E_{K}(M_{i})=(M_{i}+K_{i})\mod {26}}$

and decryption ${\displaystyle D}$ using the key ${\displaystyle K}$ as

${\displaystyle M_{i}=D_{K}(C_{i})=(C_{i}-K_{i})\mod {26}}$,

in which ${\displaystyle M=M_{1}\dots M_{n}}$ is the message, ${\displaystyle C=C_{1}\dots C_{n}}$ is the ciphertext and ${\displaystyle K=K_{1}\dots K_{n}}$ is the key obtained by repeating the keyword ${\displaystyle \lceil n/m\rceil }$ times in which ${\displaystyle m}$ is the keyword length.

Thus, by using the previous example, to encrypt ${\displaystyle A{\widehat {=}}0}$ with key letter ${\displaystyle L{\widehat {=}}11}$ the calculation would result in ${\displaystyle 11{\widehat {=}}L}$.

${\displaystyle 11=(0+11)\mod {26}}$

Therefore, to decrypt ${\displaystyle R{\widehat {=}}17}$ with key letter ${\displaystyle E{\widehat {=}}4}$, the calculation would result in ${\displaystyle 13{\widehat {=}}N}$.

${\displaystyle 13=(17-4)\mod {26}}$

In general, if ${\displaystyle \Sigma }$ is the alphabet of length ${\displaystyle \ell }$, and ${\displaystyle m}$ is the length of key, Vigenère encryption and decryption can be written:

${\displaystyle C_{i}=E_{K}(M_{i})=(M_{i}+K_{(i\mod m)})\mod \ell ,}$
${\displaystyle M_{i}=D_{K}(C_{i})=(C_{i}-K_{(i\mod m)})\mod \ell .}$

It should be noted that ${\displaystyle M_{i}}$ denotes the offset of the i-th character of the plaintext ${\displaystyle M}$ in the alphabet ${\displaystyle \Sigma }$. For example, by taking the 26 English characters as the alphabet ${\displaystyle \Sigma =(A,B,C,\cdots ,X,Y,Z)}$, the offset of A is 0, the offset of B is 1 etc. ${\displaystyle C_{i}}$ and ${\displaystyle K_{i}}$ are similar.

Cryptanalysis

The idea behind the Vigenère cipher, like all other polyalphabetic ciphers, is to disguise the plaintext letter frequency to interfere with a straightforward application of frequency analysis. For instance, if P is the most frequent letter in a ciphertext whose plaintext is in English, one might suspect that P corresponds to E since E is the most frequently used letter in English. However, by using the Vigenère cipher, E can be enciphered as different ciphertext letters at different points in the message, which defeats simple frequency analysis.

The primary weakness of the Vigenère cipher is the repeating nature of its key. If a cryptanalyst correctly guesses the key's length, the cipher text can be treated as interwoven Caesar ciphers, which can easily be broken individually. The Kasiski examination and Friedman test can help to determine the key length.

Kasiski examination

In 1863, Friedrich Kasiski was the first to publish a successful general attack on the Vigenère cipher.[13] Earlier attacks relied on knowledge of the plaintext or the use of a recognizable word as a key. Kasiski's method had no such dependencies. Although Kasiski was the first to publish an account of the attack, it is clear that others had been aware of it. In 1854, Charles Babbage was goaded into breaking the Vigenère cipher when John Hall Brock Thwaites submitted a "new" cipher to the Journal of the Society of the Arts.[14][15] When Babbage showed that Thwaites' cipher was essentially just another recreation of the Vigenère cipher, Thwaites presented a challenge to Babbage: given an original text (from Shakespeare's The Tempest : Act 1, Scene 2) and its enciphered version, he was to find the key words that Thwaites had used to encipher the original text. Babbage soon found the key words: "two" and "combined". Babbage then enciphered the same passage from Shakespeare using different key words and challenged Thwaites to find Babbage's key words.[16] Babbage never explained the method that he used. Studies of Babbage's notes reveal that he had used the method later published by Kasiski and suggest that he had been using the method as early as 1846.[17]

The Kasiski examination, also called the Kasiski test, takes advantage of the fact that repeated words are, by chance, sometimes encrypted using the same key letters, leading to repeated groups in the ciphertext. For example, consider the following encryption using the keyword ABCD:

Key:        ABCDABCDABCDABCDABCDABCDABCD
Plaintext:  CRYPTOISSHORTFORCRYPTOGRAPHY
Ciphertext: CSASTPKVSIQUTGQUCSASTPIUAQJB


There is an easily-noticed repetition in the ciphertext and so the Kasiski test will be effective.

The distance between the repetitions of CSASTP is 16. If it is assumed that the repeated segments represent the same plaintext segments, that implies that the key is 16, 8, 4, 2, or 1 characters long. (All factors of the distance are possible key lengths; a key of length one is just a simple Caesar cipher, and its cryptanalysis is much easier.) Since key lengths 2 and 1 are unrealistically short, one needs to try only lengths 16, 8 or 4. Longer messages make the test more accurate because they usually contain more repeated ciphertext segments. The following ciphertext has two segments that are repeated:

Ciphertext: VHVSSPQUCEMRVBVBBBVHVSURQGIBDUGRNICJQUCERVUAXSSR


The distance between the repetitions of VHVS is 18. If it is assumed that the repeated segments represent the same plaintext segments, that implies that the key is 18, 9, 6, 3, 2 or 1 character long. The distance between the repetitions of QUCE is 30 characters. That means that the key length could be 30, 15, 10, 6, 5, 3, 2 or 1 character long. By taking the intersection of those sets, one could safely conclude that the most likely key length is 6 since 3, 2, and 1 are unrealistically short.

Friedman test

The Friedman test (sometimes known as the kappa test) was invented during the 1920s by William F. Friedman, who used the index of coincidence, which measures the unevenness of the cipher letter frequencies to break the cipher. By knowing the probability ${\displaystyle \kappa _{p}}$ that any two randomly-chosen source language letters are the same (around 0.067 for monocase English) and the probability of a coincidence for a uniform random selection from the alphabet ${\displaystyle \kappa _{r}}$ (1/26 = 0.0385 for English), the key length can be estimated as the following:

${\displaystyle {\kappa _{p}-\kappa _{r}} \over {\kappa _{o}-\kappa _{r}}}$

from the observed coincidence rate

${\displaystyle \kappa _{o}={\frac {\sum _{i=1}^{c}n_{i}(n_{i}-1)}{N(N-1)}}}$

in which c is the size of the alphabet (26 for English), N is the length of the text and n1 to nc are the observed ciphertext letter frequencies, as integers.

That is, however, only an approximation ; its accuracy increases with the size of the text. It would, in practice, be necessary to try various key lengths that are close to the estimate.[18] A better approach for repeating-key ciphers is to copy the ciphertext into rows of a matrix with as many columns as an assumed key length and then to compute the average index of coincidence with each column considered separately. When that is done for each possible key length, the highest average I.C. then corresponds to the most-likely key length.[19] Such tests may be supplemented by information from the Kasiski examination.

Frequency analysis

Once the length of the key is known, the ciphertext can be rewritten into that many columns, with each column corresponding to a single letter of the key. Each column consists of plaintext that has been encrypted by a single Caesar cipher. The Caesar key (shift) is just the letter of the Vigenère key that was used for that column. Using methods similar to those used to break the Caesar cipher, the letters in the ciphertext can be discovered.

An improvement to the Kasiski examination, known as Kerckhoffs' method, matches each column's letter frequencies to shifted plaintext frequencies to discover the key letter (Caesar shift) for that column. Once every letter in the key is known, all the cryptanalyst has to do is to decrypt the ciphertext and reveal the plaintext.[20] Kerckhoffs' method is not applicable if the Vigenère table has been scrambled, rather than using normal alphabetic sequences, but Kasiski examination and coincidence tests can still be used to determine key length.

Key elimination

The Vigenère cipher, with normal alphabets, essentially uses modulo arithmetic, which is commutative. Therefore, if the key length is known (or guessed), subtracting the cipher text from itself, offset by the key length, will produce the plain text encrypted with itself. If any "probable word" in the plain text is known or can be guessed, its self-encryption can be recognized, which allows recovery of the key by subtracting the known plaintext from the cipher text. Key elimination is especially useful against short messages.

Variants

Confederate cipher wheel, captured at the surrender of Mobile, Alabama, in May 1865 – National Cryptologic Museum

The running key variant of the Vigenère cipher was also considered unbreakable at one time. This version uses as the key a block of text as long as the plaintext. Since the key is as long as the message, the Friedman and Kasiski tests no longer work, as the key is not repeated. In 1920, Friedman was the first to discover this variant's weaknesses: the cryptanalyst has statistical information about the key (assuming that the block of text is in a known language) and that information will be reflected in the ciphertext.

If it uses a key that is truly random, is at least as long as the encrypted message, and is used only once, the Vigenère cipher is theoretically unbreakable. However, in that case, the key, not the cipher, provides cryptographic strength, and such systems are properly referred to collectively as one-time pad systems, irrespective of the ciphers employed.

Vigenère actually invented a stronger cipher, an autokey cipher. The name "Vigenère cipher" became associated with a simpler polyalphabetic cipher instead. In fact, the two ciphers were often confused, and both were sometimes called le chiffre indéchiffrable. Babbage actually broke the much-stronger autokey cipher, but Kasiski is generally credited with the first published solution to the fixed-key polyalphabetic ciphers.

A simple variant is to encrypt by using the Vigenère decryption method and to decrypt by using Vigenère encryption. That method is sometimes referred to as "Variant Beaufort". It is different from the Beaufort cipher, created by Francis Beaufort, which is similar to Vigenère but uses a slightly-modified enciphering mechanism and tableau. The Beaufort cipher is a reciprocal cipher.

Despite the Vigenère cipher's apparent strength, it never became widely used throughout Europe. The Gronsfeld cipher is a variant created by Count Gronsfeld; it is identical to the Vigenère cipher except that it uses just 10 different cipher alphabets, corresponding to the digits 0 to 9).. The Gronsfeld cipher is strengthened because its key is not a word, but it is weakened because it has just 10 cipher alphabets. It is Gronsfeld's cipher that became widely used throughout Germany and Europe, despite its weaknesses.

References

Citations

1. ^ Bruen, Aiden A. & Forcinito, Mario A. (2011). Cryptography, Information Theory, and Error-Correction: A Handbook for the 21st Century. John Wiley & Sons. p. 21. ISBN 978-1-118-03138-4.
2. ^ Martin, Keith M. (2012). Everyday Cryptography. Oxford University Press. p. 142. ISBN 978-0-19-162588-6.
3. ^ Smith, Laurence D. (1943). "Substitution Ciphers". Cryptography the Science of Secret Writing: The Science of Secret Writing. Dover Publications. p. 81. ISBN 0-486-20247-X.
4. ^ Bellaso, Giovan Battista (1553). La Cifra del Sig. Giovan Battista Belaso … (in Italian). Venice, (Italy). Available at: Museo Galileo (Florence (Firenze), Italy)
5. ^ Vigenère, Blaise de (1586). Traicté des Chiffres, ou Secretes Manieres d'Escrire [Treatise on ciphers, or secret ways of writing] (in French). Paris, France: Abel l'Angelier.
6. ^ a b David, Kahn (1999). "On the Origin of a Species". The Codebreakers: The Story of Secret Writing. Simon & Schuster. ISBN 0-684-83130-9.
7. ^ (Anon.) (27 January 1917). "A new cipher code". Scientific American Supplement. 83 (2143): 61.
8. ^ Knudsen, Lars R. (1998). "Block Ciphers—a survey". In Bart Preneel and Vincent Rijmen. State of the Art in Applied Cryptography: Course on Computer Security and Industrial Cryptograph Leuven Belgium, June 1997 Revised Lectures. Berlin ; London: Springer. p. 29. ISBN 3-540-65474-7.
9. ^ Singh, Simon (1999). "Chapter 2: Le Chiffre Indéchiffrable". The Code Book. Anchor Books, Random House. pp. 63–78. ISBN 0-385-49532-3.
10. ^ Codes, Ciphers, & Codebreaking (The Rise Of Field Ciphers)
11. ^ David, Kahn (1999). "Crises of the Union". The Codebreakers: The Story of Secret Writing. Simon & Schuster. pp. 217–221. ISBN 0-684-83130-9.
12. ^ Stanislaw Jarecki, "Crypto Overview, Perfect Secrecy, One-time Pad", University of California, September 28, 2004, Retrieved November 20, 2016
13. ^ Kasiski, F. W. (1863). Die Geheimschriften und die Dechiffrir-Kunst [Cryptograms and the art of deciphering] (in German). Berlin, (Germany): E.S. Mittler und Sohn.
14. ^ See:
15. ^ Thwaites filed for a patent for his "new" cipher system:
16. ^ See:
17. ^ Franksen, O. I. (1985) Mr. Babbage's Secret: The Tale of a Cipher—and APL. Prentice Hall.
18. ^ Henk C.A. van Tilborg, ed. (2005). Encyclopedia of Cryptography and Security (First ed.). Springer. p. 115. ISBN 0-387-23473-X.
19. ^ Mountjoy, Marjorie (1963). "The Bar Statistics". NSA Technical Journal. VII (2,4). Published in two parts.
20. ^ "Lab exercise: Vigenere, RSA, DES, and Authentication Protocols" (PDF). CS 415: Computer and Network Security. Archived from the original (PDF) on 2011-07-23. Retrieved 2006-11-10.

Sources

• Beutelspacher, Albrecht (1994). "Chapter 2". Cryptology. translation from German by J. Chris Fisher. Washington, DC: Mathematical Association of America. pp. 27–41. ISBN 0-883-85504-6.
• Singh, Simon (1999). "Chapter 2: Le Chiffre Indéchiffrable". The Code Book. Anchor Book, Random House. ISBN 0-385-49532-3.
• Gaines, Helen Fouche (1939). "The Gronsfeld, Porta and Beaufort Ciphers". Cryptanalysis a Study of Ciphers and Their Solutions. Dover Publications. pp. 117–126. ISBN 0-486-20097-3.
• Mendelsohn, Charles J (1940). "Blaise De Vigenere and The 'Chiffre Carre'". Proceedings of the American Philosophical Society. 82 (2).