Cryptanalysis of the Lorenz cipher

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Timeline of key events
Time Event
September 1939 War breaks out in Europe.
Second half of 1940 First non-Morse transmissions intercepted.
June 1941 First experimental SZ40 Tunny link started with alphabetic indicator.
August 1941 Two long messages in depth yielded almost 4000 characters of key.
January 1942 Tunny diagnosed from key.
August 1941 traffic read.
July 1942 Turingery method of wheel breaking.
Testery established
First reading of up-to-date traffic.
October 1942 Experimental link closed.
First two of eventual 26 links started with QEP indicator system.
November 1942 "1 + 2 break in" invented by Bill Tutte.
February 1943 More complex SZ42A introduced.
June 1943 Heath Robinson delivered.
Newmanry established.
November 1943 Colossus I working at Dollis Hill prior to delivery to Bletchley Park.
February 1944 First use of Colossus I for a real job.
June 1944 D-day.
Colossus II working at Bletchley Park.
SZ42B introduced.
May 1945 Victory in Europe.
Ten Colossi in use.

Cryptanalysis of the Lorenz cipher was the process that enabled the British to read high-level German army messages during World War II. The British Government Code and Cypher School (GC&CS) at Bletchley Park decrypted many communications between the Oberkommando der Wehrmacht (OKW, German High Command) in Berlin and their army commands throughout occupied Europe, some of which were signed "Adolf Hitler, Fuhrer".[1] These were intercepted non-Morse radio transmissions that had been enciphered by the Lorenz SZ teletypewriter rotor stream cipher attachments. Decrypts of this traffic became an important source of "Ultra" intelligence.[2]

For its high-level secret messages, the German armed services enciphered each character using various online Geheimschreiber (secret writer) stream cipher machines at both ends of a telegraph link using the 5-bit International Telegraphy Alphabet No. 2 (ITA2). These machines were the Lorenz SZ (SZ for Schlüsselzusatz, meaning "cipher attachment") machine for the army, the Siemens and Halske T52 for the air force and the Siemens T43, which was little used and never broken by the Allies.[3][4]

Bletchley Park decrypts of messages enciphered with the Enigma machines revealed that the Germans called one of their wireless teleprinter transmission systems "Sägefisch" (sawfish), which led British cryptographers to refer to encrypted German teleprinter traffic as "Fish".[5] "Tunny" was the name given to the first non-Morse link, and it was subsequently used for the Lorenz SZ machines and the traffic enciphered by them.[6]

As with the entirely separate Cryptanalysis of the Enigma, it was German operational shortcomings that allowed the initial diagnosis of the system, and a way into decryption.[7] Unlike Enigma, no physical machine reached allied hands until the very end of the war in Europe, long after wholesale decryption had been established.[8] Initially, operator errors produced a number of pairs of transmissions sent with the same keys, giving a "depth", which often allowed manual decryption to be achieved. One long depth also allowed the complete logical structure of the machine to be worked out, a quite remarkable cryptanalytical feat on which the subsequent comprehensive decrypting of Tunny messages relied.[9]

When depths became less frequent, decryption was achieved by a combination of manual and automated methods. The first machine to automate part of the decrypting process was called "Heath Robinson" and it was followed by several other "Robinsons". These were, however, slow and unreliable, and were supplemented by the much faster and flexible "Colossus" the world's first electronic, programmable digital computer, ten of which were in use by the end of the war.[10][11]

Albert W. Small, an American cryptographer from the US Signal Corps who was seconded to Bletchley Park and worked on Tunny, said in his December 1944 report back to Arlington Hall that:

Daily solutions of Fish messages at GC&CS reflect a background of British mathematical genius, superb engineering ability, and solid common sense. Each of these has been a necessary factor. Each could have been overemphasised or underemphasised to the detriment of the solutions; a remarkable fact is that the fusion of the elements has been apparently in perfect proportion. The result is an outstanding contribution to cryptanalytic science.[12]

The German Tunny machines[edit]

See also: Lorenz cipher
The Lorenz SZ machines had 12 wheels each with a different number of cams (or "pins").
Wheel number 1 2 3 4 5 6 7 8 9 10 11 12
BP wheel name[13] \psi1 \psi2 \psi3 \psi4 \psi5 \mu37 \mu61 \chi1 \chi2 \chi3 \chi4 \chi5
Number of cams (pins) 43 47 51 53 59 37 61 41 31 29 26 23

The Lorenz SZ cipher attachments implemented a Vernam stream cipher, using a complex array of twelve wheels that delivered what should have been a cryptographically secure pseudorandom number as a key stream. The key stream was combined with the plaintext to produce the ciphertext at the transmitting end using the exclusive or (XOR) function. At the receiving end, an identically configured machine produced the same key stream which was combined with the ciphertext to produce the plaintext, i. e. the system implemented a symmetric-key algorithm.

The right hand five wheels, the chi (\chi) wheels, changed the five impulses (bits) of the incoming character, advancing one position every time. The left hand five, the psi (\psi) wheels, further changed the result of the chi transform, but they did not always move on with each new character.

The central two mu (\mu) or "motor" wheels determined whether or not the psi wheels rotated with a new character.[14][15] The SZ42A and SZ42B machines had a more complex arrangement for advancing the psi wheels than the original SZ40.

Each wheel had a number of cams that could be set in one of two positions. The numbers of cams on the set of wheels were co-prime with each other giving an extremely long period before the key sequence repeated. The process of working out which of the 501 cams were in the raised position was called "wheel breaking" at Bletchley Park.[16] Deriving the start positions of the wheels for a particular transmission was termed "wheel setting" or simply "setting". The fact that the psi wheels all moved together, but not with every input character, was a major weakness of the machines that led to cryptanalytical success.

Secure Telegraphy[edit]

Electro-mechanical telegraphy was developed in the 1830s and 1840s, well before telephony, and was in worldwide use by the time of the Second World War. An extensive system of cables were used within and between countries, with a standard voltage of −80 V indicating a "mark" and +80 V indicating a "space".[17] Where cable transmission was impracticable or inconvenient, such as for mobile German Army Units, radio transmission was used.

Teleprinters at each end of the circuit consisted of a keyboard and printing mechanism, and very often a five-hole perforated paper tape reading and punching mechanism. When used online, pressing an alphabet key at the transmitting end caused the relevant character to be printed at the receiving end. Commonly, however, the communication system involved the transmitting operator preparing a set of messages offline by punching them onto paper tape, and then going online only for the transmission of the messages recorded on the tape. Typically this would be at some ten characters per second, and so occupy the line or radio channel for a shorter time than for online typing.

The characters of the message were represented by the codes of the International Telegraphy Alphabet No. 2 (ITA2). The transmission medium, either wire or radio, used asynchronous serial communication with each character signaled by a start (space) impulse, 5 data impulses and 1½ stop (mark) impulses. At Bletchley Park mark impulses were signified by x and space impulses by .[18] For example the letter "H" would be coded as ••x•x.[19]

Binary teleprinter code (ITA2) as used at Bletchley Park, arranged in reflection order whereby each row differs from its neighbours by only one bit.
Pattern of impulses Mark = x, Space = Binary Letter shift Figure shift BP 'shiftless' interpretation
••••• 00000 null null /
••x•• 00100 space space 9
••x•x 00101 H # H
••••x 00001 T 5 T
•••xx 00011 O 9 O
••xxx 00111 M . M
••xx• 00110 N , N
•••x• 00010 CR CR 3
•x•x• 01010 R 4 R
•xxx• 01110 C  : C
•xxxx 01111 V  ; V
•x•xx 01011 G & G
•x••x 01001 L ) L
•xx•x 01101 P 0 P
•xx•• 01100 I 8 I
•x••• 01000 LF LF 4
xx••• 11000 A - A
xxx•• 11100 U 7 U
xxx•x 11101 Q 1 Q
xx••x 11001 W 2 W
xx•xx 11011 FIGS + or 5
xxxxx 11111 LTRS - or 8
xxxx• 11110 K ( K
xx•x• 11010 J ' J
x••x• 10010 D $ D
x•xx• 10110 F  ! F
x•xxx 10111 X / X
x••xx 10011 B  ? B
x•••x 10001 Z " Z
x•x•x 10101 Y 6 Y
x•x•• 10100 S ' S
x•••• 10000 E 3 E

The figure shift (FIGS) and letter shift (LETRS) characters determined how the receiving end interpreted the string of characters up to the next shift character. Because of the danger of a shift character being corrupted, some operators would type a pair of shift characters when changing from letters to numbers or vice versa. So they would type 55M88 to represent a full stop.[20] Such doubling of characters was very helpful for the statistical cryptanalysis used at Bletchley Park. After encipherment, shift characters had no special meaning.

Unlike Morse-coded signals, a human listener could not interpret a radio telegraph message. A standard teleprinter, however would produce the text of the message. The Lorenz cipher attachment changed the plaintext of the message into ciphertext that was uninterpretable to those without an identical machine identically set up. This was the challenge faced by the Bletchley Park codebreakers.

Interception[edit]

Intercepting Tunny transmissions presented substantial problems. As the transmitters were directional, most of the signals were quite weak at receivers in Britain. Furthermore, there were some 25 different frequencies used for these transmissions, and the frequency would sometimes be changed part way through. After the initial discovery of the non-Morse signals in 1940, a Y-station was set up on a hill at the Ivy Farm Communications Centre at Knockholt in Kent, specifically to intercept this traffic.[21] The centre was headed by Harold Kenworthy, had 30 receiving sets and employed some 600 staff. It became fully operational early in 1943.

A length of tape, 12mm (0.5in) wide, produced by an undulator similar to those used during the Second World War for intercepted 'Tunny' wireless telegraphic traffic at Knockholt, for translation into ITA2 characters to be sent to Bletchley Park

Because a single missed or corrupted character could make decryption impossible, the greatest accuracy was required.[22] The undulator technology used to record the impulses had originally been developed for high-speed Morse. It produced a visible record of the impulses on narrow paper tape. This was then read by people employed as "slip readers" who interpreted the peaks and troughs as the marks and spaces of ITA2 characters. Perforated paper tape was then produced for telegraphic transmission to Bletchley Park where it was punched out.[23]

The Vernam cipher[edit]

Main article: Gilbert Vernam

This cipher uses the Boolean "exclusive or" (XOR) function, symbolised by ⊕[24] and verbalised as "A or B but not both". This is represented by the following truth table, where x represents "true" and represents "false".

INPUT OUTPUT
A B A ⊕ B
x x
x x
x x

Other names for this function are: Not equal (NEQ), and modulo 2 addition (without "carry") and subtraction (without "borrow"). Note that modulo 2 addition and subtraction are identical. Some descriptions of Tunny decryption refer to addition and some to differencing, i.e. subtraction, but they mean the same thing.

A desirable feature of a machine cipher is that the same machine with the same settings can be used either for enciphering or for deciphering. The Vernam cipher achieves this reciprocity, as combining the stream of plaintext characters with the key stream produces the ciphertext, and combining the same key with the ciphertext regenerates the plaintext.

Symbolically:

PlaintextKey = Ciphertext

and

CiphertextKey = Plaintext

Vernam's original idea was to use conventional telegraphy practice, with a paper tape of the plaintext combined with a paper tape of the key at the transmitting end, and an identical key tape combined with the ciphertext signal at the receiving end. Each pair of key tapes would have been unique (a one-time tape), but generating and distributing such tapes presented considerable practical difficulties. In the 1920s four men in different countries invented rotor Vernam cipher machines to produce a key stream to act instead of a key tape.[25] Lorenz SZ40/42 was one of these.[26]

Security Features[edit]

A typical distribution of letters in English language text. Inadequate encipherment may not sufficiently mask the non-uniform nature of the distribution. This property was exploited in cryptanalysis of the Lorenz cipher by weakening part of the key.

A monoalphabetic substitution cipher such as the Caesar cipher can easily be broken, given a reasonable amount of ciphertext. This is achieved by frequency analysis of the different letters of the ciphertext, and comparing the result with the known distribution of letters in the language of the plaintext. With a polyalphabetic cipher, however, such as the Lorenz cipher, there is a different substitution alphabet for each successive character. So a frequency analysis shows an approximately uniform distribution, such as that obtained from a (pseudo) random number generator. By trying multiple putative chi-component partial key streams against the ciphertext, the Bletchley Park cryptanalysts were able to detect some of the underlying non-uniformity and so identify which partial key stream was likely to be correct.

The total number of cams on the twelve wheels of the SZ machines was 501. Each cam could either be in a raised position, in which case it contributed x to the logic of the system, or in the lowered position, in which case it generated .[26] The total possible number of patterns of raised cams was 2501 which is an astronomically large number.[27] In practice, however, about half of the cams on each wheel were in the raised position. Later, the Germans realized that if the number of raised cams was not very close to half and there were runs of xs and s, a cryptographic weakness existed.[28][29] Indeed this weakness was one of the two factors that led to the system being diagnosed.

The pattern of raised and lowered cams was changed daily on the motor wheels (\mu37 and \mu61). The psi wheel patterns were changed quarterly until October 1942 when the frequency was increased to monthly, and then to daily on 1 August 1944, when the chi wheel patterns were also changed from their original monthly frequency to daily.[30]

The number of start positions of the wheels was 43×47×51×53×59×37×61×41×31×29×26×23 which is approximately 1.6×1019, far too large a number for cryptanalysts to try an exhaustive "brute-force attack". As the numbers of positions of the wheels are co-prime with each other this number is also the period before the key repeated. Sometimes the Lorenz operators disobeyed instructions and two messages were transmitted with the same start positions, a phenomenon termed a "depth". The method by which the transmitting operator told the receiving operator the wheel settings that he had chosen for the message which he was about to transmit was termed the "indicator" at Bletchley Park.

In August 1942, the stereotyped starts to the messages, which were useful to cryptanalysts, were replaced by some irrelevant text, which made identifying the true message somewhat harder. This new material was dubbed quatsch (German for "nonsense") at Bletchley Park.[31]

During the phase of the experimental transmissions, the indicator consisted of twelve German forenames, the initial letters of which indicated the position to which the operators turned the twelve wheels. As well as showing when two transmissions were fully in depth, it also allowed the identification of partial depths where two indicators differed only in one or two wheel positions. From October 1942 the indicator system changed to the sending operator transmitting the unenciphered letters QEP[32] followed by a two digit number. This number was taken serially from a code book that had been issued to both operators and gave, for each QEP number, the settings of the twelve wheels. The books were replaced when they had been used up, but between replacements, complete depths could be identified by the re-use of a QEP number on a particular Tunny link.[33]

Diagnosis[edit]

The first step in breaking a new cipher is to diagnose the logic of the processes of encryption and decryption. In the case of a machine cipher such as Tunny, this entailed establishing the logical structure and hence functioning of the machine. This was achieved without the benefit of seeing a machine—which only happened in 1945, shortly before the allied victory in Europe.[34]

During the experimental period of Tunny transmissions when the twelve-letter indicator system was in use, John Tiltman, Bletchley Park's veteran and remarkably gifted cryptanalyst, studied the Tunny ciphertexts and identified that they used a Vernam cipher.

When two transmissions (a and b) use the same key, i.e. they are in depth, combining them eliminates the effect of the key.[35] Let us call the two ciphertexts Za and Zb, the key K and the two plaintexts Pa and Pb. We then have:

Za ⊕ Zb = Pa ⊕ Pb

If the two plaintexts can be worked out, the key can be recovered from either ciphertext-plaintext pair e.g.:

Za ⊕ Pa = K or Zb ⊕ Pb = K

On 31 August 1941, two long messages were received that had the same indicator HQIBPEXEZMUG. The first seven characters of these two ciphertexts were the same, but the second message was shorter. The first 15 characters of the two messages were as follows:

Za JSH5N ZYZY5 GLFRG
Zb JSH5N ZYMFS /885I
Za ⊕ Zb ///// //FOU GFL4M

John Tiltman tried various likely pieces of plaintext, i.e. a "cribs", against the Za ⊕ Zb string and found that the first plaintext message started with the German word SPRUCHNUMMER (message number). In the second plaintext, the operator had used the common abbreviation NR for NUMMER. There were more abbreviations in the second message, and the punctuation sometimes differed. This allowed Tiltman to work out, over ten days, the plaintext of both messages, as a sequence of plaintext characters discovered in Pa, could then be tried against Pb and vice versa.[36] In turn, this yielded almost 4000 characters of key.[37]

Members of the Research Section worked on this key to try to derive a mathematical description of the key generating process, but without success. Bill Tutte joined the section in October 1941 and was given the task. He had read chemistry and mathematics at Trinity College, Cambridge before being recruited to Bletchley Park. At his training course, he had been taught the Kasiski examination technique of writing out a key on squared paper with a new row after a defined number of characters that was suspected of being the frequency of repetition of the key. If this number was correct, the columns of the matrix would show more repetitions of sequences of characters than chance alone.

Tutte knew that the Tunny indicators used 25 letters (excluding J) for 11 of the positions, but only 23 letters for the other. He therefore tried Kasiski's technique on the first two impulses of the key characters using a repetition of 25 × 23 = 575. Tutte did not observe a large number of repetitions in the columns with this period, but he did observe the phenomenon on a diagonal. He therefore tried again with 574, which showed up repeats in the columns. Recognising that the prime factors of this number are 2, 7 and 41, he tried again with a period of 41 and "got a rectangle of dots and crosses that was replete with repetitions".[38]

It was clear, however, that the first impulse of the key was more complicated than that produced by a single wheel of 41 positions. Tutte called this component of the key \chi1 (chi). He figured that there was another component, which was XOR-ed with this, that did not always change with each new character, and that this was the product of a wheel that he called \psi1 (psi). The same applied for each of the five impulses—indicated here by subscripts. So for a single character, the key K consisted of two components:

K = \chi\psi .

For a stream of characters, the psi component of the key stream did not change with each new character and is referred to at the extended psi, symbolised by \psi':

K = \chi\psi' .

Tutte's derivation of the \psi component was made possible by the fact that dots were more likely than not to be followed by dots, and crosses more likely than not to be followed by crosses. This was a product of a weakness in the German key setting, which they later stopped. Once Tutte had made this breakthrough, the rest of the Research Section joined in to study the other impulses, and it was established that the five \psi wheels all moved together under the control of two \mu (mu or "motor") wheels.

Diagnosing the functioning of the Tunny machine in this way was a truly remarkable cryptanalytical achievement.

Turingery[edit]

See also: Turingery

In July 1942 Turing spent a few weeks in the Research Section.[39] He had become interested in the problem of breaking Tunny from the keys that had been obtained from depths.[40] In July, he developed a method of deriving the cam settings from a length of key. It became known as "Turingery"[41] or "Turing's Method"[42] (playfully dubbed "Turingismus" by Peter Ericsson, Peter Hilton and Donald Michie[40]) and introduced the important method of "differencing" on which much of the rest of breaking Tunny messages in the absence of depths, was based.

Differencing[edit]

The search was on for a process that would manipulate the ciphertext or key to produce a frequency distribution of characters that departed from the uniformity that the enciphering process aimed to achieve. Turing worked out that the XOR combination of the values of successive characters in a stream of ciphertext or key, emphasised any departures from a uniform distribution. The resultant stream was called the difference (symbolised by the Greek letter "delta" Δ) because XOR is the same as modulo 2 subtraction. So, for a stream of characters S, the difference ΔS was obtained as follows, where underline indicates the succeeding character:

ΔS = S ⊕ S

The stream S may be ciphertext Z, plaintext P, key K or either of its two components \chi and \psi. The relationship amongst these elements still applies when they are differenced. For example, as well as:

K = \chi\psi

It is the case that:

ΔK = Δ\chi ⊕ Δ\psi

Similarly for the ciphertext, plaintext and key components:

ΔZ = ΔP ⊕ Δ\chi ⊕ Δ\psi

So:

ΔP = ΔZ ⊕ Δ\chi ⊕ Δ\psi

The reason that differencing provided a way into Tunny, was that although the frequency distribution of characters in the ciphertext could not be distinguished from a random stream, the same was not true for a version of the ciphertext from which the chi element of the key had been removed. This is because, where the plaintext contained a repeated character and the psi wheels did not move on, the differenced psi character (Δ\psi) would be the null character ('/ ' at Bletchley Park). When XOR-ed with any character, this character has no effect, so in these circumstances, Δ\chi = ΔK. The ciphertext modified by the removal of the chi component of the key was called the de-chi D at Bletchley Park,[43] and the process of removing it as "de-chi-ing". Similarly for the removal of the psi component which was known as "de-psi-ing" (or "deep sighing" when it was particularly difficult).[44]

So the delta de-chi ΔD was:

ΔD = ΔZ ⊕ Δ\chi

Repeated characters in the plaintext were more frequent both because of the characteristics of German (EE, TT, LL and SS are relatively common),[45] and because telegraphists frequently repeated the figures-shift and letters-shift characters[46] as their loss in an ordinary telegraph transmission could lead to gibberish.[47]

To quote the General Report on Tunny:

Turingery introduced the principle that the key differenced at one, now called ΔΚ, could yield information unobtainable from ordinary key. This Δ principle was to be the fundamental basis of nearly all statistical methods of wheel-breaking and setting.[41]

As well as applying differencing to the full 5-bit characters of the ITA2 code, it was also applied to the individual impulses (bits).[48] So, for the first impulse, that was enciphered by wheels \chi1 and \psi1, differenced at one:

ΔK1 = K1K1

And for the second impulse:

ΔK2 = K2K2

And so on.

It is also worth noting that the periodicity of the chi and psi wheels for each impulse (41 and 43 respectively for the first impulse) is reflected in its pattern of ΔK. However, given that the psi wheels did not advance for every input character, as did the chi wheels, it was not simply a repetition of the pattern every 41 × 43 = 1763 characters for ΔK1, but a more complex sequence.

Turing's method[edit]

See also: Turingery

Turing's method of deriving the cam settings of the wheels from a length of key obtained from a depth, involved an iterative process. Given that the delta psi character was the null character '/ ' half of the time on average, an assumption that ΔK = Δ\chi had a 50% chance of being correct. The process started by treating a particular ΔK character as being the Δ\chi for that position. The resulting putative bit pattern of x and for each chi wheel, was recorded on a sheet of paper that contained as many columns as there were characters in the key, and five rows representing the five impulses of the Δ\chi. Given the knowledge from Tutte's work, of the periodicity of each of the wheels, this allowed the propagation of these values at the appropriate positions in the rest of the key.

A set of five sheets, one for each of the chi wheels, was also prepared. These contained a set of columns corresponding in number to the cams for the appropriate chi wheel, and were referred to as a 'cage'. So the \chi3 cage had 29 such columns.[49] Successive 'guesses' of Δ\chi values then produced further putative cam state values. These might either agree or disagree with previous assumptions, and a count of agreements and disagreements was made on these sheets. Where disagreements substantially outweighed agreements, the assumption was made that the Δ\psi character was not the null character '/ ', so the relevant assumption was discounted. Progressively, all the cam settings of the chi wheels were deduced, and from them, the psi and motor wheel cam settings.

As experience of the method developed, improvements were made that allowed it to be used with much shorter lengths of key than the original 500 or so characters.[41]

Testery[edit]

See also: Testery

The Testery was the section at Bletchley Park that performed the bulk of the work involved in decrypting Tunny messages.[50] By July 1942, the volume of traffic was building up considerably. A new section was therefore set up, led by Ralph Tester—hence the name. The staff consisted mainly of ex-members of the Research Section,[51] and included Peter Ericsson, Peter Hilton, Denis Oswald and Jerry Roberts.[52] The Testery's methods were almost entirely manual, both before and after the introduction of automated methods in the Newmanry to supplement and speed up their work.[50][53]

The first phase of the work of the Testery ran from July to October, with the predominant method of decryption being based on depths and partial depths.[54] After ten days, however, the stereotyped opening of the messages was replaced by nonsensical quatsch making decryption more difficult. This period was a productive time, albeit each decryption took considerable time until, in September, a depth was received that allowed Turing's method of wheel breaking "Turingery" to be used, and current traffic started to be read. Extensive data about the statistical characteristics of the language of the messages was compiled, and the collection of cribs extended.[41]

In late October 1942 the original, experimental Tunny link was closed and two new links (Codfish and Octopus) were opened. With these and subsequent links, the 12-letter indicator system of specifying the message key was replaced by the QEP system. This meant that only full depths could be recognised—from identical QEP numbers—which led to a considerable reduction in traffic decrypted.

Once the Newmanry became operational in June 1943, the nature of the work performed in the Testery changed, with decrypts, and wheel breaking no longer relying on depths.

British Tunny[edit]

A rebuilt British Tunny at the National Museum of Computing, Bletchley Park. It emulated the functions of the Lorenz SZ40/42, producing printed cleartext from a ciphertext input tape.

The so-called "British Tunny Machine" was a device that exactly replicated the functions of the SZ40/42 machines.[23] It was used to produce the German cleartext from a ciphertext tape, after the cam settings had been determined.[55] The functional design was produced at Bletchley Park where ten Testery Tunnies were in use by the end of the war. It was designed and built in Tommy Flowers' laboratory at the General Post Office Research Station at Dollis Hill by Gil Hayward, "Doc" Coombs, Bill Chandler and Sid Broadhurst.[56] It was mainly built from standard British telephone exchange electro-mechanical equipment such as relays and uniselectors. Input and output was by means of a teleprinter with paper tape reading and punching.[57] These machines were used in both the Testery and later the Newmanry. Dorothy Du Boisson who was a machine operator and a member of the Women's Royal Naval Service (Wren), described plugging up the settings as being like operating an old fashioned telephone exchange and that she received electric shocks in the process.[58]

When Flowers was invited by Hayward to try the first British Tunny machine at Dollis Hill by typing in the standard test phrase: "Now is the time for all good men to come to the aid of the party", he much appreciated that the rotor functions had been set up to provide the following Wordsworthian output:[59]

Input NOW IS THE TIME FOR ALL GOOD MEN TO COME TO THE AID OF THE PARTY
Output I WANDERED LONELY AS A CLOUD THAT FLOATS ON HIGH OER VALES AND H

Additional features were added to the British Tunnies to simplify their operation. Further refinements were made for the versions used in the Newmanry.[60]

Newmanry[edit]

The Newmanry was a section set up under Max Newman in December 1942 to look into the possibility of assisting the work of the Testery by automating parts of the processes of decrypting Tunny messages. Newman had been working with Gerry Morgan, head of the Research Section on ways of breaking Tunny when Bill Tutte approached them in November 1942 with the idea of what became known as the "1+2 break in".[61] This was recognised as being feasible, but only if automated.

Newman produced a functional specification of what was to become "Heath Robinson".[61] He recruited the Post Office Research Station at Dollis Hill, and Dr C.E. Wynn-Williams at the Telecommunications Research Establishment (TRE) at Malvern to implement his idea. Work on the engineering design started in January 1943 and the first machine was delivered in June. The staff at that time consisted of Newman, Donald Michie, Jack Good, two engineers and 16 Wrens. By the end of the war the Newmanry contained three Robinson machines, ten Colossus Computers and a number of British Tunnies.[62] The staff were 26 cryptographers, 28 engineers and 275 Wrens.[63]

The automation of these processes required the processing of large quantities of punched paper tape such as those on which the enciphered messages were received. Absolute accuracy of these tapes and their transcription was essential, as a single character in error could invalidate or corrupt a huge amount of work. Jack Good introduced the maxim "If it's not checked it's wrong".[64]

Tutte's "1+2 break in"[edit]

The essence of this method was to find the initial settings of the chi component of the key by exhaustively trying all positions of its combination with the ciphertext, and looking for evidence of the non-uniformity that reflected the characteristics of the original plaintext.[65][66] The wheel breaking process had to have successfully produced the current cam settings to allow the relevant sequence of characters of the chi wheels to be generated. It was totally impracticable to generate the 22 million characters from all five of the chi wheels, so it was initially limited to 41 × 31 = 1271 from the first two.

Given that for each of the five impulses i:

Zi = \chii\psii ⊕ Pi

and hence

Pi = Zi\chii\psii

for the first two impulses:

(P1 ⊕ P2) = (Z1 ⊕ Z2) ⊕ (\chi1\chi2) ⊕ (\psi1\psi2)

Calculating a putative P1 ⊕ P2 in this way for each starting point of the \chi1\chi2 sequence would yield xs and s with, in the long run, a greater proportion of s when the correct starting point had been used. Tutte knew, however, that using the differenced (∆) values amplified this effect[67] because any repeated characters in the plaintext would always generate , and similarly ∆\psi1 ⊕ ∆\psi2 would generate whenever the psi wheels did not move on, and about half of the time when they did - some 70% overall.

Tutte analyzed a decrypted ciphertext with the differenced version of the above function:

(∆Z1 ⊕ ∆Z2) ⊕ (∆\chi1 ⊕ ∆\chi2) ⊕ (∆\psi1 ⊕ ∆\psi2)

and found that it generated some 55% of the time.[68] Given the nature of the contribution of the psi wheels, the alignment of chi-stream with the ciphertext that gave the highest count of s from (∆Z1 ⊕ ∆Z2 ⊕ ∆\chi1 ⊕ ∆\chi2) was the one that was most likely to be correct.[69] This technique could be applied to any pair of impulses and so provided the basis of an automated approach to obtaining the de-chi (D) of a ciphertext, from which the psi component could be removed by manual methods.

Robinsons[edit]

Heath Robinson was the first machine produced to automate Tutte's 1+2 method. It was given the name by the Wrens who operated it, after cartoonist William Heath Robinson, who drew immensely complicated mechanical devices for simple tasks, similar to Rube Goldberg in the USA.

The functional specification of the machine was produced by Max Newman. The main engineering design was the work of Frank Morrell[70] at the Post Office Research Station at Dollis Hill in North London, with his colleague Tommy Flowers designing the "Combining Unit".[61] Dr C. E. Wynn-Williams from the Telecommunications Research Establishment at Malvern produced the high-speed electronic valve and relay counters.[61] Construction started in January 1943,[71] the prototype machine was delivered to Bletchley Park in June and was first used to help read current encrypted traffic soon afterwards.[72] The main parts of the machine were:

  • a tape transport and reading mechanism (dubbed the "bedstead" because of its resemblance to an upended metal bed frame) that ran the looped key and message tapes at between 1000 and 2000 characters per second;
  • a combining unit that implemented the logic of Tutte's method;
  • a counting unit that counted the number of s, and if it exceeded a pre-set total, displayed or printed it.

The prototype machine and was effective despite a number of serious shortcomings.[73] Most of these were progressively overcome in the development of what became known as "Old Robinson".[74] A later development was a machine called "Super Robinson".[75]

Colossus[edit]

Main article: Colossus computer
A Mark 2 Colossus computer. The Wren operators are (left to right) Dorothy Du Boisson and Elsie Booker. The slanted control panel on the left was used to set the pin patterns on the Lorenz. The "bedstead" paper tape transport is on the right.
In 1994, a team led by Tony Sale (right) began a reconstruction of a Mark 2 Colossus at Bletchley Park. Here, in 2006, Sale and Phil Hayes supervise the breaking of an enciphered message with the completed machine.

Tommy Flowers' experience with Heath Robinson, and his previous, unique experience of thermionic valves (vacuum tubes) led him to realize that a better machine could be produced using electronics. Instead of the key stream being read from a punched paper tape, an electronically generated key stream could allow much faster and more flexible processing. Flowers' suggestion that this could be achieved with a machine that was entirely electronic and would contain between one and two thousand valves, was treated with incredulity at both the Telecommunications Research Establishment and at Bletchley Park, as it was thought that it would be "too unreliable to do useful work". He did, however, have the support of the Controller of Research at Dollis Hill, W Gordon Radley,[76] and he implemented these ideas producing Colossus, the world's first electronic, digital, computing machine that was at all programmable, in the remarkably short time of ten months.[77] In this he was assisted by his colleagues at the Post Office Research Station Dollis Hill: Sidney Broadhurst, William Chandler, Allen Coombs and Harry Fensom.

The main parts of the machine were:

  • a tape transport and reading mechanism (the "bedstead") that ran the message tape in a loop at 5000 characters per second;
  • a unit that generated the key stream electronically;
  • a combining unit that implemented the logic of Tutte's method;
  • a counting unit that counted the number of s, and if it exceeded a pre-set total, printed it out.

The prototype Mark 1 Colossus (Colossus I), with its 1500 valves, was shown to be working at Dollis Hill in December 1943 and was operational at Bletchley Park by February 1944. This processed the message at 5000 characters per second using the impulse from reading the tape's sprocket holes to act as the clock signal. It quickly became evident that this was a huge leap forward in cryptanalysis of Tunny. Further Colossus machines were ordered and the orders for more Robinsons cancelled.

An improved Mark 2 Colossus (Colossus II) contained 2400 valves and first worked at Bletchley Park on 1 June 1944, just in time for the D-day Normandy landings. This processed the message at an effective speed of 25,000 characters per second by the use of circuitry invented by Flowers that would now be called a shift register. Donald Michie worked out a method of using Colossus to assist in wheel breaking as well as for wheel setting.[78] This was then implemented in special hardware on later Colossi.

A total of ten Colossus computers were in use by the end of the war.

Special machines[edit]

As well as the commercially produced teleprinters and re-perforators, a number of other machines were built to assist in the preparation and checking of tapes in the Newmanry and Testery. The approximate complement as at May 1945 was as follows.[79]

Machines used in deciphering Tunny as at May 1945
Name Function Testery Newmanry
Super Robinson Used for crib runs in which two tapes were compared in all positions. Contained some valves. 2
Colossus Mk.2 Counted a condition involving a message tape and an electronically-generated key character stream imitating the various Tunny wheels in different relative positions ("stepping").[80] Contained some 2,400 valves. 10
Dragons Used for setting short cribs by "crib-dragging" (hence the name).[81][82] 2
Aquarius A machine under development at the war's end for the "go-backs" of the SZ42B, which stored the contents of the message tape in a large bank of capacitors that acted as an electronic memory.[83] 1
Proteus A machine for utilising depths that was under construction at the war's end but was not completed.
Decoding Machines Translated from cipertext typed in, to plaintext printed out. Some of the later ones were speeded up with the use of a few valves.[84] A number of modified machines were produced for the Newmanry  ? 13
Tunnies See British Tunny above 3  ?
Miles A set of increasingly complex machines (A, B, C, D) that read two tapes and combined them in a variety of ways to produce an output tape. 3
Garbo Similar to Junior, but with a Delta'ing facility. 3
Juniors For printing tapes via a plug panel to change characters as necessary. 4
Insert machines Similar to Angel, but with a device for making corrections by hand. 2
Angels Copied tapes. 4
Hand perforators Generated tape from a keyboard. 2
Hand counters Measured text length. 6
Stickers (hot) Bostick and benzene was used for sticking tapes to make a loop. The tape to be stuck was inserted between two electrically heated plates and the benzene evaporated. 3
Stickers (cold) Stuck tapes without heating. 6

Steps in Wheel Setting[edit]

Working out the start position of the chi (\chi) wheels required first that their cam settings had been determined by "wheel breaking". Initially, this was achieved by two messages having been sent in depth.

The number of start positions for the first two wheels, \chi1 and \chi2 was 41×31 = 1271. The first step was to try all of these start positions against the message tape. This was Tutte's "1+2 break in" which involved computing (∆Z1 ⊕ ∆Z2 ⊕ ∆\chi1 ⊕ ∆\chi2)–which gives a putative (∆D1 ⊕ ∆D2)–and counting the number of times this gave . Both Heath Robinson, which was developed into what became known as "Old Robinson", and Colossus were designed to automate this process. Statistical theory allowed the derivation of measures of how far any count was from the random situation expected with an incorrect starting point for the chi wheels. For this step, the measure of deviation from randomness was called sigma. Starting points that gave a count of less than 2.5 × sigma, named the "set total", were not printed out.[85] In the ideal case there was a single large value for sigma that identified the start positions of \chi1 and \chi2. An example of the output from such a run on a Mark 2 Colossus with its five counters: a,b,c,d and e, is given below.

Output table abridged from Small's "The Special Fish Report".[86] The set total threshold was 4912.
\chi1 \chi2 Counter letter Count Operator's notes on the output
06 11 a 4021
06 13 a 4948
02 16 e 4977
05 18 b 4926
02 20 e 4954
05 22 b 4914
03 25 d 4925
02 26 e 5015 ← 4.6 sigma
19 26 c 4928
25 19 b 4930
25 21 b 5038 ← 5.1 sigma
29 18 c 4946
36 13 a 4955
35 18 b 4926
36 21 a 5384 ← 12.2 sigma ch \chi1 \chi2  ! !
36 25 a 4965
36 29 a 5013
38 08 d 4933

Having identified possible \chi1 \chi2 start positions, the next step was to try to find the start positions for the other chi wheels. In the example given above, there is a single setting of \chi1 = 36 and \chi2 = 21 whose sigma value makes it stand out from the rest. This was not always the case and there were many different actions that could be taken. Small enumerates 36 different further runs that might be done.[87] At first the choice was made by the cryptanalyst sitting at the typewriter output, and calling out instructions to the Wren operators. Max Newman devised a decision tree and then set Jack Good and Donald Michie the task of devising others. These were used by the Wrens without recourse to the cryptanalysts if certain criteria were met.[88]

In the above one of Small's examples, the next run was with the first two chi wheels set to the start positions found and three separate parallel explorations of the remaining three chi wheels.

Output table adapted from Small's "The Special Fish Report".[89] The set total threshold was 2728.
\chi1 \chi2 \chi3 \chi4 \chi5 Counter letter Count Operator's notes on the output
36 21 01 a 2938 ← 6.8 rho  ! \chi3  !
36 21 01 b 2763
36 21 01 c 2803
36 21 02 b 2733
36 21 04 c 3003 ← 8.6 rho  ! \chi5  !
36 21 06 a 2740
36 21 07 c 2750
36 21 09 b 2811
36 21 11 a 2751
36 21 12 c 2759
36 21 14 c 2733
36 21 16 a 2743
36 21 19 b 3093 ← 11.1 rho  ! \chi4  !
36 21 20 a 2785
36 21 22 b 2823
36 21 24 a 2740
36 21 25 b 2796
36 21 01 b 2763
36 21 07 c 2750

Once the probable start positions for the chi wheels had been derived, they had to be verified before the de-chi (D) message was passed to the Testery. This involved performing a count of the frequency of the characters in ∆D. Small describes the check of the frequency count of the ∆D characters as being the "acid test",[90] and that practically every cryptanalyst and Wren in the Newmanry and Testery knew the contents of the following table by heart.

Relative frequency count of characters in ∆D.[91]
Char. Count Char. Count Char. Count Char. Count
/ 1.28 R 0.92 A 0.96 D 0.89
9 1.10 C 0.90 U 1.24 F 1.00
H 1.02 V 0.94 Q 1.01 X 0.87
T 0.99 G 1.00 W 0.89 B 0.82
O 1.04 L 0.92 5 1.43 Z 0.89
M 1.00 P 0.96 8 1.12 Y 0.97
N 1.00 I 0.96 K 0.89 S 1.04
3 1.13 4 0.90 J 1.03 E 0.89

If the derived start points of the chi wheels passed this test, the de-chi-ed message was passed to the Testery where manual methods were used to derive the psi and motor settings. As Small remarked, the work in the Newmanry took a great amount of statistical science, whereas that in the Testery took much knowledge of language and was of great interest as an art. Jerry Roberts makes the point that this Testery work was a greater load on staff than the automated processes in the Newmanry.[50]

References and Notes[edit]

  1. ^ McKay 2010, p. 263 quoting Jerry Roberts
  2. ^ Hinsley 1993, p. 8
  3. ^ Weierud 2006, p. 307
  4. ^ Good, Michie & Timms 1945, p. 5 in 1. Introduction: German Tunny
  5. ^ Hinsley 1993, pp. 141–142
  6. ^ Gannon 2007, p. 189
  7. ^ Copeland 2006, p. 45
  8. ^ Good 1993, pp. 162,163
  9. ^ Tutte 1998, pp. 5, 6
  10. ^ Flowers 2006, p. 81
  11. ^ All but two of the Colossus computers, which were taken to GCHQ, were destroyed in 1945, and the whole project was kept strictly secret until the 1970s. Thus Colossus did not feature in many early descriptions of the development of electronic computers. Gannon 2006, p. 431
  12. ^ Small 1944, p. 1
  13. ^ Good, Michie & Timms 1945, p. 6 in 1. Introduction: German Tunny
  14. ^ Gannon 2007, pp. 150, 151
  15. ^ Good 1993, p. 153
  16. ^ Good, Michie & Timms 1945, p. 16 in 1. Introduction: Cryptographic Aspects
  17. ^ Hayward 1993, p. 176
  18. ^ In more recent terminology, each impulse would be termed a "bit" with a mark being binary 1 and a space being binary 0. Punched paper tape had a hole for a mark and no hole for a space.
  19. ^ Copeland 2006, pp. 348, 349
  20. ^ Roberts 2006, p. 256
  21. ^ Gannon 2007, p. 125
  22. ^ Good, Michie & Timms 1945, p. 281 in 3. Organisation: Knockholt
  23. ^ a b Ward 2011
  24. ^ Klein, p. 2
  25. ^ Klein, p. 3
  26. ^ a b Good, Michie & Timms 1945, p. 10 in 1. Introduction: German Tunny
  27. ^ Churchhouse 2002, pp. 158, 159
  28. ^ Good, Michie & Timms 1945, p. 11 in 1. Introduction: German Tunny
  29. ^ This statement is an over-simplification. The real constraint is much more complex, that ab=½. See also: Good, Michie & Timms 1945, p. 17 in 1. Introduction: Some Historical Notes and Good, Michie & Timms 1945, p. 306 in 4. Early Methods and History: Early Hand Methods for further details.
  30. ^ Copeland 2006, p. 48
  31. ^ Edgerley 2006, pp. 273, 274
  32. ^ Initially QKP (see Good, Michie & Timms 1945, p. 28 in 3. Organisation: Expansion and Growth) or QSN (see Good, Michie & Timms 1945, p. 320 in 4. Early Methods and History: Hand statistical Methods).
  33. ^ Copeland 2006, pp. 44–47
  34. ^ Sale, Tony, The Lorenz Cipher and how Bletchley Park broke it, retrieved 21 October 2010 
  35. ^ Tutte 2006, p. 353
  36. ^ Copeland 2010
  37. ^ Tutte 1998, p. 4
  38. ^ Tutte 2006, p. 357
  39. ^ Tutte 2006, pp. 359, 360
  40. ^ a b Copeland 2006, p. 380
  41. ^ a b c d Good, Michie & Timms 1945, p. 313 in 4. Early Methods and History: Testery Methods 1942-1944
  42. ^ Government Code and Cypher School 1944, p. 89
  43. ^ Small 1944, p. 2 refers to the de-chi as being "pseudo plain"
  44. ^ Tutte 2006, p. 365
  45. ^ Singh, Simon, The Black Chamber, retrieved 28 April 2012 
  46. ^ Newman c. 1944 p. 387
  47. ^ Carter 2004, p. 3
  48. ^ The five impulses or bits of the coded characters are sometimes referred to as five levels.
  49. ^ Copeland 2006, p. 385 which reproduces a \chi3 cage from the General Report on Tunny
  50. ^ a b c Roberts 2009
  51. ^ Good, Michie & Timms 1945, p. 28 in 1. Introduction: Organisation
  52. ^ Roberts 2006, p. 250
  53. ^ Good, Michie & Timms 1945, p. 29 in 1. Introduction: Organisation
  54. ^ Unlike a full depth, when all twelve letters of the indicator were the same, a partial depth occurred when one or two of the indicator letters differed.
  55. ^ Hayward 1993, pp. 175–192
  56. ^ Hayward 2006, p. 291
  57. ^ Currie 2006, pp. 265–266
  58. ^ Copeland 2006, p. 162 quoting Dorothy Du Boisson
  59. ^ Hayward 2006, p. 292
  60. ^ Good, Michie & Timms 1945, pp. 376–379 in 56. Copying Machines
  61. ^ a b c d Good, Michie & Timms 1945, p. 33 in 1. Introduction: Some historical notes
  62. ^ Good, Michie & Timms 1945, p. 276 in 3. Organisation: Mr Newman's Section
  63. ^ Copeland 2006, p. 158
  64. ^ Good 2006, p. 215
  65. ^ Good, Michie & Timms 1945, pp. 321–322 in 44. Hand Statistical Methods: Setting - Statistical Methods
  66. ^ Budiansky 2006, pp. 58–59
  67. ^ For this reason Tutte's 1 + 2 method is sometimes called the "double delta" method.
  68. ^ Tutte 2006, p. 364
  69. ^ Carter 2004, pp. 4–6
  70. ^ Bletchley Park National Code Centre: November 1943, retrieved 21 November 2012 
  71. ^ Copeland 2006, p. 65
  72. ^ Good, Michie & Timms 1945, p. 290 in 3. Organisation: Machine Setting Organisation
  73. ^ Good, Michie & Timms 1945, p. 328 in 52. Development of Robinson and Colossus
  74. ^ Good, Michie & Timms 1945, p. 354 in 54. Robinson: Introduction
  75. ^ Good, Michie & Timms 1945, pp. 354–362 in 54. Robinson
  76. ^ Fensom 2006, pp. 300–301
  77. ^ Flowers 2006, p. 80
  78. ^ Good & Michie 1992
  79. ^ Good, Michie & Timms 1945, pp. 25–27, 367–379 in 1. Introduction: Machines and 52. Development of Robinson and Colossus: Copying Machines
  80. ^ Good, Michie & Timms 1945, p. 333 in 53. Colossus
  81. ^ Hayward 2006, pp. 291–292
  82. ^ Michie 2006, p. 236
  83. ^ Fensom 2006, pp. 301–302
  84. ^ Good, Michie & Timms 1945, pp. 326 in 5. Specialise Counting Machines
  85. ^ Small 1944, p. 9
  86. ^ Small 1944, p. 19
  87. ^ Small 1944, p. 7
  88. ^ Good 2006, p. 218
  89. ^ Small 1944, p. 20
  90. ^ Small 1944, p. 15
  91. ^ Adapted from Small 1944, p. 5

Bibliography[edit]