# Chosen-plaintext attack

A chosen-plaintext attack (CPA) is an attack model for cryptanalysis which presumes that the attacker can obtain the ciphertexts for arbitrary plaintexts.[1] The goal of the attack is to gain information that reduces the security of the encryption scheme.

Modern ciphers aim to provide semantic security, also known as ciphertext indistinguishability under chosen-plaintext attack, and are therefore by design generally immune to chosen-plaintext attacks if correctly implemented.

## Introduction

In a chosen-plaintext attack the adversary can (possibly adaptively) ask for the ciphertexts of arbitrary plaintext messages. This is formalized by allowing the adversary to interact with an encryption oracle, viewed as a black box. The attacker’s goal is to reveal all or part of the secret encryption key.

It may seem infeasible in practice that an attacker could obtain ciphertexts for given plaintexts. However, modern cryptography is implemented in software or hardware and is used for a diverse range of applications; for many cases, a chosen-plaintext attack is often very feasible (see also #In practice). Chosen-plaintext attacks become extremely important in the context of public key cryptography, where the encryption key is public and so attackers can encrypt any plaintext they choose.

## Different forms

There are two forms of chosen-plaintext attacks:

• Batch chosen-plaintext attack, where the adversary chooses all of the plaintexts before seeing any of the corresponding ciphertexts. This is often the meaning intended by "chosen-plaintext attack" when this is not qualified.
• Adaptive chosen-plaintext attack (CPA2), where the adversary can request the ciphertexts of additional plaintexts after seeing the ciphertexts for some plaintexts.

## General method of an attack

A general batch chosen-plaintext attack is carried out as follows[failed verification]:

1. The attacker may choose n plaintexts. (This parameter n is specified as part of the attack model, it may or may not be bounded.)
2. The attacker then sends these n plaintexts to the encryption oracle.
3. The encryption oracle will then encrypt the attacker's plaintexts and send them back to the attacker.
4. The attacker receives n ciphertexts back from the oracle, in such a way that the attacker knows which ciphertext corresponds to each plaintext.
5. Based on the plaintext–ciphertext pairs, the attacker can attempt to extract the key used by the oracle to encode the plaintexts. Since the attacker in this type of attack is free to craft the plaintext to match his needs, the attack complexity may be reduced.

Consider the following extension of the above situation. After the last step,

1. The adversary outputs two plaintexts m0 and m1.
2. A bit b is chosen uniformly at random ${\displaystyle b\leftarrow \{0,1\}}$.
3. The adversary receives the encryption of mb, and attempts to "guess" which plaintext it received, and outputs a bit b'.

A cipher has indistinguishable encryptions under a chosen-plaintext attack if after running the above experiment with n=1[failed verification] the adversary can't guess correctly (b=b') with probability non-negligibly better than 1/2.[2]

## Examples

The following examples demonstrate how some ciphers that meet other security definitions may be broken with a chosen-plaintext attack.

### Caesar cipher

The following attack on the Caesar cipher allows full recovery of the secret key:

1. Suppose the adversary sends the message: Attack at dawn,
2. and the oracle returns Nggnpx ng qnja.
3. The adversary can then work through to recover the key in the same way you would decrypt a Caesar cipher. The adversary could deduce the substitutions A → N, T → G and so on. This would lead the adversary to determine that 13 was the key used in the Caesar cipher.

With more intricate or complex encryption methodologies the decryption method becomes more resource-intensive, however, the core concept is still relatively the same.

The following attack on the one-time pad allows full recovery of the secret key. Suppose the message length and key length are equal to n.

1. The adversary sends a string consisting of n zeroes to the oracle.
2. The oracle returns the bitwise exclusive-or of the key with the string of zeroes.
3. The string returned by the oracle is the secret key.

## In practice

In World War II US Navy cryptanalysts discovered that Japan was planning to attack a location referred to as "AF". They believed that "AF" might be Midway Island, because other locations in the Hawaiian Islands had codewords that began with "A". To prove their hypothesis that "AF" corresponded to "Midway Island" they asked the US forces at Midway to send a plaintext message about low supplies. The Japanese intercepted the message and immediately reported to their superiors that "AF" was low on water, confirming the Navy's hypothesis and allowing them to position their force to win the battle.[2][3]

Also during World War II, Allied codebreakers at Bletchley Park would sometimes ask the Royal Air Force to lay mines at a position that didn't have any abbreviations or alternatives in the German naval system's grid reference. The hope was that the Germans, seeing the mines, would use an Enigma machine to encrypt a warning message about the mines and an "all clear" message after they were removed, giving the allies enough information about the message to break the German naval Enigma. This process of planting a known-plaintext was called gardening.[4] Allied codebreakers also helped craft messages sent by double agent Juan Pujol García, whose encrypted radio reports were received in Madrid, manually decrypted, and then re-encrypted with an Enigma machine for transmission to Berlin.[5] This helped the codebreakers decrypt the code used on the second leg, having supplied the original text.[6]

In modern day, chosen-plaintext attacks (CPAs) are often used to break symmetric ciphers. To be considered CPA-secure, the symmetric cipher must not be vulnerable to chosen-plaintext attacks. Thus, it is important for symmetric cipher implementors to understand how an attacker would attempt to break their cipher and make relevant improvements.

For some chosen-plaintext attacks, only a small part of the plaintext may need to be chosen by the attacker; such attacks are known as plaintext injection attacks.

## Relation to other attacks

A chosen-plaintext attack is more powerful than known-plaintext attack, because the attacker can directly target specific terms or patterns without having to wait for these to appear naturally, allowing faster gathering of data relevant to cryptanalysis. Therefore, any cipher that prevents chosen-plaintext attacks is also secure against known-plaintext and ciphertext-only attacks.

However, a chosen-plaintext attack is less powerful than a chosen-ciphertext attack, where the attacker can obtain the plaintexts of arbitrary ciphertexts. A CCA-attacker can sometimes break a CPA-secure system.[2] For example, the El Gamal cipher is secure against chosen plaintext attacks, but vulnerable to chosen ciphertext attacks because it is unconditionally malleable.

## References

1. ^ Ross Anderson, Security Engineering: A Guide to Building Dependable Distributed Systems. The first edition (2001): http://www.cl.cam.ac.uk/~rja14/book.html
2. ^ a b c Katz, Jonathan; Lindell, Yehuda (2007). Introduction to Modern Cryptography: Principles and Protocols. Boca Raton: Chapman and Hall/CRC. OCLC 893721520.
3. ^ Weadon, Patrick D. "How Cryptology enabled the United States to turn the tide in the Pacific War". www.navy.mil. US Navy. Archived from the original on 2015-01-31. Retrieved 2015-02-19.
4. ^ Morris, Christopher (1993), "Navy Ultra's Poor Relations", in Hinsley, F.H.; Stripp, Alan (eds.), Codebreakers: The inside story of Bletchley Park, Oxford: Oxford University Press, p. 235, ISBN 978-0-19-280132-6
5. ^ Kelly, Jon (27 January 2011). "The piece of paper that fooled Hitler". BBC. Retrieved 1 January 2012. The Nazis believed Pujol, whom they code named Alaric Arabel, was one of their prize assets
6. ^ Seaman (2004). "The first code which Garbo was given by the Germans for his wireless communications turned out to be the identical code which was currently in use in the German circuits"