Onion routing

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Onion routing (OR) is a technique for anonymous communication over a computer network. Messages are repeatedly encrypted and then sent through several network nodes called onion routers. Like someone peeling an onion, each onion router removes a layer of encryption to uncover routing instructions, and sends the message to the next router where this is repeated. This prevents these intermediary nodes from knowing the origin, destination, and contents of the message.[1]

Onion routing was developed by Michael G. Reed, Paul F. Syverson, and David M. Goldschlag[2] and patented by the United States Navy in US Patent No. 6266704 (1998). As of 2009, Tor is the predominant technology that employs onion routing.

Onions[edit]

Routing onions[edit]

Example "onion"

A routing onion (or just onion) is a data structure formed by 'wrapping' a plaintext message with successive layers of encryption, such that each layer can be 'unwrapped' (decrypted) like the layer of an onion by one intermediary in a succession of intermediaries, with the original plaintext message only being viewable by at most:[3]

  1. the sender
  2. the last intermediary (the exit node)
  3. the recipient

If there is end-to-end encryption between the sender and the recipient, then not even the last intermediary can view the original message; this is similar to a game of 'pass the parcel'. An intermediary is traditionally called a node or router.

Circuit establishment and sending data[edit]

To create and transmit an onion, the following steps are taken:[3]

  1. The originator picks nodes from a list provided by a special node called the directory node (traffic between the originator and the directory node may also be encrypted or otherwise anonymized or decentralized); the chosen nodes are ordered to provide a path through which the message may be transmitted; this ordering of the nodes is called a chain or a circuit. No node within the circuit, except for the exit node, can infer where in the chain it is located, and no node can tell whether the node before it is the originator or how many nodes are in the circuit.
  2. Using asymmetric key cryptography, the originator uses the public key (obtained from the directory) of the first node in the circuit, known as the entry node, to send it an encrypted message, called a create cell, containing:
    1. A circuit ID. The circuit ID is random and different for each connection in the chain.
    2. A request for the receiving node (i.e. the entry node in this case) to establish a circuit with the originator.
    3. The originator's half of a Diffie-Hellman handshake (to establish a shared secret).
  3. The entry node, which just received one half of the handshake, replies to the originator, in unencrypted plaintext:
    1. The entry node's half of the Diffie-Hellman handshake.
    2. A hash of the shared secret, so that the originator can verify that he/she and the entry node share the same secret.
  4. Now the entry node and originator use their shared secret for encrypting all their correspondence in symmetric encryption (this is significantly more efficient than using asymmetric encryption). The shared secret is referred to as a session key.
  5. A relay cell, as opposed to a command cell like the create cell used in the first step, is not interpreted by the receiving node, but relayed to another node. Using the already established encrypted link, the originator sends the entry node a relay extend cell, which is like any relay cell, only that it contains a create cell intended for the next node (known as the relay node) in the chain, encrypted using the relay node's public key and relayed to it by the entry node, containing the following:
    1. A circuit ID. Once again, it is arbitrary, and is not necessarily the same for this connection as it is for the previous.
    2. A request from the entry node to the relay node to establish a circuit.
    3. The originator's half of a Diffie-Hellman handshake. Note that the relay node (second node in this case) cannot tell whether or not the entry node (first node) is or isn't the originator. Likewise, the entry node cannot distinguish the originator from another node in the chain.
  6. The relay node, similar to the first step, replies with its half of the handshake in plain text along with a hash of the shared secret.
  7. As the entry node - relay node circuit has been established, the entry node replies to the originator with a relay extended cell, telling it that the chain has been extended, and containing the hash of the shared secret along with the relay node's half of the handshake. The originator and the relay node now share a secret key.
  8. To extend the chain further, the originator sends the entry node a relay cell which contains a relay cell that only the relay node can decrypt, instructing the relay node to extend the chain further. The process can be repeated as above to as many nodes as possible. In Tor, for example, chains are limited to 3 nodes: the entry node, the relay node, and the exit node.

When the chain is complete, the originator can send data over the Internet anonymously. For example, if the originator wishes to open a website, the originator's onion proxy (typically running a SOCKS proxy) forwards the request from the originator's browser to the originator's local onion router (which controls the circuits). The onion router creates the following cell:

  • {RELAY C1:
  • [RELAY C2:
  • (RELAY C3:
  • ~Send HTTP request to IP-of-webpage~)]}

Where curly brackets indicate content encrypted with the entry node's shared key, square brackets content encrypted with the relay node's key, and regular brackets content encrypted with the exit node's key.

Upon receiving the cell, the entry node only sees the following:

  • RELAY C1:
  • ENCRYPTED CONTENT

The entry node knows that relay requests for circuit ID 1 (C1) should be relayed to circuit ID 2 (C2), since it received a request from the originator to extend the circuit earlier. For this reason, there is no need for the originator to know the circuit IDs, it is enough for it to tell the entry node which circuit it refers to. The entry node takes the payload and sends a relay cell to the relay node.

Upon receiving the relayed cell from the entry node, the relay node sees the following:

  • RELAY C2:
  • ENCRYPTED CONTENT

The relay node follows the same protocol as the entry node and relays the payload to the exit node. The exit node sees this:

  • RELAY C3:
  • Send HTTP request to IP-of-webpage

The exit node proceeds to sending an HTTP request to the website.

Receiving data[edit]

Continuing from the above example: the website's server responds to the exit node with the contents of the web page as follows:[3]

  • HTML FILE OF WEBPAGE

The exit node uses its session key (the secret shared between it and the sender) to encrypt the content it received, and sends the following cell to the relay node:

  • RELAY C3
  • ENCRYPTED CONTENT

The relay node knows which shared secret key to use, since it refers to circuit ID #3, and uses that key to encrypt the message again (so that no adversary watching the traffic can know the structure of the chain). It also knows that any relay request from circuit #3 should be relayed to circuit #2. It relays the following cell to the entry node:

  • RELAY C2
  • ENCRYPTED CONTENT

The entry node takes the encrypted payload, and sends the following cell to the originator:

  • RELAY C1
  • ENCRYPTED CONTENT

The sender receives a cell that is encrypted, from inside the cell (onion) to the outside: the exit node shared key, the relay node shared key, and the entry node shared key. The sender's onion router must decrypt the cell three times with three different keys in order to see the page. The number of layers of encryption between each hop is constant, however, no node can tell whether encrypted data contains more encrypted data. The purpose of layered encryption is not to make the encryption stronger per se, but rather to facilitate the execution of perfect forward secrecy. That is, if the exit node encrypted the page with the sender key, and the nodes after it would only pass on that encrypted data without encrypting it in more layers with their respective keys, an adversary would only have to compromise the exit node's shared key to be able to intercept the sender's incoming traffic. However, since the nodes add more layers of encryption as the cell is passed on, compromising the exit node's shared key will only reveal the contents of the webpage but not the IP of the sender.

Weaknesses[edit]

Timing analysis
An adversary could determine whether a node is communicating with a web by correlating when messages are sent by a server and when messages are received by a node. Tor, and any other low latency network, is vulnerable to such an attack.[4] A node can defeat this attack by sending dummy messages whenever it is not sending or receiving real messages. This counter-measure is not currently part of the Tor threat model [5] as it is considered infeasible to protect against this type of attack.[6]
Intersection attacks
Nodes periodically fail or leave the network; any chain that remains functioning cannot have been routed through either the nodes that left or the nodes that recently joined the network, increasing the chances of a successful traffic analysis.[5]
Predecessor attacks
A compromised node can keep track of a session as it occurs over multiple chain reformations (chains are periodically torn down and rebuilt). If the same session is observed over the course of enough reformations, the compromised node tends to connect with the particular sender more frequently than any [other] node, increasing the chances of a successful traffic analysis.[7]
Exit node sniffing
An exit node (the last node in a chain) has complete access to the content being transmitted from the sender to the recipient; Dan Egerstad, a Swedish researcher, used such an attack to collect the passwords of over 100 email accounts related to foreign embassies.[8] However, if the message is encrypted by SSL, the exit node cannot read the information, just as any encrypted link over the regular internet.

See also[edit]

Further reading[edit]

References[edit]

  1. ^ Goldschlag D., Reed M., Syverson P. (1999.) Onion Routing for Anonymous and Private Internet Connections, Onion Router.
  2. ^ Reed M. G., Sylverson P. F., Goldschlag D. M. (1998.) "Anonymous connections and onion routing", IEEE Journal on Selected Areas in Communications, 16(4):482-494.
  3. ^ a b c Roger Dingledine; Nick Mathewson; Paul Syverson. "Tor: The Second-Generation Onion Router". Retrieved 26 February 2011. 
  4. ^ Shmatikov, Wang; Ming-Hsiu Vitaly (2006). "Timing analysis in low-latency mix networks: attacks and defenses". Proceedings of the 11th European conference on Research in Computer Security. ESORICS'06: 18–33. doi:10.1007/11863908_2. Retrieved 24 October 2012. 
  5. ^ a b Dingledine, Roger. "Tor: The Second-Generation Onion Router". Tor Project. Retrieved 24 October 2012. 
  6. ^ arma. "One cell is enough to break Tor's anonymity". Tor Project. Retrieved 24 October 2012. 
  7. ^ Wright, Matthew. K.; Adler, Micah; Levine, Brian Neil; Shields, Clay (November 2004). "The Predecessor Attack: An Analysis of a Threat to Anonymous Communications Systems". ACM Transactions on Information and System Security (TISSEC) 7 (4): 489–522. doi:10.1145/1042031.1042032. 
  8. ^ Bangeman, Eric (2007-08-30). "Security researcher stumbles across embassy e-mail log-ins". Arstechnica.com. Retrieved 2010-03-17. 

External links[edit]