Network coding

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Network coding is a technique where, instead of simply relaying the packets of information they receive, the nodes of a network will take several packets and combine them together for transmission. This can be used to attain the maximum possible information flow in a network. Network coding is a field of information theory and coding theory.

Contents

[edit] A brief history

A network is represented by a directed graph \mathcal{G}=(V, E, C). V is the set of nodes or vertices, E is the set of directed links (or edges), and C gives the capacity of each link of E. Let t(s, t) be the maximum possible throughput from node s to node t. It has long been known that t(s, t) is upper bounded by the minimum capacity of all cuts, which is the sum of the capacities of the edges on a cut, between these two nodes.

Karl Menger proved that there is always a set of edge-disjoint paths achieving the upper bound in a unicast scenario, known as the max-flow min-cut theorem. Later, the Ford-Fulkerson algorithm was proposed to find such paths in a polynomial time. Then, Edmonds proved in the paper "Edge-Disjoint Branchings" the upper bound in the broadcast scenario is also achievable, and proposed a polynomial time algorithm.

However, the situation in the multicast scenario is more complicated, and in fact, such an upper bound can't be reached using traditional routing ideas. Ahlswede, et al. proved that it can be achieved if additional computing tasks (incoming packets are combined into one or several outgoing packets) can be done in the intermediate nodes.[1]

In 2003, Li, et al. proved that linear coding is enough to achieve the upper bound in multicast problems [2] In 2005, Randall Dougherty, Chris Freiling, and Ken Zeger showed that the linear coding is not sufficient in general (multisource, multisink with arbitrary demands), even for more general versions of linearity such as convolutional coding, filter-bank coding, etc.[3]

[edit] Linear network coding

In a Linear Network coding problem, a group of nodes P are involved in moving the data from S source nodes to K sink nodes. Each node generates a new packet, which is a linear combination of the earlier received packets on the link, by coefficients in the finite field.

A message generated so X_k is related to the received messages M_i by the relation:

X_k = \sum_{i=1}^{S}g_k^i\cdot M_i

Each node forwards the computed value X_k along with all the coefficients used in the k^{th} level, g_k^i.

The values g_k^i are the coefficients from the Galois field GF(2^s). Since the operations are computed in the finite field, the result of the operation is also of the same length, as the size of each vector M.

Each node produces a similar output, as computed above. This yields a linear problem of the type X = G M, where with the knowledge of the X , G we need to compute M. Each of the receivers in K, try to solve this linear equation, and for which at least N \ge S packets must be received. The received packets are continually used in the Gaussian elimination method to reduce G matrix into the row-echelon form. Finally the resulting values of X = G_{echelon}M are solved to obtain M.

[edit] Example

Butterfly Network.

In the butterfly network, there are two sources (at the top of the picture), each having knowledge of some value A and B. There are two destination nodes (at the bottom), which each want to know both A and B. Each edge can carry only a single value (we can think of an edge transmitting a bit in each time slot).

If we only used routing, then the central line would be able to carry A or B, but not both. Suppose we send A through the center; then the left destination would receive A twice and not know B at all. Sending B poses a similar problem for the right destination. We say that routing is insufficient because no routing scheme can transmit both A and B simultaneously to both destinations.

Using a simple code, as shown, we do get both A and B to both destinations simultaneously by sending the sum of the symbols through the center (in other words, we encode A and B using the formula "A+B"). The left destination receives A and A+B, and can find B by subtracting the two values. This is a linear code because the encoding and decoding schemes are linear operations.

[edit] Throughput

At the middle of the butterfly network, 3 messages are being transmitted (A, A+B, B). However 4 messages are being received at the endpoints at the bottom (receive 4 messages for the cost of 3 messages, Note that a message storage in the middle center router could store messages A and B and still provide all 4 messages to the endpoints (receive 4 messages for the cost of 2 messages, a 100% improvement).

[edit] Random network coding

Random Network Coding relies on using random codes at nodes for multicast or in cast networks. It was originally proposed in - T. Ho, R. Koetter, M. Medard, D. R. Karger and M. Effros, "The Benefits of Coding over Routing in a Randomized Setting" 2003 IEEE International Symposium on Information Theory. In random network coding, interior network nodes independently choose random linear mappings from inputs to outputs. The effect of the network is that of a transfer matrix from sources to receivers. To recover symbols at the receivers, we require sufficient degrees of freedom – an invertible matrix in the coefficients of all nodes. Receiver nodes can decode if they receive as many independent linear combinations as the number of source processes.

[edit] Applications

Network coding is seen to be useful in the following areas:

  • Alternative to forward error correction and ARQ in traditional and wireless networks. eg: Multi-user ARQ[4]
  • Robust and resilient to network attacks like snooping, eavesdropping or replay attacks.[5]
  • Digital file distribution and P2P file sharing. eg: Avalanche from Microsoft
  • Throughput increase in wireless mesh networks. eg : COPE,[6] Coding-aware routing[7][8]
  • Bidirectional low energy transmission in wireless sensor networks.
  • Many-to-many broadcast network capacity augmentations.
  • Buffer and Delay reduction in spatial sensor networks: Spatial buffer multiplexing [9]
  • Reduce the number of packet retransmission for a single-hop wireless multicast transmission, and hence improve network bandwidth.[10]
  • Packet collision.[11]

[edit] Network coding and Peer-to-Peer Networks

Network coding may be used in a peer-to-peer network to reduce the amount of routing information required by peers to achieve near optimal throughput[citation needed]. In large networks this can be a significant advantage, since otherwise the amount of routing overhead would scale with the size of the network. Unlike scenarios such as the butterfly network example above, network coding does not increase the maximum achievable throughput of a peer-to-peer network.[12]

However there are several difficulties when utilising network coding in peer-to-peer networks.

  • A peer may need to spend a large amount of time and resources decoding received data.
  • It can be difficult to ensure the uniqueness of the coefficients when there are many pieces in the transferred data.
  • The topology of a peer-to-peer network is constantly changing through the addition and removal of peers.

[edit] See also

[edit] References

[edit] External links

Personal tools
Namespaces

Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages