Linear network coding
Network coding is a field of research founded in a series of papers from the late 1990s to the early 2000s. However, the concept of network coding, in particular linear network coding, appeared much earlier. In a 1978 paper, a scheme for improving the throughput of a two-way communication through a satellite was proposed. In this scheme, two users trying to communicate with each other transmit their data streams to a satellite, which combines the two streams by summing them modulo 2 and then broadcasts the combined stream. Each of the two users, upon receiving the broadcast stream, can decode the other stream by using the information of their own stream.
The 2000 paper  gave the butterfly network example (discussed below) that illustrates how linear network coding can outperform routing. This example is equivalent to the scheme for satellite communication described above. The same paper gave an optimal coding scheme for a network with one source node and three destination nodes. This is the first example illustrating the optimality of convolutional network coding (a more general form of linear network coding) over a cyclic network.
Linear network coding may be used to improve a network's throughput, efficiency and scalability, as well as resilience to attacks and eavesdropping. Instead of simply relaying the packets of information they receive, the nodes of a network take several packets and combine them together for transmission. This may be used to attain the maximum possible information flow in a network.
It has been mathematically proven in theory that linear coding is enough to achieve the upper bound in multicast problems with one source. However linear coding is not sufficient in general (e.g. multisource, multisink with arbitrary demands), even for more general versions of linearity such as convolutional coding and filter-bank coding. Finding optimal coding solutions for general network problems with arbitrary demands remains an open problem.
Encoding and decoding
In a linear network coding problem, a group of nodes are involved in moving the data from source nodes to sink nodes. Each node generates new packets which are linear combinations of earlier received packets, multiplying them by coefficients chosen from a finite field, typically of size .
Each node, with indegree, , generates a message from the linear combination of received messages by the relation:
where the values are the coefficients selected from . Note that, since operations are computed in a finite field, the generated message is of the same length as the original messages. Each node forwards the computed value along with the coefficients, , used in the level, .
Sink nodes receive these network coded messages, and collect them in a matrix. The original messages can be recovered by performing Gaussian elimination on the matrix. In reduced row echelon form, decoded packets correspond to the rows of the form .
A brief history
A network is represented by a directed graph . is the set of nodes or vertices, is the set of directed links (or edges), and gives the capacity of each link of . Let be the maximum possible throughput from node to node . By the max-flow min-cut theorem, is upper bounded by the minimum capacity of all cuts, which is the sum of the capacities of the edges on a cut, between these two nodes.
Karl Menger proved that there is always a set of edge-disjoint paths achieving the upper bound in a unicast scenario, known as the max-flow min-cut theorem. Later, the Ford–Fulkerson algorithm was proposed to find such paths in polynomial time. Then, Edmonds proved in the paper "Edge-Disjoint Branchings" the upper bound in the broadcast scenario is also achievable, and proposed a polynomial time algorithm.
However, the situation in the multicast scenario is more complicated, and in fact, such an upper bound can't be reached using traditional routing ideas. Ahlswede, et al. proved that it can be achieved if additional computing tasks (incoming packets are combined into one or several outgoing packets) can be done in the intermediate nodes.
The butterfly network example
The butterfly network  is often used to illustrate how linear network coding can outperform routing. Two source nodes (at the top of the picture) have information A and B that must be transmitted to the two destination nodes (at the bottom), which each want to know both A and B. Each edge can carry only a single value (we can think of an edge transmitting a bit in each time slot).
If only routing were allowed, then the central link would be only able to carry A or B, but not both. Suppose we send A through the center; then the left destination would receive A twice and not know B at all. Sending B poses a similar problem for the right destination. We say that routing is insufficient because no routing scheme can transmit both A and B simultaneously to both destinations.
Using a simple code, as shown, A and B can be transmitted to both destinations simultaneously by sending the sum of the symbols through the center – in other words, we encode A and B using the formula "A+B". The left destination receives A and A + B, and can calculate B by subtracting the two values. Similarly, the right destination will receive B and A + B, and will also be able to determine both A and B.
Random Linear Network Coding
Random linear network coding  is a simple yet powerful encoding scheme, which in broadcast transmission schemes allows close to optimal throughput using a decentralized algorithm. Nodes transmit random linear combinations of the packets they receive, with coefficients chosen from a Galois field. If the field size is sufficiently large, the probability that the receiver(s) will obtain linearly independent combinations (and therefore obtain innovative information) approaches 1. It should however be noted that, although random linear network coding has excellent throughput performance, if a receiver obtains an insufficient number of packets, it is extremely unlikely that they can recover any of the original packets. This can be addressed by sending additional random linear combinations until the receiver obtains the appropriate number of packets.
Linear network coding is still a relatively new subject. Based on previous studies, there are three important open issues in RLNC:
- High decoding computational complexity due to using the Gauss-Jordan elimination method
- High transmission overhead due to attaching large coefficients vectors to encoded blocks
- Linear dependency among coefficients vectors which can reduce the number of innovative encoded blocks
Wireless Network Coding
The broadcast nature of wireless (coupled with network topology) determines the nature of interference. Simultaneous transmissions in a wireless network typically result in all of the packets being lost (i.e., collision, see Multiple Access with Collision Avoidance for Wireless). A wireless network therefore requires a scheduler (as part of the MAC functionality) to minimize such interference. Hence any gains from network coding are strongly impacted by the underlying scheduler and will deviate from the gains seen in wired networks. Further, wireless links are typically half-duplex due to hardware constraints; i.e., a node can not simultaneously transmit and receive due to the lack of sufficient isolation between the two paths.
Although, originally network coding was proposed to be used at Network layer (see OSI model), in wireless networks, network coding has been widely used in either MAC layer or PHY layer. It has been shown that network coding when used in wireless mesh networks need attentive design and thoughts to exploit the advantages of packet mixing, else advantages cannot be realized. There are also a variety of factors influencing throughput performance, such as media access layer protocol, congestion control algorithms, etc. It is not evident how network coding can co-exist and not jeopardize what existing congestion and flow control algorithms are doing for our Internet.
Since linear network coding is a relatively new subject, its adoption in industries is still pending. Unlike other coding, linear network coding is not entirely applicable in a system due to its narrow specific usage scenario. Theorists are trying to connect to real world applications. In fact, it was found that BitTorrent approach is far superior than network coding.
It is envisaged that network coding is useful in the following areas:
- Alternative to forward error correction and ARQ in traditional and wireless networks with packet loss. e.g.: Coded TCP, Multi-user ARQ
- Robust and resilient to network attacks like snooping, eavesdropping, replay or data corruption attacks.
- Digital file distribution and P2P file sharing. e.g.: Avalanche from Microsoft
- Distributed storage.
- Throughput increase in wireless mesh networks. e.g. : COPE, CORE, Coding-aware routing, B.A.T.M.A.N.
- Buffer and Delay reduction in spatial sensor networks: Spatial buffer multiplexing 
- Reduce the number of packet retransmission for a single-hop wireless multicast transmission, and hence improve network bandwidth.
- Distributed file sharing 
- Low-complexity video streaming to mobile devices 
Maturity & Issues
Since this area is relatively new and the mathematical treatment of this subject is currently limited to a handful of people, network coding has yet found its way to commercialization in products and services. It is unclear at this stage if this subject will prevail, or cease as a good mathematical exercise.
Researchers have clearly pointed out that special care is needed to explore how network coding can co-exist with existing routing, media access, congestion, flow control algorithms and TCP protocol. If not, network coding may not offer much advantages and can increase computation complexity and memory requirements.
- Celebiler, M.; G. Stette (1978). "On Increasing the Down-Link Capacity of a Regenerative Satellite Repeater in Point-to-Point Communications". Proceedings of the IEEE. 66 (1).
- Ahlswede, Rudolf; N. Cai; S.-Y. R. Li; R. W. Yeung (2000). "Network Information Flow". IEEE Transactions on Information Theory. 46 (4): 1204–1216. doi:10.1109/18.850663.
- S. Li, R. Yeung, and N. Cai, "Linear Network Coding"(PDF), in IEEE Transactions on Information Theory, Vol 49, No. 2, pp. 371–381, 2003
- R. Dougherty, C. Freiling, and K. Zeger, "Insufficiency of Linear Coding in Network Information Flow" (PDF), in IEEE Transactions on Information Theory, Vol. 51, No. 8, pp. 2745-2759, August 2005 ( erratum)
- Chou, Philip A.; Wu, Yunnan; Jain, Kamal (October 2003), "Practical network coding", Allerton Conference on Communication, Control, and Computing,
Any receiver can then recover the source vectors using Gaussian elimination on the vectors in its h (or more) received packets.
- T. Ho, R. Koetter, M. Medard, D. R. Karger and M. Effros, "The Benefits of Coding over Routing in a Randomized Setting" in 2003 IEEE International Symposium on Information Theory. doi:10.1109/ISIT.2003.1228459
- M.H.Firooz, Z. Chen, S. Roy and H. Liu, (Wireless Network Coding via Modified 802.11 MAC/PHY: Design and Implementation on SDR) in IEEE Journal on Selected Areas in Communications, 2013.
- XORs in The Air: Practical Wireless Network Coding
- "How Practical is Network Coding? by Mea Wang, Baochun Li".
- "Archived copy" (PDF). Archived from the original (PDF) on 2007-11-08. Retrieved 2007-06-16.
- http://home.eng.iastate.edu/~yuzhen/publications/ZhenYu_INFOCOM_2008.pdf[permanent dead link]
- "Archived copy" (PDF). Archived from the original (PDF) on 2013-09-19. Retrieved 2013-04-18.
- "CORE: COPE with MORE in Wireless Meshed Networks". 2013 IEEE 77th Vehicular Technology Conference (VTC Spring). doi:10.1109/VTCSpring.2013.6692495.
- "Archived copy" (PDF). Archived from the original (PDF) on 2008-10-11. Retrieved 2007-05-10.
- "NetworkCoding - batman-adv - Open Mesh". www.open-mesh.org. Retrieved 2015-10-28.
- Welcome to IEEE Xplore 2.0: Looking at Large Networks: Coding vs. Queueing
- Data dissemination in wireless networks with network coding
- Band Codes for Energy-Efficient Network Coding With Application to P2P Mobile Streaming
- "How Practical is Network Coding?".
- "XORs in The Air" (PDF).
- Fragouli, C.; Le Boudec, J. & Widmer, J. "Network coding: An instant primer" in Computer Communication Review, 2006.
Ali Farzamnia, Sharifah K. Syed-Yusof, Norsheila Fisa "Multicasting Multiple Description Coding Using p-Cycle Network Coding", KSII Transactions on Internet and Information Systems, Vol 7, No 12, 2013.
- Network Coding Homepage
- A network coding bibliography
- Raymond W. Yeung, Information Theory and Network Coding, Springer 2008, http://iest2.ie.cuhk.edu.hk/~whyeung/book2/
- Raymond W. Yeung et al., Network Coding Theory, now Publishers, 2005, http://iest2.ie.cuhk.edu.hk/~whyeung/netcode/monograph.html
- Christina Fragouli et al., Network Coding: An Instant Primer, ACM SIGCOMM 2006, http://infoscience.epfl.ch/getfile.py?mode=best&recid=58339.
- Avalanche Filesystem, http://research.microsoft.com/en-us/projects/avalanche/default.aspx
- Random Network Coding, https://web.archive.org/web/20060618083034/http://www.mit.edu/~medard/coding1.htm
- Digital Fountain Codes, http://www.icsi.berkeley.edu/~luby/
- Coding-Aware Routing, https://web.archive.org/web/20081011124616/http://arena.cse.sc.edu/papers/rocx.secon06.pdf
- MIT offers a course: Introduction to Network Coding
- Network coding: Networking's next revolution?
- Coding-aware protocol design for wireless networks: http://scholarcommons.sc.edu/etd/230/