Jump to content

Linear code

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 71.245.170.91 (talk) at 19:23, 19 April 2008 (→‎Nearest Neighbor Algorithm). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In mathematics and information theory, a linear code is an important type of block code used in error correction and detection schemes. Linear codes allow for more efficient encoding and decoding algorithms than other codes (cf. syndrome decoding).

Linear codes are applied in methods of transmitting symbols (e.g., bits) on a communications channel so that, if errors occur in the communication, some errors can be detected by the recipient of a message block. The "codes" in the linear code are blocks of symbols which are encoded using more symbols than the original value to be sent. A linear code of length n transmits blocks containing n symbols. For example, the "(7,4)" Hamming code is a binary linear code which represents 4-bit values each using 7-bit values. In this way, the recipient can detect errors as severe as 2 bits per block.[1] As there are sixteen (16) distinct 4-bit values expressed in binary, the size of the (7,4) Hamming code is sixteen.

Formal definition

A linear code of length n and rank k is a linear subspace C with dimension k of the vector space where is the finite field with q elements. Such a code with parameter q is called a q-ary code (e.g., when q = 5, the code is a 5-ary code). If q = 2 or q = 3, the code is described as a binary code, or a ternary code respectively.

Remark: We want to give the usual standard basis because each coordinate represents a "bit" which is transmitted across a "noisy channel" with some small probability of transmission error (please see the binary symmetric channel (BSC) for more details). If some other basis is used then this BSC model cannot be used and the Hamming metric (defined next) does not measure the number of errors in transmission, as we want it to.

Properties

As a linear subspace of , the entire code C (which may be very large) may be represented as the span of a minimal set of codewords (known as a basis in linear algebra). These basis codewords are often collated in the rows of a matrix G known as a generating matrix for the code C. When G has the block matrix form G = (Ik | A), where Ik denotes the identity matrix and A is some matrix, then we say G is in standard form.

A matrix whose kernel is C is called a check matrix of C (or sometimes a parity check matrix). If C is a code with a generating matrix G in standard form, G = (Ik | A), then H = (-At | In-k) is a check matrix for C.

The subspace definition also guarantees that the minimum Hamming distance d between any given codeword c0 and the other codewords c ≠ c0 is constant. Since the difference c − c0 of two codewords in C is also a codeword (i.e., an element of the subspace C), and d(c, c0) = d(c − c0, 0), we see that

Nearest Neighbor Algorithm

The parameter d is closely related to the error correcting ability of the code. The following construction/algorithm illustrates this (called the nearest neighbor decoding algorithm):

Input: A "received vector" v in .

Output: A codeword c in C closest to v.

   * Enumerate the elements of the ball of (Hamming) radius t about v, denoted Bt(v), where v is the received word. Set c="fail.
   * For each w in Bt(v), check if w in C. If so, put c = w and break to the next step; otherwise, discard w and move to the next element.
   * Return c.

Note: "fail is not returned unless t> (d-1)/2.. We say that a linear C is t-error correcting if there is at most one codeword in Bt(w), for each w in .

Codes in general are often denoted by the letter C. A linear code of length n, of rank k (i.e., having k code words in its basis and k rows in its generating matrix), and of minimum Hamming weight d is referred to as an [nkd] code.

Remark. This [nkd] notation should not be confused with the (nMd) notation used to denote a non-linear code of length n, size M (i.e., having M code words), and minimum Hamming distance d.

Singleton bound

Lemma (Singleton bound): Every linear [n,k,d] code C satisfies .

A code C whose parameters satisfy k+d=n+1 is called maximum distance separable or MDS. Such codes, when they exist, are in some sense best possible.

If C1 and C2 are two codes of length n and if there is a permutation p in the symmetric group Sn for which (c1,...,cn) in C1 if and only if (cp(1),...,cp(n)) in C2, then we say C1 and C2 are permutation equivalent. In more generality, if there is an monomial matrix which sends C1 isomorphically to C2 then we say C1 and C2 are equivalent.

Lemma: Any linear code is permutation equivalent to a code which is in standard form.

Examples

Some examples of linear codes include:

Uses

Binary linear codes (refer to formal definition above) are ubiquitous in electronic devices and digital storage media. For example the Reed-Solomon code is used to store digital data on a compact disc.

Footnotes

  1. ^ Thomas M. Cover and Joy A. Thomas (1991). Elements of Information Theory. John Wiley & Sons, Inc. pp. 210–211. ISBN 0-471-06259-5. {{cite book}}: Check |isbn= value: checksum (help)