# Shannon–Fano–Elias coding

Jump to navigation Jump to search

In information theory, Shannon–Fano–Elias coding is a precursor to arithmetic coding, in which probabilities are used to determine codewords.[1]

## Algorithm description

Given a discrete random variable X of ordered values to be encoded, let ${\displaystyle p(x)}$ be the probability for any x in X. Define a function

${\displaystyle {\bar {F}}(x)=\sum _{x_{i}

Algorithm:

For each x in X,
Let Z be the binary expansion of ${\displaystyle {\bar {F}}(x)}$.
Choose the length of the encoding of x, ${\displaystyle L(x)}$, to be the integer ${\displaystyle \left\lceil \log _{2}{\frac {1}{p(x)}}\right\rceil +1}$
Choose the encoding of x, ${\displaystyle code(x)}$, be the first ${\displaystyle L(x)}$ most significant bits after the decimal point of Z.

### Example

Let X = {A, B, C, D}, with probabilities p = {1/3, 1/4, 1/6, 1/4}.

For A
${\displaystyle {\bar {F}}(A)={\frac {1}{2}}p(A)={\frac {1}{2}}\cdot {\frac {1}{3}}=0.1666...}$
In binary, Z(A) = 0.0010101010...
L(A) = ${\displaystyle \left\lceil \log _{2}{\frac {1}{\frac {1}{3}}}\right\rceil +1}$ = 3
code(A) is 001
For B
${\displaystyle {\bar {F}}(B)=p(A)+{\frac {1}{2}}p(B)={\frac {1}{3}}+{\frac {1}{2}}\cdot {\frac {1}{4}}=0.4583333...}$
In binary, Z(B) = 0.01110101010101...
L(B) = ${\displaystyle \left\lceil \log _{2}{\frac {1}{\frac {1}{4}}}\right\rceil +1}$ = 3
code(B) is 011
For C
${\displaystyle {\bar {F}}(C)=p(A)+p(B)+{\frac {1}{2}}p(C)={\frac {1}{3}}+{\frac {1}{4}}+{\frac {1}{2}}\cdot {\frac {1}{6}}=0.66666...}$
In binary, Z(C) = 0.101010101010...
L(C) = ${\displaystyle \left\lceil \log _{2}{\frac {1}{\frac {1}{6}}}\right\rceil +1}$ = 4
code(C) is 1010
For D
${\displaystyle {\bar {F}}(D)=p(A)+p(B)+p(C)+{\frac {1}{2}}p(D)={\frac {1}{3}}+{\frac {1}{4}}+{\frac {1}{6}}+{\frac {1}{2}}\cdot {\frac {1}{4}}=0.875}$
In binary, Z(D) = 0.111
L(D) = ${\displaystyle \left\lceil \log _{2}{\frac {1}{\frac {1}{4}}}\right\rceil +1}$ = 3
code(D) is 111

## Algorithm analysis

### Prefix code

Shannon–Fano–Elias coding produces a binary prefix code, allowing for direct decoding.

Let bcode(x) be the rational number formed by adding a decimal point before a binary code. For example, if code(C)=1010 then bcode(C) = 0.1010. For all x, if no y exists such that

${\displaystyle bcode(x)\leq bcode(y)

then all the codes form a prefix code.

By comparing F to the CDF of X, this property may be demonstrated graphically for Shannon–Fano–Elias coding.

By definition of L it follows that

${\displaystyle 2^{-L(x)}\leq {\frac {1}{2}}p(x)}$

And because the bits after L(y) are truncated from F(y) to form code(y), it follows that

${\displaystyle {\bar {F}}(y)-bcode(y)\leq 2^{-L(y)}}$

thus bcode(y) must be no less than CDF(x).

So the above graph demonstrates that the ${\displaystyle bcode(y)-bcode(x)>p(x)\geq 2^{-L(x)}}$, therefore the prefix property holds.

### Code length

The average code length is ${\displaystyle LC(X)=\sum _{x\epsilon X}p(x)L(x)=\sum _{x\epsilon X}p(x)(\left\lceil \log _{2}{\frac {1}{p(x)}}\right\rceil +1)}$.
Thus for H(X), the Entropy of the random variable X,

${\displaystyle H(X)+1\leq LC(X)

Shannon Fano Elias codes from 1 to 2 extra bits per symbol from X than entropy, so the code is not used in practice.

## References

1. ^ T. M. Cover and Joy A. Thomas (2006). Elements of information theory (2nd ed.). John Wiley and Sons. pp. 127–128. ISBN 978-0-471-24195-9.