# Berlekamp–Welch algorithm

The Berlekamp–Welch algorithm, also known as the Welch–Berlekamp algorithm, is named for Elwyn R. Berlekamp and Lloyd R. Welch. The algorithm efficiently corrects errors in BCH codes and Reed–Solomon codes (which are a subset of BCH codes). Unlike many other decoding algorithms, and in correspondence with the code-domain Berlekamp–Massey algorithm that uses syndrome decoding and the dual of the codes, the Berlekamp–Welch decoding algorithm provides a method for decoding Reed–Solomon codes using just the generator matrix and not syndromes.

## History on decoding Reed–Solomon codes

1. In 1960, Peterson came up with an algorithm for decoding BCH codes.[1][2] His algorithm solves the important second stage of the generalized BCH decoding procedure and is used to calculate the error locator polynomial coefficients that in turn provide the error locator polynomial. This is crucial to the decoding of BCH codes.
2. In 1963, Gorenstein–Zierler saw that BCH codes and Reed–Solomon codes have a common generalization and that the decoding algorithm extends to more general situation.
3. In 1968 / 69, Elwyn Berlekamp invented an algorithm for decoding BCH codes. James Massey recognized its application to linear feedback shift registers and simplified the algorithm.[3][4] Massey termed the algorithm the LFSR Synthesis Algorithm (Berlekamp Iterative Algorithm) but it is now known as the Berlekamp–Massey algorithm.
4. In 1986, The Welch–Berlekamp algorithm was developed to solve the decoding equation of Reed–Solomon codes, using a fast method to solve a certain polynomial equation. The Berlekamp – Welch algorithm has a running time complexity of ${\displaystyle {\mathcal {O}}(N^{3})}$. We will in the following sections look at the Gemmel and Sudan’s exposition of the Berlekamp Welch Algorithm.[5]

## Error locator polynomial of Reed–Solomon codes

In the problem of decoding Reed–Solomon codes, the inputs are pair wise distinct evaluation points ${\displaystyle \alpha _{1},\cdots ,\alpha _{n}}$ where ${\displaystyle \alpha _{i}\in \mathbb {F} }$ with dimension ${\displaystyle k}$ and distance ${\displaystyle d=n-k+1}$ and a codeword ${\displaystyle y=(y_{1},\cdots ,y_{n})\in \mathbb {F} ^{n}.}$ Our goal is to describe an algorithm that can correct ${\displaystyle e<{\tfrac {1}{2}}(n-k+1)}$ many errors in polynomial time. To do so we have to find ${\displaystyle P\in \mathbb {F} [X]}$ such that ${\displaystyle \deg(P) and the number of indices for which ${\displaystyle P(\alpha _{i})\neq y_{i}}$ is less than or equal to ${\displaystyle e.}$ We can assume that there exists a polynomial ${\displaystyle P}$ such that

${\displaystyle \Delta (y,(P(\alpha _{i}))_{i=1}^{N})\leqslant e\leqslant {\tfrac {d}{2}}={\tfrac {1}{2}}(n-k+1).}$

Note that the coefficients of ${\displaystyle P}$ are the encoded information. To solve this, we use an indicator for those indices where an error may have occurred. Thus we define an error locator polynomial, ${\displaystyle E\in \mathbb {F} [X],}$ by:

${\displaystyle E(X)=\prod _{1\leqslant i\leqslant n \atop y_{i}\neq P(\alpha _{i})}(X-\alpha _{i})}$

Note that ${\displaystyle \deg(E)\leqslant {\tfrac {1}{2}}(n-k).}$ We can also claim that ${\displaystyle y_{i}E(\alpha _{i})=P(\alpha _{i})E(\alpha _{i})}$ holds for all ${\displaystyle 1\leqslant i\leqslant n}$. This fact holds true because in the event of ${\displaystyle y_{i}\neq P(\alpha _{i})}$, both sides of the above equation vanish because ${\displaystyle E(\alpha _{i})=0}$.

However, since ${\displaystyle E}$ and ${\displaystyle P}$ are both unknown, the main task of the decoding algorithm would be to find ${\displaystyle P}$. To do this we use a seemingly useless yet very powerful method and define another polynomial ${\displaystyle Q=PE.}$ This is because the ${\displaystyle n}$ equations with ${\displaystyle e+k}$ we need to solve are quadratic in nature. Thus by defining a product of two variables that gives rise to a quadratic term as one unknown variable, we increase the number of unknowns but make the equations linear in nature. This method is called linearization[6] and is a very powerful tool.

Thus ${\displaystyle Q\in \mathbb {F} [X]}$ having the properties:

1. ${\displaystyle \deg(Q)\leqslant {\tfrac {1}{2}}(n-k)+k-1}$
2. ${\displaystyle Q(\alpha _{i})=E(\alpha _{i})y_{i},\qquad 1\leqslant i\leqslant n}$

This helps because if we now manage to find ${\displaystyle Q}$ and ${\displaystyle E}$, we can easily find ${\displaystyle P}$ using ${\displaystyle P={\tfrac {Q}{E}}}$. The main purpose of the Berlekamp Welch algorithm is to find out ${\displaystyle P}$ using degree bounded polynomials ${\displaystyle Q}$ and ${\displaystyle E}$ and the properties of ${\displaystyle E}$ and ${\displaystyle N}$.

Computing ${\displaystyle E}$ is as hard as finding the end solution ${\displaystyle P.}$ Once ${\displaystyle E}$ is computed, using erasure decoding for Reed–Solomon codes, we can easily recover ${\displaystyle P}$. However, in a few cases, even the polynomial ${\displaystyle Q}$ is as hard to find as ${\displaystyle E}$. As an example, given ${\displaystyle Q}$ and ${\displaystyle y}$ (such that ${\displaystyle y_{i}\neq 0}$ for ${\displaystyle 1\leqslant i\leqslant n}$), by checking positions where ${\displaystyle Q(i)=0}$, we can ﬁnd the error locations. Thus the algorithm works on the principle that while each of the polynomials ${\displaystyle E}$ and ${\displaystyle Q}$ are hard to find individually; computing them together is much easier.

## The Berlekamp–Welch decoder and algorithm

The Welch–Berlekamp decoder for Reed–Solomon codes consists of the Welch– Berlekamp algorithm augmented by some additional steps that prepare the received word for the algorithm and interpret the result of the algorithm.

The inputs given to the Berlekamp Welch decoder are the integers denoting Block Length ${\displaystyle n,}$ the number of errors ${\displaystyle e}$ such that ${\displaystyle e<{\tfrac {1}{2}}(n-k+1),}$ and the received word ${\displaystyle (y_{i},\alpha _{i})_{i=1}^{n}}$ satisfying the condition that there exists at most one ${\displaystyle P}$ with ${\displaystyle \deg(P)\leqslant k-1}$ with ${\displaystyle \Delta (y,P(\alpha _{i})_{i})\leqslant e}$.

The output of the decoder is either the polynomial ${\displaystyle P}$, or in some cases, a failure. This decoder functions in two steps as follows:

1. This step is called the interpolation step in which the decoder computes a non zero polynomial ${\displaystyle E}$ of degree ${\displaystyle e}$ (This implies that the coefficient of ${\displaystyle X^{e}}$ must be 1[7]) and another polynomial ${\displaystyle Q}$ with ${\displaystyle \deg(Q)\leqslant e+k-1.}$ These polynomials are created such that the condition ${\displaystyle y_{i}E(\alpha _{i})=Q(\alpha _{i})}$ for all ${\displaystyle 1\leqslant i\leqslant n.}$ In the case that polynomials satisfying the above condition cannot be computed, the output of the decoder would be a failure.
2. If ${\displaystyle E\mid Q}$ then a ${\displaystyle P}$ is defined which equals ${\displaystyle {\tfrac {Q}{E}}.}$ If ${\displaystyle \Delta (y,(P(\alpha _{i})_{i})\leqslant e,}$ then the decoder outputs ${\displaystyle P.}$ If the above condition is not satisfied, i.e. if ${\displaystyle E\nmid Q}$ then a failure is returned by the decoder.

According to the algorithm, in the cases where it does not output a failure, it outputs a ${\displaystyle P}$ that is the correct and desired polynomial. To prove that, the algorithm always outputs the desired polynomial, we need to prove a few claims we have made while describing the algorithm. Let us go ahead and do so now.

Claim 1. There exist a pair of polynomials, ${\displaystyle (E,Q),}$ that satisfy Step 1 of the BW algorithm and ${\displaystyle {\tfrac {Q}{E}}=P.}$

Let ${\displaystyle E}$ be the error-locating polynomial for ${\displaystyle P}$ :

${\displaystyle E(X)=X^{e-\Delta (y,P(\alpha _{i})_{i})}\prod _{1\leqslant i\leqslant n \atop y_{i}\neq P(\alpha _{i})}\left(X-\alpha _{i}\right)}$

Notice that ${\displaystyle E}$ has the following properties by definition:

${\displaystyle \deg(E)=e,\qquad E(\alpha _{i})=0\Leftrightarrow y_{i}\neq P(\alpha _{i}).}$

Now define ${\displaystyle Q=PE}$ and note that:

${\displaystyle \deg(Q)\leqslant \deg(P)+\deg(E)\leqslant e+k-1.}$

We can now claim that ${\displaystyle y_{i}E(\alpha _{i})=Q(\alpha _{i})}$ from the first step of the BW algorithm holds. If ${\displaystyle E(\alpha _{i})=0,}$ then ${\displaystyle Q(\alpha _{i})=P(\alpha _{i})E(\alpha _{i})=y_{i}E(\alpha _{i})=0}$. For ${\displaystyle E(\alpha _{i})\neq 0}$ we have ${\displaystyle P(\alpha _{i})=y_{i}}$ and therefore ${\displaystyle Q(\alpha _{i})P(\alpha _{i})E(\alpha _{i})=y_{i}E(\alpha _{i})}$ just as we claimed.

This above claim however just reiterates and proves the fact that there exists a pair of polynomials ${\displaystyle E}$ and ${\displaystyle Q}$ such that ${\displaystyle P={\tfrac {Q}{E}}.}$ It however does not necessarily guarantee the fact that the algorithm we discussed above would indeed output such a pair of polynomials. We therefore move on to look at another claim that helps establish this fact using the above claim and thereby proving the correctness of the algorithm.

Claim 2. If ${\displaystyle (E_{1},Q_{1}),(E_{2},Q_{2})}$ are two distinct solutions that satisfy the first step of the Berlekamp Welch algorithm, then we have ${\displaystyle {\tfrac {Q_{1}}{E_{1}}}={\tfrac {Q_{2}}{E_{2}}}.}$

First note that

${\displaystyle \deg(Q_{1}E_{2}),\deg(Q_{2}E_{1})\leqslant 2e+k-1.}$

Then we define:

${\displaystyle R:=Q_{1}E_{2}-Q_{2}E_{1}}$

Note that ${\displaystyle \deg(R)\leqslant 2e+k-1.}$ From step 1 of the Berlekamp Welch algorithm we also know that ${\displaystyle y_{i}E_{1}(\alpha _{i})=Q_{1}(\alpha _{i})}$ and ${\displaystyle y_{i}E_{2}(\alpha _{i})=Q_{2}(\alpha _{i}).}$ Now for all ${\displaystyle i\in \{1,\cdots ,n\}}$ we calculate:

{\displaystyle {\begin{aligned}R(\alpha _{i})&=Q_{1}(\alpha _{i})E_{2}(\alpha _{i})-Q_{2}(\alpha _{i})E_{1}(\alpha _{i})\\&=y_{i}E_{1}(\alpha _{i})E_{2}(\alpha _{i})-y_{i}E_{2}(\alpha _{i})E_{1}(\alpha _{i})\\&=0\end{aligned}}}

Thus ${\displaystyle R}$ has ${\displaystyle n}$ roots, on the other hand

${\displaystyle \deg(R(X))\leqslant 2e+k-1<2{\tfrac {1}{2}}(n-k+1)+k-1=n.}$

Therefore, ${\displaystyle R}$ is the zero polynomial which means that ${\displaystyle Q_{1}E_{2}}$ and ${\displaystyle Q_{2}E_{1}}$ are identical. Since ${\displaystyle E_{1},E_{2}}$ are non-zero we can write: ${\displaystyle {\tfrac {Q_{1}}{E_{1}}}={\tfrac {Q_{2}}{E_{2}}}}$ as per our initial claim.

Thus based on the above claims, we can safely state that the output of the Berlekamp Welch algorithm, when outputting the polynomial ${\displaystyle P(X)}$ is correct.

We can now claim that the algorithm can be implemented such that it has a running time of ${\displaystyle O(n^{3})}$. This can be proved as follows: In Step 1 of the algorithm, the polynomials ${\displaystyle Q}$ and ${\displaystyle E}$ have ${\displaystyle e+k}$ and ${\displaystyle e+1}$ unknown values respectively and the constraints ${\displaystyle y_{i}E(\alpha _{i})=Q(\alpha _{i})}$ for all ${\displaystyle 1\leqslant i\leqslant n}$ acts as a linear equation with these unknowns. We therefore get a system of ${\displaystyle n}$ linear equations in ${\displaystyle 2e+k+1 unknowns. Using our first claim, this system of equations has a solution since ${\displaystyle \deg(E)=e.}$ This can be solved in ${\displaystyle O(n^{3})}$ time, by say Gaussian elimination. Finally, we can note that Step 2 of the algorithm can also be implemented in time ${\displaystyle O(n^{3})}$ by "long division" method. Hence we can state that the Berlekamp Welch algorithm can be used to uniquely decode any ${\displaystyle [n,k]_{q}}$ Reed–Solomon code in ${\displaystyle O(n^{3})}$ time for a maximum of ${\displaystyle {\tfrac {1}{2}}(n-k+1)}$ errors.

## Example

The error locator polynomial serves to "neutralize" errors in P by making Q zero at those points, so that the system of linear equations is not affected by the inaccuracy in the input.

Consider a simple example where a redundant set of points are used to represent the line ${\displaystyle y=5-x}$, and one of the points is incorrect. The points that the algorithm gets as an input are ${\displaystyle (1,4),(2,3),(3,4),(4,1)}$, where ${\displaystyle (3,4)}$ is the defective point. The algorithm must solve the following system of equations:

{\displaystyle {\begin{aligned}Q(1)&=4*E(1)\\Q(2)&=3*E(2)\\Q(3)&=4*E(3)\\Q(4)&=1*E(4)\end{aligned}}}

Given a solution pair ${\displaystyle (Q,E)}$ to this system of equations, it is evident that at any of the points ${\displaystyle x=1,2,3,4}$ one of the following must be true:

${\displaystyle Q(\alpha _{i})=E(\alpha _{i})=0,\quad {\text{or}}\quad P(\alpha _{i})={\frac {Q(\alpha _{i})}{E(\alpha _{i})}}=y_{i}.}$

Since ${\displaystyle E}$ is defined as only having a degree of one, the former can only be true in one point. Therefore, ${\displaystyle P(\alpha _{i})=y_{i}}$ at the three other points.

Letting ${\displaystyle E(x)=x+e_{0}}$ and ${\displaystyle Q=q_{0}+q_{1}x+q_{2}x^{2}}$ we can rewrite the system:

${\displaystyle {\begin{cases}q_{0}+q_{1}+q_{2}-4e_{0}-4=0\\q_{0}+2q_{1}+4q_{2}-3e_{0}-6=0\\q_{0}+3q_{1}+9q_{2}-4e_{0}-12=0\\q_{0}+4q_{1}+16q_{2}-e_{0}-4=0\end{cases}}}$

This system can be solved through Gaussian elimination, and gives the values:

${\displaystyle q_{0}=-15,q_{1}=8,q_{2}=-1,e_{0}=-3}$

Thus:

${\displaystyle Q=-x^{2}+8x-15,E=x-3,\quad {\text{and}}\quad {\frac {Q}{E}}=P=5-x.}$

${\displaystyle 5-x}$ fits three of the four points given, so it is the most likely to be the original polynomial.