Binary Reed–Solomon encoding

Binary Reed–Solomon coding (BRS), which belongs to a RS code, is a way of encoding that can fix node data loss in a distributed storage environment. It has maximum distance separable (MDS) encoding properties. Its encoding and decoding rate outperforms conventional RS coding and optimum CRS coding.

Background[edit]

RS coding is a fault-tolerant encoding method for a distributed storage environment. Suppose we wish to distribute data across $k$ individual devices for improved storage capacity or bandwidth, for example in a hardware RAID setup. Such a configuration risks significant data loss in the event of device failure. The Reed-Solomon encoding produces a storage coding system which robust to the simultaneous failure of any subset of $m$ nodes. To do this, we adding $m$ additional nodes to the system, for a total of $n = k + m$ storage nodes.

Traditional RS encoding method uses the Vandermonde matrix as a coding matrix and its inverse as the decoding matrix. Traditional RS encoding and decoding operations are all carried out on a large finite domain.

Because BRS encoding and decoding employ only shift and XOR operations, they are much faster than traditional RS coding. The algorithm of BRS coding is proposed by the advanced network technology laboratory of Peking University, and it also released the open source implementation of BRS coding. In the actual environment test, the encoding and decoding speed of BRS is faster than that of CRS. In the design and implementation of distributed storage system, using BRS coding can make the system have the characteristics of fault tolerant regeneration.

Principle[edit]

BRS encoding principle[edit]

The structure of traditional Reed–Solomon codes is based on finite fields, and the BRS code is based on the shift and XOR operation. BRS encoding is based on the Vandermonde matrix, and its specific encoding steps are as follows:

1、Equally divides the original data blocks into $k$ blocks, and each block of data has $L$ -bit data, recorded as

$S=(s_{0},s_{1},...,s_{k-1})$

where $s_{i}=s_{i,0}s_{i,1}...s_{i,L-1}$ , $i=0,1,2,...,k-1$ .

2、Builds the calibration data block $M$ ， $M$ has a total of $n-k$ blocks:

$M=(m_{0},m_{1},...,m_{n-k-1})$

where $m_{i}=\sum _{j=0}^{k-1}s_{j}(r_{j}^{i})$ , $i=0,1,...,n-k-1$ .

The addition here are all XOR operation，where $r_{j}^{i}$ represents the number of bits of "0" added to the front of the original data block $s_{j}$ .Thereby forming a parity data block $m_{i}$ . $r_{j}^{i}$ is given by the following way:

$(r_{0}^{a},r_{1}^{a},...,r_{k-1}^{a})=(0,a,2a,...(k-1)a)$

where $a=0,1,...n-k-1$ .

3、Each node stores data, nodes $N_{i}(i=0,1,...,n-1)$ store the data as $s_{0},s_{1},...,s_{k-1},m_{0},m_{1},...,m_{n-k-1}$ .

BRS encoding example[edit]

If now $n=6,k=3$ , there $ID_{0}=(0,0,0)$ ， $ID_{0}=(0,1,2)$ ， $ID_{0}=(0,2,4)$ . The original data block are $s_{i}=s_{0},s_{1},...,s_{L-1}$ , where $i=0,1,...,k-1$ , The calibration data for each block are $m_{i}=m_{i,0}m_{i,1}...mx_{i,L+i\times (k-1)-1}$ ，where $i=0,1,...,k-1$ .

Calculation of calibration data blocks is as follows, the addition operation represents a bit XOR operation:

$m_{0}=s_{0}(0)\oplus s_{1}(0)\oplus s_{2}(0)$ , so $m_{0}=(m_{0,0}m_{0,1}...m_{0,5})$

$m_{1}=s_{0}(0)\oplus s_{1}(1)\oplus s_{2}(2)$ , so $m_{1}=(m_{1,0}m_{1,1}...m_{1,7})$

$m_{2}=s_{0}(0)\oplus s_{1}(2)\oplus s_{2}(4)$ , so $m_{2}=(m_{2,0}m_{2,1}...m_{2,9})$

BRS decoding principle[edit]

In the structure of BRS code, we divide the original data blocks into $k$ blocks. They are $S=(s_{0},s_{1},...,s_{k-1})$ . And encoding has been $n$ block calibration data blocks, there are $M=(m_{0},m_{1},...,m_{n-k-1})$ .

During the decoding process, there is a necessary condition: The number of undamaged calibration data blocks have to be greater than or equal to the number of the original data blocks that missing, if not, it cannot be repaired.

The following is a decoding process analysis:

Might as well make $n=6$ , $k=3$ . Then

$m_{0}=s_{0}+s_{1}+s_{2}$

$m_{1}=s_{0}+xs_{1}+x^{2}s_{2}$

$m_{1}=s_{0}+x^{2}s_{1}+x^{4}s_{2}$

Supposed $s_{0}$ is intact, $s_{1},s_{2}$ miss, choose $m_{1}$ , $m_{2}$ to repair, make

$m_{1}^{*}=m_{1}+s_{0}$

$m_{2}^{*}=m_{2}+s_{0}$

Because $m_{1}$ ， $m_{2}$ ， $s_{0}$ are known, $m_{1}^{*}$ ， $m_{2}^{*}$ are known. So that

$s_{1,i-2}=m_{2,i}^{*}+s_{2,i-4}$

$s_{2,i-2}=m_{1,i}^{*}+s_{1,i-1}$

According to the above iterative formula, each cycle can figure out two bit values ( $s_{1},s_{2}$ can get a bit). Each of the original data block length ( $L$ bit), so after repeating $L$ times, We can work out all the unknown bit in the original data block. by parity of reasoning, we can completed the data decoding.

Performance[edit]

Some experiments shows that, considering the encoding rate, BRS encoding rate is about 6-fold as much as RS encoding rate and 1.5-fold as much as CRS encoding rate in the single core processor, which meets the conditions that compare to RS encoding, its encoding speed upgrades no less than 200%.

Under the same conditions, for the different number of deletions, BRS decoding rate is about 4-fold as much as RS encoding rate, about 1.3-fold as much as CRS encoding rate, which meets the conditions that compare to RS encoding, the decoding speed promotes 100%.

Applications[edit]

In the current situation, the application of distributed systems is commonly used. Using erasure code to store data in the bottom of the distributed storage system can increase the fault tolerance of the system. At the same time, compared to the traditional replica strategy, erasure code technology can exponentially improve the reliability of the system for the same redundancy.

BRS encoding can be applied to distributed storage systems, for example, BRS encoding can be used as the underlying data encoding while using HDFS. Due to the advantages of performance and similarity of the encoding method, BRS encoding can be used to replace the CRS encoding in distributed systems.

Usage[edit]

There are open source codes to implement BRS encoding written in C and available on GitHub. In the design and implementation of a distributed storage system, we can use BRS encoding to store data and to achieve the system's own fault tolerance.

References[edit]