Talk:Reed–Solomon error correction

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Telecommunications (Rated C-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Telecommunications, a collaborative effort to improve the coverage of Telecommunications on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the quality scale.
 Mid  This article has been rated as Mid-importance on the importance scale.
 
WikiProject Computer science (Rated C-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 

able to correct twice as many erasures as errors ?[edit]

"A Reed–Solomon code (like any linear code) is able to correct twice as many erasures as errors" I think it is true for Reed-Solomon but not for any linear code. This property is true for perfect codes ( Perfect codes are codes that reach the Hamming bound). I propose to change linear code into perfect codes. --Cunchem 15:23, 2 April 2009 (UTC)

Erasures[edit]

I wrote the text that said that a Reed-Solomon code is twice as powerful at erasure correction than at error correction. This text was correct, and I've reverted the page.

In this context, the word 'erasure' refers to a symbol in a particular location that is known (or is very likely) to be in error, but the actual error value isn't known. (This occurs because Reed-Solomon codes are non-binary; if we were talking about a single bit known to be in error, then it would be obvious how to fix it!)

With a Reed-Solomon decoder equipped for erasure correction, you can tell it which specific symbols you believe to be in error, and the decoder will compute the corresponding error values; i.e., it will correct the erasures. Why do this instead of just letting the decoder figure out which symbols are in error? Because the decoder can correct more errors this way. E.g., the (255,223) code heavily used in deep space communications has 32 parity symbols, so it can correct up to 16 symbol errors if the locations of those errors are not known. But if you know where two of those errors are, then the code can correct them *plus* 15 more errors in unknown locations, for a total of 17 corrections.

User: Karn

Just a minor point: of course the concept of "erasure" is also meaningful for binary codes. This is because "erasure" does not mean that the location is known to be an error; it means that the location is known to be meaningless. Perhaps you lost the 3rd bit of your transmission because you accidentally - but knowingly - set it to 0. In this case, you don't know if there's an error at that location - but you know it has to be reconstructed. Selinger (talk) 02:13, 15 March 2008 (UTC)
I like in general the idea of explaining erasure as an error at a known location. However it is true that the usual meaning of error is that the symbol received is definitely not the symbol sent, hence in general gives some information about the symbol sent. In the case of a binary alphabet it gives all the information about the symbol. An erasure indicates only the position of the symbol but nothing about its value. But I'm not sure how to reword the definition without losing its simplicity. Ozga (talk) 16:43, 26 July 2008 (UTC)

Shortened DVD[edit]

I am a new to Reed-solomon. Until recent,I am required to implement Reed-Solomon code on FPGAs. This will be used in our DVD project. However, I have some questions puzzling me greatly. In DVD error correction coding ,should RS(208,192,17) and RS(182,172,11) be shortened or puntured Reed-Solomon codes. I have no way to differentiate the two. Meanwhile, I have great eagerness to comminicate and discuss with others about ECC especially Reed-Solomon codes. My email is skycanny@hotmail.com

Regards!

Short Bursts?![edit]

From the article: "Viterbi decoders tend to produce errors in short bursts."

That's not good I guess. May someone comment on this statement? --Abdull 07:19, 16 October 2005 (UTC)

I added a link to the Viterbi decoder article from that sentence. The statement is true, but should probably be discussed in one of the Viterbi articles, and then cited from here. The following statement in the article that 'Correcting these burst errors is a job best done by short or simplified Reed-Solomon codes.' seems to deserve a citation or a bit more discussion. Edurant 13:25, 10 October 2006 (UTC)

Cross-interleaved coding[edit]

Is CIRC also used on DVDs.

Clarification on the Values Stored in Data Symbols[edit]

I came here to learn about Reed-Solomon codes, and I found the article to be very helpful, but there is one thing I do not understand: Are the data symbols coefficients of the polynomial or coordinates of the plotted points?

The article says "The coefficients of the polynomial are the data symbols of the block." That seems clear enough, but if it's true, then I simply don't understand Reed-Solomon codes at all, and I'd welcome a lot of clarification.

If that sentence is not true, then I speculate all symbols are Y-coordinates of the plotted points (the parity symbols being coordinates of the oversampled points). That interpretation makes more sense to me. Is it correct?

68.6.61.250 04:54, 25 December 2005 (UTC)

I rewrote the start of that paragraph. Read it again and see if it is clearer. (I could tell you the answer, but if you still cant figure it out from the text, then we n need to fix the text further :) - grubber 05:14, 25 December 2005 (UTC)


Explanantion of author does not make sense[edit]

Try this from me --

A message stream of 'k' symbols is treated as a degree k-1 polynomial over GF(2^s) (GF(2^s)=galois field with n=2^s -1 ), with each symbol being coefficient of the said polynomial. This is referred to as 'message poynomial'(lets call it 'm(x)' with degree 'k-1'). There is another fixed polynomial with coefficients from the GF(2^s) called a 'generator polynomial' of degree 'n-k'. lets call it g(x). A code polynomial of degree 'n-1' (lets call it 'c(x)')is formed by the following algebra - c(x) = x^(n-k)*m(x)+r(x), where r(x) is remainder formed by dividing x^(n-k)*m(x) with g(x). It is clear now that c(x) divides g(x). This is true for any message stream. The coeffcients of c(x) are now called code symbols. Thus, a 'k' stream of symbols (message) is turned into a 'n' stream of symbols (code). This block of 'n' symbols that contains only 'k' useful message symbols is transimitted over the channel. The so called parity symbols are the 'n-k' coeffcients of r(x). The fact that any code polynomial is divisible by a fixed polynomial (=g(x)) is known to both sides, sender and the reciever. The reciever, knowing this property has the ability to extract the k symbols from the recieved 'n' symbol code block (even if up to '(n-k)/2' symbols have been corrupted).

-rajesh p.

What you are describing is a polynomial code. Reed-Solomon codes can be defined either as a polynomial code (from a generator polynomial as you suggest, or from given roots), or as the graphs of polynomials. The definition in terms of graphs is mathematically simpler (it doesn't require primitive roots of unity) and therefore probably more appealing for an encyclopedia. Most non-specialists can easily appreciate that low-degree polynomials have few roots, and this is the only mathematical fact really needed. However, the alternative definition in terms of a generator polynomial is important for the error correction algorithm, which is the real value of Reed-Solomon codes. I have therefore added a section on the alternative definition. Selinger (talk) 04:28, 15 March 2008 (UTC)

Need More Info:[edit]

We need to clean up the references to to their inital paper.

--ReptileLawyer 15:11, 18 April 2006 (UTC)

Lack of detail[edit]

Compared to Hamming code, almost no detail is given on how to actually compute a Reed-Solomon code. Perhaps this should be considered for future addition. Shawn K. Quinn 13:16, 27 January 2007 (UTC)

I have added a more formal definition of a RS code. More details would be welcome! - grubber 19:59, 30 January 2007 (UTC)
Your formal definition wasn't complete. To be a Reed-Solomon code, the set \{x_1,\ldots,x_n\} must be all non-zero elements of the field. I have added this to the definition. The more general class of codes where the points are arbitrary has the correct minimum distance, but the actual algorithm for finding the error locations relies on the additional property. Also, codes in the more general class are not called "Reed-Solomon" codes according to my references (does anyone know if they have a name? It's not polynomial codes because this means something else.) Selinger (talk) 01:52, 15 March 2008 (UTC)

Detectable Errors[edit]

RS is able to correct (n - k) / 2 errors.

How many errors in a segment he can detect 100% accurate?

(the algorithm is able to detect to a certain level, if he is not able to correct too many errors up too (n - k) / 2. Where is the limit?) —The preceding unsigned comment was added by 85.5.235.196 (talk) 10:55, 16 March 2007 (UTC).

======[edit]

Reed-Solomon codes are Maximum Distance Separable (MDS) codes, and are guaranteed to have a minimum distance of n - k + 1. So one can detect up to n - k errors with certainty.


maximum burst error length[edit]

The Reed–Solomon error correction article currently claims "The result is a CIRC that can completely correct error bursts up to 4000 bits, or about 2.5 mm on the disc surface." which doesn't exactly match the cross-interleaved Reed-Solomon coding article which claims "CIRC corrects error bursts up to 3,500 bits in sequence (2.4 mm in length as seen on CD surface)".

What is the correct number? --75.37.227.177 19:52, 10 August 2007 (UTC)

I think this should be a simple calculation:

If I understand correctly, the CIRC is physically read off the CD (after EFM decoding) as large blocks, each large block a stream of 7 168 consecutive bits long.

Each of the 28 substrings of 256 consecutive bits is decoded as a "inner" RS code.

In a maximum-correctable-burst, several of those substrings are completely wiped out. One byte of each of those 28 substrings goes to a 28-byte "outer" RS code, which is decoded into 24 data bytes and corrects up to 4 byte errors.

So I get 4 * 256 = 1 000 consecutive bits that can be corrected in a CIRC large block. The maximum correctible burst occurs when the last 1000 bits are destroyed in one CIRC large block and the first 1000 bits are destroyed in the next CIRC large block, giving 2000 bits.

Where did I go wrong in this calculation? --75.37.227.177 19:52, 10 August 2007 (UTC)

This is about 4 years old and the conflict is not resolved. I assume there's a 4 way interleave on the outer code, allowing 4096 bit bursts to be corrected, of that 3072 are data bits. So I'm not sure where the 4000 and 3500 numbers originate from. Rcgldr (talk) 01:06, 14 October 2011 (UTC)

Erasure correction[edit]

The article claims that a Reed-Solomon code can correct twice as many erasures as errors - this would be (n-k), a conclusion supported by the code having distance (n-k+1)

But it goes on to state "2E + S < n − k ... where E is the number of errors and S is the number of erasures in the block."

If there are no errors, this implies it cannot in fact correct (n-k) erasures. Shouldn't the < in that equation in fact be a "less than or equal" symbol? James mcl (talk) 20:49, 15 May 2008 (UTC)

Graphs[edit]

As I'm an utter programming layman(though one with a good intuition for math/logic), the concept was totally unclear until I hit the last section. Perhaps it could be moved up? —Preceding unsigned comment added by 67.101.43.231 (talk) 02:36, 17 May 2008 (UTC)

A burst in time domain isn't transfomed to a sine wave, instead it's transformed to a infinite number of frequencyies if the bursts rise time is infinite short. Look at this picture: http://members.optushome.com.au/walshjj/transform_pairs.gif So I think Figure 3 and 4 are wrong —Preceding unsigned comment added by 62.167.25.67 (talk) 15:35, 22 July 2008 (UTC)

A single-sided impulse in one domain transforms to a complex exponential in the other (i.e. \ \delta(t+T) \leftrightarrow e^{j\omega T} . The real projection of this is indeed a cosinusoid. Oli Filth(talk) 15:43, 22 July 2008 (UTC)


While A single sided impulse in time is a complex exponential in the frequency domain, the sketch shows a square wave in the time domain. This is a sinc function in the frequency domain. it is wrong. — Preceding unsigned comment added by 64.129.125.134 (talk) 15:51, 10 August 2011 (UTC)

Article should include a summary of the algorithm[edit]

I think the article should include a summary of the algorithm for encoding and then decoding. If I understand the algorithm correctly, then this should be a valid summary:

Encoding:

1. Take the cleartext word you want to transmit, perhaps "123". The cleartext word must be k symbols (letters/numbers/characters, in this case digits) long.

2. Use the individual symbols from that cleartext word as the coefficients of a polynomial of degree k-1. So: p(x) = 1x^2 + 2x + 3

3. The code word (encoded word) will be n symbols long. In pseudocode:

For i in (1,2,3...n):
$next_symbol = p(i) = 1(i)^2 + 2(i) + 3
Write $next_symbol at the end of the code word

4. Transmit or store the code word.

Decoding:

1. Receive or read the code word.

2. "Graph" the code word by making a scatter plot.

3. Look at the graph, and see if it is smooth, low-frequency, and trustworthy.

If the code word was "12221000122210001", then your graph will look like a nice low frequency wave. You can trust there are no errors.

If the code word was "92221000122210001", then you will notice a big spike at the "9" which is way to high frequency to actually belong. The "9" must be an error.

4. Erase any stray points (like the "9" above), and then "connect the dots" with the remaining points.

5. You just drew a nice smooth curve. Use math tricks to find the polynomial that gives that curve. If you find the polynomial is p(x) = 1x^2 + 2x + 3, then the cleartext word must have been "123", the coefficients.

Do people agree that a cleaned-up (and if necessary, corrected) version of this should go in the article?Fluoborate (talk) 06:24, 12 June 2009 (UTC)

"cleartext" is not appropriate in this case, I think we should use "source data" or "source word" instead. The encoding part seems ok to me, but the step 3 confuses me: what is a smooth low frequency and trustworthy curves ? :) and what do you mean by connecting the dots? If I'm right the smoothness of the curves can be replaced by the fact that the curves correspond to a ploynomial of degree at most "n-1". If not, we are sure that at least one symbol is in error. The correction step will be to find the closest polynomial of degree at most "n-1" to this set of points. And the "math tricks" that you are speaking of is something like polynomial interpolation if I'm right. IMO it is interesting to have a simple presentation of the algorithm like this one, but we should also have a more mathematical and rigorous one. In addition, several approach are possible to encode RS codes, for exemple depending on the channel considered (BEC, or AWGN/BSC ...), and we should specify the type of error corrected by the algorithm. Cunchem (talk) 09:10, 12 June 2009 (UTC)

I'm a degreed EE, and have a decent math background. I remain amazed that people writing technical material refuse to provide simple explanations that would greatly clarify the mathematics. Consider CDs. They use Eight To Fourteen (ETF) encoding, where each eight-bit data byte is converted to 14 bits. That is, the 256 possible data values are mapped into a "space" of 16,384 values, which is more-resistant to corruption. The 14 bits are the coefficients of the polynomial. See how simple it is?

Of course, technical writers wouldn't dare give a simple explanation, because then they wouldn't look smart -- and the readers wouldn't be intimidated. WilliamSommerwerck (talk) 16:13, 4 June 2011 (UTC)

I know this is old, but perhaps William will read this some day. coding ... decoding - an example of coding and decoding is included in the article here: Example. They use Eight To Fourteen (ETF) encoding, where each eight-bit data byte is converted to 14 bits. ... The 14 bits are the coefficients of the polynomial. - The 14 bit pattern is used for a modulation scheme, Compact_Disc_Data_structure. Other than the last output and first input stages, everything else in the drive operates with 8 bit values, so the RS encoding and decoding would be working with 8 bit values. Rcgldr (talk) 14:37, 15 October 2011 (UTC)

Oli Filth[edit]

Provide an example of codeword + applied error(s) for an MDS code such that when decoded the resulting codeword is not equal to the original and where none of the undecodable criteria are met during the decoding phase. Until then that statement is invalid and will remain removed from the article. —Preceding unsigned comment added by NoNewsToday (talkcontribs) 21:24, 20 April 2010 (UTC)

This was a completely unqualified revert by User:NoNewsToday. An undetectable error (I suppose that is what you mean by decodable criteria) occurs whenever the error turns a codeword into a word that resides within the minimum distance of another codeword. The simplest case is that the error will turn one codeword into another valid codeword. The probability that an error turns it into the sphere of another codeword is already marginal (specifically it is at most 10-5 for a standard (255,223) RS code for any given bit error probability). The probability that it turns it exactly into another codeword is even lower by a magnitude.
Nageh (talk) 21:39, 20 April 2010 (UTC) (Edit: Nageh (talk) 21:48, 20 April 2010 (UTC))
Perhaps I've missed some crucial point, but otherwise, of course there are error polynomials that will cause one valid codeword to be received as another.
Incidentally, please don't misuse the word "spamming". Oli Filth(talk|contribs) 21:41, 20 April 2010 (UTC)

Please provide either an example (codeword and error index and magnitude) or citation (legitimate source ie: Lin&Costello et al) to this statement. Otherwise it will be reverted, I'll give guys 24hours. —Preceding unsigned comment added by NoNewsToday (talkcontribs) 00:02, 21 April 2010 (UTC)

Your request for a single error index and magnitude shows that you misunderstand the issue. If there were only one error and n-k >= 2, then that error would be corrected. The problem is there can be as many error indices and magnitudes as there are symbols. Every symbol in the message could have been corrupted.
For your example request, let C0(x) and C1(x) be arbitrary but distinct codewords. Transmit codeword C1(x). Let the transmission error E(x) = C0(x) - C1(x). (E(x) will have at least n-k nonzero coefficients, so the error is not guaranteed to be recoverable.) The received codeword will be R(x) = C1(x) + E(x) = C1(x) + C0(x) - C1(x) = C0(x). The receiver, seeing a legit codeword, will presume incorrectly that C0(x) was sent.Glrx (talk) 01:12, 21 April 2010 (UTC)

Glrx, there are two places in common and sane implementations of RS decoders that can detect and trivially reject codewords, for the scenario of a decoder RS(N,K) decoding errors <= K/2.

The first is the syndrome calculation - if its result is all 0 (zero) then there are no errors, the other is root calculation of lambda via Chien search, given a none all-zero syndrome if no roots are found or if the number of roots its greater than K/2. If a decoder does not adhere to these simple logical constraints, its not a valid decoder. Claiming the code can sometimes be erroneous is hence wrong.

Your explanation is that of an error-prone decoding process that has some serious logical flaws as to how the process occurs - I would suggest you read-up on the issue.

As I said before, the comment is not referenced, is in no way substantiated - and is nothing more than an assumption, so under Wiki rules its not even permissible. I would like to take this to a wiki moderator, as it seems just under those conditions it should be rightfully removed.

So lets all be a little sane and err on the side of caution and remove this reference until someone can rightfully substantiate or provide a valid citation for it. A substantial proof as I've explained before would be for a code RS(N,K) a codeword C and error vector e where |e| <= K/2 would decode using a sane decoder to C' where C is not equal to C' - if there is such a scenario please provide it as it would come contrary to some fundamental concepts in algebraic geometry - which would most definitely give the provider of such a scenario a big kudos boost in the math world. —Preceding unsigned comment added by NoNewsToday (talkcontribs) 20:40, 21 April 2010 (UTC)

And where in the whole world did you read anything about the assumption that the number of symbol errors must be < k/2? And do you think this is a civil discussion in any way from your side that you simple revert again? And do you actually have any theoretical knowledge on error correction coding? I think the answer is 'no' to all these questions.
Here is a ref for you: Kasami T.,Lin S., "On the probability of undetected error for the maximum distance separable codes." IEEE Transactions on Communications, 32(9):998–1006, 1984.
If you don't understand it I advise you to start Lin&Costello right from the beginning again.
Nageh (talk) 21:04, 21 April 2010 (UTC) (Sorry for being impolite.)
PS: I have already tried to explain to you above when such errors can happen. You have completely ignored this. I leave it to you as an exercise to compute the probability that on a binary symmetric channel with a certain bit/symbol error probability there are (a) more than k/2 errors, and (b) an error turns a code word into another codeword, and (c) an error turns a code word into the minimum-distance sphere of another codeword (in increasing order of difficulty). Good luck with your homework! Nageh (talk) 21:08, 21 April 2010 (UTC)
PPS: I would have accepted removal/rewrite of the footnote based on a statement like "The footnote is not clear or confusing because... " or "I don't understand why this is the case" but not because you just say that is an "invalid assumption". Nageh (talk) 21:11, 21 April 2010 (UTC)

The point I'm trying to make is that the comment should explicitly state that in the scenario where the codeword has more errors than what the code can correct - and yes I agree if there are more errors its possible to either detect none or even correct the codeword incorrectly. My position is that if the number of errors <= K/2 there is no way such an ambigious state can occur. So as a final solution either the statement needs to be more explicit by making it clear what the scenario (perhaps something similar to the statement in the CRC article), or not have it at all.

PPSS: this has a good discussion on the issue: http://ipnpr.jpl.nasa.gov/progress_report/42-84/84F.PDF if you like we can work together on some kind of wording on this page.

—Preceding unsigned comment added by NoNewsToday (talkcontribs) 21:20, 21 April 2010 (UTC)

The article (as far as I can see) states no such assumption on the number of symbol errors; consequently the footnote is neither ambiguous nor incorrect. Oli Filth(talk|contribs) 21:30, 21 April 2010 (UTC)

The only situation where the comment is correct is if the number of errors is greater than K/2 either change the reference or removed it. Pretty simple.

This has already been elucidated; there is no need to state that, as there is no such assumption in the article.
Please also note that you're dangerously close to violating the 3-revert rule; please stop making the same change until you've gained consensus with the three editors who've been reverting you. Oli Filth(talk|contribs) 21:39, 21 April 2010 (UTC)

I'd like to reference removed until we can come to some valid consensus

That's not how it works. Incidentally, I've now reported you for violating WP:3RR. Oli Filth(talk|contribs) 21:55, 21 April 2010 (UTC)
Thanks, I was just about to do the same. Nageh (talk) 21:57, 21 April 2010 (UTC)
I believe we came to a valid consensus: three to keep the note and one to remove it. On the subject, I would give Nageh opinion great weight. He's made a lot of recent contributions to this article, and those edits demonstrate a clear and deep understanding of the subject. Furthermore, you now admit a correctly functioning decoder could calculate the wrong codeword if there were more that (n-k)/2 errors. Given your understanding of the note's viewpoint and its correctness, I fail to understand why you would revert it.Glrx (talk) 22:59, 21 April 2010 (UTC)

A last comment to NoNewsToday before I'm off. Mentioning that a decoding error for a code that can reliably correct up to k/2 errors requires more than k/2 errors is a pleonasm, so leaving out the additional statement introduces no ambiguity whatsoever, especially if no such assumptions are stated anywhere in the article. Bye, Nageh (talk) 22:10, 21 April 2010 (UTC)

MOS, MOSMATH, TeX, etc.[edit]

It almost seems as if someone worked really hard at being unfamiliar with WP:MOS, WP:MOSMATH, and standard TeX conventions. Lots of stuff like

Reed-Solomon

instead of

Reed–Solomon

and

(a,...,b) \,

instead of

(a,\ldots,b)\,

and

(1, ..., a)

instead of

(1, ..., a)

and

n-k+1

instead of

nk + 1

and so on. Probably more cleanup is needed. Michael Hardy (talk) 21:38, 27 April 2011 (UTC)

You would do a lot better spending time on making this articles easily understood and genuinely educational. Too many of them are written by idiots who have no idea of how to communicate unfamiliar concepts. WilliamSommerwerck (talk) 16:16, 4 June 2011 (UTC)

You, too, would do a lot better spending time on making these article easily understood instead of calling other well-meaning editors idiots! Nageh (talk) 16:24, 4 June 2011 (UTC)
Criticism accepted. I'll see what I can do. WilliamSommerwerck (talk) 22:02, 4 June 2011 (UTC)

Euclidean Division Algorithm Decoder[edit]

This alternative method is simpler to implement than the Berlekamp Massey algorithm. The algorithm is described in several books, web sites and also in NASA Tech Brief article MSC-21834 (which seems to be missing title page and author) a Tutorial on Reed Solomon Error Correction Coding. I found what appears to be an updated version of the same pdf file here, the author is William Geisel, and a web search for his name will get a few hits on the tutorial:

nasa_archive_19900019023.pdf

Rcgldr (talk) 11:38, 11 October 2011 (UTC)

Go ahead and add it. Berlekamp–Massey is really pretty simple, but the operation of the Euclidean algorithm may be easier to understand. Bobmath (talk) 12:48, 11 October 2011 (UTC)
It might be better to simply post a reference to the tutorial at the nasa site for the near term, although that's based on binary type fields (the examples use 4 bit coefficients). I'd probably used the decimal based example for the Berlekamp Massey algorithm in the wiki article, but I'll need to create a program to make sure the results are ok, and I'm not sure how to get the pretty formatting with a fixed size font I'd want to show the division steps, probably as a pair of shift registers. Rcgldr (talk) 19:50, 11 October 2011 (UTC)
Euclidean section added. Kept it simple. Euclidean_decoder Rcgldr (talk) 05:45, 15 October 2011 (UTC)
  • Thanks to Bobmath (page does not exist?) for cleaning up the the Euclidean section. Rcgldr (talk) 15:19, 15 October 2011 (UTC)
I'm not done yet :) Bobmath (talk) 15:39, 15 October 2011 (UTC)
OK, it looks better now. The while statement improves the readability since it handles the all zero syndrome case. It also matches the code I actually use.
Moved the definition for t = number of parities to the start of the algorithm description. Rcgldr (talk) 16:57, 15 October 2011 (UTC)
  • I didn't mention the failure handling, but I don't think that's needed for the article. The first check after the loop: if (degree_of(Si)) != (-1+degree_of(Ai)) then it's failed. If it's less, then Ω(x) has leading zeroes, which I've never seen in a valid case. If it's more, then there's a conflict in the syndromes. The second check: if (number of valid roots of Λ(x)) != (degree_of(Λ(x)), then there are too many errors. The final check is regenerating sydromes or re-encoding, but I haven't seen this fail after getting past the first two checks. Mis-correction can occur if the garbled message is close enough to a valid message (in this case, the number of errors after mis-correction will exceed the number of parities). One mystery to me is why Si needs to be negated to produce Ω(x) for odd error cases (the (-1)deg Ai term), but it beats having to do a full Ω(x) calculation. I and most of the people I've worked with generally prefer the Euclidean algorithm.

Rcgldr (talk) 16:57, 15 October 2011 (UTC)

  • sample output of Euclidean algorithm, using two shift registers. Each register holds two values, the current remainder on the left, and a reversed polynomial (built up using the quotient values from the remainder divisions) on the right, that ends up as the error locator polynomial.
dividend=>   001 000 000 000 000|001 <= reversed polynomial
divisor=>    000 925 762 637 732|000 

dividend=>   925 762 637 732|533 232 
divisor=>    000 683 676 024|001 000 

dividend=>   683 676 024|544 704 608 <= A(i) = reversed locator polynomial 
divisor=>    000 673 596|533 232 000 

remainder=>  673 596|533 232 000 000

A(i):        544 704 608
lambda(x):   001 821 329
omega(x):    546 732

Rcgldr (talk) 16:38, 15 October 2011 (UTC)

Explanation of syndrome calculation[edit]

For the Berlekamp-Massey algorithm example, the syndrome values are listed in the table, but how they are calculated is not explained. Here's an example of how syndrome 0 is calculated. Note all coefficients are modulo 929. I'm not sure if this somewhat lengthy section should be added to the main article:

syndrome 0 = r(x) modulo (x - 3) = r(x) modulo (x + 926) = 732

Using long hand polynomial division to calculate syndrome 0:

             003 011 156 924 176 086
        ----------------------------
001 926 |003 002 123 456 191 487 474
         003 920
          --------
             011 123
             011 896
             -------
                 156 456
                 156 461
                 -------
                     924 191
                     924 015
                     -------
                         176 487
                         176 401
                         -------
                             086 474
                             086 671
                             -------
                                 732

Rcgldr (talk) 09:47, 13 October 2011 (UTC)

Detailed discussion of the Berlekamp-Massey algorithm should go to the respective article. Basics such as a trivial polynomial (long) division have no place at either. Nageh (talk) 09:57, 13 October 2011 (UTC)
Thanks for pointing out the gap in the example. You actually don't need to use long division to calculate the syndromes, just evaluate the polynomial r(x). I've added a couple of lines to show that. Bobmath (talk) 14:38, 13 October 2011 (UTC)
Thanks for the update. For anyone that would actually be reading these articles, the long division example isn't needed, but is there any wiki article to show that evaluating r(x) with either the syndromes or generator polynomial is the equivalent to grinding it out with long division? Rcgldr (talk) 20:24, 13 October 2011 (UTC)
Polynomial remainder theorem. Bobmath (talk) 04:09, 14 October 2011 (UTC)

error value calculations with multi-layer codes[edit]

In the case of a multi-layer code like CIRC, it's usually faster to generate a matrix for the outer code correction. Consider the inner code to be the rows of a matrix, and the outer code to the columns. The rows are handled first, and then correction on the columns of all rows can be treated as erasure cases, multiplying the inverted locator matrix with each column's syndromes to do the correction. If an error occurs in a column, then the failing row (once determined) can be added to the row erasure list, a new inverted locator matrix generated and correction continued. Expansion of terms from the Forney algorithm can be used avoid having to actually invert a matrix. For a 3 error case:


\begin{bmatrix}
X_1^1 & X_2^1 & X_3^1 \\
X_1^2 & X_2^2 & X_3^2 \\
X_1^3 & X_2^3 & X_3^3 \\
\end{bmatrix}
\begin{bmatrix}
Y_1 \\ Y_2 \\ Y_3
\end{bmatrix}
= 
\begin{bmatrix}
S_1 \\ S_2 \\ S_3
\end{bmatrix}


\begin{bmatrix}
Y_1 \\ Y_2 \\ Y_3
\end{bmatrix}
= 
\begin{bmatrix}
-(1 + \Lambda_1 X_1^{-1} + \Lambda_2 X_1^{-2}) / \Lambda'(X_1) & -(X_1^{-1} + \Lambda_1 X_1^{-2}) / \Lambda'(X_1) & -(X_1^{-2}) / \Lambda'(X_1) \\
-(1 + \Lambda_1 X_2^{-1} + \Lambda_2 X_2^{-2}) / \Lambda'(X_2) & -(X_2^{-1} + \Lambda_1 X_2^{-2}) / \Lambda'(X_2) & -(X_2^{-2}) / \Lambda'(X_2) \\
-(1 + \Lambda_1 X_3^{-1} + \Lambda_2 X_3^{-2}) / \Lambda'(X_3) & -(X_3^{-1} + \Lambda_1 X_3^{-2}) / \Lambda'(X_3) & -(X_3^{-2}) / \Lambda'(X_3) \\
\end{bmatrix}
\begin{bmatrix}
S_1 \\ S_2 \\ S_3
\end{bmatrix}

Rcgldr (talk) 23:57, 26 October 2011 (UTC)

Berlekamp Massey decoder[edit]

I shortened the description on this section to just reference Berlekamp–Massey algorithm. That reference, along with the example should be good enough for the Reed Solomon article.


The Berlekamp–Massey algorithm article, has a decent description for the binary case, but just a code fragment for the more general case. Perhaps that article could be updated with something like the text below. Since I previously wrote a basic description here, I'm leaving it on this discussion page for now:

The Berlekamp Massey algorithm calculates a discrepancy based on a subset of v+1 syndromes, Sj ... Sj+v and a set of v coefficients, where v is the current number of assumed errors and initially set to zero:

discrepancy = Sj+v + Λ1 Sj+v-1 + Λ2 Sj+v-2 ... + Λv-1 Sj+1 + Λv Sj

This discrepancy as well as some internal variables are used to decide if v needs to be increased and/or if the coefficients need to be updated on each iteration. j is initally set to zero and advanced as needed during iteration so that all syndromes are used in the algorithm. When the algorithm completes, there will be v coefficients, and {1 Λ1 Λ2 ... Λv} will be the coefficients of the error location polynomial, Λ(x).

Rcgldr (talk) 07:06, 16 October 2011 (UTC)

If anyone is interested, I added an algorithm description to Berlekamp–Massey algorithm

Rcgldr (talk) 00:27, 19 October 2011 (UTC)

  • Article statement - The Berlekamp Massey algorithm is the most popular procedure for finding the error locator polynomial. - Based on my experience as a firmware programmer for computer peripherals, I have the impression that the Euclidean algorithm is most popular. Going by the Wiki standard, is there any reference than can cite the popularity of any single algorithm? Also does most popular mean it's used in the most devices or used in the most number of implementations of RS codes for different device types? Rcgldr (talk) 17:38, 19 October 2011 (UTC)
As far as I know Berlekamp-Massey is more suitable for hardware implementation than Euclidean as it lends for implementation with linear shift registers. Since RS is dominantly used in electronic consumer products that statement should be true. I assume that should be verifiable via Google as I don't have a reference at hand right now (and no time to get involved in editing this article at the moment). Nageh (talk) 18:38, 19 October 2011 (UTC)
If you look at the sample output shown in the Euclidean Division Algorithm Decoder section above, you can see the shift register implementation typically used in hardware. There are two shift registers, fixed in size (2 + # parities), with an index for each register used to separate each register into it's R(x)/A(x) or S(x)/B(x) poynomials. In addition, you get the error valuator polynomial for free (it's (-1)(degree A(x)) S(x) / low order term A(x)). I first started seeing Euclid method around 1989. In the meantime, I updated the intro for Berlekamp Massey with a proper description of the algorithm and just mentioned it's an alternate interative procedure. Both algortihms are popular, but I've only personally seen one instance of Berlekamp Massey used, versus several instances of Euclid used, while your experience is obviously different. I'm not sure it's important to pick which one is most popular. Rcgldr (talk) 19:36, 19 October 2011 (UTC)
You are right, I thought Berlekamp-Massey would be most popular but based on a quick search it seem that both are about equally popular. Regarding which one is more suitable for hardware implementation, here is a paper that states that Berlekamp-Massey is thought to be more efficient while Euclidean is simpler: [1] It also shows that both algorithms are ultimately equivalent via some reformulations. I agree that both algorithms can be implemented using linear shift registers, so the difference is probably not that big in the end. Nageh (talk) 20:21, 19 October 2011 (UTC)
The hardware guys hate having to turn chips, so they often prefer simpler. Each algorithm needs some checks to handle failure cases caused by too many errors. In the case of most failures, Λ(x) will still be unique, and all algorithms produce the same Λ(x), so in that sense, they are equivalent. If too many syndromes are zero, (an e error case can result in e-1 zeroed syndromes), then there are multiple Λ(x)'s that will satisfy S_i + Λ1 Si-1 + ... = 0, but none of them will be valid, and the different algorithms will produce different Λ(x)'s (or fail to produce one), but in these cases, failure will be detected, and the produced Λ(x)'s ignored. Berlekamp Massey iterates once per number of syndromes. Euclid only iterates once per number of errors, but there's more data involved in each iteration. I'm not sure of the tradeoffs in a mostly hardware impelentation. There's also a wiki article for Berlekamp–Welch_algorithm, which I'm not familiar with. It's been a while since I've worked with RS codes. Rcgldr (talk) 23:51, 19 October 2011 (UTC)
  • Article statement - A linear feedback shift register is constructed which produces the nth syndrome Sn as output after reading S1...Sn−1 as input. The feedback coefficients of this shift register are the coefficients of the error locator polynomial. - That's not how it's described in the wiki article or any text (or actual implementation) that I'm aware of. The normal implementation is calculate a discrepancy, d = Λ1 Si ... + Λe Si+e where e is the number of errors. I added an algorithm description that matches the coding examples in this article: Berlekamp_Massey_algorithm. I can cite references: Error Correcting Codes - Peterson and Weldon, second edition, page 290, equation 9.31; nasa_archive_19900019023.pdf - William A Geisel, page 66. Rcgldr (talk) 17:38, 19 October 2011 (UTC)
I had Gill's lecture notes (from the reference section) in mind when I wrote that, but I may not have chosen the best wording. I felt that something should be added because there's a conceptual jump from polynomials to shift registers that was pretty jarring. I'm not sure whether the present wording smooths that out. Bobmath (talk) 19:45, 19 October 2011 (UTC)
Glrx has added the section on the Peterson decoder, I have only added some material on soft decoding and list decoding at the end of the decoding section. Much of the text in between needs review and corrections, so feel free to go ahead if you feel you can improve it. Nageh (talk) 18:38, 19 October 2011 (UTC)
Peterson decoder section seems ok to me. As mentioned, I updated the intro description for Berlekamp Massey. There's also an iterative procedure for finding the error evaluator polynomial (Ω(x)) which isn't mentioned at all on the RS page. A similar procedure is more often used to generate a matrix that produces error values given a set of syndromes, to be used in multi-layer RS implementations where a set of columns in an RS matrix all share the same erasure locations, which are the rows of the RS matrix. Rcgldr (talk) 20:01, 19 October 2011 (UTC)

Euclidean decoder[edit]

  • Article statement - Ai(0) is the constant term of Ai. Changed from low order term to constant term. constant term could imply a coefficient that doesn't change during iterations. I would prefer low order term, but the example clarifies what is meant by Ai(0), which should be good enough. Rcgldr (talk) 18:00, 19 October 2011 (UTC)
"Low order" isn't a standard mathematical term, AFAIK, but I guess the meaning should be clear. The Ais don't change, anyway. Bobmath (talk) 19:25, 19 October 2011 (UTC)
I did a web search for polynomial lower order term and got quite a few hits. I also see this used for describing binary numbers, the low order bit or low order byte. Also some web pages refer to the coefficients of polynomials as constants, which gets confusing. The wiki article for Polynomial refers to terms of of a polynomial by degree, and calls the last term constant, so either seems ok. I don't want to call it term of zero degree. Rcgldr (talk) 20:41, 19 October 2011 (UTC)
Please avoid order, the order of a polynomial (or a group member) is something completely different. Constant term is good and unambiguous. I guess trailing term would be too, though it's not so common. Nageh (talk) 21:11, 19 October 2011 (UTC)
In that case constant term seems the best. trailing term or leading term could be confusing since Ai is shown reversed in the example. I show Ai reversed since that's typically how it's implemented in hardware or software. Rcgldr (talk) 21:23, 19 October 2011 (UTC)
A friend of mine complained about the usage of constant term so I changed it to constant (least significant) term in attempt to make everyone happy. I have a old handout taken from an ECC book and on page 299, its states Ai ... must be divided by it's low-order coefficient which is why I used that wording in the first place. There's a reference to a book, Error Correction Coding by Clark and Cain (Plenum Press), but I don't have that book. Rcgldr (talk) 00:42, 20 October 2011 (UTC)

Alternate Generator Polynomials[edit]

The examples in the article use a generator polynomial where the first consecutive root is α :

(X-α) (X-α2) (X-α3) ... (X-αt)

If the first consecutive root of a generator polynomial isn't α, then the method used to calculate Ω(x) in the Euclidean example would need to be modified. I'm not sure if the Forney method would also need to be modified. Methods based on the error equations Error_locators_and_values should not require any modifications.

A generator polynomial may have a first consecutive root of 1: (X-1) (X-α) ... (X-αt-1) or a generator polynomial may be reciprocal (X-α(N/2)-(t/2)) ... (X-α(N/2)+(t/2)) = 1 Xt + g1 X t-1 + g2 X t-2 + ... + g2 X 2 + g1 X + 1.

Rcgldr (talk) 02:44, 26 October 2011 (UTC)

This is addressed in the Forney algorithm article using the parameter c. Also see the BCH code article. Bobmath (talk) 03:41, 26 October 2011 (UTC)
Thanks. I was wondering if alternate generator polynomials and the related the c parameter should be noted somewhere on the RS wiki page, but since there's a link to the Forney algorithm, that is probably good enough. Rcgldr (talk) 05:48, 26 October 2011 (UTC)
Isn't a Reed–Solomon code always a narrow-sense BCH code, so c must be one? See BCH code#Special cases. If that's the case, then other values for c (such as 0) don't belong here. Glrx (talk) 21:07, 26 October 2011 (UTC)
DAT drives use generator polynomials with c = 0. The ancient floppy tape drives (QIC-40) used a (32,29) code with c = -1 (reciprocal polynomial with 3 parities). Some other computer peripherals use reciprocal polynomials, c = (N/2)-(n/2), N = number of symbols, n = number of parities (N and n are even numbers). These were all considered to be RS codes by the resident ECC experts at the time. The Forney algorithm article already explains how to deal with c ≠ 1, so an RS article reference about this could just link to the Forney article. Rcgldr (talk) 02:00, 27 October 2011 (UTC)

Other algorithms used for RS codes.[edit]

The RS article leaves out some other algorithms, some of which would take a while to explain. Here's a list:

  • erasure and error handling - The algorithm used to modify syndromes given a set of know erasure (error) locations. The result is yet another set of linear equations that can be solved with various algorithms. The modified syndromes are then used to calculate any unknown error locations.
I think the term Forney syndrome is used for this method of erasure processing. It would be good to work this information into the article. I have a page I've been working on that shows some examples: v:Reed–Solomon codes for coders. It's pretty rough, but there might be something useful in there. Bobmath (talk) 04:20, 26 October 2011 (UTC)
  • hardware invesion of a finite field binary polynomial based number using a sub-field. - For example, an 8 bit field is normally implemented as finite field math modulo some 9 bit primitive polynomial. A compatable 8 bit field can also be implemented as finite field math modulo a 3 nibble (4 bit) polynomial 1 x2 + a x + b. The coefficients of the 3 nibble polynomial are finite fields modulo via one of three 5 bit primitive polynomials (hex 13, 19, or 1f). For each 5 bit polynomial there will be 4 polynomials 1 x2 + a x + b where the fields are compatable, where addition (exclusive or), multiplication, and division of numbers with the same logarithm produce answers of numbers with the same logarithm. Any compatable nibble based polynomial can be used. The process involves mapping the 8 bit number to the nibble based 8 bit number, doing the math to invert it, then mapping back. A 1/x table is still needed, but it only has 16 four bit entries. The point of all of this is that for the 8 bit case, this procedure ends up at around 400 gates with a single propagation delay through the stages of gates, versus the number of gates required for a 256 byte lookup table. Which sub-field is choosen can slightly affect the number of gates required. Note the lookup table takes more gates, but there are fewer gate stages and the propagation time is less than the sub-field method.
  • multi-layer type RS code algorithms - Used in cd-roms, dvds, magnetic tape. Treated as a matrix, where the innner layer rows typically have weak RS code, while the outer layer columns have a strong RS code. The outer layer can be optimized by handling corrections mostly as erasures (each row would be tagged as erasure location for the columns).
  • parities to syndromes, syndromes to parities - Parities or re-encoded partities (received message re-encoded) can be converted into sydromes via a matrix multiply. I'm not sure this saves much in hardware or software, just that it is possible. Syndromes to parities: the original message has zeroes appended to it, then those zeroes are considered erasures, and the message is corrected to add the parities. This eliminates the need for the generator polynomial in hardware or software.

Rcgldr (talk) 01:20, 21 October 2011 (UTC)

Your other points are interesting, but some of your terminology seems strange. Compatible doesn't seem to be a standard term for what you're doing, for example, and I thought the term nibble died 20 years ago. Also, I don't think the article needs to include every possible implementation method, because that will start to get confusing and unwieldy. Bobmath (talk) 04:20, 26 October 2011 (UTC)
nibble is still used when describing things such as Binary-coded_decimal. Instead of compatible, I've also seen equivalent used. I'm not sure if there is a standard term for this, as I haven't seen it in any text books, although most experts in RS coding are aware of this method for inverting numbers. The two most common mappings would be GF(28) to GF(162) or GF(210) to GF(322). As a bit of trivia, for a GF(28) primitive polynomial: b8 X8 + b7 X7 + ... + b1 X + 1, then the product of of the 4 compatible (out of 64 possible) GF(162) (4 bit coefficient) primitive polynomials (1 X^2 + c1 X + c2) ... (1 X^2 + c7 X + c8) = b8 X8 + b7 X7 + ... + b1 X + 1, except that the b's are now 4 bit coefficients, but will only have the values 0000 or 0001. This method is used in some computer peripherals (hard drives, tape drives, ...). Rcgldr (talk) 05:33, 26 October 2011 (UTC)
BCD is another thing that I thought had died 20 years ago. (They're no longer supporting it in the x86-64 instruction set, it seems.) It sounds like what you're describing is a field isomorphism (a subfield is something completely different). My abstract algebra isn't terribly strong -- does an isomorphic field with the right properties always exist, or only in certain cases? Bobmath (talk) 06:13, 26 October 2011 (UTC)
Off topic, but BCD is still around, mostly because the accounting and banking industry use math based on BCD to avoid any issues related to binary :: decimal conversion issues (required by laws and/or industry standards). BCD is how Cobol's packed decimal would be stored on most computers. Rcgldr (talk) 08:22, 26 October 2011 (UTC)
The idea Rcgldr is trying to describe is constructing the finite field GF(p^m) not as a polynomial ring over GF(p) but over GF(p^l) for some l|k. (Inversions in GF(p^l) are precomputed and stored in lookup tables, but polynomials are significantly shorter, reducing computation efforts.)
Re isomorphism: All finite fields of a given size (order) are isomorphic to each other. That's why there is this simple notation of GF(size).
Re nibble: Aside from assembler programmers, network folks use it, too, at least occasionally. You will still find it in a few more recent IETF documents. Nageh (talk) 07:11, 26 October 2011 (UTC)
All finite fields of a given size (order) are isomorphic to each other. - This is why I didn't use the term isomorphic. The conditions required to take advantage of mapping for inversion require that all math operations produce the same results when expressed as powers of each fields primitive element α. Multiplication and division aren't an issue since αi * αj = αi+j, and αi / αj = αi-j will hold true for any field with a primitive element α. The issue is addition and subtraction (or exclusive or for binary based fields), where αi + αj = αk and αk - αj = αi hold true for the same i, j, and k, in all compatible or equivalent fields. For each of the 3 possible GF(24) defining polynomials, hex 13, 19, or 1F, there are 64 possible fields that could be used to map a specific GF(28) to a specific GF(162), but only 4 of those fields will meet the addition and subtraction (in this case exclusive or) requirements. Again, the point of this is that it takes a fraction of the gates (although a longer propagation delay) to implement inversion via this type of mapping. Rcgldr (talk) 08:33, 26 October 2011 (UTC)
So you require that the binary representation of all field members is (essentially) the same for both fields? (It could be bit-inverted, or... what are the other two cases, most-significant bit flipped I guess.) I don't think there is a name for what you describe, or at least I'm not aware of it. Nageh (talk) 09:47, 26 October 2011 (UTC)
The binary representations won't be the same, but if you generated sets of 256 by 256 byte lookup matrices for each compatible field, each with four matrices: addition, subtraction, multiplication, and division, with all entries in each matrix being the logarithm base α for each field (except for log(0), although this could be set to hex ff, since it's not used elsewhere), those logarithm matrices would be identical. I added a section to your talk page. Can I post a direct link to a document from my web site here as an example? Rcgldr (talk) 10:06, 26 October 2011 (UTC)
There is no problem linking external content on a talk page if it fits within the context of a discussion. I have to think it through which mapping is required to fulfill the particular requirements you specified, but gotta leave now. Nageh (talk) 10:18, 26 October 2011 (UTC)
Just to make sure I get you right... you do require that α has the same binary representation in the isomorphism, right? Nageh (talk) 10:36, 26 October 2011 (UTC)
No, just that math based on powers of α is the same for both fields. For a GF(28) field based on the 9 bit polymomial 11D hex, α is 1X + 0 = hex 2. I could choose a GF(162) field where α = 1X + 1 = hex 11, as long as it met the requirements that given i and j, that k is the same for both fields for addition (exclusive or): αi + αj = αk (multiplication and division will always be compatible), but choosing a field with α = 1X + 1 would consume more hardware than choosing a field where α = 1X + 0 = hex 10. Rcgldr (talk) 10:58, 26 October 2011 (UTC)

Other algorithms used for RS codes - hardware inversion[edit]

Here's an example table in powers of α order for two compatible fields.
GF(28) is modulo hex 11D, GF(162) modulo hex 142 (1 X2 + 4 X + 2), GF(16) = GF(24) modulo hex 13:
i GF(2^8) αi GF(16^2) αi
0 01 01
1 02 10
2 04 42
3 08 18
4 10 c2
8 1d 99
19 03 11
28 18 da
68 0d 5b
253 47 1d
254 8e 92
As an example of compatible, note that in both fields, α0 + α1 = α19, α3 + α4 = α28 and α4 + α8 = α68
An old Word 6.0 / 95 document that explains this with a couple of examples: m8to44.doc . All of this can be implemented in hardware, including circuits for the 8 bit mapping, the GF(24) math: a + b, a × b, 1 / a, a2, and mapping back to the 8 bit field, as a series of stages of about 340 gates (what I was told by the resident RS expert at the time), with a single progation delay. Rcgldr (talk) 13:30, 26 October 2011 (UTC)
Ok, I got utterly confused for a moment, so let's get things straight. I was initially assuming that α would be the same for both fields, but obviously the alpha in the other group (let's call it β) is a different generator. Let m be the isomorphism that maps elements from one field to the other such that m(α) = β. You want the following to hold:
If αj * αk = αl then βj * βk = βl, as well as
If αj + αk = αl then βj + βk = βl.
Let's focus on the second line. By the isomorphism, βj + βk = m(α)j + m(α)k = mj) + mk) = mj + αk) = ml) = m(α)l = βl. So this is all about finding the right generator β. Can you give a polynomial for which you think this isomorphism does not hold? I suspect your other polynomials might not be irreducible. Nageh (talk) 20:27, 26 October 2011 (UTC)
Well, there is another explanation, which is that addition in GF(162) does not match that of GF(28) (i.e., xor). While it must hold that βj + βj = 0 for any j I'd be interested to see such an addition table. So coming back to what you were trying to describe, you are looking for is constructing the finite field GF(p^m) not as a polynomial ring over GF(p) but over GF(p^l) for some l|k, and where addition can be carried out modulo 2 (xor). Nageh (talk) 20:47, 26 October 2011 (UTC)
For GF(28) modulo hex 11d, GF(16) modulo hex 13, try GF(162) modulo hex 119 (1 X2 + 1 X + 9):
List of the 60 non-compatible modulo values with α = 1 x + 0:
119 11b 11d 11e 122 129 12d 12e 133 139
13b 13e 144 14b 14d 14e 155 159 15b 15d
162 163 164 165 16b 16d 172 173 174 175
179 17e 183 184 18d 192 193 19b 19e 1a2
1a4 1a9 1b4 1b5 1bd 1be 1c3 1c5 1ce 1d4
1d5 1d9 1db 1e2 1e3 1e9 1ed 1f2 1f5 1fb
List of the 4 compatible module values with α = 1 x + 0: 125 134 142 153
Getting back to that bit of trivia using GF(28) modulo hex 11d, GF(28) = 1 x8 + 1 x4 + 1 x3 + 1 x2 + 1
In GF(16) modulo hex 13, the product of the 4 compatible polynomials
(1 x2 + 2 x + 5) (1 x2 + 3 x + 4) (1 x2 + 4 x + 2) (1 x2 + 5 x + 3) = 1 x8 + 1 x4 + 1 x3 + 1 x2 + 1
Rcgldr (talk) 02:20, 27 October 2011 (UTC)
isomorphism with addition (exclusive or) - At least part of the reason for this requirement is that the mapping from GF(28) to GF(162) and back is to be done with 8 bit by 8 bit matrices (otherwise there's no point if mapping uses a 256 byte lookup table). This means that every symbol in each field can be generated by summing some combination of the symbols in the 8 bit by 8 bit mapping matrices. For this strategy to work, if αz = αa + αb + ... + αh in GF(28), then βz = βa + βb + ... + βh in GF(162).
Note that there are 4 other compatible fields with α = 1 x + 1 : modulo hex 126, 136, 147, 157, but these would involve more hardware. Generally the goal is to minimize the number of non-zero bits in the mapping polynomials. Also that trivia bit doesn't apply in this case, the product of those 4 poynomials translates into hex 11b, not 11d. Rcgldr (talk) 14:04, 27 October 2011 (UTC)
All of this and what you say in your document makes sense. But I'm stumbling over the isomorphism that you describe. How do you find it? By trial and error? Or do you compute the modulo polynomial for GF(16^2) somehow? And for those non-compatible isomorphisms, is it because the generator does not match x (i.e., hex 10) or because addition in those isomorphisms does not match xor? Nageh (talk) 21:14, 27 October 2011 (UTC)
by trial and error. I select a GF(16) (GF(24)) field, usually modulo hex 13. Then I start with an arrary of 256 bytes, hex 00 to hex ff and consider each byte to be a pair of nibbles a and b, each representing a potential GF(162) polynomial 1x2 + ax + b . I then eliminate all non-prime pairs by calculating (x-c)(x-d) for all 256 combinations of nibbles c and d, removing the entries matching (x-c)(x-d) from the array. Next I set β to 1x + 0 (hex 10), and test each entry to see if it takes 255 cycles to: set e = 1; loop e = e*β modulo GF(162); while e ≠ 1. If it doesn't take 255 cycles, I try the next entry in the array. If it does take 255 cycles, then 1x2 + ax + b is primitive with β = 1x + 0 (hex 10). I then test mapping by creating exponential tables for GF(28) (this is only done once) and GF(162), then creating the mapping matrix from the first 8 entries in GF(162) exponential table. I then iterate e = hex 08 to hex ff, f = αe in GF(28); map f from GF(28) to g in GF(162); then check if g = βe in GF(162). If this holds true for all e hex 08 to hex ff, then then that instance of 1x2 + ax + b is mappable or compatible with the GF(28) polynomial I'm trying to map. If it fails at this point, it's probably because of setting β = 1x+0 or xor, I'm not sure if there are other mapping issues. The last step is to actually test the invesion algorithm as explained in m8to44.doc to confirm it works.
Using the trivia trick, in the case of GF(28) = 1 x8 + 1 x4 + 1 x3 + 1 x2 + 1, I could try doing a polynomial divide of GF(16) 1 x8 + 1 x4 + 1 x3 + 1 x2 + 1 by a potential GF(16) 1x2 + ax + b to see if it's one of the 4 roots with α = 1x + 0 (hex 10). The same method should work with any GF(28) with α = 1x + 0 (hex 2) (I've confirmed this for GF(28) hex 11d and 187). Actually testing the inversion algorithm would still be used to confirm it works. Rcgldr (talk) 00:41, 28 October 2011 (UTC)
I updated my test program to test all 3 GF(16) ((GF(24) hex 13:α=2, 19:α=2, 1f:α=3) with all 16 GF(28) with α = 1x+0 (hex 2), producing 4 GF(162) with β = 1x+0 (hex 10) for each of the 48 combinations, and the trivia rule held: the product of the 4 compatable GF(162)'s ends up with the same polynomial coefficients as GF(28). It seems that none of the 14 GF(28) with α ≠ 1x+0 are mappable, even for β = or ≠ 1x+0 (unless my program has a bug). I used a second program to verify the inversion algorithm worked for all 192 (48*4) cases. Rcgldr (talk) 12:55, 29 October 2011 (UTC)
There must be a more elegant approach than trial-and-error, but I'll think about it when I find enough time. If you cannot find any compatible setting with β ≠ 1x+0 (and assuming the modulo polynomial is irreducible) this means that addition simply does not match exclusive-or, and indeed what you're looking for is an isomorphism where addition can be carried out modulo 2. Minor detail, you do not need to loop 255 times to test for a generator, it suffices to test up to α17 ≠ 1. So, to conclude this obviously interesting discussion, am I right to assume that this is some insider trick that you came up with? In that case our sourcing policies obviously do not permit adding such material to the article. Or is it referencable to secondary sources? In that case we might think about adding a section on implementation aspects, though honestly I think that they are too specific for this otherwise quite general article (which is still lacking in much more basic aspects). Nageh (talk) 19:47, 29 October 2011 (UTC)
insider trick? - No, I've seen this method used at multiple companies, hard drive (GF(210) -> GF(322)), tape drive (GF(28) -> GF(162)), chip set makers (both versions), ... . This method dates back to at least 1995 (used in some DAT (tape) drives). I don't know who the inventor is. If there was a common source that distributed information about this and other methods in my local area, it might have been CMRR (UC San Diego - there was a professor Jack Keil Wolf (current status unknown)), and/or hawaii.edu (University of Hawaii - there was a professor E J Weldon Jr (retired)). I assume other sources are aware of this and other methods, but most of my external sources were people from CMRR or U of Hawaii.
If you cannot find any compatible setting with β ≠ 1x+0. The issue is any of the 14 GF(28) where α ≠ 1x+0, none of them are mappable. All 16 GF(28) with α = 1x+0 can be mapped with β = 1x+0, and 10 of them can also be mapped with β ≠ 1x+0, but I see no reason to use β ≠ 1x+0. Every binary (xor) type RS implementation I've seen only uses GF()s where α = 1x+0. Rcgldr (talk) 22:19, 29 October 2011 (UTC)
The issue is any of the 14 GF(28) where α ≠ 1x+0, none of them are mappable - These GF()'s are mappable. For each of the 14 GF(28) where α ≠ 1x+0, there are 8 out of 128 α (generators) that will map to any specific 16 GF(162) with β = 1x+0. Rcgldr (talk) 17:42, 4 November 2011 (UTC)
unindent - This method of using sub-fields for inversion is apparently well known by the experts working with an encryption scheme known as AES that includes inversion of symbols in GF(28) = 1 x8 + 1 x4 + 1 x3 + 1 x + 1 (hex 11b) with α = 1x+1. One implementation of AES GF() inversion combined with the encryption mapping is known as the greedy algorithm. A nested sub-field scheme is used going down to 2 bit symbols: using my notation from above, GF(28) is mapped to GF(162), each GF(16) 4 bit symbol is mapped to GF(42), (2 bit symbols) where 1/a = a2. I found a web site with a pdf file by David Canright that optimizes the greedy algorithm by grinding through a large number of possible mapping implementations to find minimal gate count A_Very_CompactS-box_for_AES.pdf. My program failed to map any GF(28) with α ≠ 1x+0. The apparent issue may be that my program restricts the mapping matrix to the first few powers of β or perhaps it's just a bug. If and when I get the time, I'll fix my program. It seems there's already been a lot of research into optimized inversion for AES, so I'll search for an answer rather than try to re-invent a solution on my own. Rcgldr (talk) 10:18, 30 October 2011 (UTC)
I also did a web search for Galois field inversion, and was suprised by multiple patents that seem to be based on the obvious method of exponentiating a number a in GF(2n), 1/a = am where m = 2n-2. In the case of n=8, m=254 and 1/a = a2 a4 a8 a16 a32 a64 a128. The wiki article mentions this method as well as a Euclidean method: Finite_field_arithmetic. Rcgldr (talk) 09:07, 30 October 2011 (UTC)
Check chapter 6 of this thesis. I am no more surprised about the incredible number of utterly trivial patents. It just shows how broken the patent system is. Nageh (talk) 12:14, 30 October 2011 (UTC)

Other algorithms used for RS codes - hardware inversion - GF(28) where α ≠ 1x+0[edit]

From what I've read, finding a sub-field and mapping matrix for GF(28) where α ≠ 1x+0, involves more trial and error. For example, there are 128 possible α (generators) for GF(28) = 1 x8 + 1 x4 + 1 x3 + 1 x + 1 (hex 11b). One document I read chose GF(162) = 1 x2 + 1x + 9, with β = 1x+0; Then in what is described as brute force search through the 128 α for GF(28), one is searched for so that the mapping function M(a) + M(b) = M(a + b), where + is xor. For each trial case of α and β, two matrices are created, one is the powers 0 to 7 of α the other is powers 0 to 7 of β, with the symbols stored as columns in the two matrices left to right. Probably any set of 8 (sequential?) powers would work, I also tested reversing the matrices and using powers 1 to 8, and both variations resulted in an identical mapping matrix. The matrix mapping equation to be solved is

|map| |powers of α 0 to 7| = |powers of β 0 to 7|

|map| = |powers of β 0 to 7| |powers of α 0 to 7|-1.

As mentioned, each of the 128 α are tried until M(a) + M(b) = M(a + b), where + is xor. In the document I read success was found with α = 1 x7 + 1 x6 + 1 x5 + 1 x4 + 1 x3 + 1 x2 + 1 x + 1 (hex ff). I confirmed that this mapping matrix worked for inversion. I'm not sure how GF(162) or β was chosen. Rcgldr (talk) 07:17, 31 October 2011 (UTC)

I'm not sure how GF(162) or β was chosen. - The choice is arbitrary. Turns out that for any GF(28) where α ≠ 1x+0, there are 8 out of 128 α (generators), each of which will be compatible with a specific GF(162) with β = 1x+0, so the choice for α, GF(16), and GF(162) that reduces the amount of hardware is chosen. Rcgldr (talk) 17:44, 4 November 2011 (UTC)

In the cases where α = 1x+0, the matrices can be reversed so the columns of GF(28) are powers 7 to 0 of α which is the identity matrix. So the columns of the mapping matrix for α = 1x+0 will always be powers 7 to 0 of β Rcgldr (talk) 12:07, 31 October 2011 (UTC)

Reed Solmon article intro - Berlekamp Massey implied to be only efficient decoding algorithm.[edit]

From the intro section: where encoding symbols are derived from the coefficients of a polynomial constructed by multiplying p(x) with a cyclic generator polynomial. This gives rise to an efficient decoding algorithm, which was discovered by Elwyn Berlekamp and James Massey, and is known as the Berlekamp–Massey decoding algorithm. Seems the real breakthrough was discovering the relationship between the error locator polynomial and syndromes as explained in the Peterson_decoder section, where the Berlekamp method is mentioned as an improvement: ''Peterson (1960) developed a practical decoder based on syndrome decoding. ... Berlekamp (below) would improve on that decoder. So there's a conflict in the article. Seems like the intro section should be revised. The Euclidean based algorithm is yet another improved method, but not mentioned in the intro section. Rcgldr (talk) 08:19, 1 November 2011 (UTC)

The lead section is right, actually. Peterson's decoder does not scale well for larger number of errors. Also, it was only designed for binary BCH codes, originally. It was Gorenstein and Ziegler who extended Peterson's algorithm for the decoding of non-binary BCH/RS codes to what is now known as the Peterson–Gorenstein–Zierler algorithm. So it is the body of the article that primarily needs improvement. When it comes to notable algorithms, Chien search and Forney's algorithm also needs more attention. The Euclidean algorithm is less an improvement rather than an alternative to Berlekamp-Massey, but obviously needs more attention, too. Nageh (talk) 09:15, 1 November 2011 (UTC)
I second Nageh's characterization. The notion of "practical" is a moving target. The RS count-the-subsets decoder was impractical. The PGZ decoder is practical in the sense that it could be implemented, but it has an awkward error location polynomial step that examines several systems of linear equations (matrix ops are O(t3), so PGZ number of errors step is O(t4)?). The hardware to do that for many check symbols would be expensive. In contrast, the BM decoder's error location polynomial is simpler (O(t2)?). Glrx (talk) 17:06, 2 November 2011 (UTC)
Now that I've re-read both sections, the issue is "practical" versus "efficient". I think the linear RS method qualifies as "practical", assuming it is/was used in some devices. On page 105 of the cited RS tutorial (1990) 1990019023.pdf is a claim that for 3 (possibly 4) or smaller maximum error cases, an optimized linear method (the math worked out in advance for the determinants, eliminating xors of duplicate terms), would require less hardware than the iterative methods. The real breakthrough would seem to be the discovery of the relationship between the error locator polynomial and the syndromes, which the mentioned RS decoder algorithms (linear, BM, Euclid) are based on. The main exception would be single burst decoders that re-encode the received message and forwards or backwards cycle the re-calculated parities until the most significant half of the parities are zero, in which case the least significant half is the error value. The first section should also include Euclid as well as BM as an efficient algorithm. The Euclid method requires two shift registers of the same size as used by BM's main shift register, but BM needs a "previous" copy of Λ(x). Euclid iterates t=#errors times, while BM iterates N=#syndrome times. Euclid produces Ω(x), but BM and linear method require a second iteration to produce Ω(x). So it's a trade off in size versus speed. Rcgldr (talk) 04:03, 3 November 2011 (UTC)
unindent - I changed the title of this discussion section to BM implied to be only efficient algorithm. Perhaps the statement in question could be changed to This gives rise to efficient decoding algorithms (described below). This would avoid having to clarify the pros and cons of each algorithm in the intro section. Rcgldr (talk) 04:16, 3 November 2011 (UTC)

article section 4.2 - Peterson Decoder - 4.2.3 - Error locator polynomial[edit]

Quote from the end of the article: However, this method doesn't tell you the number of errors, v.

To determine number of errors with Peterson decoder, assume max error case (# syndromes / 2). If this is grearer than the actual number of errors, the matrix will be redundant and not be invertible (or check the determinant, which will be zero). In this case the procedure is to decrease number of errors by 1, and repeat this process until the actual number of errors (which could be zero) is determined.

Section 4.2.3 of the article should be updated with similar wording to explain how the number of errors can be determined using the Peterson decoder. Rcgldr (talk) 18:36, 23 January 2013 (UTC)

Done. Glrx (talk) 16:42, 24 January 2013 (UTC)
Thanks. Rcgldr (talk) 15:34, 2 February 2013 (UTC)

First paragraph - encoded message generated by division (modulo), versus multiplying?[edit]

From the first paragraph of the article:

Instead, RS codes are viewed as cyclic BCH codes, where encoding symbols are derived from the coefficients of a polynomial constructed by multiplying p(x) with a cyclic generator polynomial.

I realize that a large matrix multiply can be performed to encode data, but doesn't the BCH method normally involve multiplying the message p(x) by x^t (t is number of parity symbols to append to p(x)), then dividing this by the generator polynomial g(x), and appending the remainder to the message:

 s(x) = (p(x) \  x^t) - ( (p(x) \  x^t) \ | \ g(x) )

Rcgldr (talk) 04:59, 26 February 2013 (UTC)

Both methods can be used. The only important issue is that g(x) divide the transmitted polynomial. I believe it is more common for the sender to divide so that the transmitted message is in the clear. Some text that kept making that point has been removed. Glrx (talk) 02:09, 27 February 2013 (UTC)
Note - I editted this talk section so that it matches the names used in the rest of the article. Also, a large matrix multiply could be used for either encode method. The encode matrix would be smaller in the sender divides case.
I didn't read far enough. The sender divides method is covered in the later remarks section:
In this case, the t check symbols are created by computing the remainder, sr(x):
s_r(x) = ( p(x) \times x^t ) \  \bmod \  g(x)
and that remainder is used to make an evenly divisible codeword:
s(x) = p(x) \times x^t - s_r(x)
So the two methods are s(x) = p(x) g(x) or s(x) = ( p(x) x^t ) - ( (p(x) x^t) | g(x) ). Perhaps the first paragraph of the article should also mention the second method.
Rcgldr (talk) 14:49, 28 February 2013 (UTC)
  • I moved that part of the "remarks" section to a new subsection systematic encoding procedure. There are many different encoding methods, each with their advantages and disadvantages. I think the lead section of the article should stay on a higher level and not dive into the details of any particular method. ylloh (talk) 17:08, 28 February 2013 (UTC)
In the new systematic encoding section:
multiplying p(x) by x^t to make room for the t+1 check symbols
but I think that should be
multiplying p(x) by x^t to make room for the t check symbols
If the lead section is to stay on a high level, it shouldn't mention a specific construction method such as multiping p(x) by g(x). The wording could be something like the encoded message s(x) is based on p(x) and g(x) and is constructed so that it is an exact multiple of g(x).
Rcgldr (talk) 10:19, 1 March 2013 (UTC)
Thanks, I corrected the +/-1 error. Your wording describes the two methods that we have in the BCH view well, but does not capture Reed & Solomon's original view. Maybe we should add a description of both views (but not the different encoding methods)? ylloh (talk) 14:37, 1 March 2013 (UTC)

unindent - I'm thinking that the lead section should just provide a basic description of current implementations of RS codes, and review the history in later sections. Rcgldr (talk) 20:15, 1 March 2013 (UTC)

There's an inconsistency in the ordering of coefficients. The convention I've normally used considers the coefficients from most signficant to least signficant, p(x) = p_(k-1) x^(k-1) + p_(k-2) x^(k-2) + ... + p_0. This is the convention used in the examples for Berlekamp–Massey decoder and Euclidean decoder, with the Euclidean decoder clarifing the ordering by stating that R and S are forward, A and B are reversed. From the earlier section, Systematic encoding procedure - the message as an initial sequence of values, it is stated that the first k entries of the codeword are exactly equal to x. In the section you added Systematic encoding procedure , first is changed to last : the k last coefficients are identical to the coefficients of p(x). Rcgldr (talk) 20:15, 1 March 2013 (UTC)

In theoretical computer science, the "classical" view of Reed-Solomon codes is still used today because it is somewhat more intuitive and elegant, so I wouldn't call it "history". I don't have a strong opinion on whether or not it should be mentioned in the lead.
You're right, there's now some inconsistency between the definition and the examples. I think the examples should always briefly mention which encoder is being discussed (in this case, a systematic BCH encoder with q,n,k set in some way). If there is a consensus in the relevant coding literature whether to write polynomials from left to right or from right to left, that consensus variant should be used in the definition sections. However, I suspect that one does not care about it and usually writes it from the smallest monomial to the largest because this leads to the cleanest notation. Maybe we can use the cleanest notation at first and then add the remark that the codewords in the systematic BCH code are reversed when it is used in practical applications (so that codewords always start with the message)? ylloh (talk) 21:27, 1 March 2013 (UTC)
My only issue with the lead article is that it mentions one specific method s(x) = p(x) g(x) but not the other s(x) = ( p(x) x^t ) - ( (p(x) x^t) | g(x) ), and this is in reference to BCH codes, and I had the impression that even "classic" BCH codes use s(x) = ( p(x) x^t ) - ( (p(x) x^t) | g(x) ). Using s(x) = p(x) g(x) is more complicated for both encoder and decoder. The encoder involves more data and/or a larger encoding matrix. The decoder would have do the equivalent of first correcting a received s(x), then dividing the corrected s(x) by g(x) to produce p(x). Rcgldr (talk) 23:29, 1 March 2013 (UTC)
In the articles and text books I've seen, when polynomials are written out as a series of terms, they are usually written with most significant term first (especially for examples of polynomial math such as long hand multiplication or division), even though when a polynomial is written as a sum (sigma) or product (pi) series, the series index normally goes from 0 to n-1 or from 1 to n. In the cases where it isn't clear what the order of the terms of a polynomial are, instead of using "first ... " or "last ... ", the terms "most signifcant ... " or "least signifcant ... " could be used. Rcgldr (talk) 17:14, 6 March 2013 (UTC)
Yes, the monomial order still needs some streamlining. Also, e.g., in the beginning the message is called "x" (conforming with other articles such as Hamming code or Hadamard code) and the variable is called "a" (conforming with \alpha, I suppose), and in later parts of the article, the variable is called x. I think a is not a good choice, but I'd prefer to call the message x. Maybe we can use capital "X" or "y" for the variable? What do you think? ylloh (talk) 18:19, 6 March 2013 (UTC)
polymonomial term order - This is already fixed. The ordering of terms is now consistent within the article.
Throughout most of the article, the original message is considered to be coefficients of a polnomial p(x). The conflict occurs in Reed & Solomon's original view section, where x is the message represented as a set of symbols, and where p(...) is described as a generator polynomial that produces a scalar to be evaluated at n points to produce a codeword C(x), but then defines p_x(a) as a summation of the products of scalars x_i \ a^{i-1}. Rcgldr (talk) 23:11, 6 March 2013 (UTC)

Isn't this a very badly written article?[edit]

The article is written is of a highly technical nature and shows no empathy for a reader of an cyclopedia. — Preceding unsigned comment added by 130.76.96.111 (talk) 23:10, 1 March 2013 (UTC)

Perhaps there should be a comment that one of the external links is for a tutorial, Tutorial on Reed–Solomon Error Correction Coding, which is currently the last external link on the article. Rcgldr (talk) 23:34, 1 March 2013 (UTC)
It would be helpful to point to specific parts of the article, argue why you think they are badly written, and, if you can, suggest improvements. Thanks! ylloh (talk) 18:13, 6 March 2013 (UTC)

Trivial typo, not sure but anyway I'm afraid to edit article page11:27, 24 July 2013 (UTC)80.221.245.225 (talk)[edit]

Near the top of the article:

"Furthermore, RS codes are suitable as multiple-burst bit-error correcting codes"

Shouldn't that be "multiple-bit burst errors" ?

Or maybe it's right the way it is.

80.221.245.225 (talk) 11:27, 24 July 2013 (UTC)

RS codes can handle multiple bursts; the error symbols may be scattered throughout the message. Some ECC codes are for single bursts within a block. Glrx (talk) 20:23, 25 July 2013 (UTC)
Those two sentences could be removed, since there's previous text explaining that an RS code can correct up to t/2 symbols. Perhaps that other sentence could include the wording any combination up to t/2 symbols. Rcgldr (talk) 12:40, 19 November 2013 (UTC)
Correcting error bursts is important, and that is what the lede is trying to get across. My memory is dim, but I think Fire codes were used to correct single burst errors on disk drives. RS codes replaced Fire codes because they can correct multiple random and burst errors. Glrx (talk) 01:09, 22 November 2013 (UTC)
My issue is that the term bursts is typcially bit oriented, while RS codes are symbol oriented. Perhaps the article statement could be changed to multiple burst / multiple bit error correcting code. It doesn't make it clear that a n+2 bit burst could affect 3 n bit symbols, or that interleaving could be used to reduce the impact of a multi-bit burst error. Rcgldr (talk) 03:37, 22 November 2013 (UTC)

Singleton bound?[edit]

So the singleton bound says delta + R ≤ 1. But if delta = 1 - k/n + 1/n, and R = k/n, then the formula reads: 1 + 1/n ≤ 1, which is only valid if n < 0. That doesn't seem right? 194.39.218.10 (talk) 06:28, 30 September 2013 (UTC)