Jump to content

6b/8b encoding

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 125.24.166.68 (talk) at 17:17, 25 March 2020 (Coding rules). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In telecommunications, 6b/8b is a line code that expands 6-bit codes to 8-bit symbols for the purposes of maintaining DC-balance in a communications system.[1]

Each 8-bit output symbol contains 4 zero bits and 4 one bits, so the code can, like a parity bit, detect all single-bit errors.

The number of 8-bit patterns with 4 bits set is the binomial coefficient = 70. Further excluding the patterns 11110000 and 00001111, this allows 68 coded patterns: 64 data codes, plus 4 additional control codes.

Coding rules

The 64 possible 6-bit input codes can be classified according to their disparity, the number of 1 bits minus the number of 0 bits:

Ones Zeros Disparity Number
0 6 −6 1
1 5 −4 6
2 4 −2 15
3 3 0 20
4 2 +2 15
5 1 +4 6
6 0 +6 1

The 6-bit input codes are mapped to 8-bit output symbols as follows:

  • The 20 6-bit codes with disparity 0 are prefixed with 10
    Example: 000111 → 10000111
    Example: 101010 → 10101010
  • The 15 6-bit codes with disparity +2, other than 001111, are prefixed with 00
    Example: 010111 → 00010111
  • The 15 6-bit codes with disparity −2, other than 110000, are prefixed with 11
    Example: 101000 → 11101000
  • The remaining 20 codes: 12 with disparity ±4, 2 with disparity ±6, 001111, 110000, and the 4 control codes, are assigned to codes beginning with 01 as follows:
Type Input Output Type Input Output Complement
−6 000000 01011001 +6 111111 01100110 01_xx__x
−4 000001 01110001 +4 111110 01001110 01xx____
000010 01110010 111101 01001101
000100 01100101 111011 01011010 01x____x
001000 01101001 110111 01010110
010000 01010011 101111 01101100 01_____xx
100000 01100011 011111 01011100
−2 110000 01110100 +2 001111 01001011 01____x__
Control K 000111 01000111 Control K 111000 01111000
K 010101 01010101 K 101010 01101010

Obviously, no data symbol contains more than four consecutive matching bits, and because the patterns 11110000 and 00001111 are excluded, no data symbol begins or ends with more than three identical bits. Thus, the longest run of identical bits that will be produced is 6. (I.e. this is a (0,5) RLL code, with a worst-case running disparity of +3 to −3.)

Any occurrence of 6 consecutive identical bits constitutes a comma sequence or sync mark or syncword; it identifies the symbol boundaries precisely. Those 6 bits straddle the inter-symbol boundary with exactly 3 of those identical bits at the end of one symbol, and 3 of those identical bits at the start of the following next symbol.

See also

References

  1. ^ Kees A. Schouhamer Immink (November 2004). Codes for Mass Data Storage Systems (Second fully revised ed.). Eindhoven, The Netherlands: Shannon Foundation Publishers. ISBN 90-74249-27-2. Retrieved 2015-08-23.