Introduction to BCH Codes

The Bose-Chaudhuri-Hocquenghem (BCH) codes are a widely used class of multi-bit error correcting codes. A simple BCH code is developed here which can achieve two-bit error correction on 7-bit codewords. An (n,k) BCH code with Hamming distance D=5 is generally labeled BCH(n,k,5).


Code Generation with Galois Field GF(23)

A Galois field GF(2m) can be used to construct generator and check matrices for single bit error correction based on primitive polynomials. The BCH codes essentially combine multiple polynomials into a single generating polynomial with predictable distance properties and the ability to correct multiple bit errors.

The non-zero elements of the Galois field GF(8) may be generated (not in order) as the set {a0,a1,a2,...,a7} modulus a primitive polynomial P(x) where a is called a primitive root of the field. The table below shows the non-zero elements of a GF(8) defined as {an} with P(x) = x3 + x + 1. While a=2 is a specific root of this polynomial, a general expression in terms of an provides that if P(a)=0 then a3 + a + 1 = 0 and a3 = a + 1, such that successive terms of an defining all the elements of GF(8) can be derived as shown below.


GF(8) defined as {an} with P(x) = x3 + x + 1

a0 = 1 = 001 = 1 = a7 = a14 = a21 = a28 ...
a1 = a = 010 = 2 = a8 = a15 = a22 = a29 ...
a2 = a2 = 100 = 4 = a9 = a16 = a23 = a30 ...
a3 = a+1 = 011 = 3 = a10 = a17 = a24 = a31 ...
a4 = a2+a = 110 = 6 = a11 = a18 = a25 = a32 ...
a5 = a2+a+1 = 111 = 7 = a12 = a19 = a26 = a33 ...
a6 = a2+1 = 101 = 5 = a13 = a20 = a27 = a34 ...

From the table, it can be confirmed that a is a GF(8) root of P(x), since P(a) = a3 + a + 1 = 011 + 010 + 001 = 0.

Observe that consecutive powers of an repeat periodically and the value of an for any n can be found in this way. These terms can serve to develop an error correcting code.

The i-th power of each of the elements aN can be described as aNi. From the values found above, these powers can be summarized as the octal values shown here:

i a6i a5i a4i a3i a2i a1i a0i
1 5 7 6 3 4 2 1
2 7 3 2 5 6 4 1
3 6 2 7 4 5 3 1
4 3 5 4 7 2 6 1
5 4 6 5 2 3 7 1
6 2 4 3 6 7 5 1
7 1 1 1 1 1 1 1
  A B C D E F G
Columns are labeled A..G for reference.

Rows in the table repeat periodically for i > 7. The row for i = 7 is included for completeness.


Error Control Check Matrix

Single-Bit Error Correction (D=3)

A (3 x 7) check matrix H for single-bit error correction is defined from the above table (with i=1) as shown below. In practice any of the rows (1 to 6) could be used.

A B C D E F G H = 5 7 6 3 4 2 1 1 1 1 0 1 0 0 H = 0 1 1 1 0 1 0 1 1 0 1 0 0 1

This check matrix describes a (7,4) code in which 7-bit codewords lead to 3-bit syndromes. The table below shows the unique 3-bit syndrome (in octal) for all possible single bit errors as E(x) = xn.

Table 1 E(x)=0E(x)=x0E(x)=x1E(x)=x2E(x)=x3E(x)=x4E(x)=x5E(x)=x6
Error  00000000000001000001000001000001000001000001000001000000
LABEL  GFEDCBA
Syndrome 01243675

Single bit error correction is possible because every 3-bit column is unique. For example, the single bit error E(x) = x4 = (0010000) returns the syndrome 6 from column C and no other single bit error will return this same value.

Double-bit error correction is not possible because different two-bit errors do not give a unique result. The two bit error E(x) = x6 + x3 returns a non-zero syndrome (detection!); however, it is the same syndrome as a single bit error E(x) = x4 since the bits in column C alone and the sum of the bits in columns (A+D) both equal 6. Also, the two-bit error (B+G) is 7 + 1 = 6. Consequently, there is no way to distinguish between different two-bit patterns and therefore no way to correct two bad bits.

Overall, this matrix gives a minimum Hamming distance D=3. Using a different row (i<7) in the table simply changes the column ordering and would not change the Hamming distance, although the specific syndromes would be different for each single bit error.


Double-Bit Error Correction (D=5)

To achieve two bit error correction, D=5 is necessary. A code with this property can be constructed from the above (D=3) single bit error corrrecting matrix by extending the matrix (with another single bit error correcting matrix) to give a double bit error correcting check matrix.

The challenge is always to determine how to extend this matrix to assure that all one- and two-bit errors will give a unique syndrome. Note that in every row (i) the different column ordering gives a different syndrome for any specific error. The combination of two rows from the table is key to this next step.

Using the same GF(8) as the basis for this new matrix, another row may be incorporated as shown below. This additional row will add three rows to the binary matrix. This new generating matrix describes a (7,1) code. Choose i=3 for this new row.

The (6 x 7) check matrix H for double-bit error correction is defined with i=(1,3) shown below.

A B C D E F G H = 5 7 6 3 4 2 1 6 2 7 4 5 3 1 1 1 1 0 1 0 0 H = 0 1 1 1 0 1 0 1 1 0 1 0 0 1 1 0 1 1 1 0 0 1 1 1 0 0 1 0 0 0 1 0 1 1 1

Single bit error correction is still possible because every 6-bit column is unique. This result is assured from the original three rows and does not depend on the new rows. The single bit error E(x) = x4 now returns the 6-bit syndrome 67 (octal) from column C and no other single bit error will return this same value.

Double-bit error correction is now possible because different two-bit errors will give a unique result. The two bit error E(x) = x6 + x3 or (A,D) returns 34 + 56 = 62. Also the two-bit error (B,G) is now 72 + 11 = 63, another different result. Consequently, it is now possible to distinguish between different two-bit error patterns and therefore two bad bits can be corrected by identifying their unique syndrome. This syndrome is like a 'signature' for the error pattern. The table below shows the unique 6-bit syndrome (in octal) for all possible one and two bit errors.

Table 2 E(x)=0E(x)=x0E(x)=x1E(x)=x2E(x)=x3E(x)=x4E(x)=x5E(x)=x6
Error  00000000000001000001000001000001000001000001000001000000
   GFEDCBA
0000000 0011234534677256
0000001G11--325425766347
0000010F2332--6617445175
0000100E455466--71223713
0001000D34251771--534662
0010000C6776442253--1531
0100000B726351374615--24
1000000A56477513623124--

The check matrix confirms the performance of this code in identifying and correcting two bit errors. While this choice of two rows from the table lead to enhanced code performance, not all row pairs will be as successful, as shown below.


A Check Matrix that Does not Work

The (6 x 7) check matrix H for double-bit error correction is defined from the above table (with i=1 and i=2) as shown below. However, this choice of rows fails.

A B C D E F G H = 5 7 6 3 4 2 1 7 3 2 5 6 4 1 1 1 1 0 1 0 0 H = 0 1 1 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 1 1 0 1 1 1 0 1 0 0 1 1 0 1 0 0 1

At first glance, this new matrix seems as good as the previous example; however, in this case, many different errors will lead to the same syndrome. The table below shows the 6-bit syndrome (in octal) for all possible one and two bit errors.

Table 3 E(x)=0E(x)=x0E(x)=x1E(x)=x2E(x)=x3E(x)=x4E(x)=x5E(x)=x6
Error  00000000000001000001000001000001000001000001000001000000
   GFEDCBA
0000000 0011244635627357
0000001G11--355724736246
0000010F2435--6211465773
0000100E465762--73243511
0001000D35241173--574662
0010000C6273462457--1135
0100000B736257354611--24
1000000A57467311623524--

For example, the errors E(x) = (0000001), (00001010), (0110000), (1000100) all give the same syndrome (11) as shown highlighted. It is clear that this new matrix cannot be used to correct two-bit errors..

A Better Choice of Rows

Examining the row table for only the rows i=1 and 1=2 shows that by definition each element in row 2 is the square of the corresponding element in row 1. For example, the first element in row 1 is a6 and the first element in row 2 is a12. Similarly, it can be confirmed for these same terms that (7) = (5)2. (Confirm)

i a6i a5i a4i a3i a2i a1i a0i
1 5 7 6 3 4 2 1
2 7 3 2 5 6 4 1

It happens that in this extended field, squaring a term is a linear operation (in other words, x2 + y2 = (x+y)2 ). For example:

From row (i=1), Let x=5 and y=3   
From row (i=2),  7=52 and 5=32, also 2=62 for reference. 

Let LHS = x2 + y2 
        = 52 + 32 
        = 7 + 5 
        = 2

Let RHS = (x+y)2 
        = (5+3)2 
        = (6)2 
        =  2 

LHS=RHS  

This demonstrates that x2 + y2 = (x+y)2.

Now consider a 6-bit syndrome AB in octal from Table 3. Let E(x)= xN + xM be a 2-bit error leading to AB. The digit A comes from the sum of terms M and N in row (i=1) and the digit B comes from the sum of terms M and N in row (i=2). It is expected that AB will be unique for all (M,N).

From the squaring property, the sum of two elements in row 2 is the square of the sum of the same two elements in row 1.

In other words, for every 2-digit syndrome AB in Table 3, the digit B is simply the square of the digit A. Moreover, as each element in row 2 is the square of the corresponding element in row 1, all the syndromes in Table 3 are found by knowing A and by reading by column (AB) is one of (11,24,35,46,57,62,73). Nothing is contributed by the presence of the second digit.

By extension, no other even-numbered row will contribute any additional information necessary to give a unique syndrome in a new check matrix. The rows i=(2,4) and i=(3,6) and i=(6,5) describe other pairs of rows related by the square that do not contribute to improving the Hamming distance of the new codewords.

Recall that the choice of rows i=(1,3) was successful in the first example. These rows are not related by the square. The above discussion shows that starting with row (i=1) the rows (i=2,4,5,6) can be eliminated and the only choice for a useful second row is (i=3).


Three-Bit Error Correction (D=7)

The above examples show how more powerful codes can be derived by systematically assembling check matrices.

The algorithm for building an error control code with a desired Hamming distance (D) using this matrix approach shall be:

  1. Start with (i=1);
  2. Choose as many rows as necessary (i=2,3,4...D-1) to achieve the distance (D).
  3. Discard the rows that are redundant, and;
  4. Build the check matrix.

Examples with P(x) = x3 + x + 1

For (D=3) start with (i=1), choose additional row (i=2) then keep only row (i=1).

For (D=5) start with (i=1), choose additional rows (i=2,3,4) then keep only rows (i=1,3) as the others are redundant.

For (D=7) start with (i=1), choose additional rows (i=2,3,4,5,6) then keep only rows (i=1,3) as the others are redundant.

It is interesting to note that D=7 was already achieved in the first example on this page where the two rows (i=1,3) gave a (7,1) code describing the trivial case of two codewords (0000000) and (1111111) which actually have distance (D=7). Use of a higher degree polynomial P(x) will lead to more realistic results.


Conclusions

A BCH Code has been constructed which can correct two-bit errors by combining two single-bit error correcting polynomials. BCH codes can be created based on a target Hamming distance for the resulting codewords.

In general, BCH codes are created from a generating polynomials from which the generating and check matrices can be derived as required. The above procedure is formalized and a BCH generating polynomial is developed in Online BCH Code Generator.