Introduction to BCH Codes

The Bose-Chaudhuri-Hocquenghem (BCH) codes are a widely used class of multi-bit error correcting codes. A simple BCH code is developed here which can achieve two-bit error correction on 7-bit codewords. An (n,k) BCH code with Hamming distance D=5 is generally labeled BCH(n,k,5).

Code Generation with Galois Field GF(2³)

A Galois field GF(2^m) can be used to construct generator and check matrices for single bit error correction based on primitive polynomials. The BCH codes essentially combine multiple polynomials into a single generating polynomial with predictable distance properties and the ability to correct multiple bit errors.

The non-zero elements of the Galois field GF(8) may be generated (not in order) as the set {a⁰,a¹,a²,...,a⁷} modulus a primitive polynomial P(x) where a is called a primitive root of the field. The table below shows the non-zero elements of a GF(8) defined as {aⁿ} with P(x) = x³ + x + 1. While a=2 is a specific root of this polynomial, a general expression in terms of aⁿ provides that if P(a)=0 then a³ + a + 1 = 0 and a³ = a + 1, such that successive terms of aⁿ defining all the elements of GF(8) can be derived as shown below.

GF(8) defined as {aⁿ} with P(x) = x³ + x + 1

a⁰

= 1

= 001

= 1

= a⁷

= a¹⁴

= a²¹

= a²⁸

...

a¹

= a

= 010

= 2

= a⁸

= a¹⁵

= a²²

= a²⁹

...

a²

= a²

= 100

= 4

= a⁹

= a¹⁶

= a²³

= a³⁰

...

a³

= a+1

= 011

= 3

= a¹⁰

= a¹⁷

= a²⁴

= a³¹

...

a⁴

= a²+a

= 110

= 6

= a¹¹

= a¹⁸

= a²⁵

= a³²

...

a⁵

= a²+a+1

= 111

= 7

= a¹²

= a¹⁹

= a²⁶

= a³³

...

a⁶

= a²+1

= 101

= 5

= a¹³

= a²⁰

= a²⁷

= a³⁴

...

From the table, it can be confirmed that a is a GF(8) root of P(x), since P(a) = a³ + a + 1 = 011 + 010 + 001 = 0.

Observe that consecutive powers of aⁿ repeat periodically and the value of aⁿ for any n can be found in this way. These terms can serve to develop an error correcting code.

The i-th power of each of the elements a^N can be described as a^Ni. From the values found above, these powers can be summarized as the octal values shown here:

i	a⁶ⁱ	a⁵ⁱ	a⁴ⁱ	a³ⁱ	a²ⁱ	a¹ⁱ	a⁰ⁱ
1	5	7	6	3	4	2	1
2	7	3	2	5	6	4	1
3	6	2	7	4	5	3	1
4	3	5	4	7	2	6	1
5	4	6	5	2	3	7	1
6	2	4	3	6	7	5	1
7	1	1	1	1	1	1	1
	A	B	C	D	E	F	G

Columns are labeled A..G for reference.

Rows in the table repeat periodically for i > 7. The row for i = 7 is included for completeness.

Error Control Check Matrix

Single-Bit Error Correction (D=3)

A (3 x 7) check matrix H for single-bit error correction is defined from the above table (with i=1) as shown below. In practice any of the rows (1 to 6) could be used.


    A B C D E F G 
H = 5 7 6 3 4 2 1 
 
    1 1 1 0 1 0 0 
H = 0 1 1 1 0 1 0 
    1 1 0 1 0 0 1

This check matrix describes a (7,4) code in which 7-bit codewords lead to 3-bit syndromes. The table below shows the unique 3-bit syndrome (in octal) for all possible single bit errors as E(x) = xⁿ.

Table 1	E(x)=0	E(x)=x⁰	E(x)=x¹	E(x)=x²	E(x)=x³	E(x)=x⁴	E(x)=x⁵	E(x)=x⁶
Error	0000000	0000001	0000010	0000100	0001000	0010000	0100000	1000000
LABEL		G	F	E	D	C	B	A
Syndrome	0	1	2	4	3	6	7	5

Single bit error correction is possible because every 3-bit column is unique. For example, the single bit error E(x) = x⁴ = (0010000) returns the syndrome 6 from column C and no other single bit error will return this same value.

Double-bit error correction is not possible because different two-bit errors do not give a unique result. The two bit error E(x) = x⁶ + x³ returns a non-zero syndrome (detection!); however, it is the same syndrome as a single bit error E(x) = x⁴ since the bits in column C alone and the sum of the bits in columns (A+D) both equal 6. Also, the two-bit error (B+G) is 7 + 1 = 6. Consequently, there is no way to distinguish between different two-bit patterns and therefore no way to correct two bad bits.

Overall, this matrix gives a minimum Hamming distance D=3. Using a different row (i<7) in the table simply changes the column ordering and would not change the Hamming distance, although the specific syndromes would be different for each single bit error.

Double-Bit Error Correction (D=5)

To achieve two bit error correction, D=5 is necessary. A code with this property can be constructed from the above (D=3) single bit error corrrecting matrix by extending the matrix (with another single bit error correcting matrix) to give a double bit error correcting check matrix.

The challenge is always to determine how to extend this matrix to assure that all one- and two-bit errors will give a unique syndrome. Note that in every row (i) the different column ordering gives a different syndrome for any specific error. The combination of two rows from the table is key to this next step.

Using the same GF(8) as the basis for this new matrix, another row may be incorporated as shown below. This additional row will add three rows to the binary matrix. This new generating matrix describes a (7,1) code. Choose i=3 for this new row.

The (6 x 7) check matrix H for double-bit error correction is defined with i=(1,3) shown below.


    A B C D E F G 
H = 5 7 6 3 4 2 1 
    6 2 7 4 5 3 1 
 
    1 1 1 0 1 0 0 
H = 0 1 1 1 0 1 0 
    1 1 0 1 0 0 1 
    1 0 1 1 1 0 0 
    1 1 1 0 0 1 0 
    0 0 1 0 1 1 1

Single bit error correction is still possible because every 6-bit column is unique. This result is assured from the original three rows and does not depend on the new rows. The single bit error E(x) = x⁴ now returns the 6-bit syndrome 67 (octal) from column C and no other single bit error will return this same value.

Double-bit error correction is now possible because different two-bit errors will give a unique result. The two bit error E(x) = x⁶ + x³ or (A,D) returns 34 + 56 = 62. Also the two-bit error (B,G) is now 72 + 11 = 63, another different result. Consequently, it is now possible to distinguish between different two-bit error patterns and therefore two bad bits can be corrected by identifying their unique syndrome. This syndrome is like a 'signature' for the error pattern. The table below shows the unique 6-bit syndrome (in octal) for all possible one and two bit errors.

Table 2		E(x)=0	E(x)=x⁰	E(x)=x¹	E(x)=x²	E(x)=x³	E(x)=x⁴	E(x)=x⁵	E(x)=x⁶
Error		0000000	0000001	0000010	0000100	0001000	0010000	0100000	1000000
			G	F	E	D	C	B	A
0000000		00	11	23	45	34	67	72	56
0000001	G	11	--	32	54	25	76	63	47
0000010	F	23	32	--	66	17	44	51	75
0000100	E	45	54	66	--	71	22	37	13
0001000	D	34	25	17	71	--	53	46	62
0010000	C	67	76	44	22	53	--	15	31
0100000	B	72	63	51	37	46	15	--	24
1000000	A	56	47	75	13	62	31	24	--

The check matrix confirms the performance of this code in identifying and correcting two bit errors. While this choice of two rows from the table lead to enhanced code performance, not all row pairs will be as successful, as shown below.

A Check Matrix that Does not Work

The (6 x 7) check matrix H for double-bit error correction is defined from the above table (with i=1 and i=2) as shown below. However, this choice of rows fails.


    A B C D E F G 
H = 5 7 6 3 4 2 1 
    7 3 2 5 6 4 1 
 
    1 1 1 0 1 0 0 
H = 0 1 1 1 0 1 0 
    1 1 0 1 0 0 1 
    1 0 0 1 1 1 0 
    1 1 1 0 1 0 0 
    1 1 0 1 0 0 1

At first glance, this new matrix seems as good as the previous example; however, in this case, many different errors will lead to the same syndrome. The table below shows the 6-bit syndrome (in octal) for all possible one and two bit errors.

Table 3		E(x)=0	E(x)=x⁰	E(x)=x¹	E(x)=x²	E(x)=x³	E(x)=x⁴	E(x)=x⁵	E(x)=x⁶
Error		0000000	0000001	0000010	0000100	0001000	0010000	0100000	1000000
			G	F	E	D	C	B	A
0000000		00	11	24	46	35	62	73	57
0000001	G	11	--	35	57	24	73	62	46
0000010	F	24	35	--	62	11	46	57	73
0000100	E	46	57	62	--	73	24	35	11
0001000	D	35	24	11	73	--	57	46	62
0010000	C	62	73	46	24	57	--	11	35
0100000	B	73	62	57	35	46	11	--	24
1000000	A	57	46	73	11	62	35	24	--

For example, the errors E(x) = (0000001), (00001010), (0110000), (1000100) all give the same syndrome (11) as shown highlighted. It is clear that this new matrix cannot be used to correct two-bit errors..

A Better Choice of Rows

Examining the row table for only the rows i=1 and 1=2 shows that by definition each element in row 2 is the square of the corresponding element in row 1. For example, the first element in row 1 is a⁶ and the first element in row 2 is a¹². Similarly, it can be confirmed for these same terms that (7) = (5)². (Confirm)

i	a⁶ⁱ	a⁵ⁱ	a⁴ⁱ	a³ⁱ	a²ⁱ	a¹ⁱ	a⁰ⁱ
1	5	7	6	3	4	2	1
2	7	3	2	5	6	4	1

It happens that in this extended field, squaring a term is a linear operation (in other words, x² + y² = (x+y)² ). For example:

From row (i=1), Let x=5 and y=3   
From row (i=2),  7=5² and 5=3², also 2=6² for reference. 

Let LHS = x² + y² 
        = 5² + 3² 
        = 7 + 5 
        = 2

Let RHS = (x+y)² 
        = (5+3)² 
        = (6)² 
        =  2 

LHS=RHS  This demonstrates that x² + y² = (x+y)².

Now consider a 6-bit syndrome AB in octal from Table 3. Let E(x)= x^N + x^M be a 2-bit error leading to AB. The digit A comes from the sum of terms M and N in row (i=1) and the digit B comes from the sum of terms M and N in row (i=2). It is expected that AB will be unique for all (M,N).

From the squaring property, the sum of two elements in row 2 is the square of the sum of the same two elements in row 1.

In other words, for every 2-digit syndrome AB in Table 3, the digit B is simply the square of the digit A. Moreover, as each element in row 2 is the square of the corresponding element in row 1, all the syndromes in Table 3 are found by knowing A and by reading by column (AB) is one of (11,24,35,46,57,62,73). Nothing is contributed by the presence of the second digit.

By extension, no other even-numbered row will contribute any additional information necessary to give a unique syndrome in a new check matrix. The rows i=(2,4) and i=(3,6) and i=(6,5) describe other pairs of rows related by the square that do not contribute to improving the Hamming distance of the new codewords.

Recall that the choice of rows i=(1,3) was successful in the first example. These rows are not related by the square. The above discussion shows that starting with row (i=1) the rows (i=2,4,5,6) can be eliminated and the only choice for a useful second row is (i=3).

Three-Bit Error Correction (D=7)

The above examples show how more powerful codes can be derived by systematically assembling check matrices.

The algorithm for building an error control code with a desired Hamming distance (D) using this matrix approach shall be:

Start with (i=1);
Choose as many rows as necessary (i=2,3,4...D-1) to achieve the distance (D).
Discard the rows that are redundant, and;
Build the check matrix.

Examples with P(x) = x³ + x + 1

For (D=3) start with (i=1), choose additional row (i=2) then keep only row (i=1).

For (D=5) start with (i=1), choose additional rows (i=2,3,4) then keep only rows (i=1,3) as the others are redundant.

For (D=7) start with (i=1), choose additional rows (i=2,3,4,5,6) then keep only rows (i=1,3) as the others are redundant.

It is interesting to note that D=7 was already achieved in the first example on this page where the two rows (i=1,3) gave a (7,1) code describing the trivial case of two codewords (0000000) and (1111111) which actually have distance (D=7). Use of a higher degree polynomial P(x) will lead to more realistic results.

Conclusions

A BCH Code has been constructed which can correct two-bit errors by combining two single-bit error correcting polynomials. BCH codes can be created based on a target Hamming distance for the resulting codewords.

In general, BCH codes are created from a generating polynomials from which the generating and check matrices can be derived as required. The above procedure is formalized and a BCH generating polynomial is developed in Online BCH Code Generator.