The Hamming Code with Matrices

The Hamming Code has been introduced as a binary block code allowing correction of single bit errors. The examples described a linear (7,4) code in which 7-bit codewords carry 4 data bits and 3 error control bits.

The approach will now be generalized and studied in matrix form, where it can be extended to more powerful codes.

The Hamming Code in Matrix Form

Consider the 7-bit Hamming code built on a message having four data bits (D) which are to be transmitted as a 7-bit codeword by adding three error control bits. This would be called a (7,4) code. The three bits to be added are three even Parity bits (P), where the parity of each is computed on different subsets of the message bits.

Let the message be the four element vector M = [ D7 D6 D5 D3 ]

By inspection, define a (4 × 7) generator matrix G in which the contents of each column correspond to the data bit enclosed by the seven regions of the Venn diagram.

             7  6  5  4  3  2  1  
          [  1  0  0  1  0  1  1  ] ⇐ D7 
      G = [  0  1  0  1  0  1  0  ] ⇐ D6
          [  0  0  1  1  0  0  1  ] ⇐ D5
          [  0  0  0  0  1  1  1  ] ⇐ D3

The product C = M × G, accomplishes the codeword generation given only the four input bits M, and where modulus 2 arithmetic is assumed.

      C = M × G = [ D7 D6 D5 D7+D6+D5 D3 D7+D6+D3 D7+D5+D3 ] = [ D7 D6 D5 P4 D3 P2 P1 ]

A received codeword can be checked by confirming the even parity of the bits found within each circle of the Venn diagran. Define a (3 × 7) matrix H called the check matrix in which each row reflects the composition of the three parity check bits.

            D7 D6 D5 P4 D3 P2 P1  
          [  1  1  1  1  0  0  0  ] ⇐ P4
      H = [  1  1  0  0  1  1  0  ] ⇐ P2
          [  1  0  1  0  1  0  1  ] ⇐ P1

Observe that the product P = HG^T returns a zero matrix (3 × 4), where G^T is the transpose of H.

And the matrix product S = HC^T equals a zero matrix (3 × 1) for any valid codeword C, as shown below.

          ---------- H ------------     --C^T--
          [  1  1  1  1  0  0  0  ]     [ D7 ]      [ D7 + D6 + D5 + P4 ]     [ 0 ]
    S =   [  1  1  0  0  1  1  0  ]  x  [ D6 ]   =  [ D7 + D6 + D3 + P2 ]  =  [ 0 ]
          [  1  0  1  0  1  0  1  ]     [ D5 ]      [ D7 + D5 + D3 + P1 ]     [ 0 ]
                                        [ P4 ]
                                        [ D3 ]
                                        [ P2 ]
                                        [ P1 ]

The zero result indicates no errors, since the modulus 2 addition effects a parity check on the bits defined in each row. The result (S) is called the syndrome. For example, given the valid codeword C = [ 1 0 1 0 0 1 0 ],

          ---------- H -----------     --C^T--
          [  1  1  1  1  0  0  0 ]     [ 1  ]      [ 1 + 0 + 1 + 0 + 0 + 0 + 0 ]     [ 0 ]
    S =   [  1  1  0  0  1  1  0 ]  x  [ 0  ]   =  [ 1 + 0 + 0 + 0 + 0 + 1 + 0 ]  =  [ 0 ]
          [  1  0  1  0  1  0  1 ]     [ 1  ]      [ 1 + 0 + 1 + 0 + 0 + 0 + 0 ]     [ 0 ]
                                       [ 0  ]
                                       [ 0  ]
                                       [ 1  ]
                                       [ 0  ]

The Arrival of Errors

The presence of error(s) can be modeled by an error vector such as E = [ 0 1 0 0 0 0 0 ] where the sum B = E + C describes the bit error(s) added to the codeword. For the E shown, the resulting word B would have a single bit error compared to the original codeword C. In polyomial form, the expression E(x)=x^N compactly describes a single bit error.

For example, given the codeword C = [ 1 0 1 0 0 1 0 ], then B = C + E = [ 1 1 1 0 0 1 0 ] where a single bit is made incorrect. Of course, when checking this word for errors it it is not known where the bad bit is located, or even if there are any bad bits.

To check for errors, simply compute the product HB^T, but first observe the details of this operation:

   S = HB^T = H(E + C)^T = H(E^T + C^T) =  HE^T + HC^T = HE^T

Since the term HC^T = 0 by definition.

In conclusion, the syndrome (S) depends only on the error for whatever valid codeword is transmitted.

It is sufficient to examine the error vector E to study the error control aspects of this coding approach.

The Detection of Errors

Single Bit Errors

Let E = [ 0 1 0 0 0 0 0 ], then:

          ---------- H -----------     --E^T--
          [  1  1  1  1  0  0  0 ]     [ 0  ]      [ 1 ]
    S =   [  1  1  0  0  1  1  0 ]  x  [ 1  ]   =  [ 1 ]
          [  1  0  1  0  1  0  1 ]     [ 0  ]      [ 0 ]
                                       [ 0  ]
                                       [ 0  ]
                                       [ 0  ]
                                       [ 0  ]

The non-zero syndrome indicates that an error has occurred. The single bit error syndrome matches the column location of the bad bit. Correction is trivial. Note that no two columns in H have the same value and any single bit error can be corrected in this way.

Two-Bit Errors

Let E = [ 0 1 0 1 0 0 0 ], then:

          ---------- H -----------     --E^T--
          [  1  1  1  1  0  0  0 ]     [ 0  ]      [ 1 + 1 ]     [ 0 ]
    S =   [  1  1  0  0  1  1  0 ]  x  [ 1  ]   =  [ 1 + 0 ]  =  [ 1 ]
          [  1  0  1  0  1  0  1 ]     [ 0  ]      [ 0 + 0 ]     [ 0 ]
                                       [ 1  ]
                                       [ 0  ]
                                       [ 0  ]
                                       [ 0  ]

The non-zero syndrome indicates that some error has occurred. The error is not correctable as other pairs of errors (or a single bit error) could lead to this same result. Observe that the resulting syndrome is the sum of the syndromes for each single bit error. Any two-bit error is guaranteed to be detected since no two columns in H would add to give zero (the columns would have to be identical).

Multi-Bit Errors

A non-zero syndrome indicates that an error has occurred. A multi-bit error is not correctable as other combinations of errors could lead to this same result. In every case, the resulting syndrome is the sum of the syndromes for each single bit error. Unfortunately, some of these sums will equal zero and the errors will go undetected.

A Systematic Code

Before moving to a more general method of error control using matrices, consider the Hamming code with the bits rearranged such that the parity bits appear separately from the data bits. To this end, columns 3 and 4 of the check matrix H are swapped as shown below. None of the above error control arguments is affected by this change.

Specifically, in the new check matrix H, the rightmost most columns form a (3 × 3) identity matrix.

            7  6  5  3  4  2  1 
          [ 1  1  1  0  1  0  0 ]
      H = [ 1  1  0  1  0  1  0 ]
          [ 1  0  1  1  0  0  1 ]

Similarly, in the new generating matrix G the leftmost columns form a (4 × 4) identity matrix; consequently, the input data bits emerge unchanged and grouped together (followed by the parity bits).

          [ 1  0  0  0  1  1  1 ]
      G = [ 0  1  0  0  1  1  0 ]
          [ 0  0  1  0  1  0  1 ]
          [ 0  0  0  1  0  1  1 ]

Reading across the final three bits in each row of this matrix (G) reveals the four 3-bit terms (7,6,5,3) from the leftmost columns of the check matrix (H); in this way, converting between G and H is readily accomplished.

Example: Let the four message bits be M = [ 1 0 1 0 ], then the codeword C = iG as:

                           [ 1  0  0  0  1  1  1 ]
      C = MG = [1 0 1 0] x [ 0  1  0  0  1  1  0 ]  = [1 0 1 0 0 1 0]
                           [ 0  0  1  0  1  0  1 ]
                           [ 0  0  0  1  0  1  1 ]

It can be seen that the above codewords contains the four input data bits followed by the computed parity bits. A generator matrix that incorporates an identiy matrix in this way ensures the arrangement of codeword bits separated from parity bits defining a systematic block code.

To confirm that this is a valid codeword, use the check matrix as HC^T to obtain a zero result.

Conclusions

The Hamming code concepts can be described in matrix form, where a generating matrix (G) creates valid codewords from information bits, and a check matrix (H) computes syndromes for error checking.

When a valid codeword is multiplied by the check matrix, the result (syndrome) is zero. Any non-zero syndrome indicates a bit error. It has been shown that this syndrome depends entirely on an error vector E(x). Each column in the check matrix is different, meaning that all single bit errors of the form E(x)=x^N in a codeword produce a unique syndrome and no two-bit errors can produce a zero syndrome. This implies that all single bit errors can be corrected or that all two-bit errors can be detected.

The matrix approach on this page follows from the description of the (7,4) Hamming code as overlapping parity circles. In practice, error control codes like the Hamming code and the associated matrices will be generated mathematically to have some specific targeted performance. From this perspective, the Hamming code will be redefined as a systematic cyclic code having distance D=3.

The associated matrix operations may be accomplished in MATLAB as seen in Hamming Code with MATLAB.