Hamming Codes and C Implementation Using the Octave GNU Tool
1. Abbreviation
2. Introduction
It is necessary often to ensure reliable data transmission or storage in Digital Signal Processing applications. Data will be understood messages (code words) containing zeros (0) and ones (1), that is binary data. Additional parity bits create differences between the data (code words). To estimate the difference between the codes, Hamming introduced a code distance. For example, there are two messages:
These two code words differ in four positions (see italics). The number of differences between binary messages is called the Hamming code distance. A theorem is known that is used to create self-correcting codes. If there is a set of code words with a Hamming distance: d = 2t +1. Then this code can correct t-errors that occur during data transmission or storage. The proof of this theorem can be seen in [2]. For example, Hamming codes have a code distance of 3, which a single error is corrected, and extended Hamming codes have a code distance of 4, which a single error is corrected and a double error is detected.
Addition and subtraction operations with XOR (addition modulo 2) are introduced between code words:
as well as multiplication and division with multiplication modulo 2 operation:
Thus a finite field or Galois field of binary codes of length n is obtained, which is denoted by:
and uses Modular Arithmetic modulo 2.
Next we will use the following notation for the code: (n,k,d), where
The parity bits are set using the equations:
Codes (13) are called linear codes. Hamming codes are linear. It is convenient to describe linear codes in matrix form. In this case a generator matrix G with dimension (k) x (n) and a parity-check matrix H with dimension (n-k) x (n) or (m) x (n) are set.
The generator matrix generates code words from information codes:
The syndrome is calculated using the parity-check matrix. If the code word x does not contain errors, then the syndrome must have a null row:
It is easily proved (see the [2]):
where the null matrix has dimension (k) x (m) or (k) x (n-k).
Let the parity-check matrix H be given in a standard/systematic form:
It can be proved (see the [2]) that in this case the generator matrix in the standard/systematic form will take the form:
Now back to the Hamming code. The number of parity bits m = (n – k) is selected according to the inequality:
In this case there are enough syndrome values to identify errors in a code word of length n, while a zero syndrome means that there is no error in the code word. At the same time not only the information bits are corrected, but also the check parity bits. If inequality (20) turns into equality, then such a code is called perfect code. Perfect codes also have an important geometric meaning. See the [1], [2]
For example:
Hamming Code (7,4,3) is perfect code: n = 7, k = 4, m = n – k = 7 – 4 = 3, since (20) goes into equality:
Hamming Code (15,11,3) is perfect code: n = 15, k = 11, m = n – k = 15 – 11 = 4, since (20) goes into equality:
Hamming Code (21,16,3) is NOT perfect code: n = 21, k = 16, m = n – k = 21 – 16 = 5, since (20) does not go into equality:
Extended Hamming Code (8,4,4) is NOT perfect code: n = 8, k = 4, m = n – k = 8 – 4 = 4, since (20) does not go into equality:
Extended Hamming Code (16,11,4) is NOT perfect code: n = 16, k = 11, m = n – k = 16 – 11 = 5, since (20) does not go into equality:
Now let’s build the Hamming code (7,4,3) with the parity-check matrix:
In this case the columns of this matrix form seven syndromes that correspond to errors in one of the seven bits of the code word. Rearrange the columns of the matrix (21) to obtain a standard/systematic form (18) for the parity-check matrix:
Then the generator matrix of (19):
At the same time if an error occurs:
which corresponds to the columns of the parity-check matrix (22).
It is possible to construct H and G matrices for this code (7,4,3) so that the syndrome corresponds to the bit number of the message. To do this, take the parity-check matrix H in the form (21) and calculate the syndrome using the formula:
This corresponds to the equations:
Solving these equations with respect to the parity bits x5, x6, x7, using binary arithmetic modulo 2:
Thus we get a slightly different version of the Hamming code (7,4,3), while the matrix H changes and G remains unchanged:
At the same time if an error occurs:
which corresponds to the bit number of the message x, as well as the columns of the parity-check matrix (28). This representation makes it possible to simplify the implementation of the software.
Notes:
-
At the same time the G matrix (29) remains the standard/systematic form, while the H matrix (28) does not
-
I prefer to use extended Hamming Code in my applications. In this case an additional parity-check bit is introduced for the entire message:
which a double error is detected. Extended Hamming codes is realized below: (8,4,4) and (16,11,4). See also the HammingCode.c/h/m
-
Here is an example of using (28), (29). Let’s take the information code as:
Then:
Let’s check the code word using the H matrix:
Now let’s make an error in 5th bits of the code word:
Then the syndrome:
which corresponds to a 5th bit error.
3. Extended Hamming Code (8,4,4)
Consider the extended Hamming code (8,4,4) with a parity-check matrix:
The first row of ones has been added to this matrix to check for the parity of the entire message (31) in order to identify double error in the message.
Checking the syndrome:
This corresponds to the following equations:
Solving these equations with respect to the parity bits x5, x6, x7, x8, using binary arithmetic modulo 2:
Then the generator matrix:
The code words are formed according to the formula:
At the same time:
See also the HammingCode.m/c/h
4. Extended Hamming Code (16,11,4)
Consider the extended Hamming code (16,11,4) with a parity-check matrix:
Checking the syndrome:
This corresponds to the following checks:
Let’s add up the second and third equations from (41):
the information bits restrictions are gotten, which is unacceptable and does not allow us to solve these equations with respect to parity bits:
Thus it is impossible to obtain a syndrome in the form of an erroneous bit number for the Hamming code (16, 11, 4) if the generator G matrix is presented in a standard/systematic form, as it was obtained for the Hamming code (8,4,4).
Note: The representation of the syndrome in the form of an erroneous bit number is easy to obtain if the parity bits are mixed with the information bits. This is exactly what the classic (original) Hamming code is done (see the [1]). An example of a classic Hamming code will be given below in the Conclusion section. Nevertheless I prefer to use a systematic representation of the matrix G in my Software. It means that the information bits occupy the first k positions, and m=(n-k) parity bits.
Now let’s use the standard/systematic representation (18), (19) and write the parity-check matrix H for the Hamming code (15,11,3):
Let’s rewrite the parity-check matrix in a systematic way changing the order of the columns:
Then the generator matrix G:
Now let’s move on to the extended Hamming code (16,11,4), adding a row of ones to the parity-check matrix to get the parity of the entire message to detect double errors:
A check is added:
The parity bits are calculated from the generator matrix (44):
Substituting (47) into (46):
Substitute the resulting new column into the generator matrix for extended Hamming Code (16,11,4):
Thus the parity-check (45) and the generator (49) matrices for the extended Hamming code (16,11,4) are obtained. See also the HammingCode.m/c/h
5. Conclusions for Extended Hamming Codes
-
Extended Hamming Code has a Hamming Distance of 4, which a single error is corrected in a code word and a double error is recognized without correcting it
-
Consider the values of the syndrome for extended Hamming code. Highest (first) syndrome bit shows an extended parity-check of the entire code word. If this bit is equal to 1, then this indicates that an odd number of errors occurred: 1, 3, 5, … If the syndrome is non-zero and the highest bit is zero, then an even number of errors occurred: 2, 4, 6, … So we can implement the verification and correction algorithm as follows. If the highest bit of the syndrome is 1, then we assume that a single error has occurred and using the low bits of the syndrome we correct the error in the code word. If the syndrome is not zero and the highest bit of the syndrome is zero, then a double error has occurred and, for example, we repeat receiving the message or set the fatal error flag, which indicates that it is impossible to correct the code word. See also the HammingCode.c/h
-
If there are 3, 5, 7, …, 2 i + 1 (odd) number of errors in the code word of the extended Hamming code, then the algorithm leads to wrong correction
-
It follows from the first three points that the use of Hamming codes is limited to a single error. If there is a high probability of double or more errors in the code word, then other codes should be used that have a sufficiently large Hamming distance and allow to correct the required number of errors
-
In this article Hamming codes were considered as linear codes. The theory of Hamming codes can also be built on the basis of cyclic codes. For more information about this, see [1], [2]. The cyclic codes I also described in the article: “Cyclic Redundancy Check (CRC): bitwise, lookup table, fast crc without lookup table, reversing crc. C Implementation Using the Octave GNU Tool”
-
As noted above I prefer to use Hamming codes in a standard/systematic way. In this case the first k bits are information bits, and the last m=(n-k) parity-check bits. The classical code proposed by Hamming used information and parity bits mixed together (see also the [1]). The parity bits were located at positions 2^i. Consider the classic Hamming code (15,11,3). The code has m = (n – k) = 15 – 11 = 4 parity bits: p1, p2, p3, p4. They will be located in positions:
The information bits u will occupy the remaining positions in x:
Next we will make a table for calculating the parity bits p:
To calculate the parity bits, the code word bits are included for which the position numbers are 1. See the (52). Then:
From (51) and (53) the generator G matrix is written :
The parity-check H matrix:
Now let’s move on to the classic extended Hamming code (16,11,4). Insert a new parity bit for the entire code word to recognize a double error:
Substituting (51) and (53) into (56):
Using (56) and (57), the G and H matrices are written for classic extended Hamming Code (16,11,4):
See also HammingCode.m
6. Octave GNU file HammingCode.m
The script generates and tests standard/systematic Hamming Codes: (7,4,3), (15,11,3), extended Hamming Codes (8,4,4), (16,11,4), as well as the classic Hamming Code: (15,11,3), classic extended Hamming Code (16,11,4)
7. C Language HammingCode.c/h
Supports standard/systematic extended Hamming Codes: (8,4,4) and (16,11,4):
/*
Hamming Distance between code1 and code2
Input
uint16 code1
uint16 code2
Return
uint8 – Hamming Distance
*/
uint8 HammingDistance(uint16 code1, uint16 code2)
/*
Generator of the Extended Hamming Code (8,4,4)
x = [x1,…,x8] = u*G = [u1,u2,u3,u4]*G
Input
uint8 u – input nibble
Return
uint8 x – Hamming output byte
*/
uint8 GeneratorHamming_8_4_4(uint8 u_in)
/*
Syndrome of the Extended Hamming Code (8,4,4)
syndrome = [s1,s2,s3,s4] = x*HT = [x1,x2,…,x8]*HT
Input
uint8 x_in – input Hamming Code
Return
uint8 syndrome – Hamming syndrome [s1,s2,s3,s4]
*/
uint8 SyndromeHamming_8_4_4(uint8 x_in)
/*
Correction of the Extended Hamming Code (8,4,4)
Input
uint8* ptr_x_in – pointer on the input Hamming Code
Return
uint8 flag – TRUE: Hamming code in the pointer is in order (successful correction or it was ok)
FALSE: Hamming code has double error and cannot be corrected
*/
uint8 CorrectionHamming_8_4_4(uint8* ptr_x_in)
/*
Generator of the Extended Hamming Code (16,11,4)
x = [x1,x2,…,x16] = u*G = [u1,u2,…,u11]*G
Input
uint16 u – input nibble
Return
uint16 x – Hamming output byte
*/
uint16 GeneratorHamming_16_11_4(uint16 u_in)
/*
Syndrome of the Extended Hamming Code (16,11,4)
syndrome = [s1,s2,s3,s4,s5] = x*HT = [x1,x2,…,x16]*HT
Input
uint16 x_in – input Hamming Code
Return
uint8 syndrome – Hamming syndrome [s1,s2,s3,s4,s5]
*/
uint8 SyndromeHamming_16_11_4(uint16 x_in)
/*
Correction of the Extended Hamming Code (16,11,4)
Input
uint16* ptr_x_in – pointer on the input Hamming Code
Return
uint8 flag – TRUE: Hamming code in the pointer is in order (successful correction or it was ok)
FALSE: Hamming code has double error and cannot be corrected
*/
uint8 CorrectionHamming_16_11_4(uint16* ptr_x_in)