In common with many similar numbers used in credit cards, price tags, inventory control and membership numbers, the ISBN is specially constructed to incorporate error detection. A study of the method used reveals many fundamental features of a robust error detection scheme (i.e. one which reliably detects errors).
Error control codes are often designed for known applications where certain types of errors are expected to occur. In this case, the most common errors expected would be those which humans would typically make when writing or typing a book order. These human errors would normally be either "write a single digit incorrectly", or "switch two adjacent digits". Of course, the ability to detect errors also allows identification of invalid numbers which might be encountered (for example, if someone were to forge a credit card number). Forgery is not expected to be a real concern for book numbers, but error detection is important.
The final digit in the ISBN provides the desired error detection capability and is appended when the book number is issued. Once complete, the entire ten digit ISBN can be checked for errors whenever it is transcribed or transmitted.
The dashes often found in ISBN numbers serve only for presentation and may be omitted.
ISBN 0 1 3 1 3 9 1 3 9 0 sum 0 + 1 + 3 + 1 + 3 + 9 + 1 + 3 + 9 = 30
Using MATLAB, the modulo 10 operation is mod(x,10) and is equivalent to finding the remainder after division by 10 as rem(x,10). In the above example, mod(30,10) returns 0. The required checksum digit may be found as:
>> checksum = mod( sum([0 1 3 1 3 9 1 3 9 0]), 10 )
ISBN 0 1 3 1 3 9 1 3 9 9 multiplier x10 x9 x8 x7 x6 x5 x4 x3 x2 x1 -- -- -- -- -- -- -- -- -- -- sum 0 + 9 + 24 + 7 + 18 + 45 + 4 + 9 + 18 + 9 = 143
Note that the remainder of division by eleven could be any number from 0 to 10. If it happens that the value of the check digit must be 10, the letter 'X' is used instead to complete the ISBN.
When compared to the previous (modulo 10) example, the cost of having error control is still only one extra digit yet this method is more performant and can detect swapped digits as well as wrong digits. The objective of error control theory is to achieve the best performance at the least cost.
Using MATLAB, the above number could be checked as the weighted sum:
>> check = mod( sum([0 1 3 1 3 9 1 3 9 9].*[10 9 8 7 6 5 4 3 2 1]), 11 )
ISBN 0 1 3 1 3 9 1 3 9 1 multiplier x10 x9 x8 x7 x6 x5 x4 x3 x2 x1 -- -- -- -- -- -- -- -- -- -- sum 0 + 9 + 24 + 7 + 18 + 45 + 4 + 9 + 18 + 1 = 90 = 0 (mod 5)
ISBN 0 1 3 1 3 9 1 3 9 X multiplier x10 x9 x8 x7 x6 x5 x4 x3 x2 x1 -- -- -- -- -- -- -- -- -- -- sum 0 + 9 + 24 + 7 + 18 + 45 + 4 + 9 + 18 + 10 = 144 = 0 (mod 12)
In either case, the error would go undetected because 12 has factors (1,2,3,4,6) and any of these digit positions could contain a different digit which gives some other multiple of 12.
This problem be overcome by using a prime integer for the modulus. With no alternate method of obtaining a result which is an exact multiple of 11, the ISBN is guaranteed to detect all single-digit errors and all swapped-digit errors. Only the choice of a suitable prime modulus assures this result. Of note, the set {0,1,2,...,10} modulo 11 defines a finite field GF(11) under normal addition and multiplication operations.
Using MATLAB, the contribution of every digit to the total sum can shown in a table as:
>> table = mod( [1:10]' * [0:9], 11 )
For an error to be detected, the sum (modulo 11) must change when a digit changes. In this table, every row contains ten distinct values, so any change of a single digit at a given ISBN position would necessarily be detected.
DIGIT | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
Contribution (x1) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
Contribution (x2) | 0 | 2 | 4 | 6 | 8 | 10 | 1 | 3 | 5 | 7 |
Contribution (x3) | 0 | 3 | 6 | 9 | 1 | 4 | 7 | 10 | 2 | 5 |
Contribution (x4) | 0 | 4 | 8 | 1 | 5 | 9 | 2 | 6 | 10 | 3 |
Contribution (x5) | 0 | 5 | 10 | 4 | 9 | 3 | 8 | 2 | 7 | 1 |
Contribution (x6) | 0 | 6 | 1 | 7 | 2 | 8 | 3 | 9 | 4 | 10 |
Contribution (x7) | 0 | 7 | 3 | 10 | 6 | 2 | 9 | 5 | 1 | 8 |
Contribution (x8) | 0 | 8 | 5 | 2 | 10 | 7 | 4 | 1 | 9 | 6 |
Contribution (x9) | 0 | 9 | 7 | 5 | 3 | 1 | 10 | 8 | 6 | 4 |
Contribution (x10) | 0 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 |
Swapping two digits is a special case of two single digit errors. For an error to be detected, the sum (modulo 11) must change when two digits are swapped. Inspection of the above table shows that every column contains eleven distinct values, with the exception of the column for the digit 0. (Only the digit 0 would give the same contribution to the sum if it moves to a different ISBN position.)
Let the two digits be A and B let them be at position (multiplier) N and N+1 respectively.
The contribution of this (A,B) to the correct sum is C0 = AN + B(N+1). If the two are swapped, their contribution becomes C1 = BN + A(N+1). In other words when these digits are exchanged, the ISBN sum simply changes by C1 - C0 = (A - B), regardless of where in the number these two adjacent digits appear. Since an ISBN error will go undetected only if the sum changes by a multiple of 11 ,and since the difference (A-B) can never equal 11, then adjacent swapped digits will always be detected.
In this example, the sum 143 changes to 149 when the digits 3 & 9 (highlighted) are swapped. The error is detected.
ISBN 0 1 3 1 [3] [9] 1 3 9 9 multiplier x10 x9 x8 x7 x6 x5 x4 x3 x2 x1 -- -- -- -- -- -- -- -- -- -- old sum 0 + 9 + 24 + 7 + 18 + 45 + 4 + 9 + 18 + 9 = 143 new sum 0 + 9 + 24 + 7 + 54 + 15 + 4 + 9 + 18 + 9 = 149
Further analysis shows that for two digits separated by a distance M the difference will be M(A-B), which can never be a multiple of eleven for M less than eleven. (Adjacent digits are the special case of M=1.) Again, this works well because 11 is a prime number. Consequently, the exchange of any two digits in the ISBN would always be detected.
The properties of the ISBN can be explored further using the EE4253 Online ISBN Checking Tool.
The use of a prime modulus is illustrated by the ISBN coding method. This observation is important in error control coding, and it lays the groundwork for more sophisticated approaches involving long binary messages.
ISBN 0 0 7 0 0 7 0 1 3 X multiplier x10 x9 x8 x7 x6 x5 x4 x3 x2 x1 -- -- -- -- -- -- -- -- -- -- sum 0 + 0 + 56 + 0 + 0 + 35 + 0 + 3 + 6 + 10 = 110Recall that 'X' used when '10' is required in the final digit.
Since 110 ≡ 0 (mod 11), this ISBN is correct.
ISBN 1 5 5 5 1 2 0 1 0 ? multiplier x10 x9 x8 x7 x6 x5 x4 x3 x2 x1 -- -- -- -- -- -- -- -- -- -- sum 10 + 45 + 40 + 35 + 6 + 10 + 0 + 3 + 0 + ? = 149Since 149 is not a multiple of 11, and the next closest multiple is 14x11= 154, the final digit should be 5.
Sum = 154 ≡ 0 (mod 11), for the valid ISBN 1-55512-010-5