Hamming Distance for Binary Variables
Finite binary 0 and 1 sequence is sometimes called a word in coding theory. If two words have the same length, we can count the number of digits in positions where they have different digit. The length of different digits is called Hamming distance. If = number of variables with value 1 for the th objects and 0 for the th object and = number of variables with value 0 for the th objects and 1 for the th object, we have
Formula
Below is an interactive program to compute hamming distance. Try to experiment with your own input values.
Input coordinate values of ObjectA and ObjectB (the coordinate are binary, 0 or 1), then press "Get Hamming Distance" button. The program will also directly calculate when you type the input.
Example
Feature of Fruit 
Sphere shape 
Sweet 
Sour 
Crunchy 
Object =Apple 
Yes 
Yes 
Yes 
Yes 
Object =Banana 
No 
Yes 
No 
No 
The coordinate of Apple is (1,1,1,1) and coordinate of Banana is (0,1,0,0). Because each object is represented by 4 variables, we say that these objects has 4 dimensions.
Word 1 (Apple) 
1 
1 
1 
1 
Word 2 (Banana) 
0 
1 
0 
0 
Is different 
Yes 
no 
Yes 
Yes 
Since there are 3 digits different ( and ) , we say that the hamming distance is 3.
Other examples:
(1, 1, 1, 0, 0) and
(0, 1, 0, 0, 0) has Hamming distance of 2 because the first and third digit are different.
(1, 0, 0, 1, 1, 0) and
(0, 1, 0, 1, 0, 1) has Hamming distance of 4 because the first, second, fifth and sith digits are different)
(0, 1, 0) and (0, 1, 0) has zero hamming distance because no digit different between the two words.
Hamming distance is somewhat related to Simple matching distance . Hamming distance divided by the length of word (= total number of variables) will give simple matching distance.
Hamming Distance for Ordinal Data
In principle, Hamming distance measure the number of disagreement between two vectors. Hamming distance can also be used for ordinal variables to measure disorder of disordervector from a patternvector.
The algorithm to compute Hamming distance for ordinal variable involve operation � Putting back � unmatched digit into the correct place, after remove them in the first step:
 Take/Delete all unmatched digits on disorder vector that does not match to the corresponding digit in pattern vector
 Put the deleted digit one by one into the correct place.
Hamming distance is the count of operation � Putting back �. This algorithm also works for binary variable. Of course, the Hamming distance is equal to the number of unmatched digit.
Example :
Suppose we have two judges (A and B) who give rank of importance over 6 products. The ranking vector is given as follow A=[1, 2, 3, 4, 5, 6] and B = [2, 5, 3, 1, 4, 6]. We want to measure Hamming distance between A and B
We set rank vector A as patternvector and vector B as disordervector. Our goal is to count the number of steps of operation �Putting back� of digits to make disordervector into patternvector. Diagram below show the steps. Since we count four number of interchange, thus the Hamming distance between A and B is 4.
<
Content

Previous

Next
>
See also:
Ordinal Variable
,
Contents
,
Ulam Distance
Preferable reference for this tutorial is
Teknomo, Kardi (2015) Similarity Measurement. http:\people.revoledu.comkardi tutorialSimilarity