Hamming Distance Calculator
Enter two strings of equal length to count how many positions differ - that count is the Hamming distance. Works with binary sequences (like 10110 vs 11010), plain text (like KAROLIN vs KATHRIN), and hexadecimal values. The calculator also shows you the similarity percentage, which positions differ, and how many errors the distance could detect or correct.
Formula
Worked example
Compare 10110 and 11010. Length n = 5. Position 2: 0 vs 1 (differ). Position 3: 1 vs 0 (differ). Positions 1, 4, 5 match. Hamming distance = 2. Similarity = (5 - 2) / 5 = 60%. As a code minimum distance, this detects up to 1 error and corrects 0.
What is the Hamming distance?
The Hamming distance between two strings of equal length is simply the number of positions at which the corresponding characters or bits are different. The concept was introduced by Richard Hamming in his 1950 paper "Error Detecting and Error Correcting Codes," where he was trying to find systematic ways for computers to catch and fix the errors that punch-card readers frequently introduced. Today the Hamming distance is fundamental to coding theory, telecommunications, cryptography, and even bioinformatics, where it measures the number of nucleotide differences between two DNA sequences of the same length.
How to calculate Hamming distance step by step
Place the two strings on top of each other so that corresponding positions are aligned. Then scan from left to right, counting every position where the top and bottom characters are not the same. That count is the Hamming distance. For example, compare "KAROLIN" and "KATHRIN": K-K (match), A-A (match), R-T (differ), O-H (differ), L-R (differ), I-I (match), N-N (match). Three positions differ, so the Hamming distance is 3. For binary strings the comparison is bit by bit: 10110 vs 11010 gives differences at positions 2 and 3, so the distance is 2. The formal expression is d(s1, s2) = sum over all i of the indicator that s1[i] not equal to s2[i].
Error detection and correction in digital codes
The Hamming distance is most important in coding theory through the concept of minimum distance: the smallest Hamming distance between any two distinct valid codewords in a code. A code with minimum distance d can detect up to d minus 1 errors in a received message, because flipping fewer than d bits cannot transform one valid codeword into another. The same code can correct up to floor((d minus 1) divided by 2) errors, because a corrupted codeword is closest to the original if fewer than half the minimum distance bits have flipped. The classic Hamming(7,4) code uses 7-bit codewords encoding 4 data bits and has a minimum distance of 3, meaning it detects any 2-bit error and corrects any 1-bit error. RAID-style storage, QR codes, and satellite communications all rely on codes engineered for specific Hamming distances.
Similarity, binary XOR, and practical uses
Similarity is the complement of the distance: (length minus distance) divided by length, expressed as a percentage. Two identical strings have distance 0 and similarity 100%. For binary sequences only, a shortcut is to XOR the two numbers and count the 1-bits in the result - that popcount equals the Hamming distance. Modern CPUs expose this as a hardware instruction (POPCNT), making Hamming distance checks extremely fast in firmware and network hardware. Beyond error codes, the Hamming distance appears in spell-checking (find words that differ by exactly one letter), nearest-neighbour classification, DNA analysis (count single-nucleotide polymorphisms), and feature hashing in machine learning.
Hamming distance and error-correction capacity
| Min. distance (d) | Errors detectable | Errors correctable | Example code |
|---|---|---|---|
| 1 | 0 | 0 | Single parity bit |
| 2 | 1 | 0 | SECDED parity (detect only) |
| 3 | 2 | 1 | Hamming(7,4) code |
| 4 | 3 | 1 | Extended Hamming(8,4) |
| 5 | 4 | 2 | Hamming(31,26) |
| 7 | 6 | 3 | Hamming(127,120) |
The minimum Hamming distance between any two valid codewords in a code determines how many errors it can detect and correct.
Frequently asked questions
Do both strings have to be the same length?
Yes. The Hamming distance is only defined for strings of equal length, because it compares corresponding positions. If your strings have different lengths you may need a different metric such as the Levenshtein (edit) distance, which accounts for insertions and deletions. This calculator will zero-pad binary and hex inputs on the left so short strings can still be compared, but for text you must pad the shorter string manually, typically with spaces.
What does a Hamming distance of 0 mean?
A distance of 0 means the two strings are identical at every position - they are exactly the same sequence of characters or bits.
How is Hamming distance different from Levenshtein distance?
Hamming distance counts only substitutions: positions where one character differs from another. Levenshtein (edit) distance also counts insertions and deletions, so it can compare strings of different lengths. If two strings of equal length differ only by substitutions, both metrics give the same result. If lengths differ, Hamming distance is undefined while Levenshtein distance still works.
How do I calculate Hamming distance for numbers?
Convert both numbers to the same base (binary is most common for error-coding purposes), optionally pad the shorter one with leading zeros to make them the same length, then count the positions that differ. For two binary numbers this is equivalent to XOR-ing them and counting the 1-bits in the result.
What is the Hamming distance used for in DNA analysis?
In genetics, the Hamming distance counts the number of single-nucleotide polymorphisms (SNPs) between two DNA sequences of the same length. Because each position is a nucleotide (A, T, C, or G), the Hamming distance is a direct measure of how many point mutations separate the two sequences, which is useful for phylogenetic analysis and for tracking how viruses mutate over time.
Can Hamming distance be used for spell-checking?
Yes, for fixed-length words. A Hamming distance of 1 between two words means exactly one letter differs, which covers many common typos (substituting the wrong key). Because it requires equal lengths, it works best for checking whether a word matches a known dictionary word of the same length. General spell-checkers also include Levenshtein distance to catch insertions and deletions.