How to modify Levenshteins Edit Distance to count adjacent letter exchanges as 1 edit
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
If you want adjacent letter swaps to count as one edit, you are no longer using plain Levenshtein distance. You are moving into Damerau-Levenshtein territory, where insertion, deletion, substitution, and adjacent transposition are all allowed operations.
Why Plain Levenshtein Is Not Enough
Standard Levenshtein distance counts the minimum number of:
- insertions
- deletions
- substitutions
That means transforming ab into ba takes two edits under plain Levenshtein: one substitution for each position, or a delete-plus-insert path.
If your domain treats a typo like teh instead of the as one mistake, you need to add transposition as a fourth operation.
The Key Recurrence Change
The ordinary dynamic-programming recurrence already considers insertion, deletion, and substitution. To support an adjacent swap, add another case when the current two characters are reversed relative to the other string.
In words, when these conditions hold:
- '
i > 1' - '
j > 1' - '
a[i - 1] == b[j - 2]' - '
a[i - 2] == b[j - 1]'
then you can treat that as a transposition costing 1.
So the extra candidate becomes conceptually:
- distance at
i - 2, j - 2plus1
That is the core modification.
A Working Python Implementation
Here is a simple dynamic-programming version that counts adjacent transpositions as one edit:
Output:
That is the behavior you wanted.
What This Variant Actually Is
The implementation above is often called the optimal string alignment variant. It handles adjacent transpositions cleanly and is enough for many typo-correction tasks.
It is a good fit when:
- transpositions are local typing mistakes
- you want a simple dynamic-programming extension
- you do not need the most general transposition behavior across repeated character interactions
If you need the full unrestricted Damerau-Levenshtein distance, the implementation becomes slightly more involved because it tracks more history.
Comparing the Behavior
A quick comparison helps show the difference.
Under plain Levenshtein:
This returns 2, while the transposition-aware version returns 1.
That difference is exactly why the modification matters for spell checking and typo-tolerant matching.
Complexity
The time complexity remains O(m * n) because you still fill a dynamic-programming table over both strings. The space complexity is also O(m * n) in the straightforward version.
So the modification does not change the overall asymptotic shape. It just adds one extra recurrence case per cell.
Common Pitfalls
The biggest mistake is calling the modified algorithm "Levenshtein distance" without qualification. Once adjacent transpositions cost one edit, you are using a Damerau-Levenshtein-style metric.
Another issue is assuming all transposition-aware variants behave identically. The optimal string alignment version is simpler but has slightly different behavior from the full unrestricted Damerau-Levenshtein distance in some repeated-character cases.
People also forget to guard the transposition case with i > 1 and j > 1. Without that, the recurrence reads outside the table boundaries.
Finally, do not add transposition support unless it matches the domain. In some applications, an adjacent swap is not a meaningful single error.
Summary
- Plain Levenshtein does not treat adjacent swaps as one edit.
- To do that, add a transposition case to the dynamic-programming recurrence.
- The resulting algorithm is in the Damerau-Levenshtein family.
- A simple adjacent-transposition version still runs in
O(m * n)time. - This modification is especially useful for typo-tolerant matching such as spell checking.

