Facebook Pixel

72. Edit Distance

Problem Description

This is the classic Edit Distance problem, also known as the Levenshtein Distance problem.

You are given two strings word1 and word2. Your task is to find the minimum number of operations needed to transform word1 into word2.

You can perform three types of operations on word1:

  • Insert a character at any position
  • Delete a character from any position
  • Replace a character with another character

The problem uses dynamic programming to find the optimal solution. We create a 2D table f[i][j] where each cell represents the minimum operations needed to convert the first i characters of word1 to the first j characters of word2.

The base cases are:

  • f[0][j] = j: Converting an empty string to the first j characters of word2 requires j insertions
  • f[i][0] = i: Converting the first i characters of word1 to an empty string requires i deletions

For each cell f[i][j], we check if the current characters match:

  • If word1[i-1] == word2[j-1], no operation is needed for these characters, so f[i][j] = f[i-1][j-1]
  • Otherwise, we take the minimum of three possible operations and add 1:
    • f[i-1][j] + 1: Delete from word1
    • f[i][j-1] + 1: Insert into word1
    • f[i-1][j-1] + 1: Replace in word1

The final answer is stored in f[m][n], where m and n are the lengths of word1 and word2 respectively.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is to think about this problem recursively and then optimize it using dynamic programming.

Imagine you're comparing two strings character by character from left to right. At each position, you face a decision: what's the best way to handle the current characters?

Let's say we're looking at position i in word1 and position j in word2. If the characters match (word1[i] == word2[j]), that's great! We don't need to do anything for these characters, and we can focus on the remaining parts of both strings.

But what if they don't match? We have three choices:

  1. Delete the character from word1 - this means we skip this character in word1 and try to match the rest of word1 with word2
  2. Insert a character into word1 - we add the matching character from word2, which moves us forward in word2 but stays at the same position in word1
  3. Replace the character in word1 - we change it to match word2[j], allowing us to move forward in both strings

The brilliant part is recognizing that we can build up the solution incrementally. Instead of solving the entire problem at once, we solve smaller subproblems first. For example, what's the edit distance between empty strings? Between one character and empty string? Between two single characters?

This leads us to dynamic programming: we create a table where each cell f[i][j] represents a subproblem - converting the first i characters of word1 to the first j characters of word2. We fill this table systematically, using previously computed values to calculate new ones.

The recurrence relation captures our decision-making process:

  • If characters match: f[i][j] = f[i-1][j-1] (no operation needed)
  • If they don't match: f[i][j] = 1 + min(f[i-1][j], f[i][j-1], f[i-1][j-1]) (choose the cheapest operation)

By building up from smaller subproblems to larger ones, we avoid redundant calculations and efficiently find the minimum edit distance.

Learn more about Dynamic Programming patterns.

Solution Approach

We implement the solution using a 2D dynamic programming table f where f[i][j] represents the minimum number of operations to convert the first i characters of word1 to the first j characters of word2.

Step 1: Initialize the DP table

Create a (m+1) × (n+1) table where m = len(word1) and n = len(word2). The extra row and column handle the base cases of empty strings.

f = [[0] * (n + 1) for _ in range(m + 1)]

Step 2: Set up base cases

  • First row f[0][j] = j: Converting empty string to first j characters of word2 requires j insertions
  • First column f[i][0] = i: Converting first i characters of word1 to empty string requires i deletions
for j in range(1, n + 1):
    f[0][j] = j

Step 3: Fill the DP table

For each cell f[i][j], we apply the state transition equation:

f[i][j]={f[i1][j1],if word1[i1]=word2[j1]min(f[i1][j],f[i][j1],f[i1][j1])+1,otherwisef[i][j] = \begin{cases} f[i - 1][j - 1], & \text{if } word1[i - 1] = word2[j - 1] \\ \min(f[i - 1][j], f[i][j - 1], f[i - 1][j - 1]) + 1, & \text{otherwise} \end{cases}

The implementation uses enumerate(word1, 1) to iterate through characters while maintaining 1-based indexing for the DP table:

for i, a in enumerate(word1, 1):
    f[i][0] = i  # Set first column value
    for j, b in enumerate(word2, 1):
        if a == b:
            f[i][j] = f[i - 1][j - 1]  # Characters match, no operation needed
        else:
            # Take minimum of three operations:
            # f[i-1][j]: delete from word1
            # f[i][j-1]: insert into word1
            # f[i-1][j-1]: replace in word1
            f[i][j] = min(f[i - 1][j], f[i][j - 1], f[i - 1][j - 1]) + 1

Step 4: Return the result

The answer is in f[m][n], representing the minimum operations to convert entire word1 to entire word2.

Time Complexity: O(m × n) where m and n are the lengths of the two strings, as we fill each cell in the DP table once.

Space Complexity: O(m × n) for storing the DP table.

Ready to land your dream job?

Unlock your dream job with a 3-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through converting word1 = "cat" to word2 = "cut" step by step.

Step 1: Initialize the DP table

We create a 4×4 table (since both strings have length 3, we need an extra row and column for empty string cases):

    ""  c  u  t
""  0   1  2  3
c   1   ?  ?  ?
a   2   ?  ?  ?
t   3   ?  ?  ?

The first row represents converting an empty string to "c", "cu", "cut" (requires 1, 2, 3 insertions respectively). The first column represents converting "c", "ca", "cat" to an empty string (requires 1, 2, 3 deletions respectively).

Step 2: Fill the table row by row

Row 1 (i=1, character 'c' from word1):

  • f[1][1]: Compare 'c' with 'c' → They match! So f[1][1] = f[0][0] = 0
  • f[1][2]: Compare 'c' with 'u' → They don't match. Take min of:
    • Delete 'c': f[0][2] + 1 = 2 + 1 = 3
    • Insert 'u': f[1][1] + 1 = 0 + 1 = 1
    • Replace 'c' with 'u': f[0][1] + 1 = 1 + 1 = 2
    • Minimum is 1, so f[1][2] = 1
  • f[1][3]: Compare 'c' with 't' → They don't match. Take min of:
    • Delete 'c': f[0][3] + 1 = 3 + 1 = 4
    • Insert 't': f[1][2] + 1 = 1 + 1 = 2
    • Replace 'c' with 't': f[0][2] + 1 = 2 + 1 = 3
    • Minimum is 2, so f[1][3] = 2
    ""  c  u  t
""  0   1  2  3
c   1   0  1  2
a   2   ?  ?  ?
t   3   ?  ?  ?

Row 2 (i=2, character 'a' from word1):

  • f[2][1]: Compare 'a' with 'c' → They don't match. Min(1+1, 0+1, 0+1) = 1
  • f[2][2]: Compare 'a' with 'u' → They don't match. Min(0+1, 1+1, 0+1) = 1
  • f[2][3]: Compare 'a' with 't' → They don't match. Min(1+1, 1+1, 1+1) = 2
    ""  c  u  t
""  0   1  2  3
c   1   0  1  2
a   2   1  1  2
t   3   ?  ?  ?

Row 3 (i=3, character 't' from word1):

  • f[3][1]: Compare 't' with 'c' → They don't match. Min(2+1, 1+1, 1+1) = 2
  • f[3][2]: Compare 't' with 'u' → They don't match. Min(1+1, 2+1, 1+1) = 2
  • f[3][3]: Compare 't' with 't' → They match! So f[3][3] = f[2][2] = 1

Final table:

    ""  c  u  t
""  0   1  2  3
c   1   0  1  2
a   2   1  1  2
t   3   2  2  1

Result: f[3][3] = 1, meaning we need exactly 1 operation to convert "cat" to "cut".

The optimal operation is to replace 'a' with 'u', transforming "cat" → "cut" in one step.

Solution Implementation

1class Solution:
2    def minDistance(self, word1: str, word2: str) -> int:
3        """
4        Calculate the minimum edit distance (Levenshtein distance) between two strings.
5      
6        Args:
7            word1: The first string
8            word2: The second string
9          
10        Returns:
11            The minimum number of operations (insert, delete, replace) to transform word1 to word2
12        """
13        m, n = len(word1), len(word2)
14      
15        # Create a 2D DP table where dp[i][j] represents the edit distance
16        # between word1[0:i] and word2[0:j]
17        dp = [[0] * (n + 1) for _ in range(m + 1)]
18      
19        # Initialize first row: transforming empty string to word2[0:j]
20        # requires j insertions
21        for j in range(1, n + 1):
22            dp[0][j] = j
23      
24        # Fill the DP table
25        for i, char1 in enumerate(word1, 1):
26            # Initialize first column: transforming word1[0:i] to empty string
27            # requires i deletions
28            dp[i][0] = i
29          
30            for j, char2 in enumerate(word2, 1):
31                if char1 == char2:
32                    # Characters match, no operation needed
33                    dp[i][j] = dp[i - 1][j - 1]
34                else:
35                    # Characters don't match, take minimum of:
36                    # dp[i-1][j] + 1: delete from word1
37                    # dp[i][j-1] + 1: insert into word1
38                    # dp[i-1][j-1] + 1: replace in word1
39                    dp[i][j] = min(dp[i - 1][j], dp[i][j - 1], dp[i - 1][j - 1]) + 1
40      
41        return dp[m][n]
42
1class Solution {
2    /**
3     * Calculate the minimum edit distance between two strings using dynamic programming.
4     * Edit operations allowed: insert, delete, replace.
5     * 
6     * @param word1 The first string
7     * @param word2 The second string
8     * @return The minimum number of operations to convert word1 to word2
9     */
10    public int minDistance(String word1, String word2) {
11        int word1Length = word1.length();
12        int word2Length = word2.length();
13      
14        // dp[i][j] represents the minimum edit distance between 
15        // the first i characters of word1 and the first j characters of word2
16        int[][] dp = new int[word1Length + 1][word2Length + 1];
17      
18        // Initialize base case: converting empty string to word2[0...j-1]
19        // requires j insertions
20        for (int j = 1; j <= word2Length; j++) {
21            dp[0][j] = j;
22        }
23      
24        // Process each character of word1
25        for (int i = 1; i <= word1Length; i++) {
26            // Initialize base case: converting word1[0...i-1] to empty string
27            // requires i deletions
28            dp[i][0] = i;
29          
30            // Process each character of word2
31            for (int j = 1; j <= word2Length; j++) {
32                // If characters match, no operation needed
33                if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
34                    dp[i][j] = dp[i - 1][j - 1];
35                } else {
36                    // Characters don't match, choose minimum of three operations:
37                    // 1. dp[i - 1][j] + 1: delete from word1
38                    // 2. dp[i][j - 1] + 1: insert into word1
39                    // 3. dp[i - 1][j - 1] + 1: replace character in word1
40                    dp[i][j] = Math.min(
41                        dp[i - 1][j],                              // delete
42                        Math.min(dp[i][j - 1], dp[i - 1][j - 1])   // insert or replace
43                    ) + 1;
44                }
45            }
46        }
47      
48        // Return the minimum edit distance between complete strings
49        return dp[word1Length][word2Length];
50    }
51}
52
1class Solution {
2public:
3    int minDistance(string word1, string word2) {
4        int m = word1.size();
5        int n = word2.size();
6      
7        // dp[i][j] represents the minimum edit distance between 
8        // the first i characters of word1 and the first j characters of word2
9        vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));
10      
11        // Initialize base case: converting empty string to word2[0...j-1]
12        // requires j insertions
13        for (int j = 0; j <= n; ++j) {
14            dp[0][j] = j;
15        }
16      
17        // Fill the dp table
18        for (int i = 1; i <= m; ++i) {
19            // Base case: converting word1[0...i-1] to empty string
20            // requires i deletions
21            dp[i][0] = i;
22          
23            for (int j = 1; j <= n; ++j) {
24                // If characters match, no operation needed
25                if (word1[i - 1] == word2[j - 1]) {
26                    dp[i][j] = dp[i - 1][j - 1];
27                } 
28                // Otherwise, take minimum of three operations:
29                // 1. dp[i - 1][j] + 1: delete from word1
30                // 2. dp[i][j - 1] + 1: insert into word1
31                // 3. dp[i - 1][j - 1] + 1: replace character in word1
32                else {
33                    dp[i][j] = min({dp[i - 1][j], dp[i][j - 1], dp[i - 1][j - 1]}) + 1;
34                }
35            }
36        }
37      
38        // Return the edit distance between word1 and word2
39        return dp[m][n];
40    }
41};
42
1/**
2 * Calculates the minimum edit distance (Levenshtein distance) between two strings.
3 * The edit distance is the minimum number of operations (insert, delete, replace)
4 * needed to transform word1 into word2.
5 * 
6 * @param word1 - The source string to transform
7 * @param word2 - The target string to transform into
8 * @returns The minimum number of edit operations required
9 */
10function minDistance(word1: string, word2: string): number {
11    const sourceLength: number = word1.length;
12    const targetLength: number = word2.length;
13  
14    // Create a 2D DP table where dp[i][j] represents the minimum edit distance
15    // between the first i characters of word1 and the first j characters of word2
16    const dp: number[][] = Array(sourceLength + 1)
17        .fill(0)
18        .map(() => Array(targetLength + 1).fill(0));
19  
20    // Initialize base case: transforming empty string to word2[0...j-1]
21    // requires j insertions
22    for (let j = 1; j <= targetLength; j++) {
23        dp[0][j] = j;
24    }
25  
26    // Fill the DP table
27    for (let i = 1; i <= sourceLength; i++) {
28        // Base case: transforming word1[0...i-1] to empty string
29        // requires i deletions
30        dp[i][0] = i;
31      
32        for (let j = 1; j <= targetLength; j++) {
33            // If characters match, no operation needed
34            if (word1[i - 1] === word2[j - 1]) {
35                dp[i][j] = dp[i - 1][j - 1];
36            } else {
37                // Take minimum of three operations:
38                // 1. dp[i - 1][j] + 1: Delete from word1
39                // 2. dp[i][j - 1] + 1: Insert into word1
40                // 3. dp[i - 1][j - 1] + 1: Replace character in word1
41                dp[i][j] = Math.min(
42                    dp[i - 1][j],      // Delete
43                    dp[i][j - 1],      // Insert
44                    dp[i - 1][j - 1]   // Replace
45                ) + 1;
46            }
47        }
48    }
49  
50    // Return the minimum edit distance between the complete strings
51    return dp[sourceLength][targetLength];
52}
53

Time and Space Complexity

The time complexity is O(m × n), where m is the length of word1 and n is the length of word2. This is because the algorithm uses a nested loop structure - the outer loop iterates through all characters of word1 (m iterations), and for each character, the inner loop iterates through all characters of word2 (n iterations). Each cell in the dynamic programming table is computed exactly once with constant time operations.

The space complexity is O(m × n). The algorithm creates a 2D array f with dimensions (m + 1) × (n + 1) to store the intermediate results for the dynamic programming solution. This table stores the minimum edit distance for all possible substrings of word1 and word2, requiring space proportional to the product of their lengths.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Index Confusion Between String and DP Table

One of the most common mistakes is confusing the indices when accessing characters from the strings versus the DP table. The DP table uses 1-based indexing (where dp[i][j] represents the first i characters of word1 and first j characters of word2), while string access uses 0-based indexing.

Incorrect Implementation:

for i in range(1, m + 1):
    for j in range(1, n + 1):
        if word1[i] == word2[j]:  # Wrong! This causes IndexError
            dp[i][j] = dp[i-1][j-1]

Correct Implementation:

for i in range(1, m + 1):
    for j in range(1, n + 1):
        if word1[i-1] == word2[j-1]:  # Correct: use i-1 and j-1 for string access
            dp[i][j] = dp[i-1][j-1]

2. Forgetting to Initialize Base Cases Properly

Another pitfall is incomplete initialization of the DP table's first row and column. Some implementations forget to set the first column values inside the main loop or miss initializing them entirely.

Incorrect Implementation:

dp = [[0] * (n + 1) for _ in range(m + 1)]
# Forgot to initialize first row and column!
for i in range(1, m + 1):
    for j in range(1, n + 1):
        # ... rest of the logic

Correct Implementation:

dp = [[0] * (n + 1) for _ in range(m + 1)]

# Initialize first row
for j in range(1, n + 1):
    dp[0][j] = j

# Initialize first column (can be done in main loop or separately)
for i in range(1, m + 1):
    dp[i][0] = i
    for j in range(1, n + 1):
        # ... rest of the logic

3. Space Optimization Attempt Gone Wrong

When trying to optimize space complexity from O(m×n) to O(min(m,n)) using a 1D or 2-row approach, it's easy to overwrite values that are still needed.

Problematic Space-Optimized Version:

def minDistance(self, word1: str, word2: str) -> int:
    m, n = len(word1), len(word2)
    dp = [j for j in range(n + 1)]  # Single row
  
    for i in range(1, m + 1):
        dp[0] = i  # Wrong! This overwrites the value we need for dp[1]
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                dp[j] = dp[j-1]  # Wrong! dp[j-1] was already updated
            else:
                dp[j] = min(dp[j], dp[j-1], dp[j-1]) + 1

Correct Space-Optimized Version:

def minDistance(self, word1: str, word2: str) -> int:
    m, n = len(word1), len(word2)
    # Make word1 the shorter string for better space efficiency
    if m > n:
        word1, word2, m, n = word2, word1, n, m
  
    prev = list(range(n + 1))
  
    for i in range(1, m + 1):
        curr = [i] + [0] * n
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                curr[j] = prev[j-1]
            else:
                curr[j] = min(prev[j], curr[j-1], prev[j-1]) + 1
        prev = curr
  
    return prev[n]

4. Not Handling Edge Cases

Failing to consider empty strings or identical strings can lead to unnecessary computation or errors.

Better Implementation with Early Returns:

def minDistance(self, word1: str, word2: str) -> int:
    # Handle edge cases
    if word1 == word2:
        return 0
    if not word1:
        return len(word2)
    if not word2:
        return len(word1)
  
    # Continue with standard DP solution...
Discover Your Strengths and Weaknesses: Take Our 3-Minute Quiz to Tailor Your Study Plan:

A person thinks of a number between 1 and 1000. You may ask any number questions to them, provided that the question can be answered with either "yes" or "no".

What is the minimum number of questions you needed to ask so that you are guaranteed to know the number that the person is thinking?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More