Facebook Pixel

583. Delete Operation for Two Strings

Problem Description

You are given two strings word1 and word2. Your task is to find the minimum number of steps required to make both strings identical.

In each step, you can delete exactly one character from either word1 or word2. You need to determine the minimum total number of character deletions needed so that the two strings become the same.

For example, if word1 = "sea" and word2 = "eat", you would need to:

  • Delete 's' from word1 to get "ea"
  • Delete 't' from word2 to get "ea"

This requires 2 steps total to make both strings equal to "ea".

The solution uses dynamic programming where f[i][j] represents the minimum number of deletions needed to make the first i characters of word1 equal to the first j characters of word2.

The base cases are:

  • f[i][0] = i: To make i characters of word1 equal to an empty string, delete all i characters
  • f[0][j] = j: To make an empty string equal to j characters of word2, delete all j characters

For the general case:

  • If word1[i-1] == word2[j-1], the characters match, so f[i][j] = f[i-1][j-1] (no deletion needed)
  • Otherwise, we either delete from word1 or word2, taking the minimum: f[i][j] = min(f[i-1][j], f[i][j-1]) + 1

The final answer is f[m][n] where m and n are the lengths of word1 and word2 respectively.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is recognizing that making two strings identical through deletions is equivalent to finding their longest common subsequence (LCS) and then deleting all characters that are not part of this LCS.

Think about it this way: when two strings become identical after deletions, what remains must be a sequence of characters that existed in both original strings in the same order. This remaining sequence is actually a common subsequence of both strings.

To minimize the number of deletions, we want to keep as many characters as possible - which means finding the longest possible common subsequence. Once we know the LCS length, the minimum deletions needed would be:

  • Characters to delete from word1: len(word1) - LCS_length
  • Characters to delete from word2: len(word2) - LCS_length
  • Total deletions: len(word1) + len(word2) - 2 * LCS_length

However, instead of finding the LCS first and then calculating deletions, we can directly track the minimum deletions needed using dynamic programming. At each position (i, j), we consider:

  1. If characters match (word1[i-1] == word2[j-1]), we can keep both characters without any deletion, inheriting the deletion count from f[i-1][j-1].

  2. If characters don't match, we must delete at least one character. We have two choices:

    • Delete from word1: move to f[i-1][j] and add 1 deletion
    • Delete from word2: move to f[i][j-1] and add 1 deletion

    We choose whichever option gives us fewer total deletions.

This builds up a table where each cell f[i][j] tells us the minimum deletions needed to make the first i characters of word1 match the first j characters of word2, ultimately giving us the answer at f[m][n].

Learn more about Dynamic Programming patterns.

Solution Approach

We solve this problem using dynamic programming with a 2D table to track the minimum deletions needed.

Step 1: Initialize the DP table

Create a 2D array f of size (m+1) × (n+1) where m = len(word1) and n = len(word2). Each cell f[i][j] will store the minimum number of deletions required to make the first i characters of word1 equal to the first j characters of word2.

f = [[0] * (n + 1) for _ in range(m + 1)]

Step 2: Set up base cases

  • f[i][0] = i for all i from 1 to m: To make i characters of word1 equal to an empty string, we need to delete all i characters.
  • f[0][j] = j for all j from 1 to n: To make an empty string equal to j characters of word2, we need to delete all j characters.
for i in range(1, m + 1):
    f[i][0] = i
for j in range(1, n + 1):
    f[0][j] = j

Step 3: Fill the DP table

Iterate through each position (i, j) where i ranges from 1 to m and j ranges from 1 to n. For each position:

  • If word1[i-1] == word2[j-1] (characters match):

    • No deletion is needed for these matching characters
    • f[i][j] = f[i-1][j-1]
  • If word1[i-1] != word2[j-1] (characters don't match):

    • We must delete at least one character
    • Option 1: Delete character from word1, taking value from f[i-1][j]
    • Option 2: Delete character from word2, taking value from f[i][j-1]
    • f[i][j] = min(f[i-1][j], f[i][j-1]) + 1
for i, a in enumerate(word1, 1):
    for j, b in enumerate(word2, 1):
        if a == b:
            f[i][j] = f[i - 1][j - 1]
        else:
            f[i][j] = min(f[i - 1][j], f[i][j - 1]) + 1

Step 4: Return the result

The final answer is stored in f[m][n], which represents the minimum deletions needed to make the entire word1 equal to the entire word2.

Time Complexity: O(m × n) where m and n are the lengths of the two strings.

Space Complexity: O(m × n) for the DP table.

Ready to land your dream job?

Unlock your dream job with a 3-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example with word1 = "sea" and word2 = "eat".

Step 1: Initialize the DP table

We create a table of size 4×4 (since both strings have length 3, we need +1 for the empty string case):

    ""  e  a  t
""   0  0  0  0
s    0  0  0  0
e    0  0  0  0
a    0  0  0  0

Step 2: Fill base cases

For f[i][0], we need i deletions to make word1's prefix match empty string. For f[0][j], we need j deletions to make empty string match word2's prefix.

    ""  e  a  t
""   0  1  2  3
s    1  0  0  0
e    2  0  0  0
a    3  0  0  0

Step 3: Fill the DP table row by row

For position (1,1): Compare 's' with 'e'

  • Characters don't match
  • f[1][1] = min(f[0][1], f[1][0]) + 1 = min(1, 1) + 1 = 2

For position (1,2): Compare 's' with 'a'

  • Characters don't match
  • f[1][2] = min(f[0][2], f[1][1]) + 1 = min(2, 2) + 1 = 3

For position (1,3): Compare 's' with 't'

  • Characters don't match
  • f[1][3] = min(f[0][3], f[1][2]) + 1 = min(3, 3) + 1 = 4

After row 1:

    ""  e  a  t
""   0  1  2  3
s    1  2  3  4
e    2  0  0  0
a    3  0  0  0

For position (2,1): Compare 'e' with 'e'

  • Characters match!
  • f[2][1] = f[1][0] = 1 (no deletion needed for matching chars)

For position (2,2): Compare 'e' with 'a'

  • Characters don't match
  • f[2][2] = min(f[1][2], f[2][1]) + 1 = min(3, 1) + 1 = 2

For position (2,3): Compare 'e' with 't'

  • Characters don't match
  • f[2][3] = min(f[1][3], f[2][2]) + 1 = min(4, 2) + 1 = 3

After row 2:

    ""  e  a  t
""   0  1  2  3
s    1  2  3  4
e    2  1  2  3
a    3  0  0  0

For position (3,1): Compare 'a' with 'e'

  • Characters don't match
  • f[3][1] = min(f[2][1], f[3][0]) + 1 = min(1, 3) + 1 = 2

For position (3,2): Compare 'a' with 'a'

  • Characters match!
  • f[3][2] = f[2][1] = 1 (no deletion needed for matching chars)

For position (3,3): Compare 'a' with 't'

  • Characters don't match
  • f[3][3] = min(f[2][3], f[3][2]) + 1 = min(3, 1) + 1 = 2

Final table:

    ""  e  a  t
""   0  1  2  3
s    1  2  3  4
e    2  1  2  3
a    3  2  1  2

Step 4: Read the answer

The value at f[3][3] = 2 tells us we need minimum 2 deletions to make both strings identical.

To verify: Delete 's' from "sea" → "ea", and delete 't' from "eat" → "ea". Both strings become "ea" with 2 total deletions.

Solution Implementation

1class Solution:
2    def minDistance(self, word1: str, word2: str) -> int:
3        """
4        Calculate the minimum edit distance between two strings.
5        This implementation finds the minimum number of deletions needed
6        to make two strings equal (delete from either string).
7      
8        Args:
9            word1: First input string
10            word2: Second input string
11          
12        Returns:
13            Minimum number of deletion operations needed
14        """
15        len1, len2 = len(word1), len(word2)
16      
17        # Create a 2D DP table where dp[i][j] represents the minimum deletions
18        # needed to make word1[0:i] and word2[0:j] equal
19        dp = [[0] * (len2 + 1) for _ in range(len1 + 1)]
20      
21        # Base case: If word2 is empty, delete all characters from word1
22        for i in range(1, len1 + 1):
23            dp[i][0] = i
24          
25        # Base case: If word1 is empty, delete all characters from word2
26        for j in range(1, len2 + 1):
27            dp[0][j] = j
28          
29        # Fill the DP table
30        for i, char1 in enumerate(word1, 1):
31            for j, char2 in enumerate(word2, 1):
32                if char1 == char2:
33                    # Characters match, no deletion needed
34                    dp[i][j] = dp[i - 1][j - 1]
35                else:
36                    # Characters don't match, delete from either string
37                    # Take minimum of:
38                    # 1. Delete from word1: dp[i-1][j] + 1
39                    # 2. Delete from word2: dp[i][j-1] + 1
40                    dp[i][j] = min(dp[i - 1][j], dp[i][j - 1]) + 1
41                  
42        return dp[len1][len2]
43
1class Solution {
2    /**
3     * Calculates the minimum number of steps required to make two strings the same
4     * by deleting characters from either string.
5     * 
6     * @param word1 The first string
7     * @param word2 The second string
8     * @return The minimum number of deletion operations needed
9     */
10    public int minDistance(String word1, String word2) {
11        int lengthWord1 = word1.length();
12        int lengthWord2 = word2.length();
13      
14        // dp[i][j] represents the minimum deletions needed to make 
15        // the first i characters of word1 equal to the first j characters of word2
16        int[][] dp = new int[lengthWord1 + 1][lengthWord2 + 1];
17      
18        // Initialize base case: to make word1[0...i-1] equal to empty string,
19        // we need to delete all i characters
20        for (int i = 0; i <= lengthWord1; i++) {
21            dp[i][0] = i;
22        }
23      
24        // Initialize base case: to make empty string equal to word2[0...j-1],
25        // we need to delete all j characters from word2
26        for (int j = 0; j <= lengthWord2; j++) {
27            dp[0][j] = j;
28        }
29      
30        // Fill the dp table using dynamic programming
31        for (int i = 1; i <= lengthWord1; i++) {
32            for (int j = 1; j <= lengthWord2; j++) {
33                // Get current characters from both strings (0-indexed)
34                char charFromWord1 = word1.charAt(i - 1);
35                char charFromWord2 = word2.charAt(j - 1);
36              
37                if (charFromWord1 == charFromWord2) {
38                    // If characters match, no deletion needed
39                    // Take the result from previous state
40                    dp[i][j] = dp[i - 1][j - 1];
41                } else {
42                    // If characters don't match, we need to delete one character
43                    // Either delete from word1 (dp[i-1][j]) or from word2 (dp[i][j-1])
44                    // Choose the option with minimum deletions and add 1
45                    dp[i][j] = Math.min(dp[i - 1][j], dp[i][j - 1]) + 1;
46                }
47            }
48        }
49      
50        // Return the minimum deletions needed for the entire strings
51        return dp[lengthWord1][lengthWord2];
52    }
53}
54
1class Solution {
2public:
3    int minDistance(string word1, string word2) {
4        int m = word1.length();
5        int n = word2.length();
6      
7        // dp[i][j] represents the minimum number of steps to make 
8        // word1[0...i-1] and word2[0...j-1] the same
9        vector<vector<int>> dp(m + 1, vector<int>(n + 1));
10      
11        // Base case: when word2 is empty, we need to delete all characters from word1
12        for (int i = 0; i <= m; ++i) {
13            dp[i][0] = i;
14        }
15      
16        // Base case: when word1 is empty, we need to delete all characters from word2
17        for (int j = 0; j <= n; ++j) {
18            dp[0][j] = j;
19        }
20      
21        // Fill the dp table
22        for (int i = 1; i <= m; ++i) {
23            for (int j = 1; j <= n; ++j) {
24                // Get current characters from both strings
25                char charFromWord1 = word1[i - 1];
26                char charFromWord2 = word2[j - 1];
27              
28                if (charFromWord1 == charFromWord2) {
29                    // If characters match, no deletion needed
30                    // Take the value from diagonal (both characters are kept)
31                    dp[i][j] = dp[i - 1][j - 1];
32                } else {
33                    // If characters don't match, we need to delete one character
34                    // Either delete from word1 (dp[i-1][j]) or from word2 (dp[i][j-1])
35                    // Choose the minimum and add 1 for the deletion operation
36                    dp[i][j] = min(dp[i - 1][j], dp[i][j - 1]) + 1;
37                }
38            }
39        }
40      
41        // Return the minimum steps to make both strings the same
42        return dp[m][n];
43    }
44};
45
1/**
2 * Calculates the minimum edit distance between two strings using dynamic programming.
3 * This computes the minimum number of deletions required to make two strings equal.
4 * 
5 * @param word1 - The first string to compare
6 * @param word2 - The second string to compare
7 * @returns The minimum number of deletion operations needed
8 */
9function minDistance(word1: string, word2: string): number {
10    const word1Length: number = word1.length;
11    const word2Length: number = word2.length;
12  
13    // Create a 2D DP table where dp[i][j] represents the minimum deletions
14    // needed to make word1[0...i-1] and word2[0...j-1] equal
15    const dpTable: number[][] = Array.from(
16        { length: word1Length + 1 }, 
17        () => Array(word2Length + 1).fill(0)
18    );
19  
20    // Initialize base case: deleting all characters from word1 to match empty word2
21    for (let i = 1; i <= word1Length; i++) {
22        dpTable[i][0] = i;
23    }
24  
25    // Initialize base case: deleting all characters from word2 to match empty word1
26    for (let j = 1; j <= word2Length; j++) {
27        dpTable[0][j] = j;
28    }
29  
30    // Fill the DP table using bottom-up approach
31    for (let i = 1; i <= word1Length; i++) {
32        for (let j = 1; j <= word2Length; j++) {
33            // If characters match, no deletion needed at this position
34            if (word1[i - 1] === word2[j - 1]) {
35                dpTable[i][j] = dpTable[i - 1][j - 1];
36            } else {
37                // Characters don't match, take minimum of:
38                // 1. Delete from word1: dpTable[i - 1][j] + 1
39                // 2. Delete from word2: dpTable[i][j - 1] + 1
40                dpTable[i][j] = Math.min(
41                    dpTable[i - 1][j],    // Delete character from word1
42                    dpTable[i][j - 1]     // Delete character from word2
43                ) + 1;
44            }
45        }
46    }
47  
48    // Return the minimum deletions needed for entire strings
49    return dpTable[word1Length][word2Length];
50}
51

Time and Space Complexity

The time complexity is O(m × n), where m is the length of word1 and n is the length of word2. This is because the algorithm uses nested loops that iterate through all positions in a 2D dynamic programming table. The outer loop runs m times (for each character in word1), and the inner loop runs n times (for each character in word2), resulting in m × n operations.

The space complexity is O(m × n). The algorithm creates a 2D array f with dimensions (m + 1) × (n + 1) to store the intermediate results of the dynamic programming solution. This array requires O((m + 1) × (n + 1)) space, which simplifies to O(m × n).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Index Confusion Between String and DP Table

The Pitfall: One of the most common mistakes is confusing the indices when accessing characters from the strings versus the DP table. Since the DP table has dimensions (m+1) × (n+1) but the strings have lengths m and n, when we're at position [i][j] in the DP table, we need to access word1[i-1] and word2[j-1] in the strings.

Incorrect Code:

for i in range(1, len1 + 1):
    for j in range(1, len2 + 1):
        if word1[i] == word2[j]:  # Wrong! This will cause IndexError
            dp[i][j] = dp[i - 1][j - 1]

Correct Code:

for i in range(1, len1 + 1):
    for j in range(1, len2 + 1):
        if word1[i-1] == word2[j-1]:  # Correct: subtract 1 from indices
            dp[i][j] = dp[i - 1][j - 1]

2. Forgetting to Initialize Base Cases

The Pitfall: Forgetting to properly initialize the base cases will lead to incorrect results. The base cases represent the cost of converting a string to an empty string (or vice versa), which requires deleting all characters.

Incorrect Code:

dp = [[0] * (len2 + 1) for _ in range(len1 + 1)]
# Missing base case initialization
for i in range(1, len1 + 1):
    for j in range(1, len2 + 1):
        # ... rest of the logic

Correct Code:

dp = [[0] * (len2 + 1) for _ in range(len1 + 1)]

# Initialize base cases
for i in range(1, len1 + 1):
    dp[i][0] = i  # Delete all i characters from word1
for j in range(1, len2 + 1):
    dp[0][j] = j  # Delete all j characters from word2

# Then proceed with the main logic

3. Space Optimization Pitfall

The Pitfall: While it's possible to optimize space from O(m×n) to O(min(m,n)) by using only two rows instead of the full table, a common mistake is trying to optimize prematurely and incorrectly updating the values, leading to wrong results.

Incorrect Space-Optimized Attempt:

# Wrong: Trying to use a single row incorrectly
prev = [j for j in range(len2 + 1)]
for i in range(1, len1 + 1):
    curr = [i]  # Should track previous value properly
    for j in range(1, len2 + 1):
        if word1[i-1] == word2[j-1]:
            curr.append(prev[j-1])  # This might be overwritten
        else:
            curr.append(min(prev[j], curr[j-1]) + 1)
    prev = curr  # Lost information needed for next iteration

Correct Space-Optimized Solution:

def minDistance(self, word1: str, word2: str) -> int:
    if len(word1) < len(word2):
        word1, word2 = word2, word1  # Ensure word1 is longer
  
    len1, len2 = len(word1), len(word2)
    prev = list(range(len2 + 1))
  
    for i in range(1, len1 + 1):
        curr = [i]
        for j in range(1, len2 + 1):
            if word1[i-1] == word2[j-1]:
                curr.append(prev[j-1])
            else:
                curr.append(min(prev[j], curr[j-1]) + 1)
        prev = curr
  
    return prev[len2]

4. Alternative Problem Interpretation

The Pitfall: This problem asks for the minimum deletions to make strings equal. A related but different problem is finding the Longest Common Subsequence (LCS) and using it to calculate deletions. While this approach works, mixing up the recurrence relations can lead to errors.

Alternative (LCS-based) approach:

# Find LCS length first, then calculate deletions
# Deletions needed = len(word1) + len(word2) - 2 * LCS_length

Make sure you understand which approach you're implementing and don't mix the recurrence relations between direct deletion counting and LCS-based approaches.

Discover Your Strengths and Weaknesses: Take Our 3-Minute Quiz to Tailor Your Study Plan:

Which algorithm should you use to find a node that is close to the root of the tree?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More