Facebook Pixel

1092. Shortest Common Supersequence

Problem Description

You are given two strings str1 and str2. Your task is to find the shortest string that contains both str1 and str2 as subsequences.

A subsequence is formed by deleting some (possibly zero) characters from a string without changing the order of the remaining characters. For example, "ace" is a subsequence of "abcde" (delete 'b' and 'd').

The goal is to construct the shortest possible string where:

  • If you delete some characters from this string, you can get str1
  • If you delete some (other) characters from this string, you can get str2

If multiple valid shortest strings exist, you can return any one of them.

Example:

  • If str1 = "abac" and str2 = "cab", one possible shortest common supersequence is "cabac".
    • Deleting the second 'c' gives "abac" (which is str1)
    • Deleting the first 'a' and last 'a' and 'c' gives "cab" (which is str2)

The solution uses dynamic programming to first find the Longest Common Subsequence (LCS) between the two strings, then reconstructs the shortest supersequence by:

  1. Building a DP table f[i][j] that stores the length of LCS for the first i characters of str1 and first j characters of str2
  2. Backtracking through the DP table to build the result string by:
    • Including characters that are common to both strings once
    • Including characters unique to each string in the appropriate positions
Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is recognizing that the shortest common supersequence is closely related to the Longest Common Subsequence (LCS) of the two strings.

Think about it this way: if two strings share some common characters in the same order, we only need to include those shared characters once in our result. The more characters they share (i.e., the longer their LCS), the shorter our final supersequence will be.

For example, if str1 = "abac" and str2 = "cab", their LCS is "ab". To build the shortest supersequence:

  • We must include all characters from both strings
  • But we can "merge" the common subsequence characters instead of duplicating them

The relationship can be expressed as: Length of shortest supersequence = len(str1) + len(str2) - len(LCS)

This is because:

  • We need all characters from str1 (length m)
  • We need all characters from str2 (length n)
  • But the LCS characters are counted twice, so we subtract them once

Once we understand this relationship, the solution strategy becomes clear:

  1. First, find the LCS using dynamic programming
  2. Then, reconstruct the actual supersequence by backtracking through the DP table

During reconstruction, we traverse backwards through the DP table:

  • When characters match (part of LCS), we include that character once
  • When they don't match, we need to include the character from whichever string would maintain the subsequence property
  • At boundaries (when one string is exhausted), we include all remaining characters from the other string

This greedy reconstruction ensures we build the shortest possible string that contains both original strings as subsequences.

Learn more about Dynamic Programming patterns.

Solution Approach

The implementation consists of two main phases: building the LCS table and reconstructing the shortest supersequence.

Phase 1: Building the LCS Dynamic Programming Table

We create a 2D table f[i][j] where f[i][j] represents the length of the LCS between the first i characters of str1 and the first j characters of str2.

f = [[0] * (n + 1) for _ in range(m + 1)]

The table is filled using the recurrence relation:

  • If str1[i-1] == str2[j-1]: f[i][j] = f[i-1][j-1] + 1 (characters match, extend LCS)
  • Otherwise: f[i][j] = max(f[i-1][j], f[i][j-1]) (take the better LCS without current character)

Phase 2: Backtracking to Build the Result

Starting from f[m][n] (bottom-right corner), we trace back to reconstruct the shortest supersequence:

i, j = m, n
ans = []

The backtracking logic follows these rules:

  1. Boundary cases:

    • If i == 0: We've exhausted str1, so add remaining characters from str2[j]
    • If j == 0: We've exhausted str2, so add remaining characters from str1[i]
  2. Non-boundary cases:

    • If f[i][j] == f[i-1][j]: The LCS value came from excluding str1[i-1], so we add str1[i-1] to result and move to (i-1, j)
    • Else if f[i][j] == f[i][j-1]: The LCS value came from excluding str2[j-1], so we add str2[j-1] to result and move to (i, j-1)
    • Else: Characters matched (part of LCS), add the character once and move diagonally to (i-1, j-1)

Since we build the result by traversing backwards, we reverse the final array:

return ''.join(ans[::-1])

Time Complexity: O(m * n) for building the DP table and O(m + n) for reconstruction Space Complexity: O(m * n) for the DP table

The algorithm guarantees the shortest supersequence because it includes each LCS character exactly once while preserving all non-LCS characters from both strings in their required order.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the algorithm with str1 = "abc" and str2 = "adc".

Step 1: Build the LCS DP Table

We create a table f[i][j] where f[i][j] represents the LCS length for the first i characters of str1 and first j characters of str2.

    ""  a  d  c
""   0  0  0  0
a    0  1  1  1
b    0  1  1  1
c    0  1  1  2

Building process:

  • f[1][1]: str1[0]='a' == str2[0]='a', so f[1][1] = f[0][0] + 1 = 1
  • f[1][2]: str1[0]='a' != str2[1]='d', so f[1][2] = max(f[0][2], f[1][1]) = 1
  • f[2][1]: str1[1]='b' != str2[0]='a', so f[2][1] = max(f[1][1], f[2][0]) = 1
  • f[3][3]: str1[2]='c' == str2[2]='c', so f[3][3] = f[2][2] + 1 = 2

The LCS length is 2 (characters 'a' and 'c').

Step 2: Backtrack to Build Result

Starting from (i=3, j=3), we trace backwards:

  1. At (3,3): f[3][3]=2 != f[2][3]=1 and != f[3][2]=1

    • This means str1[2]='c' and str2[2]='c' match
    • Add 'c' to result: ans = ['c']
    • Move diagonally to (2,2)
  2. At (2,2): f[2][2]=1 == f[2][1]=1

    • The LCS value came from excluding str2[1]='d'
    • Add 'd' to result: ans = ['c', 'd']
    • Move left to (2,1)
  3. At (2,1): f[2][1]=1 == f[1][1]=1

    • The LCS value came from excluding str1[1]='b'
    • Add 'b' to result: ans = ['c', 'd', 'b']
    • Move up to (1,1)
  4. At (1,1): f[1][1]=1 != f[0][1]=0 and != f[1][0]=0

    • This means str1[0]='a' and str2[0]='a' match
    • Add 'a' to result: ans = ['c', 'd', 'b', 'a']
    • Move diagonally to (0,0)
  5. At (0,0): Both strings exhausted, stop.

Step 3: Reverse the Result

Since we built backwards, reverse the array: ['c', 'd', 'b', 'a']"adbc"

Verification:

  • "adbc" contains "abc" as subsequence (delete 'd')
  • "adbc" contains "adc" as subsequence (delete 'b')
  • Length is 4, which equals len("abc") + len("adc") - LCS_length = 3 + 3 - 2 = 4

The shortest common supersequence is "adbc".

Solution Implementation

1class Solution:
2    def shortestCommonSupersequence(self, str1: str, str2: str) -> str:
3        # Get lengths of both strings
4        len1, len2 = len(str1), len(str2)
5      
6        # Create DP table to store length of longest common subsequence (LCS)
7        # dp[i][j] represents LCS length of str1[0:i] and str2[0:j]
8        dp = [[0] * (len2 + 1) for _ in range(len1 + 1)]
9      
10        # Fill the DP table to find LCS lengths
11        for i in range(1, len1 + 1):
12            for j in range(1, len2 + 1):
13                if str1[i - 1] == str2[j - 1]:
14                    # Characters match, extend the LCS
15                    dp[i][j] = dp[i - 1][j - 1] + 1
16                else:
17                    # Characters don't match, take maximum from previous states
18                    dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
19      
20        # Build the shortest common supersequence by backtracking through DP table
21        result = []
22        i, j = len1, len2
23      
24        while i > 0 or j > 0:
25            if i == 0:
26                # Reached end of str1, add remaining characters from str2
27                j -= 1
28                result.append(str2[j])
29            elif j == 0:
30                # Reached end of str2, add remaining characters from str1
31                i -= 1
32                result.append(str1[i])
33            else:
34                # Both strings have remaining characters
35                if dp[i][j] == dp[i - 1][j]:
36                    # Current character from str1 is not part of LCS
37                    i -= 1
38                    result.append(str1[i])
39                elif dp[i][j] == dp[i][j - 1]:
40                    # Current character from str2 is not part of LCS
41                    j -= 1
42                    result.append(str2[j])
43                else:
44                    # Characters match and are part of LCS, add once
45                    i -= 1
46                    j -= 1
47                    result.append(str1[i])
48      
49        # Reverse the result since we built it backwards
50        return ''.join(reversed(result))
51
1class Solution {
2    public String shortestCommonSupersequence(String str1, String str2) {
3        int len1 = str1.length();
4        int len2 = str2.length();
5      
6        // dp[i][j] represents the length of LCS (Longest Common Subsequence) 
7        // of str1[0...i-1] and str2[0...j-1]
8        int[][] dp = new int[len1 + 1][len2 + 1];
9      
10        // Build the LCS table using dynamic programming
11        for (int i = 1; i <= len1; i++) {
12            for (int j = 1; j <= len2; j++) {
13                if (str1.charAt(i - 1) == str2.charAt(j - 1)) {
14                    // Characters match, extend the LCS
15                    dp[i][j] = dp[i - 1][j - 1] + 1;
16                } else {
17                    // Characters don't match, take the maximum LCS from previous states
18                    dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
19                }
20            }
21        }
22      
23        // Reconstruct the shortest common supersequence by backtracking through the DP table
24        int i = len1;
25        int j = len2;
26        StringBuilder result = new StringBuilder();
27      
28        while (i > 0 || j > 0) {
29            if (i == 0) {
30                // Reached the end of str1, append remaining characters from str2
31                result.append(str2.charAt(--j));
32            } else if (j == 0) {
33                // Reached the end of str2, append remaining characters from str1
34                result.append(str1.charAt(--i));
35            } else {
36                // Check where the current LCS value came from
37                if (dp[i][j] == dp[i - 1][j]) {
38                    // LCS value came from top cell, character from str1 is not in LCS
39                    result.append(str1.charAt(--i));
40                } else if (dp[i][j] == dp[i][j - 1]) {
41                    // LCS value came from left cell, character from str2 is not in LCS
42                    result.append(str2.charAt(--j));
43                } else {
44                    // Characters match and are part of LCS, add once and move both pointers
45                    result.append(str1.charAt(--i));
46                    j--;
47                }
48            }
49        }
50      
51        // Reverse the result since we built it backwards
52        return result.reverse().toString();
53    }
54}
55
1class Solution {
2public:
3    string shortestCommonSupersequence(string str1, string str2) {
4        int len1 = str1.size();
5        int len2 = str2.size();
6      
7        // Build LCS (Longest Common Subsequence) table
8        // dp[i][j] represents the length of LCS of str1[0...i-1] and str2[0...j-1]
9        vector<vector<int>> dp(len1 + 1, vector<int>(len2 + 1, 0));
10      
11        // Fill the LCS table using dynamic programming
12        for (int i = 1; i <= len1; ++i) {
13            for (int j = 1; j <= len2; ++j) {
14                if (str1[i - 1] == str2[j - 1]) {
15                    // Characters match, extend the LCS
16                    dp[i][j] = dp[i - 1][j - 1] + 1;
17                } else {
18                    // Characters don't match, take maximum from previous states
19                    dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
20                }
21            }
22        }
23      
24        // Reconstruct the shortest common supersequence by backtracking through the LCS table
25        int i = len1;
26        int j = len2;
27        string result;
28      
29        while (i > 0 || j > 0) {
30            if (i == 0) {
31                // Reached end of str1, add remaining characters from str2
32                result += str2[--j];
33            } else if (j == 0) {
34                // Reached end of str2, add remaining characters from str1
35                result += str1[--i];
36            } else {
37                // Check where the current LCS value came from
38                if (dp[i][j] == dp[i - 1][j]) {
39                    // LCS value came from top cell, add character from str1
40                    result += str1[--i];
41                } else if (dp[i][j] == dp[i][j - 1]) {
42                    // LCS value came from left cell, add character from str2
43                    result += str2[--j];
44                } else {
45                    // Characters matched (diagonal move), add the common character once
46                    result += str1[--i];
47                    --j;
48                }
49            }
50        }
51      
52        // Reverse the result since we built it backwards
53        reverse(result.begin(), result.end());
54        return result;
55    }
56};
57
1/**
2 * Finds the shortest common supersequence of two strings.
3 * A supersequence is a string that has both input strings as subsequences.
4 * 
5 * @param str1 - First input string
6 * @param str2 - Second input string
7 * @returns The shortest string containing both str1 and str2 as subsequences
8 */
9function shortestCommonSupersequence(str1: string, str2: string): string {
10    const length1: number = str1.length;
11    const length2: number = str2.length;
12  
13    // Create a 2D DP table to store the length of longest common subsequence
14    // dp[i][j] represents the LCS length of str1[0...i-1] and str2[0...j-1]
15    const dp: number[][] = Array.from({ length: length1 + 1 }, () => 
16        new Array<number>(length2 + 1).fill(0)
17    );
18  
19    // Build the LCS table using dynamic programming
20    for (let i = 1; i <= length1; i++) {
21        for (let j = 1; j <= length2; j++) {
22            if (str1[i - 1] === str2[j - 1]) {
23                // Characters match: extend the LCS by 1
24                dp[i][j] = dp[i - 1][j - 1] + 1;
25            } else {
26                // Characters don't match: take the maximum LCS from previous states
27                dp[i][j] = Math.max(dp[i - 1][j], dp[i][j - 1]);
28            }
29        }
30    }
31  
32    // Reconstruct the shortest common supersequence by backtracking through the DP table
33    const result: string[] = [];
34    let row: number = length1;
35    let col: number = length2;
36  
37    while (row > 0 || col > 0) {
38        if (row === 0) {
39            // Reached the top edge: add remaining characters from str2
40            result.push(str2[--col]);
41        } else if (col === 0) {
42            // Reached the left edge: add remaining characters from str1
43            result.push(str1[--row]);
44        } else {
45            if (dp[row][col] === dp[row - 1][col]) {
46                // Current LCS value came from above: add character from str1
47                result.push(str1[--row]);
48            } else if (dp[row][col] === dp[row][col - 1]) {
49                // Current LCS value came from left: add character from str2
50                result.push(str2[--col]);
51            } else {
52                // Characters matched: add the common character and move diagonally
53                result.push(str1[--row]);
54                col--;
55            }
56        }
57    }
58  
59    // Reverse the result since we built it backwards
60    return result.reverse().join('');
61}
62

Time and Space Complexity

Time Complexity: O(m * n + m + n) which simplifies to O(m * n)

The algorithm consists of two main phases:

  1. Building the DP table: The nested loops iterate through all cells of the f table, where each cell operation takes O(1) time. This results in O(m * n) time complexity.

  2. Reconstructing the shortest common supersequence: The while loop backtracks through the DP table from position (m, n) to (0, 0). In the worst case, we traverse at most m + n steps (moving either up or left in each iteration). Each operation within the loop takes O(1) time. This phase has O(m + n) time complexity.

The overall time complexity is O(m * n) + O(m + n) = O(m * n) since m * n dominates for non-trivial inputs.

Space Complexity: O(m * n)

The space usage breaks down as follows:

  1. DP table f: A 2D array of size (m + 1) × (n + 1) storing the longest common subsequence lengths, requiring O(m * n) space.

  2. Result list ans: In the worst case where the two strings have no common characters, the shortest common supersequence has length m + n, requiring O(m + n) space.

  3. Other variables: i, j, m, n use O(1) space.

The total space complexity is O(m * n) + O(m + n) + O(1) = O(m * n).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

Pitfall 1: Incorrect Backtracking Order

The Problem: A common mistake is incorrectly checking the DP table conditions during backtracking. Specifically, developers often check dp[i][j] == dp[i][j-1] before dp[i][j] == dp[i-1][j], which can lead to an incorrect supersequence that doesn't maintain the proper character order.

Why It Happens: The order matters because when both dp[i-1][j] and dp[i][j-1] equal dp[i][j], we have a choice. The algorithm must consistently follow one path to ensure correctness.

Incorrect Implementation:

if dp[i][j] == dp[i][j - 1]:  # Checking this first
    j -= 1
    result.append(str2[j])
elif dp[i][j] == dp[i - 1][j]:  # Checking this second
    i -= 1
    result.append(str1[i])

Solution: Always check dp[i-1][j] first, then dp[i][j-1]. This ensures a consistent traversal pattern:

if dp[i][j] == dp[i - 1][j]:
    i -= 1
    result.append(str1[i])
elif dp[i][j] == dp[i][j - 1]:
    j -= 1
    result.append(str2[j])
else:
    # Characters match (part of LCS)
    i -= 1
    j -= 1
    result.append(str1[i])

Pitfall 2: Off-by-One Errors in Index Management

The Problem: When backtracking, it's easy to confuse whether to use the character at index i or i-1 after decrementing. Since the DP table uses 1-based indexing for convenience but strings use 0-based indexing, this mismatch can cause bugs.

Incorrect Implementation:

# Wrong: Using index before decrementing
result.append(str1[i - 1])
i -= 1

Solution: Always decrement the index first, then use it to access the string:

# Correct: Decrement first, then use
i -= 1
result.append(str1[i])  # Now i is already decremented

Pitfall 3: Forgetting to Handle Empty String Cases

The Problem: If either str1 or str2 is empty, the algorithm should simply return the other string. Not handling this edge case explicitly can lead to unnecessary computation or errors.

Solution: Add early return conditions:

def shortestCommonSupersequence(self, str1: str, str2: str) -> str:
    if not str1:
        return str2
    if not str2:
        return str1
  
    # Continue with the main algorithm...

Pitfall 4: Building Result in Wrong Direction

The Problem: During backtracking, we traverse from (m, n) to (0, 0), which means we're building the result string backwards. Forgetting to reverse the final result is a common oversight.

Incorrect Implementation:

return ''.join(result)  # Forgot to reverse!

Solution: Always reverse the result before returning:

return ''.join(reversed(result))
# or
return ''.join(result[::-1])
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

In a binary min heap, the minimum element can be found in:


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More