Facebook Pixel

3571. Find the Shortest Superstring II 🔒

EasyString
Leetcode Link

Problem Description

You need to find the shortest string that contains both given strings s1 and s2 as substrings.

Given two strings s1 and s2, your task is to construct and return the shortest possible string that includes both s1 and s2 as substrings. A substring is a contiguous sequence of characters that appears within the string.

For example:

  • If s1 = "abc" and s2 = "bcd", the shortest string containing both would be "abcd" (where "abc" and "bcd" are both substrings)
  • If s1 = "cat" and s2 = "dog", since there's no overlap, the shortest would be either "catdog" or "dogcat" (both have the same length)

The key insight is that you want to maximize the overlap between the two strings to minimize the total length. This can happen in several ways:

  • One string might already contain the other as a substring
  • The end of one string might overlap with the beginning of the other
  • There might be no overlap at all

If multiple valid answers exist with the same minimum length, you can return any one of them.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

To find the shortest string containing both strings as substrings, we need to think about how two strings can be combined most efficiently. The key realization is that we want to maximize the overlap between them to minimize the total length.

Consider what happens when we try to combine two strings. If we just concatenate them directly like s1 + s2, we get a valid answer but it might not be the shortest. The shortest combination occurs when we can reuse parts of one string that match parts of the other.

Think about it step by step:

  1. First check for complete containment: If one string already contains the other entirely (like "abcdef" contains "cde"), then we don't need to add anything - the longer string is already our answer.

  2. Look for partial overlaps: If neither contains the other completely, we look for partial overlaps. This happens when the end of one string matches the beginning of another. For instance, if s1 = "abcd" and s2 = "cdef", we notice that "cd" appears at the end of s1 and at the beginning of s2. So instead of "abcdcdef" (8 characters), we can combine them as "abcdef" (6 characters).

  3. Consider both directions: We need to check both s1 followed by s2 AND s2 followed by s1, because the overlap might be different. For example, with s1 = "abc" and s2 = "cab", putting s2 first gives us "cabc" (4 characters) which is shorter than "abcab" (5 characters).

  4. Handle the no-overlap case: If there's no overlap at all (like "cat" and "dog"), we simply concatenate them.

The algorithm systematically checks all these possibilities by iterating through possible overlap lengths, starting from the maximum possible overlap and working down. For each potential overlap length i, it checks if the last i characters of one string match the first i characters of the other. Once a match is found, that's the maximum overlap for that ordering, and we can construct the combined string.

Solution Approach

The implementation follows a systematic approach to find the shortest superstring by checking all possible overlapping configurations between s1 and s2.

Step 1: Handle Length Optimization

if m > n:
    return self.shortestSuperstring(s2, s1)

We first ensure that s1 is the shorter string (or equal length) by swapping if necessary. This simplifies our logic since we always work with the shorter string as s1.

Step 2: Check for Complete Containment

if s1 in s2:
    return s2

If the shorter string s1 is already a substring of s2, then s2 itself is the shortest superstring. No concatenation needed.

Step 3: Check for Overlaps The algorithm then iterates through possible overlap lengths from 1 to m-1 (where m = len(s1)):

for i in range(m):

For each iteration i, we check two scenarios:

Scenario A: s1 before s2 with overlap

if s2.startswith(s1[i:]):
    return s1[:i] + s2

This checks if the suffix of s1 starting from index i matches the prefix of s2. If it does, we can overlap them:

  • Take the first i characters of s1 (the non-overlapping part)
  • Append the entire s2 (which already contains the overlapping suffix of s1)

For example: If s1 = "abcd" and s2 = "cdef", when i = 2:

  • s1[2:] = "cd"
  • s2.startswith("cd") is True
  • Result: s1[:2] + s2 = "ab" + "cdef" = "abcdef"

Scenario B: s2 before s1 with overlap

if s2.endswith(s1[:m-i]):
    return s2 + s1[m-i:]

This checks if the prefix of s1 of length m-i matches the suffix of s2. If it does:

  • Take the entire s2
  • Append only the non-overlapping part of s1 (from index m-i onwards)

For example: If s1 = "abc" and s2 = "xab", when i = 1:

  • s1[:2] = "ab" (m-i = 3-1 = 2)
  • s2.endswith("ab") is True
  • Result: s2 + s1[2:] = "xab" + "c" = "xabc"

Step 4: No Overlap Case

return s1 + s2

If no overlaps are found after checking all possibilities, we simply concatenate the two strings.

The algorithm is efficient because it:

  1. Checks containment first (O(n) operation)
  2. Iterates through at most m possible overlap positions
  3. For each position, performs string comparison operations
  4. Returns immediately upon finding the first (and maximum) overlap

The time complexity is O(m × n) where m and n are the lengths of the two strings, as we perform string matching operations for each possible overlap position.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the algorithm with s1 = "abc" and s2 = "bcd".

Step 1: Check lengths

  • m = len(s1) = 3, n = len(s2) = 3
  • Since m ≤ n, no swapping needed

Step 2: Check complete containment

  • Is "abc" in "bcd"? No, so continue

Step 3: Check for overlaps

  • Start iterating with i = 0:

    Iteration i = 0:

    • Check Scenario A: Does "bcd" start with s1[0:] = "abc"? No
    • Check Scenario B: Does "bcd" end with s1[:3] = "abc"? No

    Iteration i = 1:

    • Check Scenario A: Does "bcd" start with s1[1:] = "bc"? Yes! ✓
    • Return s1[:1] + s2 = "a" + "bcd" = "abcd"

The algorithm finds that the suffix "bc" of s1 overlaps with the prefix "bc" of s2, so it combines them efficiently to get "abcd" (length 4) instead of the naive concatenation "abcbcd" (length 6).

Let's trace another example where s1 = "cat" and s2 = "dog":

Step 1: m = 3, n = 3, no swap needed Step 2: "cat" is not in "dog" Step 3: Check overlaps:

  • i = 0: "dog" doesn't start with "cat", "dog" doesn't end with "cat"
  • i = 1: "dog" doesn't start with "at", "dog" doesn't end with "ca"
  • i = 2: "dog" doesn't start with "t", "dog" doesn't end with "c"

Step 4: No overlap found, return s1 + s2 = "catdog"


Let's trace through the algorithm with `s1 = "abc"` and `s2 = "bcd"` to find the shortest superstring.

**Initial Setup:**
- `s1 = "abc"` with length m = 3
- `s2 = "bcd"` with length n = 3

**Step 1: Length Check**
Since m ≤ n (both are 3), we proceed without swapping.

**Step 2: Containment Check**
Check if "abc" is a substring of "bcd": No match, so we continue.

**Step 3: Overlap Detection**
We iterate through possible overlap positions:

*Iteration i = 0:*
- Scenario A: Check if "bcd" starts with `s1[0:]` = "abc"
  - "bcd".startswith("abc") → False
- Scenario B: Check if "bcd" ends with `s1[:3]` = "abc"
  - "bcd".endswith("abc") → False

*Iteration i = 1:*
- Scenario A: Check if "bcd" starts with `s1[1:]` = "bc"
  - "bcd".startswith("bc") → True! ✓
  - Found overlap! Return `s1[:1] + s2` = "a" + "bcd" = "abcd"

**Result:** "abcd" (length 4)

The algorithm successfully identifies that the last 2 characters of s1 ("bc") match the first 2 characters of s2 ("bc"), allowing us to overlap them and create the shortest possible superstring "abcd" instead of the naive concatenation "abcbcd" (length 6).

Solution Implementation

1class Solution:
2    def shortestSuperstring(self, s1: str, s2: str) -> str:
3        """
4        Find the shortest superstring that contains both s1 and s2 as substrings.
5        A superstring is a string that contains all given strings as substrings.
6      
7        Args:
8            s1: First input string
9            s2: Second input string
10          
11        Returns:
12            The shortest string containing both s1 and s2 as substrings
13        """
14        m, n = len(s1), len(s2)
15      
16        # Optimization: always process the shorter string first
17        if m > n:
18            return self.shortestSuperstring(s2, s1)
19      
20        # Case 1: If s1 is already a substring of s2, s2 is the answer
21        if s1 in s2:
22            return s2
23      
24        # Case 2: Check for overlaps between s1 and s2
25        for i in range(m):
26            # Check if suffix of s1 (from position i) matches prefix of s2
27            # Example: s1="abcd", s2="cdef" -> overlap at "cd"
28            if s2.startswith(s1[i:]):
29                return s1[:i] + s2
30          
31            # Check if prefix of s1 matches suffix of s2
32            # Example: s1="cdef", s2="abcd" -> overlap at "cd"
33            if s2.endswith(s1[:m - i]):
34                return s2 + s1[m - i:]
35      
36        # Case 3: No overlap found, concatenate the strings
37        return s1 + s2
38
1class Solution {
2    /**
3     * Finds the shortest superstring that contains both input strings.
4     * A superstring is a string that contains both s1 and s2 as substrings.
5     * 
6     * @param s1 First input string
7     * @param s2 Second input string
8     * @return The shortest string containing both s1 and s2
9     */
10    public String shortestSuperstring(String s1, String s2) {
11        int lengthS1 = s1.length();
12        int lengthS2 = s2.length();
13      
14        // Ensure s1 is the shorter string for optimization
15        if (lengthS1 > lengthS2) {
16            return shortestSuperstring(s2, s1);
17        }
18      
19        // Check if s2 already contains s1 as a substring
20        if (s2.contains(s1)) {
21            return s2;
22        }
23      
24        // Try to find overlaps between s1 and s2
25        for (int i = 0; i < lengthS1; i++) {
26            // Check if suffix of s1 matches prefix of s2
27            // Example: s1="abc", s2="bcd" -> check if "bc" (suffix of s1) matches start of s2
28            if (s2.startsWith(s1.substring(i))) {
29                return s1.substring(0, i) + s2;
30            }
31          
32            // Check if prefix of s1 matches suffix of s2
33            // Example: s1="bcd", s2="abc" -> check if "bc" (prefix of s1) matches end of s2
34            if (s2.endsWith(s1.substring(0, lengthS1 - i))) {
35                return s2 + s1.substring(lengthS1 - i);
36            }
37        }
38      
39        // No overlap found, concatenate the strings
40        return s1 + s2;
41    }
42}
43
1class Solution {
2public:
3    string shortestSuperstring(string s1, string s2) {
4        int len1 = s1.size();
5        int len2 = s2.size();
6      
7        // Ensure s1 is the shorter string for optimization
8        if (len1 > len2) {
9            return shortestSuperstring(s2, s1);
10        }
11      
12        // Check if s1 is already a substring of s2
13        if (s2.find(s1) != string::npos) {
14            return s2;
15        }
16      
17        // Try to find overlaps between s1 and s2
18        for (int i = 0; i < len1; ++i) {
19            // Check if suffix of s1 (starting from index i) matches prefix of s2
20            // If yes, prepend the non-overlapping part of s1 to s2
21            if (s2.find(s1.substr(i)) == 0) {
22                return s1.substr(0, i) + s2;
23            }
24          
25            // Check if prefix of s1 (of length len1-i) matches suffix of s2
26            // If yes, append the non-overlapping part of s1 to s2
27            if (s2.rfind(s1.substr(0, len1 - i)) == s2.size() - (len1 - i)) {
28                return s2 + s1.substr(len1 - i);
29            }
30        }
31      
32        // No overlap found, concatenate both strings
33        return s1 + s2;
34    }
35};
36
1/**
2 * Finds the shortest superstring that contains both input strings as substrings.
3 * A superstring is a string that contains all given strings as substrings.
4 * 
5 * @param s1 - First input string
6 * @param s2 - Second input string
7 * @returns The shortest superstring containing both s1 and s2
8 */
9function shortestSuperstring(s1: string, s2: string): string {
10    const length1: number = s1.length;
11    const length2: number = s2.length;
12
13    // Ensure s1 is the shorter string for optimization
14    if (length1 > length2) {
15        return shortestSuperstring(s2, s1);
16    }
17
18    // Check if s1 is already a substring of s2
19    if (s2.includes(s1)) {
20        return s2;
21    }
22
23    // Try to find overlaps between the strings
24    for (let i: number = 0; i < length1; i++) {
25        // Check if suffix of s1 matches prefix of s2
26        // Example: s1="abc", s2="bcd" -> check if "bc" (suffix of s1) matches start of s2
27        if (s2.startsWith(s1.slice(i))) {
28            return s1.slice(0, i) + s2;
29        }
30      
31        // Check if prefix of s1 matches suffix of s2
32        // Example: s1="bcd", s2="abc" -> check if "bc" (prefix of s1) matches end of s2
33        if (s2.endsWith(s1.slice(0, length1 - i))) {
34            return s2 + s1.slice(length1 - i);
35        }
36    }
37
38    // No overlap found, concatenate the strings
39    return s1 + s2;
40}
41

Time and Space Complexity

The time complexity is O(n^2) where n is the maximum length of s1 and s2.

  • The initial length comparison and swap operation takes O(1) time
  • The substring check s1 in s2 takes O(m * n) time in the worst case, which is bounded by O(n^2) since m ≤ n after the swap
  • The for loop runs m iterations (where m ≤ n)
  • Inside each iteration:
    • s2.startswith(s1[i:]) creates a substring s1[i:] of length m-i and performs comparison in O(m-i) time
    • s2.endswith(s1[:m-i]) creates a substring s1[:m-i] of length m-i and performs comparison in O(m-i) time
  • The total work in the loop is O(m) + O(m-1) + ... + O(1) = O(m^2), which is bounded by O(n^2)
  • String concatenation operations take O(m+n) time, bounded by O(n)

The space complexity is O(n) where n is the maximum length of s1 and s2.

  • The substring operations s1[i:] and s1[:m-i] create new strings that take up to O(m) space, bounded by O(n)
  • The recursive call in the swap case uses O(1) additional stack space since it's tail recursion with at most one recursive call
  • The final string concatenation creates a new string of length at most O(m+n), which is O(n)

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Overlap Detection Logic

One of the most common mistakes is incorrectly checking for overlaps, particularly mixing up the indices when checking prefix-suffix matches.

Pitfall Example:

# WRONG: Incorrect index usage
if s2.startswith(s1[i:]):
    return s1[:i] + s2
if s2.endswith(s1[:i]):  # Should be s1[:m-i], not s1[:i]
    return s2 + s1[i:]      # Should be s1[m-i:], not s1[i:]

Why it's wrong: When checking if s2 ends with a prefix of s1, the prefix length should be m-i (decreasing as i increases), not i (increasing). Similarly, the concatenation should append the remaining part of s1, which starts at index m-i.

Correct Implementation:

if s2.startswith(s1[i:]):
    return s1[:i] + s2
if s2.endswith(s1[:m-i]):
    return s2 + s1[m-i:]

2. Missing the Containment Check

Pitfall Example:

def shortestSuperstring(self, s1: str, s2: str) -> str:
    # Directly checking overlaps without containment check
    for i in range(len(s1)):
        if s2.startswith(s1[i:]):
            return s1[:i] + s2
        # ... other overlap checks
    return s1 + s2

Why it's wrong: If one string completely contains the other (e.g., s1="bc", s2="abcd"), the overlap checking loop might not handle this case correctly. The loop would check partial overlaps but miss that the entire s1 is already within s2.

Solution: Always check for complete containment first:

if s1 in s2:
    return s2
if s2 in s1:
    return s1

3. Not Finding the Maximum Overlap

Pitfall Example:

# WRONG: Starting from the end and potentially missing larger overlaps
for i in range(m-1, -1, -1):  # Iterating backwards
    if s2.startswith(s1[i:]):
        return s1[:i] + s2

Why it's wrong: Starting from smaller overlaps (larger i values) means you might return a suboptimal solution. For example, with s1="abcde" and s2="cdefg", checking from i=4 downwards would first find the overlap "e" and return "abcdcdefg" (length 9), missing the better overlap "cde" which gives "abcdefg" (length 7).

Solution: Always iterate from i=0 to find the maximum overlap first:

for i in range(m):  # Start from 0 to find maximum overlap
    if s2.startswith(s1[i:]):
        return s1[:i] + s2

4. Forgetting Edge Cases with Empty or Single-Character Strings

Pitfall Example:

def shortestSuperstring(self, s1: str, s2: str) -> str:
    m, n = len(s1), len(s2)
    # No handling for empty strings
    for i in range(m):
        # This loop won't execute if m=0
        ...

Why it's wrong: If s1 is empty, the loop won't execute and the function will return s1 + s2 = "" + s2 = s2, which is correct. However, the logic becomes unclear and the containment check if s1 in s2 might behave unexpectedly with empty strings in some implementations.

Solution: Add explicit handling for edge cases:

if not s1:
    return s2
if not s2:
    return s1
if m > n:
    return self.shortestSuperstring(s2, s1)

5. Not Considering Both Ordering Possibilities

Pitfall Example:

# WRONG: Only checking s1 before s2
for i in range(m):
    if s2.startswith(s1[i:]):
        return s1[:i] + s2
return s1 + s2  # Missing s2 before s1 case

Why it's wrong: This only checks if we can place s1 before s2 with overlap, but misses cases where s2 before s1 would give a shorter result. For example, with s1="defg" and s2="abcde", the optimal solution is "abcdefg" (s2 before s1 with overlap "de").

Solution: Always check both orderings in the same loop:

for i in range(m):
    if s2.startswith(s1[i:]):
        return s1[:i] + s2
    if s2.endswith(s1[:m-i]):
        return s2 + s1[m-i:]
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

A person thinks of a number between 1 and 1000. You may ask any number questions to them, provided that the question can be answered with either "yes" or "no".

What is the minimum number of questions you needed to ask so that you are guaranteed to know the number that the person is thinking?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More