3571. Find the Shortest Superstring II 🔒
Problem Description
You need to find the shortest string that contains both given strings s1
and s2
as substrings.
Given two strings s1
and s2
, your task is to construct and return the shortest possible string that includes both s1
and s2
as substrings. A substring is a contiguous sequence of characters that appears within the string.
For example:
- If
s1 = "abc"
ands2 = "bcd"
, the shortest string containing both would be"abcd"
(where "abc" and "bcd" are both substrings) - If
s1 = "cat"
ands2 = "dog"
, since there's no overlap, the shortest would be either"catdog"
or"dogcat"
(both have the same length)
The key insight is that you want to maximize the overlap between the two strings to minimize the total length. This can happen in several ways:
- One string might already contain the other as a substring
- The end of one string might overlap with the beginning of the other
- There might be no overlap at all
If multiple valid answers exist with the same minimum length, you can return any one of them.
Intuition
To find the shortest string containing both strings as substrings, we need to think about how two strings can be combined most efficiently. The key realization is that we want to maximize the overlap between them to minimize the total length.
Consider what happens when we try to combine two strings. If we just concatenate them directly like s1 + s2
, we get a valid answer but it might not be the shortest. The shortest combination occurs when we can reuse parts of one string that match parts of the other.
Think about it step by step:
-
First check for complete containment: If one string already contains the other entirely (like "abcdef" contains "cde"), then we don't need to add anything - the longer string is already our answer.
-
Look for partial overlaps: If neither contains the other completely, we look for partial overlaps. This happens when the end of one string matches the beginning of another. For instance, if
s1 = "abcd"
ands2 = "cdef"
, we notice that "cd" appears at the end ofs1
and at the beginning ofs2
. So instead of "abcdcdef" (8 characters), we can combine them as "abcdef" (6 characters). -
Consider both directions: We need to check both
s1
followed bys2
ANDs2
followed bys1
, because the overlap might be different. For example, withs1 = "abc"
ands2 = "cab"
, puttings2
first gives us "cabc" (4 characters) which is shorter than "abcab" (5 characters). -
Handle the no-overlap case: If there's no overlap at all (like "cat" and "dog"), we simply concatenate them.
The algorithm systematically checks all these possibilities by iterating through possible overlap lengths, starting from the maximum possible overlap and working down. For each potential overlap length i
, it checks if the last i
characters of one string match the first i
characters of the other. Once a match is found, that's the maximum overlap for that ordering, and we can construct the combined string.
Solution Approach
The implementation follows a systematic approach to find the shortest superstring by checking all possible overlapping configurations between s1
and s2
.
Step 1: Handle Length Optimization
if m > n: return self.shortestSuperstring(s2, s1)
We first ensure that s1
is the shorter string (or equal length) by swapping if necessary. This simplifies our logic since we always work with the shorter string as s1
.
Step 2: Check for Complete Containment
if s1 in s2: return s2
If the shorter string s1
is already a substring of s2
, then s2
itself is the shortest superstring. No concatenation needed.
Step 3: Check for Overlaps
The algorithm then iterates through possible overlap lengths from 1 to m-1
(where m = len(s1)
):
for i in range(m):
For each iteration i
, we check two scenarios:
Scenario A: s1 before s2 with overlap
if s2.startswith(s1[i:]): return s1[:i] + s2
This checks if the suffix of s1
starting from index i
matches the prefix of s2
. If it does, we can overlap them:
- Take the first
i
characters ofs1
(the non-overlapping part) - Append the entire
s2
(which already contains the overlapping suffix ofs1
)
For example: If s1 = "abcd"
and s2 = "cdef"
, when i = 2
:
s1[2:]
= "cd"s2.startswith("cd")
is True- Result:
s1[:2] + s2
= "ab" + "cdef" = "abcdef"
Scenario B: s2 before s1 with overlap
if s2.endswith(s1[:m-i]): return s2 + s1[m-i:]
This checks if the prefix of s1
of length m-i
matches the suffix of s2
. If it does:
- Take the entire
s2
- Append only the non-overlapping part of
s1
(from indexm-i
onwards)
For example: If s1 = "abc"
and s2 = "xab"
, when i = 1
:
s1[:2]
= "ab" (m-i = 3-1 = 2)s2.endswith("ab")
is True- Result:
s2 + s1[2:]
= "xab" + "c" = "xabc"
Step 4: No Overlap Case
return s1 + s2
If no overlaps are found after checking all possibilities, we simply concatenate the two strings.
The algorithm is efficient because it:
- Checks containment first (O(n) operation)
- Iterates through at most
m
possible overlap positions - For each position, performs string comparison operations
- Returns immediately upon finding the first (and maximum) overlap
The time complexity is O(m × n) where m and n are the lengths of the two strings, as we perform string matching operations for each possible overlap position.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the algorithm with s1 = "abc"
and s2 = "bcd"
.
Step 1: Check lengths
m = len(s1) = 3
,n = len(s2) = 3
- Since
m ≤ n
, no swapping needed
Step 2: Check complete containment
- Is "abc" in "bcd"? No, so continue
Step 3: Check for overlaps
-
Start iterating with
i = 0
:Iteration i = 0:
- Check Scenario A: Does "bcd" start with
s1[0:]
= "abc"? No - Check Scenario B: Does "bcd" end with
s1[:3]
= "abc"? No
Iteration i = 1:
- Check Scenario A: Does "bcd" start with
s1[1:]
= "bc"? Yes! ✓ - Return
s1[:1] + s2
= "a" + "bcd" = "abcd"
- Check Scenario A: Does "bcd" start with
The algorithm finds that the suffix "bc" of s1 overlaps with the prefix "bc" of s2, so it combines them efficiently to get "abcd" (length 4) instead of the naive concatenation "abcbcd" (length 6).
Let's trace another example where s1 = "cat"
and s2 = "dog"
:
Step 1: m = 3
, n = 3
, no swap needed
Step 2: "cat" is not in "dog"
Step 3: Check overlaps:
- i = 0: "dog" doesn't start with "cat", "dog" doesn't end with "cat"
- i = 1: "dog" doesn't start with "at", "dog" doesn't end with "ca"
- i = 2: "dog" doesn't start with "t", "dog" doesn't end with "c"
Step 4: No overlap found, return s1 + s2
= "catdog"
Let's trace through the algorithm with `s1 = "abc"` and `s2 = "bcd"` to find the shortest superstring.
**Initial Setup:**
- `s1 = "abc"` with length m = 3
- `s2 = "bcd"` with length n = 3
**Step 1: Length Check**
Since m ≤ n (both are 3), we proceed without swapping.
**Step 2: Containment Check**
Check if "abc" is a substring of "bcd": No match, so we continue.
**Step 3: Overlap Detection**
We iterate through possible overlap positions:
*Iteration i = 0:*
- Scenario A: Check if "bcd" starts with `s1[0:]` = "abc"
- "bcd".startswith("abc") → False
- Scenario B: Check if "bcd" ends with `s1[:3]` = "abc"
- "bcd".endswith("abc") → False
*Iteration i = 1:*
- Scenario A: Check if "bcd" starts with `s1[1:]` = "bc"
- "bcd".startswith("bc") → True! ✓
- Found overlap! Return `s1[:1] + s2` = "a" + "bcd" = "abcd"
**Result:** "abcd" (length 4)
The algorithm successfully identifies that the last 2 characters of s1 ("bc") match the first 2 characters of s2 ("bc"), allowing us to overlap them and create the shortest possible superstring "abcd" instead of the naive concatenation "abcbcd" (length 6).
Solution Implementation
1class Solution:
2 def shortestSuperstring(self, s1: str, s2: str) -> str:
3 """
4 Find the shortest superstring that contains both s1 and s2 as substrings.
5 A superstring is a string that contains all given strings as substrings.
6
7 Args:
8 s1: First input string
9 s2: Second input string
10
11 Returns:
12 The shortest string containing both s1 and s2 as substrings
13 """
14 m, n = len(s1), len(s2)
15
16 # Optimization: always process the shorter string first
17 if m > n:
18 return self.shortestSuperstring(s2, s1)
19
20 # Case 1: If s1 is already a substring of s2, s2 is the answer
21 if s1 in s2:
22 return s2
23
24 # Case 2: Check for overlaps between s1 and s2
25 for i in range(m):
26 # Check if suffix of s1 (from position i) matches prefix of s2
27 # Example: s1="abcd", s2="cdef" -> overlap at "cd"
28 if s2.startswith(s1[i:]):
29 return s1[:i] + s2
30
31 # Check if prefix of s1 matches suffix of s2
32 # Example: s1="cdef", s2="abcd" -> overlap at "cd"
33 if s2.endswith(s1[:m - i]):
34 return s2 + s1[m - i:]
35
36 # Case 3: No overlap found, concatenate the strings
37 return s1 + s2
38
1class Solution {
2 /**
3 * Finds the shortest superstring that contains both input strings.
4 * A superstring is a string that contains both s1 and s2 as substrings.
5 *
6 * @param s1 First input string
7 * @param s2 Second input string
8 * @return The shortest string containing both s1 and s2
9 */
10 public String shortestSuperstring(String s1, String s2) {
11 int lengthS1 = s1.length();
12 int lengthS2 = s2.length();
13
14 // Ensure s1 is the shorter string for optimization
15 if (lengthS1 > lengthS2) {
16 return shortestSuperstring(s2, s1);
17 }
18
19 // Check if s2 already contains s1 as a substring
20 if (s2.contains(s1)) {
21 return s2;
22 }
23
24 // Try to find overlaps between s1 and s2
25 for (int i = 0; i < lengthS1; i++) {
26 // Check if suffix of s1 matches prefix of s2
27 // Example: s1="abc", s2="bcd" -> check if "bc" (suffix of s1) matches start of s2
28 if (s2.startsWith(s1.substring(i))) {
29 return s1.substring(0, i) + s2;
30 }
31
32 // Check if prefix of s1 matches suffix of s2
33 // Example: s1="bcd", s2="abc" -> check if "bc" (prefix of s1) matches end of s2
34 if (s2.endsWith(s1.substring(0, lengthS1 - i))) {
35 return s2 + s1.substring(lengthS1 - i);
36 }
37 }
38
39 // No overlap found, concatenate the strings
40 return s1 + s2;
41 }
42}
43
1class Solution {
2public:
3 string shortestSuperstring(string s1, string s2) {
4 int len1 = s1.size();
5 int len2 = s2.size();
6
7 // Ensure s1 is the shorter string for optimization
8 if (len1 > len2) {
9 return shortestSuperstring(s2, s1);
10 }
11
12 // Check if s1 is already a substring of s2
13 if (s2.find(s1) != string::npos) {
14 return s2;
15 }
16
17 // Try to find overlaps between s1 and s2
18 for (int i = 0; i < len1; ++i) {
19 // Check if suffix of s1 (starting from index i) matches prefix of s2
20 // If yes, prepend the non-overlapping part of s1 to s2
21 if (s2.find(s1.substr(i)) == 0) {
22 return s1.substr(0, i) + s2;
23 }
24
25 // Check if prefix of s1 (of length len1-i) matches suffix of s2
26 // If yes, append the non-overlapping part of s1 to s2
27 if (s2.rfind(s1.substr(0, len1 - i)) == s2.size() - (len1 - i)) {
28 return s2 + s1.substr(len1 - i);
29 }
30 }
31
32 // No overlap found, concatenate both strings
33 return s1 + s2;
34 }
35};
36
1/**
2 * Finds the shortest superstring that contains both input strings as substrings.
3 * A superstring is a string that contains all given strings as substrings.
4 *
5 * @param s1 - First input string
6 * @param s2 - Second input string
7 * @returns The shortest superstring containing both s1 and s2
8 */
9function shortestSuperstring(s1: string, s2: string): string {
10 const length1: number = s1.length;
11 const length2: number = s2.length;
12
13 // Ensure s1 is the shorter string for optimization
14 if (length1 > length2) {
15 return shortestSuperstring(s2, s1);
16 }
17
18 // Check if s1 is already a substring of s2
19 if (s2.includes(s1)) {
20 return s2;
21 }
22
23 // Try to find overlaps between the strings
24 for (let i: number = 0; i < length1; i++) {
25 // Check if suffix of s1 matches prefix of s2
26 // Example: s1="abc", s2="bcd" -> check if "bc" (suffix of s1) matches start of s2
27 if (s2.startsWith(s1.slice(i))) {
28 return s1.slice(0, i) + s2;
29 }
30
31 // Check if prefix of s1 matches suffix of s2
32 // Example: s1="bcd", s2="abc" -> check if "bc" (prefix of s1) matches end of s2
33 if (s2.endsWith(s1.slice(0, length1 - i))) {
34 return s2 + s1.slice(length1 - i);
35 }
36 }
37
38 // No overlap found, concatenate the strings
39 return s1 + s2;
40}
41
Time and Space Complexity
The time complexity is O(n^2)
where n
is the maximum length of s1
and s2
.
- The initial length comparison and swap operation takes
O(1)
time - The substring check
s1 in s2
takesO(m * n)
time in the worst case, which is bounded byO(n^2)
sincem ≤ n
after the swap - The for loop runs
m
iterations (wherem ≤ n
) - Inside each iteration:
s2.startswith(s1[i:])
creates a substrings1[i:]
of lengthm-i
and performs comparison inO(m-i)
times2.endswith(s1[:m-i])
creates a substrings1[:m-i]
of lengthm-i
and performs comparison inO(m-i)
time
- The total work in the loop is
O(m) + O(m-1) + ... + O(1) = O(m^2)
, which is bounded byO(n^2)
- String concatenation operations take
O(m+n)
time, bounded byO(n)
The space complexity is O(n)
where n
is the maximum length of s1
and s2
.
- The substring operations
s1[i:]
ands1[:m-i]
create new strings that take up toO(m)
space, bounded byO(n)
- The recursive call in the swap case uses
O(1)
additional stack space since it's tail recursion with at most one recursive call - The final string concatenation creates a new string of length at most
O(m+n)
, which isO(n)
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Overlap Detection Logic
One of the most common mistakes is incorrectly checking for overlaps, particularly mixing up the indices when checking prefix-suffix matches.
Pitfall Example:
# WRONG: Incorrect index usage if s2.startswith(s1[i:]): return s1[:i] + s2 if s2.endswith(s1[:i]): # Should be s1[:m-i], not s1[:i] return s2 + s1[i:] # Should be s1[m-i:], not s1[i:]
Why it's wrong: When checking if s2 ends with a prefix of s1, the prefix length should be m-i
(decreasing as i increases), not i
(increasing). Similarly, the concatenation should append the remaining part of s1, which starts at index m-i
.
Correct Implementation:
if s2.startswith(s1[i:]): return s1[:i] + s2 if s2.endswith(s1[:m-i]): return s2 + s1[m-i:]
2. Missing the Containment Check
Pitfall Example:
def shortestSuperstring(self, s1: str, s2: str) -> str:
# Directly checking overlaps without containment check
for i in range(len(s1)):
if s2.startswith(s1[i:]):
return s1[:i] + s2
# ... other overlap checks
return s1 + s2
Why it's wrong: If one string completely contains the other (e.g., s1="bc", s2="abcd"), the overlap checking loop might not handle this case correctly. The loop would check partial overlaps but miss that the entire s1 is already within s2.
Solution: Always check for complete containment first:
if s1 in s2: return s2 if s2 in s1: return s1
3. Not Finding the Maximum Overlap
Pitfall Example:
# WRONG: Starting from the end and potentially missing larger overlaps
for i in range(m-1, -1, -1): # Iterating backwards
if s2.startswith(s1[i:]):
return s1[:i] + s2
Why it's wrong: Starting from smaller overlaps (larger i values) means you might return a suboptimal solution. For example, with s1="abcde" and s2="cdefg", checking from i=4 downwards would first find the overlap "e" and return "abcdcdefg" (length 9), missing the better overlap "cde" which gives "abcdefg" (length 7).
Solution: Always iterate from i=0 to find the maximum overlap first:
for i in range(m): # Start from 0 to find maximum overlap
if s2.startswith(s1[i:]):
return s1[:i] + s2
4. Forgetting Edge Cases with Empty or Single-Character Strings
Pitfall Example:
def shortestSuperstring(self, s1: str, s2: str) -> str:
m, n = len(s1), len(s2)
# No handling for empty strings
for i in range(m):
# This loop won't execute if m=0
...
Why it's wrong: If s1 is empty, the loop won't execute and the function will return s1 + s2 = "" + s2 = s2, which is correct. However, the logic becomes unclear and the containment check if s1 in s2
might behave unexpectedly with empty strings in some implementations.
Solution: Add explicit handling for edge cases:
if not s1: return s2 if not s2: return s1 if m > n: return self.shortestSuperstring(s2, s1)
5. Not Considering Both Ordering Possibilities
Pitfall Example:
# WRONG: Only checking s1 before s2
for i in range(m):
if s2.startswith(s1[i:]):
return s1[:i] + s2
return s1 + s2 # Missing s2 before s1 case
Why it's wrong: This only checks if we can place s1 before s2 with overlap, but misses cases where s2 before s1 would give a shorter result. For example, with s1="defg" and s2="abcde", the optimal solution is "abcdefg" (s2 before s1 with overlap "de").
Solution: Always check both orderings in the same loop:
for i in range(m):
if s2.startswith(s1[i:]):
return s1[:i] + s2
if s2.endswith(s1[:m-i]):
return s2 + s1[m-i:]
A person thinks of a number between 1 and 1000. You may ask any number questions to them, provided that the question can be answered with either "yes" or "no".
What is the minimum number of questions you needed to ask so that you are guaranteed to know the number that the person is thinking?
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!