1055. Shortest Way to Form String 🔒
Problem Description
You are given two strings: source
and target
. A subsequence is a string formed by deleting some (possibly zero) characters from the original string without changing the order of the remaining characters. For example, "ace"
is a subsequence of "abcde"
(formed by taking the 1st, 3rd, and 5th characters), but "aec"
is not a valid subsequence since it doesn't maintain the relative order.
Your task is to find the minimum number of subsequences from the source
string that, when concatenated together, form the target
string. If it's impossible to form the target
string using any number of subsequences from source
, return -1
.
For example:
- If
source = "abc"
andtarget = "abcbc"
, you would need 2 subsequences: first taking"abc"
and then taking"bc"
from another iteration through source, giving you"abc" + "bc" = "abcbc"
. - If
source = "abc"
andtarget = "acdbc"
, it would be impossible since'd'
doesn't exist in source, so you'd return-1
.
The key insight is that you can reuse the source string multiple times, taking different subsequences each time, and concatenate them to build the target string. The goal is to minimize the number of such subsequences needed.
Intuition
The key observation is that we want to use as many characters as possible from each pass through the source
string to minimize the total number of subsequences needed.
Think of it this way: we're trying to "match" characters from target
using characters from source
. Since we need to maintain the relative order of characters (subsequence property), once we pick a character from source
, we can only pick characters that come after it in the same pass.
This naturally leads to a greedy approach: for each pass through source
, we try to match as many consecutive characters from target
as possible. We scan through source
from left to right, and whenever we find a character that matches the current position in target
, we advance our position in target
. This ensures we're taking the maximum possible subsequence in each iteration.
For example, if source = "abc"
and target = "abcbc"
:
- First pass: we can match
"abc"
from target (positions 0, 1, 2) - Second pass: we can only match
"bc"
from target (positions 3, 4) - Total: 2 subsequences
The impossibility check is straightforward: if we go through the entire source
string and can't match even a single character from our current position in target
, it means that character doesn't exist in source
at all, making it impossible to form target
.
The two-pointer technique naturally implements this greedy strategy: one pointer (j
) tracks our progress in target
, while the other pointer (i
) scans through source
looking for matches. Each complete scan through source
represents one subsequence, and we continue until we've matched all characters in target
.
Learn more about Greedy, Two Pointers and Binary Search patterns.
Solution Approach
The solution implements a two-pointer greedy approach to find the minimum number of subsequences needed.
Main Algorithm Structure:
The solution uses a helper function f(i, j)
that performs one pass through the source
string starting from index i
, trying to match characters from target
starting at index j
. Here's how it works:
-
Helper Function
f(i, j)
:- Takes starting positions for both strings
- Iterates through
source
from positioni
to the end - When
source[i] == target[j]
, advancesj
to match the next character intarget
- Always advances
i
to continue scanning throughsource
- Returns the new position
j
after matching as many characters as possible
-
Main Loop Logic:
ans = j = 0 while j < n: k = f(0, j) if k == j: return -1 j = k ans += 1
- Initialize answer counter
ans
and target pointerj
to 0 - While we haven't matched all characters in
target
:- Call
f(0, j)
to scan throughsource
from the beginning - Store the new position as
k
- If
k == j
, no progress was made (no matching character found), return-1
- Otherwise, update
j = k
and increment the subsequence countans
- Call
- Initialize answer counter
Key Implementation Details:
-
Variables:
m, n
store the lengths ofsource
andtarget
respectivelyj
tracks our current position intarget
ans
counts the number of subsequences used
-
Impossibility Check: When
f(0, j)
returns the same value asj
, it means we couldn't find the charactertarget[j]
anywhere insource
, making it impossible to form the target string. -
Time Complexity:
O(m × n)
where each character intarget
might require scanning through the entiresource
string in the worst case. -
Space Complexity:
O(1)
as we only use a constant amount of extra space for pointers and counters.
The elegance of this solution lies in its simplicity - it greedily takes as much as possible from each pass through source
, ensuring we use the minimum number of subsequences.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's trace through the algorithm with source = "abc"
and target = "abcbc"
.
Initial Setup:
source = "abc"
(length m = 3)target = "abcbc"
(length n = 5)ans = 0
,j = 0
First Iteration (First Subsequence):
- Call
f(0, 0)
to scan through source starting at index 0, matching target from index 0 - Inside
f(0, 0)
:- i=0, j=0: source[0]='a' == target[0]='a' ✓ → j becomes 1
- i=1, j=1: source[1]='b' == target[1]='b' ✓ → j becomes 2
- i=2, j=2: source[2]='c' == target[2]='c' ✓ → j becomes 3
- Function returns j=3
- Back in main loop: k=3, since k(3) ≠j(0), we made progress
- Update: j=3, ans=1
- We've matched "abc" from target (indices 0-2)
Second Iteration (Second Subsequence):
- j=3 < n=5, so continue
- Call
f(0, 3)
to scan through source again, matching target from index 3 - Inside
f(0, 3)
:- i=0, j=3: source[0]='a' ≠target[3]='b' → j stays 3
- i=1, j=3: source[1]='b' == target[3]='b' ✓ → j becomes 4
- i=2, j=4: source[2]='c' == target[4]='c' ✓ → j becomes 5
- Function returns j=5
- Back in main loop: k=5, since k(5) ≠j(3), we made progress
- Update: j=5, ans=2
- We've matched "bc" from target (indices 3-4)
Termination:
- j=5 equals n=5, so we've matched all characters in target
- Return ans=2
The algorithm successfully found that we need 2 subsequences from source to form target:
- First pass: "abc" (taking all characters)
- Second pass: "bc" (skipping 'a', taking 'b' and 'c')
Concatenating these gives us "abc" + "bc" = "abcbc", which equals our target.
Solution Implementation
1class Solution:
2 def shortestWay(self, source: str, target: str) -> int:
3 def find_matching_chars(source_start: int, target_start: int) -> int:
4 """
5 Greedily match characters from source starting at source_start
6 with target starting at target_start.
7 Returns the new position in target after matching.
8 """
9 source_idx = source_start
10 target_idx = target_start
11
12 # Try to match as many characters as possible in one pass through source
13 while source_idx < source_length and target_idx < target_length:
14 if source[source_idx] == target[target_idx]:
15 target_idx += 1
16 source_idx += 1
17
18 return target_idx
19
20 # Get lengths of both strings
21 source_length = len(source)
22 target_length = len(target)
23
24 # Initialize counters
25 subsequence_count = 0 # Number of times we need to use source as subsequence
26 target_position = 0 # Current position in target string
27
28 # Keep matching until we've covered all characters in target
29 while target_position < target_length:
30 # Try to match characters starting from beginning of source
31 new_position = find_matching_chars(0, target_position)
32
33 # If no progress was made, target contains a character not in source
34 if new_position == target_position:
35 return -1
36
37 # Update position and increment subsequence count
38 target_position = new_position
39 subsequence_count += 1
40
41 return subsequence_count
42
1class Solution {
2 public int shortestWay(String source, String target) {
3 int sourceLength = source.length();
4 int targetLength = target.length();
5 int subsequenceCount = 0;
6 int targetIndex = 0;
7
8 // Continue until all characters in target are matched
9 while (targetIndex < targetLength) {
10 int sourceIndex = 0;
11 boolean foundMatch = false;
12
13 // Try to match as many characters as possible in one pass through source
14 while (sourceIndex < sourceLength && targetIndex < targetLength) {
15 // If characters match, move target pointer forward
16 if (source.charAt(sourceIndex) == target.charAt(targetIndex)) {
17 foundMatch = true;
18 targetIndex++;
19 }
20 sourceIndex++;
21 }
22
23 // If no character was matched in this pass, target cannot be formed
24 if (!foundMatch) {
25 return -1;
26 }
27
28 // Increment count as we used source string once
29 subsequenceCount++;
30 }
31
32 return subsequenceCount;
33 }
34}
35
1class Solution {
2public:
3 int shortestWay(string source, string target) {
4 int sourceLength = source.size();
5 int targetLength = target.size();
6 int subsequenceCount = 0; // Number of subsequences needed
7 int targetIndex = 0; // Current position in target string
8
9 // Keep forming subsequences until we've covered the entire target
10 while (targetIndex < targetLength) {
11 int sourceIndex = 0;
12 bool foundMatch = false; // Track if we found at least one matching character
13
14 // Try to match as many characters as possible in one pass through source
15 while (sourceIndex < sourceLength && targetIndex < targetLength) {
16 // If characters match, advance in target and mark that we found a match
17 if (source[sourceIndex] == target[targetIndex]) {
18 foundMatch = true;
19 ++targetIndex;
20 }
21 ++sourceIndex;
22 }
23
24 // If no character was matched in this pass, target cannot be formed
25 if (!foundMatch) {
26 return -1;
27 }
28
29 // We used one subsequence from source
30 ++subsequenceCount;
31 }
32
33 return subsequenceCount;
34 }
35};
36
1function shortestWay(source: string, target: string): number {
2 const sourceLength: number = source.length;
3 const targetLength: number = target.length;
4 let subsequenceCount: number = 0; // Number of subsequences needed
5 let targetIndex: number = 0; // Current position in target string
6
7 // Keep forming subsequences until we've covered the entire target
8 while (targetIndex < targetLength) {
9 let sourceIndex: number = 0;
10 let foundMatch: boolean = false; // Track if we found at least one matching character
11
12 // Try to match as many characters as possible in one pass through source
13 while (sourceIndex < sourceLength && targetIndex < targetLength) {
14 // If characters match, advance in target and mark that we found a match
15 if (source[sourceIndex] === target[targetIndex]) {
16 foundMatch = true;
17 targetIndex++;
18 }
19 sourceIndex++;
20 }
21
22 // If no character was matched in this pass, target cannot be formed
23 if (!foundMatch) {
24 return -1;
25 }
26
27 // We used one subsequence from source
28 subsequenceCount++;
29 }
30
31 return subsequenceCount;
32}
33
Time and Space Complexity
Time Complexity: O(m × n)
The algorithm uses a while loop that continues until all characters in target
are processed. In each iteration of the outer while loop, the helper function f(i, j)
is called, which iterates through the entire source
string (taking O(m)
time) to match as many characters as possible from target
.
In the worst case, we might need to traverse source
once for each character in target
. This happens when each pass through source
only matches one character from target
. For example, if source = "abc"
and target = "aaa"
, we need 3 passes through source
, matching one 'a' in each pass.
Therefore, the total time complexity is O(m × n)
, where m
is the length of source
and n
is the length of target
.
Space Complexity: O(1)
The algorithm only uses a constant amount of extra space. The variables used are:
m
,n
: store the lengths of the stringsans
: counter for the number of subsequencesj
,k
: pointers for tracking positions intarget
i
: local variable in functionf
for tracking position insource
All these variables use constant space regardless of the input size, resulting in O(1)
space complexity.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Inefficient Character Existence Check
A common pitfall is not pre-checking whether all characters in target
exist in source
. Without this check, the algorithm might scan through the entire source
string multiple times before discovering an impossible character, wasting computation time.
Problem Example:
source = "abc" target = "abcabcabcabcabcabcabcabcabcabcabcd" # 'd' at the end
The algorithm would process almost the entire target string before failing at the last character.
Solution: Pre-validate using a set to check character existence:
def shortestWay(self, source: str, target: str) -> int:
# Early validation - O(m + n) time, O(m) space
source_chars = set(source)
for char in target:
if char not in source_chars:
return -1
# Continue with the main algorithm...
2. Suboptimal Two-Pointer Reset Strategy
Another pitfall is always resetting the source pointer to 0 after each subsequence. While correct, this misses optimization opportunities when we could continue from where we left off if no match was found.
Enhanced Approach with Circular Scanning:
def shortestWay(self, source: str, target: str) -> int:
source_chars = set(source)
# Pre-check for impossible characters
for char in target:
if char not in source_chars:
return -1
subsequence_count = 1
source_idx = 0
for target_char in target:
# Find the character in current subsequence
start_idx = source_idx
while source[source_idx] != target_char:
source_idx = (source_idx + 1) % len(source)
# If we've made a full circle without finding the character
if source_idx == start_idx:
# Start a new subsequence
subsequence_count += 1
# Search from beginning of source
while source[source_idx] != target_char:
source_idx += 1
break
# Move to next position for next iteration
source_idx = (source_idx + 1) % len(source)
# If we've wrapped around, we need a new subsequence
if source_idx == 0 and target_char != target[-1]:
subsequence_count += 1
return subsequence_count
3. Memory-Intensive Preprocessing
Some developers might be tempted to precompute all possible subsequences or create complex data structures, leading to excessive memory usage.
Pitfall Example:
# DON'T DO THIS - generates all possible subsequences
def generate_all_subsequences(s):
subsequences = []
for mask in range(1, 2**len(s)):
subseq = ""
for i in range(len(s)):
if mask & (1 << i):
subseq += s[i]
subsequences.append(subseq)
return subsequences
Better Alternative - Use Index Mapping:
def shortestWay(self, source: str, target: str) -> int:
# Create a map of character to indices for O(1) lookups
char_to_indices = {}
for i, char in enumerate(source):
if char not in char_to_indices:
char_to_indices[char] = []
char_to_indices[char].append(i)
# Check if all target characters exist
for char in target:
if char not in char_to_indices:
return -1
subsequence_count = 1
current_pos = -1
for char in target:
indices = char_to_indices[char]
# Binary search for next occurrence after current_pos
left, right = 0, len(indices) - 1
next_pos = -1
while left <= right:
mid = (left + right) // 2
if indices[mid] > current_pos:
next_pos = indices[mid]
right = mid - 1
else:
left = mid + 1
if next_pos == -1:
# Need new subsequence
subsequence_count += 1
current_pos = indices[0]
else:
current_pos = next_pos
return subsequence_count
This approach uses O(m) extra space and provides O(n log m) time complexity with binary search optimization.
Which of these properties could exist for a graph but not a tree?
Recommended Readings
Greedy Introduction div class responsive iframe iframe src https www youtube com embed WTslqPbj7I title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture web share allowfullscreen iframe div When do we use greedy Greedy algorithms tend to solve optimization problems Typically they will ask you to calculate the max min of some value Commonly you may see this phrased in the problem as max min longest shortest largest smallest etc These keywords can be identified by just scanning
Tech Interview Pattern Two Pointers Introduction If you prefer videos here's a super quick introduction to Two Pointers div class responsive iframe iframe src https www youtube com embed xZ4AfXHQ1VQ title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture allowfullscreen iframe div Two pointers is a common interview
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
Want a Structured Path to Master System Design Too? Don’t Miss This!