Facebook Pixel

1055. Shortest Way to Form String 🔒

Problem Description

You are given two strings: source and target. A subsequence is a string formed by deleting some (possibly zero) characters from the original string without changing the order of the remaining characters. For example, "ace" is a subsequence of "abcde" (formed by taking the 1st, 3rd, and 5th characters), but "aec" is not a valid subsequence since it doesn't maintain the relative order.

Your task is to find the minimum number of subsequences from the source string that, when concatenated together, form the target string. If it's impossible to form the target string using any number of subsequences from source, return -1.

For example:

  • If source = "abc" and target = "abcbc", you would need 2 subsequences: first taking "abc" and then taking "bc" from another iteration through source, giving you "abc" + "bc" = "abcbc".
  • If source = "abc" and target = "acdbc", it would be impossible since 'd' doesn't exist in source, so you'd return -1.

The key insight is that you can reuse the source string multiple times, taking different subsequences each time, and concatenate them to build the target string. The goal is to minimize the number of such subsequences needed.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key observation is that we want to use as many characters as possible from each pass through the source string to minimize the total number of subsequences needed.

Think of it this way: we're trying to "match" characters from target using characters from source. Since we need to maintain the relative order of characters (subsequence property), once we pick a character from source, we can only pick characters that come after it in the same pass.

This naturally leads to a greedy approach: for each pass through source, we try to match as many consecutive characters from target as possible. We scan through source from left to right, and whenever we find a character that matches the current position in target, we advance our position in target. This ensures we're taking the maximum possible subsequence in each iteration.

For example, if source = "abc" and target = "abcbc":

  • First pass: we can match "abc" from target (positions 0, 1, 2)
  • Second pass: we can only match "bc" from target (positions 3, 4)
  • Total: 2 subsequences

The impossibility check is straightforward: if we go through the entire source string and can't match even a single character from our current position in target, it means that character doesn't exist in source at all, making it impossible to form target.

The two-pointer technique naturally implements this greedy strategy: one pointer (j) tracks our progress in target, while the other pointer (i) scans through source looking for matches. Each complete scan through source represents one subsequence, and we continue until we've matched all characters in target.

Learn more about Greedy, Two Pointers and Binary Search patterns.

Solution Approach

The solution implements a two-pointer greedy approach to find the minimum number of subsequences needed.

Main Algorithm Structure:

The solution uses a helper function f(i, j) that performs one pass through the source string starting from index i, trying to match characters from target starting at index j. Here's how it works:

  1. Helper Function f(i, j):

    • Takes starting positions for both strings
    • Iterates through source from position i to the end
    • When source[i] == target[j], advances j to match the next character in target
    • Always advances i to continue scanning through source
    • Returns the new position j after matching as many characters as possible
  2. Main Loop Logic:

    ans = j = 0
    while j < n:
        k = f(0, j)
        if k == j:
            return -1
        j = k
        ans += 1
    • Initialize answer counter ans and target pointer j to 0
    • While we haven't matched all characters in target:
      • Call f(0, j) to scan through source from the beginning
      • Store the new position as k
      • If k == j, no progress was made (no matching character found), return -1
      • Otherwise, update j = k and increment the subsequence count ans

Key Implementation Details:

  • Variables:

    • m, n store the lengths of source and target respectively
    • j tracks our current position in target
    • ans counts the number of subsequences used
  • Impossibility Check: When f(0, j) returns the same value as j, it means we couldn't find the character target[j] anywhere in source, making it impossible to form the target string.

  • Time Complexity: O(m × n) where each character in target might require scanning through the entire source string in the worst case.

  • Space Complexity: O(1) as we only use a constant amount of extra space for pointers and counters.

The elegance of this solution lies in its simplicity - it greedily takes as much as possible from each pass through source, ensuring we use the minimum number of subsequences.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's trace through the algorithm with source = "abc" and target = "abcbc".

Initial Setup:

  • source = "abc" (length m = 3)
  • target = "abcbc" (length n = 5)
  • ans = 0, j = 0

First Iteration (First Subsequence):

  • Call f(0, 0) to scan through source starting at index 0, matching target from index 0
  • Inside f(0, 0):
    • i=0, j=0: source[0]='a' == target[0]='a' ✓ → j becomes 1
    • i=1, j=1: source[1]='b' == target[1]='b' ✓ → j becomes 2
    • i=2, j=2: source[2]='c' == target[2]='c' ✓ → j becomes 3
    • Function returns j=3
  • Back in main loop: k=3, since k(3) ≠ j(0), we made progress
  • Update: j=3, ans=1
  • We've matched "abc" from target (indices 0-2)

Second Iteration (Second Subsequence):

  • j=3 < n=5, so continue
  • Call f(0, 3) to scan through source again, matching target from index 3
  • Inside f(0, 3):
    • i=0, j=3: source[0]='a' ≠ target[3]='b' → j stays 3
    • i=1, j=3: source[1]='b' == target[3]='b' ✓ → j becomes 4
    • i=2, j=4: source[2]='c' == target[4]='c' ✓ → j becomes 5
    • Function returns j=5
  • Back in main loop: k=5, since k(5) ≠ j(3), we made progress
  • Update: j=5, ans=2
  • We've matched "bc" from target (indices 3-4)

Termination:

  • j=5 equals n=5, so we've matched all characters in target
  • Return ans=2

The algorithm successfully found that we need 2 subsequences from source to form target:

  1. First pass: "abc" (taking all characters)
  2. Second pass: "bc" (skipping 'a', taking 'b' and 'c')

Concatenating these gives us "abc" + "bc" = "abcbc", which equals our target.

Solution Implementation

1class Solution:
2    def shortestWay(self, source: str, target: str) -> int:
3        def find_matching_chars(source_start: int, target_start: int) -> int:
4            """
5            Greedily match characters from source starting at source_start
6            with target starting at target_start.
7            Returns the new position in target after matching.
8            """
9            source_idx = source_start
10            target_idx = target_start
11          
12            # Try to match as many characters as possible in one pass through source
13            while source_idx < source_length and target_idx < target_length:
14                if source[source_idx] == target[target_idx]:
15                    target_idx += 1
16                source_idx += 1
17          
18            return target_idx
19      
20        # Get lengths of both strings
21        source_length = len(source)
22        target_length = len(target)
23      
24        # Initialize counters
25        subsequence_count = 0  # Number of times we need to use source as subsequence
26        target_position = 0    # Current position in target string
27      
28        # Keep matching until we've covered all characters in target
29        while target_position < target_length:
30            # Try to match characters starting from beginning of source
31            new_position = find_matching_chars(0, target_position)
32          
33            # If no progress was made, target contains a character not in source
34            if new_position == target_position:
35                return -1
36          
37            # Update position and increment subsequence count
38            target_position = new_position
39            subsequence_count += 1
40      
41        return subsequence_count
42
1class Solution {
2    public int shortestWay(String source, String target) {
3        int sourceLength = source.length();
4        int targetLength = target.length();
5        int subsequenceCount = 0;
6        int targetIndex = 0;
7      
8        // Continue until all characters in target are matched
9        while (targetIndex < targetLength) {
10            int sourceIndex = 0;
11            boolean foundMatch = false;
12          
13            // Try to match as many characters as possible in one pass through source
14            while (sourceIndex < sourceLength && targetIndex < targetLength) {
15                // If characters match, move target pointer forward
16                if (source.charAt(sourceIndex) == target.charAt(targetIndex)) {
17                    foundMatch = true;
18                    targetIndex++;
19                }
20                sourceIndex++;
21            }
22          
23            // If no character was matched in this pass, target cannot be formed
24            if (!foundMatch) {
25                return -1;
26            }
27          
28            // Increment count as we used source string once
29            subsequenceCount++;
30        }
31      
32        return subsequenceCount;
33    }
34}
35
1class Solution {
2public:
3    int shortestWay(string source, string target) {
4        int sourceLength = source.size();
5        int targetLength = target.size();
6        int subsequenceCount = 0;  // Number of subsequences needed
7        int targetIndex = 0;        // Current position in target string
8      
9        // Keep forming subsequences until we've covered the entire target
10        while (targetIndex < targetLength) {
11            int sourceIndex = 0;
12            bool foundMatch = false;  // Track if we found at least one matching character
13          
14            // Try to match as many characters as possible in one pass through source
15            while (sourceIndex < sourceLength && targetIndex < targetLength) {
16                // If characters match, advance in target and mark that we found a match
17                if (source[sourceIndex] == target[targetIndex]) {
18                    foundMatch = true;
19                    ++targetIndex;
20                }
21                ++sourceIndex;
22            }
23          
24            // If no character was matched in this pass, target cannot be formed
25            if (!foundMatch) {
26                return -1;
27            }
28          
29            // We used one subsequence from source
30            ++subsequenceCount;
31        }
32      
33        return subsequenceCount;
34    }
35};
36
1function shortestWay(source: string, target: string): number {
2    const sourceLength: number = source.length;
3    const targetLength: number = target.length;
4    let subsequenceCount: number = 0;  // Number of subsequences needed
5    let targetIndex: number = 0;        // Current position in target string
6  
7    // Keep forming subsequences until we've covered the entire target
8    while (targetIndex < targetLength) {
9        let sourceIndex: number = 0;
10        let foundMatch: boolean = false;  // Track if we found at least one matching character
11      
12        // Try to match as many characters as possible in one pass through source
13        while (sourceIndex < sourceLength && targetIndex < targetLength) {
14            // If characters match, advance in target and mark that we found a match
15            if (source[sourceIndex] === target[targetIndex]) {
16                foundMatch = true;
17                targetIndex++;
18            }
19            sourceIndex++;
20        }
21      
22        // If no character was matched in this pass, target cannot be formed
23        if (!foundMatch) {
24            return -1;
25        }
26      
27        // We used one subsequence from source
28        subsequenceCount++;
29    }
30  
31    return subsequenceCount;
32}
33

Time and Space Complexity

Time Complexity: O(m × n)

The algorithm uses a while loop that continues until all characters in target are processed. In each iteration of the outer while loop, the helper function f(i, j) is called, which iterates through the entire source string (taking O(m) time) to match as many characters as possible from target.

In the worst case, we might need to traverse source once for each character in target. This happens when each pass through source only matches one character from target. For example, if source = "abc" and target = "aaa", we need 3 passes through source, matching one 'a' in each pass.

Therefore, the total time complexity is O(m × n), where m is the length of source and n is the length of target.

Space Complexity: O(1)

The algorithm only uses a constant amount of extra space. The variables used are:

  • m, n: store the lengths of the strings
  • ans: counter for the number of subsequences
  • j, k: pointers for tracking positions in target
  • i: local variable in function f for tracking position in source

All these variables use constant space regardless of the input size, resulting in O(1) space complexity.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Inefficient Character Existence Check

A common pitfall is not pre-checking whether all characters in target exist in source. Without this check, the algorithm might scan through the entire source string multiple times before discovering an impossible character, wasting computation time.

Problem Example:

source = "abc"
target = "abcabcabcabcabcabcabcabcabcabcabcd"  # 'd' at the end

The algorithm would process almost the entire target string before failing at the last character.

Solution: Pre-validate using a set to check character existence:

def shortestWay(self, source: str, target: str) -> int:
    # Early validation - O(m + n) time, O(m) space
    source_chars = set(source)
    for char in target:
        if char not in source_chars:
            return -1
  
    # Continue with the main algorithm...

2. Suboptimal Two-Pointer Reset Strategy

Another pitfall is always resetting the source pointer to 0 after each subsequence. While correct, this misses optimization opportunities when we could continue from where we left off if no match was found.

Enhanced Approach with Circular Scanning:

def shortestWay(self, source: str, target: str) -> int:
    source_chars = set(source)
  
    # Pre-check for impossible characters
    for char in target:
        if char not in source_chars:
            return -1
  
    subsequence_count = 1
    source_idx = 0
  
    for target_char in target:
        # Find the character in current subsequence
        start_idx = source_idx
      
        while source[source_idx] != target_char:
            source_idx = (source_idx + 1) % len(source)
          
            # If we've made a full circle without finding the character
            if source_idx == start_idx:
                # Start a new subsequence
                subsequence_count += 1
                # Search from beginning of source
                while source[source_idx] != target_char:
                    source_idx += 1
                break
      
        # Move to next position for next iteration
        source_idx = (source_idx + 1) % len(source)
      
        # If we've wrapped around, we need a new subsequence
        if source_idx == 0 and target_char != target[-1]:
            subsequence_count += 1
  
    return subsequence_count

3. Memory-Intensive Preprocessing

Some developers might be tempted to precompute all possible subsequences or create complex data structures, leading to excessive memory usage.

Pitfall Example:

# DON'T DO THIS - generates all possible subsequences
def generate_all_subsequences(s):
    subsequences = []
    for mask in range(1, 2**len(s)):
        subseq = ""
        for i in range(len(s)):
            if mask & (1 << i):
                subseq += s[i]
        subsequences.append(subseq)
    return subsequences

Better Alternative - Use Index Mapping:

def shortestWay(self, source: str, target: str) -> int:
    # Create a map of character to indices for O(1) lookups
    char_to_indices = {}
    for i, char in enumerate(source):
        if char not in char_to_indices:
            char_to_indices[char] = []
        char_to_indices[char].append(i)
  
    # Check if all target characters exist
    for char in target:
        if char not in char_to_indices:
            return -1
  
    subsequence_count = 1
    current_pos = -1
  
    for char in target:
        indices = char_to_indices[char]
      
        # Binary search for next occurrence after current_pos
        left, right = 0, len(indices) - 1
        next_pos = -1
      
        while left <= right:
            mid = (left + right) // 2
            if indices[mid] > current_pos:
                next_pos = indices[mid]
                right = mid - 1
            else:
                left = mid + 1
      
        if next_pos == -1:
            # Need new subsequence
            subsequence_count += 1
            current_pos = indices[0]
        else:
            current_pos = next_pos
  
    return subsequence_count

This approach uses O(m) extra space and provides O(n log m) time complexity with binary search optimization.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which of these properties could exist for a graph but not a tree?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More