Facebook Pixel

3291. Minimum Number of Valid Strings to Form Target I

MediumTrieSegment TreeArrayStringBinary SearchDynamic ProgrammingString MatchingHash FunctionRolling Hash
Leetcode Link

Problem Description

You are given an array of strings words and a string target.

A string x is considered valid if it is a prefix of at least one string in the words array. For example, if words = ["abc", "def"], then "a", "ab", "abc", "d", "de", and "def" are all valid strings, but "b", "ac", or "ef" are not valid.

Your task is to find the minimum number of valid strings that can be concatenated together to form the target string. The valid strings can be used multiple times and in any order.

If it's impossible to form the target string using valid strings, return -1.

For example:

  • If words = ["abc", "def"] and target = "abcd", you could use "abc" (a valid prefix of "abc") and "d" (a valid prefix of "def") to form "abcd", requiring 2 valid strings.
  • If words = ["abc"] and target = "def", it's impossible to form "def" using prefixes of "abc", so the answer would be -1.

The problem requires finding the optimal way to break down the target string into the fewest number of valid prefixes from the words array.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is that we need to efficiently check if any substring starting from a position in target matches a prefix of any word in words. This immediately suggests using a Trie (prefix tree) data structure, which excels at prefix matching operations.

Think of the problem as trying to "cover" the entire target string using valid prefixes. At each position in target, we have choices - we can use any valid prefix that starts at that position. This creates overlapping subproblems, which hints at a dynamic programming approach.

Consider how we'd solve this step by step:

  1. Starting from position i in target, we need to find all possible valid prefixes that match
  2. For each valid prefix of length len, we jump to position i + len and solve the subproblem from there
  3. We want the minimum among all these choices

The challenge is that checking all possible substrings against all words would be inefficient. This is where the Trie shines - by storing all words in a Trie, we can traverse it character by character while simultaneously traversing the target string. As soon as we can't find the next character in the Trie, we know there are no more valid prefixes from that starting position.

The recursive nature of the problem becomes clear: dfs(i) = minimum number of valid strings needed to form target[i:]. For each valid prefix found starting at position i, we recursively solve for the remaining substring. Since we might revisit the same positions multiple times through different paths, memoization prevents redundant calculations, transforming our solution into an efficient dynamic programming approach.

The combination of Trie for efficient prefix matching and memoized recursion for optimal substructure gives us an elegant solution to this string segmentation problem.

Learn more about Trie, Segment Tree, Binary Search and Dynamic Programming patterns.

Solution Approach

The solution uses a Trie data structure combined with memoized recursion to efficiently find the minimum number of valid strings needed.

Trie Implementation

First, we build a Trie to store all words from the words array:

class [Trie](/problems/trie_intro):
    def __init__(self):
        self.children: List[Optional[Trie]] = [None] * 26  # 26 lowercase letters
  
    def insert(self, w: str):
        node = self
        for i in map(lambda c: ord(c) - 97, w):  # Convert 'a'-'z' to 0-25
            if node.children[i] is None:
                node.children[i] = Trie()
            node = node.children[i]

The Trie allows us to efficiently check if a substring of target matches any prefix of words in words.

Dynamic Programming with Memoization

The core algorithm uses a recursive function dfs(i) that returns the minimum number of valid strings needed to form target[i:] (the substring from index i to the end):

@cache
def dfs(i: int) -> int:
    if i >= n:  # Base case: reached end of target
        return 0
  
    node = [trie](/problems/trie_intro)
    ans = inf
  
    # Try all possible prefixes starting from position i
    for j in range(i, n):
        k = ord(target[j]) - 97  # Convert character to index
        if node.children[k] is None:
            break  # No more valid prefixes from this position
      
        node = node.children[k]
        ans = min(ans, 1 + dfs(j + 1))  # Use this prefix and recurse
  
    return ans

Algorithm Flow

  1. Build the Trie: Insert all words from words into the Trie structure.

  2. Start recursion: Call dfs(0) to find the minimum number of strings needed starting from index 0.

  3. At each position i:

    • Try to match prefixes starting from target[i]
    • Traverse the Trie character by character
    • For each valid prefix found (ending at position j), recursively solve for dfs(j + 1)
    • Take the minimum among all valid choices and add 1 (for the current prefix used)
  4. Memoization: The @cache decorator automatically memorizes results for each position i, preventing redundant calculations when the same position is reached through different paths.

  5. Return result: If dfs(0) < inf, return it; otherwise return -1 (impossible to form target).

Time and Space Complexity

  • Time: O(n²) in the worst case, where n is the length of target. For each starting position, we might check up to n characters.
  • Space: O(m × k + n) where m is the total number of characters in all words (for the Trie), k is the average word length, and n for the memoization cache.

The elegance of this solution lies in combining the Trie's prefix-matching capability with dynamic programming's ability to solve overlapping subproblems efficiently.

Ready to land your dream job?

Unlock your dream job with a 3-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a concrete example to illustrate the solution approach.

Given:

  • words = ["ab", "abc", "cd", "def"]
  • target = "abcd"

Step 1: Build the Trie

We insert all words into the Trie structure:

        root
       /    \
      a      c      d
     /       |      |
    b        d      e
   /                |
  c                 f

Step 2: Start DFS from position 0

Call dfs(0) to find minimum strings needed for target[0:] = "abcd"

At position 0 (character 'a'):

  • Start traversing Trie from root
  • Find 'a' in Trie ✓
  • Continue to position 1

At position 1 (character 'b'):

  • 'b' exists as child of 'a' ✓
  • We found valid prefix "ab" → Calculate: 1 + dfs(2)
  • Continue to position 2

At position 2 (character 'c'):

  • 'c' exists as child of 'b' ✓
  • We found valid prefix "abc" → Calculate: 1 + dfs(3)
  • Continue to position 3

At position 3 (character 'd'):

  • 'd' doesn't exist as child of 'c' ✗
  • Stop traversing from position 0

So from position 0, we have two choices:

  • Use "ab" → 1 + dfs(2)
  • Use "abc" → 1 + dfs(3)

Step 3: Solve dfs(2)

At position 2 (character 'c'):

  • Start traversing Trie from root
  • Find 'c' in Trie ✓
  • Continue to position 3

At position 3 (character 'd'):

  • 'd' exists as child of 'c' ✓
  • We found valid prefix "cd" → Calculate: 1 + dfs(4)

dfs(4) returns 0 (base case - reached end of target)

Therefore, dfs(2) = 1 + 0 = 1

Step 4: Solve dfs(3)

At position 3 (character 'd'):

  • Start traversing Trie from root
  • Find 'd' in Trie ✓
  • Continue to position 4 (out of bounds)
  • We found valid prefix "d" → Calculate: 1 + dfs(4)

dfs(4) returns 0 (base case)

Therefore, dfs(3) = 1 + 0 = 1

Step 5: Calculate final result

Back to dfs(0):

  • Using "ab": 1 + dfs(2) = 1 + 1 = 2
  • Using "abc": 1 + dfs(3) = 1 + 1 = 2

Both paths give us 2, so dfs(0) = 2

Answer: 2

The target "abcd" can be formed using:

  • "ab" + "cd" (2 valid strings)
  • "abc" + "d" (2 valid strings)

The memoization ensures that dfs(3) and dfs(4) are only computed once, even though they might be reached through different paths. This optimization transforms what could be exponential time complexity into polynomial time.

Solution Implementation

1from typing import List, Optional
2from functools import cache
3from math import inf
4
5
6class Trie:
7    """Trie (prefix tree) data structure for efficient string prefix matching."""
8  
9    def __init__(self):
10        # Initialize children array for 26 lowercase letters
11        self.children: List[Optional[Trie]] = [None] * 26
12  
13    def insert(self, word: str) -> None:
14        """Insert a word into the trie."""
15        node = self
16        for char in word:
17            # Convert character to index (0-25 for 'a'-'z')
18            index = ord(char) - ord('a')
19            if node.children[index] is None:
20                node.children[index] = Trie()
21            node = node.children[index]
22
23
24class Solution:
25    def minValidStrings(self, words: List[str], target: str) -> int:
26        """
27        Find minimum number of valid strings from words needed to form target.
28        Each valid string is a prefix of some word in words array.
29      
30        Args:
31            words: List of available words
32            target: Target string to form
33          
34        Returns:
35            Minimum number of valid strings needed, or -1 if impossible
36        """
37      
38        @cache
39        def dfs(start_index: int) -> int:
40            """
41            Dynamic programming function to find minimum strings needed
42            starting from given index in target.
43          
44            Args:
45                start_index: Current position in target string
46              
47            Returns:
48                Minimum number of strings needed from this position
49            """
50            # Base case: reached end of target string
51            if start_index >= target_length:
52                return 0
53          
54            # Try to match prefixes starting from current position
55            node = trie_root
56            min_strings = inf
57          
58            # Explore all possible prefix matches from current position
59            for end_index in range(start_index, target_length):
60                char_index = ord(target[end_index]) - ord('a')
61              
62                # No matching prefix found, stop exploring
63                if node.children[char_index] is None:
64                    break
65              
66                # Move to next node in trie
67                node = node.children[char_index]
68              
69                # Calculate minimum using current prefix + remaining substring
70                min_strings = min(min_strings, 1 + dfs(end_index + 1))
71          
72            return min_strings
73      
74        # Build trie from all words
75        trie_root = Trie()
76        for word in words:
77            trie_root.insert(word)
78      
79        # Initialize variables
80        target_length = len(target)
81      
82        # Find minimum valid strings needed
83        result = dfs(0)
84      
85        # Return result or -1 if impossible
86        return result if result < inf else -1
87
1/**
2 * Trie (Prefix Tree) data structure for efficient string prefix matching
3 */
4class Trie {
5    // Array to store 26 children nodes (one for each lowercase letter a-z)
6    Trie[] children = new Trie[26];
7
8    /**
9     * Inserts a word into the trie
10     * @param word The word to insert into the trie
11     */
12    void insert(String word) {
13        Trie currentNode = this;
14      
15        // Traverse each character in the word
16        for (int i = 0; i < word.length(); i++) {
17            // Calculate the index for the current character (0-25 for a-z)
18            int charIndex = word.charAt(i) - 'a';
19          
20            // Create a new node if it doesn't exist
21            if (currentNode.children[charIndex] == null) {
22                currentNode.children[charIndex] = new Trie();
23            }
24          
25            // Move to the child node
26            currentNode = currentNode.children[charIndex];
27        }
28    }
29}
30
31/**
32 * Solution class to find minimum number of valid strings needed to form target
33 */
34class Solution {
35    // Memoization array to store computed results for dynamic programming
36    private Integer[] memo;
37  
38    // Character array representation of the target string
39    private char[] targetChars;
40  
41    // Trie to store all words for efficient prefix matching
42    private Trie trie;
43  
44    // Constant representing infinity (impossible case)
45    private final int INFINITY = 1 << 30;
46
47    /**
48     * Finds the minimum number of valid strings from words array needed to form target
49     * @param words Array of valid strings that can be used
50     * @param target The target string to be formed
51     * @return Minimum number of strings needed, or -1 if impossible
52     */
53    public int minValidStrings(String[] words, String target) {
54        // Initialize the trie and insert all words
55        trie = new Trie();
56        for (String word : words) {
57            trie.insert(word);
58        }
59      
60        // Convert target string to character array for efficient access
61        targetChars = target.toCharArray();
62      
63        // Initialize memoization array
64        memo = new Integer[targetChars.length];
65      
66        // Start dynamic programming from index 0
67        int result = dfs(0);
68      
69        // Return result if valid, otherwise return -1
70        return result < INFINITY ? result : -1;
71    }
72
73    /**
74     * Dynamic programming with memoization to find minimum strings needed
75     * starting from given index
76     * @param startIndex Current starting position in the target string
77     * @return Minimum number of strings needed from this position
78     */
79    private int dfs(int startIndex) {
80        // Base case: reached the end of target string
81        if (startIndex >= targetChars.length) {
82            return 0;
83        }
84      
85        // Return memoized result if already computed
86        if (memo[startIndex] != null) {
87            return memo[startIndex];
88        }
89      
90        // Start traversing from the root of trie
91        Trie currentNode = trie;
92      
93        // Initialize result for current position as infinity
94        memo[startIndex] = INFINITY;
95      
96        // Try to match prefixes starting from current position
97        for (int endIndex = startIndex; endIndex < targetChars.length; endIndex++) {
98            // Get the character index (0-25 for a-z)
99            int charIndex = targetChars[endIndex] - 'a';
100          
101            // If no matching prefix exists, stop searching
102            if (currentNode.children[charIndex] == null) {
103                break;
104            }
105          
106            // Update minimum by considering using current prefix
107            // 1 (current string) + minimum needed for remaining part
108            memo[startIndex] = Math.min(memo[startIndex], 1 + dfs(endIndex + 1));
109          
110            // Move to the next node in trie
111            currentNode = currentNode.children[charIndex];
112        }
113      
114        return memo[startIndex];
115    }
116}
117
1class Trie {
2public:
3    // Array to store 26 children nodes (for 'a' to 'z')
4    Trie* children[26]{};
5  
6    // Insert a word into the trie
7    void insert(string& word) {
8        Trie* currentNode = this;
9      
10        // Traverse each character in the word
11        for (char& ch : word) {
12            int index = ch - 'a';  // Convert character to array index (0-25)
13          
14            // Create new node if it doesn't exist
15            if (!currentNode->children[index]) {
16                currentNode->children[index] = new Trie();
17            }
18          
19            // Move to the child node
20            currentNode = currentNode->children[index];
21        }
22    }
23};
24
25class Solution {
26public:
27    int minValidStrings(vector<string>& words, string target) {
28        int targetLength = target.size();
29      
30        // Build trie from all words
31        Trie* trieRoot = new Trie();
32        for (auto& word : words) {
33            trieRoot->insert(word);
34        }
35      
36        // Define infinity as a large value for comparison
37        const int INF = 1 << 30;
38      
39        // Memoization array: dp[i] stores minimum strings needed from index i to end
40        int dp[targetLength];
41        memset(dp, -1, sizeof(dp));
42      
43        // Recursive function with memoization to find minimum valid strings
44        auto dfs = [&](this auto&& dfs, int startIndex) -> int {
45            // Base case: reached the end of target string
46            if (startIndex >= targetLength) {
47                return 0;
48            }
49          
50            // Return memoized result if already computed
51            if (dp[startIndex] != -1) {
52                return dp[startIndex];
53            }
54          
55            // Initialize current position's result with infinity
56            dp[startIndex] = INF;
57          
58            // Try to match prefixes starting from current position
59            Trie* currentNode = trieRoot;
60            for (int endIndex = startIndex; endIndex < targetLength; ++endIndex) {
61                int charIndex = target[endIndex] - 'a';
62              
63                // If no matching prefix exists, stop searching
64                if (!currentNode->children[charIndex]) {
65                    break;
66                }
67              
68                // Move to next character in trie
69                currentNode = currentNode->children[charIndex];
70              
71                // Update minimum: 1 (current substring) + result from remaining part
72                dp[startIndex] = min(dp[startIndex], 1 + dfs(endIndex + 1));
73            }
74          
75            return dp[startIndex];
76        };
77      
78        // Start DFS from index 0
79        int result = dfs(0);
80      
81        // Return result if valid solution exists, otherwise return -1
82        return result < INF ? result : -1;
83    }
84};
85
1// Global Trie node structure
2interface TrieNode {
3    children: (TrieNode | null)[];
4}
5
6// Create a new Trie node
7function createTrieNode(): TrieNode {
8    return {
9        children: Array(26).fill(null)
10    };
11}
12
13// Insert a word into the Trie
14function insertWord(root: TrieNode, word: string): void {
15    let currentNode: TrieNode = root;
16  
17    for (const char of word) {
18        // Calculate the index for the character (0-25 for 'a'-'z')
19        const charIndex = char.charCodeAt(0) - 'a'.charCodeAt(0);
20      
21        // Create a new node if it doesn't exist
22        if (!currentNode.children[charIndex]) {
23            currentNode.children[charIndex] = createTrieNode();
24        }
25      
26        // Move to the next node
27        currentNode = currentNode.children[charIndex]!;
28    }
29}
30
31function minValidStrings(words: string[], target: string): number {
32    const targetLength = target.length;
33  
34    // Build the Trie from all words
35    const trieRoot = createTrieNode();
36    for (const word of words) {
37        insertWord(trieRoot, word);
38    }
39  
40    // Large value representing infinity
41    const INFINITY = 1 << 30;
42  
43    // Memoization array for dynamic programming
44    // memo[i] stores the minimum number of valid strings needed to form target[i:]
45    const memo = Array(targetLength).fill(0);
46  
47    // Recursive function with memoization to find minimum valid strings
48    function findMinStrings(startIndex: number): number {
49        // Base case: reached the end of target string
50        if (startIndex >= targetLength) {
51            return 0;
52        }
53      
54        // Return memoized result if already computed
55        if (memo[startIndex]) {
56            return memo[startIndex];
57        }
58      
59        // Initialize with infinity
60        memo[startIndex] = INFINITY;
61      
62        // Try to match prefixes starting from current index
63        let currentNode: TrieNode | null = trieRoot;
64      
65        for (let endIndex = startIndex; endIndex < targetLength; ++endIndex) {
66            // Get the character index for the current character in target
67            const charIndex = target[endIndex].charCodeAt(0) - 'a'.charCodeAt(0);
68          
69            // If no matching prefix exists, stop searching
70            if (!currentNode?.children[charIndex]) {
71                break;
72            }
73          
74            // Move to the next node in Trie
75            currentNode = currentNode.children[charIndex];
76          
77            // Update minimum: 1 (current prefix) + minimum for remaining string
78            memo[startIndex] = Math.min(
79                memo[startIndex], 
80                1 + findMinStrings(endIndex + 1)
81            );
82        }
83      
84        return memo[startIndex];
85    }
86  
87    // Calculate the result starting from index 0
88    const result = findMinStrings(0);
89  
90    // Return -1 if no valid combination exists, otherwise return the result
91    return result < INFINITY ? result : -1;
92}
93

Time and Space Complexity

Time Complexity: O(n^2 + L)

The time complexity consists of two main parts:

  • Trie Construction: Building the trie from all words takes O(L) time, where L is the total length of all words in the input array.
  • DFS with Memoization: The dfs function can be called at most n times (once for each starting position in the target string). For each call to dfs(i), the inner loop iterates from position i to potentially position n-1, checking each character against the trie. This gives us O(n) work per call. With n possible calls and O(n) work each, the total time for DFS is O(n^2).

Therefore, the overall time complexity is O(n^2 + L).

Space Complexity: O(n + L)

The space complexity consists of:

  • Trie Storage: The trie stores all the words, requiring O(L) space in the worst case, where L is the total length of all words.
  • Memoization Cache: The @cache decorator stores results for at most n different values of i (from 0 to n-1), requiring O(n) space.
  • Recursion Stack: In the worst case, the recursion depth can be O(n) when we match one character at a time.

Since the recursion stack and memoization cache both contribute O(n), and the trie contributes O(L), the total space complexity is O(n + L).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Not Understanding Valid Strings are Prefixes, Not Just Substrings

A critical misunderstanding is thinking that any substring of words in the array can be used. The problem specifically requires prefixes - strings that start from the beginning of at least one word.

Incorrect Approach:

# Wrong: Checking if substring exists anywhere in words
def is_valid(substring):
    for word in words:
        if substring in word:  # This checks for any occurrence, not just prefix
            return True
    return False

Correct Approach:

# Right: Using Trie to check valid prefixes
# The Trie naturally enforces prefix matching from the start of words

2. Inefficient Prefix Checking Without Trie

Without a Trie, checking all possible prefixes becomes inefficient, leading to Time Limit Exceeded (TLE) for large inputs.

Inefficient Approach:

def minValidStrings(words, target):
    @cache
    def dfs(i):
        if i >= len(target):
            return 0
      
        result = float('inf')
        # Inefficient: For each position, checking all words and all lengths
        for j in range(i + 1, len(target) + 1):
            substring = target[i:j]
            for word in words:
                if word.startswith(substring) or substring == word[:len(substring)]:
                    result = min(result, 1 + dfs(j))
      
        return result

This approach has O(n² × m × k) complexity where n is target length, m is number of words, and k is average word length.

Efficient Solution: Using Trie reduces the complexity by allowing simultaneous prefix checking while traversing the target string character by character.

3. Memory Issues with String Slicing in Recursion

Creating substring copies in recursive calls can cause memory issues.

Memory-Intensive Approach:

def dfs(remaining_target):  # Passing substring instead of index
    if not remaining_target:
        return 0
  
    for i in range(1, len(remaining_target) + 1):
        prefix = remaining_target[:i]
        # Creates new string objects repeatedly
        result = min(result, 1 + dfs(remaining_target[i:]))

Memory-Efficient Solution: Pass indices instead of creating substring copies:

@cache
def dfs(start_index):  # Using index instead of substring
    if start_index >= len(target):
        return 0
    # Work with indices, not string copies

4. Forgetting to Handle the Impossible Case

Not properly returning -1 when it's impossible to form the target string.

Incomplete Handling:

def minValidStrings(words, target):
    result = dfs(0)
    return result  # Wrong: might return infinity

Proper Handling:

def minValidStrings(words, target):
    result = dfs(0)
    return result if result < float('inf') else -1

5. Not Using Memoization Leading to Exponential Time Complexity

Without memoization, the same subproblems are solved repeatedly.

Without Memoization:

def dfs(i):  # No @cache decorator
    if i >= n:
        return 0
    # This will recompute the same positions multiple times
    # Time complexity becomes exponential

With Memoization:

@cache  # Critical for performance
def dfs(i):
    if i >= n:
        return 0
    # Each position is computed only once

6. Incorrect Base Case in Recursion

Using wrong conditions for the base case can lead to incorrect results or infinite recursion.

Wrong Base Case:

def dfs(i):
    if i == len(target):  # Might miss the last character
        return 0
    # Or
    if i > len(target):  # Might cause index out of bounds
        return 0

Correct Base Case:

def dfs(i):
    if i >= len(target):  # Handles both exact match and beyond
        return 0
Discover Your Strengths and Weaknesses: Take Our 3-Minute Quiz to Tailor Your Study Plan:

What does the following code do?

1def f(arr1, arr2):
2  i, j = 0, 0
3  new_arr = []
4  while i < len(arr1) and j < len(arr2):
5      if arr1[i] < arr2[j]:
6          new_arr.append(arr1[i])
7          i += 1
8      else:
9          new_arr.append(arr2[j])
10          j += 1
11  new_arr.extend(arr1[i:])
12  new_arr.extend(arr2[j:])
13  return new_arr
14
1public static List<Integer> f(int[] arr1, int[] arr2) {
2  int i = 0, j = 0;
3  List<Integer> newArr = new ArrayList<>();
4
5  while (i < arr1.length && j < arr2.length) {
6      if (arr1[i] < arr2[j]) {
7          newArr.add(arr1[i]);
8          i++;
9      } else {
10          newArr.add(arr2[j]);
11          j++;
12      }
13  }
14
15  while (i < arr1.length) {
16      newArr.add(arr1[i]);
17      i++;
18  }
19
20  while (j < arr2.length) {
21      newArr.add(arr2[j]);
22      j++;
23  }
24
25  return newArr;
26}
27
1function f(arr1, arr2) {
2  let i = 0, j = 0;
3  let newArr = [];
4  
5  while (i < arr1.length && j < arr2.length) {
6      if (arr1[i] < arr2[j]) {
7          newArr.push(arr1[i]);
8          i++;
9      } else {
10          newArr.push(arr2[j]);
11          j++;
12      }
13  }
14  
15  while (i < arr1.length) {
16      newArr.push(arr1[i]);
17      i++;
18  }
19  
20  while (j < arr2.length) {
21      newArr.push(arr2[j]);
22      j++;
23  }
24  
25  return newArr;
26}
27

Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More