3291. Minimum Number of Valid Strings to Form Target I
Problem Description
You are given an array of strings words
and a string target
.
A string x
is considered valid if it is a prefix of at least one string in the words
array. For example, if words = ["abc", "def"]
, then "a", "ab", "abc", "d", "de", and "def" are all valid strings, but "b", "ac", or "ef" are not valid.
Your task is to find the minimum number of valid strings that can be concatenated together to form the target
string. The valid strings can be used multiple times and in any order.
If it's impossible to form the target
string using valid strings, return -1
.
For example:
- If
words = ["abc", "def"]
andtarget = "abcd"
, you could use "abc" (a valid prefix of "abc") and "d" (a valid prefix of "def") to form "abcd", requiring 2 valid strings. - If
words = ["abc"]
andtarget = "def"
, it's impossible to form "def" using prefixes of "abc", so the answer would be-1
.
The problem requires finding the optimal way to break down the target
string into the fewest number of valid prefixes from the words
array.
Intuition
The key insight is that we need to efficiently check if any substring starting from a position in target
matches a prefix of any word in words
. This immediately suggests using a Trie (prefix tree) data structure, which excels at prefix matching operations.
Think of the problem as trying to "cover" the entire target
string using valid prefixes. At each position in target
, we have choices - we can use any valid prefix that starts at that position. This creates overlapping subproblems, which hints at a dynamic programming approach.
Consider how we'd solve this step by step:
- Starting from position
i
intarget
, we need to find all possible valid prefixes that match - For each valid prefix of length
len
, we jump to positioni + len
and solve the subproblem from there - We want the minimum among all these choices
The challenge is that checking all possible substrings against all words would be inefficient. This is where the Trie shines - by storing all words in a Trie, we can traverse it character by character while simultaneously traversing the target
string. As soon as we can't find the next character in the Trie, we know there are no more valid prefixes from that starting position.
The recursive nature of the problem becomes clear: dfs(i)
= minimum number of valid strings needed to form target[i:]
. For each valid prefix found starting at position i
, we recursively solve for the remaining substring. Since we might revisit the same positions multiple times through different paths, memoization prevents redundant calculations, transforming our solution into an efficient dynamic programming approach.
The combination of Trie for efficient prefix matching and memoized recursion for optimal substructure gives us an elegant solution to this string segmentation problem.
Learn more about Trie, Segment Tree, Binary Search and Dynamic Programming patterns.
Solution Approach
The solution uses a Trie data structure combined with memoized recursion to efficiently find the minimum number of valid strings needed.
Trie Implementation
First, we build a Trie to store all words from the words
array:
class [Trie](/problems/trie_intro):
def __init__(self):
self.children: List[Optional[Trie]] = [None] * 26 # 26 lowercase letters
def insert(self, w: str):
node = self
for i in map(lambda c: ord(c) - 97, w): # Convert 'a'-'z' to 0-25
if node.children[i] is None:
node.children[i] = Trie()
node = node.children[i]
The Trie allows us to efficiently check if a substring of target
matches any prefix of words in words
.
Dynamic Programming with Memoization
The core algorithm uses a recursive function dfs(i)
that returns the minimum number of valid strings needed to form target[i:]
(the substring from index i
to the end):
@cache
def dfs(i: int) -> int:
if i >= n: # Base case: reached end of target
return 0
node = [trie](/problems/trie_intro)
ans = inf
# Try all possible prefixes starting from position i
for j in range(i, n):
k = ord(target[j]) - 97 # Convert character to index
if node.children[k] is None:
break # No more valid prefixes from this position
node = node.children[k]
ans = min(ans, 1 + dfs(j + 1)) # Use this prefix and recurse
return ans
Algorithm Flow
-
Build the Trie: Insert all words from
words
into the Trie structure. -
Start recursion: Call
dfs(0)
to find the minimum number of strings needed starting from index 0. -
At each position
i
:- Try to match prefixes starting from
target[i]
- Traverse the Trie character by character
- For each valid prefix found (ending at position
j
), recursively solve fordfs(j + 1)
- Take the minimum among all valid choices and add 1 (for the current prefix used)
- Try to match prefixes starting from
-
Memoization: The
@cache
decorator automatically memorizes results for each positioni
, preventing redundant calculations when the same position is reached through different paths. -
Return result: If
dfs(0) < inf
, return it; otherwise return-1
(impossible to form target).
Time and Space Complexity
- Time: O(n²) in the worst case, where n is the length of
target
. For each starting position, we might check up to n characters. - Space: O(m × k + n) where m is the total number of characters in all words (for the Trie), k is the average word length, and n for the memoization cache.
The elegance of this solution lies in combining the Trie's prefix-matching capability with dynamic programming's ability to solve overlapping subproblems efficiently.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a concrete example to illustrate the solution approach.
Given:
words = ["ab", "abc", "cd", "def"]
target = "abcd"
Step 1: Build the Trie
We insert all words into the Trie structure:
root / \ a c d / | | b d e / | c f
Step 2: Start DFS from position 0
Call dfs(0)
to find minimum strings needed for target[0:] = "abcd"
At position 0 (character 'a'):
- Start traversing Trie from root
- Find 'a' in Trie ✓
- Continue to position 1
At position 1 (character 'b'):
- 'b' exists as child of 'a' ✓
- We found valid prefix "ab" → Calculate:
1 + dfs(2)
- Continue to position 2
At position 2 (character 'c'):
- 'c' exists as child of 'b' ✓
- We found valid prefix "abc" → Calculate:
1 + dfs(3)
- Continue to position 3
At position 3 (character 'd'):
- 'd' doesn't exist as child of 'c' ✗
- Stop traversing from position 0
So from position 0, we have two choices:
- Use "ab" →
1 + dfs(2)
- Use "abc" →
1 + dfs(3)
Step 3: Solve dfs(2)
At position 2 (character 'c'):
- Start traversing Trie from root
- Find 'c' in Trie ✓
- Continue to position 3
At position 3 (character 'd'):
- 'd' exists as child of 'c' ✓
- We found valid prefix "cd" → Calculate:
1 + dfs(4)
dfs(4)
returns 0 (base case - reached end of target)
Therefore, dfs(2) = 1 + 0 = 1
Step 4: Solve dfs(3)
At position 3 (character 'd'):
- Start traversing Trie from root
- Find 'd' in Trie ✓
- Continue to position 4 (out of bounds)
- We found valid prefix "d" → Calculate:
1 + dfs(4)
dfs(4)
returns 0 (base case)
Therefore, dfs(3) = 1 + 0 = 1
Step 5: Calculate final result
Back to dfs(0)
:
- Using "ab":
1 + dfs(2) = 1 + 1 = 2
- Using "abc":
1 + dfs(3) = 1 + 1 = 2
Both paths give us 2, so dfs(0) = 2
Answer: 2
The target "abcd" can be formed using:
- "ab" + "cd" (2 valid strings)
- "abc" + "d" (2 valid strings)
The memoization ensures that dfs(3)
and dfs(4)
are only computed once, even though they might be reached through different paths. This optimization transforms what could be exponential time complexity into polynomial time.
Solution Implementation
1from typing import List, Optional
2from functools import cache
3from math import inf
4
5
6class Trie:
7 """Trie (prefix tree) data structure for efficient string prefix matching."""
8
9 def __init__(self):
10 # Initialize children array for 26 lowercase letters
11 self.children: List[Optional[Trie]] = [None] * 26
12
13 def insert(self, word: str) -> None:
14 """Insert a word into the trie."""
15 node = self
16 for char in word:
17 # Convert character to index (0-25 for 'a'-'z')
18 index = ord(char) - ord('a')
19 if node.children[index] is None:
20 node.children[index] = Trie()
21 node = node.children[index]
22
23
24class Solution:
25 def minValidStrings(self, words: List[str], target: str) -> int:
26 """
27 Find minimum number of valid strings from words needed to form target.
28 Each valid string is a prefix of some word in words array.
29
30 Args:
31 words: List of available words
32 target: Target string to form
33
34 Returns:
35 Minimum number of valid strings needed, or -1 if impossible
36 """
37
38 @cache
39 def dfs(start_index: int) -> int:
40 """
41 Dynamic programming function to find minimum strings needed
42 starting from given index in target.
43
44 Args:
45 start_index: Current position in target string
46
47 Returns:
48 Minimum number of strings needed from this position
49 """
50 # Base case: reached end of target string
51 if start_index >= target_length:
52 return 0
53
54 # Try to match prefixes starting from current position
55 node = trie_root
56 min_strings = inf
57
58 # Explore all possible prefix matches from current position
59 for end_index in range(start_index, target_length):
60 char_index = ord(target[end_index]) - ord('a')
61
62 # No matching prefix found, stop exploring
63 if node.children[char_index] is None:
64 break
65
66 # Move to next node in trie
67 node = node.children[char_index]
68
69 # Calculate minimum using current prefix + remaining substring
70 min_strings = min(min_strings, 1 + dfs(end_index + 1))
71
72 return min_strings
73
74 # Build trie from all words
75 trie_root = Trie()
76 for word in words:
77 trie_root.insert(word)
78
79 # Initialize variables
80 target_length = len(target)
81
82 # Find minimum valid strings needed
83 result = dfs(0)
84
85 # Return result or -1 if impossible
86 return result if result < inf else -1
87
1/**
2 * Trie (Prefix Tree) data structure for efficient string prefix matching
3 */
4class Trie {
5 // Array to store 26 children nodes (one for each lowercase letter a-z)
6 Trie[] children = new Trie[26];
7
8 /**
9 * Inserts a word into the trie
10 * @param word The word to insert into the trie
11 */
12 void insert(String word) {
13 Trie currentNode = this;
14
15 // Traverse each character in the word
16 for (int i = 0; i < word.length(); i++) {
17 // Calculate the index for the current character (0-25 for a-z)
18 int charIndex = word.charAt(i) - 'a';
19
20 // Create a new node if it doesn't exist
21 if (currentNode.children[charIndex] == null) {
22 currentNode.children[charIndex] = new Trie();
23 }
24
25 // Move to the child node
26 currentNode = currentNode.children[charIndex];
27 }
28 }
29}
30
31/**
32 * Solution class to find minimum number of valid strings needed to form target
33 */
34class Solution {
35 // Memoization array to store computed results for dynamic programming
36 private Integer[] memo;
37
38 // Character array representation of the target string
39 private char[] targetChars;
40
41 // Trie to store all words for efficient prefix matching
42 private Trie trie;
43
44 // Constant representing infinity (impossible case)
45 private final int INFINITY = 1 << 30;
46
47 /**
48 * Finds the minimum number of valid strings from words array needed to form target
49 * @param words Array of valid strings that can be used
50 * @param target The target string to be formed
51 * @return Minimum number of strings needed, or -1 if impossible
52 */
53 public int minValidStrings(String[] words, String target) {
54 // Initialize the trie and insert all words
55 trie = new Trie();
56 for (String word : words) {
57 trie.insert(word);
58 }
59
60 // Convert target string to character array for efficient access
61 targetChars = target.toCharArray();
62
63 // Initialize memoization array
64 memo = new Integer[targetChars.length];
65
66 // Start dynamic programming from index 0
67 int result = dfs(0);
68
69 // Return result if valid, otherwise return -1
70 return result < INFINITY ? result : -1;
71 }
72
73 /**
74 * Dynamic programming with memoization to find minimum strings needed
75 * starting from given index
76 * @param startIndex Current starting position in the target string
77 * @return Minimum number of strings needed from this position
78 */
79 private int dfs(int startIndex) {
80 // Base case: reached the end of target string
81 if (startIndex >= targetChars.length) {
82 return 0;
83 }
84
85 // Return memoized result if already computed
86 if (memo[startIndex] != null) {
87 return memo[startIndex];
88 }
89
90 // Start traversing from the root of trie
91 Trie currentNode = trie;
92
93 // Initialize result for current position as infinity
94 memo[startIndex] = INFINITY;
95
96 // Try to match prefixes starting from current position
97 for (int endIndex = startIndex; endIndex < targetChars.length; endIndex++) {
98 // Get the character index (0-25 for a-z)
99 int charIndex = targetChars[endIndex] - 'a';
100
101 // If no matching prefix exists, stop searching
102 if (currentNode.children[charIndex] == null) {
103 break;
104 }
105
106 // Update minimum by considering using current prefix
107 // 1 (current string) + minimum needed for remaining part
108 memo[startIndex] = Math.min(memo[startIndex], 1 + dfs(endIndex + 1));
109
110 // Move to the next node in trie
111 currentNode = currentNode.children[charIndex];
112 }
113
114 return memo[startIndex];
115 }
116}
117
1class Trie {
2public:
3 // Array to store 26 children nodes (for 'a' to 'z')
4 Trie* children[26]{};
5
6 // Insert a word into the trie
7 void insert(string& word) {
8 Trie* currentNode = this;
9
10 // Traverse each character in the word
11 for (char& ch : word) {
12 int index = ch - 'a'; // Convert character to array index (0-25)
13
14 // Create new node if it doesn't exist
15 if (!currentNode->children[index]) {
16 currentNode->children[index] = new Trie();
17 }
18
19 // Move to the child node
20 currentNode = currentNode->children[index];
21 }
22 }
23};
24
25class Solution {
26public:
27 int minValidStrings(vector<string>& words, string target) {
28 int targetLength = target.size();
29
30 // Build trie from all words
31 Trie* trieRoot = new Trie();
32 for (auto& word : words) {
33 trieRoot->insert(word);
34 }
35
36 // Define infinity as a large value for comparison
37 const int INF = 1 << 30;
38
39 // Memoization array: dp[i] stores minimum strings needed from index i to end
40 int dp[targetLength];
41 memset(dp, -1, sizeof(dp));
42
43 // Recursive function with memoization to find minimum valid strings
44 auto dfs = [&](this auto&& dfs, int startIndex) -> int {
45 // Base case: reached the end of target string
46 if (startIndex >= targetLength) {
47 return 0;
48 }
49
50 // Return memoized result if already computed
51 if (dp[startIndex] != -1) {
52 return dp[startIndex];
53 }
54
55 // Initialize current position's result with infinity
56 dp[startIndex] = INF;
57
58 // Try to match prefixes starting from current position
59 Trie* currentNode = trieRoot;
60 for (int endIndex = startIndex; endIndex < targetLength; ++endIndex) {
61 int charIndex = target[endIndex] - 'a';
62
63 // If no matching prefix exists, stop searching
64 if (!currentNode->children[charIndex]) {
65 break;
66 }
67
68 // Move to next character in trie
69 currentNode = currentNode->children[charIndex];
70
71 // Update minimum: 1 (current substring) + result from remaining part
72 dp[startIndex] = min(dp[startIndex], 1 + dfs(endIndex + 1));
73 }
74
75 return dp[startIndex];
76 };
77
78 // Start DFS from index 0
79 int result = dfs(0);
80
81 // Return result if valid solution exists, otherwise return -1
82 return result < INF ? result : -1;
83 }
84};
85
1// Global Trie node structure
2interface TrieNode {
3 children: (TrieNode | null)[];
4}
5
6// Create a new Trie node
7function createTrieNode(): TrieNode {
8 return {
9 children: Array(26).fill(null)
10 };
11}
12
13// Insert a word into the Trie
14function insertWord(root: TrieNode, word: string): void {
15 let currentNode: TrieNode = root;
16
17 for (const char of word) {
18 // Calculate the index for the character (0-25 for 'a'-'z')
19 const charIndex = char.charCodeAt(0) - 'a'.charCodeAt(0);
20
21 // Create a new node if it doesn't exist
22 if (!currentNode.children[charIndex]) {
23 currentNode.children[charIndex] = createTrieNode();
24 }
25
26 // Move to the next node
27 currentNode = currentNode.children[charIndex]!;
28 }
29}
30
31function minValidStrings(words: string[], target: string): number {
32 const targetLength = target.length;
33
34 // Build the Trie from all words
35 const trieRoot = createTrieNode();
36 for (const word of words) {
37 insertWord(trieRoot, word);
38 }
39
40 // Large value representing infinity
41 const INFINITY = 1 << 30;
42
43 // Memoization array for dynamic programming
44 // memo[i] stores the minimum number of valid strings needed to form target[i:]
45 const memo = Array(targetLength).fill(0);
46
47 // Recursive function with memoization to find minimum valid strings
48 function findMinStrings(startIndex: number): number {
49 // Base case: reached the end of target string
50 if (startIndex >= targetLength) {
51 return 0;
52 }
53
54 // Return memoized result if already computed
55 if (memo[startIndex]) {
56 return memo[startIndex];
57 }
58
59 // Initialize with infinity
60 memo[startIndex] = INFINITY;
61
62 // Try to match prefixes starting from current index
63 let currentNode: TrieNode | null = trieRoot;
64
65 for (let endIndex = startIndex; endIndex < targetLength; ++endIndex) {
66 // Get the character index for the current character in target
67 const charIndex = target[endIndex].charCodeAt(0) - 'a'.charCodeAt(0);
68
69 // If no matching prefix exists, stop searching
70 if (!currentNode?.children[charIndex]) {
71 break;
72 }
73
74 // Move to the next node in Trie
75 currentNode = currentNode.children[charIndex];
76
77 // Update minimum: 1 (current prefix) + minimum for remaining string
78 memo[startIndex] = Math.min(
79 memo[startIndex],
80 1 + findMinStrings(endIndex + 1)
81 );
82 }
83
84 return memo[startIndex];
85 }
86
87 // Calculate the result starting from index 0
88 const result = findMinStrings(0);
89
90 // Return -1 if no valid combination exists, otherwise return the result
91 return result < INFINITY ? result : -1;
92}
93
Time and Space Complexity
Time Complexity: O(n^2 + L)
The time complexity consists of two main parts:
- Trie Construction: Building the trie from all words takes
O(L)
time, whereL
is the total length of all words in the input array. - DFS with Memoization: The
dfs
function can be called at mostn
times (once for each starting position in the target string). For each call todfs(i)
, the inner loop iterates from positioni
to potentially positionn-1
, checking each character against the trie. This gives usO(n)
work per call. Withn
possible calls andO(n)
work each, the total time for DFS isO(n^2)
.
Therefore, the overall time complexity is O(n^2 + L)
.
Space Complexity: O(n + L)
The space complexity consists of:
- Trie Storage: The trie stores all the words, requiring
O(L)
space in the worst case, whereL
is the total length of all words. - Memoization Cache: The
@cache
decorator stores results for at mostn
different values ofi
(from 0 ton-1
), requiringO(n)
space. - Recursion Stack: In the worst case, the recursion depth can be
O(n)
when we match one character at a time.
Since the recursion stack and memoization cache both contribute O(n)
, and the trie contributes O(L)
, the total space complexity is O(n + L)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Not Understanding Valid Strings are Prefixes, Not Just Substrings
A critical misunderstanding is thinking that any substring of words in the array can be used. The problem specifically requires prefixes - strings that start from the beginning of at least one word.
Incorrect Approach:
# Wrong: Checking if substring exists anywhere in words
def is_valid(substring):
for word in words:
if substring in word: # This checks for any occurrence, not just prefix
return True
return False
Correct Approach:
# Right: Using Trie to check valid prefixes # The Trie naturally enforces prefix matching from the start of words
2. Inefficient Prefix Checking Without Trie
Without a Trie, checking all possible prefixes becomes inefficient, leading to Time Limit Exceeded (TLE) for large inputs.
Inefficient Approach:
def minValidStrings(words, target):
@cache
def dfs(i):
if i >= len(target):
return 0
result = float('inf')
# Inefficient: For each position, checking all words and all lengths
for j in range(i + 1, len(target) + 1):
substring = target[i:j]
for word in words:
if word.startswith(substring) or substring == word[:len(substring)]:
result = min(result, 1 + dfs(j))
return result
This approach has O(n² × m × k) complexity where n is target length, m is number of words, and k is average word length.
Efficient Solution: Using Trie reduces the complexity by allowing simultaneous prefix checking while traversing the target string character by character.
3. Memory Issues with String Slicing in Recursion
Creating substring copies in recursive calls can cause memory issues.
Memory-Intensive Approach:
def dfs(remaining_target): # Passing substring instead of index
if not remaining_target:
return 0
for i in range(1, len(remaining_target) + 1):
prefix = remaining_target[:i]
# Creates new string objects repeatedly
result = min(result, 1 + dfs(remaining_target[i:]))
Memory-Efficient Solution: Pass indices instead of creating substring copies:
@cache
def dfs(start_index): # Using index instead of substring
if start_index >= len(target):
return 0
# Work with indices, not string copies
4. Forgetting to Handle the Impossible Case
Not properly returning -1 when it's impossible to form the target string.
Incomplete Handling:
def minValidStrings(words, target):
result = dfs(0)
return result # Wrong: might return infinity
Proper Handling:
def minValidStrings(words, target):
result = dfs(0)
return result if result < float('inf') else -1
5. Not Using Memoization Leading to Exponential Time Complexity
Without memoization, the same subproblems are solved repeatedly.
Without Memoization:
def dfs(i): # No @cache decorator
if i >= n:
return 0
# This will recompute the same positions multiple times
# Time complexity becomes exponential
With Memoization:
@cache # Critical for performance
def dfs(i):
if i >= n:
return 0
# Each position is computed only once
6. Incorrect Base Case in Recursion
Using wrong conditions for the base case can lead to incorrect results or infinite recursion.
Wrong Base Case:
def dfs(i):
if i == len(target): # Might miss the last character
return 0
# Or
if i > len(target): # Might cause index out of bounds
return 0
Correct Base Case:
def dfs(i):
if i >= len(target): # Handles both exact match and beyond
return 0
What does the following code do?
1def f(arr1, arr2):
2 i, j = 0, 0
3 new_arr = []
4 while i < len(arr1) and j < len(arr2):
5 if arr1[i] < arr2[j]:
6 new_arr.append(arr1[i])
7 i += 1
8 else:
9 new_arr.append(arr2[j])
10 j += 1
11 new_arr.extend(arr1[i:])
12 new_arr.extend(arr2[j:])
13 return new_arr
14
1public static List<Integer> f(int[] arr1, int[] arr2) {
2 int i = 0, j = 0;
3 List<Integer> newArr = new ArrayList<>();
4
5 while (i < arr1.length && j < arr2.length) {
6 if (arr1[i] < arr2[j]) {
7 newArr.add(arr1[i]);
8 i++;
9 } else {
10 newArr.add(arr2[j]);
11 j++;
12 }
13 }
14
15 while (i < arr1.length) {
16 newArr.add(arr1[i]);
17 i++;
18 }
19
20 while (j < arr2.length) {
21 newArr.add(arr2[j]);
22 j++;
23 }
24
25 return newArr;
26}
27
1function f(arr1, arr2) {
2 let i = 0, j = 0;
3 let newArr = [];
4
5 while (i < arr1.length && j < arr2.length) {
6 if (arr1[i] < arr2[j]) {
7 newArr.push(arr1[i]);
8 i++;
9 } else {
10 newArr.push(arr2[j]);
11 j++;
12 }
13 }
14
15 while (i < arr1.length) {
16 newArr.push(arr1[i]);
17 i++;
18 }
19
20 while (j < arr2.length) {
21 newArr.push(arr2[j]);
22 j++;
23 }
24
25 return newArr;
26}
27
Recommended Readings
Trie Introduction Imagine you have a magic dictionary where as soon as you start spelling a word it immediately tells you how many words that start with those letters How cool would that be This is exactly the kind of magic a data structure called trie performs What's a Trie
Segment Tree Faster Range Queries For this article we want to introduce the idea of a Segment Tree Segment Trees allow us to quickly perform range queries as well as range updates Suppose we have an array and we want to know the sum of a particular range of numbers as well
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
Want a Structured Path to Master System Design Too? Don’t Miss This!