3292. Minimum Number of Valid Strings to Form Target II
Problem Description
You are given an array of strings words
and a string target
. A string x
is called valid if x
is a prefix of any string in words
. The task is to return the minimum number of valid strings that can be concatenated to form target
. If it is not possible to form target
, return -1
.
Intuition
The problem can be approached by considering the concept of string hashing and binary search to efficiently find valid prefixes of the target
in the given words
. The key is to understand how to determine the longest prefix of the target
starting from any position that matches a prefix in one of the words
.
-
String Hashing: Precompute hash values for all prefixes of each string in
words
and for all possible substrings oftarget
. This allows for quick substring comparisons using hash values instead of character-by-character comparison. -
Binary Search for Prefix Length: For every starting position
i
intarget
, use binary search to find the maximum lengthdist
where the substringtarget[i..(i+dist-1)]
is a valid prefix for some string inwords
. -
Greedy Approach to Count Valid Strings: Use a greedy strategy to determine the minimum number of valid strings needed to cover the whole
target
starting from index 0. Track the farthest position (mx
) that can be reached with valid strings from the current or future start point. If at any position the farthest reachable point is not progressing, return-1
. Otherwise, continue till the end oftarget
is reachable, counting the necessary valid strings along the way.
By leveraging the hashing for quick comparisons and binary search for efficient discovery of valid prefixes, the solution efficiently handles large inputs without timing out.
Learn more about Segment Tree, Binary Search and Dynamic Programming patterns.
Solution Approach
The solution employs a combination of string hashing, binary search, and a greedy algorithm. Here's how these components fit together:
-
String Hashing:
- Utilize a
Hashing
class to store hash values for bothtarget
and all prefixes of each string inwords
. - Each string is hashed using a polynomial hash method with a specified
base
andmod
. This allows the rapid comparison of substring equivalence by comparing their hash values rather than the strings themselves.
- Utilize a
-
Binary Search for Maximum Prefix Length:
- Define a helper function
f(i)
that uses binary search to compute the maximum lengthdist
such thattarget[i..i+dist-1]
is a valid prefix of any string inwords
. - For each position
i
intarget
, initialize binary search boundariesl
andr
based on the remaining length oftarget
and the maximum length of strings inwords
. - At each step, compute the hash of
target[i..i+mid-1]
to check if it exists in the precomputed hash setss
. Adjust the boundaries accordingly and returnl
as the maximum valid prefix length.
- Define a helper function
-
Greedy Algorithm for Minimum Valid Strings:
- Initialize variables:
last
to track the last valid concatenation point,mx
for the farthest reachable point from any start, andans
for the count of valid segments. - Traverse the
target
from start to end (i = 0
ton
):- Use the
f(i)
function to determine how far you can match from positioni
. - Check the farthest position (
i + f(i)
) and updatemx
. - If
i
equalslast
, it means it's time to perform another move. Check if further movement is possible (mx > i
); if not, return-1
. - Update
last
tomx
and incrementans
for each jump.
- Use the
- Initialize variables:
This structured combination of hashing for efficient string comparison, binary search for determining the longest valid prefix quickly, and a greedy approach ensures an optimal solution for large datasets.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach using a small example:
Example:
Given words = ["ab", "abc", "cd"]
and target = "abcd"
.
Step 1: String Hashing
For each word in words
and the target
string, compute hash values for all prefixes. For instance, the hash values for words
would include those for "a", "ab", "abc", "c", "cd", etc., and similarly for each prefix of target
like "a", "ab", "abc", and "abcd".
Step 2: Binary Search for Maximum Prefix Length
Define the helper function f(i)
to determine the maximum length dist
such that target[i..i+dist-1]
is a valid prefix of any string in words
. For example, starting from i = 0
in target
, find the maximum prefix:
-
Target substring from
i = 0
: "abcd"- Check "a" (valid, found in "ab")
- Check "ab" (valid, found in "ab")
- Check "abc" (valid, found in "abc")
- Returned max length
dist = 3
("abc" from "abc").
-
For
i = 3
, only the remaining substring "d" can be checked:- Check "d" (not valid)
- Binary search returns
dist = 0
.
Step 3: Greedy Algorithm for Minimum Valid Strings
Initialize last = 0
, mx = 0
, and ans = 0
.
-
Start from
i = 0
:- Compute
f(i) = 3
, somx = max(mx, i + f(i)) = 3
. - Since
i
equalslast
, incrementans
and updatelast = 3
.
- Compute
-
Move to
i = 3
:- Compute
f(i) = 0
, so no further valid substring extends coverage. - No more valid sections can be added as
mx
is not greater thani
.
- Compute
Finally, checking if we have covered the entire target
, we see that it's not possible with the given words
. Thus, return -1
as target
cannot be constructed.
Through hashing, binary search, and a greedy approach, we efficiently determine the result. In this case, since it's impossible to form target
with provided prefixes, the output is -1
.
Solution Implementation
1from typing import List
2
3class Hashing:
4 __slots__ = ["mod", "h", "p"]
5
6 def __init__(self, s: List[str], base: int, mod: int):
7 # Initialize the modulus for hashing
8 self.mod = mod
9 # Initialize arrays for hash values and powers of the base
10 self.h = [0] * (len(s) + 1)
11 self.p = [1] * (len(s) + 1)
12 # Compute hash and power arrays
13 for i in range(1, len(s) + 1):
14 self.h[i] = (self.h[i - 1] * base + ord(s[i - 1])) % mod
15 self.p[i] = (self.p[i - 1] * base) % mod
16
17 def query(self, l: int, r: int) -> int:
18 # Return the hash of the substring from index l to r
19 return (self.h[r] - self.h[l - 1] * self.p[r - l + 1]) % self.mod
20
21class Solution:
22 def minValidStrings(self, words: List[str], target: str) -> int:
23 def f(i: int) -> int:
24 l, r = 0, min(n - i, m)
25 # Binary search to find maximum valid substring length
26 while l < r:
27 mid = (l + r + 1) >> 1
28 sub = hashing.query(i + 1, i + mid)
29 if sub in s[mid]:
30 l = mid
31 else:
32 r = mid - 1
33 return l
34
35 base, mod = 13331, 998244353
36 hashing = Hashing(target, base, mod)
37 m = max(len(w) for w in words)
38 # Prepare hash sets for each word length
39 s = [set() for _ in range(m + 1)]
40 for w in words:
41 h = 0
42 # Compute and store all prefix hashes
43 for j, c in enumerate(w, 1):
44 h = (h * base + ord(c)) % mod
45 s[j].add(h)
46
47 ans = last = mx = 0
48 n = len(target)
49 for i in range(n):
50 # Calculate the maximum extendable substring at index i
51 dist = f(i)
52 mx = max(mx, i + dist)
53 # Check if the current position can reach the max extendable point
54 if i == last:
55 if i == mx:
56 return -1 # No solution found
57 last = mx
58 ans += 1 # Increment the count of valid strings
59 return ans
60
1import java.util.Arrays;
2import java.util.HashSet;
3import java.util.Set;
4
5// Class to handle hashing operations
6class Hashing {
7 private final long[] powers; // Array to store powers of the base
8 private final long[] hashes; // Array to store hash values of prefixes
9 private final long mod; // Modulo for hash operations
10
11 // Constructor to initialize hashing for a given word
12 public Hashing(String word, long base, long mod) {
13 int length = word.length();
14 powers = new long[length + 1];
15 hashes = new long[length + 1];
16 powers[0] = 1;
17 this.mod = mod;
18
19 // Compute powers of the base and prefix hashes
20 for (int i = 1; i <= length; i++) {
21 powers[i] = powers[i - 1] * base % mod;
22 hashes[i] = (hashes[i - 1] * base + word.charAt(i - 1)) % mod;
23 }
24 }
25
26 // Method to compute the hash of a substring from index l to r
27 public long query(int l, int r) {
28 return (hashes[r] - hashes[l - 1] * powers[r - l + 1] % mod + mod) % mod;
29 }
30}
31
32// Main solution class
33class Solution {
34 private Hashing hashing; // Hashing object for the target string
35 private Set<Long>[] substringHashes; // Array of hash sets for word substrings
36
37 // Method to find the minimum number of valid strings
38 public int minValidStrings(String[] words, String target) {
39 int base = 13331;
40 long mod = 998244353L; // Long ensures it supports larger mod values
41 hashing = new Hashing(target, base, mod);
42
43 // Find the maximum length of a word in the words array
44 int maxWordLength = Arrays.stream(words).mapToInt(String::length).max().orElse(0);
45 substringHashes = new Set[maxWordLength + 1];
46 Arrays.setAll(substringHashes, k -> new HashSet<>());
47
48 // Compute the prefix hashes for words and store them in sets
49 for (String word : words) {
50 long hash = 0;
51 for (int j = 0; j < word.length(); j++) {
52 hash = (hash * base + word.charAt(j)) % mod;
53 substringHashes[j + 1].add(hash);
54 }
55 }
56
57 int numValidStrings = 0; // Counter for valid strings
58 int lastPos = 0; // Last valid position
59 int maxPos = 0; // Maximum position that can be reached
60 int targetLength = target.length();
61
62 // Iterate through the target string
63 for (int i = 0; i < targetLength; i++) {
64 int distance = findMaxValidDistance(i, targetLength, maxWordLength);
65 maxPos = Math.max(maxPos, i + distance);
66
67 // If current position is the last valid position, try extending it
68 if (i == lastPos) {
69 if (i == maxPos) {
70 return -1; // No valid string found
71 }
72 lastPos = maxPos;
73 numValidStrings++;
74 }
75 }
76 return numValidStrings;
77 }
78
79 // Helper method to compute the maximum valid distance from a position
80 private int findMaxValidDistance(int i, int targetLength, int maxWordLength) {
81 int left = 0, right = Math.min(targetLength - i, maxWordLength);
82
83 // Binary search to find the maximum length of a valid substring from position i
84 while (left < right) {
85 int mid = (left + right + 1) >> 1;
86 long subHash = hashing.query(i + 1, i + mid);
87
88 // Check if the substring hash exists in the corresponding set
89 if (substringHashes[mid].contains(subHash)) {
90 left = mid;
91 } else {
92 right = mid - 1;
93 }
94 }
95 return left;
96 }
97}
98
1#include <vector>
2#include <string>
3#include <unordered_set>
4#include <algorithm>
5
6using namespace std;
7
8// Class to handle substring hashing using rolling hash technique
9class Hashing {
10private:
11 vector<long long> power; // Precomputed powers of the base
12 vector<long long> hash_values; // Hash values for prefixes of the string
13 long long modulo; // Modulo for hashing
14
15public:
16 // Constructor to initialize the class with the given word, base, and modulo
17 Hashing(const string& word, long long base, int mod) {
18 int n = word.size();
19 power.resize(n + 1);
20 hash_values.resize(n + 1);
21 power[0] = 1;
22 this->modulo = mod;
23
24 for (int i = 1; i <= n; i++) {
25 power[i] = (power[i - 1] * base) % mod; // Compute power[i] = base^i % mod
26 hash_values[i] = (hash_values[i - 1] * base + word[i - 1]) % mod; // Prefix hash calculation
27 }
28 }
29
30 // Query hash of the substring word[l...r] (1-based index)
31 long long query(int l, int r) {
32 return (hash_values[r] - hash_values[l - 1] * power[r - l + 1] % modulo + modulo) % modulo;
33 }
34};
35
36// The main solution class which includes the minValidStrings method
37class Solution {
38public:
39 // Function to find the minimum number of valid strings to cover the target
40 int minValidStrings(vector<string>& words, string target) {
41 int base = 13331, mod = 998244353; // Constants for hashing
42 Hashing hashing(target, base, mod); // Initialize hashing for target string
43
44 int maxWordLength = 0, n = target.size();
45 for (const string& word : words) {
46 maxWordLength = max(maxWordLength, (int)word.size());
47 }
48
49 // Preprocess all words into a set of hash values per length
50 vector<unordered_set<long long>> substringHashes(maxWordLength + 1);
51 for (const string& w : words) {
52 long long hash_value = 0;
53 for (int j = 0; j < w.size(); j++) {
54 hash_value = (hash_value * base + w[j]) % mod;
55 substringHashes[j + 1].insert(hash_value);
56 }
57 }
58
59 // Lambda function to determine the length of the valid substring from index i
60 auto maxValidSubstring = [&](int i) -> int {
61 int left = 0, right = min(n - i, maxWordLength);
62 while (left < right) {
63 int mid = (left + right + 1) >> 1;
64 long long subHash = hashing.query(i + 1, i + mid);
65 if (substringHashes[mid].count(subHash)) {
66 left = mid;
67 } else {
68 right = mid - 1;
69 }
70 }
71 return left;
72 };
73
74 // Algorithm to find the minimum number of valid strings
75 int result = 0, lastConvergedIndex = 0, maxReach = 0;
76 for (int i = 0; i < n; i++) {
77 int distance = maxValidSubstring(i);
78 maxReach = max(maxReach, i + distance);
79
80 if (i == lastConvergedIndex) {
81 if (i == maxReach) {
82 return -1; // If the current position is the max reach, return -1
83 }
84 lastConvergedIndex = maxReach; // Update last converged point
85 result++; // Increment result as a new string is used
86 }
87 }
88
89 return result; // Return the minimum number of strings used
90 }
91};
92
1// Defining global variables and methods for hashing and solution
2let power: number[] = []; // Precomputed powers of the base
3let hashValues: number[] = []; // Hash values for prefixes of the string
4let modulo: number; // Modulo for hashing
5const base = 13331, mod = 998244353; // Constants for hashing
6
7// Initialize hashing for a given word
8function initHashing(word: string): void {
9 const n = word.length;
10 power = new Array(n + 1);
11 hashValues = new Array(n + 1);
12 power[0] = 1;
13 modulo = mod;
14
15 for (let i = 1; i <= n; i++) {
16 power[i] = (power[i - 1] * base) % mod; // Compute power[i] = base^i % mod
17 hashValues[i] = (hashValues[i - 1] * base + word.charCodeAt(i - 1)) % mod; // Prefix hash calculation
18 }
19}
20
21// Function to get hash of the substring word[l...r] (1-based index)
22function query(l: number, r: number): number {
23 return (hashValues[r] - hashValues[l - 1] * power[r - l + 1] % modulo + modulo) % modulo;
24}
25
26// Function to find the minimum number of valid strings to cover the target
27function minValidStrings(words: string[], target: string): number {
28 initHashing(target); // Initialize hashing for the target string
29
30 let maxWordLength = 0;
31 const n = target.length;
32
33 // Find the maximum length of words in the list
34 for (const word of words) {
35 maxWordLength = Math.max(maxWordLength, word.length);
36 }
37
38 // Preprocess all words into a set of hash values per length
39 const substringHashes: Set<number>[] = Array.from({ length: maxWordLength + 1 }, () => new Set());
40
41 for (const word of words) {
42 let hashValue = 0;
43 for (let j = 0; j < word.length; j++) {
44 hashValue = (hashValue * base + word.charCodeAt(j)) % mod;
45 substringHashes[j + 1].add(hashValue);
46 }
47 }
48
49 // Helper function to determine the length of the valid substring from index i
50 function maxValidSubstring(i: number): number {
51 let left = 0;
52 let right = Math.min(n - i, maxWordLength);
53
54 while (left < right) {
55 const mid = (left + right + 1) >> 1;
56 const subHash = query(i + 1, i + mid);
57 if (substringHashes[mid].has(subHash)) {
58 left = mid;
59 } else {
60 right = mid - 1;
61 }
62 }
63 return left;
64 }
65
66 // Algorithm to find the minimum number of valid strings
67 let result = 0;
68 let lastConvergedIndex = 0;
69 let maxReach = 0;
70
71 for (let i = 0; i < n; i++) {
72 const distance = maxValidSubstring(i);
73 maxReach = Math.max(maxReach, i + distance);
74
75 if (i === lastConvergedIndex) {
76 if (i === maxReach) {
77 return -1; // If the current position is the max reach, return -1
78 }
79 lastConvergedIndex = maxReach; // Update last converged point
80 result++; // Increment result as a new string is used
81 }
82 }
83
84 return result; // Return the minimum number of strings used
85}
86
Time and Space Complexity
The implementation utilizes a substring hashing technique alongside a binary search approach to find valid strings from the target
.
-
The initialization of the
Hashing
class for thetarget
string runs inO(n)
, wheren
is the length oftarget
, to compute the hash and power arrays. -
In
Solution.minValidStrings
, constructing sets of all possible substring hashes forwords
takesO(L)
, whereL
is the total length of all provided strings inwords
. -
The binary search process inside the helper function
f(i)
runsO(\log n)
times for each of then
starting positions intarget
, with constant-time hash lookups.
Integrating all these parts, the overall time complexity for the function becomes O(n \times \log n + L)
.
The space complexity mainly arises from storing hashes in s
for every substring length and the Hashing
class's arrays, which is O(n + L)
total.
Learn more about how to find time and space complexity quickly.
Which of the following array represent a max heap?
Recommended Readings
Segment Tree Faster Range Queries For this article we want to introduce the idea of a Segment Tree Segment Trees allow us to quickly perform range queries as well as range updates Suppose we have an array and we want to know the sum of a particular range of numbers as well
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
Want a Structured Path to Master System Design Too? Don’t Miss This!