3292. Minimum Number of Valid Strings to Form Target II

HardSegment TreeArrayStringBinary SearchDynamic ProgrammingString MatchingHash FunctionRolling Hash

Problem Description

You are given an array of strings words and a string target. A string x is called valid if x is a prefix of any string in words. The task is to return the minimum number of valid strings that can be concatenated to form target. If it is not possible to form target, return -1.

Intuition

The problem can be approached by considering the concept of string hashing and binary search to efficiently find valid prefixes of the target in the given words. The key is to understand how to determine the longest prefix of the target starting from any position that matches a prefix in one of the words.

String Hashing: Precompute hash values for all prefixes of each string in words and for all possible substrings of target. This allows for quick substring comparisons using hash values instead of character-by-character comparison.
Binary Search for Prefix Length: For every starting position i in target, use binary search to find the maximum length dist where the substring target[i..(i+dist-1)] is a valid prefix for some string in words.
Greedy Approach to Count Valid Strings: Use a greedy strategy to determine the minimum number of valid strings needed to cover the whole target starting from index 0. Track the farthest position (mx) that can be reached with valid strings from the current or future start point. If at any position the farthest reachable point is not progressing, return -1. Otherwise, continue till the end of target is reachable, counting the necessary valid strings along the way.

By leveraging the hashing for quick comparisons and binary search for efficient discovery of valid prefixes, the solution efficiently handles large inputs without timing out.

Learn more about Segment Tree, Binary Search and Dynamic Programming patterns.

Solution Approach

The solution employs a combination of string hashing, binary search, and a greedy algorithm. Here's how these components fit together:

String Hashing:
- Utilize a Hashing class to store hash values for both target and all prefixes of each string in words.
- Each string is hashed using a polynomial hash method with a specified base and mod. This allows the rapid comparison of substring equivalence by comparing their hash values rather than the strings themselves.
Binary Search for Maximum Prefix Length:
- Define a helper function f(i) that uses binary search to compute the maximum length dist such that target[i..i+dist-1] is a valid prefix of any string in words.
- For each position i in target, initialize binary search boundaries l and r based on the remaining length of target and the maximum length of strings in words.
- At each step, compute the hash of target[i..i+mid-1] to check if it exists in the precomputed hash sets s. Adjust the boundaries accordingly and return l as the maximum valid prefix length.
Greedy Algorithm for Minimum Valid Strings:
- Initialize variables: last to track the last valid concatenation point, mx for the farthest reachable point from any start, and ans for the count of valid segments.
- Traverse the target from start to end (i = 0 to n):
  - Use the f(i) function to determine how far you can match from position i.
  - Check the farthest position (i + f(i)) and update mx.
  - If i equals last, it means it's time to perform another move. Check if further movement is possible (mx > i); if not, return -1.
  - Update last to mx and increment ans for each jump.

This structured combination of hashing for efficient string comparison, binary search for determining the longest valid prefix quickly, and a greedy approach ensures an optimal solution for large datasets.

Ready to land your dream job?

Unlock your dream job with a 2-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's illustrate the solution approach using a small example:

Example:

Given words = ["ab", "abc", "cd"] and target = "abcd".

Step 1: String Hashing

For each word in words and the target string, compute hash values for all prefixes. For instance, the hash values for words would include those for "a", "ab", "abc", "c", "cd", etc., and similarly for each prefix of target like "a", "ab", "abc", and "abcd".

Step 2: Binary Search for Maximum Prefix Length

Define the helper function f(i) to determine the maximum length dist such that target[i..i+dist-1] is a valid prefix of any string in words. For example, starting from i = 0 in target, find the maximum prefix:

Target substring from i = 0: "abcd"
- Check "a" (valid, found in "ab")
- Check "ab" (valid, found in "ab")
- Check "abc" (valid, found in "abc")
- Returned max length dist = 3 ("abc" from "abc").
For i = 3, only the remaining substring "d" can be checked:
- Check "d" (not valid)
- Binary search returns dist = 0.

Step 3: Greedy Algorithm for Minimum Valid Strings

Initialize last = 0, mx = 0, and ans = 0.

Start from i = 0:
- Compute f(i) = 3, so mx = max(mx, i + f(i)) = 3.
- Since i equals last, increment ans and update last = 3.
Move to i = 3:
- Compute f(i) = 0, so no further valid substring extends coverage.
- No more valid sections can be added as mx is not greater than i.

Finally, checking if we have covered the entire target, we see that it's not possible with the given words. Thus, return -1 as target cannot be constructed.

Through hashing, binary search, and a greedy approach, we efficiently determine the result. In this case, since it's impossible to form target with provided prefixes, the output is -1.

Solution Implementation

1from typing import List
2
3class Hashing:
4    __slots__ = ["mod", "h", "p"]
5
6    def __init__(self, s: List[str], base: int, mod: int):
7        # Initialize the modulus for hashing
8        self.mod = mod
9        # Initialize arrays for hash values and powers of the base
10        self.h = [0] * (len(s) + 1)
11        self.p = [1] * (len(s) + 1)
12        # Compute hash and power arrays
13        for i in range(1, len(s) + 1):
14            self.h[i] = (self.h[i - 1] * base + ord(s[i - 1])) % mod
15            self.p[i] = (self.p[i - 1] * base) % mod
16
17    def query(self, l: int, r: int) -> int:
18        # Return the hash of the substring from index l to r
19        return (self.h[r] - self.h[l - 1] * self.p[r - l + 1]) % self.mod
20
21class Solution:
22    def minValidStrings(self, words: List[str], target: str) -> int:
23        def f(i: int) -> int:
24            l, r = 0, min(n - i, m)
25            # Binary search to find maximum valid substring length
26            while l < r:
27                mid = (l + r + 1) >> 1
28                sub = hashing.query(i + 1, i + mid)
29                if sub in s[mid]:
30                    l = mid
31                else:
32                    r = mid - 1
33            return l
34
35        base, mod = 13331, 998244353
36        hashing = Hashing(target, base, mod)
37        m = max(len(w) for w in words)
38        # Prepare hash sets for each word length
39        s = [set() for _ in range(m + 1)]
40        for w in words:
41            h = 0
42            # Compute and store all prefix hashes
43            for j, c in enumerate(w, 1):
44                h = (h * base + ord(c)) % mod
45                s[j].add(h)
46
47        ans = last = mx = 0
48        n = len(target)
49        for i in range(n):
50            # Calculate the maximum extendable substring at index i
51            dist = f(i)
52            mx = max(mx, i + dist)
53            # Check if the current position can reach the max extendable point
54            if i == last:
55                if i == mx:
56                    return -1  # No solution found
57                last = mx
58                ans += 1  # Increment the count of valid strings
59        return ans
60

1import java.util.Arrays;
2import java.util.HashSet;
3import java.util.Set;
4
5// Class to handle hashing operations
6class Hashing {
7    private final long[] powers; // Array to store powers of the base
8    private final long[] hashes; // Array to store hash values of prefixes
9    private final long mod;      // Modulo for hash operations
10
11    // Constructor to initialize hashing for a given word
12    public Hashing(String word, long base, long mod) {
13        int length = word.length();
14        powers = new long[length + 1];
15        hashes = new long[length + 1];
16        powers[0] = 1;
17        this.mod = mod;
18
19        // Compute powers of the base and prefix hashes
20        for (int i = 1; i <= length; i++) {
21            powers[i] = powers[i - 1] * base % mod;
22            hashes[i] = (hashes[i - 1] * base + word.charAt(i - 1)) % mod;
23        }
24    }
25
26    // Method to compute the hash of a substring from index l to r
27    public long query(int l, int r) {
28        return (hashes[r] - hashes[l - 1] * powers[r - l + 1] % mod + mod) % mod;
29    }
30}
31
32// Main solution class
33class Solution {
34    private Hashing hashing; // Hashing object for the target string
35    private Set<Long>[] substringHashes; // Array of hash sets for word substrings
36
37    // Method to find the minimum number of valid strings
38    public int minValidStrings(String[] words, String target) {
39        int base = 13331;
40        long mod = 998244353L; // Long ensures it supports larger mod values
41        hashing = new Hashing(target, base, mod);
42
43        // Find the maximum length of a word in the words array
44        int maxWordLength = Arrays.stream(words).mapToInt(String::length).max().orElse(0);
45        substringHashes = new Set[maxWordLength + 1];
46        Arrays.setAll(substringHashes, k -> new HashSet<>());
47
48        // Compute the prefix hashes for words and store them in sets
49        for (String word : words) {
50            long hash = 0;
51            for (int j = 0; j < word.length(); j++) {
52                hash = (hash * base + word.charAt(j)) % mod;
53                substringHashes[j + 1].add(hash);
54            }
55        }
56
57        int numValidStrings = 0; // Counter for valid strings
58        int lastPos = 0; // Last valid position
59        int maxPos = 0; // Maximum position that can be reached
60        int targetLength = target.length();
61
62        // Iterate through the target string
63        for (int i = 0; i < targetLength; i++) {
64            int distance = findMaxValidDistance(i, targetLength, maxWordLength);
65            maxPos = Math.max(maxPos, i + distance);
66
67            // If current position is the last valid position, try extending it
68            if (i == lastPos) {
69                if (i == maxPos) {
70                    return -1; // No valid string found
71                }
72                lastPos = maxPos;
73                numValidStrings++;
74            }
75        }
76        return numValidStrings;
77    }
78
79    // Helper method to compute the maximum valid distance from a position
80    private int findMaxValidDistance(int i, int targetLength, int maxWordLength) {
81        int left = 0, right = Math.min(targetLength - i, maxWordLength);
82
83        // Binary search to find the maximum length of a valid substring from position i
84        while (left < right) {
85            int mid = (left + right + 1) >> 1;
86            long subHash = hashing.query(i + 1, i + mid);
87
88            // Check if the substring hash exists in the corresponding set
89            if (substringHashes[mid].contains(subHash)) {
90                left = mid;
91            } else {
92                right = mid - 1;
93            }
94        }
95        return left;
96    }
97}
98

1#include <vector>
2#include <string>
3#include <unordered_set>
4#include <algorithm>
5
6using namespace std;
7
8// Class to handle substring hashing using rolling hash technique
9class Hashing {
10private:
11    vector<long long> power; // Precomputed powers of the base
12    vector<long long> hash_values; // Hash values for prefixes of the string
13    long long modulo; // Modulo for hashing
14
15public:
16    // Constructor to initialize the class with the given word, base, and modulo
17    Hashing(const string& word, long long base, int mod) {
18        int n = word.size();
19        power.resize(n + 1);
20        hash_values.resize(n + 1);
21        power[0] = 1;
22        this->modulo = mod;
23
24        for (int i = 1; i <= n; i++) {
25            power[i] = (power[i - 1] * base) % mod; // Compute power[i] = base^i % mod
26            hash_values[i] = (hash_values[i - 1] * base + word[i - 1]) % mod; // Prefix hash calculation
27        }
28    }
29
30    // Query hash of the substring word[l...r] (1-based index)
31    long long query(int l, int r) {
32        return (hash_values[r] - hash_values[l - 1] * power[r - l + 1] % modulo + modulo) % modulo;
33    }
34};
35
36// The main solution class which includes the minValidStrings method
37class Solution {
38public:
39    // Function to find the minimum number of valid strings to cover the target
40    int minValidStrings(vector<string>& words, string target) {
41        int base = 13331, mod = 998244353; // Constants for hashing
42        Hashing hashing(target, base, mod); // Initialize hashing for target string
43
44        int maxWordLength = 0, n = target.size();
45        for (const string& word : words) {
46            maxWordLength = max(maxWordLength, (int)word.size());
47        }
48
49        // Preprocess all words into a set of hash values per length
50        vector<unordered_set<long long>> substringHashes(maxWordLength + 1);
51        for (const string& w : words) {
52            long long hash_value = 0;
53            for (int j = 0; j < w.size(); j++) {
54                hash_value = (hash_value * base + w[j]) % mod;
55                substringHashes[j + 1].insert(hash_value);
56            }
57        }
58
59        // Lambda function to determine the length of the valid substring from index i
60        auto maxValidSubstring = [&](int i) -> int {
61            int left = 0, right = min(n - i, maxWordLength);
62            while (left < right) {
63                int mid = (left + right + 1) >> 1;
64                long long subHash = hashing.query(i + 1, i + mid);
65                if (substringHashes[mid].count(subHash)) {
66                    left = mid;
67                } else {
68                    right = mid - 1;
69                }
70            }
71            return left;
72        };
73
74        // Algorithm to find the minimum number of valid strings
75        int result = 0, lastConvergedIndex = 0, maxReach = 0;
76        for (int i = 0; i < n; i++) {
77            int distance = maxValidSubstring(i);
78            maxReach = max(maxReach, i + distance);
79
80            if (i == lastConvergedIndex) {
81                if (i == maxReach) {
82                    return -1; // If the current position is the max reach, return -1
83                }
84                lastConvergedIndex = maxReach; // Update last converged point
85                result++; // Increment result as a new string is used
86            }
87        }
88
89        return result; // Return the minimum number of strings used
90    }
91};
92

1// Defining global variables and methods for hashing and solution
2let power: number[] = []; // Precomputed powers of the base
3let hashValues: number[] = []; // Hash values for prefixes of the string
4let modulo: number; // Modulo for hashing
5const base = 13331, mod = 998244353; // Constants for hashing
6
7// Initialize hashing for a given word
8function initHashing(word: string): void {
9    const n = word.length;
10    power = new Array(n + 1);
11    hashValues = new Array(n + 1);
12    power[0] = 1;
13    modulo = mod;
14
15    for (let i = 1; i <= n; i++) {
16        power[i] = (power[i - 1] * base) % mod; // Compute power[i] = base^i % mod
17        hashValues[i] = (hashValues[i - 1] * base + word.charCodeAt(i - 1)) % mod; // Prefix hash calculation
18    }
19}
20
21// Function to get hash of the substring word[l...r] (1-based index)
22function query(l: number, r: number): number {
23    return (hashValues[r] - hashValues[l - 1] * power[r - l + 1] % modulo + modulo) % modulo;
24}
25
26// Function to find the minimum number of valid strings to cover the target
27function minValidStrings(words: string[], target: string): number {
28    initHashing(target); // Initialize hashing for the target string
29
30    let maxWordLength = 0;
31    const n = target.length;
32
33    // Find the maximum length of words in the list
34    for (const word of words) {
35        maxWordLength = Math.max(maxWordLength, word.length);
36    }
37
38    // Preprocess all words into a set of hash values per length
39    const substringHashes: Set<number>[] = Array.from({ length: maxWordLength + 1 }, () => new Set());
40
41    for (const word of words) {
42        let hashValue = 0;
43        for (let j = 0; j < word.length; j++) {
44            hashValue = (hashValue * base + word.charCodeAt(j)) % mod;
45            substringHashes[j + 1].add(hashValue);
46        }
47    }
48
49    // Helper function to determine the length of the valid substring from index i
50    function maxValidSubstring(i: number): number {
51        let left = 0;
52        let right = Math.min(n - i, maxWordLength);
53
54        while (left < right) {
55            const mid = (left + right + 1) >> 1;
56            const subHash = query(i + 1, i + mid);
57            if (substringHashes[mid].has(subHash)) {
58                left = mid;
59            } else {
60                right = mid - 1;
61            }
62        }
63        return left;
64    }
65
66    // Algorithm to find the minimum number of valid strings
67    let result = 0;
68    let lastConvergedIndex = 0;
69    let maxReach = 0;
70
71    for (let i = 0; i < n; i++) {
72        const distance = maxValidSubstring(i);
73        maxReach = Math.max(maxReach, i + distance);
74
75        if (i === lastConvergedIndex) {
76            if (i === maxReach) {
77                return -1; // If the current position is the max reach, return -1
78            }
79            lastConvergedIndex = maxReach; // Update last converged point
80            result++; // Increment result as a new string is used
81        }
82    }
83
84    return result; // Return the minimum number of strings used
85}
86

Time and Space Complexity

The implementation utilizes a substring hashing technique alongside a binary search approach to find valid strings from the target.

The initialization of the Hashing class for the target string runs in O(n), where n is the length of target, to compute the hash and power arrays.
In Solution.minValidStrings, constructing sets of all possible substring hashes for words takes O(L), where L is the total length of all provided strings in words.
The binary search process inside the helper function f(i) runs O(\log n) times for each of the n starting positions in target, with constant-time hash lookups.

Integrating all these parts, the overall time complexity for the function becomes O(n \times \log n + L).

The space complexity mainly arises from storing hashes in s for every substring length and the Hashing class's arrays, which is O(n + L) total.

Learn more about how to find time and space complexity quickly.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Question 1 out of 10

Which of the following array represent a max heap?

20 12 16 1 2 3 4

20 30 10 5 6 7 18

30 20 11 2 3 4 12

50 30 40 10 20 30 40

I don’t know