2781. Length of the Longest Valid Substring
Problem Description
You're given a string word
and an array of strings forbidden
. The goal is to find the length of the longest substring within word
that doesn't contain any sequence of characters present in forbidden
. A substring is simply a contiguous sequence of characters within another string and it can even be an empty string.
To make a string "valid" in the context of this problem, none of its substrings can match any string present in forbidden
. It's your task to figure out what's the maximum length of such a "valid" substring within the given word
.
Intuition
The intuition behind the solution revolves around the concept of dynamic window sliding coupled with efficient substring checking. The brute force method of checking every possible substring would be too slow, especially since string word
can be quite large. Instead, we need to find a quicker way of validating substrings against the forbidden
list.
We create a sliding window that expands and contracts as needed while scanning the word
from left to right. The set s
allows for constant-time checks if a substring is in forbidden
āthis is much faster than repeatedly scanning a list.
Our approach is as follows:
- Start with two pointers
i
andj
, initializingi
to the beginning of the word andj
to increment over each character. - At every step, check the current window from
j
down toi
to see if any substring is inforbidden
. If it is, movei
past the forbidden substring's start to make a valid window again. - Keep track of the maximum valid window seen so far as
ans
.
One key optimization lies in the knowledge that the forbidden substrings have a maximum possible length. Thus the inner loop doesn't need to check further back than this maximum length from j
, which prevents needless comparisons and significantly speeds up the algorithm.
This solution effectively balances between the thoroughness of checking all relevant substrings and the speed of skipping unnecessary checks, achieving the task within acceptable time complexity.
Learn more about Sliding Window patterns.
Solution Approach
In the provided solution, a set data structure is utilized to store the forbidden
strings. The choice of a set is crucial because it provides O(1) lookup time complexity, which means checking if a substring is in forbidden
is very fast, regardless of the size of forbidden
.
Here is the step-by-step explanation of the code:
- The set
s
is created from theforbidden
array to take advantage of fast lookups. - Two pointers are introduced:
i
is initialized to 0, pointing to the beginning of the current valid substring.j
is used to iterate through each character inword
.
- A variable
ans
is used to keep track of the length of the longest valid substring found so far. - We iterate over the string
word
with the help ofj
.- During each iteration, we look for any potential
forbidden
substrings within a window --- the substring from[k : j + 1]
, wherek
will iterate backwards fromj
but not further back than 10 places (owing to the maximum length of forbidden substrings) or beforei
, whichever is lesser. - The use of
max(j - 10, i - 1)
ensures that we don't check beyond the necessary window backwards from the current character.
- During each iteration, we look for any potential
- If a forbidden substring is found,
i
is moved to one character past the start of the forbidden substring. This effectively skips the forbidden part, searching for the next valid starting point. ans
is updated to be the maximum of its current value orj - i + 1
, which represents the length of the new valid substring from the current positionj
back toi
.- The iteration continues until the end of
word
is reached, withans
being updated as necessary. - At the end,
ans
holds the length of the longest valid substring, and it is returned as the solution.
The key algorithmic pattern here is the sliding window, employed to dynamically adjust the substring being inspected. Effective use of data structuring and algorithmic patterns are what make this solution both elegant and efficient.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's take an example to understand how this solution works. Suppose we have word = "abcdefgh"
and forbidden = ["bc", "fg"]
. We want to find the longest substring of word
that doesn't contain any sequence of characters from forbidden
.
We start by creating a set s
containing forbidden
strings for quick lookup: s = {"bc", "fg"}
. We initialize two pointers: i
at index 0, and j
which will traverse through word
.
The ans
variable starts at 0 to keep track of the length of the longest valid substring found so far. Now, we begin iterating over word
:
-
With
j
at 0, our window is ["a"], and since it doesn't match any forbidden strings, we move on. -
With
j
at 1, our window is ["ab"], which is also not forbidden. We continue. -
With
j
at 2, our window is ["abc"], and checking from the previous 10 characters (or fromi
if less than 10), we find a match withs
because "bc" is in the window. We updatei
to one position after where "bc" started, soi
becomes 2. -
Now
j
moves to 3, and our new window starts fromi
(2) toj
(3), which is ["cd"]. It is valid, so we continue. -
With
j
at 4, our window is ["cde"], still valid. -
With
j
moving up to 5, our window is now ["cdef"], which is still not inforbidden
. -
At
j
= 6, our window becomes ["cdefg"], which contains "fg". We updatei
to move past the start of "fg", makingi
= 6. -
As
j
moves to 7, our window is ["h"], sincei
was just updated to 6. This window is valid.
Throughout the process, we update ans
to be the maximum length of valid substrings we have found. The sequence "cde" was the longest valid substring before we encountered "fg", with a length of 3. Therefore, ans
is 3.
In this case, the longest substring of word
without any forbidden
sequences would be "cde" which has a length of 3. This is the value we return.
Solution Implementation
1class Solution:
2 def longestValidSubstring(self, word: str, forbidden: list[str]) -> int:
3 # Create a set from the `forbidden` list for O(1) access time during checks.
4 forbidden_set = set(forbidden)
5
6 # Initialize the maximum length of the valid substring (ans) and the starting index (start) of the current substring.
7 ans = start = 0
8
9 # Iterate over each character index `end` of the `word`.
10 for end in range(len(word)):
11 # Check all substrings that end at `end` and do not exceed 10 characters in length.
12 # `max(end - 10, start - 1)` ensures we only consider substrings of up to 10 characters
13 # and do not reexamine substrings already invalidated by a forbidden word.
14 for check_start in range(end, max(end - 10, start - 1), -1):
15 # If the substring from `check_start` to `end` (inclusive) is in the forbidden set,
16 # update the `start` index to one position after `check_start`
17 # and break from the loop to start checking the next end position.
18 if word[check_start:end + 1] in forbidden_set:
19 start = check_start + 1
20 break
21
22 # Calculate the length of the current valid substring and update `ans` if it's greater than the previous maximum.
23 ans = max(ans, end - start + 1)
24
25 # Return the length of the longest valid substring that doesn't contain a forbidden word.
26 return ans
27
1class Solution {
2 public int longestValidSubstring(String word, List<String> forbiddenSubstrings) {
3 // A set to store the forbidden substrings for faster lookup
4 Set<String> forbiddenSet = new HashSet<>(forbiddenSubstrings);
5
6 // Initialize variables for the answer and the length of the word
7 int maxValidLength = 0;
8 int wordLength = word.length();
9
10 // Two pointers for iterating through the string
11 // i - start of the current valid substring
12 // j - end of the current valid substring
13 for (int i = 0, j = 0; j < wordLength; ++j) {
14 // Check substrings within the current window from [i,j]
15 for (int k = j; k > Math.max(j - 10, i - 1); --k) {
16 // If a forbidden substring is found within this window
17 if (forbiddenSet.contains(word.substring(k, j + 1))) {
18 // Move the start index after the forbidden substring
19 i = k + 1;
20 break;
21 }
22 }
23 // Calculate the length of the current valid substring
24 // And update the maximum if this is the longest so far
25 maxValidLength = Math.max(maxValidLength, j - i + 1);
26 }
27
28 // Return the length of the longest valid substring found
29 return maxValidLength;
30 }
31}
32
1#include <string>
2#include <vector>
3#include <unordered_set>
4using namespace std;
5
6class Solution {
7public:
8 /**
9 * Finds the longest valid substring that does not contain any forbidden substrings.
10 * @param word The main string in which to look for the substring.
11 * @param forbidden A vector of forbidden substrings.
12 * @return The length of the longest valid substring.
13 */
14 int longestValidSubstring(string word, vector<string>& forbidden) {
15 // Convert the forbidden vector to a set for efficient lookup
16 unordered_set<string> forbiddenSet(forbidden.begin(), forbidden.end());
17
18 // Variable to store the answer, i.e., the length of the longest valid substring
19 int maxLength = 0;
20 // Variable to store the length of the word
21 int wordLength = word.size();
22
23 // Two pointers technique to iterate through the string
24 // `start` denotes the start index of the current valid substring
25 // `end` denotes the end index of the current valid substring
26 for (int start = 0, end = 0; end < wordLength; ++end) {
27 // Check all possible forbidden substrings within the last 10 characters
28 for (int k = end; k > max(end - 10, start - 1); --k) {
29 // If a forbidden substring is found, move the start pointer past it
30 if (forbiddenSet.count(word.substr(k, end - k + 1))) {
31 start = k + 1;
32 break;
33 }
34 }
35 // Update the maximum length with the length of the current valid substring
36 maxLength = max(maxLength, end - start + 1);
37 }
38 // Return the length of the longest valid substring found
39 return maxLength;
40 }
41};
42
1// Defines a function to find the length of the longest substring without any forbidden words.
2function longestValidSubstring(word: string, forbidden: string[]): number {
3 // Create a Set to efficiently check if a substring is forbidden.
4 const forbiddenSet: Set<string> = new Set(forbidden);
5 // Get the length of the word to iterate through.
6 const wordLength = word.length;
7 // Initialize the answer to store the length of the longest valid substring.
8 let maxLength = 0;
9
10 // Initialize two pointers for the sliding window technique.
11 // 'start' tracks the start index of the current substring.
12 // 'end' tracks the end index of the current substring.
13 for (let start = 0, end = 0; end < wordLength; ++end) {
14 // Check all possible substrings from the current 'end' to 'start',
15 // but we limit the substring length to a maximum of 10 characters.
16 for (let k = end; k > Math.max(end - 10, start - 1); --k) {
17 // If a forbidden substring is found, move 'start' past the forbidden word,
18 // effectively shrinking the current window and continue searching.
19 if (forbiddenSet.has(word.substring(k, end + 1))) {
20 start = k + 1;
21 break;
22 }
23 }
24 // Update the length of the longest valid substring found.
25 maxLength = Math.max(maxLength, end - start + 1);
26 }
27 // Return the length of the longest valid substring.
28 return maxLength;
29}
30
Time and Space Complexity
Time Complexity
The time complexity of the given code can be analyzed by going through the nested loops and operations within them. The outer loop runs for len(word)
times, where 'word' is the input string and its length is denoted by 'n'. The inner loop at worst case runs for a maximum of 10 iterations since it looks back from the current position j
only up to 10 characters because of max(j - 10, i - 1)
condition.
Hence, in the worst-case scenario, for each character we might look at the next 10 characters to check if the substring is in the forbidden set. Since checking whether a substring is in a set is O(1) operation due to hashing, the worst case would consider these constant time checks up to 10 times for each character.
Combining both loops, the worst-case time complexity can therefore be simplified to O(n)
, where each character is checked against a constant number of potential forbidden substrings.
Space Complexity
The space complexity consists of the space required for the set 's' and a few integer variables. Since 's' contains the forbidden substrings, its space complexity will be O(m), where 'm' is the total length of all forbidden substrings combined.
The other variables, 'ans', 'i', and 'j' are constant space, adding O(1) to the total space complexity.
Therefore, the total space complexity of the code is O(m + 1)
which simplifies to O(m)
, denoting the space taken up by the set of forbidden substrings.
Learn more about how to find time and space complexity quickly using problem constraints.
Depth first search is equivalent to which of the tree traversal order?
Recommended Readings
https algomonster s3 us east 2 amazonaws com cover_photos stack svg Sliding Window Maximum Monotonic Stack We have an array and a sliding window defined by a start index and an end index The sliding window moves from left of the array to right There are always k elements in
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first