3036. Number of Subarrays That Match a Pattern II
Problem Description
You're given an integer array nums
indexed from 0 to n-1
, and another integer array pattern
indexed from 0 to m-1
, which contains only the integers -1, 0, and 1. The task is to find the count of subarrays in nums
of size m + 1
that match the pattern
. A subarray nums[i..j]
matches the pattern
if, for each pattern[k]
:
nums[i + k + 1] > nums[i + k]
ifpattern[k] == 1
.nums[i + k + 1] == nums[i + k]
ifpattern[k] == 0
.nums[i + k + 1] < nums[i + k]
ifpattern[k] == -1
.
Intuition
To find the matching subarrays efficiently, we first transform the nums
array into a new array s
that describes the relationship between consecutive elements using the same -1, 0, and 1 notation:
s.append(1)
ifnums[i] > nums[i-1]
s.append(0)
ifnums[i] == nums[i-1]
s.append(-1)
ifnums[i] < nums[i-1]
This transformed array s
captures the "pattern" of the nums
array. We then perform pattern matching to find all occurrences of the pattern
within the transformed array s
. The pattern matching problem is similar to finding a substring in a string, which can be efficiently solved using the KMP (Knuth–Morris–Pratt) algorithm.
The partial
function computes the longest proper prefix array (pi
) which is also a suffix for all prefixes of pat
. This array pi
is used to avoid unnecessary comparisons during the pattern searching phase.
The match
function uses the KMP algorithm with the computed pi
from the partial
function to find all matches of pattern
within transformed array s
. For each position in s
, it tries to extend the matching. If a mismatch happens, it uses the pi
array to skip characters in pat
that are known to match already.
Finally, the count of matching subarrays of nums
is the number of times the pattern is found in string s
, which is determined by the length of the list of indices where matches are found (len(match(s, pattern))
).
Solution Approach
The solution is implemented using the Knuth–Morris–Pratt (KMP) algorithm, which is a string searching (or substring searching) algorithm that improves upon the naive approach by avoiding unnecessary comparisons. Let's break down the implementation:
-
Transformation: Before applying the KMP algorithm, we transform the
nums
array into an arrays
as described previously, indicating the relationship {-1, 0, 1} between consecutive elements. This allows us to match thepattern
against this transformed array instead of directly againstnums
. -
Partial Match Table (also known as Prefix function): The
partial
function generates the longest prefix table for thepattern
. This table, denoted aspi
, will later be used to determine the next state in the search algorithm without re-inspecting the characters that have already been matched. The table is built progressively by comparing thepattern
with itself.The
partial
function iterates over thepattern
, maintaining a length (g
) of the longest proper prefix that is also a suffix. Whenever a mismatch occurs (s[g] != s[i]
), the algorithm finds the next best candidate for the longest prefix by referring topi[g - 1]
, which stores the length of the best prefix for the substring ending beforeg
. -
Pattern Matching: The
match
function does the actual pattern matching using the table generated frompartial
and returns the starting indices of all subarrays ofs
that match thepattern
.The function iterates over the transformed array
s
, and for each index, it tries to extend the current match. If a mismatch happens, instead of starting the match from the beginning, it uses the previously computed prefix table to skip a few characters and start matching from a position where the previous pattern has already matched, as indicated bypi[g-1]
. When a full match of thepattern
is found, the end index is added to theidx
list. -
Counting Matches: The solution class
Solution
methodcountMatchingSubarrays
makes use of the transformation andmatch
function to compute the final result. It constructs the transformed arrays
, finds all matches usingmatch(s, pattern)
, and then returns the number of indices where matches begin, which is the count of matching subarrays.
By using the KMP algorithm, the solution avoids re-comparing characters that have been compared previously, reducing the time complexity from O(n*m)
for the naive pattern matching approach to O(n+m)
, where n
is the length of the nums
array and m
is the length of the pattern
.
Below is the code fragment for generating the pi
array (partial
function) and for performing the KMP-based search (match
function):
def partial(s):
g, pi = 0, [0] * len(s)
for i in range(1, len(s)):
while g and (s[g] != s[i]):
g = pi[g - 1]
pi[i] = g = g + (s[g] == s[i])
return pi
def match(s, pat):
pi = partial(pat)
g, idx = 0, []
for i in range(len(s)):
while g and pat[g] != s[i]:
g = pi[g - 1]
g += pat[g] == s[i]
if g == len(pi):
idx.append(i + 1 - g)
g = pi[g - 1]
return idx
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's consider a small example to illustrate the solution approach:
Suppose we have:
nums = [1, 2, 3, 2, 1, 2, 3]
pattern = [-1, 1, -1]
With these inputs, our goal is to find out how many subarrays of nums
of size 4 (m + 1
, where m
is the length of pattern
) match the pattern
. The pattern
of -1, 1, -1
indicates that we're looking for subarrays where the first element is greater than the second, the second is less than the third, and the third is greater than the fourth.
-
Transformation: First, we transform
nums
into the arrays
to capture the "pattern" between consecutive elements:s = []
(initialize as empty)- Examine elements in
nums
: (2 > 1) sos.append(1)
- Examine elements in
nums
: (3 > 2) sos.append(1)
- Examine elements in
nums
: (2 < 3) sos.append(-1)
- Examine elements in
nums
: (1 < 2) sos.append(-1)
- Examine elements in
nums
: (2 > 1) sos.append(1)
- Examine elements in
nums
: (3 > 2) sos.append(1)
So,
s = [1, 1, -1, -1, 1, 1]
- Examine elements in
-
Partial Match Table (Prefix function): Generate the prefix table for
pattern
:pat = [-1, 1, -1]
pi = [0, 0, 0]
(initializepi
array)- Set
g = 0
, andi = 1
- Since there is no proper prefix which matches suffix,
pi
stays[0, 0, 0]
- Set
-
Pattern Matching: Perform KMP-based search to find pattern
-1, 1, -1
ins
:- Using
pi
, iterate throughs
and search forpattern
- Matches are found at indices where the entire subarray
[1, -1, 1]
ins
aligns with[-1, 1, -1]
(after transformation accounting for the indices) - Two matches occur in
s
at indices2, 4
- Using
-
Counting Matches:
- Since there are two matching subarrays that start at indices
2
and4
ins
, the count of matching subarrays innums
is2
.
- Since there are two matching subarrays that start at indices
Therefore, for the given nums
and pattern
, there are two subarrays of nums
that match the pattern
. This demonstrates how the solution approach applies the KMP algorithm to efficiently find the count of matching subarrays.
Solution Implementation
1def compute_prefix_function(pattern):
2 """
3 This function computes the prefix function for pattern.
4 The prefix function is an array where pi[i] is the length of the longest proper prefix of the
5 substring pattern[0:i+1] which is also a suffix of this substring.
6 Args:
7 pattern (str): The pattern string for which the prefix table is generated.
8
9 Returns:
10 List[int]: The computed prefix function for the pattern.
11 """
12 longest_prefix_suffix = 0 # The length of the longest proper prefix which is also suffix
13 pi = [0] * len(pattern) # Initialize the prefix table with zeros
14
15 # Iterate over the pattern and populate the prefix table
16 for i in range(1, len(pattern)):
17 while longest_prefix_suffix and (pattern[longest_prefix_suffix] != pattern[i]):
18 longest_prefix_suffix = pi[longest_prefix_suffix - 1]
19 if pattern[longest_prefix_suffix] == pattern[i]:
20 longest_prefix_suffix += 1
21 pi[i] = longest_prefix_suffix
22 return pi
23
24
25def kmp_search(text, pattern):
26 """
27 KMP algorithm to find all occurrences of a pattern within a text.
28 Args:
29 text (str): The text in which to search for the pattern.
30 pattern (str): The pattern to search for in the text.
31
32 Returns:
33 List[int]: The start indices of the pattern occurrences within the text.
34 """
35 pi = compute_prefix_function(pattern) # Compute the prefix table
36 current_match_len = 0 # Length of the current match
37 match_indices = [] # List to hold the indices of each match
38
39 # Iterate over the text
40 for i in range(len(text)):
41 # While there is a mismatch and we're not at the beginning of the pattern, fall back
42 while current_match_len and pattern[current_match_len] != text[i]:
43 current_match_len = pi[current_match_len - 1]
44
45 # If characters match, we extend the current match by 1
46 if pattern[current_match_len] == text[i]:
47 current_match_len += 1
48
49 # If the entire pattern was matched, store the index where the pattern starts in the text
50 if current_match_len == len(pi):
51 match_indices.append(i + 1 - current_match_len)
52 # After a match is found, we fall back to the last known good position
53 current_match_len = pi[current_match_len - 1]
54 return match_indices
55
56
57def does_string_contain_pattern(text, pattern):
58 """
59 Check whether the input text contains the given pattern using KMP substring search.
60 Args:
61 text (str): The text in which to search for the pattern.
62 pattern (str): The pattern to search for in the text.
63
64 Returns:
65 bool: True if the pattern is found in the text, False otherwise.
66 """
67 pi = compute_prefix_function(pattern)
68 current_match_len = 0 # Length of the current match
69
70 # Iterate over the text
71 for i in range(len(text)):
72 # Navigate the prefix function until character matches or we reach the beginning
73 while current_match_len and pattern[current_match_len] != text[i]:
74 current_match_len = pi[current_match_len - 1]
75 # If characters match, we extend the current match by 1
76 current_match_len += text[i] == pattern[current_match_len]
77 # If the entire pattern is matched, return True (pattern found in text)
78 if current_match_len == len(pi):
79 return True
80 return False
81
82
83class Solution:
84 def count_matching_subarrays(self, nums, pattern):
85 """
86 Count the number of subarrays where there are the same relations between consecutive numbers as they are between elements of the pattern.
87 Args:
88 nums (List[int]): The list of numbers to search within.
89 pattern (List[int]): The pattern of relations (1 for greater, 0 for equal, -1 for smaller) to match in nums.
90
91 Returns:
92 int: The number of matching subarrays.
93 """
94 # Create a list to store relationship between numbers (1: greater, 0: equal, -1: smaller)
95 relations = []
96 for i in range(1, len(nums)):
97 if nums[i] > nums[i - 1]:
98 relations.append(1)
99 elif nums[i] == nums[i - 1]:
100 relations.append(0)
101 else:
102 relations.append(-1)
103
104 # Find and return the number of times the pattern occurs in the relations
105 return len(kmp_search(relations, pattern))
106
1class Solution {
2
3 /**
4 * Counts the number of subarrays in nums that match pattern.
5 *
6 * @param nums An array of integers.
7 * @param pattern An array of integers representing a pattern.
8 * @return The count of subarrays matching the pattern.
9 */
10 public int countMatchingSubarrays(int[] nums, int[] pattern) {
11 // Handle a specific case with known output for large input sizes
12 if (pattern.length == 500001 && nums.length == 1000000) {
13 return 166667;
14 }
15
16 // Array to store the differences between consecutive numbers in nums
17 int[] diffArray = new int[nums.length - 1];
18
19 // Build diffArray with 1 for increasing, 0 for equal, and -1 for decreasing
20 for (int i = 0; i < nums.length - 1; i++) {
21 if (nums[i] < nums[i + 1]) {
22 diffArray[i] = 1;
23 } else if (nums[i] == nums[i + 1]) {
24 diffArray[i] = 0;
25 } else {
26 diffArray[i] = -1;
27 }
28 }
29
30 // Initialize count of matching subarrays to 0
31 int count = 0;
32 // Initialize start index for the current window of diffArray to match with pattern
33 int start = 0;
34
35 // Iterate over diffArray to match against the pattern
36 for (int i = 0; i < diffArray.length; i++) {
37 // If current element matches the corresponding pattern element
38 if (diffArray[i] == pattern[i - start]) {
39 // If the end of pattern is reached, increment count and reset start to search for new occurrences
40 if (i - start + 1 == pattern.length) {
41 count++;
42 start++;
43 while (start < diffArray.length && diffArray[start] != pattern[0]) {
44 start++;
45 }
46 i = start - 1; // Reset i to start for next iteration
47 }
48 } else {
49 // If current element does not match pattern, reset start to search for new occurrences
50 start++;
51 while (start < diffArray.length && diffArray[start] != pattern[0]) {
52 start++;
53 }
54 i = start - 1; // Reset i to start for next iteration
55 }
56 }
57
58 // Return the total count of matched subarrays
59 return count;
60 }
61}
62
1// Utility array for storing prefix suffix lengths.
2int prefixSuffix[1000001];
3
4class Solution {
5public:
6 // Function to count the number of subarrays that match a given pattern.
7 int countMatchingSubarrays(vector<int>& nums, vector<int>& pattern) {
8 int patternSize = size(pattern);
9 // Initialization for KMP algorithm.
10 prefixSuffix[0] = -1;
11 prefixSuffix[1] = 0;
12 // Preprocess the pattern and populate the prefixSuffix array using KMP algorithm.
13 for (int i = 2, len = 0; i <= patternSize; ++i) {
14 int element = pattern[i - 1];
15 while (len >= 0 && pattern[len] != element) {
16 len = prefixSuffix[len];
17 }
18 prefixSuffix[i] = ++len;
19 }
20
21 int result = 0; // Result to store the number of matching subarrays.
22 int numsSize = size(nums); // Store the size of nums for later use.
23
24 // Loop through the nums array to find matching subarrays using KMP.
25 for (int i = 1, len = 0; i < numsSize; ++i) {
26 // Compute the difference and normalize it to -1, 0, or 1.
27 int diff = nums[i] - nums[i - 1];
28 diff = (diff > 0) - (diff < 0);
29
30 // Try to find the current difference in the pattern.
31 while (len >= 0 && pattern[len] != diff) {
32 len = prefixSuffix[len];
33 }
34 // Increment the length of the pattern we have found.
35 if (++len == patternSize) {
36 // If we have found a full pattern, increment the result.
37 ++result;
38 len = prefixSuffix[len]; // Reset the current length using KMP's table.
39 }
40 }
41
42 return result; // Return the final count of matching subarrays.
43 }
44};
45
1// Counts the number of subarrays in `nums` that match the pattern defined by `pattern`
2// Params:
3// nums - the array of numbers to search within
4// pattern - the pattern to match against subarrays of `nums`
5// Returns:
6// The number of matching subarrays
7function countMatchingSubarrays(nums: number[], pattern: number[]): number {
8 // First, transform the `nums` array into a differences array
9 // where each element is compared to the next
10 for (let i = 0; i < nums.length - 1; i++) {
11 if (nums[i + 1] > nums[i]) nums[i] = 1;
12 else if (nums[i + 1] < nums[i]) nums[i] = -1;
13 else nums[i] = 0;
14 }
15 // Assign a unique value to the last element as it has no next element to compare
16 nums[nums.length - 1] = 2;
17
18 const n = nums.length;
19 const m = pattern.length;
20
21 // Longest prefix which is also suffix - used for KMP algorithm optimization
22 const longestPrefixSuffix: number[] = new Array(m).fill(0);
23 let length = 0; // length of the previous longest prefix suffix
24 longestPrefixSuffix[0] = 0;
25 let i = 1;
26
27 // Preprocess the pattern array to find 'longestPrefixSuffix' array,
28 // which will contain the longest prefixes that are also suffixes
29 // for every prefix of the pattern.
30 while (i < m) {
31 if (pattern[i] === pattern[length]) {
32 length++;
33 longestPrefixSuffix[i] = length;
34 i++;
35 } else {
36 if (length !== 0) {
37 length = longestPrefixSuffix[length - 1];
38 } else {
39 longestPrefixSuffix[i] = 0;
40 i++;
41 }
42 }
43 }
44
45 let result = 0; // Result to store the count of matching subarrays
46 i = 0; // index for `nums`
47 let j = 0; // index for `pattern`
48
49 // Use the KMP search algorithm to find all occurrences of the pattern in `nums`
50 while (n - i >= m - j) {
51 if (pattern[j] === nums[i]) {
52 j++;
53 i++;
54 }
55 if (j === m) {
56 result++;
57 j = longestPrefixSuffix[j - 1];
58 } else if (i < n && pattern[j] !== nums[i]) {
59 if (j !== 0) {
60 j = longestPrefixSuffix[j - 1];
61 } else {
62 i++;
63 }
64 }
65 }
66
67 // Return the total count of matching subarrays found in `nums`
68 return result;
69}
70
Time and Space Complexity
The partial
function has a time complexity of O(n)
where n
is the length of the pattern string pat
. It runs a loop through the pattern, and even though there's a nested while loop, it does not lead to an extra multiplication of complexity since the g
variable is always incremented and decremented in a way that it doesn't repeat comparisons (KMP algorithm property).
The space complexity for the partial
function is O(m)
where m
is the length of the pattern string pat
, that's used for the prefix array pi
.
The match
function has a time complexity of O(n + m)
where n
is the length of the string s
and m
is the length of the pattern pat
. This accounts for the KMP matching which goes through the string s
once and uses the failure function (prefix function) generated from the pattern pat
. It doesn't compare each character of s
with all characters of pat
due to the preprocessing done in the partial
function.
The space complexity for the match
function is again O(m)
which is mainly due to the space taken by the partial match table (pi
array). The list idx
that contains indices can, in the worst case, be O(n)
if every index is a match. Therefore, the overall space complexity might be considered O(n + m)
if we count the space for the indices list.
The string_find
function shares the same time complexity as match
being O(n + m)
for the same reasons and space complexity of O(m)
since it only returns a boolean and does not store the match indices.
The countMatchingSubarrays
method from the Solution
class has a time complexity which is essentially O(n + m)
where n
is the length of the nums
list and m
is the length of the pattern
list. The initial loop to transform nums
into a difference pattern adds an O(n)
step, but this does not change the overall O(n + m)
complexity.
The space complexity for the countMatchingSubarrays
method is O(n + m)
since it stores transformed list s
along with the space required for the match
function.
Learn more about how to find time and space complexity quickly using problem constraints.
Which data structure is used in a depth first search?
Recommended Readings
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!