1297. Maximum Number of Occurrences of a Substring
Problem Description
The problem is to find the substring that appears the most times in a given string s
. The catch is that these substrings have to comply with two rules:
- The quantity of unique characters in each substring must be less than or equal to
maxLetters
. - The length of the substring must be at least
minSize
and at mostmaxSize
.
A key thing to note is that though substrings can have lengths anywhere from minSize
to maxSize
, we are interested in finding the maximum number of occurrences, and typically substrings with smaller lengths will have higher chances of repeated occurrence. The solution should return the maximum frequency of any such substring found in s
that follows the above rules.
Intuition
When dissecting the problem, we focus on the constraints - unique characters within the substring and the substring's size. The fact that we're also looking for maximum occurrence leads to a strategy where smaller substrings are more likely to be repeated; hence, we prioritize evaluating substrings of minSize
.
The intuition for the solution is to slide a window of size minSize
across the string s
and use a set to track the unique characters and a counter (hashmap) to count occurrences of each substring that fits the rules. For each window, we check the following:
- If the set size is larger than
maxLetters
, we ignore this substring as it doesn't meet the first rule. - If the set size is within the limits, we add the substring to the counter and check if it's the most frequent one we've seen so far.
This way we iterate over the string once (O(n)
time complexity, where n
is the length of the string), and only consider substrings of length minSize
because regardless of maxSize
, any repeated instances of a longer substring will always contain a repeated minSize
substring within it. Thus, minSize
is the key to finding the maximum occurrence.
Learn more about Sliding Window patterns.
Solution Approach
The solution uses a sliding window algorithm and two data structuresโa set and a counter from Python's collections
module. The set is used to keep track of the unique characters within the current minSize
window of the string, while the Counter
is a hashmap to keep track of the frequency of each unique substring that fits the criteria.
Here are the steps in the implementation:
- Initialize
cnt
as aCounter
object to keep track of the frequency of each qualifying substring. - Initialize
ans
as an integer to keep track of the maximum frequency found. - Iterate over the string with a variable
i
ranging from0
tolen(s) - minSize + 1
. This is your sliding window that will only consider substrings of sizeminSize
. - At each iteration, create a substring
t
which is a slice ofs
starting from indexi
toi + minSize
. - For the substring
t
, create a setss
to determine the number of unique characters it contains. - A crucial optimization here is that if
len(ss)
is greater than themaxLetters
, the substring does not satisfy the required constraints, and no further action is taken for that window. - If
len(ss)
is less than or equal tomaxLetters
, add/subtract one to thecnt[t]
for that substring sequence, which increases the count of how many timest
has been seen. - Update
ans
which keeps track of the highest frequency seen so far by comparing it with the frequency count oft
incnt
. - After the loop finishes,
ans
will hold the highest frequency of occurrence of any valid substring and is returned.
The algorithm sweeps through the string only once, making it efficient with time complexity of O(n)
(for the loop) times O(k)
(for the set creation, where k
is at most minSize
), and it only considers substrings of length minSize
due to the provided insight that longer substrings will naturally contain these smaller, frequently occurring substrings if they are to be frequent themselves. The space complexity mainly depends on the number of unique substrings of length minSize
that can be formed from s
, which in worst cases is O(n)
.
Overall, the algorithm efficiently finds the most frequent substring of size within the given bounds that also contains no more than maxLetters
unique characters.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's consider a simple example with the following parameters:
s
: "ababab"maxLetters
: 2minSize
: 2maxSize
: 3
Following the solution approach:
- We initialize
cnt
as an emptyCounter
object andans
as 0. - We then iterate over
s
using a window size ofminSize
. In this example,minSize
is 2, so we will be sliding through the string two characters at a time. - In the first iteration,
i
is 0, and our substringt
is "ab" (from index 0 to 1). We create a setss
containing unique characters in "ab", which are{'a', 'b'}
. - Since the size of
ss
is 2 and does not exceedmaxLetters
, which is also 2, we proceed to increment the count of this substring incnt
:cnt["ab"] += 1
. - Now we slide the window by one and repeat. Our next substring is "ba" (from index 1 to 2), and
ss
will again be{'b', 'a'}
. We incrementcnt["ba"]
. - This process is repeated for the entire string: we then check "ab" again, and "ba" again. Each time, we are incrementing the count in
cnt
for the respective substring. - After we've processed each substring, we observe that "ab" and "ba" each have a count of 3 in
cnt
. ans
is then updated to be the maximum of the values incnt
, which in this case is 3.
The final answer, held by ans
, is 3, indicating that the most frequent substrings of length minSize
are "ab" and "ba," each appearing 3 times in the string s
. Our example is now complete, demonstrating how the sliding window and counter work together to find the most frequent qualifying substring.
Solution Implementation
1from collections import Counter
2
3class Solution:
4 def maxFreq(self, s: str, maxLetters: int, minSize: int, maxSize: int) -> int:
5 # This function finds the maximum frequency of any substring that meets the criteria
6
7 # Initialize the variable to store the maximum frequency found
8 max_frequency = 0
9
10 # Create a Counter object to keep track of the frequencies of the substrings
11 substring_counter = Counter()
12
13 # Loop through the string to check all possible substrings of length minSize
14 for i in range(len(s) - minSize + 1):
15
16 # Extract a substring of length minSize
17 substring = s[i: i + minSize]
18
19 # Create a set of unique characters in the substring
20 unique_chars = set(substring)
21
22 # If the number of unique characters is less than or equal to maxLetters
23 if len(unique_chars) <= maxLetters:
24 # Increment the frequency count for this substring
25 substring_counter[substring] += 1
26
27 # Update max_frequency with the maximum value between the current max
28 # and the frequency of the current substring
29 max_frequency = max(max_frequency, substring_counter[substring])
30
31 # Return the maximum frequency
32 return max_frequency
33
1class Solution {
2 public int maxFreq(String s, int maxLetters, int minSize, int maxSize) {
3 // Initialize the answer variable to store the maximum frequency.
4 int maxFrequency = 0;
5 // Create a HashMap to store the frequency of substrings.
6 Map<String, Integer> substringFrequency = new HashMap<>();
7 // Loop through the string to find valid substrings of length minSize.
8 for (int start = 0; start <= s.length() - minSize; ++start) {
9 // Extract substring of size `minSize` from the original string.
10 String substring = s.substring(start, start + minSize);
11 // HashSet to track unique characters in the current substring.
12 Set<Character> uniqueChars = new HashSet<>();
13 // Calculate the number of unique characters in the substring.
14 for (int j = 0; j < minSize; ++j) {
15 // Add unique characters in the substring to the set.
16 uniqueChars.add(substring.charAt(j));
17 }
18 // Check if the number of unique characters meets the requirement.
19 if (uniqueChars.size() <= maxLetters) {
20 // Increment the count of this valid substring in the map.
21 substringFrequency.put(substring, substringFrequency.getOrDefault(substring, 0) + 1);
22 // Update the maximum frequency with the highest count found so far.
23 maxFrequency = Math.max(maxFrequency, substringFrequency.get(substring));
24 }
25 }
26 // Return the maximum frequency of any valid substring.
27 return maxFrequency;
28 }
29}
30
1#include <string>
2#include <unordered_map>
3#include <unordered_set>
4#include <algorithm>
5
6class Solution {
7public:
8 int maxFreq(string s, int maxLetters, int minSize, int maxSize) {
9 // Initialize the answer to be returned
10 int maxFrequency = 0;
11
12 // Make use of an unordered_map to count the occurrences of substrings
13 unordered_map<string, int> substringCounts;
14
15 // Loop through the string, only up to the point where minSize substrings can still be formed
16 for (int i = 0; i <= s.size() - minSize; ++i) {
17
18 // Extract the substring of length minSize starting at index i
19 string t = s.substr(i, minSize);
20
21 // Create a set of unique characters (ss) from the substring
22 unordered_set<char> uniqueChars(t.begin(), t.end());
23
24 // Check if the number of unique characters does not exceed maxLetters
25 if (uniqueChars.size() <= maxLetters) {
26
27 // If the conditions are met, increment the count for this substring
28 // and update the maxFrequency with the maximum value
29 maxFrequency = std::max(maxFrequency, ++substringCounts[t]);
30 }
31 }
32
33 // Return the maximum frequency found for any substring that meets the criteria
34 return maxFrequency;
35 }
36};
37
1// Import necessary elements from 'collections' for Map and Set
2import { Map, Set } from 'collections';
3
4// Function to calculate the maximum frequency of any substring meeting certain criteria
5function maxFreq(s: string, maxLetters: number, minSize: number, maxSize: number): number {
6 // Initialize the answer to be returned
7 let maxFrequency: number = 0;
8
9 // Make use of a Map to count the occurrences of substrings
10 let substringCounts: Map<string, number> = new Map<string, number>();
11
12 // Loop through the string, only up to the point where minSize substrings can still be formed
13 for (let i = 0; i <= s.length - minSize; ++i) {
14
15 // Extract the substring of length minSize starting at index i
16 let t: string = s.substring(i, i + minSize);
17
18 // Create a set of unique characters from the substring
19 let uniqueChars: Set<string> = new Set<string>(t.split(''));
20
21 // Check if the number of unique characters does not exceed maxLetters
22 if (uniqueChars.size <= maxLetters) {
23
24 // If the conditions are met, increment the count for this substring
25 let count: number = substringCounts.get(t) || 0;
26 substringCounts.set(t, count + 1);
27
28 // Update the maxFrequency with the maximum value
29 maxFrequency = Math.max(maxFrequency, count + 1);
30 }
31 }
32
33 // Return the maximum frequency found for any substring that meets the criteria
34 return maxFrequency;
35}
36
Time and Space Complexity
The time complexity of the code is O(n * minSize)
where n
is the length of the string s
. This is because the code iterates over the string in slices of length minSize
which takes O(minSize)
time for each slice, and this is done for all starting points in the string, of which there are n - minSize + 1
.
The space complexity of the code is O(n)
in the worst case. This is due to the fact that the Counter
could potentially store every unique substring of length minSize
in the worst scenario where all possible substrings are unique. Since the length of each substring is bounded by minSize
, this does not affect the order of space complexity.
Learn more about how to find time and space complexity quickly using problem constraints.
Consider the classic dynamic programming of longest increasing subsequence:
Find the length of the longest subsequence of a given sequence such that all elements of the subsequence are sorted in increasing order.
For example, the length of LIS for [50, 3, 10, 7, 40, 80]
is 4
and LIS is
[3, 7, 40, 80]
.
What is the recurrence relation?
Recommended Readings
https algomonster s3 us east 2 amazonaws com cover_photos stack svg Sliding Window Maximum Monotonic Stack We have an array and a sliding window defined by a start index and an end index The sliding window moves from left of the array to right There are always k elements in
Patterns The Shortest Path Algorithm for Coding Interviews The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we can determine the
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Got a question?ย Ask the Monster Assistantย anything you don't understand.
Still not clear? ย Submitย the part you don't understand to our editors. Or join ourย Discord and ask the community.