1297. Maximum Number of Occurrences of a Substring
Problem Description
You are given a string s
and three integer parameters: maxLetters
, minSize
, and maxSize
. Your task is to find the maximum number of times any substring appears in s
, where the substring must satisfy two conditions:
-
Character limit: The substring must contain at most
maxLetters
unique characters. For example, ifmaxLetters = 2
, then substrings like "aa", "ab", "bb" are valid (having 1 or 2 unique characters), but "abc" is not valid (having 3 unique characters). -
Length constraint: The length of the substring must be between
minSize
andmaxSize
(inclusive).
The goal is to find which valid substring appears the most frequently in s
and return that frequency count.
For example, if s = "aababcaab"
, maxLetters = 2
, minSize = 3
, and maxSize = 4
:
- The substring "aab" has 2 unique characters ('a' and 'b') and length 3, which satisfies both conditions
- "aab" appears 2 times in the string (at positions 0-2 and 6-8)
- This would be the maximum frequency among all valid substrings
The key insight in the solution is that if a longer substring meets the conditions and appears multiple times, then any of its substrings of length minSize
will also meet the conditions and appear at least as many times. Therefore, we only need to check substrings of length exactly minSize
to find the maximum frequency.
Intuition
The key observation is that if a substring of length k
(where minSize ≤ k ≤ maxSize
) appears n
times in the string, then every substring of length minSize
within that longer substring will also appear at least n
times.
To understand why, consider a substring of length 5 that appears 3 times. Any substring of length 3 within those 5 characters will also appear at least 3 times (at the same positions where the length-5 substring appears). This means that checking longer substrings is redundant - we'll never find a longer substring that has a higher frequency than the best substring of length minSize
.
Additionally, if a longer substring satisfies the unique character constraint (having at most maxLetters
unique characters), then any of its substrings will also satisfy this constraint, since a substring cannot have more unique characters than the string containing it.
Therefore, instead of checking all possible substring lengths from minSize
to maxSize
, we can optimize by only checking substrings of exactly length minSize
. This significantly reduces the search space while guaranteeing we find the maximum frequency.
The approach becomes straightforward:
- Generate all substrings of length
minSize
- For each substring, check if it has at most
maxLetters
unique characters - Count the frequency of valid substrings using a hash table
- Return the maximum frequency found
This reduces the time complexity from checking multiple lengths to just checking one optimal length, making the solution more efficient.
Learn more about Sliding Window patterns.
Solution Approach
The solution uses a hash table combined with a sliding window approach to efficiently find all substrings of length minSize
and track their frequencies.
Step-by-step implementation:
-
Initialize data structures: We use a
Counter
(hash table) to store the frequency of each valid substring, and a variableans
to track the maximum frequency found. -
Iterate through all possible starting positions: We loop through the string from index
0
tolen(s) - minSize
, ensuring we can always extract a substring of lengthminSize
without going out of bounds. -
Extract substring: For each position
i
, we extract a substringt = s[i : i + minSize]
of exactly lengthminSize
. -
Check unique character constraint: We convert the substring
t
into a setss = set(t)
to get unique characters. The size of this set tells us how many unique characters are in the substring. Iflen(ss) <= maxLetters
, the substring is valid. -
Update frequency count: For valid substrings, we increment their count in the hash table using
cnt[t] += 1
. -
Track maximum frequency: After updating the count, we immediately check if this is the new maximum frequency with
ans = max(ans, cnt[t])
. -
Return result: After checking all possible substrings of length
minSize
, we returnans
which contains the maximum frequency.
Time Complexity: O(n × minSize)
where n
is the length of string s
. We iterate through n - minSize + 1
positions, and for each position, we create a substring of length minSize
and convert it to a set.
Space Complexity: O(n × minSize)
in the worst case, where all substrings are unique and stored in the hash table.
The elegance of this solution lies in recognizing that we only need to check substrings of the minimum allowed length, which significantly simplifies the problem while guaranteeing the optimal answer.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example to illustrate the solution approach.
Input: s = "aabaab"
, maxLetters = 2
, minSize = 2
, maxSize = 3
Step 1: Initialize
- Create an empty hash table
cnt = {}
- Set
ans = 0
to track maximum frequency
Step 2: Extract all substrings of length minSize (2)
We'll iterate through positions 0 to 4 (since string length is 6 and we need substrings of length 2):
-
Position 0: Extract
"aa"
- Unique characters:
{'a'}
→ 1 unique character - 1 ≤ 2 (maxLetters), so it's valid
- Update:
cnt["aa"] = 1
,ans = max(0, 1) = 1
- Unique characters:
-
Position 1: Extract
"ab"
- Unique characters:
{'a', 'b'}
→ 2 unique characters - 2 ≤ 2 (maxLetters), so it's valid
- Update:
cnt["ab"] = 1
,ans = max(1, 1) = 1
- Unique characters:
-
Position 2: Extract
"ba"
- Unique characters:
{'b', 'a'}
→ 2 unique characters - 2 ≤ 2 (maxLetters), so it's valid
- Update:
cnt["ba"] = 1
,ans = max(1, 1) = 1
- Unique characters:
-
Position 3: Extract
"aa"
- Unique characters:
{'a'}
→ 1 unique character - 1 ≤ 2 (maxLetters), so it's valid
- Update:
cnt["aa"] = 2
,ans = max(1, 2) = 2
- Unique characters:
-
Position 4: Extract
"ab"
- Unique characters:
{'a', 'b'}
→ 2 unique characters - 2 ≤ 2 (maxLetters), so it's valid
- Update:
cnt["ab"] = 2
,ans = max(2, 2) = 2
- Unique characters:
Step 3: Return result
- Final hash table:
cnt = {"aa": 2, "ab": 2, "ba": 1}
- Maximum frequency:
ans = 2
Result: The function returns 2
, as both "aa" and "ab" appear twice in the string, which is the maximum frequency among all valid substrings.
Note how we ignored checking substrings of length 3 (maxSize). Even though "aab" appears twice and satisfies all constraints, we don't need to check it because any valid longer substring that appears n times will have at least one substring of minSize that also appears n times.
Solution Implementation
1from collections import Counter
2
3class Solution:
4 def maxFreq(self, s: str, maxLetters: int, minSize: int, maxSize: int) -> int:
5 """
6 Find the maximum number of occurrences of any substring that:
7 - Has length between minSize and maxSize
8 - Contains at most maxLetters unique characters
9
10 Key insight: We only need to check substrings of minSize length.
11 If a longer substring appears k times, all its minSize-length substrings
12 appear at least k times, so checking minSize is sufficient.
13
14 Args:
15 s: Input string to search in
16 maxLetters: Maximum number of unique letters allowed in substring
17 minSize: Minimum length of substring
18 maxSize: Maximum length of substring (not used due to optimization)
19
20 Returns:
21 Maximum frequency of any valid substring
22 """
23
24 # Initialize maximum frequency counter
25 max_frequency = 0
26
27 # Dictionary to count occurrences of each valid substring
28 substring_counter = Counter()
29
30 # Iterate through all possible substrings of length minSize
31 for start_index in range(len(s) - minSize + 1):
32 # Extract substring of length minSize starting at current position
33 current_substring = s[start_index : start_index + minSize]
34
35 # Count unique characters in the substring
36 unique_chars = set(current_substring)
37
38 # Check if substring satisfies the unique character constraint
39 if len(unique_chars) <= maxLetters:
40 # Increment count for this valid substring
41 substring_counter[current_substring] += 1
42
43 # Update maximum frequency if current count is higher
44 max_frequency = max(max_frequency, substring_counter[current_substring])
45
46 return max_frequency
47
1class Solution {
2 public int maxFreq(String s, int maxLetters, int minSize, int maxSize) {
3 int maxFrequency = 0;
4 // Map to store substring frequency count
5 Map<String, Integer> substringFrequencyMap = new HashMap<>();
6
7 // Iterate through all possible substrings of minSize length
8 for (int startIndex = 0; startIndex <= s.length() - minSize; startIndex++) {
9 // Extract substring of minSize length starting at startIndex
10 String currentSubstring = s.substring(startIndex, startIndex + minSize);
11
12 // Count unique characters in the current substring
13 Set<Character> uniqueCharacters = new HashSet<>();
14 for (int j = 0; j < minSize; j++) {
15 uniqueCharacters.add(currentSubstring.charAt(j));
16 }
17
18 // Check if the number of unique characters is within the limit
19 if (uniqueCharacters.size() <= maxLetters) {
20 // Update frequency count for this valid substring
21 int currentFrequency = substringFrequencyMap.getOrDefault(currentSubstring, 0) + 1;
22 substringFrequencyMap.put(currentSubstring, currentFrequency);
23
24 // Update the maximum frequency found so far
25 maxFrequency = Math.max(maxFrequency, currentFrequency);
26 }
27 }
28
29 return maxFrequency;
30 }
31}
32
1class Solution {
2public:
3 int maxFreq(string s, int maxLetters, int minSize, int maxSize) {
4 int maxFrequency = 0;
5 // Map to store substring frequency count
6 unordered_map<string, int> substringFrequency;
7
8 // Iterate through all possible substrings of length minSize
9 // Note: We only need to check minSize substrings because if a substring
10 // of length > minSize occurs multiple times, then a substring of minSize
11 // within it also occurs at least that many times
12 for (int i = 0; i <= s.size() - minSize; ++i) {
13 // Extract substring of length minSize starting at position i
14 string currentSubstring = s.substr(i, minSize);
15
16 // Count unique characters in the current substring using a set
17 unordered_set<char> uniqueChars(currentSubstring.begin(), currentSubstring.end());
18
19 // Check if the number of unique characters doesn't exceed maxLetters
20 if (uniqueChars.size() <= maxLetters) {
21 // Increment frequency count for this valid substring
22 substringFrequency[currentSubstring]++;
23 // Update the maximum frequency found so far
24 maxFrequency = max(maxFrequency, substringFrequency[currentSubstring]);
25 }
26 }
27
28 return maxFrequency;
29 }
30};
31
1/**
2 * Finds the maximum number of occurrences of any substring that satisfies the given constraints
3 * @param s - The input string to search for substrings
4 * @param maxLetters - Maximum number of unique characters allowed in a valid substring
5 * @param minSize - Minimum length of a valid substring
6 * @param maxSize - Maximum length of a valid substring (not used in optimization)
7 * @returns The maximum frequency of any valid substring
8 */
9function maxFreq(s: string, maxLetters: number, minSize: number, maxSize: number): number {
10 // Map to store frequency of each valid substring
11 const substringFrequencyMap = new Map<string, number>();
12
13 // Track the maximum frequency found
14 let maxFrequency = 0;
15
16 // Iterate through all possible substrings of minSize length
17 // Note: We only check minSize because any valid substring of larger size
18 // that appears multiple times will have a substring of minSize that appears at least as many times
19 for (let startIndex = 0; startIndex <= s.length - minSize; startIndex++) {
20 // Extract substring of minSize length starting at current position
21 const currentSubstring = s.slice(startIndex, startIndex + minSize);
22
23 // Create a set of unique characters in the substring
24 const uniqueCharacters = new Set(currentSubstring.split(''));
25
26 // Check if the number of unique characters is within the allowed limit
27 if (uniqueCharacters.size <= maxLetters) {
28 // Update frequency count for this valid substring
29 const currentFrequency = (substringFrequencyMap.get(currentSubstring) || 0) + 1;
30 substringFrequencyMap.set(currentSubstring, currentFrequency);
31
32 // Update maximum frequency if current frequency is higher
33 maxFrequency = Math.max(maxFrequency, currentFrequency);
34 }
35 }
36
37 return maxFrequency;
38}
39
Time and Space Complexity
The time complexity is O(n × m)
, where n
is the length of the string s
and m
is minSize
. This is because:
- The outer loop runs
n - minSize + 1
times, which isO(n)
- For each iteration, we extract a substring of length
minSize
using slicings[i : i + minSize]
, which takesO(m)
time - Creating a set from the substring takes
O(m)
time - The Counter update and max comparison are
O(1)
operations - Therefore, the total time complexity is
O(n × m)
The space complexity is O(n × m)
, where n
is the length of the string s
and m
is minSize
. This is because:
- The Counter
cnt
can store at mostn - minSize + 1
different substrings, which isO(n)
- Each substring stored as a key has length
minSize
, which ism
- Therefore, the total space needed to store all possible substrings is
O(n × m)
- The temporary set
ss
usesO(m)
space, but this doesn't affect the overall complexity
Note that in this problem, m
(which is minSize
) does not exceed 26 as per the reference answer.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Attempting to Check All Substring Lengths from minSize to maxSize
One of the most common mistakes is trying to check all possible substring lengths between minSize
and maxSize
. This seems intuitive since the problem states the substring length should be "between minSize and maxSize", but it leads to:
- Unnecessary complexity: The code becomes more complex with nested loops
- Worse time complexity: O(n × (maxSize - minSize + 1) × maxSize) instead of O(n × minSize)
- Potential incorrect results: Without careful implementation, you might count overlapping substrings incorrectly
Incorrect approach:
def maxFreq(self, s: str, maxLetters: int, minSize: int, maxSize: int) -> int:
max_frequency = 0
substring_counter = Counter()
# Wrong: Checking all lengths from minSize to maxSize
for length in range(minSize, maxSize + 1):
for start in range(len(s) - length + 1):
substring = s[start : start + length]
if len(set(substring)) <= maxLetters:
substring_counter[substring] += 1
max_frequency = max(max_frequency, substring_counter[substring])
return max_frequency
Why the optimization works: If a substring of length k appears n times, then each of its substrings of length minSize will appear at least n times. Therefore, the maximum frequency can always be achieved by a substring of minimum length.
2. Incorrectly Calculating String Bounds
Another frequent error is miscalculating the loop boundaries, leading to index out of range errors:
Incorrect boundary:
# Wrong: This will cause index out of bounds
for start_index in range(len(s) - minSize): # Missing +1
current_substring = s[start_index : start_index + minSize]
Correct boundary:
# Correct: Ensures we can always extract minSize characters
for start_index in range(len(s) - minSize + 1):
current_substring = s[start_index : start_index + minSize]
The +1
is crucial because when start_index = len(s) - minSize
, we can still extract exactly minSize
characters from position start_index
to start_index + minSize - 1
.
3. Using len() on the Substring Instead of Set for Unique Character Count
A subtle but critical mistake is checking the length of the substring instead of the number of unique characters:
Incorrect:
# Wrong: This checks substring length, not unique character count
if len(current_substring) <= maxLetters:
substring_counter[current_substring] += 1
Correct:
# Correct: This checks the number of unique characters
if len(set(current_substring)) <= maxLetters:
substring_counter[current_substring] += 1
4. Not Handling Edge Cases
Failing to consider edge cases can lead to unexpected results:
- Empty string or minSize > len(s): The loop won't execute, correctly returning 0
- maxLetters = 0: No valid substrings exist (unless considering empty substring, which isn't typically valid)
- minSize = 1: Every single character becomes a candidate
Solution: The provided code naturally handles these edge cases, but it's worth adding validation if needed:
if not s or minSize > len(s) or maxLetters <= 0:
return 0
Which type of traversal does breadth first search do?
Recommended Readings
https assets algo monster cover_photos stack svg Sliding Window Maximum Monotonic Stack We have an array and a sliding window defined by a start index and an end index The sliding window moves from left of the array to right There are always k elements in the window The window
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Don’t Miss This!