Facebook Pixel

1297. Maximum Number of Occurrences of a Substring

MediumHash TableStringSliding Window
Leetcode Link

Problem Description

You are given a string s and three integer parameters: maxLetters, minSize, and maxSize. Your task is to find the maximum number of times any substring appears in s, where the substring must satisfy two conditions:

  1. Character limit: The substring must contain at most maxLetters unique characters. For example, if maxLetters = 2, then substrings like "aa", "ab", "bb" are valid (having 1 or 2 unique characters), but "abc" is not valid (having 3 unique characters).

  2. Length constraint: The length of the substring must be between minSize and maxSize (inclusive).

The goal is to find which valid substring appears the most frequently in s and return that frequency count.

For example, if s = "aababcaab", maxLetters = 2, minSize = 3, and maxSize = 4:

  • The substring "aab" has 2 unique characters ('a' and 'b') and length 3, which satisfies both conditions
  • "aab" appears 2 times in the string (at positions 0-2 and 6-8)
  • This would be the maximum frequency among all valid substrings

The key insight in the solution is that if a longer substring meets the conditions and appears multiple times, then any of its substrings of length minSize will also meet the conditions and appear at least as many times. Therefore, we only need to check substrings of length exactly minSize to find the maximum frequency.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key observation is that if a substring of length k (where minSize ≤ k ≤ maxSize) appears n times in the string, then every substring of length minSize within that longer substring will also appear at least n times.

To understand why, consider a substring of length 5 that appears 3 times. Any substring of length 3 within those 5 characters will also appear at least 3 times (at the same positions where the length-5 substring appears). This means that checking longer substrings is redundant - we'll never find a longer substring that has a higher frequency than the best substring of length minSize.

Additionally, if a longer substring satisfies the unique character constraint (having at most maxLetters unique characters), then any of its substrings will also satisfy this constraint, since a substring cannot have more unique characters than the string containing it.

Therefore, instead of checking all possible substring lengths from minSize to maxSize, we can optimize by only checking substrings of exactly length minSize. This significantly reduces the search space while guaranteeing we find the maximum frequency.

The approach becomes straightforward:

  1. Generate all substrings of length minSize
  2. For each substring, check if it has at most maxLetters unique characters
  3. Count the frequency of valid substrings using a hash table
  4. Return the maximum frequency found

This reduces the time complexity from checking multiple lengths to just checking one optimal length, making the solution more efficient.

Learn more about Sliding Window patterns.

Solution Approach

The solution uses a hash table combined with a sliding window approach to efficiently find all substrings of length minSize and track their frequencies.

Step-by-step implementation:

  1. Initialize data structures: We use a Counter (hash table) to store the frequency of each valid substring, and a variable ans to track the maximum frequency found.

  2. Iterate through all possible starting positions: We loop through the string from index 0 to len(s) - minSize, ensuring we can always extract a substring of length minSize without going out of bounds.

  3. Extract substring: For each position i, we extract a substring t = s[i : i + minSize] of exactly length minSize.

  4. Check unique character constraint: We convert the substring t into a set ss = set(t) to get unique characters. The size of this set tells us how many unique characters are in the substring. If len(ss) <= maxLetters, the substring is valid.

  5. Update frequency count: For valid substrings, we increment their count in the hash table using cnt[t] += 1.

  6. Track maximum frequency: After updating the count, we immediately check if this is the new maximum frequency with ans = max(ans, cnt[t]).

  7. Return result: After checking all possible substrings of length minSize, we return ans which contains the maximum frequency.

Time Complexity: O(n × minSize) where n is the length of string s. We iterate through n - minSize + 1 positions, and for each position, we create a substring of length minSize and convert it to a set.

Space Complexity: O(n × minSize) in the worst case, where all substrings are unique and stored in the hash table.

The elegance of this solution lies in recognizing that we only need to check substrings of the minimum allowed length, which significantly simplifies the problem while guaranteeing the optimal answer.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example to illustrate the solution approach.

Input: s = "aabaab", maxLetters = 2, minSize = 2, maxSize = 3

Step 1: Initialize

  • Create an empty hash table cnt = {}
  • Set ans = 0 to track maximum frequency

Step 2: Extract all substrings of length minSize (2)

We'll iterate through positions 0 to 4 (since string length is 6 and we need substrings of length 2):

  • Position 0: Extract "aa"

    • Unique characters: {'a'} → 1 unique character
    • 1 ≤ 2 (maxLetters), so it's valid
    • Update: cnt["aa"] = 1, ans = max(0, 1) = 1
  • Position 1: Extract "ab"

    • Unique characters: {'a', 'b'} → 2 unique characters
    • 2 ≤ 2 (maxLetters), so it's valid
    • Update: cnt["ab"] = 1, ans = max(1, 1) = 1
  • Position 2: Extract "ba"

    • Unique characters: {'b', 'a'} → 2 unique characters
    • 2 ≤ 2 (maxLetters), so it's valid
    • Update: cnt["ba"] = 1, ans = max(1, 1) = 1
  • Position 3: Extract "aa"

    • Unique characters: {'a'} → 1 unique character
    • 1 ≤ 2 (maxLetters), so it's valid
    • Update: cnt["aa"] = 2, ans = max(1, 2) = 2
  • Position 4: Extract "ab"

    • Unique characters: {'a', 'b'} → 2 unique characters
    • 2 ≤ 2 (maxLetters), so it's valid
    • Update: cnt["ab"] = 2, ans = max(2, 2) = 2

Step 3: Return result

  • Final hash table: cnt = {"aa": 2, "ab": 2, "ba": 1}
  • Maximum frequency: ans = 2

Result: The function returns 2, as both "aa" and "ab" appear twice in the string, which is the maximum frequency among all valid substrings.

Note how we ignored checking substrings of length 3 (maxSize). Even though "aab" appears twice and satisfies all constraints, we don't need to check it because any valid longer substring that appears n times will have at least one substring of minSize that also appears n times.

Solution Implementation

1from collections import Counter
2
3class Solution:
4    def maxFreq(self, s: str, maxLetters: int, minSize: int, maxSize: int) -> int:
5        """
6        Find the maximum number of occurrences of any substring that:
7        - Has length between minSize and maxSize
8        - Contains at most maxLetters unique characters
9      
10        Key insight: We only need to check substrings of minSize length.
11        If a longer substring appears k times, all its minSize-length substrings 
12        appear at least k times, so checking minSize is sufficient.
13      
14        Args:
15            s: Input string to search in
16            maxLetters: Maximum number of unique letters allowed in substring
17            minSize: Minimum length of substring
18            maxSize: Maximum length of substring (not used due to optimization)
19          
20        Returns:
21            Maximum frequency of any valid substring
22        """
23      
24        # Initialize maximum frequency counter
25        max_frequency = 0
26      
27        # Dictionary to count occurrences of each valid substring
28        substring_counter = Counter()
29      
30        # Iterate through all possible substrings of length minSize
31        for start_index in range(len(s) - minSize + 1):
32            # Extract substring of length minSize starting at current position
33            current_substring = s[start_index : start_index + minSize]
34          
35            # Count unique characters in the substring
36            unique_chars = set(current_substring)
37          
38            # Check if substring satisfies the unique character constraint
39            if len(unique_chars) <= maxLetters:
40                # Increment count for this valid substring
41                substring_counter[current_substring] += 1
42              
43                # Update maximum frequency if current count is higher
44                max_frequency = max(max_frequency, substring_counter[current_substring])
45      
46        return max_frequency
47
1class Solution {
2    public int maxFreq(String s, int maxLetters, int minSize, int maxSize) {
3        int maxFrequency = 0;
4        // Map to store substring frequency count
5        Map<String, Integer> substringFrequencyMap = new HashMap<>();
6      
7        // Iterate through all possible substrings of minSize length
8        for (int startIndex = 0; startIndex <= s.length() - minSize; startIndex++) {
9            // Extract substring of minSize length starting at startIndex
10            String currentSubstring = s.substring(startIndex, startIndex + minSize);
11          
12            // Count unique characters in the current substring
13            Set<Character> uniqueCharacters = new HashSet<>();
14            for (int j = 0; j < minSize; j++) {
15                uniqueCharacters.add(currentSubstring.charAt(j));
16            }
17          
18            // Check if the number of unique characters is within the limit
19            if (uniqueCharacters.size() <= maxLetters) {
20                // Update frequency count for this valid substring
21                int currentFrequency = substringFrequencyMap.getOrDefault(currentSubstring, 0) + 1;
22                substringFrequencyMap.put(currentSubstring, currentFrequency);
23              
24                // Update the maximum frequency found so far
25                maxFrequency = Math.max(maxFrequency, currentFrequency);
26            }
27        }
28      
29        return maxFrequency;
30    }
31}
32
1class Solution {
2public:
3    int maxFreq(string s, int maxLetters, int minSize, int maxSize) {
4        int maxFrequency = 0;
5        // Map to store substring frequency count
6        unordered_map<string, int> substringFrequency;
7      
8        // Iterate through all possible substrings of length minSize
9        // Note: We only need to check minSize substrings because if a substring
10        // of length > minSize occurs multiple times, then a substring of minSize
11        // within it also occurs at least that many times
12        for (int i = 0; i <= s.size() - minSize; ++i) {
13            // Extract substring of length minSize starting at position i
14            string currentSubstring = s.substr(i, minSize);
15          
16            // Count unique characters in the current substring using a set
17            unordered_set<char> uniqueChars(currentSubstring.begin(), currentSubstring.end());
18          
19            // Check if the number of unique characters doesn't exceed maxLetters
20            if (uniqueChars.size() <= maxLetters) {
21                // Increment frequency count for this valid substring
22                substringFrequency[currentSubstring]++;
23                // Update the maximum frequency found so far
24                maxFrequency = max(maxFrequency, substringFrequency[currentSubstring]);
25            }
26        }
27      
28        return maxFrequency;
29    }
30};
31
1/**
2 * Finds the maximum number of occurrences of any substring that satisfies the given constraints
3 * @param s - The input string to search for substrings
4 * @param maxLetters - Maximum number of unique characters allowed in a valid substring
5 * @param minSize - Minimum length of a valid substring
6 * @param maxSize - Maximum length of a valid substring (not used in optimization)
7 * @returns The maximum frequency of any valid substring
8 */
9function maxFreq(s: string, maxLetters: number, minSize: number, maxSize: number): number {
10    // Map to store frequency of each valid substring
11    const substringFrequencyMap = new Map<string, number>();
12  
13    // Track the maximum frequency found
14    let maxFrequency = 0;
15  
16    // Iterate through all possible substrings of minSize length
17    // Note: We only check minSize because any valid substring of larger size
18    // that appears multiple times will have a substring of minSize that appears at least as many times
19    for (let startIndex = 0; startIndex <= s.length - minSize; startIndex++) {
20        // Extract substring of minSize length starting at current position
21        const currentSubstring = s.slice(startIndex, startIndex + minSize);
22      
23        // Create a set of unique characters in the substring
24        const uniqueCharacters = new Set(currentSubstring.split(''));
25      
26        // Check if the number of unique characters is within the allowed limit
27        if (uniqueCharacters.size <= maxLetters) {
28            // Update frequency count for this valid substring
29            const currentFrequency = (substringFrequencyMap.get(currentSubstring) || 0) + 1;
30            substringFrequencyMap.set(currentSubstring, currentFrequency);
31          
32            // Update maximum frequency if current frequency is higher
33            maxFrequency = Math.max(maxFrequency, currentFrequency);
34        }
35    }
36  
37    return maxFrequency;
38}
39

Time and Space Complexity

The time complexity is O(n × m), where n is the length of the string s and m is minSize. This is because:

  • The outer loop runs n - minSize + 1 times, which is O(n)
  • For each iteration, we extract a substring of length minSize using slicing s[i : i + minSize], which takes O(m) time
  • Creating a set from the substring takes O(m) time
  • The Counter update and max comparison are O(1) operations
  • Therefore, the total time complexity is O(n × m)

The space complexity is O(n × m), where n is the length of the string s and m is minSize. This is because:

  • The Counter cnt can store at most n - minSize + 1 different substrings, which is O(n)
  • Each substring stored as a key has length minSize, which is m
  • Therefore, the total space needed to store all possible substrings is O(n × m)
  • The temporary set ss uses O(m) space, but this doesn't affect the overall complexity

Note that in this problem, m (which is minSize) does not exceed 26 as per the reference answer.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Attempting to Check All Substring Lengths from minSize to maxSize

One of the most common mistakes is trying to check all possible substring lengths between minSize and maxSize. This seems intuitive since the problem states the substring length should be "between minSize and maxSize", but it leads to:

  • Unnecessary complexity: The code becomes more complex with nested loops
  • Worse time complexity: O(n × (maxSize - minSize + 1) × maxSize) instead of O(n × minSize)
  • Potential incorrect results: Without careful implementation, you might count overlapping substrings incorrectly

Incorrect approach:

def maxFreq(self, s: str, maxLetters: int, minSize: int, maxSize: int) -> int:
    max_frequency = 0
    substring_counter = Counter()
  
    # Wrong: Checking all lengths from minSize to maxSize
    for length in range(minSize, maxSize + 1):
        for start in range(len(s) - length + 1):
            substring = s[start : start + length]
            if len(set(substring)) <= maxLetters:
                substring_counter[substring] += 1
                max_frequency = max(max_frequency, substring_counter[substring])
  
    return max_frequency

Why the optimization works: If a substring of length k appears n times, then each of its substrings of length minSize will appear at least n times. Therefore, the maximum frequency can always be achieved by a substring of minimum length.

2. Incorrectly Calculating String Bounds

Another frequent error is miscalculating the loop boundaries, leading to index out of range errors:

Incorrect boundary:

# Wrong: This will cause index out of bounds
for start_index in range(len(s) - minSize):  # Missing +1
    current_substring = s[start_index : start_index + minSize]

Correct boundary:

# Correct: Ensures we can always extract minSize characters
for start_index in range(len(s) - minSize + 1):
    current_substring = s[start_index : start_index + minSize]

The +1 is crucial because when start_index = len(s) - minSize, we can still extract exactly minSize characters from position start_index to start_index + minSize - 1.

3. Using len() on the Substring Instead of Set for Unique Character Count

A subtle but critical mistake is checking the length of the substring instead of the number of unique characters:

Incorrect:

# Wrong: This checks substring length, not unique character count
if len(current_substring) <= maxLetters:
    substring_counter[current_substring] += 1

Correct:

# Correct: This checks the number of unique characters
if len(set(current_substring)) <= maxLetters:
    substring_counter[current_substring] += 1

4. Not Handling Edge Cases

Failing to consider edge cases can lead to unexpected results:

  • Empty string or minSize > len(s): The loop won't execute, correctly returning 0
  • maxLetters = 0: No valid substrings exist (unless considering empty substring, which isn't typically valid)
  • minSize = 1: Every single character becomes a candidate

Solution: The provided code naturally handles these edge cases, but it's worth adding validation if needed:

if not s or minSize > len(s) or maxLetters <= 0:
    return 0
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which type of traversal does breadth first search do?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More