Facebook Pixel

3329. Count Substrings With K-Frequency Characters II 🔒

HardHash TableStringSliding Window
Leetcode Link

Problem Description

You are given a string s and an integer k. Your task is to find the total number of substrings of s where at least one character appears at least k times.

A substring is a contiguous sequence of characters within the string. For example, if s = "aaab" and k = 2, the substrings where at least one character appears at least 2 times would include "aa" (where 'a' appears 2 times), "aaa" (where 'a' appears 3 times), "aaab" (where 'a' appears 3 times), and so on.

The solution uses a sliding window technique with two pointers. The right pointer expands the window by iterating through each character, while the left pointer contracts the window when we find a character that appears at least k times. The key insight is that once we have a character appearing at least k times in our current window [l, r], any substring ending at position r and starting from positions 0 to l-1 will also satisfy the condition (since they all contain the window as a subset). Therefore, we add l to our answer for each valid position.

The algorithm maintains a counter to track character frequencies in the current window. When a character's count reaches k, we shrink the window from the left until no character has a count of k or more, counting all valid substrings along the way.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key observation is that we need to count substrings where at least one character appears k or more times. Instead of directly counting these "valid" substrings, we can think inversely - what if we count substrings where NO character appears k or more times, and then use this to find our answer?

When we fix a right endpoint r and think about all possible left endpoints, we notice a pattern: there exists some position l such that:

  • For all substrings ending at r with starting positions from 0 to l-1, at least one character appears k or more times (these are valid)
  • For all substrings ending at r with starting positions from l to r, no character appears k or more times (these are invalid)

This creates a natural division point. As we move our right pointer through the string, we maintain a window [l, r] where no character appears k or more times. When we add a new character at position r:

  • If adding it doesn't make any character count reach k, we simply continue
  • If adding it makes some character count reach k, we need to shrink from the left until no character has count ≥ k

The brilliant insight is that when we're forced to move the left pointer from position l to some new position, all the substrings ending at r with starting positions [0, 1, ..., l-1] are valid substrings. Why? Because if the window [l, r] contains a character with count k, then any larger window that includes [l, r] will also contain that character at least k times.

Therefore, for each position r, we add l to our answer - representing the count of all valid substrings ending at r. This efficiently counts all valid substrings in O(n) time using the two-pointer technique.

Learn more about Sliding Window patterns.

Solution Approach

The solution implements a sliding window technique with two pointers to efficiently count the valid substrings.

Data Structures Used:

  • cnt: A Counter (hash map) to track the frequency of each character in the current window
  • l: Left pointer of the sliding window
  • ans: Accumulator for the final answer

Algorithm Steps:

  1. Initialize Variables: Start with an empty counter cnt, left pointer l = 0, and answer ans = 0.

  2. Iterate Through String: For each character c at position r (implicitly the right pointer):

    • Add the character to the window: cnt[c] += 1
  3. Shrink Window When Needed: While the current character c appears at least k times in the window:

    • Remove the leftmost character from the window: cnt[s[l]] -= 1
    • Move the left pointer right: l += 1
    • This ensures the window [l, r] contains no character with frequency ≥ k
  4. Count Valid Substrings: After adjusting the window, add l to the answer:

    • ans += l
    • This counts all substrings ending at position r that start from positions 0 to l-1
  5. Return Result: After processing all characters, return ans.

Why This Works:

The window [l, r] maintains the invariant that no character appears k or more times within it. When we're at position r:

  • Substrings from [0, r], [1, r], ..., [l-1, r] all contain at least one character appearing ≥ k times
  • Substrings from [l, r], [l+1, r], ..., [r, r] have no character appearing ≥ k times

By maintaining this invariant and adding l for each position r, we correctly count all valid substrings exactly once.

Time Complexity: O(n) where n is the length of the string, as each character is visited at most twice (once by right pointer, once by left pointer).

Space Complexity: O(1) or O(26) for the counter, as we only store counts for at most 26 lowercase English letters.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the algorithm with s = "aaab" and k = 3.

We want to count all substrings where at least one character appears at least 3 times.

Initial State:

  • l = 0, ans = 0, cnt = {}

Step 1: r = 0, character = 'a'

  • Add 'a' to window: cnt = {'a': 1}
  • Check: 'a' appears 1 time (< 3), so no shrinking needed
  • Add l to answer: ans = 0 + 0 = 0
  • Window: [0,0] = "a"

Step 2: r = 1, character = 'a'

  • Add 'a' to window: cnt = {'a': 2}
  • Check: 'a' appears 2 times (< 3), so no shrinking needed
  • Add l to answer: ans = 0 + 0 = 0
  • Window: [0,1] = "aa"

Step 3: r = 2, character = 'a'

  • Add 'a' to window: cnt = {'a': 3}
  • Check: 'a' appears 3 times (≥ 3), so we need to shrink!
  • Shrinking process:
    • Remove s[0]='a': cnt = {'a': 2}, l = 1
    • Check: 'a' appears 2 times (< 3), shrinking complete
  • Add l to answer: ans = 0 + 1 = 1
  • Window: [1,2] = "aa"
  • This counts the substring [0,2] = "aaa" as valid

Step 4: r = 3, character = 'b'

  • Add 'b' to window: cnt = {'a': 2, 'b': 1}
  • Check: no character appears ≥ 3 times, so no shrinking needed
  • Add l to answer: ans = 1 + 1 = 2
  • Window: [1,3] = "aab"
  • This counts the substring [0,3] = "aaab" as valid

Final Answer: 2

The valid substrings are:

  1. "aaa" (positions 0-2): 'a' appears 3 times
  2. "aaab" (positions 0-3): 'a' appears 3 times

Note how the algorithm cleverly counts these by tracking when the window needs to shrink. When we had to move l from 0 to 1 at step 3, it meant that any substring ending at position 2 and starting before position 1 would be valid. Similarly, at step 4, since l = 1, the substring starting at position 0 and ending at position 3 is valid.

Solution Implementation

1from collections import Counter
2
3class Solution:
4    def numberOfSubstrings(self, s: str, k: int) -> int:
5        # Counter to track frequency of characters in current window
6        char_count = Counter()
7      
8        # Total count of valid substrings and left pointer for sliding window
9        total_substrings = 0
10        left = 0
11      
12        # Iterate through string with right pointer
13        for right, char in enumerate(s):
14            # Add current character to the window
15            char_count[char] += 1
16          
17            # When current character appears at least k times,
18            # all substrings ending at current position and starting 
19            # from index 0 to left (inclusive) are valid
20            while char_count[char] >= k:
21                # Remove leftmost character from window
22                char_count[s[left]] -= 1
23                left += 1
24          
25            # Add count of valid substrings ending at current position
26            # (all substrings from index 0 to left-1)
27            total_substrings += left
28      
29        return total_substrings
30
1class Solution {
2    public long numberOfSubstrings(String s, int k) {
3        // Array to track frequency of each character (26 lowercase letters)
4        int[] charFrequency = new int[26];
5        long totalSubstrings = 0;
6      
7        // Two-pointer sliding window approach
8        int left = 0;
9      
10        for (int right = 0; right < s.length(); right++) {
11            // Add current character to the window
12            int currentCharIndex = s.charAt(right) - 'a';
13            charFrequency[currentCharIndex]++;
14          
15            // Shrink window from left while current character appears at least k times
16            while (charFrequency[currentCharIndex] >= k) {
17                // Remove leftmost character from window
18                int leftCharIndex = s.charAt(left) - 'a';
19                charFrequency[leftCharIndex]--;
20                left++;
21            }
22          
23            // All substrings ending at right with starting positions [0, left-1]
24            // satisfy the condition (contain at least one character with frequency >= k)
25            totalSubstrings += left;
26        }
27      
28        return totalSubstrings;
29    }
30}
31
1class Solution {
2public:
3    long long numberOfSubstrings(string s, int k) {
4        int n = s.size();
5        long long result = 0;
6        int leftIndex = 0;
7      
8        // Frequency array for 26 lowercase letters
9        int charFrequency[26] = {0};
10      
11        // Iterate through each character as the right boundary of the window
12        for (int rightIndex = 0; rightIndex < n; rightIndex++) {
13            // Increment frequency of current character
14            charFrequency[s[rightIndex] - 'a']++;
15          
16            // Shrink window from left while current character appears k or more times
17            while (charFrequency[s[rightIndex] - 'a'] >= k) {
18                charFrequency[s[leftIndex] - 'a']--;
19                leftIndex++;
20            }
21          
22            // Add count of valid substrings ending at rightIndex
23            // All substrings from index 0 to leftIndex-1 combined with 
24            // substring from their position to rightIndex are valid
25            result += leftIndex;
26        }
27      
28        return result;
29    }
30};
31
1/**
2 * Counts the number of substrings where at least one character appears k or more times
3 * Uses a sliding window approach with two pointers
4 * 
5 * @param s - The input string to analyze
6 * @param k - The minimum frequency threshold for a character
7 * @returns The total count of valid substrings
8 */
9function numberOfSubstrings(s: string, k: number): number {
10    // Initialize result counter and left pointer for sliding window
11    let result: number = 0;
12    let leftPointer: number = 0;
13  
14    // Array to track frequency of each lowercase letter (a-z)
15    // Index 0 represents 'a', index 1 represents 'b', etc.
16    const charFrequency: number[] = Array(26).fill(0);
17  
18    // Iterate through string with right pointer
19    for (const currentChar of s) {
20        // Convert character to array index (0-25 for a-z)
21        const charIndex: number = currentChar.charCodeAt(0) - 97;
22      
23        // Increment frequency of current character
24        charFrequency[charIndex]++;
25      
26        // Shrink window from left while current character appears k or more times
27        // This maintains the invariant that no character in the window has frequency >= k
28        while (charFrequency[charIndex] >= k) {
29            // Get character at left pointer and decrease its frequency
30            const leftCharIndex: number = s[leftPointer].charCodeAt(0) - 97;
31            charFrequency[leftCharIndex]--;
32            leftPointer++;
33        }
34      
35        // Add number of valid substrings ending at current position
36        // All substrings from index 0 to leftPointer-1 combined with 
37        // the current suffix form valid substrings
38        result += leftPointer;
39    }
40  
41    return result;
42}
43

Time and Space Complexity

The time complexity is O(n), where n is the length of the string s. Each character in the string is visited at most twice - once when the right pointer (implicit in the for loop) moves forward, and potentially once when the left pointer l moves forward. Since each character is processed a constant number of times, the overall time complexity remains linear.

The space complexity is O(|Σ|), where Σ is the character set. The Counter object cnt stores at most all unique characters that appear in the string. Since the problem deals with lowercase letters (based on the reference answer), we have |Σ| = 26, making the space complexity O(26) = O(1) constant space.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

Pitfall 1: Misunderstanding the Counting Logic

The Issue: A common mistake is thinking that we should add r - l + 1 (the number of substrings ending at r within the current window) instead of l. This misunderstanding stems from confusing which substrings are valid.

Why It's Wrong:

  • The window [l, r] maintains that NO character appears ≥ k times within it
  • We want to count substrings where AT LEAST ONE character appears ≥ k times
  • Therefore, we count substrings that extend BEYOND the current window to the left

Correct Understanding:

  • When we're at position r, substrings [0, r], [1, r], ..., [l-1, r] are valid (they contain characters with frequency ≥ k)
  • Substrings [l, r], [l+1, r], ..., [r, r] are NOT valid (no character appears ≥ k times)
  • So we add l (the count of valid substrings), not r - l + 1

Pitfall 2: Incorrect Window Shrinking Condition

The Issue: Some might write the shrinking condition as:

while any(count >= k for count in char_count.values()):
    char_count[s[left]] -= 1
    left += 1

Why It's Wrong: This checks if ANY character in the window has frequency ≥ k, but we only need to shrink when the NEWLY ADDED character char reaches frequency k. Checking all characters is:

  • Inefficient (O(26) check per iteration)
  • Logically incorrect for maintaining our invariant

Correct Approach:

while char_count[char] >= k:  # Only check the character we just added
    char_count[s[left]] -= 1
    left += 1

Pitfall 3: Off-by-One Error in Counting

The Issue: Adding left + 1 instead of left to the answer, thinking we need to include the substring starting at position left.

Why It's Wrong:

  • After the while loop, the window [left, right] has no character with frequency ≥ k
  • The substring [left, right] is NOT valid
  • Only substrings [0, right] through [left-1, right] are valid
  • There are exactly left such substrings (indices 0 through left-1)

Correct Implementation:

total_substrings += left  # Not left + 1

Complete Corrected Solution:

from collections import Counter

class Solution:
    def numberOfSubstrings(self, s: str, k: int) -> int:
        char_count = Counter()
        total_substrings = 0
        left = 0
      
        for right, char in enumerate(s):
            char_count[char] += 1
          
            # Only shrink when the current character reaches k occurrences
            while char_count[char] >= k:
                char_count[s[left]] -= 1
                left += 1
          
            # Add exactly 'left' valid substrings ending at position right
            total_substrings += left
      
        return total_substrings
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which data structure is used to implement priority queue?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More