Facebook Pixel

3325. Count Substrings With K-Frequency Characters I

MediumHash TableStringSliding Window
Leetcode Link

Problem Description

You are given a string s and an integer k. Your task is to count the total number of substrings where at least one character appears at least k times.

A substring is a contiguous sequence of characters within the string. For each valid substring, there must be at least one character that occurs k or more times within that substring.

For example, if s = "aaab" and k = 2, the substring "aaa" would be valid because the character 'a' appears 3 times (which is ≥ 2). Similarly, "aaab" would also be valid because 'a' appears 3 times.

The solution uses a sliding window approach with two pointers. It enumerates each position as a potential right endpoint of substrings. For each right endpoint position, it maintains a left pointer that moves rightward whenever the current window contains a character appearing at least k times. The key insight is that once a window [l, r] contains a character appearing at least k times, all substrings ending at r with starting positions from 0 to l-1 are also valid (since they would contain the entire window [l, r] as a subset). This allows us to efficiently count valid substrings by adding l to our answer for each position of r.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key observation is that if a substring contains a character that appears at least k times, then any longer substring that contains this substring will also satisfy the condition. This monotonic property suggests we can use a sliding window approach.

Let's think about this differently: instead of directly counting valid substrings, we can count them by fixing the right endpoint and determining how many valid left endpoints exist.

For any position r as the right endpoint, we want to find the rightmost position l such that the substring s[l...r] contains at least one character appearing k times. Once we find this position l, all substrings ending at r with starting positions from 0 to l-1 will be valid because they all contain the substring s[l...r].

Why does this work? Consider a window [l, r] where some character c appears exactly k times. Any substring that starts before l and ends at r will contain all occurrences of c in the window [l, r], so it will also have at least k occurrences of c.

The sliding window maintains the invariant: the window [l, r] is the minimal window ending at r that contains a character appearing at least k times. As we extend r, we shrink from the left (increment l) whenever we have a character with count ≥ k, ensuring we always have the minimal such window. The number of valid substrings ending at each position r is exactly l, representing all possible starting positions from 0 to l-1.

Learn more about Sliding Window patterns.

Solution Approach

We implement a sliding window solution using two pointers to efficiently count valid substrings.

Data Structures:

  • cnt: A Counter (hash map) to track the frequency of each character in the current window [l, r]
  • l: Left pointer of the sliding window, initially set to 0
  • ans: Accumulator for the total count of valid substrings

Algorithm Steps:

  1. Enumerate the right endpoint: We iterate through each character c in the string s, treating each position as a potential right endpoint r of substrings.

  2. Expand the window: For each character at position r, we add it to our frequency counter: cnt[c] += 1

  3. Shrink from the left when condition is met: When the current character c appears at least k times in the window (cnt[c] >= k), we need to find the minimal window ending at r that satisfies our condition. We do this by:

    • Removing characters from the left: cnt[s[l]] -= 1
    • Moving the left pointer rightward: l += 1
    • Continuing this process while cnt[c] >= k
  4. Count valid substrings: After adjusting the window, l represents the leftmost position where the window [l, r] no longer contains any character appearing ≥ k times. This means all substrings ending at r with starting positions from 0 to l-1 are valid. We add l to our answer.

Example Walkthrough:

Consider s = "aaab" and k = 2:

  • When r = 0 (first 'a'): cnt['a'] = 1, no character has count ≥ 2, so l = 0, add 0
  • When r = 1 (second 'a'): cnt['a'] = 2, now 'a' appears 2 times
    • Shrink: remove s[0], cnt['a'] = 1, move l = 1
    • Now 'a' appears only once in window [1, 1]
    • Add l = 1 to answer (substring "aa" starting at index 0 is valid)
  • When r = 2 (third 'a'): cnt['a'] = 2 again
    • Shrink: remove s[1], move l = 2
    • Add l = 2 to answer (substrings "aaa" and "aa" starting at indices 0 and 1 are valid)
  • When r = 3 ('b'): cnt['b'] = 1, no shrinking needed
    • Add l = 2 to answer (substrings ending at 'b' starting from indices 0 and 1 are valid)

The time complexity is O(n) where n is the length of the string, as each character is added and removed from the window at most once. The space complexity is O(min(n, 26)) for storing character frequencies.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's trace through the algorithm with s = "abba" and k = 2.

Initial State:

  • l = 0, ans = 0, cnt = {}

Step 1: r = 0, c = 'a'

  • Add 'a' to counter: cnt = {'a': 1}
  • cnt['a'] = 1 < 2, so no shrinking needed
  • Add l = 0 to answer → ans = 0
  • Window [0,0] = "a" has no character appearing ≥ 2 times

Step 2: r = 1, c = 'b'

  • Add 'b' to counter: cnt = {'a': 1, 'b': 1}
  • cnt['b'] = 1 < 2, so no shrinking needed
  • Add l = 0 to answer → ans = 0
  • Window [0,1] = "ab" has no character appearing ≥ 2 times

Step 3: r = 2, c = 'b'

  • Add 'b' to counter: cnt = {'a': 1, 'b': 2}
  • cnt['b'] = 2 ≥ 2, so we need to shrink!
    • Remove s[0] = 'a': cnt = {'a': 0, 'b': 2}
    • Move l = 1
    • Check: cnt['b'] = 2 ≥ 2, continue shrinking
    • Remove s[1] = 'b': cnt = {'a': 0, 'b': 1}
    • Move l = 2
    • Check: cnt['b'] = 1 < 2, stop shrinking
  • Add l = 2 to answer → ans = 2
  • This counts substrings "abb" (starting at 0) and "bb" (starting at 1)

Step 4: r = 3, c = 'a'

  • Add 'a' to counter: cnt = {'a': 1, 'b': 1}
  • cnt['a'] = 1 < 2, so no shrinking needed
  • Add l = 2 to answer → ans = 4
  • This counts substrings "abba" (starting at 0) and "bba" (starting at 1)

Final Result: ans = 4

The valid substrings are:

  1. "abb" - 'b' appears 2 times
  2. "abba" - 'b' appears 2 times
  3. "bb" - 'b' appears 2 times
  4. "bba" - 'b' appears 2 times

The algorithm correctly identifies that once we have a minimal window with a character appearing ≥ k times, all extensions to the left create valid substrings.

Solution Implementation

1from collections import Counter
2
3class Solution:
4    def numberOfSubstrings(self, s: str, k: int) -> int:
5        # Counter to track frequency of characters in current window
6        char_count = Counter()
7      
8        # Total number of valid substrings
9        total_substrings = 0
10      
11        # Left pointer of the sliding window
12        left = 0
13      
14        # Iterate through string with right pointer
15        for right, char in enumerate(s):
16            # Add current character to the window
17            char_count[char] += 1
18          
19            # While current character appears at least k times
20            # All substrings ending at current position and starting 
21            # from index 0 to left-1 are valid
22            while char_count[char] >= k:
23                # Remove leftmost character from window
24                char_count[s[left]] -= 1
25                # Move left pointer forward
26                left += 1
27          
28            # Add count of valid substrings ending at current position
29            # These are substrings from index 0 to left-1 that end at right
30            total_substrings += left
31      
32        return total_substrings
33
1class Solution {
2    public int numberOfSubstrings(String s, int k) {
3        // Array to store the frequency of each character (26 lowercase letters)
4        int[] charFrequency = new int[26];
5      
6        // Total count of valid substrings
7        int totalSubstrings = 0;
8      
9        // Left pointer for the sliding window
10        int leftPointer = 0;
11      
12        // Iterate through the string with the right pointer
13        for (int rightPointer = 0; rightPointer < s.length(); rightPointer++) {
14            // Get the current character index (0-25 for 'a'-'z')
15            int currentCharIndex = s.charAt(rightPointer) - 'a';
16          
17            // Increment the frequency of the current character
18            charFrequency[currentCharIndex]++;
19          
20            // Shrink the window from the left while the current character appears at least k times
21            while (charFrequency[currentCharIndex] >= k) {
22                // Decrement the frequency of the character at the left pointer
23                int leftCharIndex = s.charAt(leftPointer) - 'a';
24                charFrequency[leftCharIndex]--;
25              
26                // Move the left pointer to the right
27                leftPointer++;
28            }
29          
30            // Add the number of valid substrings ending at rightPointer
31            // All substrings from index 0 to (leftPointer-1) combined with 
32            // the current window form valid substrings
33            totalSubstrings += leftPointer;
34        }
35      
36        return totalSubstrings;
37    }
38}
39
1class Solution {
2public:
3    int numberOfSubstrings(string s, int k) {
4        int n = s.size();
5        int result = 0;
6        int leftPointer = 0;
7        int charFrequency[26] = {0};  // Array to store frequency of each character (a-z)
8      
9        // Iterate through each character in the string
10        for (char& currentChar : s) {
11            // Increment frequency of current character
12            charFrequency[currentChar - 'a']++;
13          
14            // Shrink window from left while current character frequency >= k
15            while (charFrequency[currentChar - 'a'] >= k) {
16                // Decrement frequency of leftmost character and move left pointer
17                charFrequency[s[leftPointer] - 'a']--;
18                leftPointer++;
19            }
20          
21            // Add number of valid substrings ending before current position
22            result += leftPointer;
23        }
24      
25        return result;
26    }
27};
28```
29
30However, if the goal is to count substrings where at least one character appears k or more times, here's the corrected version:
31
32```cpp
33class Solution {
34public:
35    int numberOfSubstrings(string s, int k) {
36        int n = s.size();
37        int result = 0;
38      
39        // For each starting position
40        for (int start = 0; start < n; start++) {
41            int charFrequency[26] = {0};  // Frequency array for characters a-z
42          
43            // Extend window from current start position
44            for (int end = start; end < n; end++) {
45                // Update frequency of current character
46                charFrequency[s[end] - 'a']++;
47              
48                // Check if any character has frequency >= k
49                bool hasValidChar = false;
50                for (int i = 0; i < 26; i++) {
51                    if (charFrequency[i] >= k) {
52                        hasValidChar = true;
53                        break;
54                    }
55                }
56              
57                // If valid, count this substring
58                if (hasValidChar) {
59                    result++;
60                }
61            }
62        }
63      
64        return result;
65    }
66};
67
1function numberOfSubstrings(s: string, k: number): number {
2    // Total count of valid substrings
3    let validSubstringCount: number = 0;
4  
5    // Left pointer for the sliding window
6    let leftPointer: number = 0;
7  
8    // Array to track character frequency (26 letters in alphabet)
9    const characterFrequency: number[] = Array(26).fill(0);
10  
11    // Iterate through each character in the string (right pointer of sliding window)
12    for (const currentChar of s) {
13        // Convert character to index (0-25 for a-z)
14        const charIndex: number = currentChar.charCodeAt(0) - 'a'.charCodeAt(0);
15      
16        // Increment frequency of current character
17        characterFrequency[charIndex]++;
18      
19        // Shrink window from left while current character appears k or more times
20        // This maintains the invariant that no character in the window appears >= k times
21        while (characterFrequency[charIndex] >= k) {
22            // Get the index of the character at left pointer
23            const leftCharIndex: number = s[leftPointer].charCodeAt(0) - 'a'.charCodeAt(0);
24          
25            // Decrement its frequency and move left pointer right
26            characterFrequency[leftCharIndex]--;
27            leftPointer++;
28        }
29      
30        // All substrings ending at current position with starting index < leftPointer
31        // are valid (they contain at least one character appearing >= k times)
32        validSubstringCount += leftPointer;
33    }
34  
35    return validSubstringCount;
36}
37

Time and Space Complexity

The time complexity is O(n), where n is the length of the string s. This is because we iterate through each character in the string exactly once with the outer loop. While there is an inner while loop, each character can only be processed by the left pointer l at most once throughout the entire execution (the left pointer only moves forward and never backwards), making the amortized cost O(1) per character. Therefore, the total operations are bounded by 2n, which simplifies to O(n).

The space complexity is O(|Σ|), where Σ is the character set. Since the problem deals with lowercase letters (as implied by the reference answer), we have |Σ| = 26. The Counter object cnt stores at most one entry for each distinct character that appears in the string, which is bounded by the size of the alphabet. Therefore, the space complexity is O(26) = O(1) in terms of constant space, or more precisely O(|Σ|) when considering the character set size as a parameter.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

Pitfall 1: Misunderstanding the Counting Logic

The Problem: A common mistake is thinking that when we find a valid window [l, r] where some character appears at least k times, we should count ALL substrings within that window. This leads to incorrect counting logic like:

# INCORRECT approach
if char_count[char] >= k:
    # Wrong: counting all substrings in current window
    total_substrings += (right - left + 1)

Why It's Wrong: This counts substrings that may not contain any character appearing k times. For example, if window [0, 3] has character 'a' appearing 3 times, the substring [1, 1] within it might only have 'b', which doesn't satisfy our condition.

The Correct Understanding: The solution counts substrings that END at position right. When we shrink the window to [left, right] where no character appears ≥ k times, it means all substrings ending at right and starting from positions 0 to left-1 ARE valid (because they contain the previous valid window as a subset).

Pitfall 2: Incorrect Window Shrinking Condition

The Problem: Some might try to shrink the window while ANY character in the window has count ≥ k:

# INCORRECT shrinking condition
while any(count >= k for count in char_count.values()):
    char_count[s[left]] -= 1
    left += 1

Why It's Wrong: This is inefficient and conceptually incorrect. We only need to shrink when the NEWLY ADDED character at position right reaches count k. This ensures we find the minimal valid window ending at each position.

The Correct Approach: Only shrink while the current character being processed has count ≥ k:

while char_count[char] >= k:  # 'char' is s[right]
    char_count[s[left]] -= 1
    left += 1

Pitfall 3: Off-by-One Error in Counting

The Problem: After shrinking the window, some might incorrectly count:

# INCORRECT counting
total_substrings += left + 1  # Wrong!

Why It's Wrong: After shrinking, left points to the first position where the window [left, right] does NOT satisfy our condition. This means valid substrings ending at right start from indices 0 to left-1, which gives us exactly left valid substrings, not left + 1.

The Correct Count:

total_substrings += left  # Correct: counts substrings from [0,right] to [left-1,right]
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which algorithm should you use to find a node that is close to the root of the tree?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More