3329. Count Substrings With K-Frequency Characters II 🔒
Problem Description
You are given a string s
and an integer k
. Your task is to find the total number of substrings of s
where at least one character appears at least k
times.
A substring is a contiguous sequence of characters within the string. For example, if s = "aaab"
and k = 2
, the substrings where at least one character appears at least 2 times would include "aa" (where 'a' appears 2 times), "aaa" (where 'a' appears 3 times), "aaab" (where 'a' appears 3 times), and so on.
The solution uses a sliding window technique with two pointers. The right pointer expands the window by iterating through each character, while the left pointer contracts the window when we find a character that appears at least k
times. The key insight is that once we have a character appearing at least k
times in our current window [l, r]
, any substring ending at position r
and starting from positions 0
to l-1
will also satisfy the condition (since they all contain the window as a subset). Therefore, we add l
to our answer for each valid position.
The algorithm maintains a counter to track character frequencies in the current window. When a character's count reaches k
, we shrink the window from the left until no character has a count of k
or more, counting all valid substrings along the way.
Intuition
The key observation is that we need to count substrings where at least one character appears k
or more times. Instead of directly counting these "valid" substrings, we can think inversely - what if we count substrings where NO character appears k
or more times, and then use this to find our answer?
When we fix a right endpoint r
and think about all possible left endpoints, we notice a pattern: there exists some position l
such that:
- For all substrings ending at
r
with starting positions from0
tol-1
, at least one character appearsk
or more times (these are valid) - For all substrings ending at
r
with starting positions froml
tor
, no character appearsk
or more times (these are invalid)
This creates a natural division point. As we move our right pointer through the string, we maintain a window [l, r]
where no character appears k
or more times. When we add a new character at position r
:
- If adding it doesn't make any character count reach
k
, we simply continue - If adding it makes some character count reach
k
, we need to shrink from the left until no character has count≥ k
The brilliant insight is that when we're forced to move the left pointer from position l
to some new position, all the substrings ending at r
with starting positions [0, 1, ..., l-1]
are valid substrings. Why? Because if the window [l, r]
contains a character with count k
, then any larger window that includes [l, r]
will also contain that character at least k
times.
Therefore, for each position r
, we add l
to our answer - representing the count of all valid substrings ending at r
. This efficiently counts all valid substrings in O(n)
time using the two-pointer technique.
Learn more about Sliding Window patterns.
Solution Approach
The solution implements a sliding window technique with two pointers to efficiently count the valid substrings.
Data Structures Used:
cnt
: A Counter (hash map) to track the frequency of each character in the current windowl
: Left pointer of the sliding windowans
: Accumulator for the final answer
Algorithm Steps:
-
Initialize Variables: Start with an empty counter
cnt
, left pointerl = 0
, and answerans = 0
. -
Iterate Through String: For each character
c
at positionr
(implicitly the right pointer):- Add the character to the window:
cnt[c] += 1
- Add the character to the window:
-
Shrink Window When Needed: While the current character
c
appears at leastk
times in the window:- Remove the leftmost character from the window:
cnt[s[l]] -= 1
- Move the left pointer right:
l += 1
- This ensures the window
[l, r]
contains no character with frequency≥ k
- Remove the leftmost character from the window:
-
Count Valid Substrings: After adjusting the window, add
l
to the answer:ans += l
- This counts all substrings ending at position
r
that start from positions0
tol-1
-
Return Result: After processing all characters, return
ans
.
Why This Works:
The window [l, r]
maintains the invariant that no character appears k
or more times within it. When we're at position r
:
- Substrings from
[0, r]
,[1, r]
, ...,[l-1, r]
all contain at least one character appearing≥ k
times - Substrings from
[l, r]
,[l+1, r]
, ...,[r, r]
have no character appearing≥ k
times
By maintaining this invariant and adding l
for each position r
, we correctly count all valid substrings exactly once.
Time Complexity: O(n)
where n
is the length of the string, as each character is visited at most twice (once by right pointer, once by left pointer).
Space Complexity: O(1)
or O(26)
for the counter, as we only store counts for at most 26 lowercase English letters.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the algorithm with s = "aaab"
and k = 3
.
We want to count all substrings where at least one character appears at least 3 times.
Initial State:
l = 0
,ans = 0
,cnt = {}
Step 1: r = 0, character = 'a'
- Add 'a' to window:
cnt = {'a': 1}
- Check: 'a' appears 1 time (< 3), so no shrinking needed
- Add
l
to answer:ans = 0 + 0 = 0
- Window:
[0,0]
= "a"
Step 2: r = 1, character = 'a'
- Add 'a' to window:
cnt = {'a': 2}
- Check: 'a' appears 2 times (< 3), so no shrinking needed
- Add
l
to answer:ans = 0 + 0 = 0
- Window:
[0,1]
= "aa"
Step 3: r = 2, character = 'a'
- Add 'a' to window:
cnt = {'a': 3}
- Check: 'a' appears 3 times (≥ 3), so we need to shrink!
- Shrinking process:
- Remove s[0]='a':
cnt = {'a': 2}
,l = 1
- Check: 'a' appears 2 times (< 3), shrinking complete
- Remove s[0]='a':
- Add
l
to answer:ans = 0 + 1 = 1
- Window:
[1,2]
= "aa" - This counts the substring
[0,2]
= "aaa" as valid
Step 4: r = 3, character = 'b'
- Add 'b' to window:
cnt = {'a': 2, 'b': 1}
- Check: no character appears ≥ 3 times, so no shrinking needed
- Add
l
to answer:ans = 1 + 1 = 2
- Window:
[1,3]
= "aab" - This counts the substring
[0,3]
= "aaab" as valid
Final Answer: 2
The valid substrings are:
- "aaa" (positions 0-2): 'a' appears 3 times
- "aaab" (positions 0-3): 'a' appears 3 times
Note how the algorithm cleverly counts these by tracking when the window needs to shrink. When we had to move l
from 0 to 1 at step 3, it meant that any substring ending at position 2 and starting before position 1 would be valid. Similarly, at step 4, since l = 1
, the substring starting at position 0 and ending at position 3 is valid.
Solution Implementation
1from collections import Counter
2
3class Solution:
4 def numberOfSubstrings(self, s: str, k: int) -> int:
5 # Counter to track frequency of characters in current window
6 char_count = Counter()
7
8 # Total count of valid substrings and left pointer for sliding window
9 total_substrings = 0
10 left = 0
11
12 # Iterate through string with right pointer
13 for right, char in enumerate(s):
14 # Add current character to the window
15 char_count[char] += 1
16
17 # When current character appears at least k times,
18 # all substrings ending at current position and starting
19 # from index 0 to left (inclusive) are valid
20 while char_count[char] >= k:
21 # Remove leftmost character from window
22 char_count[s[left]] -= 1
23 left += 1
24
25 # Add count of valid substrings ending at current position
26 # (all substrings from index 0 to left-1)
27 total_substrings += left
28
29 return total_substrings
30
1class Solution {
2 public long numberOfSubstrings(String s, int k) {
3 // Array to track frequency of each character (26 lowercase letters)
4 int[] charFrequency = new int[26];
5 long totalSubstrings = 0;
6
7 // Two-pointer sliding window approach
8 int left = 0;
9
10 for (int right = 0; right < s.length(); right++) {
11 // Add current character to the window
12 int currentCharIndex = s.charAt(right) - 'a';
13 charFrequency[currentCharIndex]++;
14
15 // Shrink window from left while current character appears at least k times
16 while (charFrequency[currentCharIndex] >= k) {
17 // Remove leftmost character from window
18 int leftCharIndex = s.charAt(left) - 'a';
19 charFrequency[leftCharIndex]--;
20 left++;
21 }
22
23 // All substrings ending at right with starting positions [0, left-1]
24 // satisfy the condition (contain at least one character with frequency >= k)
25 totalSubstrings += left;
26 }
27
28 return totalSubstrings;
29 }
30}
31
1class Solution {
2public:
3 long long numberOfSubstrings(string s, int k) {
4 int n = s.size();
5 long long result = 0;
6 int leftIndex = 0;
7
8 // Frequency array for 26 lowercase letters
9 int charFrequency[26] = {0};
10
11 // Iterate through each character as the right boundary of the window
12 for (int rightIndex = 0; rightIndex < n; rightIndex++) {
13 // Increment frequency of current character
14 charFrequency[s[rightIndex] - 'a']++;
15
16 // Shrink window from left while current character appears k or more times
17 while (charFrequency[s[rightIndex] - 'a'] >= k) {
18 charFrequency[s[leftIndex] - 'a']--;
19 leftIndex++;
20 }
21
22 // Add count of valid substrings ending at rightIndex
23 // All substrings from index 0 to leftIndex-1 combined with
24 // substring from their position to rightIndex are valid
25 result += leftIndex;
26 }
27
28 return result;
29 }
30};
31
1/**
2 * Counts the number of substrings where at least one character appears k or more times
3 * Uses a sliding window approach with two pointers
4 *
5 * @param s - The input string to analyze
6 * @param k - The minimum frequency threshold for a character
7 * @returns The total count of valid substrings
8 */
9function numberOfSubstrings(s: string, k: number): number {
10 // Initialize result counter and left pointer for sliding window
11 let result: number = 0;
12 let leftPointer: number = 0;
13
14 // Array to track frequency of each lowercase letter (a-z)
15 // Index 0 represents 'a', index 1 represents 'b', etc.
16 const charFrequency: number[] = Array(26).fill(0);
17
18 // Iterate through string with right pointer
19 for (const currentChar of s) {
20 // Convert character to array index (0-25 for a-z)
21 const charIndex: number = currentChar.charCodeAt(0) - 97;
22
23 // Increment frequency of current character
24 charFrequency[charIndex]++;
25
26 // Shrink window from left while current character appears k or more times
27 // This maintains the invariant that no character in the window has frequency >= k
28 while (charFrequency[charIndex] >= k) {
29 // Get character at left pointer and decrease its frequency
30 const leftCharIndex: number = s[leftPointer].charCodeAt(0) - 97;
31 charFrequency[leftCharIndex]--;
32 leftPointer++;
33 }
34
35 // Add number of valid substrings ending at current position
36 // All substrings from index 0 to leftPointer-1 combined with
37 // the current suffix form valid substrings
38 result += leftPointer;
39 }
40
41 return result;
42}
43
Time and Space Complexity
The time complexity is O(n)
, where n
is the length of the string s
. Each character in the string is visited at most twice - once when the right pointer (implicit in the for loop) moves forward, and potentially once when the left pointer l
moves forward. Since each character is processed a constant number of times, the overall time complexity remains linear.
The space complexity is O(|Σ|)
, where Σ
is the character set. The Counter
object cnt
stores at most all unique characters that appear in the string. Since the problem deals with lowercase letters (based on the reference answer), we have |Σ| = 26
, making the space complexity O(26) = O(1)
constant space.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
Pitfall 1: Misunderstanding the Counting Logic
The Issue:
A common mistake is thinking that we should add r - l + 1
(the number of substrings ending at r
within the current window) instead of l
. This misunderstanding stems from confusing which substrings are valid.
Why It's Wrong:
- The window
[l, r]
maintains that NO character appears ≥ k times within it - We want to count substrings where AT LEAST ONE character appears ≥ k times
- Therefore, we count substrings that extend BEYOND the current window to the left
Correct Understanding:
- When we're at position
r
, substrings[0, r]
,[1, r]
, ...,[l-1, r]
are valid (they contain characters with frequency ≥ k) - Substrings
[l, r]
,[l+1, r]
, ...,[r, r]
are NOT valid (no character appears ≥ k times) - So we add
l
(the count of valid substrings), notr - l + 1
Pitfall 2: Incorrect Window Shrinking Condition
The Issue: Some might write the shrinking condition as:
while any(count >= k for count in char_count.values()):
char_count[s[left]] -= 1
left += 1
Why It's Wrong:
This checks if ANY character in the window has frequency ≥ k, but we only need to shrink when the NEWLY ADDED character char
reaches frequency k. Checking all characters is:
- Inefficient (O(26) check per iteration)
- Logically incorrect for maintaining our invariant
Correct Approach:
while char_count[char] >= k: # Only check the character we just added char_count[s[left]] -= 1 left += 1
Pitfall 3: Off-by-One Error in Counting
The Issue:
Adding left + 1
instead of left
to the answer, thinking we need to include the substring starting at position left
.
Why It's Wrong:
- After the while loop, the window
[left, right]
has no character with frequency ≥ k - The substring
[left, right]
is NOT valid - Only substrings
[0, right]
through[left-1, right]
are valid - There are exactly
left
such substrings (indices 0 through left-1)
Correct Implementation:
total_substrings += left # Not left + 1
Complete Corrected Solution:
from collections import Counter
class Solution:
def numberOfSubstrings(self, s: str, k: int) -> int:
char_count = Counter()
total_substrings = 0
left = 0
for right, char in enumerate(s):
char_count[char] += 1
# Only shrink when the current character reaches k occurrences
while char_count[char] >= k:
char_count[s[left]] -= 1
left += 1
# Add exactly 'left' valid substrings ending at position right
total_substrings += left
return total_substrings
Which data structure is used to implement priority queue?
Recommended Readings
https assets algo monster cover_photos stack svg Sliding Window Maximum Monotonic Stack We have an array and a sliding window defined by a start index and an end index The sliding window moves from left of the array to right There are always k elements in the window The window
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Don’t Miss This!