2067. Number of Equal Count Substrings

MediumStringCountingPrefix Sum
Leetcode Link

Problem Description

In the given LeetCode problem, we need to deal with a string s that is composed of lowercase English letters, and we are provided with an integer count. The problem's central concept is to identify "equal count substrings." An equal count substring is defined as a substring where each unique letter appears exactly count times within it.

Essentially, the task is to find all the substrings where the frequency of each distinct letter within the substring equals the given count. We then need to return the total count of such substrings.

Given the nature of strings and substrings, the problem might appear complex because there can be a lot of substrings to consider, especially as the length of the string increases. We are asked to calculate the number of substrings that meet the criteria, not to list them, which slightly simplifies the task.

Intuition

The intuition behind the solution is to leverage the sliding window technique combined with frequency counting for every unique letter. Here's a breakdown of how we can approach the problem:

  1. We need to find substrings that contain each unique character count times. Since we are limited to lowercase English letters, there is a maximum of 26 unique characters that can appear in substrings.

  2. We initialize a counter called ans that will keep track of the total number of equal count substrings found.

  3. Begin by iterating over a range from 1 to 27, which corresponds to the possible number of unique characters in a substring (as there are 26 lowercase letters). The variable x represents this count of unique characters.

  4. For each x, we calculate m, the minimum length a substring needs to be to contain x unique characters count times each. If m exceeds the string length, we can break out of the loop because it is no longer possible for any further substrings to meet the criteria.

  5. We use a sliding window method in combination with a counter cnt to keep track of the occurrences of each character as we slide through the string.

  6. By moving the right edge of the window, we add characters to the cnt and keep track of how many characters meet the count condition (y).

  7. Once the window reaches the size m, we start moving the left edge to remove characters from the cnt. Adjust the counts for the character we remove and modify the y accordingly to reflect the current state of our window.

  8. We increment ans each time the number of characters meeting the criterion x is the same as y, which implies we found an equal count substring.

By the end of this process, we will have iterated over all possible substrings that can have an equal count of all its unique characters. Each time the window slides and the conditions match, we would increment ans, and finally, we return the value of ans as the result.

Solution Approach

The solution employs a combination of the sliding window technique and a frequency counter (Counter) to efficiently identify all equal count substrings within the given string s.

Here's how the implementation works:

  • A Python Counter from the collections module is utilized to maintain the frequency of each character within the current sliding window.

  • The solution begins by iterating over all potential counts of unique characters a substring might have (x), ranging from 1 to 26. This corresponds to the possible number of characters (considering the alphabet has 26 letters) that can fulfill the equal count condition.

  • m is calculated as count * x and represents the minimum size the window needs to have for there to be x unique characters each appearing count times.

  • The sliding window is iterated across the string s using a for-loop. For each character c encountered, the Counter cnt is updated to include the new character.

  • y tracks how many characters within the current window meet the condition of appearing exactly count times.

  • When the end of the window goes beyond index i - m, indicating that the window is larger than the size needed to fit x unique characters count times each, we start to shrink the window from the left.

  • As we shrink the window, we decrease the frequency of the character that goes out of the window and update y to reflect whether the character going out still meets the count condition or not.

  • If the current window exactly contains x characters that meet the count condition, then this window is a valid equal count substring, and ans is incremented.

  • The above steps are repeated for each x until all potential substring lengths have been checked or until m exceeds the length of the string.

The Python Counter data structure speedily updates and keeps track of character frequencies, and the sliding window technique ensures that the algorithm considers each substring without redundant recalculations.

In code, this is represented as:

1class Solution:
2    def equalCountSubstrings(self, s: str, count: int) -> int:
3        ans = 0
4        for x in range(1, 27):
5            m = count * x
6            if m > len(s):
7                break
8            cnt = Counter()
9            y = 0
10            for i, c in enumerate(s):
11                cnt[c] += 1
12                y += cnt[c] == count
13                y -= cnt[c] == count + 1
14                j = i - m
15                if j >= 0:
16                    cnt[s[j]] -= 1
17                    y += cnt[s[j]] == count
18                    y -= cnt[s[j]] == count - 1
19                ans += x == y
20        return ans

Each iteration adaptive to the character presence updates the answer ans when a valid window is found. The algorithm's complexity is primarily driven by the length of the input string s and the interaction between the count of unique characters and the sliding window.

💪
Level Up Your
Algo Skills

Example Walkthrough

Let's walk through a small example to illustrate the solution approach with a string s = "aabbcc" and count = 2.

We must find substrings where each distinct character appears exactly twice. Here's how the sliding window technique and frequency counting would be used:

  1. We iterate from x = 1 to x = 26, where x represents the number of unique characters that must each appear count times within a substring.

  2. Let's start with x = 1. The minimum length m for a substring to have one unique character appearing twice is m = count * x = 2 * 1 = 2.

  3. We initialize a counter cnt and a variable y to keep track of characters meeting the condition.

  4. As we slide through the string, we will encounter "aa", "bb", and "cc" as substrings of length m = 2. Each time we slide and the condition matches (where x = y), we increment ans.

  5. For our example, "aa", "bb", and "cc" are valid substrings when x=1. So, ans = 3 after processing for x=1.

  6. Next, take x = 2. Here, m = count * x = 2 * 2 = 4. We use the window size of 4, slide through the string, and find "aabb", "bbcc".

  7. For x = 2, our window "aabb" and "bbcc" are valid, so ans = ans + 2 = 3 + 2 = 5.

  8. We proceed with this process, increasing x until m exceeds the string length or until x = 26.

  9. In this example, for x = 3, m = count * x = 2 * 3 = 6, which is equal to the length of the string. So "aabbcc" is also a valid substring, ans = ans + 1 = 5 + 1 = 6.

  10. Since we've reached a window length equal to the entire string, we stop our iteration here as no longer substrings can be formed.

Through this process for each possible value of x, we have found the total number of equal count substrings to be 6. This method aggregates counts for all possible window sizes that match the condition of having each unique character appear exactly count times.

Python Solution

1from collections import Counter
2
3class Solution:
4    def equalCountSubstrings(self, s: str, count: int) -> int:
5        num_substrings = 0  # Initialize the number of valid substrings
6
7        # Iterate through the possible numbers of unique characters (1 to 26 for the alphabet)
8        for unique_chars in range(1, 27):
9            # Calculate the length of the substring that must be considered given the count
10            required_length = count * unique_chars
11            # If the required length exceeds the string length, no further consideration is needed
12            if required_length > len(s):
13                break
14          
15            # Initialize a counter for the occurrence of characters
16            char_count = Counter()
17            # Initialize a variable to track the number of unique characters with exactly 'count' occurrences
18            current_count_match = 0
19          
20            # Iterate through characters of the string
21            for index, char in enumerate(s):
22                char_count[char] += 1
23                # If a character reaches the exact 'count', increment the current count match
24                if char_count[char] == count:
25                    current_count_match += 1
26                # If a character exceeds the 'count', decrement the current count match
27                elif char_count[char] == count + 1:
28                    current_count_match -= 1
29              
30                # Calculate the start index of the current window 
31                start_index = index - required_length
32                # If start index is valid, update the character count and current count match
33                if start_index >= 0:
34                    char_count[s[start_index]] -= 1
35                    # If count of a character becomes 'count', increment the current count match
36                    if char_count[s[start_index]] == count:
37                        current_count_match += 1
38                    # If count of a character falls below 'count', decrement the current count match
39                    elif char_count[s[start_index]] == count - 1:
40                        current_count_match -= 1
41              
42                # If the current count match is equal to the number of unique characters needed, 
43                # increment the result
44                num_substrings += unique_chars == current_count_match
45
46        # Return the total number of valid substrings
47        return num_substrings
48

Java Solution

1class Solution {
2    public int equalCountSubstrings(String s, int count) {
3        int answer = 0; // This will contain the final count of valid substrings.
4        int stringLength = s.length(); // The length of the input string.
5      
6        // Iterate over possible distinct character counts from 1 to 26 (inclusive),
7        // but only if 'count * x' does not exceed the length of the string.
8        for (int numDistinctChars = 1; numDistinctChars <= 26 && count * numDistinctChars <= stringLength; ++numDistinctChars) {
9            // The length of the substring we're looking for based on 'numDistinctChars'.
10            int substringLength = count * numDistinctChars;
11            int[] charCount = new int[26]; // Count of each character in the current window.
12            int validCharCount = 0; // Number of characters with exactly 'count' occurrences in the current window.
13          
14            // Iterate over the characters of the string.
15            for (int i = 0; i < stringLength; ++i) {
16                int currentCharIndex = s.charAt(i) - 'a'; // Convert char to index (0-25).
17                ++charCount[currentCharIndex]; // Increment the count for this character.
18              
19                // Update validCharCount for the current character count.
20                if (charCount[currentCharIndex] == count) {
21                    ++validCharCount;
22                }
23                if (charCount[currentCharIndex] == count + 1) {
24                    --validCharCount;
25                }
26              
27                // Remove the character from the start of the window if it's outside the window.
28                int startWindowIndex = i - substringLength;
29                if (startWindowIndex >= 0) {
30                    int startCharIndex = s.charAt(startWindowIndex) - 'a';
31                    --charCount[startCharIndex]; // Decrement the count for this character.
32                  
33                    // Update validCharCount after decrementing.
34                    if (charCount[startCharIndex] == count) {
35                        ++validCharCount;
36                    }
37                    if (charCount[startCharIndex] == count - 1) {
38                        --validCharCount;
39                    }
40                }
41              
42                // If the number of valid characters matches the distinct character count,
43                // we've found a valid substring, so increment the answer.
44                if (numDistinctChars == validCharCount) {
45                    ++answer;
46                }
47            }
48        }
49        return answer; // Return the total count of valid substrings.
50    }
51}
52

C++ Solution

1class Solution {
2public:
3    int equalCountSubstrings(string s, int count) {
4        int totalSubstrings = 0; // This will hold the total count of substrings found
5        int strLength = s.size(); // The length of the input string
6        int charCount[26]; // Array to keep count of each character within a window
7
8        // Iterate over possible lengths of substrings which are multiples of count
9        for (int substringFactor = 1; substringFactor * count <= strLength; ++substringFactor) {
10            int windowSize = count * substringFactor; // Calculate window size
11            memset(charCount, 0, sizeof charCount); // Initialize character counts to 0
12            int currentFactorCount = 0; // Tracks how many characters have reached the 'count' within the window
13
14            // Iterate over each character in the input string
15            for (int i = 0; i < strLength; ++i) {
16                int charIndex = s[i] - 'a'; // Convert character to index (0-25)
17                ++charCount[charIndex]; // Increment the count for this character
18
19                // Increment currentFactorCount if count for this character has reached 'count'
20                if (charCount[charIndex] == count) {
21                    ++currentFactorCount;
22                } else if (charCount[charIndex] == count + 1) {
23                    --currentFactorCount; // Decrement if the count exceeds 'count'
24                }
25
26                // If we've moved past the first window, start removing characters that fall out
27                if (i >= windowSize) {
28                    int removeIndex = s[i - windowSize] - 'a'; // Character to remove from window
29                    --charCount[removeIndex]; // Decrement the count for removed character
30
31                    // Adjust currentFactorCount based on the new count of the removed character
32                    if (charCount[removeIndex] == count) {
33                        ++currentFactorCount;
34                    } else if (charCount[removeIndex] == count - 1) {
35                        --currentFactorCount;
36                    }
37                }
38
39                // If currentFactorCount equals substringFactor, it means we have a valid substring
40                totalSubstrings += (substringFactor == currentFactorCount);
41            }
42        }
43        return totalSubstrings; // Return the total count of valid substrings
44    }
45};
46

Typescript Solution

1/**
2 * Counts the number of substrings where each unique character occurs exactly `count` times.
3 * @param {string} s - The input string.
4 * @param {number} count - The exact number of times each character should repeat in a substring.
5 * @return {number} The count of valid substrings.
6 */
7var equalCountSubstrings = function(s: string, count: number): number {
8    let answer: number = 0;
9    const strLength: number = s.length;
10
11    // Iterate through possible substrings with unique character counts from 1 to 26
12    for (let numOfUniqueChars = 1; 
13         numOfUniqueChars <= 26 && numOfUniqueChars * count <= strLength; 
14         ++numOfUniqueChars) {
15        const windowSize: number = numOfUniqueChars * count;
16        const charCount: number[] = new Array(26).fill(0);
17        let validCount: number = 0;
18      
19        // Iterate through the string to count the occurrences of each character
20        for (let i = 0; i < strLength; ++i) {
21            const currentCharIndex: number = s.charCodeAt(i) - 'a'.charCodeAt(0);
22            ++charCount[currentCharIndex];
23            validCount += charCount[currentCharIndex] === count ? 1 : 0;
24            validCount -= charCount[currentCharIndex] === count + 1 ? 1 : 0;
25          
26            const startIndex = i - windowSize;
27          
28            // Decrement the count for the character going out of the window
29            if (startIndex >= 0) {
30                const oldCharIndex: number = s.charCodeAt(startIndex) - 'a'.charCodeAt(0);
31                --charCount[oldCharIndex];
32                validCount += charCount[oldCharIndex] === count ? 1 : 0;
33                validCount -= charCount[oldCharIndex] === count - 1 ? 1 : 0;
34            }
35          
36            // If the number of valid characters equals the number of unique characters,
37            // it is a valid substring.
38            answer += numOfUniqueChars === validCount ? 1 : 0;
39        }
40    }
41  
42    return answer;
43};
44

Time and Space Complexity

Time Complexity

The provided Python code involves nested loops where the outer loop iterates over a range determined by the input string length and the constant count. Specifically, the outer loop can run up to 26 times representing the number of unique letters in the English alphabet, and it breaks early if the count multiplied by the current index exceeds the length of the string s.

  • The outer loop runs at most 26 times (for each possible unique character count x).
  • The inner loop runs linearly with the length of the string s, processing each character exactly once.
  • Within the inner loop, operations such as incrementing Counter values, conditional checks, and +/- operations are constant time.

The time complexity, therefore, would be O(26 * n) which simplifies to O(n), where n is the length of string s.

However, if we delve deeper, we may notice that the computations depend not only on n but also on m which is count * x. The inner loop performs work proportional to both n and the window size determined by m. So, even though we iterate over the string once, we must consider the cost of maintaining the sliding window. Therefore, the time complexity is more accurately represented as O(n * x), where x can be up to a maximum of n / count.

Space Complexity

  • The Counter object holds at most 26 keys at any time (for each letter of the English alphabet), but the number of keys will actually correspond to the number of unique characters in each considered substring of s, bound by m.

  • The variables ans, y, x, m, i, j, and c use constant space.

Thus, the space complexity is primarily determined by the Counter object, which is O(1) because the number of unique characters in Counter is capped by the alphabet size, which is a constant.

In conclusion, the code has a linear time complexity with respect to the size of the string 's' and a constant space complexity.

😈
Become an
Algo Monster

Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫