2953. Count Complete Substrings

HardHash TableStringSliding Window
Leetcode Link

Problem Description

In this problem, we are given two inputs: a string word and an integer k. Our task is to determine the number of substrings of word that are considered "complete." A substring is considered complete if it adheres to two specific rules:

  1. Every character within the substring occurs exactly k times.
  2. The difference in alphabet positions between any two adjacent characters in the substring is no more than 2.

The string word consists of lowercase English letters, and we have to find non-empty substrings that satisfy the above criteria. The ultimate goal is to return the count of such complete substrings from the given word.

Intuition

The intuition behind the solution approach is to recognize that the problem can be broken down into more manageable chunks by first dealing with condition 2 (the difference between adjacent characters must be at most 2). We can iterate through the string and identify substrings that meet this second criterion as separated segments.

Once we have those segments, we can focus on counting the number of complete substrings within each segment that also satisfy condition 1 (each character in the substring occurs exactly k times). This involves a bit more complexity since we now have to track character frequencies within a sliding window.

To approach this, we use a sliding window technique, allowing us to dynamically update the frequency of characters as we move through the substring. By doing so, we can efficiently keep track of the number of characters appearing exactly k times within any given window length — and as a result, we can count complete substrings that meet both of the stated conditions.

We encapsulate the counting logic within a helper function f(s), which takes a substring s (already satisfying condition 2) and returns the number of complete substrings (satisfying both conditions 1 and 2). By enumerating the possible number of character types in s and sliding a window of a suitable length over s, we can find and count complete substrings. This application of the sliding window technique alongside character frequency counting is the crux of the problem's solution.

Learn more about Sliding Window patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

What are the two properties the problem needs to have for dynamic programming to be applicable? (Select 2)

Solution Approach

The implementation of the solution employs a combination of the sliding window pattern, hash tables (dictionaries in Python), and character frequency analysis. Here's how we could break down the approach:

Firstly, we define the inner function f(s), which operates on a substring s. Its goal is to count the complete substrings of s where each character appears exactly k times. The function works as follows:

  1. The enumaration of each unique chararacter type ranging from 1 to 26 (as there are 26 letters in the English alphabet) is conducted.
  2. For each character type i, calculate the required length of the sliding window l = i * k. This results from the rule that each character must appear exactly k times to form a complete substring.
  3. The cnt dictionary tracks each character's frequency within the current window, and freq records how many characters appear with each frequency.
  4. If freq[k] is equal to i, we've found a valid substring, as exactly i different characters occur k times each.
  5. While enumerating through s, the character frequencies are updated: whenever the window slides right, we update frequencies for the new character included and the one that's excluded.

Then, using the main function:

  1. A loop through the string word splits it into segments where each segment satisfies condition 2 (the maximum difference in alphabet positions between adjacent characters is 2). We split by incrementing j until the condition is violated.
  2. For each segment found, we invoke f(s) to get the count of complete substrings within it.
  3. We accumulate these counts to maintain an overall total count of complete substrings.

This strategy of dividing the problem into segments (according to condition 2) and solving each one independently with our sliding window and frequency count (for condition 1) makes for an efficient and methodical solution. By using hash tables, we can quickly access and update character frequencies, making the algorithm relatively fast and effective for the task at hand.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Which algorithm is best for finding the shortest distance between two points in an unweighted graph?

Example Walkthrough

Let's take a small example to illustrate the solution approach. Suppose we are given the string word = "abcabbcc" and k = 2. Our goal is to count the number of complete substrings following the rules mentioned.

Firstly, we split the string word into segments where the alphabetical difference between adjacent characters is no more than 2.

  1. Starting from the first character, 'a', we move forward until this condition is violated. We find that the substring "abc" satisfies this condition, but adding 'a' again violates it because the difference between 'c' and 'a' is more than 2.

  2. The next segment is "ab" since 'a' and 'b' have a difference of 1, but adding 'b' again violates the condition with respect to the first 'a'.

  3. Next, we have the segment "bcc".

  4. We are left with no more segments since we have reached the end of word.

Now, let's apply the function f(s) to count the complete substrings within each segment while ensuring every character appears exactly k times:

Segment "abc":

  • There are no complete substrings here since no character appears exactly twice.

Segment "ab":

  • No character appears exactly twice, so again no complete substrings.

Segment "bcc":

  • We could choose the window size l = 2 * k because we only have one type of character ('c') that can appear twice to make a complete substring.
  • Sliding the window across "bcc", we find one complete substring which is "cc".

After accounting for all segments, we have counted one complete substring which is "cc" from the segment "bcc". So, the result for this example would be 1.

This example illustrates the core technique of segmenting the input string and then using a sliding window with a character frequency count to identify complete substrings within those segments.

Solution Implementation

1from collections import Counter
2
3class Solution:
4    def countCompleteSubstrings(self, word: str, k: int) -> int:
5        # Helper function to count completed substrings in a given segment
6        def count_completed_substr(segment: str) -> int:
7            segment_length = len(segment)
8            completed_substr_count = 0
9          
10            # Start iterating with multiples of k
11            for num_unique_chars in range(1, 27):  # There are 26 letters in the English alphabet
12                length_of_group = num_unique_chars * k
13              
14                # If length is greater than segment length, stop the loop
15                if length_of_group > segment_length:
16                    break
17              
18                # Initialize the counter for the first window of size 'length_of_group'
19                char_counter = Counter(segment[:length_of_group])
20                # Counter to track the frequencies of character counts in the current window
21                frequency_counter = Counter(char_counter.values())
22
23                # Check if there are exactly num_unique_chars characters with k occurrences
24                completed_substr_count += (frequency_counter[k] == num_unique_chars)
25
26                for end_index in range(length_of_group, segment_length):
27                    # Slide the window: add one char at the end and remove one at the start
28                    frequency_counter[char_counter[segment[end_index]]] -= 1
29                    char_counter[segment[end_index]] += 1
30                    frequency_counter[char_counter[segment[end_index]]] += 1
31                  
32                    start_index = end_index - length_of_group
33                    frequency_counter[char_counter[segment[start_index]]] -= 1
34                    char_counter[segment[start_index]] -= 1
35                    frequency_counter[char_counter[segment[start_index]]] += 1
36
37                    # Again, check if the new window is a completed substring
38                    completed_substr_count += (frequency_counter[k] == num_unique_chars)
39            return completed_substr_count
40      
41        # Initialize counters
42        total_completed_substr_count = 0
43        current_index = 0
44        word_length = len(word)
45      
46        # Process the entire input string
47        while current_index < word_length:
48            next_index = current_index + 1
49          
50            # Find the end of a 'segment' where consecutive characters differ by at most 2
51            while next_index < word_length and abs(ord(word[next_index]) - ord(word[next_index - 1])) <= 2:
52                next_index += 1
53          
54            # Count completed substrings in the current 'segment'
55            total_completed_substr_count += count_completed_substr(word[current_index:next_index])
56            current_index = next_index
57      
58        # Return the total count of completed substrings
59        return total_completed_substr_count
60
1class Solution {
2    // Main function to count the complete substrings according to given conditions
3    public int countCompleteSubstrings(String word, int k) {
4        int wordLength = word.length();
5        int totalCount = 0;
6
7        // Iterate over the word to find all the substrings where consecutive characters have absolute difference <= 2
8        for (int i = 0; i < wordLength;) {
9            int nextIndex = i + 1;
10            // Find the end of the current substring based on the character difference condition
11            while (nextIndex < wordLength && Math.abs(word.charAt(nextIndex) - word.charAt(nextIndex - 1)) <= 2) {
12                ++nextIndex;
13            }
14
15            // Count complete substrings within the currently identified substring
16            totalCount += countValidSubstrings(word.substring(i, nextIndex), k);
17            i = nextIndex; // Move to the next position after the current substring
18        }
19        return totalCount;
20    }
21
22    // Helper function to count substrings where each character appears exactly 'k' times
23    private int countValidSubstrings(String s, int k) {
24        int substringLength = s.length();
25        int count = 0;
26
27        // Iterate for each possible length of alphabet characters, constrained by 'k' and length of 's'
28        for (int i = 1; i <= 26 && i * k <= substringLength; ++i) {
29            int segmentLength = i * k; // The length of the substring we are looking for
30            int[] charCounts = new int[26]; // Array to count occurrences of each letter
31
32            // Count occurrences of each letter in the first segment of length 'segmentLength'
33            for (int j = 0; j < segmentLength; ++j) {
34                ++charCounts[s.charAt(j) - 'a'];
35            }
36
37            // Store the frequency of frequencies
38            Map<Integer, Integer> frequencyMap = new HashMap<>();
39            for (int countValue : charCounts) {
40                if (countValue > 0) {
41                    frequencyMap.merge(countValue, 1, Integer::sum);
42                }
43            }
44
45            // If the map contains exactly 'i' entries of 'k', it's a valid substring
46            if (frequencyMap.getOrDefault(k, 0) == i) {
47                ++count;
48            }
49
50            // Slide the window of size 'segmentLength' and update counts and frequencies
51            for (int j = segmentLength; j < substringLength; ++j) {
52                int newCharIndex = s.charAt(j) - 'a';
53                int oldCharIndex = s.charAt(j - segmentLength) - 'a';
54
55                // Update frequency map for the character entering the sliding window
56                frequencyMap.merge(charCounts[newCharIndex], -1, Integer::sum);
57                ++charCounts[newCharIndex];
58                frequencyMap.merge(charCounts[newCharIndex], 1, Integer::sum);
59
60                // Update frequency map for the character leaving the sliding window
61                frequencyMap.merge(charCounts[oldCharIndex], -1, Integer::sum);
62                --charCounts[oldCharIndex];
63                frequencyMap.merge(charCounts[oldCharIndex], 1, Integer::sum);
64
65                // If the map condition holds again, increase the count
66                if (frequencyMap.getOrDefault(k, 0) == i) {
67                    ++count;
68                }
69            }
70        }
71        return count;
72    }
73}
74
1#include <string>
2#include <unordered_map>
3#include <vector>
4using namespace std;
5
6class Solution {
7public:
8    // Function to count the number of complete substrings with the given constraints
9    int countCompleteSubstrings(string word, int k) {
10        // Get the length of the input string
11        int length = word.length();
12        // Initialize the answer
13        int completeSubstrings = 0;
14        // Lambda function to process each segmented string
15        auto countSegmentedSubstrings = [&](string segment) {
16            // Get the length of the segment
17            int segmentLength = segment.length();
18            int count = 0;
19            // Iterate through possible character counts up to 26 (number of letters in alphabet)
20            for (int i = 1; i <= 26 && i * k <= segmentLength; ++i) {
21                int requiredLength = i * k;
22                int charCounts[26] = {0}; // Initialize array to keep character counts
23                // Count occurrences of each character in the initial window of size requiredLength
24                for (int j = 0; j < requiredLength; ++j) {
25                    ++charCounts[segment[j] - 'a'];
26                }
27                unordered_map<int, int> frequencyMap; // Map to store frequencies of character counts
28                // Count frequencies of the character counts
29                for (int count : charCounts) {
30                    if (count > 0) {
31                        frequencyMap[count]++;
32                    }
33                }
34                // Check if there are exactly 'i' character counts equal to 'k'
35                if (frequencyMap[k] == i) {
36                    ++count;
37                }
38                // Slide the window one character at a time, updating counts and checking for complete substrings
39                for (int j = requiredLength; j < segmentLength; ++j) {
40                    int newCharIndex = segment[j] - 'a';
41                    int oldCharIndex = segment[j - requiredLength] - 'a';
42                  
43                    // Update the frequency map for the new character
44                    frequencyMap[charCounts[newCharIndex]]--;
45                    charCounts[newCharIndex]++;
46                    frequencyMap[charCounts[newCharIndex]]++;
47                  
48                    // Update the frequency map for the old character
49                    frequencyMap[charCounts[oldCharIndex]]--;
50                    charCounts[oldCharIndex]--;
51                    frequencyMap[charCounts[oldCharIndex]]++;
52                  
53                    // Recheck for complete substrings in the new window
54                    if (frequencyMap[k] == i) {
55                        ++count;
56                    }
57                }
58            }
59            return count;
60        };
61        // Loop through the input string and apply the lambda function to each segment
62        for (int i = 0; i < length;) {
63            int j = i + 1;
64            // Find segment boundaries where the absolute difference is greater than 2
65            while (j < length && abs(word[j] - word[j - 1]) <= 2) {
66                ++j;
67            }
68            // Count complete substrings for the segment and add it to the total count
69            completeSubstrings += countSegmentedSubstrings(word.substr(i, j - i));
70            // Move to the next segment
71            i = j;
72        }
73        // Return the total count of complete substrings
74        return completeSubstrings;
75    }
76};
77
1function countCompleteSubstrings(word: string, k: number): number {
2    // Helper function to calculate count of valid substrings in a given string `s`
3    const calculateValidSubstrings = (s: string): number => {
4        const lengthOfS = s.length;
5        let totalCount = 0;
6
7        for (let i = 1; i <= 26 && i * k <= lengthOfS; i++) {
8            const currentLength = i * k;
9            const charCount: number[] = new Array(26).fill(0);
10
11            // Initialize character counts for the first `currentLength` characters
12            for (let j = 0; j < currentLength; j++) {
13                charCount[s.charCodeAt(j) - 'a'.charCodeAt(0)]++;
14            }
15
16            // Create a frequency map of the character counts
17            const countFrequency: { [key: number]: number } = {};
18            for (const count of charCount) {
19                if (count > 0) {
20                    countFrequency[count] = (countFrequency[count] || 0) + 1;
21                }
22            }
23
24            // Check if the substring matches the conditions
25            if (countFrequency[k] === i) {
26                totalCount++;
27            }
28
29            // Slide the window of `currentLength` across the string and update counts
30            for (let j = currentLength; j < lengthOfS; j++) {
31                const incomingCharIndex = s.charCodeAt(j) - 'a'.charCodeAt(0);
32                const outgoingCharIndex = s.charCodeAt(j - currentLength) - 'a'.charCodeAt(0);
33
34                countFrequency[charCount[incomingCharIndex]]--;
35                charCount[incomingCharIndex]++;
36                countFrequency[charCount[incomingCharIndex]] = (countFrequency[charCount[incomingCharIndex]] || 0) + 1;
37
38                countFrequency[charCount[outgoingCharIndex]]--;
39                charCount[outgoingCharIndex]--;
40                countFrequency[charCount[outgoingCharIndex]] = (countFrequency[charCount[outgoingCharIndex]] || 0) + 1;
41
42                // Check if the updated substring matches the conditions
43                if (countFrequency[k] === i) {
44                    totalCount++;
45                }
46            }
47        }
48
49        return totalCount;
50    };
51
52    // Main loop to iterate over the word and segment it into non-decreasing alphabetical sequences
53    let wordLength = word.length;
54    let totalSubstringsCount = 0;
55    for (let i = 0; i < wordLength; ) {
56        let j = i + 1;
57        // Find the end of the current alphabetical sequence
58        while (j < wordLength && Math.abs(word.charCodeAt(j) - word.charCodeAt(j - 1)) <= 2) {
59            j++;
60        }
61        // Calculate valid substrings within the current alphabetical sequence
62        totalSubstringsCount += calculateValidSubstrings(word.substring(i, j));
63        i = j; // Move to the next sequence
64    }
65    return totalSubstringsCount;
66}
67
Not Sure What to Study? Take the 2-min Quiz:

How does merge sort divide the problem into subproblems?

Time and Space Complexity

Time Complexity

The time complexity of the given code can be broken down into the operations it performs:

  1. Outer Loop: The outermost while loop runs at most n times where n is the length of the word since i increases by at least 1 in every iteration.

  2. Inner Loop: The nested while loop determines consecutive characters with an absolute difference of their ASCII values not greater than 2. In the worst case, this could run for the entire length of the string, but cumulatively across all iterations, it also operates within O(n) time, as each character gets checked exactly once throughout the entire execution of both while loops.

  3. Function f(s: str): Within this function:

    • The outer for loop runs up to 26 times representing each character in the English alphabet (1 ≤ i ≤ 26).
    • Inside this for loop, another nested loop runs sliding over the string with a window of size l = i * k. This loop iterates approximately m - l times, where m is the length of the substring here. Given that l can be at most 26 * k and that f(s) can be called several times, the total contribution of these nested loops can be approximated to O(n) since each character is considered once for every character in the alphabet—resulting in O(n * |\Sigma|) time complexity overall.

Conclusively, the combination of the loops and function calls give us a time complexity of O(n * |\Sigma|), where |\Sigma| = 26 for lowercase English letters.

Space Complexity

The space complexity is determined by the storage requirements:

  1. Counter Objects: There are two counter objects (cnt and freq) that are used to store the frequency of each character and the frequency of those frequencies. The cnt object at most will store |\Sigma| entries (one for each unique character), and the freq object will store at most k entries (since k is the maximum frequency we're interested in counting).

  2. Constant Space: Other than the counters, the function utilizes a few variables for logic and arithmetic operations which take up constant space.

Thus, the space complexity is O(|\Sigma| + k). However, since k is at most 26 (as i ranges from 1 to 26), it is valid to simplify this expression to O(|\Sigma|).

In the context of the given problem where |\Sigma| represents the size of the set of lowercase English letters, |\Sigma| = 26. Hence, the final space complexity is O(26) which simplifies to O(1)—constant space complexity.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which one best describes the time complexity of the following code?

1int factorial(int n) {
2  if (n < 0) {
3    return -1;
4  } else if (n == 0) {
5    return 1;
6  } else {
7    return n * factorial(n - 1);
8  }
9}

Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫