1180. Count Substrings with Only One Distinct Letter


Problem Description

The problem requires counting the substrings within a given string s that meet a specific criterion: each substring must contain exactly one distinct letter. In other words, any given substring is a sequence of the same character, with no other different characters inside. To illustrate, if s is "aaa", we should count the individual substrings "a", "a", "a", the pairs "aa", "aa", and the entire string "aaa", which gives us a total of 6 such substrings.

Intuition

To solve this problem, we can observe patterns in the strings and how substrings forming consecutive identical characters contribute to the total count. Consider a substring with n identical characters; it will have 1 + 2 + 3 + ... + n distinct substrings with one distinct letter, which is the sum of the first n natural numbers. We can use the formula for this sum, n * (n + 1) / 2, to find the number of valid substrings in a piece of the sequence without the need for an exhaustive search for all possible substrings.

With this knowledge, we can look for these consecutive substrings of identical characters. We iterate through the string with two pointers (or indices) i and j. We use i to start at the beginning of a substring with identical characters and j to find its end. For each substring we find, we calculate its contribution to the total count using our sum formula and add it to ans, which keeps track of the number of valid substrings. We repeat this process until we've considered every character in the string, at which point ans will contain the final count of substrings with one distinct letter.

Solution Approach

The solution utilizes a simple iterative approach with two pointers to traverse the string efficiently. The key idea is to find consecutive groups of the same character and then calculate the number of substrings that can be formed from these groups using a mathematical formula. This eliminates the need for nested loops to consider every possible substring explicitly. We'll break down the steps of the implementation:

  1. Initialize two pointers, i and j. i will scan through the string, and j will be used to find the end of a consecutive group of identical characters starting at i.
  2. Initialize ans to zero, which will be used to accumulate the total number of substrings with one distinct letter.
  3. Use a while loop to iterate over the string with the condition i < n, ensuring we don't go past the string's end.
  4. Within the loop, set j to i to mark the beginning of the new possible consecutive group.
  5. Start a nested while loop to move j forward as long as j < n and s[j] is equal to s[i]. This loop determines the length of the consecutive group.
  6. With the length of the consecutive group determined (j - i), calculate the number of substrings using the formula (1 + j - i) * (j - i) // 2. This formula represents the sum of the first j - i natural numbers.
  7. Add this calculated number to ans, accumulating the count of valid substrings.
  8. Update i to j to move the first pointer to the next group of identical characters.
  9. Return ans once the entire string has been scanned.

The algorithm makes a single pass over the string, achieving linear time complexity, O(n), where n is the length of the input string. The space complexity is O(1) as it only uses a constant amount of extra space for the pointers and counter. This efficiency stems from the combination of the two pointers technique and the application of the arithmetic series sum formula.

Example Walkthrough

Let's apply the solution approach using the string s = "aabbbc" as an example.

Following the steps:

  1. Initialize two pointers, i and j. We start by setting both i = 0 and j = 0.

  2. Initialize ans to zero. This will hold the count of valid substrings. So, ans = 0.

  3. Use a while loop to iterate over the string. Since i < n where n = 6 (length of "aabbbc"), the loop begins.

  4. Within the loop, set j = i. This acknowledges the start of a new consecutive group (we expect j to find how long it goes).

  5. Start a nested while loop with j < n and s[j] == s[i].

    • For i = 0, j moves from 0 to 1 as s[0] (which is 'a') is the same as s[1]. j stops at 2 because s[2] is different ('b').
    • The group "aa" ends at index 1, thus with length j - i = 2.
  6. Calculate the number of substrings: (1 + j - i) * (j - i) // 2 = (1 + 2 - 0) * (2 - 0) // 2 = 3 * 2 // 2 = 3.

  7. Add to ans: ans = ans + 3 which becomes ans = 3.

  8. Update i to j: set i to 2 where j had stopped.

Now we continue the steps for the next group of identical characters 'b':

  1. While loop continues with i = 2. As before, we set j = i.

  2. Nested while loop identifies the group "bbb": The loop starts at j = 2 and increments j until it reaches 5, right before 'c', as 'b's are consecutive until index 4.

  3. Calculate for "bbb": (1 + j - i) * (j - i) // 2 = (1 + 5 - 2) * (5 - 2) // 2 = 4 * 3 // 2 = 6.

  4. Add to ans: ans = ans + 6 which now becomes ans = 9.

  5. Update i to the new position j = 5, skipping over the 'bbb' group.

The loop won't find any more groups longer than 1 character ('c' is a single character).

  1. Remaining single character will be the group "c":

    • Since j == n and no further groups are possible, we simply add 1 for the solo 'c'.
    • ans = ans + 1 which now becomes ans = 10.

The string has been fully scanned:

  1. Return ans. The final count of substrings where each contains exactly one distinct letter is 10.

We've just applied the solution approach using two-pointer technique to efficiently calculate the number of substrings within the string "aabbbc" that consist of the same character. The total count of valid substrings in this case is 10, which is the sum of substrings from groups "aa", "bbb", and "c".

Python Solution

1class Solution:
2    def countLetters(self, s: str) -> int:
3        # Initialize the length of the string for easy reference
4        string_length = len(s)
5      
6        # Initialize the index and answer variables
7        index, total_count = 0, 0
8      
9        # Iterate over the string, using 'index' as the starting pointer
10        while index < string_length:
11            # Set 'current_char' as the character at the current index
12            current_char = s[index]
13            # Initialize 'span_length' which will count the span of identical characters
14            span_length = 0
15          
16            # Count continuous span of identical characters starting from 'index'
17            while index < string_length and s[index] == current_char:
18                span_length += 1
19                index += 1
20          
21            # Calculate the total substrings for the span and add to 'total_count'
22            # The formula (span_length * (span_length + 1)) // 2 calculates the number of total
23            # possible substrings in a string containing identical characters.
24            total_count += (span_length * (span_length + 1)) // 2
25      
26        # Return the total count of all possible substrings
27        return total_count
28

Java Solution

1class Solution {
2
3    // This method counts all possible substrings which consist of the same character
4    public int countLetters(String s) {
5        int totalCount = 0; // Initialize total count of valid substrings
6      
7        // Iterate through the string starting from the first character
8        for (int currentIndex = 0, stringLength = s.length(); currentIndex < stringLength;) {
9            int nextIndex = currentIndex; // Index to find the end of a group of identical characters
10            // Continue while we have the same character as at currentIndex
11            while (nextIndex < stringLength && s.charAt(nextIndex) == s.charAt(currentIndex)) {
12                nextIndex++;
13            }
14
15            // Calculate the number of substrings that can be formed with the same character
16            // and add it to totalCount. It is based on the arithmetic series (n(n+1)/2).
17            int sameCharCount = nextIndex - currentIndex;
18            totalCount += (sameCharCount + 1) * sameCharCount / 2;
19
20            // Skip to the next character group
21            currentIndex = nextIndex;
22        }
23      
24        // Return the total count of substrings
25        return totalCount;
26    }
27}
28

C++ Solution

1class Solution {
2public:
3    // Function to count the total number of substrings that have all the same letters
4    int countLetters(string s) {
5        int totalCount = 0; // This will hold the final count of substrings
6        int n = s.size(); // Get the size of the string to iterate over
7
8        // Loop through the string
9        for (int i = 0; i < n;) {
10            // Start of the current substring with the same character
11            int start = i;
12
13            // Find the end of the current group of the same character
14            while (i < n && s[i] == s[start]) {
15                ++i;
16            }
17
18            // Length of the group of the same character
19            int length = i - start;
20
21            // Add the count of substrings for this group to totalCount
22            // Counts the number of substrings that can be formed with 'length' characters,
23            // which is the sum of the series 1 + 2 + ... + length.
24            totalCount += (1 + length) * length / 2;
25
26            // No need to set i = j, as i is already at the end of the current character group
27        }
28
29        return totalCount; // Return the total count of all such substrings
30    }
31};
32

Typescript Solution

1/**
2 * Counts the total number of contiguous occurrences of each letter
3 * in the string `s`. For each continuous group of the same character,
4 * it adds up a series, where the nth character contributes n to the count
5 * (e.g., for "aa" it adds 1 for the first 'a' and 2 for the second 'a', giving 3).
6 *
7 * @param {string} s - The string to analyze.
8 * @return {number} - The total count of contiguous letters.
9 */
10function countLetters(s: string): number {
11    let totalCount = 0; // Initialize total count
12    const lengthOfS = s.length; // Cache the length of the string
13
14    // Iterate over the string
15    for (let index = 0; index < lengthOfS; ) {
16        let currentIndex = index; // Index used to find contiguous characters
17        let contiguousCount = 0; // Reset counter for contiguous characters
18
19        // Count contiguous occurrences of the character
20        while (currentIndex < lengthOfS && s[currentIndex] === s[index]) {
21            ++currentIndex; // Move to the next character
22            totalCount += ++contiguousCount; // Increment and add to total count
23        }
24
25        // Continue from where the last contiguous sequence ended
26        index = currentIndex;
27    }
28
29    return totalCount; // Return the computed total count
30}
31

Time and Space Complexity

Time Complexity

The given Python code for countLetters has two nested loops. However, the inner loop does not start from the beginning every time, but rather from the index where the outer loop left off. The inner loop only runs when it finds characters in string s that are the same as the character at index i, and once it finds a different character, it breaks and sets i to j (the next start position). This means each character in the string is visited exactly once by the inner loop, and thus the total number of iterations across both loops is O(n), where n is the length of the string s.

Therefore, time complexity is O(n).

Space Complexity

As for the space complexity, the code uses a fixed number of integer variables (n, i, j, ans) that do not depend on the size of the input string s. No additional data structures are used that would grow with the input size.

Therefore, space complexity is O(1).


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫