1759. Count Number of Homogenous Substrings
Problem Description
The task is to find the number of homogenous substrings in a given string s
. A string is considered homogenous if all characters in the string are identical. A substring refers to a consecutive sequence of characters within a string. The answer should be returned modulo 10^9 + 7
to prevent overflow issues due to potentially large numbers.
For example, if s = "aaa"
, then the homogenous substrings are "a"
, "a"
, "a"
, "aa"
, "aa"
, and "aaa"
, which totals up to 6.
Intuition
To solve this problem efficiently, we can utilize a two-pointer technique.
- We iterate over the string using two pointers,
i
andj
. - The first pointer
i
marks the start of a potential homogenous substring, whilej
scans ahead to find where this substring ends (i.e., where a character different froms[i]
is encountered). - For each character position
i
, we find the longest stretch of the same character by incrementally increasingj
as long ass[j]
is equal tos[i]
. - The length of the homogenous substring starting at
i
is(j - i)
. For each such substring, we calculate the number of homogenous substrings that can be made, which is given by the formula(1 + cnt) * cnt / 2
, wherecnt
is the length of the homogenous substring. - Why this formula? Consider a homogenous string of length
n
. We can maken
single-character substrings,n-1
substrings of length 2,n-2
of length 3, and so on, down to 1 substring of lengthn
. This forms an arithmetic sequence that sums ton*(n+1)/2
. - The answer is incremented by this count for each homogenous stretch we find.
- We use the modulo operation to keep our calculations within the prescribed limit to avoid integer overflow.
- The first pointer
i
is then set toj
to start searching for the next homogenous substring.
This approach optimizes the process by minimizing the number of times we traverse the string, leading to an efficient solution.
Learn more about Math patterns.
Solution Approach
The implementation of the solution uses a two-pointer technique along with basic arithmetic calculations to find the number of homogenous substrings. Here is the walkthrough of the code:
- The function
countHomogenous
starts by initializing the variablemod
to10**9 + 7
for modulo operations to prevent overflow. - Two variables are declared,
i
being the start pointer (initialized at index 0) andn
being the length of the input strings
. - We also initialize a variable
ans
to store the cumulative number of homogenous substrings found.
The solution enters a loop that continues until the start pointer i
has reached the end of the string (i < n
):
- A second pointer
j
is set to start at the same position asi
. This will be used to find the end of the current homogenous substring. - A while loop is used to move
j
forward as long ass[j]
is the same ass[i]
. Whens[j]
is different froms[i]
, it means we have found the end of the homogenous substring. - After the while loop, we now have a substring from index
i
toj - 1
that is homogenous. The length of this substring iscnt = j - i
. - To find the number of homogenous substrings within this section, we use the arithmetic series sum formula
(1 + cnt) * cnt / 2
, wherecnt
is the length of the homogenous substring. - The result is added to
ans
, which keeps the running total of homogenous substrings. Every time a new count is added, we perform a modulo operation to make sureans
doesn't exceed10**9 + 7
. - Finally, we move the start pointer
i
to the position wherej
ended, as everything beforej
is already part of a homogenous substring we've counted.
The use of the two-pointer technique efficiently reduces the time complexity since each character in the string is checked only once. By only considering stretches of identical characters and using the arithmetic series sum formula, we avoid having to individually count each possible substring. This is what makes the algorithm efficient.
The function ends by returning the total count ans
as the resulting number of homogenous substrings.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach with a small example. Assume the input string s = "abbba"
.
- The
countHomogenous
function begins by initializing the variables (mod = 10^9 + 7
,i = 0
,n = 5
, andans = 0
). - The main loop starts with
i < n
, which is true at the start (i = 0
).
Starting with i = 0
:
-
We set the second pointer
j = i = 0
. We are now looking for a homogenous substring starting at 'a'. -
As we increment
j
, we realize thats[j] == s[i]
only forj = 0
. As soon asj
becomes 1,s[j]
becomes 'b', which is different froms[i]
('a'). So the while loop stops withj = 1
. -
We found a homogenous substring 'a' from index
0
to0
. Thuscnt = j - i = 1 - 0 = 1
. -
The number of homogenous substrings for this
cnt
is(1 + 1) * 1 / 2 = 1
. -
We add this to
ans
,ans = 0 + 1 = 1
. -
We then set
i = j
, soi
is now1
and we start looking for a new homogenous substring.
Next with i = 1
:
-
Set
j
to1
. The character at this index is 'b'. -
We increment
j
whiles[j]
is the same ass[i]
. This gives usj = 4
because indices 1, 2, and 3 are all 'b'. -
We found a homogenous substring 'bbb' from index
1
to3
, socnt = j - i = 4 - 1 = 3
. -
Using the formula for
cnt = 3
, the number of homogenous substrings is(1 + 3) * 3 / 2 = 6
. -
This is added to
ans
, which becomesans = 1 + 6 = 7
. -
The
i
is moved to the index4
.
The last character is evaluated but as there are no repeating characters beyond this point, j
never increments. Therefore, this results in a cnt
of 1
, which provides a single homogenous substring 'a'.
The answer, or total number of homogenous substrings, is summed up as ans = 7 + 1 = 8
, which is the result returned by the function.
This walkthrough simplifies how the solution makes efficient use of the two-pointer technique to calculate the number of homogenous substrings without having to count each one individually.
Solution Implementation
1class Solution:
2 def countHomogenous(self, s: str) -> int:
3 # Define the modulus as mentioned in the problem statement.
4 modulo_factor = 10**9 + 7
5
6 # Initialize starting index for iterating the string and the length of the string.
7 current_index, string_length = 0, len(s)
8
9 # Initialize the answer to count homogenous substrings.
10 homogenous_count = 0
11
12 # Iterate over the string to find homogenous substrings.
13 while current_index < string_length:
14 # Find the end of the current homogenous substring.
15 next_index = current_index
16 while next_index < string_length and s[next_index] == s[current_index]:
17 next_index += 1
18
19 # Calculate the length of the current homogenous substring.
20 substring_length = next_index - current_index
21
22 # Count the number of homogenous substrings which can be formed from
23 # the current homogenous substring using the formula for the sum of
24 # the first n natural numbers: n * (n + 1) / 2.
25 homogenous_count += (1 + substring_length) * substring_length // 2
26
27 # Use modulo operation to avoid large numbers as per problem constraints.
28 homogenous_count %= modulo_factor
29
30 # Move to the beginning of the next potential homogenous substring.
31 current_index = next_index
32
33 # Return the final count of homogenous substrings.
34 return homogenous_count
35
1class Solution {
2 private static final int MOD = (int) 1e9 + 7;
3
4 public int countHomogenous(String s) {
5 // Length of the input string
6 int length = s.length();
7 // Variable to hold the total count of homogenous substrings
8 long totalHomogenousSubstrings = 0;
9
10 // Loop through the string characters
11 for (int startIndex = 0, endIndex = 0; startIndex < length; startIndex = endIndex) {
12 // Set the end index to the current start index
13 endIndex = startIndex;
14 // Extend the end index while the end character is the same as the start character
15 while (endIndex < length && s.charAt(endIndex) == s.charAt(startIndex)) {
16 endIndex++;
17 }
18 // Calculate the length of the homogeneous substring
19 int homogeneousLength = endIndex - startIndex;
20 // Use the formula for sum of first n natural numbers to calculate the number of substrings
21 totalHomogenousSubstrings += (long) (1 + homogeneousLength) * homogeneousLength / 2;
22 // Apply modulo operation to prevent overflow
23 totalHomogenousSubstrings %= MOD;
24 }
25 // Cast the result to int before returning, since the final output must be an integer
26 return (int) totalHomogenousSubstrings;
27 }
28}
29
1class Solution {
2public:
3 static constexpr int MOD = 1e9 + 7; // Define the modulus constant for preventing integer overflow.
4
5 // Method to count the number of homogenous substrings in the given string s.
6 int countHomogenous(string s) {
7 int length = s.size(); // The length of the input string.
8 long long answer = 0; // To store the final answer, initialized to 0.
9
10 // Loop through the string to count all homogenous substrings.
11 for (int start = 0, end = 0; start < length; start = end) {
12 // Find the end of the current homogenous substring.
13 end = start;
14 while (end < length && s[end] == s[start]) {
15 ++end;
16 }
17
18 // Compute the count of characters in the current homogenous substring.
19 int count = end - start;
20
21 // Calculate the number of possible homogenous substrings,
22 // which is the sum of the first count natural numbers: count * (count + 1) / 2.
23 // We use long long to avoid integer overflow during the calculation.
24 answer += static_cast<long long>(count + 1) * count / 2;
25 answer %= MOD; // Apply the modulus to keep the answer within the integer limits.
26 }
27
28 return static_cast<int>(answer); // Return the answer as an integer.
29 }
30};
31
1// Function to count the number of homogenous substrings in a given string.
2// A homogenous substring is one that consists of a single unique character.
3// For example, in the string "aa", there would be three homogenous substrings: "a", "a", and "aa".
4function countHomogenous(s: string): number {
5 const MODULO: number = 1e9 + 7; // Define a constant for modulo to avoid large numbers
6 const n: number = s.length; // Length of the input string
7 let count: number = 0; // Initialize the count of homogenous substrings
8
9 // Use two pointers to iterate through the string
10 for (let startIndex = 0, currentIndex = 0; currentIndex < n; currentIndex++) {
11 // If the current character is different from the starting character,
12 // update the starting index to the current index
13 if (s[startIndex] !== s[currentIndex]) {
14 startIndex = currentIndex;
15 }
16 // Calculate the number of homogenous substrings found so far based on the current sequence
17 // Add the number of new homogenous substrings from startIndex to currentIndex
18 count = (count + (currentIndex - startIndex + 1)) % MODULO;
19 }
20 // Return the total count of homogenous substrings modulo the defined constant
21 return count;
22}
23
Time and Space Complexity
The time complexity of the given code is O(n)
, where n
is the length of the string s
. This is because each character in the string is checked exactly once to form homogenous substrings (characters that are the same and contiguous). The inner while
loop runs only once for each homogenous substring, and since it only moves j
to the end of a homogenous substring, the iterations of the inner loop throughout the entire run of the algorithm sum up to O(n)
.
The space complexity of the code is O(1)
. This is because the algorithm uses a constant number of additional variables (mod
, i
, n
, ans
, j
, cnt
) which do not scale with the input size - they use a constant amount of space regardless of the length of the string s
.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of the following is a good use case for backtracking?
Recommended Readings
Math for Technical Interviews How much math do I need to know for technical interviews The short answer is about high school level math Computer science is often associated with math and some universities even place their computer science department under the math faculty However the reality is that you
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!