2083. Substrings That Begin and End With the Same Letter
Problem Description
The problem presents a situation where we are given a string s
, which contains only lowercase English letters. Our goal is to determine the total number of substrings within this string that start and end with the same character. It's essential to remember that a substring is simply a sequence of characters that are adjacent in the string and that the string in question uses 0-based indexing.
For example, if given the string s = "ababa"
, some valid substrings that begin and end with the same character would be ["a", "aba", "ababa", "b", "bab", "a"]
. In this challenge, we need to enumerate all possible substrings that meet this criterion and return their count.
Intuition
The solution strategy revolves around the observation that for each occurrence of a character in the string, it can potentially form a substring with each of its previous appearances. This is because any sequence of characters between the two same characters (including no characters at all) will still result in a substring that starts and ends with that character.
For example, if we have the string s = "aa"
, then there are three such substrings: "a", "a", "aa"
. For the first 'a', there is 1 substring which is itself. For the second 'a', there are 2 substrings: itself and the substring "aa" that includes both the first and the second 'a'.
To implement this, we utilize a dictionary to keep track of the count of each character as we iterate through the string. Whenever we encounter a character, we increment its count in the dictionary. The current count of the character gives us the number of substrings that end with this character and start with a previous instance of the same character. We add this count to our answer for each character we encounter. The sum of all these counts will give us the total number of desired substrings.
The Counter
class from the collections
module in Python provides a convenient way to count occurrences of each character. The variable cnt
is our counter, and ans
is used to accumulate the count of valid substrings. We iterate through each character in the string, update its count in cnt
, and add the updated count to ans
, yielding the total number of substrings that start and end with the same character by the time we've processed the entire string.
Learn more about Math and Prefix Sum patterns.
Solution Approach
The implementation uses a hash table to keep track of how many times each character has appeared in the string so far. The hash table is implemented using the Counter
class from Python's collections
module, which efficiently handles the counting of hashable objects.
Here's a step-by-step breakdown of the implementation:
- A
Counter
object namedcnt
is created to keep track of the occurrences of each character in the string. - An integer variable
ans
is initialized to0
. It will serve as an accumulator for the total count of substrings satisfying the given condition. - The code enters a loop that iterates over each character
c
in the strings
. - For each character, the loop increments its count in
cnt
. This is achieved by the expressioncnt[c] += 1
. In theCounter
, if the characterc
doesn't exist yet, it would be initialized to0
before being incremented. So, after this step,cnt[c]
contains the number of timesc
has appeared so far. - The current value of
cnt[c]
is then added toans
. This is based on the observation that if we have seen a charactern
times so far, there aren
substrings that can end with this instance of the character, each starting with one of the previously seen instances of this character (including a substring just consisting of this one character). - After the loop ends,
ans
contains the total number of qualified substrings, and the function returns this value.
This approach is effective for the problem at hand because it leverages the fact that the number of substrings ending with a certain character equates to the instances of that character seen so far. This allows the algorithm to count the requisite substrings in a single linear pass over the string, thus having a time complexity of O(n), where n is the length of the string.
Here is the reference solution approach implemented in Python:
class Solution:
def numberOfSubstrings(self, s: str) -> int:
cnt = Counter()
ans = 0
for c in s:
cnt[c] += 1
ans += cnt[c]
return ans
In this code, for c in s
iterates over the characters in the string. The Counter
object cnt
keeps track of the occurrence count, and the accumulator ans
aggregates the number of substrings. After visiting all characters, the total count is held in ans
, which is returned as the final result.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's use a small example to illustrate the solution approach described above. Suppose we have the string s = "abcab"
.
- We start with an empty
Counter
objectcnt
and an accumulatorans
, initialized to0
. - We iterate through each character in the string, in order.
- Upon encountering the first character
a
, we update the countercnt['a'] += 1
. Now,cnt = Counter({'a': 1})
, and we incrementans
bycnt['a']
, soans = 1
. - Next, we see the character
b
. We updatecnt['b'] += 1
, makingcnt = Counter({'a': 1, 'b': 1})
, and appendcnt['b']
toans
, resulting inans = 2
. - The third character is
c
. We updatecnt['c'] += 1
, giving uscnt = Counter({'a': 1, 'b': 1, 'c': 1})
, and incrementans
bycnt['c']
, leading toans = 3
. - The fourth character is
a
again. This time,cnt['a'] += 1
makescnt = Counter({'a': 2, 'b': 1, 'c': 1})
becausea
has appeared twice. We addcnt['a']
toans
, and nowans = 5
(sincecnt['a']
is now2
, we add2
). - Finally, we see another
b
. We incrementcnt['b']
, socnt = Counter({'a': 2, 'b': 2, 'c': 1})
, and thenans += cnt['b']
makesans = 7
.
At the end of the loop, we have considered every character and updated our accumulator ans
based on the occurrences of each character. The final value of ans
is 7
, which is the number of substrings that start and end with the same character in the string s = "abcab"
.
These substrings are specifically "a"
, "b"
, "c"
, "abca"
, "bcab"
, "aba"
, and "bab"
. Notice how substrings that start and end with the same character and contain other characters in the middle are also included, as illustrated by "abca"
and "bcab"
.
Solution Implementation
1from collections import Counter
2
3class Solution:
4 def numberOfSubstrings(self, string: str) -> int:
5 # Initialize an empty Counter to keep track of character frequencies.
6 char_count = Counter()
7
8 # Initialize the answer variable to store the result.
9 total_substrings = 0
10
11 # Iterate over each character in the string.
12 for char in string:
13 # Increment the count of the current character.
14 char_count[char] += 1
15
16 # The number of substrings ending with the current character is equal
17 # to its current count (since any substring ending before could be extended by this character).
18 total_substrings += char_count[char]
19
20 # Return the total number of substrings.
21 return total_substrings
22
1class Solution {
2 public long numberOfSubstrings(String s) {
3 // Create an array to store the count of each letter.
4 int[] letterCount = new int[26];
5
6 // Initialize a variable to store the total number of substrings.
7 long totalSubstrings = 0;
8
9 // Iterate over each character in the string.
10 for (int i = 0; i < s.length(); ++i) {
11 // Calculate the index of the current character 'a' being 0, 'b' being 1, ... 'z' being 25.
12 int index = s.charAt(i) - 'a';
13
14 // Increment the count of the current letter in the letter count array.
15 ++letterCount[index];
16
17 // Each time a new character is processed, increment totalSubstrings by the count of that character.
18 // The count represents the number of substrings ending with that character.
19 totalSubstrings += letterCount[index];
20 }
21
22 // Return the total number of substrings.
23 return totalSubstrings;
24 }
25}
26
1class Solution {
2public:
3 long long numberOfSubstrings(string s) {
4 int count[26] = {}; // Initialize array to store the count of each letter
5 long long answer = 0; // Initialize the answer variable to store the total number of substrings
6
7 // Loop through each character in the string
8 for (char& c : s) {
9 // Increment the count of the current letter and add the new count to the answer
10 // The new count represents the number of substrings ending with the current letter
11 // that have not been counted before
12 answer += ++count[c - 'a'];
13 }
14
15 return answer; // Return the total number of substrings
16 }
17};
18
1// Function to count the number of substrings
2function numberOfSubstrings(s: string): number {
3 // Initialize an array to store the count of each letter ('a' to 'z')
4 let count: number[] = new Array(26).fill(0);
5 // Initialize the answer variable to store the total number of substrings
6 let answer: number = 0;
7
8 // Loop through each character in the string
9 for (const c of s) {
10 // Increment the count of the current letter (using its ASCII value to map to an index)
11 // The new count represents the number of substrings ending with the current letter
12 // that have not been counted before
13 answer += ++count[c.charCodeAt(0) - 'a'.charCodeAt(0)];
14 }
15
16 // Return the total number of substrings
17 return answer;
18}
19
20// Note: Since this function is defined globally, it can be called directly with a string parameter.
21
Time and Space Complexity
The given code calculates the number of substrings that can be formed in a string with each character being counted once it is part of a substring as it's introduced.
Time Complexity:
Let's analyze the time complexity of the provided function:
- The function iterates through each character in the string
s
which has a length ofn
. - In each iteration, it performs an update on the
Counter()
dictionary and an addition operation to theans
variable.
The Counter()
update operation and addition operation are both O(1)
operations.
Hence, the loop runs n
times with constant-time operations within it, resulting in a time complexity of O(n)
.
Space Complexity:
Now, let's analyze the space complexity:
- The
Counter()
data structure is used to keep a count of each character. In the worst case, this would store as many keys as there are unique characters in the strings
. - Assuming the input string
s
could include all possible characters in the charset used (let's say ASCII for simplicity), there is a constant number of possible characters, which means theCounter
would have at mostC
keys, whereC
is the size of the charset.
Therefore, since the size of the charset is fixed and does not grow with the input size n
, the space required by the Counter
is O(1)
.
As a result, the space complexity of the code is O(1)
if we consider the size of the charset to be constant and not dependent on the size of the input string. If considering unicode or a variable character set that indeed scales with n
, it would be O(n)
for space complexity.
Learn more about how to find time and space complexity quickly using problem constraints.
What is the best way of checking if an element exists in an unsorted array once in terms of time complexity? Select the best that applies.
Recommended Readings
Math for Technical Interviews How much math do I need to know for technical interviews The short answer is about high school level math Computer science is often associated with math and some universities even place their computer science department under the math faculty However the reality is that you
Prefix Sum The prefix sum is an incredibly powerful and straightforward technique Its primary goal is to allow for constant time range sum queries on an array What is Prefix Sum The prefix sum of an array at index i is the sum of all numbers from index 0 to i By
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Want a Structured Path to Master System Design Too? Don’t Miss This!