2955. Number of Same-End Substrings 🔒
Problem Description
You are given a string s
(0-indexed) and a 2D array queries
. Each query queries[i] = [li, ri]
represents a substring of s
from index li
to index ri
(both inclusive).
A same-end substring is defined as a substring that has the same character at both its beginning and end positions. For example, "aba" is a same-end substring because it starts and ends with 'a', while "abc" is not.
For each query [li, ri]
, you need to count how many same-end substrings exist within the substring s[li..ri]
.
Example breakdown:
- If we have substring "aba" (from a query), the same-end substrings would be:
- Single characters: "a" (at position 0), "b" (at position 1), "a" (at position 2) - all single characters are same-end by definition
- Length 2: none that are same-end
- Length 3: "aba" (starts and ends with 'a')
The task is to return an array ans
where ans[i]
contains the count of all same-end substrings for the i-th query.
Key Points:
- A substring is a contiguous sequence of characters
- Single characters always count as same-end substrings
- For longer substrings, only those with matching first and last characters count
- Each query defines a specific portion of the original string to examine
Intuition
Let's think about what makes a substring "same-end". For any substring to have the same character at both ends, we need at least two occurrences of that character in our range (except for single characters which are always same-end).
Consider a simple example: if we have the character 'a' appearing 3 times in our query range, how many same-end substrings can we form using these 'a's? We can pick any two positions of 'a' - the first one will be the start and the second will be the end of our substring. The number of ways to choose 2 positions from 3 is C(3,2) = 3
.
This observation is key: for each character that appears x
times in our query range, it contributes C(x,2) = x*(x-1)/2
same-end substrings (choosing any 2 positions as start and end).
But wait, we also need to count single-character substrings! Each character in the range forms a same-end substring by itself. If our query range has length r-l+1
, that's exactly r-l+1
single-character same-end substrings.
So the total count for a query [l,r]
becomes:
- Count of single characters:
r - l + 1
- For each unique character
c
in the string, if it appearsx
times in range[l,r]
, addx*(x-1)/2
To efficiently calculate how many times each character appears in any given range, we can use prefix sums. For each character, we maintain a prefix sum array that tells us how many times that character has appeared up to each position. Then for any range [l,r]
, the count of character c
is simply prefix[c][r+1] - prefix[c][l]
.
This preprocessing allows us to answer each query in O(26) time (for lowercase English letters) rather than scanning through the entire range for each query.
Learn more about Prefix Sum patterns.
Solution Approach
The solution uses prefix sum preprocessing combined with combinatorial counting to efficiently answer each query.
Step 1: Build Prefix Sum Arrays
First, we identify all unique characters in the string s
and store them in a set cs
. For each unique character, we create a prefix sum array:
cnt = {c: [0] * (n + 1) for c in cs}
This creates a dictionary where each character maps to an array of size n+1
(to handle 1-indexing conveniently).
Step 2: Populate Prefix Sums
We iterate through the string and update the prefix sum arrays:
for i, a in enumerate(s, 1):
for c in cs:
cnt[c][i] = cnt[c][i - 1] # Copy previous count
cnt[a][i] += 1 # Increment count for current character
After this, cnt[c][i]
contains the number of times character c
appears in s[0..i-1]
.
Step 3: Process Each Query
For each query [l, r]
:
-
Initialize the count with single-character substrings:
t = r - l + 1
-
For each unique character
c
, calculate how many times it appears in the range[l, r]
:x = cnt[c][r + 1] - cnt[c][l]
This uses the prefix sum to get the count in O(1) time.
-
Add the number of same-end substrings formed by character
c
:t += x * (x - 1) // 2
This is the combinatorial formula
C(x, 2)
- choosing any 2 occurrences ofc
as endpoints. -
Append the total count
t
to the answer array.
Time Complexity:
- Preprocessing: O(n × |Σ|) where |Σ| is the number of unique characters
- Per query: O(|Σ|)
- Total: O(n × |Σ| + q × |Σ|) where q is the number of queries
Space Complexity: O(n × |Σ|) for storing the prefix sum arrays
The key insight is that by preprocessing the character counts, we can answer each query in constant time relative to the string length, making this approach efficient for multiple queries on the same string.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with s = "abcba"
and queries = [[0, 4], [1, 3]]
.
Step 1: Build Prefix Sum Arrays
First, identify unique characters: cs = {'a', 'b', 'c'}
Create prefix sum arrays:
cnt['a'] = [0, 0, 0, 0, 0, 0] # size n+1 = 6 cnt['b'] = [0, 0, 0, 0, 0, 0] cnt['c'] = [0, 0, 0, 0, 0, 0]
Step 2: Populate Prefix Sums
Process each character in s = "abcba"
:
- i=1, s[0]='a':
cnt['a'] = [0, 1, 0, 0, 0, 0]
- i=2, s[1]='b':
cnt['b'] = [0, 0, 1, 0, 0, 0]
- i=3, s[2]='c':
cnt['c'] = [0, 0, 0, 1, 0, 0]
- i=4, s[3]='b':
cnt['b'] = [0, 0, 1, 1, 2, 0]
- i=5, s[4]='a':
cnt['a'] = [0, 1, 1, 1, 1, 2]
Final prefix sums:
cnt['a'] = [0, 1, 1, 1, 1, 2] # 'a' appears at positions 0 and 4 cnt['b'] = [0, 0, 1, 1, 2, 2] # 'b' appears at positions 1 and 3 cnt['c'] = [0, 0, 0, 1, 1, 1] # 'c' appears at position 2
Step 3: Process Queries
Query 1: [0, 4] (examining "abcba")
- Single characters:
t = 4 - 0 + 1 = 5
(each character is a same-end substring) - Character 'a': appears
cnt['a'][5] - cnt['a'][0] = 2 - 0 = 2
times- Contributes:
2 * (2-1) / 2 = 1
same-end substring ("abcba")
- Contributes:
- Character 'b': appears
cnt['b'][5] - cnt['b'][0] = 2 - 0 = 2
times- Contributes:
2 * (2-1) / 2 = 1
same-end substring ("bcb")
- Contributes:
- Character 'c': appears
cnt['c'][5] - cnt['c'][0] = 1 - 0 = 1
time- Contributes:
1 * (1-1) / 2 = 0
same-end substrings
- Contributes:
- Total:
5 + 1 + 1 + 0 = 7
Query 2: [1, 3] (examining "bcb")
- Single characters:
t = 3 - 1 + 1 = 3
- Character 'a': appears
cnt['a'][4] - cnt['a'][1] = 1 - 1 = 0
times- Contributes:
0
same-end substrings
- Contributes:
- Character 'b': appears
cnt['b'][4] - cnt['b'][1] = 2 - 0 = 2
times- Contributes:
2 * (2-1) / 2 = 1
same-end substring ("bcb")
- Contributes:
- Character 'c': appears
cnt['c'][4] - cnt['c'][1] = 1 - 0 = 1
time- Contributes:
1 * (1-1) / 2 = 0
same-end substrings
- Contributes:
- Total:
3 + 0 + 1 + 0 = 4
Final Answer: [7, 4]
The same-end substrings are:
- Query 1: "a", "b", "c", "b", "a", "abcba", "bcb"
- Query 2: "b", "c", "b", "bcb"
Solution Implementation
1class Solution:
2 def sameEndSubstringCount(self, s: str, queries: List[List[int]]) -> List[int]:
3 n = len(s)
4
5 # Get all unique characters in the string
6 unique_chars = set(s)
7
8 # Create prefix count arrays for each unique character
9 # prefix_counts[char][i] = count of 'char' in s[0:i]
10 prefix_counts = {char: [0] * (n + 1) for char in unique_chars}
11
12 # Build prefix count arrays
13 for index, current_char in enumerate(s, 1):
14 # Copy previous counts for all characters
15 for char in unique_chars:
16 prefix_counts[char][index] = prefix_counts[char][index - 1]
17 # Increment count for current character
18 prefix_counts[current_char][index] += 1
19
20 result = []
21
22 # Process each query
23 for left, right in queries:
24 # Start with single character substrings (each character forms a valid substring)
25 total = right - left + 1
26
27 # For each unique character, count substrings with this character at both ends
28 for char in unique_chars:
29 # Count occurrences of this character in the range [left, right]
30 char_count = prefix_counts[char][right + 1] - prefix_counts[char][left]
31
32 # Calculate number of substrings with this character at both ends
33 # Formula: choose 2 positions from char_count positions = C(char_count, 2)
34 total += char_count * (char_count - 1) // 2
35
36 result.append(total)
37
38 return result
39
1class Solution {
2 public int[] sameEndSubstringCount(String s, int[][] queries) {
3 int stringLength = s.length();
4
5 // Create a 2D array to store cumulative count of each character
6 // prefixCount[i][j] represents count of character 'a'+i in substring s[0...j-1]
7 int[][] prefixCount = new int[26][stringLength + 1];
8
9 // Build the prefix count array
10 for (int position = 1; position <= stringLength; position++) {
11 // Copy previous counts for all characters
12 for (int charIndex = 0; charIndex < 26; charIndex++) {
13 prefixCount[charIndex][position] = prefixCount[charIndex][position - 1];
14 }
15 // Increment count for current character
16 int currentCharIndex = s.charAt(position - 1) - 'a';
17 prefixCount[currentCharIndex][position]++;
18 }
19
20 int queryCount = queries.length;
21 int[] result = new int[queryCount];
22
23 // Process each query
24 for (int queryIndex = 0; queryIndex < queryCount; queryIndex++) {
25 int leftBound = queries[queryIndex][0];
26 int rightBound = queries[queryIndex][1];
27
28 // Initialize with count of all single character substrings
29 result[queryIndex] = rightBound - leftBound + 1;
30
31 // For each character, calculate substrings with same start and end character
32 for (int charIndex = 0; charIndex < 26; charIndex++) {
33 // Get count of current character in the query range
34 int charCountInRange = prefixCount[charIndex][rightBound + 1] - prefixCount[charIndex][leftBound];
35
36 // Add number of ways to choose 2 positions from charCountInRange positions
37 // This gives count of substrings starting and ending with same character
38 result[queryIndex] += charCountInRange * (charCountInRange - 1) / 2;
39 }
40 }
41
42 return result;
43 }
44}
45
1class Solution {
2public:
3 vector<int> sameEndSubstringCount(string s, vector<vector<int>>& queries) {
4 int n = s.size();
5
6 // Create a 2D array to store prefix counts for each character (a-z)
7 // prefixCount[i][j] represents count of character 'a'+i in s[0...j-1]
8 int prefixCount[26][n + 1];
9 memset(prefixCount, 0, sizeof(prefixCount));
10
11 // Build prefix count array for each character
12 for (int pos = 1; pos <= n; ++pos) {
13 // Copy counts from previous position
14 for (int charIdx = 0; charIdx < 26; ++charIdx) {
15 prefixCount[charIdx][pos] = prefixCount[charIdx][pos - 1];
16 }
17 // Increment count for current character
18 prefixCount[s[pos - 1] - 'a'][pos]++;
19 }
20
21 // Process each query and calculate result
22 vector<int> result;
23 for (auto& query : queries) {
24 int left = query[0];
25 int right = query[1];
26
27 // Start with count of single-character substrings
28 int totalCount = right - left + 1;
29
30 // For each character, calculate substrings with same start and end character
31 for (int charIdx = 0; charIdx < 26; ++charIdx) {
32 // Get count of current character in range [left, right]
33 int charCount = prefixCount[charIdx][right + 1] - prefixCount[charIdx][left];
34
35 // Add combinations: choose 2 positions from charCount occurrences
36 // This gives us all substrings starting and ending with this character
37 totalCount += charCount * (charCount - 1) / 2;
38 }
39
40 result.push_back(totalCount);
41 }
42
43 return result;
44 }
45};
46
1/**
2 * Counts the number of substrings with the same starting and ending character
3 * for each query range in the given string.
4 *
5 * @param s - The input string
6 * @param queries - Array of query ranges, where each query is [left, right] (0-indexed)
7 * @returns Array of counts for each query
8 */
9function sameEndSubstringCount(s: string, queries: number[][]): number[] {
10 const stringLength: number = s.length;
11
12 // Create a 2D prefix sum array to store character counts
13 // prefixCount[charIndex][position] stores count of character 'charIndex' up to position
14 const prefixCount: number[][] = Array.from(
15 { length: 26 },
16 () => Array(stringLength + 1).fill(0)
17 );
18
19 // Build prefix sum array for each character (a-z)
20 for (let position = 1; position <= stringLength; position++) {
21 // Copy counts from previous position
22 for (let charIndex = 0; charIndex < 26; charIndex++) {
23 prefixCount[charIndex][position] = prefixCount[charIndex][position - 1];
24 }
25
26 // Increment count for current character at this position
27 const currentCharIndex: number = s.charCodeAt(position - 1) - 'a'.charCodeAt(0);
28 prefixCount[currentCharIndex][position]++;
29 }
30
31 const results: number[] = [];
32
33 // Process each query
34 for (const [leftIndex, rightIndex] of queries) {
35 // Initialize with count of all single-character substrings in range
36 let substringCount: number = rightIndex - leftIndex + 1;
37
38 // Calculate substrings with same start and end character
39 for (let charIndex = 0; charIndex < 26; charIndex++) {
40 // Get frequency of current character in the query range
41 const charFrequency: number = prefixCount[charIndex][rightIndex + 1] - prefixCount[charIndex][leftIndex];
42
43 // Add combinations: choose 2 positions from charFrequency occurrences
44 // This gives us all substrings where both ends are the same character
45 substringCount += (charFrequency * (charFrequency - 1)) / 2;
46 }
47
48 results.push(substringCount);
49 }
50
51 return results;
52}
53
Time and Space Complexity
Time Complexity: O((n + m) × |Σ|)
The algorithm consists of two main phases:
-
Preprocessing phase: Building the prefix count arrays takes
O(n × |Σ|)
time, where we iterate through each position in the string (n
positions) and update counts for each unique character (|Σ|
characters). -
Query processing phase: For each of the
m
queries, we iterate through all unique characters in the set to calculate the count of substrings. This takesO(m × |Σ|)
time.
Therefore, the total time complexity is O(n × |Σ| + m × |Σ|) = O((n + m) × |Σ|)
.
Space Complexity: O(n × |Σ|)
The algorithm uses a dictionary cnt
where each unique character maps to an array of size n + 1
to store prefix counts. Since there are |Σ|
unique characters and each requires an array of size n + 1
, the space complexity is O(n × |Σ|)
.
Where:
n
is the length of the strings
m
is the number of queries|Σ|
is the size of the character set (number of unique characters ins
), which is at most 26 for lowercase English letters
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Index Confusion Between 0-based String and 1-based Prefix Arrays
The most common pitfall in this solution is mishandling the indexing when building or querying the prefix sum arrays. The string s
uses 0-based indexing, but the prefix sum arrays are built with 1-based logic for convenience.
Incorrect Implementation:
# Wrong: Using 0-based indexing directly
for i in range(n):
for c in unique_chars:
prefix_counts[c][i] = prefix_counts[c][i - 1] # Error when i=0!
prefix_counts[s[i]][i] += 1
# Wrong: Incorrect query range calculation
char_count = prefix_counts[char][right] - prefix_counts[char][left - 1] # Error when left=0!
Correct Implementation:
# Use enumerate with start=1 for cleaner 1-based indexing
for index, current_char in enumerate(s, 1):
for char in unique_chars:
prefix_counts[char][index] = prefix_counts[char][index - 1]
prefix_counts[current_char][index] += 1
# Proper query with adjusted indices
char_count = prefix_counts[char][right + 1] - prefix_counts[char][left]
2. Forgetting to Initialize Single Character Counts
Another pitfall is forgetting that single characters always count as same-end substrings. Some might only count the multi-character same-end substrings.
Incorrect:
# Wrong: Only counting multi-character substrings total = 0 for char in unique_chars: char_count = prefix_counts[char][right + 1] - prefix_counts[char][left] total += char_count * (char_count - 1) // 2
Correct:
# Include single character substrings first total = right - left + 1 # All single characters in the range for char in unique_chars: char_count = prefix_counts[char][right + 1] - prefix_counts[char][left] total += char_count * (char_count - 1) // 2
3. Memory Optimization Oversight
When the string contains many unique characters but queries focus on small ranges, creating prefix arrays for all characters might be inefficient.
Alternative Approach for Small Query Ranges:
def sameEndSubstringCount(self, s: str, queries: List[List[int]]) -> List[int]:
result = []
for left, right in queries:
# For small ranges, direct counting might be more efficient
if right - left + 1 <= 100: # Threshold can be adjusted
count = right - left + 1 # Single characters
char_positions = {}
# Collect positions for each character in range
for i in range(left, right + 1):
if s[i] not in char_positions:
char_positions[s[i]] = []
char_positions[s[i]].append(i)
# Count pairs for each character
for positions in char_positions.values():
count += len(positions) * (len(positions) - 1) // 2
result.append(count)
else:
# Use prefix sum approach for large ranges
# ... (original solution)
return result
This hybrid approach can be more efficient when query ranges are small relative to the string length.
A heap is a ...?
Recommended Readings
Prefix Sum The prefix sum is an incredibly powerful and straightforward technique Its primary goal is to allow for constant time range sum queries on an array What is Prefix Sum The prefix sum of an array at index i is the sum of all numbers from index 0 to i By
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Don’t Miss This!