Facebook Pixel

2955. Number of Same-End Substrings 🔒

MediumArrayHash TableStringCountingPrefix Sum
Leetcode Link

Problem Description

You are given a string s (0-indexed) and a 2D array queries. Each query queries[i] = [li, ri] represents a substring of s from index li to index ri (both inclusive).

A same-end substring is defined as a substring that has the same character at both its beginning and end positions. For example, "aba" is a same-end substring because it starts and ends with 'a', while "abc" is not.

For each query [li, ri], you need to count how many same-end substrings exist within the substring s[li..ri].

Example breakdown:

  • If we have substring "aba" (from a query), the same-end substrings would be:
    • Single characters: "a" (at position 0), "b" (at position 1), "a" (at position 2) - all single characters are same-end by definition
    • Length 2: none that are same-end
    • Length 3: "aba" (starts and ends with 'a')

The task is to return an array ans where ans[i] contains the count of all same-end substrings for the i-th query.

Key Points:

  • A substring is a contiguous sequence of characters
  • Single characters always count as same-end substrings
  • For longer substrings, only those with matching first and last characters count
  • Each query defines a specific portion of the original string to examine
Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

Let's think about what makes a substring "same-end". For any substring to have the same character at both ends, we need at least two occurrences of that character in our range (except for single characters which are always same-end).

Consider a simple example: if we have the character 'a' appearing 3 times in our query range, how many same-end substrings can we form using these 'a's? We can pick any two positions of 'a' - the first one will be the start and the second will be the end of our substring. The number of ways to choose 2 positions from 3 is C(3,2) = 3.

This observation is key: for each character that appears x times in our query range, it contributes C(x,2) = x*(x-1)/2 same-end substrings (choosing any 2 positions as start and end).

But wait, we also need to count single-character substrings! Each character in the range forms a same-end substring by itself. If our query range has length r-l+1, that's exactly r-l+1 single-character same-end substrings.

So the total count for a query [l,r] becomes:

  • Count of single characters: r - l + 1
  • For each unique character c in the string, if it appears x times in range [l,r], add x*(x-1)/2

To efficiently calculate how many times each character appears in any given range, we can use prefix sums. For each character, we maintain a prefix sum array that tells us how many times that character has appeared up to each position. Then for any range [l,r], the count of character c is simply prefix[c][r+1] - prefix[c][l].

This preprocessing allows us to answer each query in O(26) time (for lowercase English letters) rather than scanning through the entire range for each query.

Learn more about Prefix Sum patterns.

Solution Approach

The solution uses prefix sum preprocessing combined with combinatorial counting to efficiently answer each query.

Step 1: Build Prefix Sum Arrays

First, we identify all unique characters in the string s and store them in a set cs. For each unique character, we create a prefix sum array:

cnt = {c: [0] * (n + 1) for c in cs}

This creates a dictionary where each character maps to an array of size n+1 (to handle 1-indexing conveniently).

Step 2: Populate Prefix Sums

We iterate through the string and update the prefix sum arrays:

for i, a in enumerate(s, 1):
    for c in cs:
        cnt[c][i] = cnt[c][i - 1]  # Copy previous count
    cnt[a][i] += 1  # Increment count for current character

After this, cnt[c][i] contains the number of times character c appears in s[0..i-1].

Step 3: Process Each Query

For each query [l, r]:

  1. Initialize the count with single-character substrings: t = r - l + 1

  2. For each unique character c, calculate how many times it appears in the range [l, r]:

    x = cnt[c][r + 1] - cnt[c][l]

    This uses the prefix sum to get the count in O(1) time.

  3. Add the number of same-end substrings formed by character c:

    t += x * (x - 1) // 2

    This is the combinatorial formula C(x, 2) - choosing any 2 occurrences of c as endpoints.

  4. Append the total count t to the answer array.

Time Complexity:

  • Preprocessing: O(n × |Σ|) where |Σ| is the number of unique characters
  • Per query: O(|Σ|)
  • Total: O(n × |Σ| + q × |Σ|) where q is the number of queries

Space Complexity: O(n × |Σ|) for storing the prefix sum arrays

The key insight is that by preprocessing the character counts, we can answer each query in constant time relative to the string length, making this approach efficient for multiple queries on the same string.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with s = "abcba" and queries = [[0, 4], [1, 3]].

Step 1: Build Prefix Sum Arrays

First, identify unique characters: cs = {'a', 'b', 'c'}

Create prefix sum arrays:

cnt['a'] = [0, 0, 0, 0, 0, 0]  # size n+1 = 6
cnt['b'] = [0, 0, 0, 0, 0, 0]
cnt['c'] = [0, 0, 0, 0, 0, 0]

Step 2: Populate Prefix Sums

Process each character in s = "abcba":

  • i=1, s[0]='a': cnt['a'] = [0, 1, 0, 0, 0, 0]
  • i=2, s[1]='b': cnt['b'] = [0, 0, 1, 0, 0, 0]
  • i=3, s[2]='c': cnt['c'] = [0, 0, 0, 1, 0, 0]
  • i=4, s[3]='b': cnt['b'] = [0, 0, 1, 1, 2, 0]
  • i=5, s[4]='a': cnt['a'] = [0, 1, 1, 1, 1, 2]

Final prefix sums:

cnt['a'] = [0, 1, 1, 1, 1, 2]  # 'a' appears at positions 0 and 4
cnt['b'] = [0, 0, 1, 1, 2, 2]  # 'b' appears at positions 1 and 3
cnt['c'] = [0, 0, 0, 1, 1, 1]  # 'c' appears at position 2

Step 3: Process Queries

Query 1: [0, 4] (examining "abcba")

  • Single characters: t = 4 - 0 + 1 = 5 (each character is a same-end substring)
  • Character 'a': appears cnt['a'][5] - cnt['a'][0] = 2 - 0 = 2 times
    • Contributes: 2 * (2-1) / 2 = 1 same-end substring ("abcba")
  • Character 'b': appears cnt['b'][5] - cnt['b'][0] = 2 - 0 = 2 times
    • Contributes: 2 * (2-1) / 2 = 1 same-end substring ("bcb")
  • Character 'c': appears cnt['c'][5] - cnt['c'][0] = 1 - 0 = 1 time
    • Contributes: 1 * (1-1) / 2 = 0 same-end substrings
  • Total: 5 + 1 + 1 + 0 = 7

Query 2: [1, 3] (examining "bcb")

  • Single characters: t = 3 - 1 + 1 = 3
  • Character 'a': appears cnt['a'][4] - cnt['a'][1] = 1 - 1 = 0 times
    • Contributes: 0 same-end substrings
  • Character 'b': appears cnt['b'][4] - cnt['b'][1] = 2 - 0 = 2 times
    • Contributes: 2 * (2-1) / 2 = 1 same-end substring ("bcb")
  • Character 'c': appears cnt['c'][4] - cnt['c'][1] = 1 - 0 = 1 time
    • Contributes: 1 * (1-1) / 2 = 0 same-end substrings
  • Total: 3 + 0 + 1 + 0 = 4

Final Answer: [7, 4]

The same-end substrings are:

  • Query 1: "a", "b", "c", "b", "a", "abcba", "bcb"
  • Query 2: "b", "c", "b", "bcb"

Solution Implementation

1class Solution:
2    def sameEndSubstringCount(self, s: str, queries: List[List[int]]) -> List[int]:
3        n = len(s)
4      
5        # Get all unique characters in the string
6        unique_chars = set(s)
7      
8        # Create prefix count arrays for each unique character
9        # prefix_counts[char][i] = count of 'char' in s[0:i]
10        prefix_counts = {char: [0] * (n + 1) for char in unique_chars}
11      
12        # Build prefix count arrays
13        for index, current_char in enumerate(s, 1):
14            # Copy previous counts for all characters
15            for char in unique_chars:
16                prefix_counts[char][index] = prefix_counts[char][index - 1]
17            # Increment count for current character
18            prefix_counts[current_char][index] += 1
19      
20        result = []
21      
22        # Process each query
23        for left, right in queries:
24            # Start with single character substrings (each character forms a valid substring)
25            total = right - left + 1
26          
27            # For each unique character, count substrings with this character at both ends
28            for char in unique_chars:
29                # Count occurrences of this character in the range [left, right]
30                char_count = prefix_counts[char][right + 1] - prefix_counts[char][left]
31              
32                # Calculate number of substrings with this character at both ends
33                # Formula: choose 2 positions from char_count positions = C(char_count, 2)
34                total += char_count * (char_count - 1) // 2
35          
36            result.append(total)
37      
38        return result
39
1class Solution {
2    public int[] sameEndSubstringCount(String s, int[][] queries) {
3        int stringLength = s.length();
4      
5        // Create a 2D array to store cumulative count of each character
6        // prefixCount[i][j] represents count of character 'a'+i in substring s[0...j-1]
7        int[][] prefixCount = new int[26][stringLength + 1];
8      
9        // Build the prefix count array
10        for (int position = 1; position <= stringLength; position++) {
11            // Copy previous counts for all characters
12            for (int charIndex = 0; charIndex < 26; charIndex++) {
13                prefixCount[charIndex][position] = prefixCount[charIndex][position - 1];
14            }
15            // Increment count for current character
16            int currentCharIndex = s.charAt(position - 1) - 'a';
17            prefixCount[currentCharIndex][position]++;
18        }
19      
20        int queryCount = queries.length;
21        int[] result = new int[queryCount];
22      
23        // Process each query
24        for (int queryIndex = 0; queryIndex < queryCount; queryIndex++) {
25            int leftBound = queries[queryIndex][0];
26            int rightBound = queries[queryIndex][1];
27          
28            // Initialize with count of all single character substrings
29            result[queryIndex] = rightBound - leftBound + 1;
30          
31            // For each character, calculate substrings with same start and end character
32            for (int charIndex = 0; charIndex < 26; charIndex++) {
33                // Get count of current character in the query range
34                int charCountInRange = prefixCount[charIndex][rightBound + 1] - prefixCount[charIndex][leftBound];
35              
36                // Add number of ways to choose 2 positions from charCountInRange positions
37                // This gives count of substrings starting and ending with same character
38                result[queryIndex] += charCountInRange * (charCountInRange - 1) / 2;
39            }
40        }
41      
42        return result;
43    }
44}
45
1class Solution {
2public:
3    vector<int> sameEndSubstringCount(string s, vector<vector<int>>& queries) {
4        int n = s.size();
5      
6        // Create a 2D array to store prefix counts for each character (a-z)
7        // prefixCount[i][j] represents count of character 'a'+i in s[0...j-1]
8        int prefixCount[26][n + 1];
9        memset(prefixCount, 0, sizeof(prefixCount));
10      
11        // Build prefix count array for each character
12        for (int pos = 1; pos <= n; ++pos) {
13            // Copy counts from previous position
14            for (int charIdx = 0; charIdx < 26; ++charIdx) {
15                prefixCount[charIdx][pos] = prefixCount[charIdx][pos - 1];
16            }
17            // Increment count for current character
18            prefixCount[s[pos - 1] - 'a'][pos]++;
19        }
20      
21        // Process each query and calculate result
22        vector<int> result;
23        for (auto& query : queries) {
24            int left = query[0];
25            int right = query[1];
26          
27            // Start with count of single-character substrings
28            int totalCount = right - left + 1;
29          
30            // For each character, calculate substrings with same start and end character
31            for (int charIdx = 0; charIdx < 26; ++charIdx) {
32                // Get count of current character in range [left, right]
33                int charCount = prefixCount[charIdx][right + 1] - prefixCount[charIdx][left];
34              
35                // Add combinations: choose 2 positions from charCount occurrences
36                // This gives us all substrings starting and ending with this character
37                totalCount += charCount * (charCount - 1) / 2;
38            }
39          
40            result.push_back(totalCount);
41        }
42      
43        return result;
44    }
45};
46
1/**
2 * Counts the number of substrings with the same starting and ending character
3 * for each query range in the given string.
4 * 
5 * @param s - The input string
6 * @param queries - Array of query ranges, where each query is [left, right] (0-indexed)
7 * @returns Array of counts for each query
8 */
9function sameEndSubstringCount(s: string, queries: number[][]): number[] {
10    const stringLength: number = s.length;
11  
12    // Create a 2D prefix sum array to store character counts
13    // prefixCount[charIndex][position] stores count of character 'charIndex' up to position
14    const prefixCount: number[][] = Array.from(
15        { length: 26 }, 
16        () => Array(stringLength + 1).fill(0)
17    );
18  
19    // Build prefix sum array for each character (a-z)
20    for (let position = 1; position <= stringLength; position++) {
21        // Copy counts from previous position
22        for (let charIndex = 0; charIndex < 26; charIndex++) {
23            prefixCount[charIndex][position] = prefixCount[charIndex][position - 1];
24        }
25      
26        // Increment count for current character at this position
27        const currentCharIndex: number = s.charCodeAt(position - 1) - 'a'.charCodeAt(0);
28        prefixCount[currentCharIndex][position]++;
29    }
30  
31    const results: number[] = [];
32  
33    // Process each query
34    for (const [leftIndex, rightIndex] of queries) {
35        // Initialize with count of all single-character substrings in range
36        let substringCount: number = rightIndex - leftIndex + 1;
37      
38        // Calculate substrings with same start and end character
39        for (let charIndex = 0; charIndex < 26; charIndex++) {
40            // Get frequency of current character in the query range
41            const charFrequency: number = prefixCount[charIndex][rightIndex + 1] - prefixCount[charIndex][leftIndex];
42          
43            // Add combinations: choose 2 positions from charFrequency occurrences
44            // This gives us all substrings where both ends are the same character
45            substringCount += (charFrequency * (charFrequency - 1)) / 2;
46        }
47      
48        results.push(substringCount);
49    }
50  
51    return results;
52}
53

Time and Space Complexity

Time Complexity: O((n + m) × |Σ|)

The algorithm consists of two main phases:

  1. Preprocessing phase: Building the prefix count arrays takes O(n × |Σ|) time, where we iterate through each position in the string (n positions) and update counts for each unique character (|Σ| characters).

  2. Query processing phase: For each of the m queries, we iterate through all unique characters in the set to calculate the count of substrings. This takes O(m × |Σ|) time.

Therefore, the total time complexity is O(n × |Σ| + m × |Σ|) = O((n + m) × |Σ|).

Space Complexity: O(n × |Σ|)

The algorithm uses a dictionary cnt where each unique character maps to an array of size n + 1 to store prefix counts. Since there are |Σ| unique characters and each requires an array of size n + 1, the space complexity is O(n × |Σ|).

Where:

  • n is the length of the string s
  • m is the number of queries
  • |Σ| is the size of the character set (number of unique characters in s), which is at most 26 for lowercase English letters

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Index Confusion Between 0-based String and 1-based Prefix Arrays

The most common pitfall in this solution is mishandling the indexing when building or querying the prefix sum arrays. The string s uses 0-based indexing, but the prefix sum arrays are built with 1-based logic for convenience.

Incorrect Implementation:

# Wrong: Using 0-based indexing directly
for i in range(n):
    for c in unique_chars:
        prefix_counts[c][i] = prefix_counts[c][i - 1]  # Error when i=0!
    prefix_counts[s[i]][i] += 1

# Wrong: Incorrect query range calculation
char_count = prefix_counts[char][right] - prefix_counts[char][left - 1]  # Error when left=0!

Correct Implementation:

# Use enumerate with start=1 for cleaner 1-based indexing
for index, current_char in enumerate(s, 1):
    for char in unique_chars:
        prefix_counts[char][index] = prefix_counts[char][index - 1]
    prefix_counts[current_char][index] += 1

# Proper query with adjusted indices
char_count = prefix_counts[char][right + 1] - prefix_counts[char][left]

2. Forgetting to Initialize Single Character Counts

Another pitfall is forgetting that single characters always count as same-end substrings. Some might only count the multi-character same-end substrings.

Incorrect:

# Wrong: Only counting multi-character substrings
total = 0
for char in unique_chars:
    char_count = prefix_counts[char][right + 1] - prefix_counts[char][left]
    total += char_count * (char_count - 1) // 2

Correct:

# Include single character substrings first
total = right - left + 1  # All single characters in the range
for char in unique_chars:
    char_count = prefix_counts[char][right + 1] - prefix_counts[char][left]
    total += char_count * (char_count - 1) // 2

3. Memory Optimization Oversight

When the string contains many unique characters but queries focus on small ranges, creating prefix arrays for all characters might be inefficient.

Alternative Approach for Small Query Ranges:

def sameEndSubstringCount(self, s: str, queries: List[List[int]]) -> List[int]:
    result = []
  
    for left, right in queries:
        # For small ranges, direct counting might be more efficient
        if right - left + 1 <= 100:  # Threshold can be adjusted
            count = right - left + 1  # Single characters
            char_positions = {}
          
            # Collect positions for each character in range
            for i in range(left, right + 1):
                if s[i] not in char_positions:
                    char_positions[s[i]] = []
                char_positions[s[i]].append(i)
          
            # Count pairs for each character
            for positions in char_positions.values():
                count += len(positions) * (len(positions) - 1) // 2
          
            result.append(count)
        else:
            # Use prefix sum approach for large ranges
            # ... (original solution)
  
    return result

This hybrid approach can be more efficient when query ranges are small relative to the string length.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

A heap is a ...?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More