3298. Count Substrings That Can Be Rearranged to Contain a String II

Problem Description

You are given two strings word1 and word2.

A string x is considered valid if it can be rearranged (its characters can be reordered) to have word2 as a prefix. In other words, after rearranging the characters of x, the resulting string should start with word2.

Your task is to find and return the total number of valid substrings of word1. A substring is a contiguous sequence of characters within word1.

For example, if word1 = "abcabc" and word2 = "abc", then substrings like "abc", "bca", "cab", "abca", "bcab", "cabc", and "abcabc" are all valid because they contain at least the characters 'a', 'b', and 'c' with the required frequencies, allowing them to be rearranged to have "abc" as a prefix.

Important constraint: The memory limits for this problem are smaller than usual, so you must implement a solution with linear runtime complexity O(n).

The key insight is that for a substring to be valid, it must contain at least all the characters from word2 with their respective frequencies. Once a substring contains all required characters, it can be rearranged to place word2 at the beginning.

Quick Interview Experience

Help others by sharing your interview experience

Have you seen this problem before?

Intuition

The core observation is that a substring is valid if it contains at least all characters from word2 with their required frequencies. Once we have these characters, we can rearrange them to form word2 as a prefix.

This immediately suggests using a sliding window approach. But here's the key insight: instead of finding all valid windows separately, we can be clever about counting them.

Consider when we have a window [l, r] that just barely contains all required characters. If we extend the right boundary to r+1, we still have a valid window. More importantly, if [l, r] is valid, then any window [l', r] where l' < l is also valid because it contains even more characters.

This leads us to the counting strategy: for each right boundary position r, we want to find the rightmost left boundary l such that [l, r] is still valid. Then all windows [0, r], [1, r], ..., [l-1, r] are valid, giving us exactly l valid substrings ending at position r.

The algorithm works by:

Expanding the window by moving the right pointer and adding characters
When we have all required characters (a valid window), we shrink from the left as much as possible while maintaining validity
Once we can't shrink anymore (removing one more character would make it invalid), we count all valid substrings ending at the current right position

The use of need counter is elegant - it tracks how many distinct characters still need to reach their required frequency. When need = 0, we have a valid window. When we shrink and lose a required character, need becomes positive again, signaling we've found the boundary.

This approach ensures we count each valid substring exactly once and achieves the required O(n) time complexity since each character is visited at most twice (once by right pointer, once by left pointer).

Learn more about Sliding Window patterns.

Solution Implementation

1from collections import Counter
2
3class Solution:
4    def validSubstringCount(self, word1: str, word2: str) -> int:
5        # If word1 is shorter than word2, no valid substring is possible
6        if len(word1) < len(word2):
7            return 0
8      
9        # Count frequency of each character in word2 (target frequencies)
10        target_count = Counter(word2)
11      
12        # Number of unique characters we need to match
13        chars_needed = len(target_count)
14      
15        # Initialize result and left pointer for sliding window
16        result = 0
17        left = 0
18      
19        # Current window character frequencies
20        window_count = Counter()
21      
22        # Iterate through word1 with right pointer
23        for right, char in enumerate(word1):
24            # Add current character to window
25            window_count[char] += 1
26          
27            # If this character now matches the required frequency from word2
28            if window_count[char] == target_count[char]:
29                chars_needed -= 1
30          
31            # When all required characters are satisfied
32            while chars_needed == 0:
33                # Check if removing left character would break the validity
34                if window_count[word1[left]] == target_count[word1[left]]:
35                    chars_needed += 1
36              
37                # Remove leftmost character from window
38                window_count[word1[left]] -= 1
39                left += 1
40          
41            # Add number of valid substrings ending at current position
42            # All substrings from index 0 to left-1 form valid substrings when combined with [left, right]
43            result += left
44      
45        return result
46

1class Solution {
2    public long validSubstringCount(String word1, String word2) {
3        // Early return if word1 is shorter than word2
4        if (word1.length() < word2.length()) {
5            return 0;
6        }
7      
8        // Count frequency of each character in word2
9        int[] targetFreq = new int[26];
10        int uniqueCharsNeeded = 0;
11      
12        // Build frequency map for word2 and count unique characters
13        for (int i = 0; i < word2.length(); i++) {
14            int charIndex = word2.charAt(i) - 'a';
15            if (++targetFreq[charIndex] == 1) {
16                uniqueCharsNeeded++;
17            }
18        }
19      
20        long result = 0;
21        int[] windowFreq = new int[26];
22      
23        // Sliding window approach with two pointers
24        int left = 0;
25        for (int right = 0; right < word1.length(); right++) {
26            // Add current character to window
27            int rightCharIndex = word1.charAt(right) - 'a';
28            if (++windowFreq[rightCharIndex] == targetFreq[rightCharIndex]) {
29                uniqueCharsNeeded--;
30            }
31          
32            // Shrink window from left while maintaining validity
33            while (uniqueCharsNeeded == 0) {
34                int leftCharIndex = word1.charAt(left) - 'a';
35              
36                // Check if removing left character breaks the validity
37                if (windowFreq[leftCharIndex] == targetFreq[leftCharIndex]) {
38                    uniqueCharsNeeded++;
39                }
40              
41                // Remove left character from window
42                windowFreq[leftCharIndex]--;
43                left++;
44            }
45          
46            // All substrings starting from indices 0 to (left-1) and ending at right are valid
47            result += left;
48        }
49      
50        return result;
51    }
52}
53

1class Solution {
2public:
3    long long validSubstringCount(string word1, string word2) {
4        // Early return if word1 is shorter than word2
5        if (word1.size() < word2.size()) {
6            return 0;
7        }
8      
9        // Count frequency of each character in word2
10        int targetFreq[26] = {0};
11        int uniqueCharsNeeded = 0;
12      
13        for (char& ch : word2) {
14            int charIndex = ch - 'a';
15            if (++targetFreq[charIndex] == 1) {
16                // This is a new unique character in word2
17                ++uniqueCharsNeeded;
18            }
19        }
20      
21        long long result = 0;
22        int windowFreq[26] = {0};  // Frequency of characters in current window
23        int left = 0;  // Left pointer of sliding window
24      
25        // Iterate through word1 with right pointer
26        for (char& ch : word1) {
27            int charIndex = ch - 'a';
28          
29            // Add current character to window
30            if (++windowFreq[charIndex] == targetFreq[charIndex]) {
31                // We've satisfied the requirement for this character
32                --uniqueCharsNeeded;
33            }
34          
35            // Shrink window from left while it's still valid
36            while (uniqueCharsNeeded == 0) {
37                int leftCharIndex = word1[left] - 'a';
38              
39                // Check if removing left character breaks validity
40                if (windowFreq[leftCharIndex] == targetFreq[leftCharIndex]) {
41                    ++uniqueCharsNeeded;
42                }
43              
44                // Remove left character from window
45                --windowFreq[leftCharIndex];
46                ++left;
47            }
48          
49            // All substrings ending at current position with starting index < left are valid
50            // Since we moved left to the first position that makes window invalid,
51            // positions [0, left-1] all form valid substrings with current right pointer
52            result += left;
53        }
54      
55        return result;
56    }
57};
58

1/**
2 * Counts the number of valid substrings in word1 that contain all characters from word2
3 * with at least the same frequency
4 * @param word1 - The source string to search for valid substrings
5 * @param word2 - The pattern string whose characters must be contained in valid substrings
6 * @returns The count of valid substrings
7 */
8function validSubstringCount(word1: string, word2: string): number {
9    // Early return if word1 is shorter than word2
10    if (word1.length < word2.length) {
11        return 0;
12    }
13  
14    // Array to store character frequency requirements from word2 (26 lowercase letters)
15    const requiredCharCount: number[] = Array(26).fill(0);
16    // Number of distinct characters we need to satisfy
17    let distinctCharsNeeded: number = 0;
18  
19    // Count character frequencies in word2 and track distinct characters
20    for (const char of word2) {
21        const charIndex = char.charCodeAt(0) - 97; // Convert 'a'-'z' to 0-25
22        if (++requiredCharCount[charIndex] === 1) {
23            // First occurrence of this character, increment distinct count
24            ++distinctCharsNeeded;
25        }
26    }
27  
28    // Array to track character frequencies in current window
29    const windowCharCount: number[] = Array(26).fill(0);
30    // Result counter and left pointer for sliding window
31    let validSubstringCount: number = 0;
32    let leftPointer: number = 0;
33  
34    // Iterate through word1 with right pointer
35    for (const char of word1) {
36        const charIndex = char.charCodeAt(0) - 97;
37      
38        // Add current character to window
39        if (++windowCharCount[charIndex] === requiredCharCount[charIndex]) {
40            // This character now meets the required frequency
41            --distinctCharsNeeded;
42        }
43      
44        // Shrink window from left while it remains valid
45        while (distinctCharsNeeded === 0) {
46            const leftCharIndex = word1[leftPointer].charCodeAt(0) - 97;
47          
48            // Check if removing left character breaks validity
49            if (windowCharCount[leftCharIndex] === requiredCharCount[leftCharIndex]) {
50                ++distinctCharsNeeded;
51            }
52          
53            // Remove left character from window and move pointer
54            --windowCharCount[leftCharIndex];
55            ++leftPointer;
56        }
57      
58        // All substrings starting from indices 0 to leftPointer-1 and ending at current position are valid
59        validSubstringCount += leftPointer;
60    }
61  
62    return validSubstringCount;
63}
64

Solution Approach

The solution uses a sliding window technique with two pointers to efficiently count valid substrings.

Initial Setup:

First, check if len(word1) < len(word2). If true, return 0 since it's impossible to have valid substrings.
Create a counter cnt to store the frequency of each character in word2.
Initialize need = len(cnt), which tracks how many distinct characters from word2 we still need to match their required frequencies.
Initialize win as a counter for the current window, ans = 0 for the result, and l = 0 for the left boundary.

Main Algorithm: We iterate through each character in word1 using it as the right boundary of our window:

Expand the window: For each character c at position r:
- Add it to the window: win[c] += 1
- Check if this character helps us meet a requirement: if win[c] == cnt[c], we've just satisfied the frequency requirement for character c, so decrement need -= 1
Shrink when valid: When need == 0 (all requirements met):
- Try to shrink from the left to find the smallest valid window ending at r
- Before removing word1[l] from the window:
  - If win[word1[l]] == cnt[word1[l]], removing this character will break a requirement, so increment need += 1
  - Decrement the count: win[word1[l]] -= 1
  - Move left pointer: l += 1
- This loop continues until need > 0, meaning we've found the rightmost l where [l, r] is still valid
Count valid substrings: After shrinking, add l to the answer. This represents all valid substrings ending at position r:
- [0, r], [1, r], ..., [l-1, r] are all valid (total of l substrings)
- [l, r] and beyond are not valid (don't contain all required characters)

Key Insights:

The need counter elegantly tracks validity: when it's 0, we have all required characters with correct frequencies
By shrinking to find the rightmost valid left boundary, we can count all valid substrings ending at each position in O(1) time
Each character is visited at most twice (once by right pointer, once by left pointer), ensuring O(n) time complexity
The space complexity is O(1) since we only use fixed-size counters (at most 26 characters)

The algorithm efficiently counts all valid substrings by recognizing that if [l, r] is the minimal valid window ending at r, then there are exactly l valid substrings ending at that position.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the algorithm with word1 = "bcabc" and word2 = "abc".

Initial Setup:

cnt = {'a': 1, 'b': 1, 'c': 1} (frequency of characters in word2)
need = 3 (we need to match 3 distinct characters)
win = {} (empty window counter)
ans = 0, l = 0

Step-by-step execution:

r = 0, character 'b':

Add 'b' to window: win = {'b': 1}
Since win['b'] == cnt['b'] (both are 1), we satisfied 'b': need = 2
need > 0, so we don't shrink
ans += 0 (no valid substrings yet)

r = 1, character 'c':

Add 'c' to window: win = {'b': 1, 'c': 1}
Since win['c'] == cnt['c'] (both are 1), we satisfied 'c': need = 1
need > 0, so we don't shrink
ans += 0 (still no valid substrings)

r = 2, character 'a':

Add 'a' to window: win = {'b': 1, 'c': 1, 'a': 1}
Since win['a'] == cnt['a'] (both are 1), we satisfied 'a': need = 0
Now need == 0, we have a valid window [0,2] = "bca"
Try to shrink:
- Check word1[0] = 'b': if we remove it, win['b'] would become 0, breaking the requirement
- Since win['b'] == cnt['b'], removing it would make window invalid: need = 1
- Remove 'b': win = {'b': 0, 'c': 1, 'a': 1}, l = 1
- Stop shrinking since need > 0
ans += 1 (substring [0,2] = "bca" is valid)

r = 3, character 'b':

Add 'b' to window: win = {'b': 1, 'c': 1, 'a': 1}
Since win['b'] == cnt['b'] (both are 1), we satisfied 'b' again: need = 0
need == 0, valid window [1,3] = "cab"
Try to shrink:
- Check word1[1] = 'c': if we remove it, win['c'] would become 0
- Since win['c'] == cnt['c'], removing it would break requirement: need = 1
- Remove 'c': win = {'b': 1, 'c': 0, 'a': 1}, l = 2
- Stop shrinking since need > 0
ans += 2 (substrings [0,3] = "bcab" and [1,3] = "cab" are valid)
Total: ans = 3

r = 4, character 'c':

Add 'c' to window: win = {'b': 1, 'c': 1, 'a': 1}
Since win['c'] == cnt['c'] (both are 1), we satisfied 'c' again: need = 0
need == 0, valid window [2,4] = "abc"
Try to shrink:
- Check word1[2] = 'a': if we remove it, win['a'] would become 0
- Since win['a'] == cnt['a'], removing it would break requirement: need = 1
- Remove 'a': win = {'b': 1, 'c': 1, 'a': 0}, l = 3
- Stop shrinking since need > 0
ans += 3 (substrings [0,4] = "bcabc", [1,4] = "cabc", and [2,4] = "abc" are valid)
Total: ans = 6

Final Result: 6 valid substrings

The valid substrings are:

"bca" (can be rearranged to "abc")
"bcab" (can be rearranged to "abcb" with "abc" prefix)
"cab" (can be rearranged to "abc")
"bcabc" (can be rearranged to "abcbc" with "abc" prefix)
"cabc" (can be rearranged to "abc" + "c")
"abc" (already has all required characters)

Time and Space Complexity

The time complexity is O(n + m), where n is the length of word1 and m is the length of word2. The algorithm first counts the characters in word2 using Counter, which takes O(m) time. Then it uses a sliding window approach with two pointers that traverse word1 once. The right pointer moves through all n characters exactly once, and the left pointer also moves at most n times throughout the entire execution, giving us O(n) for the main loop. Therefore, the total time complexity is O(n + m).

The space complexity is O(|Σ|), where Σ represents the character set. The algorithm uses two Counter objects: cnt to store character frequencies from word2 and win to track the current window's character frequencies. Since the problem deals with lowercase English letters, the maximum number of distinct characters is 26, making the space complexity O(1) or constant space. Even though we store counters, their size is bounded by the alphabet size rather than the input size.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

Pitfall 1: Misunderstanding the Counting Logic

The Problem: Many developers incorrectly assume that when need == 0 (window is valid), they should count ALL substrings within the current window [left, right]. This leads to code like:

if chars_needed == 0:
    # WRONG: This counts internal substrings multiple times
    result += (right - left + 1)

Why It's Wrong: This approach counts substrings multiple times. For example, if [0, 5] is valid, this would count substring [1, 4] here. But when we later process [1, 4] directly, we'd count it again.

The Correct Approach: Count only substrings that END at the current position right. After shrinking to find the smallest valid window [left, right], we know that:

[0, right], [1, right], ..., [left-1, right] are all valid (total of left substrings)
[left, right] onwards are invalid (missing required characters)

Pitfall 2: Incorrect Window Shrinking Condition

The Problem: Some implementations shrink the window while it remains valid:

# WRONG: Shrinks while window is still valid
while chars_needed == 0:
    result += 1  # or some counting logic
    # shrink window...

Why It's Wrong: This approach finds valid windows but doesn't correctly count all valid substrings. It might miss counting larger valid substrings that contain the minimal window.

The Correct Approach: Shrink until the window becomes invalid, then count based on the last valid position:

while chars_needed == 0:
    # First check if removing would break validity
    if window_count[word1[left]] == target_count[word1[left]]:
        chars_needed += 1
    # Then remove and move pointer
    window_count[word1[left]] -= 1
    left += 1
# After loop: left is now at first position that makes window invalid
result += left  # Count all valid substrings ending at right

Pitfall 3: Overcounting Due to Character Frequency Satisfaction

The Problem: Incorrectly decrementing chars_needed every time a character count increases:

# WRONG: Decrements even when already satisfied
window_count[char] += 1
if char in target_count and window_count[char] >= target_count[char]:
    chars_needed -= 1

Why It's Wrong: If a character already has sufficient count (e.g., we need 2 'a's and already have 3), adding another 'a' would incorrectly decrement chars_needed again, leading to negative values or incorrect validity checks.

The Correct Approach: Only decrement chars_needed when we EXACTLY meet the requirement:

window_count[char] += 1
# Only when we go from insufficient to exactly sufficient
if window_count[char] == target_count[char]:
    chars_needed -= 1

Similarly, when shrinking, only increment chars_needed when we go from exactly sufficient to insufficient:

if window_count[word1[left]] == target_count[word1[left]]:
    chars_needed += 1
window_count[word1[left]] -= 1

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Consider the classic dynamic programming of fibonacci numbers, what is the recurrence relation?

dp[i] = dp[i - 1] + dp[i - 2]

dp[i] = dp[i - 1] + 1

dp[i] = dp[i - 1] + dp[i - 2] + dp[i - 3]

dp[i] = dp[i - 1]

I don't know