Facebook Pixel

1371. Find the Longest Substring Containing Vowels in Even Counts

MediumBit ManipulationHash TableStringPrefix Sum
Leetcode Link

Problem Description

You are given a string s and need to find the length of the longest substring where each of the five vowels ('a', 'e', 'i', 'o', 'u') appears an even number of times.

The key insight is that we can track the parity (odd/even) of vowel occurrences using a bitmask. Each vowel is assigned a bit position, and we toggle that bit whenever we encounter the vowel. When two positions in the string have the same bitmask value, it means the substring between them has an even count of all vowels.

The algorithm uses a hash table d to store the first occurrence index of each bitmask state. Starting with d[0] = -1 (representing the state before any characters), we traverse the string:

  1. For each vowel encountered, we XOR the corresponding bit in mask (toggling between odd/even state)
  2. If the current mask was seen before at index j, the substring from j+1 to i has all vowels appearing an even number of times
  3. We update the answer with the maximum length found

The XOR operation mask ^= 1 << (ord(c) - ord("a")) efficiently toggles the bit for vowel c. For example, if 'a' appears, it toggles bit 0; if 'e' appears, it toggles bit 4.

The beauty of this approach is that when two positions share the same mask value, all vowels in the substring between them must have been toggled an even number of times (returning to their original parity state), ensuring an even count for each vowel in that substring.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The challenge asks for substrings where each vowel appears an even number of times. Let's think about what this means - we need to track whether each vowel has appeared an odd or even number of times as we scan through the string.

The key realization is that we don't need to count the exact number of occurrences; we only care about parity (odd vs. even). This is perfect for using bits - a bit can be 0 (even) or 1 (odd).

Consider this: if at position i vowel 'a' has appeared 3 times (odd), and at position j later it has appeared 7 times (still odd), then in the substring between i and j, 'a' must have appeared 4 times (even). This pattern holds for all vowels.

This leads to a crucial insight: if two positions in the string have the same parity state for all vowels, then the substring between them contains each vowel an even number of times. Why? Because getting from one state to the same state means each vowel that changed must have changed an even number of times.

We can represent the parity state of all five vowels using just 5 bits in a binary number (the mask). Each time we see a vowel, we flip its corresponding bit using XOR. For instance, if we see 'a', we XOR with 1 << 0; if we see 'e', we XOR with 1 << 4.

The XOR operation is perfect here because:

  • 0 XOR 1 = 1 (even count becomes odd)
  • 1 XOR 1 = 0 (odd count becomes even)

By storing the first occurrence of each mask state and comparing with later occurrences of the same mask, we can find all valid substrings. The longest one will be our answer. The initial state d[0] = -1 handles the case where a valid substring starts from the beginning of the string.

Learn more about Prefix Sum patterns.

Solution Approach

The solution uses Prefix XOR with a Hash Table to efficiently track the parity states of vowels.

Data Structures:

  • A hash table d to store the first occurrence index of each mask state
  • An integer mask to represent the current parity state using bits
  • An integer ans to track the maximum length found

Implementation Steps:

  1. Initialize the hash table: Set d[0] = -1 to handle substrings starting from index 0. This represents the state before processing any characters where all vowels have appeared 0 times (even).

  2. Process each character: Iterate through the string with index i and character c:

    • If c is a vowel, toggle its corresponding bit in mask using XOR:
      mask ^= 1 << (ord(c) - ord("a"))
    • This maps vowels to bit positions: 'a'→0, 'e'→4, 'i'→8, 'o'→14, 'u'→20
  3. Check for valid substrings:

    • If the current mask exists in d, we've found a valid substring:
      • Retrieve the previous occurrence index j = d[mask]
      • The substring from index j+1 to i has all vowels appearing an even number of times
      • Update the answer: ans = max(ans, i - j)
    • If mask is new, store the current index: d[mask] = i
  4. Return the result: After processing all characters, ans contains the length of the longest valid substring.

Why this works:

  • The XOR operation naturally handles the toggle between odd/even states
  • When two positions have the same mask value, the XOR of all vowel occurrences between them results in 0 for each vowel bit, meaning even occurrences
  • By storing only the first occurrence of each mask, we maximize the substring length when we encounter the same mask again

Time Complexity: O(n) where n is the length of the string, as we traverse it once Space Complexity: O(2^5) = O(32) in the worst case for the hash table, which is O(1)

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the string s = "eleetminicoworoep" to illustrate the solution approach.

We'll track the mask value and hash table d as we process each character. The mask uses 5 bits for vowels: a(bit 0), e(bit 4), i(bit 8), o(bit 14), u(bit 20).

Initial state:

  • mask = 0 (binary: 00000, all vowels have even count)
  • d = {0: -1} (before processing any characters)
  • ans = 0

Processing each character:

IndexCharIs Vowel?ActionMask (binary)Mask (decimal)dans
0'e'YesToggle bit 41000016{0: -1, 16: 0}0
1'l'NoNo change1000016{0: -1, 16: 0}0
2'e'YesToggle bit 4000000{0: -1, 16: 0}max(0, 2-(-1))=3
3'e'YesToggle bit 41000016{0: -1, 16: 0}max(3, 3-0)=3
4't'NoNo change1000016{0: -1, 16: 0}max(3, 4-0)=4
5'm'NoNo change1000016{0: -1, 16: 0}max(4, 5-0)=5
6'i'YesToggle bit 8110000272{0: -1, 16: 0, 272: 6}5
7'n'NoNo change110000272{0: -1, 16: 0, 272: 6}5
8'i'YesToggle bit 81000016{0: -1, 16: 0, 272: 6}max(5, 8-0)=8
9'c'NoNo change1000016{0: -1, 16: 0, 272: 6}max(8, 9-0)=9
10'o'YesToggle bit 1410000001000016400{0: -1, 16: 0, 272: 6, 16400: 10}9
11'w'NoNo change10000001000016400{0: -1, 16: 0, 272: 6, 16400: 10}9
12'o'YesToggle bit 141000016{0: -1, 16: 0, 272: 6, 16400: 10}max(9, 12-0)=12
13'r'NoNo change1000016{0: -1, 16: 0, 272: 6, 16400: 10}max(12, 13-0)=13
14'o'YesToggle bit 1410000001000016400{0: -1, 16: 0, 272: 6, 16400: 10}max(13, 14-10)=13
15'e'YesToggle bit 410000000000016384{0: -1, 16: 0, 272: 6, 16400: 10, 16384: 15}13
16'p'NoNo change10000000000016384{0: -1, 16: 0, 272: 6, 16400: 10, 16384: 15}13

Key observations:

  • At index 2, mask returns to 0 (same as at index -1), so substring "ele" has even vowels (e:2)
  • At index 8, mask returns to 16 (same as at index 0), so substring "leetmini" has even vowels (e:2, i:2)
  • At index 12, mask returns to 16 (same as at index 0), so substring "leetminicowo" has even vowels (e:2, i:2, o:2)
  • The final answer is 13, corresponding to substring "leetminicowor" from index 1 to 13

The algorithm efficiently finds that the longest valid substring has length 13 by tracking when the mask returns to a previously seen state.

Solution Implementation

1class Solution:
2    def findTheLongestSubstring(self, s: str) -> int:
3        # Dictionary to store the first occurrence index of each bitmask state
4        # Initialize with mask 0 at index -1 (before string starts)
5        first_occurrence = {0: -1}
6      
7        # Initialize result and current bitmask
8        max_length = 0
9        current_mask = 0
10      
11        # Iterate through each character with its index
12        for index, char in enumerate(s):
13            # If character is a vowel, toggle its corresponding bit
14            if char in "aeiou":
15                # Calculate bit position based on character's ASCII value
16                # 'a' -> bit 0, 'e' -> bit 4, 'i' -> bit 8, 'o' -> bit 14, 'u' -> bit 20
17                bit_position = ord(char) - ord("a")
18                current_mask ^= (1 << bit_position)
19          
20            # Check if we've seen this mask state before
21            if current_mask in first_occurrence:
22                # If yes, calculate the length of substring between first occurrence and current position
23                previous_index = first_occurrence[current_mask]
24                substring_length = index - previous_index
25                max_length = max(max_length, substring_length)
26            else:
27                # If not, record the first occurrence of this mask state
28                first_occurrence[current_mask] = index
29      
30        return max_length
31
1class Solution {
2    public int findTheLongestSubstring(String s) {
3        // Define vowels for checking
4        String vowels = "aeiou";
5      
6        // Array to store the first occurrence index of each bitmask state
7        // Size 32 because we have 5 vowels, so 2^5 = 32 possible states
8        int[] firstOccurrence = new int[32];
9      
10        // Initialize all positions with a large value (represents not seen yet)
11        Arrays.fill(firstOccurrence, Integer.MAX_VALUE);
12      
13        // Empty prefix has bitmask 0 at position 0
14        firstOccurrence[0] = 0;
15      
16        // Track the maximum length found and current bitmask state
17        int maxLength = 0;
18        int currentMask = 0;
19      
20        // Iterate through each character in the string (1-indexed for position)
21        for (int position = 1; position <= s.length(); position++) {
22            char currentChar = s.charAt(position - 1);
23          
24            // Check if current character is a vowel and update mask
25            for (int vowelIndex = 0; vowelIndex < 5; vowelIndex++) {
26                if (currentChar == vowels.charAt(vowelIndex)) {
27                    // Toggle the bit corresponding to this vowel
28                    // Bit represents odd (1) or even (0) count of this vowel
29                    currentMask ^= (1 << vowelIndex);
30                    break;
31                }
32            }
33          
34            // Update maximum length if we've seen this mask before
35            // The substring between two same masks has even counts of all vowels
36            maxLength = Math.max(maxLength, position - firstOccurrence[currentMask]);
37          
38            // Record the first occurrence of this mask state
39            firstOccurrence[currentMask] = Math.min(firstOccurrence[currentMask], position);
40        }
41      
42        return maxLength;
43    }
44}
45
1class Solution {
2public:
3    int findTheLongestSubstring(string s) {
4        // Define vowels for checking
5        string vowels = "aeiou";
6      
7        // Array to store the earliest index where each bitmask state appears
8        // Size 32 because we have 5 vowels, so 2^5 = 32 possible states
9        // Initialize with INT_MAX to indicate unvisited states
10        vector<int> earliestIndex(32, INT_MAX);
11      
12        // Empty prefix (before any character) has all vowels with even count (mask = 0)
13        earliestIndex[0] = 0;
14      
15        // Initialize result and current bitmask
16        int maxLength = 0;
17        int currentMask = 0;
18      
19        // Iterate through each character position (1-indexed for calculation convenience)
20        for (int i = 1; i <= s.length(); ++i) {
21            char currentChar = s[i - 1];
22          
23            // Check if current character is a vowel and update mask accordingly
24            for (int j = 0; j < 5; ++j) {
25                if (currentChar == vowels[j]) {
26                    // Toggle the bit corresponding to this vowel
27                    // Bit j represents the parity of vowel at index j in "aeiou"
28                    currentMask ^= (1 << j);
29                    break;
30                }
31            }
32          
33            // If we've seen this mask before, we can form a valid substring
34            // from the previous occurrence to current position
35            if (earliestIndex[currentMask] != INT_MAX) {
36                maxLength = max(maxLength, i - earliestIndex[currentMask]);
37            }
38          
39            // Update the earliest occurrence of this mask
40            earliestIndex[currentMask] = min(earliestIndex[currentMask], i);
41        }
42      
43        return maxLength;
44    }
45};
46
1/**
2 * Finds the length of the longest substring containing an even count of each vowel.
3 * Uses bit masking to track the parity (odd/even) of vowel counts.
4 * 
5 * @param s - The input string to process
6 * @returns The length of the longest valid substring
7 */
8function findTheLongestSubstring(s: string): number {
9    // Define the vowels to track
10    const VOWELS = 'aeiou';
11  
12    // Initialize array to store first occurrence of each bitmask state
13    // Size 32 because we have 5 vowels, so 2^5 = 32 possible states
14    // Initialize with large values except for state 0 (no vowels seen)
15    const firstOccurrence: number[] = Array(32).fill(1 << 29);
16    firstOccurrence[0] = 0;
17  
18    // Track the maximum substring length and current bitmask state
19    let maxLength = 0;
20    let currentMask = 0;
21  
22    // Iterate through each character position (1-indexed for calculation)
23    for (let position = 1; position <= s.length; position++) {
24        const currentChar = s[position - 1];
25      
26        // Check if current character is a vowel and update mask accordingly
27        for (let vowelIndex = 0; vowelIndex < 5; vowelIndex++) {
28            if (currentChar === VOWELS[vowelIndex]) {
29                // Toggle the bit corresponding to this vowel
30                currentMask ^= 1 << vowelIndex;
31                break;
32            }
33        }
34      
35        // Calculate the length of substring from first occurrence of this mask state
36        maxLength = Math.max(maxLength, position - firstOccurrence[currentMask]);
37      
38        // Update the first occurrence of this mask state if earlier
39        firstOccurrence[currentMask] = Math.min(firstOccurrence[currentMask], position);
40    }
41  
42    return maxLength;
43}
44

Time and Space Complexity

The time complexity is O(n), where n is the length of the string s. The algorithm iterates through the string exactly once using a single for loop. Within each iteration, all operations are constant time: checking if a character is a vowel, XOR operation, dictionary lookup, dictionary insertion, and max comparison all take O(1) time.

The space complexity is O(1). Although the code uses a dictionary d to store mask-to-index mappings, the key insight is that the mask represents the parity state of vowels using bits. Since there are only 5 vowels (a, e, i, o, u), the mask can have at most 2^5 = 32 different values. Therefore, the dictionary d can contain at most 32 entries regardless of the input string length, making the space usage constant.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Forgetting to Initialize the Hash Table with {0: -1}

The Pitfall: Many developers initialize an empty hash table without the base case {0: -1}. This causes the algorithm to miss valid substrings that start from index 0.

Why it happens:

  • When the substring from index 0 to some index i has all vowels appearing an even number of times, the mask at position i will be 0
  • Without {0: -1} in the hash table, we cannot calculate the correct length for these substrings
  • The substring length should be i - (-1) = i + 1, but without this initialization, we'd miss this case entirely

Example scenario: For string "eleetminicoworoep", if the first 5 characters "eleet" result in mask 0 (all vowels even), we need the -1 reference point to calculate length as 4 - (-1) = 5.

Solution: Always initialize the hash table with:

first_occurrence = {0: -1}

2. Incorrectly Updating the Hash Table for Repeated Mask Values

The Pitfall: Updating the hash table every time we encounter a mask value, even if we've seen it before:

# WRONG approach
if current_mask in first_occurrence:
    max_length = max(max_length, index - first_occurrence[current_mask])
first_occurrence[current_mask] = index  # This overwrites the first occurrence!

Why it's wrong:

  • To maximize substring length, we need the earliest occurrence of each mask
  • Overwriting with later occurrences reduces the potential substring length
  • Future encounters of the same mask will calculate shorter lengths

Solution: Only store the first occurrence of each mask:

if current_mask in first_occurrence:
    max_length = max(max_length, index - first_occurrence[current_mask])
else:
    # Only store if it's the first time seeing this mask
    first_occurrence[current_mask] = index

3. Using Incorrect Bit Positions for Vowels

The Pitfall: Assigning consecutive bit positions (0, 1, 2, 3, 4) to vowels instead of using their relative positions in the alphabet:

# WRONG approach
vowel_to_bit = {'a': 0, 'e': 1, 'i': 2, 'o': 3, 'u': 4}
current_mask ^= (1 << vowel_to_bit[char])

Why it might seem correct: This approach still works functionally, but it doesn't match the problem's described implementation that uses ord(c) - ord("a").

The issue: While both approaches are valid, using the alphabet-based positioning:

  • Makes the code more elegant and eliminates the need for a mapping dictionary
  • Follows the standard implementation pattern
  • Results in a sparse bitmask (bits 0, 4, 8, 14, 20) but this doesn't affect correctness

Solution: Stick to the alphabet-based bit positioning for consistency:

if char in "aeiou":
    bit_position = ord(char) - ord("a")
    current_mask ^= (1 << bit_position)
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which algorithm is best for finding the shortest distance between two points in an unweighted graph?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More