Facebook Pixel

2564. Substring XOR Queries

MediumBit ManipulationArrayHash TableString
Leetcode Link

Problem Description

You are given a binary string s (containing only '0' and '1' characters) and a 2D integer array queries where each query is represented as queries[i] = [firsti, secondi].

For each query, you need to find the shortest substring of s that satisfies a specific XOR condition:

  • Take a substring from s and convert it to its decimal value val
  • This value should satisfy: val XOR firsti = secondi
  • In other words, when you XOR the decimal value of the substring with firsti, it should equal secondi

For each query, you need to return the starting and ending indices (0-indexed) of the shortest valid substring as [lefti, righti]. If no such substring exists, return [-1, -1]. When multiple substrings of the same length satisfy the condition, choose the one that appears earliest in the string (minimum left index).

The solution works by preprocessing all possible substrings of length 1 to 32 bits (since we're dealing with binary to decimal conversion and typical integer ranges). For each starting position i in the string:

  • It builds decimal values by reading consecutive bits
  • It stores the first occurrence of each decimal value with its indices in a hash table d
  • For value 0, it stops early since adding more leading zeros won't change the value

When processing queries, for each [first, second] pair:

  • It calculates the target value as first XOR second (because if val XOR first = second, then val = first XOR second)
  • It looks up this target value in the preprocessed hash table
  • It returns the stored indices or [-1, -1] if not found
Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight comes from understanding the XOR operation's property: if val XOR first = second, then we can rearrange this to get val = first XOR second. This is because XOR is its own inverse operation - applying XOR twice with the same value returns the original value.

This means for each query [first, second], we don't need to check every substring and test if it satisfies the condition. Instead, we can directly calculate what value we're looking for: target = first XOR second. Now the problem reduces to finding the shortest substring in s whose decimal value equals this target.

Since we need to answer multiple queries efficiently, it makes sense to preprocess the string once and store all possible substring values. But how many substrings should we consider?

The constraint comes from practical considerations - we're dealing with integers, and typically integers are represented in 32 bits or less. So any substring longer than 32 characters would represent a number too large for standard integer operations. This limits our search space significantly.

For the preprocessing strategy, we iterate through each starting position and build decimal values incrementally by reading one bit at a time. This is efficient because we can build the decimal value on the fly: x = x << 1 | int(s[i + j]) shifts the current value left by one bit and adds the new bit.

We use a hash table to store the first occurrence of each value because we want the minimum left index when multiple substrings have the same value. The special case for x == 0 is handled by breaking early - once we have a '0', adding more leading zeros won't create a different value, so there's no point continuing.

This preprocessing approach transforms a potentially expensive substring search problem into a simple hash table lookup for each query, making the solution efficient even for multiple queries.

Solution Approach

The solution uses a preprocessing and hash table lookup approach to efficiently handle multiple queries.

Step 1: Preprocessing Phase

We create a hash table d to store the first occurrence of each decimal value found in the string. For each starting position i in the string:

  1. Initialize a variable x = 0 to build the decimal value
  2. Iterate up to 32 positions from the current starting point (or until the string ends)
  3. For each position j from the starting point:
    • Update the decimal value: x = x << 1 | int(s[i + j])
    • This operation shifts x left by 1 bit and adds the current bit
    • Example: if x = 5 (binary: 101) and next bit is '1', then x becomes 11 (binary: 1011)
  4. Store the indices only if this value hasn't been seen before: if x not in d: d[x] = [i, i + j]
  5. Special optimization: if x == 0, break early since adding leading zeros won't create new values

Step 2: Query Processing

For each query [first, second]:

  1. Calculate the target value: target = first XOR second
    • This uses the XOR property: if val XOR first = second, then val = first XOR second
  2. Look up the target value in the hash table: d.get(first ^ second, [-1, -1])
  3. Return the stored indices if found, otherwise return [-1, -1]

Time Complexity Analysis:

  • Preprocessing: O(n * 32) where n is the length of string s
    • For each of n positions, we check up to 32 substrings
  • Query processing: O(q) where q is the number of queries
    • Each query is just a hash table lookup, which is O(1) on average
  • Total: O(n * 32 + q)

Space Complexity:

  • The hash table can store at most O(n * 32) entries in the worst case
  • Each entry stores a pair of indices

The algorithm efficiently handles the constraint that we're looking for the shortest substring and the one with minimum left index by:

  1. Building substrings from shortest to longest (incrementally adding bits)
  2. Only storing the first occurrence of each value in the hash table

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example with s = "101" and queries = [[4, 3], [1, 2]].

Step 1: Preprocessing Phase

We'll build a hash table d by examining all substrings up to length 32 (or until string ends).

Starting at index 0 (s[0] = '1'):

  • j=0: substring "1" → decimal value = 1, store d[1] = [0, 0]
  • j=1: substring "10" → decimal value = 2 (binary: 10), store d[2] = [0, 1]
  • j=2: substring "101" → decimal value = 5 (binary: 101), store d[5] = [0, 2]

Starting at index 1 (s[1] = '0'):

  • j=0: substring "0" → decimal value = 0, store d[0] = [1, 1]
  • Since x = 0, we break (adding leading zeros won't create new values)

Starting at index 2 (s[2] = '1'):

  • j=0: substring "1" → decimal value = 1
  • Value 1 already exists in d, so we don't update it (keeping the earliest occurrence)

Final hash table: d = {1: [0, 0], 2: [0, 1], 5: [0, 2], 0: [1, 1]}

Step 2: Query Processing

Query 1: [4, 3]

  • Calculate target: 4 XOR 3 = 7
  • Look up 7 in hash table: not found
  • Return [-1, -1]

Query 2: [1, 2]

  • Calculate target: 1 XOR 2 = 3
  • Look up 3 in hash table: not found
  • Return [-1, -1]

Let's try another example where we find matches. Consider s = "111" and queries = [[5, 2]]:

Preprocessing:

  • Index 0: "1" → 1, "11" → 3, "111" → 7
  • Index 1: "1" → 1 (skip, already exists), "11" → 3 (skip)
  • Index 2: "1" → 1 (skip)
  • Hash table: d = {1: [0, 0], 3: [0, 1], 7: [0, 2]}

Query [5, 2]:

  • Target: 5 XOR 2 = 7
  • Look up 7: found at d[7] = [0, 2]
  • Return [0, 2] (substring "111" has decimal value 7, and 7 XOR 5 = 2 ✓)

Solution Implementation

1from typing import List
2
3class Solution:
4    def substringXorQueries(self, s: str, queries: List[List[int]]) -> List[List[int]]:
5        """
6        Find the leftmost substring in binary string s that equals first XOR second for each query.
7      
8        Args:
9            s: Binary string containing only '0' and '1'
10            queries: List of [first, second] pairs where we need to find substring equal to first XOR second
11          
12        Returns:
13            List of [left, right] indices for each query, or [-1, -1] if not found
14        """
15        # Dictionary to store the leftmost occurrence of each binary value
16        # Key: decimal value, Value: [start_index, end_index] of substring
17        value_to_indices = {}
18        string_length = len(s)
19      
20        # Generate all possible substrings up to 32 bits (sufficient for 32-bit integers)
21        for start_idx in range(string_length):
22            current_value = 0
23          
24            # Build binary values starting from position start_idx
25            # Maximum length is 32 bits (covers all possible XOR results of two 32-bit integers)
26            for offset in range(32):
27                # Check if we've reached the end of the string
28                if start_idx + offset >= string_length:
29                    break
30              
31                # Shift current value left and add the next bit
32                # This builds the decimal representation of the binary substring
33                current_value = (current_value << 1) | int(s[start_idx + offset])
34              
35                # Store the first occurrence of this value (leftmost substring)
36                if current_value not in value_to_indices:
37                    value_to_indices[current_value] = [start_idx, start_idx + offset]
38              
39                # Optimization: if current value is 0, no need to continue extending
40                # (leading zeros don't change the value)
41                if current_value == 0:
42                    break
43      
44        # Process each query and find the corresponding substring
45        # For each [first, second] pair, we need substring with value = first XOR second
46        result = []
47        for first, second in queries:
48            target_value = first ^ second
49            # Get indices if target value exists, otherwise return [-1, -1]
50            result.append(value_to_indices.get(target_value, [-1, -1]))
51      
52        return result
53
1class Solution {
2    public int[][] substringXorQueries(String s, int[][] queries) {
3        // Map to store the first occurrence of each binary number value
4        // Key: decimal value, Value: [startIndex, endIndex] of substring
5        Map<Integer, int[]> valueToIndicesMap = new HashMap<>();
6        int stringLength = s.length();
7      
8        // Iterate through all possible starting positions in the binary string
9        for (int startIndex = 0; startIndex < stringLength; ++startIndex) {
10            int currentValue = 0;
11          
12            // Build binary numbers starting from startIndex
13            // Limit to 32 bits (maximum integer size) or string boundary
14            for (int length = 0; length < 32 && startIndex + length < stringLength; ++length) {
15                // Shift left and add the current bit to build the decimal value
16                currentValue = currentValue << 1 | (s.charAt(startIndex + length) - '0');
17              
18                // Store the first occurrence of this value with its indices
19                valueToIndicesMap.putIfAbsent(currentValue, new int[] {startIndex, startIndex + length});
20              
21                // Optimization: if we encounter 0, no need to continue
22                // as leading zeros don't change the value
23                if (currentValue == 0) {
24                    break;
25                }
26            }
27        }
28      
29        int queryCount = queries.length;
30        int[][] result = new int[queryCount][2];
31      
32        // Process each query to find the substring indices
33        for (int i = 0; i < queryCount; ++i) {
34            int firstValue = queries[i][0];
35            int secondValue = queries[i][1];
36          
37            // Calculate the target value using XOR operation
38            int targetValue = firstValue ^ secondValue;
39          
40            // Retrieve the indices for the target value, or [-1, -1] if not found
41            result[i] = valueToIndicesMap.getOrDefault(targetValue, new int[] {-1, -1});
42        }
43      
44        return result;
45    }
46}
47
1class Solution {
2public:
3    vector<vector<int>> substringXorQueries(string s, vector<vector<int>>& queries) {
4        // Map to store the first occurrence of each value with its [start, end] indices
5        unordered_map<int, vector<int>> valueToIndices;
6        int stringLength = s.size();
7      
8        // Iterate through all possible starting positions in the string
9        for (int startPos = 0; startPos < stringLength; ++startPos) {
10            int currentValue = 0;
11          
12            // Build binary numbers starting from startPos, up to 32 bits or string end
13            for (int length = 0; length < 32 && startPos + length < stringLength; ++length) {
14                // Shift left and add the current bit
15                currentValue = (currentValue << 1) | (s[startPos + length] - '0');
16              
17                // Store the first occurrence of this value
18                if (valueToIndices.find(currentValue) == valueToIndices.end()) {
19                    valueToIndices[currentValue] = {startPos, startPos + length};
20                }
21              
22                // Optimization: If we encounter 0, no need to continue 
23                // (leading zeros don't change the value)
24                if (currentValue == 0) {
25                    break;
26                }
27            }
28        }
29      
30        // Process queries and build result
31        vector<vector<int>> result;
32        for (const auto& query : queries) {
33            int firstValue = query[0];
34            int secondValue = query[1];
35          
36            // Calculate the target value using XOR
37            int targetValue = firstValue ^ secondValue;
38          
39            // Check if the target value exists in our map
40            if (valueToIndices.find(targetValue) != valueToIndices.end()) {
41                result.push_back(valueToIndices[targetValue]);
42            } else {
43                // Target value not found, return [-1, -1]
44                result.push_back({-1, -1});
45            }
46        }
47      
48        return result;
49    }
50};
51
1function substringXorQueries(s: string, queries: number[][]): number[][] {
2    // Map to store the first occurrence of each value with its [start, end] indices
3    const valueToIndices: Map<number, number[]> = new Map();
4    const stringLength: number = s.length;
5  
6    // Iterate through all possible starting positions in the string
7    for (let startPos = 0; startPos < stringLength; startPos++) {
8        let currentValue = 0;
9      
10        // Build binary numbers starting from startPos, up to 32 bits or string end
11        for (let length = 0; length < 32 && startPos + length < stringLength; length++) {
12            // Shift left and add the current bit
13            currentValue = (currentValue << 1) | (parseInt(s[startPos + length]));
14          
15            // Store the first occurrence of this value
16            if (!valueToIndices.has(currentValue)) {
17                valueToIndices.set(currentValue, [startPos, startPos + length]);
18            }
19          
20            // Optimization: If we encounter 0, no need to continue 
21            // (leading zeros don't change the value)
22            if (currentValue === 0) {
23                break;
24            }
25        }
26    }
27  
28    // Process queries and build result
29    const result: number[][] = [];
30    for (const query of queries) {
31        const firstValue: number = query[0];
32        const secondValue: number = query[1];
33      
34        // Calculate the target value using XOR
35        const targetValue: number = firstValue ^ secondValue;
36      
37        // Check if the target value exists in our map
38        if (valueToIndices.has(targetValue)) {
39            result.push(valueToIndices.get(targetValue)!);
40        } else {
41            // Target value not found, return [-1, -1]
42            result.push([-1, -1]);
43        }
44    }
45  
46    return result;
47}
48

Time and Space Complexity

Time Complexity: O(n × log M + m)

The algorithm iterates through each position i in string s (from 0 to n-1). For each starting position, it builds substrings by examining up to 32 consecutive characters (since we're looking for values that fit in 32 bits). This gives us O(n × 32) = O(n × log M) where M = 2^31 - 1 represents the maximum integer value, and log M ≈ 31 bits. The inner loop runs at most 32 times because:

  • We limit j to range(32)
  • We break early if we exceed string bounds or encounter a zero value

After building the dictionary, we process m queries, each taking O(1) time for dictionary lookup and XOR operation, contributing O(m) to the total complexity.

Space Complexity: O(n × log M)

The dictionary d stores at most O(n × 32) = O(n × log M) entries because:

  • Each starting position i can generate up to 32 different substrings (binary numbers)
  • We store the first occurrence of each unique value
  • The maximum number of unique values is bounded by the number of possible substrings we examine

Each dictionary entry stores a pair of indices [i, i + j], which takes O(1) space. Therefore, the total space complexity is O(n × log M).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Not Handling Leading Zeros Correctly

The Pitfall: A common mistake is not properly handling binary strings with leading zeros. For example, the binary strings "0001", "001", "01", and "1" all represent the decimal value 1. If we don't handle this correctly, we might:

  • Store multiple entries for the same decimal value
  • Miss the shortest substring because we stored a longer one first
  • Continue processing unnecessarily after encountering a zero value

Example of Incorrect Approach:

# INCORRECT: This doesn't handle leading zeros properly
for start_idx in range(string_length):
    current_value = 0
    for offset in range(32):
        if start_idx + offset >= string_length:
            break
        current_value = (current_value << 1) | int(s[start_idx + offset])
        # Problem: This might overwrite a shorter substring with a longer one
        value_to_indices[current_value] = [start_idx, start_idx + offset]

The Solution: The correct code addresses this by:

  1. Only storing the first occurrence of each value using if current_value not in value_to_indices
  2. Breaking early when current_value == 0 to avoid redundant processing of leading zeros
  3. Building substrings from each position incrementally (shortest to longest), ensuring we capture the shortest valid substring first

2. Integer Overflow or Incorrect Bit Limit

The Pitfall: Using too many or too few bits when building decimal values. Processing more than 32 bits could lead to:

  • Integer overflow in languages with fixed-size integers
  • Unnecessary computation for values that can't be represented in standard integer ranges
  • Missing valid substrings if the limit is too small

Example of Potential Issue:

# RISKY: Processing too many bits might cause issues
for offset in range(len(s)):  # Processing entire remaining string
    current_value = (current_value << 1) | int(s[start_idx + offset])
    # This could create values beyond typical integer ranges

The Solution: The code correctly limits processing to 32 bits, which is sufficient for:

  • All possible XOR results of two 32-bit integers
  • Avoiding overflow issues in most programming languages
  • Maintaining reasonable time complexity

3. Incorrect XOR Property Application

The Pitfall: Misunderstanding the XOR relationship and searching for the wrong value. Some might incorrectly try to:

  • Search for second directly
  • Calculate first XOR val for each substring and compare with second
  • Use incorrect XOR property inversions

Example of Incorrect Logic:

# INCORRECT: Wrong understanding of XOR relationship
for first, second in queries:
    # Wrong: Looking for second directly
    result.append(value_to_indices.get(second, [-1, -1]))
  
    # Or wrong: Looking for first XOR with something else
    result.append(value_to_indices.get(first, [-1, -1]))

The Solution: The code correctly uses the XOR property:

  • If val XOR first = second, then val = first XOR second
  • This is because XOR is its own inverse: (a XOR b) XOR b = a
  • Therefore, we search for target_value = first ^ second
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

What are the two properties the problem needs to have for dynamic programming to be applicable? (Select 2)


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More