2564. Substring XOR Queries
Problem Description
You are given a binary string s
(containing only '0' and '1' characters) and a 2D integer array queries
where each query is represented as queries[i] = [firsti, secondi]
.
For each query, you need to find the shortest substring of s
that satisfies a specific XOR condition:
- Take a substring from
s
and convert it to its decimal valueval
- This value should satisfy:
val XOR firsti = secondi
- In other words, when you XOR the decimal value of the substring with
firsti
, it should equalsecondi
For each query, you need to return the starting and ending indices (0-indexed) of the shortest valid substring as [lefti, righti]
. If no such substring exists, return [-1, -1]
. When multiple substrings of the same length satisfy the condition, choose the one that appears earliest in the string (minimum left index).
The solution works by preprocessing all possible substrings of length 1 to 32 bits (since we're dealing with binary to decimal conversion and typical integer ranges). For each starting position i
in the string:
- It builds decimal values by reading consecutive bits
- It stores the first occurrence of each decimal value with its indices in a hash table
d
- For value 0, it stops early since adding more leading zeros won't change the value
When processing queries, for each [first, second]
pair:
- It calculates the target value as
first XOR second
(because ifval XOR first = second
, thenval = first XOR second
) - It looks up this target value in the preprocessed hash table
- It returns the stored indices or
[-1, -1]
if not found
Intuition
The key insight comes from understanding the XOR operation's property: if val XOR first = second
, then we can rearrange this to get val = first XOR second
. This is because XOR is its own inverse operation - applying XOR twice with the same value returns the original value.
This means for each query [first, second]
, we don't need to check every substring and test if it satisfies the condition. Instead, we can directly calculate what value we're looking for: target = first XOR second
. Now the problem reduces to finding the shortest substring in s
whose decimal value equals this target.
Since we need to answer multiple queries efficiently, it makes sense to preprocess the string once and store all possible substring values. But how many substrings should we consider?
The constraint comes from practical considerations - we're dealing with integers, and typically integers are represented in 32 bits or less. So any substring longer than 32 characters would represent a number too large for standard integer operations. This limits our search space significantly.
For the preprocessing strategy, we iterate through each starting position and build decimal values incrementally by reading one bit at a time. This is efficient because we can build the decimal value on the fly: x = x << 1 | int(s[i + j])
shifts the current value left by one bit and adds the new bit.
We use a hash table to store the first occurrence of each value because we want the minimum left index when multiple substrings have the same value. The special case for x == 0
is handled by breaking early - once we have a '0', adding more leading zeros won't create a different value, so there's no point continuing.
This preprocessing approach transforms a potentially expensive substring search problem into a simple hash table lookup for each query, making the solution efficient even for multiple queries.
Solution Approach
The solution uses a preprocessing and hash table lookup approach to efficiently handle multiple queries.
Step 1: Preprocessing Phase
We create a hash table d
to store the first occurrence of each decimal value found in the string. For each starting position i
in the string:
- Initialize a variable
x = 0
to build the decimal value - Iterate up to 32 positions from the current starting point (or until the string ends)
- For each position
j
from the starting point:- Update the decimal value:
x = x << 1 | int(s[i + j])
- This operation shifts
x
left by 1 bit and adds the current bit - Example: if
x = 5
(binary: 101) and next bit is '1', thenx
becomes11
(binary: 1011)
- Update the decimal value:
- Store the indices only if this value hasn't been seen before:
if x not in d: d[x] = [i, i + j]
- Special optimization: if
x == 0
, break early since adding leading zeros won't create new values
Step 2: Query Processing
For each query [first, second]
:
- Calculate the target value:
target = first XOR second
- This uses the XOR property: if
val XOR first = second
, thenval = first XOR second
- This uses the XOR property: if
- Look up the target value in the hash table:
d.get(first ^ second, [-1, -1])
- Return the stored indices if found, otherwise return
[-1, -1]
Time Complexity Analysis:
- Preprocessing:
O(n * 32)
wheren
is the length of strings
- For each of
n
positions, we check up to 32 substrings
- For each of
- Query processing:
O(q)
whereq
is the number of queries- Each query is just a hash table lookup, which is
O(1)
on average
- Each query is just a hash table lookup, which is
- Total:
O(n * 32 + q)
Space Complexity:
- The hash table can store at most
O(n * 32)
entries in the worst case - Each entry stores a pair of indices
The algorithm efficiently handles the constraint that we're looking for the shortest substring and the one with minimum left index by:
- Building substrings from shortest to longest (incrementally adding bits)
- Only storing the first occurrence of each value in the hash table
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example with s = "101" and queries = [[4, 3], [1, 2]]
.
Step 1: Preprocessing Phase
We'll build a hash table d
by examining all substrings up to length 32 (or until string ends).
Starting at index 0 (s[0] = '1'):
- j=0: substring "1" → decimal value = 1, store d[1] = [0, 0]
- j=1: substring "10" → decimal value = 2 (binary: 10), store d[2] = [0, 1]
- j=2: substring "101" → decimal value = 5 (binary: 101), store d[5] = [0, 2]
Starting at index 1 (s[1] = '0'):
- j=0: substring "0" → decimal value = 0, store d[0] = [1, 1]
- Since x = 0, we break (adding leading zeros won't create new values)
Starting at index 2 (s[2] = '1'):
- j=0: substring "1" → decimal value = 1
- Value 1 already exists in d, so we don't update it (keeping the earliest occurrence)
Final hash table: d = {1: [0, 0], 2: [0, 1], 5: [0, 2], 0: [1, 1]}
Step 2: Query Processing
Query 1: [4, 3]
- Calculate target: 4 XOR 3 = 7
- Look up 7 in hash table: not found
- Return [-1, -1]
Query 2: [1, 2]
- Calculate target: 1 XOR 2 = 3
- Look up 3 in hash table: not found
- Return [-1, -1]
Let's try another example where we find matches. Consider s = "111"
and queries = [[5, 2]]
:
Preprocessing:
- Index 0: "1" → 1, "11" → 3, "111" → 7
- Index 1: "1" → 1 (skip, already exists), "11" → 3 (skip)
- Index 2: "1" → 1 (skip)
- Hash table:
d = {1: [0, 0], 3: [0, 1], 7: [0, 2]}
Query [5, 2]:
- Target: 5 XOR 2 = 7
- Look up 7: found at d[7] = [0, 2]
- Return [0, 2] (substring "111" has decimal value 7, and 7 XOR 5 = 2 ✓)
Solution Implementation
1from typing import List
2
3class Solution:
4 def substringXorQueries(self, s: str, queries: List[List[int]]) -> List[List[int]]:
5 """
6 Find the leftmost substring in binary string s that equals first XOR second for each query.
7
8 Args:
9 s: Binary string containing only '0' and '1'
10 queries: List of [first, second] pairs where we need to find substring equal to first XOR second
11
12 Returns:
13 List of [left, right] indices for each query, or [-1, -1] if not found
14 """
15 # Dictionary to store the leftmost occurrence of each binary value
16 # Key: decimal value, Value: [start_index, end_index] of substring
17 value_to_indices = {}
18 string_length = len(s)
19
20 # Generate all possible substrings up to 32 bits (sufficient for 32-bit integers)
21 for start_idx in range(string_length):
22 current_value = 0
23
24 # Build binary values starting from position start_idx
25 # Maximum length is 32 bits (covers all possible XOR results of two 32-bit integers)
26 for offset in range(32):
27 # Check if we've reached the end of the string
28 if start_idx + offset >= string_length:
29 break
30
31 # Shift current value left and add the next bit
32 # This builds the decimal representation of the binary substring
33 current_value = (current_value << 1) | int(s[start_idx + offset])
34
35 # Store the first occurrence of this value (leftmost substring)
36 if current_value not in value_to_indices:
37 value_to_indices[current_value] = [start_idx, start_idx + offset]
38
39 # Optimization: if current value is 0, no need to continue extending
40 # (leading zeros don't change the value)
41 if current_value == 0:
42 break
43
44 # Process each query and find the corresponding substring
45 # For each [first, second] pair, we need substring with value = first XOR second
46 result = []
47 for first, second in queries:
48 target_value = first ^ second
49 # Get indices if target value exists, otherwise return [-1, -1]
50 result.append(value_to_indices.get(target_value, [-1, -1]))
51
52 return result
53
1class Solution {
2 public int[][] substringXorQueries(String s, int[][] queries) {
3 // Map to store the first occurrence of each binary number value
4 // Key: decimal value, Value: [startIndex, endIndex] of substring
5 Map<Integer, int[]> valueToIndicesMap = new HashMap<>();
6 int stringLength = s.length();
7
8 // Iterate through all possible starting positions in the binary string
9 for (int startIndex = 0; startIndex < stringLength; ++startIndex) {
10 int currentValue = 0;
11
12 // Build binary numbers starting from startIndex
13 // Limit to 32 bits (maximum integer size) or string boundary
14 for (int length = 0; length < 32 && startIndex + length < stringLength; ++length) {
15 // Shift left and add the current bit to build the decimal value
16 currentValue = currentValue << 1 | (s.charAt(startIndex + length) - '0');
17
18 // Store the first occurrence of this value with its indices
19 valueToIndicesMap.putIfAbsent(currentValue, new int[] {startIndex, startIndex + length});
20
21 // Optimization: if we encounter 0, no need to continue
22 // as leading zeros don't change the value
23 if (currentValue == 0) {
24 break;
25 }
26 }
27 }
28
29 int queryCount = queries.length;
30 int[][] result = new int[queryCount][2];
31
32 // Process each query to find the substring indices
33 for (int i = 0; i < queryCount; ++i) {
34 int firstValue = queries[i][0];
35 int secondValue = queries[i][1];
36
37 // Calculate the target value using XOR operation
38 int targetValue = firstValue ^ secondValue;
39
40 // Retrieve the indices for the target value, or [-1, -1] if not found
41 result[i] = valueToIndicesMap.getOrDefault(targetValue, new int[] {-1, -1});
42 }
43
44 return result;
45 }
46}
47
1class Solution {
2public:
3 vector<vector<int>> substringXorQueries(string s, vector<vector<int>>& queries) {
4 // Map to store the first occurrence of each value with its [start, end] indices
5 unordered_map<int, vector<int>> valueToIndices;
6 int stringLength = s.size();
7
8 // Iterate through all possible starting positions in the string
9 for (int startPos = 0; startPos < stringLength; ++startPos) {
10 int currentValue = 0;
11
12 // Build binary numbers starting from startPos, up to 32 bits or string end
13 for (int length = 0; length < 32 && startPos + length < stringLength; ++length) {
14 // Shift left and add the current bit
15 currentValue = (currentValue << 1) | (s[startPos + length] - '0');
16
17 // Store the first occurrence of this value
18 if (valueToIndices.find(currentValue) == valueToIndices.end()) {
19 valueToIndices[currentValue] = {startPos, startPos + length};
20 }
21
22 // Optimization: If we encounter 0, no need to continue
23 // (leading zeros don't change the value)
24 if (currentValue == 0) {
25 break;
26 }
27 }
28 }
29
30 // Process queries and build result
31 vector<vector<int>> result;
32 for (const auto& query : queries) {
33 int firstValue = query[0];
34 int secondValue = query[1];
35
36 // Calculate the target value using XOR
37 int targetValue = firstValue ^ secondValue;
38
39 // Check if the target value exists in our map
40 if (valueToIndices.find(targetValue) != valueToIndices.end()) {
41 result.push_back(valueToIndices[targetValue]);
42 } else {
43 // Target value not found, return [-1, -1]
44 result.push_back({-1, -1});
45 }
46 }
47
48 return result;
49 }
50};
51
1function substringXorQueries(s: string, queries: number[][]): number[][] {
2 // Map to store the first occurrence of each value with its [start, end] indices
3 const valueToIndices: Map<number, number[]> = new Map();
4 const stringLength: number = s.length;
5
6 // Iterate through all possible starting positions in the string
7 for (let startPos = 0; startPos < stringLength; startPos++) {
8 let currentValue = 0;
9
10 // Build binary numbers starting from startPos, up to 32 bits or string end
11 for (let length = 0; length < 32 && startPos + length < stringLength; length++) {
12 // Shift left and add the current bit
13 currentValue = (currentValue << 1) | (parseInt(s[startPos + length]));
14
15 // Store the first occurrence of this value
16 if (!valueToIndices.has(currentValue)) {
17 valueToIndices.set(currentValue, [startPos, startPos + length]);
18 }
19
20 // Optimization: If we encounter 0, no need to continue
21 // (leading zeros don't change the value)
22 if (currentValue === 0) {
23 break;
24 }
25 }
26 }
27
28 // Process queries and build result
29 const result: number[][] = [];
30 for (const query of queries) {
31 const firstValue: number = query[0];
32 const secondValue: number = query[1];
33
34 // Calculate the target value using XOR
35 const targetValue: number = firstValue ^ secondValue;
36
37 // Check if the target value exists in our map
38 if (valueToIndices.has(targetValue)) {
39 result.push(valueToIndices.get(targetValue)!);
40 } else {
41 // Target value not found, return [-1, -1]
42 result.push([-1, -1]);
43 }
44 }
45
46 return result;
47}
48
Time and Space Complexity
Time Complexity: O(n × log M + m)
The algorithm iterates through each position i
in string s
(from 0 to n-1). For each starting position, it builds substrings by examining up to 32 consecutive characters (since we're looking for values that fit in 32 bits). This gives us O(n × 32)
= O(n × log M)
where M = 2^31 - 1
represents the maximum integer value, and log M ≈ 31
bits. The inner loop runs at most 32 times because:
- We limit
j
to range(32) - We break early if we exceed string bounds or encounter a zero value
After building the dictionary, we process m
queries, each taking O(1)
time for dictionary lookup and XOR operation, contributing O(m)
to the total complexity.
Space Complexity: O(n × log M)
The dictionary d
stores at most O(n × 32)
= O(n × log M)
entries because:
- Each starting position
i
can generate up to 32 different substrings (binary numbers) - We store the first occurrence of each unique value
- The maximum number of unique values is bounded by the number of possible substrings we examine
Each dictionary entry stores a pair of indices [i, i + j]
, which takes O(1)
space. Therefore, the total space complexity is O(n × log M)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Not Handling Leading Zeros Correctly
The Pitfall: A common mistake is not properly handling binary strings with leading zeros. For example, the binary strings "0001", "001", "01", and "1" all represent the decimal value 1. If we don't handle this correctly, we might:
- Store multiple entries for the same decimal value
- Miss the shortest substring because we stored a longer one first
- Continue processing unnecessarily after encountering a zero value
Example of Incorrect Approach:
# INCORRECT: This doesn't handle leading zeros properly
for start_idx in range(string_length):
current_value = 0
for offset in range(32):
if start_idx + offset >= string_length:
break
current_value = (current_value << 1) | int(s[start_idx + offset])
# Problem: This might overwrite a shorter substring with a longer one
value_to_indices[current_value] = [start_idx, start_idx + offset]
The Solution: The correct code addresses this by:
- Only storing the first occurrence of each value using
if current_value not in value_to_indices
- Breaking early when
current_value == 0
to avoid redundant processing of leading zeros - Building substrings from each position incrementally (shortest to longest), ensuring we capture the shortest valid substring first
2. Integer Overflow or Incorrect Bit Limit
The Pitfall: Using too many or too few bits when building decimal values. Processing more than 32 bits could lead to:
- Integer overflow in languages with fixed-size integers
- Unnecessary computation for values that can't be represented in standard integer ranges
- Missing valid substrings if the limit is too small
Example of Potential Issue:
# RISKY: Processing too many bits might cause issues
for offset in range(len(s)): # Processing entire remaining string
current_value = (current_value << 1) | int(s[start_idx + offset])
# This could create values beyond typical integer ranges
The Solution: The code correctly limits processing to 32 bits, which is sufficient for:
- All possible XOR results of two 32-bit integers
- Avoiding overflow issues in most programming languages
- Maintaining reasonable time complexity
3. Incorrect XOR Property Application
The Pitfall: Misunderstanding the XOR relationship and searching for the wrong value. Some might incorrectly try to:
- Search for
second
directly - Calculate
first XOR val
for each substring and compare withsecond
- Use incorrect XOR property inversions
Example of Incorrect Logic:
# INCORRECT: Wrong understanding of XOR relationship for first, second in queries: # Wrong: Looking for second directly result.append(value_to_indices.get(second, [-1, -1])) # Or wrong: Looking for first XOR with something else result.append(value_to_indices.get(first, [-1, -1]))
The Solution: The code correctly uses the XOR property:
- If
val XOR first = second
, thenval = first XOR second
- This is because XOR is its own inverse:
(a XOR b) XOR b = a
- Therefore, we search for
target_value = first ^ second
What are the two properties the problem needs to have for dynamic programming to be applicable? (Select 2)
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!