2564. Substring XOR Queries

MediumBit ManipulationArrayHash TableString

Problem Description

In this problem, we are given a binary string s and an array of queries, where each query is represented by a pair of integers [first_i, second_i]. The goal is to find the shortest substring within s that, when converted to a decimal value val and XORed with first_i, yields second_i. In simpler terms, for each query, we want to find a substring of s such that val ^ first_i == second_i.

The result of each query should be an array [left_i, right_i] indicating the 0-indexed start and end positions of the substring within s. If there is more than one possible substring, we select the one with the smallest left_i. If no such substring exists, we return [-1, -1] for that query.

A substring, in this context, is defined as a sequential chunk of characters from the string s without altering the order or skipping any characters in between.

Intuition

The intuition behind the solution is to use a dictionary to keep track of all possible substrings and their decimal values. We iterate over the string s, and at each position, we try to build substrings of different lengths (up to 32-bits, considering the problem constraints) while keeping track of their decimal values.

We use the following steps:

Create an empty dictionary d to store the substrings.
Iterate over the binary string s (using index i for the start of the substring).
Within each iteration of i, we construct substrings and their decimal values by appending one bit at a time (using index j), shifting the previously calculated value to the left, and adding the new bit.
If the decimal value x of the current substring has not been seen before, we store it in the dictionary with its starting and ending indexes [i, i + j].
If the value is zero, we can break the inner loop early since zero XORed with any number gives that number, and we can't find any smaller substring with a value of zero.
Now that we have precomputed the dictionary with substrings and their decimal representations, we can answer each query by looking for the decimal value that, when XORed with first_i, gives second_i. We use XOR in the query because of the property that if a ^ b = c, then a = b ^ c.

For every query [first, second], we perform the XOR operation first ^ second and look up the result in the dictionary. If there's a match, we return the corresponding substring's start and end indexes; otherwise, we return [-1, -1] for that query.

The benefit of this approach is that it avoids recalculating decimal values of the same substrings for different queries, leveraging the properties of XOR and the lookup efficiency of a dictionary.

Solution Approach

The solution approach utilizes a lookup table (Python dictionary) to store the decimal values of all possible substrings and their corresponding starting and ending positions. Here's the step-by-step breakdown of the implemented algorithm:

Initialization: A Python dictionary d is created to serve as the lookup table.
Outer Loop - Iterating over String s:
- An outer for-loop is set up with variable i which iterates from 0 to the length of string s minus 1. Each i represents the starting position of the potential substrings.
Inner Loop - Building the Substrings:
- For each position i, an inner for-loop with variable j is used to try to build substrings of increasing size, sequentially appended by one character at a time (up to a maximum of 31 iterations for a 32-bit limit).
- During construction, a temporary variable x is used to calculate the decimal representation of the substring using bit manipulation. At each iteration, x is left-shifted by one bit (multiplied by 2) and the next bit of s is taken into account by using a bitwise OR (denoted by |) with the result. This is equivalent to appending the next character of s and converting the new binary string to a decimal value.
```
x = x << 1 | int(s[i + j])
```
Updating the Lookup Table:
- If the current decimal value x does not already exist in the dictionary d, it is added, with its value being a list [i, i + j], representing the start and end indexes of the current substring.
- If x is 0, we can break from the loop early because no smaller substring than the already found will have the same value, due to the nature of XOR.
Handling Queries:
- After populating the lookup table, we can efficiently answer each query using list comprehension.
- For each query [first, second] in the queries list, perform the XOR operation first ^ second and look it up in the dictionary d:
```
d.get(first ^ second, [-1, -1])
```
- If the result of the XOR operation is found in d, the starting and ending positions are returned. If the XOR result is not found, [-1, -1] is returned, indicating no such substring exists.

By converting substrings to their decimal values once and storing them in the dictionary, the algorithm avoids redundant calculations for different queries. Bit manipulation and a lookup table are central to the algorithm's efficiency.

This algorithm greatly optimizes the process of finding the required substrings, especially when dealing with multiple queries, as it leverages the constant time complexity (average case) of dictionary lookups in Python.

Ready to land your dream job?

Unlock your dream job with a 2-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example to illustrate the solution approach:

Assume our binary string s is "1011" and we have a single query queries = [[2, 3]]. We want to find the shortest substring that, when converted to decimal and XORed with 2, results in 3.

Following the solution approach:

Initialization: We start by initializing an empty Python dictionary d.
Outer Loop - Iterating over String s:
- We iterate over string s. In our example, our string is "1011", hence i will run from 0 to 3.
Inner Loop - Building the Substrings:
- For i = 0, we try to build substrings starting at s[0].
  - For j = 0, our substring is "1" (in binary), which equals 1 in decimal, so we store {1: [0, 0]} in our dictionary.
  - For j = 1, our substring is "10", which equals 2 in decimal, so {2: [0, 1]} is added to the dictionary.
  - Continuing, we get "101" (5 in decimal) and "1011" (11 in decimal), resulting in {5: [0, 2]} and {11: [0, 3]} added to the dictionary.
- We repeat this for i = 1, 2, 3, each time constructing new substrings and converting them to decimal numbers and adding to the dictionary if they are not yet present.
  - For i = 1 (and j = 0), the substring is "0", which as per step 4 of the implementation does not need to be stored since 0 XORed with any number will not help us find a unique answer we do not already have.
  - For i = 2 (and j = 0), the substring is "11" (3 in decimal), so {3: [2, 2]} is added to the dictionary.
  - i = 3 only gives us the substring "1" again, which is already in the dictionary, so we do not add it.
The final dictionary d after processing will look like this:
```
d = {
  1: [0, 0],
  2: [0, 1],
  5: [0, 2],
  11: [0, 3],
  3: [2, 2]
}
```
Handling Queries:
- Now, we answer our query [2, 3]. We compute 2 ^ 3 which equals 1. We look up the value 1 in our dictionary.
- We find it as d[1], which is [0, 0].

Therefore, for the query [2, 3], the answer is [0, 0]. The shortest substring from s starting at index 0 and ending at index 0 when converted to decimal and XORed with 2 results in 3.

This illustrates how the algorithm prepares a lookup table for efficient query handling based on substring decimal values. It leverages the dictionary data structure to preemptively store all possible substrings and their indexes, significantly speeding up the processing of each query.

Solution Implementation

1class Solution:
2    def substring_xor_queries(self, s: str, queries: List[List[int]]) -> List[List[int]]:
3      
4        # Dictionary to store XOR values as keys and their substring indices as values.
5        xor_to_indices = {}
6        n = len(s)
7      
8        # Iterate through each character in the string.
9        for i in range(n):
10            # Initialize XOR value.
11            x = 0
12            # Iterate through bit positions up to 32 bits.
13            for j in range(32):
14                # Break if the substring index exceeds string length.
15                if i + j >= n:
16                    break
17                # Shift XOR value to left and add the current bit.
18                x = x << 1 | int(s[i + j])
19                # Store the substring indices in the dictionary if XOR value is not present.
20                if x not in xor_to_indices:
21                    xor_to_indices[x] = [i, i + j]
22                # Break if XOR value becomes 0 to avoid unnecessary calculations.
23                if x == 0:
24                    break
25      
26        # Process queries to find substring indices for XOR values obtained from the queries.
27        return [xor_to_indices.get(first ^ second, [-1, -1]) for first, second in queries]
28

1import java.util.HashMap;
2import java.util.Map;
3
4class Solution {
5    public int[][] substringXorQueries(String s, int[][] queries) {
6        // Create a map to store the XOR value with its corresponding substring indices.
7        Map<Integer, int[]> xorToIndexMap = new HashMap<>();
8        int stringLength = s.length();
9      
10        // Iterate through each possible starting index for substrings.
11        for (int startIndex = 0; startIndex < stringLength; ++startIndex) {
12            int xorValue = 0;
13          
14            // Build the XOR value for the substring starting at 'startIndex'
15            // and ending before reaching the 32-bit limit or the end of string.
16            for (int offset = 0; offset < 32 && startIndex + offset < stringLength; ++offset) {
17                xorValue = xorValue << 1 | (s.charAt(startIndex + offset) - '0');
18                // Store only the first occurrence of each XOR value.
19                xorToIndexMap.putIfAbsent(xorValue, new int[] {startIndex, startIndex + offset});
20              
21                // If the XOR value is 0, no need to continue further for this substring.
22                if (xorValue == 0) {
23                    break;
24                }
25            }
26        }
27      
28        // Number of queries to process.
29        int numberOfQueries = queries.length;
30        // Array to store the results of the queries.
31        int[][] results = new int[numberOfQueries][2];
32      
33        // Process each query.
34        for (int i = 0; i < numberOfQueries; ++i) {
35            int firstQueryValue = queries[i][0];
36            int secondQueryValue = queries[i][1];
37            // Calculate the XOR value for the given query.
38            int queryXor = firstQueryValue ^ secondQueryValue;
39            // Fetch the substring indices corresponding to the XOR value or default to [-1, -1] if not found.
40            results[i] = xorToIndexMap.getOrDefault(queryXor, new int[] {-1, -1});
41        }
42
43        return results;
44    }
45}
46

1#include <vector>
2#include <string>
3#include <unordered_map>
4using namespace std;
5
6class Solution {
7public:
8    // Function to process the XOR queries on substrings
9    vector<vector<int>> substringXorQueries(string s, vector<vector<int>>& queries) {
10        // Dictionary to store the XOR values and their corresponding start and end indices
11        unordered_map<int, vector<int>> xorDictionary;
12        // Length of the input string
13        int length = s.size();
14
15        // Iterating through each character of the string
16        for (int i = 0; i < length; ++i) {
17            int xorValue = 0;
18
19            // Generate XOR for substrings starting at i and having a length of up to 32 bits
20            for (int j = 0; j < 32 && i + j < length; ++j) {
21                // Shift the XOR value left by 1 bit and add the current bit value
22                xorValue = (xorValue << 1) | (s[i + j] - '0');
23              
24                // If the XOR value has not been seen before, store it with its indices
25                if (!xorDictionary.count(xorValue)) {
26                    xorDictionary[xorValue] = {i, i + j};
27                }
28              
29                // If the XOR value is 0, break, as all future XORs will remain 0
30                if (xorValue == 0) {
31                    break;
32                }
33            }
34        }
35
36        // Vector to store the results of the queries
37        vector<vector<int>> results;
38      
39        // Process each query
40        for (auto& query : queries) {
41            int startIndex = query[0], endIndex = query[1];
42            int xorQueryValue = startIndex ^ endIndex; // Compute the XOR of the given indices
43          
44            // Check if the XOR value exists in our dictionary
45            if (xorDictionary.count(xorQueryValue)) {
46                // If so, add the corresponding indices to the results
47                results.emplace_back(xorDictionary[xorQueryValue]);
48            } else {
49                // Otherwise, add {-1, -1} to indicate the value isn't found
50                results.push_back({-1, -1});
51            }
52        }
53      
54        // Return the results of the queries
55        return results;
56    }
57};
58

1// Importing necessary functionality
2import { string } from "prop-types";
3
4// Method to process the XOR queries on substrings
5function substringXorQueries(s: string, queries: number[][]): number[][] {
6    // Dictionary to store the XOR values and their corresponding start and end indices
7    const xorDictionary: { [key: number]: number[] } = {};
8    // Length of the input string
9    const length: number = s.length;
10
11    // Iterating through each character of the string
12    for (let i = 0; i < length; ++i) {
13        let xorValue = 0;
14
15        // Generate XOR for substrings starting at i and having a length of up to 32 bits
16        for (let j = 0; j < 32 && i + j < length; ++j) {
17            // Shift the XOR value left by 1 bit and add the current bit value
18            xorValue = (xorValue << 1) | (s.charCodeAt(i + j) - '0'.charCodeAt(0));
19
20            // If the XOR value has not been seen before, store it with its indices
21            if (xorDictionary[xorValue] === undefined) {
22                xorDictionary[xorValue] = [i, i + j];
23            }
24
25            // If the XOR value is 0, break, as all future XORs will remain 0
26            if (xorValue === 0) {
27                break;
28            }
29        }
30    }
31
32    // Array to store the results of the queries
33    const results: number[][] = [];
34
35    // Process each query
36    for (const query of queries) {
37        const startIndex = query[0], endIndex = query[1];
38        const xorQueryValue = startIndex ^ endIndex; // Compute the XOR of the given indices
39
40        // Check if the XOR value exists in our dictionary
41        if (xorDictionary[xorQueryValue] !== undefined) {
42            // If so, add the corresponding indices to the results
43            results.push(xorDictionary[xorQueryValue]);
44        } else {
45            // Otherwise, add [-1, -1] to indicate the value isn't found
46            results.push([-1, -1]);
47        }
48    }
49
50    // Return the results of the queries
51    return results;
52}
53
54// Example of how to use the function
55const s = "11010";
56const queries = [[0, 1], [1, 2]];
57const results = substringXorQueries(s, queries);
58console.log(results); // Expected output based on the function's logic
59

Time and Space Complexity

Time Complexity

The time complexity of the given code can be broken down as follows:

The outer loop runs n times where n is the length of the string s.
The inner loop, which is responsible for constructing the different XOR substrings and storing them into the dictionary d, runs for at most 32 iterations (since we're looking at a maximum of a 32-bit integer representation).
Within the inner loop, there is a constant-time operation of left shifting (x << 1) and OR-ing (| int(s[i + j])), as well as dictionary insertion or check which can be considered O(1) on average.
After the loops, the code iterates through each query to look up the XOR result in the dictionary, which can be done in O(1) time on average for each query.

Assuming m queries, the total time complexity is:

O(n * 32 + m) = O(n + m)

This assumes dictionary operations take constant time.

However, we must bear in mind that if there were a lot of collisions and the Python dictionary had to handle these, the worst-case time complexity for insertions and lookups could degrade to O(n). But this is highly unlikely with a good hash function and is not considered in average case analysis.

Space Complexity

The space complexity mainly comes from the dictionary d where all unique XOR results of substrings starting at different positions are stored.
In the worst case, every possible starting index and length could yield a unique XOR value, which means the size of d could grow to n * 32.
The space complexity also includes the output list of lists, but this only adds space linearly proportional to the number of queries m.

Thus, the space complexity is O(n * 32 + m) = O(n + m), with the same assumption about average case behavior of dictionary storage.

Learn more about how to find time and space complexity quickly using problem constraints.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Which technique can we use to find the middle of a linked list?

Fast-slow pointers

Two pointers moving in opposite direction

Binary search

Depth first search

I don't know