645. Set Mismatch
Problem Description
In this problem, we're dealing with a particular set of integers ranging from 1
to n
where due to an error, one of the numbers was duplicated, causing another number to be missing. Our task is to figure out which number was duplicated (appears twice) and which one is missing from the array nums
given to us.
The key to solving this problem lies in taking advantage of the properties of exclusive or (XOR) operations and the specifics of the data corruption that occurred.
Intuition
The intuition behind the solution can be developed by considering the properties of XOR operation. We know that for any number x
, doing x XOR x
gives us 0
, and x XOR 0
gives us x
. Thus, XOR-ing a number with itself cancels it out.
With this property in mind, we can XOR all elements in the nums
array with each integer from 1
to n
. Ideally, if no number was duplicated and none missing, the result should be 0
because every number would cancel itself out. However, since we have one duplicate and one missing number, we'll be left with the XOR of these two numbers.
From there, we can distinguish between the duplicate and missing numbers by noticing that they must differ in at least one bit. We find this differing bit (the rightmost bit is usually a good choice), and use it as a mask to separate the numbers into two groups - one with this bit set and one without it.
Now, we apply XOR operation within each group, including the numbers from 1
to n
. This separate XORing would give us the duplicate number and the missing number, but we don't know which is which yet. We verify by checking each number in nums
again. The number that we come across in the array is the duplicate, and the other one is the missing number.
By applying XOR in such strategic ways, we leverage its ability to cancel equal pairs and isolate the odd ones out: in this case, our duplicate and missing numbers.
Learn more about Sorting patterns.
Solution Approach
There are multiple approaches to solving this problem, each using different algorithms, data structures, or patterns. Here, we'll discuss three distinct approaches.
Solution 1: Mathematics
Using the principle of arithmetical sums, we can find the difference between the sum of a complete set from 1
to n
(s_1
) and the sum of our corrupted array (s_2
after duplicates removal). The sum difference s - s_2
gives us the repeated element, while s_1 - s_2
identifies the missing element. This approach is direct and relies on simple arithmetic operations. However, it requires handling potentially large sums, which could lead to integer overflow issues if not careful.
Solution 2: Hash Table
This approach is more straightforward: construct a hash table to count the number of times each element appears in nums
. Then, simply iterate through the range 1
to n
and check the count for each number. If the count is 2
, the number is duplicated; if 0
, it's the missing number. The hash table provides a direct mapping from elements to their counts, making lookups efficient (O(1)
on average).
Solution 3: Bit Operation (implemented in the provided code)
The bit manipulation solution is more complex but efficient, especially regarding space complexity. The idea revolves around XOR operations. We iterate through the array nums
and range from 1
to n
, XOR-ing each element and index. This process will give us xs
, the XOR of the duplicate number a
and the missing number b
(xs = a XOR b)
.
To separate these two values, we find the rightmost bit that is set in xs
(lb = xs & -xs
). We use this bit to divide the numbers into two groups and perform XOR operations on those groups. After separating elements into two piles, we XOR all elements that have this bit set and all elements that don't. This will result in two numbers, one being a
and the other b
. To identify which is which, we iterate through the nums
array again. The one we find in the array is our duplicate number.
Finally, we return the duplicate and missing numbers as a list [a
, b
], ensuring the order is correct based on the final check with the original array.
The brilliance of this approach lies in using the distinct patterns of XOR to segregate and identify unique elements while maintaining a very low space complexity.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the third solution approach with XOR operation using a small example.
Assume our array nums
is [1, 2, 2, 4]
for n = 4
. The ideal array without any errors would be [1, 2, 3, 4]
.
-
We start by initializing
xs
to0
and we XOR all elements innums
and all numbers from1
ton
. Doing so gives us:xs = 1^2^2^4^1^2^3^4
After canceling out duplicate pairs
(1^1, 2^2, and 4^4)
we are left with:xs = 2^3
xs
is now the XOR of the duplicate number and the missing number. -
Next, we find the rightmost set bit in
xs
, which we'll use as a bitmask to differentiate between the two numbers that didn't cancel out.xs = 2^3 = 0010^0011 = 0001
The rightmost set bit is
1
(we can find it by performingxs & -xs
). -
We now use this bit as a mask to split the elements into two groups. The first group will have this bit set (
3
for our case) and the second will not (2
for our case).We perform XOR operations separately on these two groups alongside the numbers from
1
ton
:- With bit set:
3
- Without bit set:
1^2^2^4^1^4
After cancellation:
- With bit set:
3
(already isolated, no duplicates to cancel it out) - Without bit set:
2
(since1^1
and4^4
cancel themselves out leaving2^2
which, after XORing, gives2
)
- With bit set:
-
Now, we have two results from each group:
3
and2
. We don't yet know which is the duplicate number and which is the missing one. -
To find out, we have to compare it against the
nums
array. In our example, we find2
innums
, meaning2
is the duplicate number. -
Since
3
did not appear innums
, it is the missing number. -
We conclude that the duplicate number (
a
) is2
, and the missing number (b
) is3
. Our ordered result would be[2, 3]
.
In this example walkthrough, we have effectively demonstrated how the third solution approach, which focuses on bit manipulation, can identify the duplicate and the missing numbers in an array with a corrupted sequence of integers.
Solution Implementation
1class Solution:
2 def findErrorNums(self, nums: List[int]) -> List[int]:
3 xor_result = 0
4 # First pass: calculate the XOR of all indices and the elements in nums
5 for index, value in enumerate(nums, 1):
6 xor_result ^= index ^ value
7
8 # Obtain the rightmost set bit (least significant bit)
9 # This will serve as a mask for segregating numbers
10 rightmost_set_bit = xor_result & -xor_result
11
12 # We will use `number_a` to find one of the numbers by segregating into two buckets
13 # The rightmost set bit will allow us to xor numbers into two groups
14 # One group where the bit is set and another where it's not
15 number_a = 0
16 for index, value in enumerate(nums, 1):
17 if index & rightmost_set_bit:
18 number_a ^= index
19 if value & rightmost_set_bit:
20 number_a ^= value
21
22 # `number_b` is the other number we need to find
23 number_b = xor_result ^ number_a
24
25 # Check if `number_a` is the duplicated number or the missing number
26 # If `number_a` is found in nums, it is the duplicated one
27 for value in nums:
28 if value == number_a:
29 return [number_a, number_b]
30
31 # If `number_a` is not in nums, it's the missing number and `number_b` is the duplicate
32 return [number_b, number_a]
33
1class Solution {
2 public int[] findErrorNums(int[] nums) {
3 // The length of the array
4 int length = nums.length;
5 // Variable to store the XOR of all indices and values
6 int xorResult = 0;
7
8 // XOR all the indices with their corresponding values
9 for (int i = 1; i <= length; ++i) {
10 xorResult ^= i ^ nums[i - 1];
11 }
12
13 // Find the rightmost set bit of xorResult
14 int rightmostSetBit = xorResult & -xorResult;
15
16 // Initialize variables to 0 that will store the two unique numbers
17 int number1 = 0;
18
19 // Separate the numbers into two groups and XOR them individually
20 for (int i = 1; i <= length; ++i) {
21 // Check for the rightmost set bit and XOR
22 if ((i & rightmostSetBit) > 0) {
23 number1 ^= i;
24 }
25 if ((nums[i - 1] & rightmostSetBit) > 0) {
26 number1 ^= nums[i - 1];
27 }
28 }
29
30 // XORing with xorResult to find the other number
31 int number2 = xorResult ^ number1;
32
33 // Iterate over the array to check which one is the duplicate number
34 for (int i = 0; i < length; ++i) {
35 if (nums[i] == number1) {
36 // If number1 is the duplicate, return number1 and number2 in order
37 return new int[] {number1, number2};
38 }
39 }
40
41 // If number1 is not the duplicate then number2 is, so return number2 and number1 in order
42 return new int[] {number2, number1};
43 }
44}
45
1class Solution {
2public:
3 vector<int> findErrorNums(vector<int>& nums) {
4 int size = nums.size(); // Get the size of the input vector
5 int xorSum = 0; // Initialize variable for XOR sum
6
7 // Calculate XOR for all numbers from 1 to n and all elements in nums
8 for (int i = 1; i <= size; ++i) {
9 xorSum ^= i ^ nums[i - 1];
10 }
11
12 // Determine the rightmost set bit (which differs between the missing and duplicate)
13 int rightMostBit = xorSum & -xorSum;
14 int oneBitSetNum = 0; // This will hold one of the two numbers (missing or duplicate)
15
16 // Divide the numbers into two groups based on the rightMostBit and XOR within groups
17 for (int i = 1; i <= size; ++i) {
18 if (i & rightMostBit) { // If the bit is set
19 oneBitSetNum ^= i; // XOR for the range 1 to N
20 }
21 if (nums[i - 1] & rightMostBit) { // If the bit is set in the current element
22 oneBitSetNum ^= nums[i - 1]; // XOR with the nums elements
23 }
24 }
25
26 int otherNum = xorSum ^ oneBitSetNum; // Find the second number using the xorSum
27
28 // Determine which number is the missing number and which one is the duplicate
29 for (int i = 0; i < size; ++i) {
30 if (nums[i] == oneBitSetNum) {
31 // If we find oneBitSetNum in nums array, it is the duplicate
32 return {oneBitSetNum, otherNum}; // Return the duplicate and missing numbers
33 }
34 }
35
36 // If not found, then oneBitSetNum is the missing number and otherNum is the duplicate
37 return {otherNum, oneBitSetNum};
38 }
39};
40
1function findErrorNums(nums: number[]): number[] {
2 // Calculate the total number of elements in the array.
3 const length = nums.length;
4
5 // Initialize xorSum to 0, which will be used to find the XOR sum of all elements and their indices.
6 let xorSum = 0;
7
8 // Loop through the array and XOR both the index and value of each element.
9 for (let index = 1; index <= length; ++index) {
10 xorSum ^= index ^ nums[index - 1];
11 }
12
13 // Determine the rightmost set bit in xorSum using two's complement.
14 const rightmostSetBit = xorSum & -xorSum;
15
16 // Initialize a variable to hold one of the two numbers we're looking for.
17 let oneNumber = 0;
18
19 // Loop through the array and indices to partition them based on the rightmostSetBit.
20 for (let index = 1; index <= length; ++index) {
21 if (index & rightmostSetBit) {
22 oneNumber ^= index;
23 }
24 if (nums[index - 1] & rightmostSetBit) {
25 oneNumber ^= nums[index - 1];
26 }
27 }
28
29 // Determine the other number by XORing oneNumber with xorSum.
30 const otherNumber = xorSum ^ oneNumber;
31
32 // Check which number is missing from the array and which number is duplicated.
33 // The number that appears in the nums array is the duplicated one,
34 // and the one that's missing is the number that should have been there.
35 return nums.includes(oneNumber) ? [oneNumber, otherNumber] : [otherNumber, oneNumber];
36}
37
Time and Space Complexity
The time complexity of the given code is O(n)
where n
is the length of the array nums
. This is because each of the loops in the function (for i, x in enumerate(nums, 1):
and for x in nums:
) iterates over the list nums
exactly once, and all operations within the loops are constant time operations.
The space complexity of the function is O(1)
, which means it uses a constant amount of extra space. Despite the input list size, the space consumed by the variables xs
, lb
, a
, and b
does not scale with n
, thus having no additional space depending on the input size.
Learn more about how to find time and space complexity quickly using problem constraints.
Which algorithm is best for finding the shortest distance between two points in an unweighted graph?
Recommended Readings
Sorting Summary Comparisons We presented quite a few sorting algorithms and it is essential to know the advantages and disadvantages of each one The basic algorithms are easy to visualize and easy to learn for beginner programmers because of their simplicity As such they will suffice if you don't know any advanced
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!