2670. Find the Distinct Difference Array
Problem Description
The problem presents us with an integer array called nums
, indexed from 0
, which means the first element of the array is at index 0
. Our task is to calculate a new array, diff
, also of the same length as nums
, based on a specific rule. For each position i
in the nums
array, we have to find the count of distinct elements in two different parts of nums
:
- The "prefix" which includes all elements from the start of
nums
up to and including the element at indexi
. - The "suffix" which includes all elements after index
i
up to the end of thenums
.
diff[i]
is then defined as the number of distinct elements in the prefix minus the number of distinct elements in the suffix.
Note that when working with subarrays or slices of an array, nums[i, ..., j]
means we are considering the elements of nums
from index i
through to index j
, inclusive. If i
is greater than j
, it means the subarray is empty.
Intuition
To construct the diff
array as described, we need to keep track of the distinct elements as we move through the nums
array. Using two sets, we can capture the unique elements in the prefixes and suffixes of nums
at each index.
The algorithm includes the following steps:
-
Initialize an extra array
suf
with the same length asnums
plus one, to record the number of distinct elements in the suffix part starting from each index. We add an extra space since we are including the empty subarray wheni > j
. -
Traverse the
nums
array in reverse (right to left), starting from the last element, and with each element, we encounter, add it to a set to ensure only distinct elements are counted. Update the correspondingsuf
entry with the length of this set to reflect the current number of distinct elements in the suffix up to the current index. -
Clear the set (for counting distinct elements in the prefixes) and then traverse the
nums
array once more, from left to right. With each element, we add it to the set to keep track of distinct elements and calculatediff[i]
by subtracting the number of distinct elements in the suffix (obtained fromsuf[i + 1]
) from the number of distinct elements in the prefix (length of the set at this point in the traversal).
This approach allows us to incrementally build up the distinct element counts for the prefix and suffix arrays, which can then be used to quickly compute the diff
array without having to re-check all the elements at each index.
Solution Approach
The solution uses a couple of Python data structures - lists and sets - and leverages their properties to solve the problem efficiently:
-
Lists: They are used to store the input (
nums
) and output (ans
for the distinct difference array, andsuf
for tracking the number of distinct elements in suffixes). -
Sets: A set is used to keep track of distinct elements because sets naturally eliminate duplicates.
The implementation follows these steps:
-
Initialize the
suf
list to haven + 1
zeros, wheren
is the length ofnums
. This is done to handle the suffix count of distinct elements as well as accounting for the empty subarray at the end. -
Start by populating the
suf
array from the end ofnums
moving towards the beginning. For each indexi
, addnums[i]
to a set (this ensures only distinct elements are counted). Then, setsuf[i]
to the current size of the set, which represents the count of distinct elements fromnums[i]
to the end of the array. -
Once the
suf
array is complete, reset (clear) the set to use it for counting distinct elements in prefixes. -
Create an answer list
ans
initialized with zeros of lengthn
to hold the final result. -
Iterate through
nums
from start to end. For each indexi
, addnums[i]
to the set (again, ensuring uniqueness) and subtract the number of distinct elements in the suffix (obtained fromsuf[i + 1]
) from the number of distinct elements in the prefix (the current size of the set). This gives thediff[i]
value. -
Assign this result to
ans[i]
, building the answer as the loop progresses. -
Finally, return the
ans
list once the loop is complete.
The key algorithmic insight here is to use two passes, one for suffixes and one for prefixes, with the help of sets to dynamically maintain the count of distinct elements efficiently. This negates the need to re-calculate the number of distinct elements for each index from scratch, thus improving the time complexity significantly.
Here's the main loop in the code explained:
for i in range(n - 1, -1, -1):
s.add(nums[i])
suf[i] = len(s)
s.clear()
for i, x in enumerate(nums):
s.add(x)
ans[i] = len(s) - suf[i + 1]
In the first loop, we fill the suf
array with the count of distinct elements for the suffix starting from each index. In the second loop, we use the suf
array and the prefix count of distinct elements to calculate the diff
for each index, which is stored in ans
.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach using a small example. Suppose we have the following integer array nums
:
nums = [4, 6, 4, 3]
We need to calculate the diff
array where diff[i]
is the count of distinct elements from the start of nums
up to i
, minus the count of distinct elements from i+1
to the end of nums
. Here's how we would apply the solution step by step:
- Initialize the
suf
list to lengthn+1
with zero values.n
is the length ofnums
, sosuf
initially will be[0, 0, 0, 0, 0]
. - Initialize an empty set
s
to record distinct elements as we go.
We first populate the suf
array in backward fashion:
- For
i = 3
,nums[3] = 3
. We add this to the sets
, which becomes{3}
, andsuf[3] = 1
. - For
i = 2
,nums[2] = 4
. We add this to the sets
, which becomes{3, 4}
, andsuf[2] = 2
. - For
i = 1
,nums[1] = 6
. We add this to the sets
, which becomes{3, 4, 6}
, andsuf[1] = 3
. - For
i = 0
,nums[0] = 4
. The sets
is already{3, 4, 6}
, so adding4
doesn't change it, andsuf[0] remains at 3
.
Now our suf
array looks like this: [3, 3, 2, 1, 0]
.
Next, we'll go over nums
from left to right to build our diff
array:
- Clear the set
s
. - For
i = 0
, we addnums[0] = 4
tos
, which becomes{4}
. We calculatediff[0] = len(s) - suf[1] = 1 - 3 = -2
. - For
i = 1
, we addnums[1] = 6
tos
, which becomes{4, 6}
. We calculatediff[1] = len(s) - suf[2] = 2 - 2 = 0
. - For
i = 2
, we addnums[2] = 4
tos
, which doesn't change it as4
is already in the set. Thus, we calculatediff[2] = len(s) - suf[3] = 2 - 1 = 1
. - For
i = 3
, we addnums[3] = 3
tos
, which becomes{3, 4, 6}
. We calculatediff[3] = len(s) - suf[4] = 3 - 0 = 3
.
Our final diff
array is [-2, 0, 1, 3]
which is the result of our algorithm.
Solution Implementation
1from typing import List
2
3class Solution:
4 def distinctDifferenceArray(self, nums: List[int]) -> List[int]:
5 # Get the length of the input list
6 length = len(nums)
7 # Initialize a suffix array of length 'length + 1' with zeros
8 suffix_count = [0] * (length + 1)
9 # Create an empty set to store unique elements
10 unique_elements = set()
11
12 # Populate the suffix_count array with count of unique elements from the end
13 for i in reversed(range(length)):
14 unique_elements.add(nums[i])
15 suffix_count[i] = len(unique_elements)
16
17 # Clear the set for re-use
18 unique_elements.clear()
19 # Initialize the answer array with zeros
20 answer = [0] * length
21
22 # Calculate the distinct count difference for each position
23 for i, number in enumerate(nums):
24 unique_elements.add(number)
25 # Calculate the difference between the number of unique elements seen
26 # so far and the number of unique elements that will be seen in the suffix
27 answer[i] = len(unique_elements) - suffix_count[i + 1]
28
29 # Return the answer array
30 return answer
31
1class Solution {
2 public int[] distinctDifferenceArray(int[] nums) {
3 // The length of the input array
4 int length = nums.length;
5
6 // Suffix array to store the number of distinct elements from index 'i' to the end
7 int[] suffixDistinctCount = new int[length + 1];
8
9 // A set to keep track of distinct numbers as we traverse the array from the end
10 Set<Integer> distinctNumbers = new HashSet<>();
11
12 // Populate the suffixDistinctCount array with the count of distinct numbers
13 // starting from the end of nums array
14 for (int i = length - 1; i >= 0; --i) {
15 distinctNumbers.add(nums[i]);
16 suffixDistinctCount[i] = distinctNumbers.size();
17 }
18
19 // Clear the set to reuse it for the next loop
20 distinctNumbers.clear();
21
22 // Resultant array to store the desired distinct differences
23 int[] result = new int[length];
24
25 // Traverse the nums array and calculate the distinct difference
26 for (int i = 0; i < length; ++i) {
27 distinctNumbers.add(nums[i]);
28 // The current distinct difference is the size of the set
29 // minus the size of the suffixDistinctCount at the next index
30 result[i] = distinctNumbers.size() - suffixDistinctCount[i + 1];
31 }
32
33 // Return the computed distinct difference array
34 return result;
35 }
36}
37
1#include <vector>
2#include <unordered_set>
3using namespace std;
4
5class Solution {
6public:
7 // Function to return a vector that contains the number of distinct integers
8 // in the array nums after removing the elements to the right of the current element
9 vector<int> distinctDifferenceArray(vector<int>& nums) {
10 int n = nums.size();
11
12 // Create a suffix array to store the count of distinct numbers from the current index to the end
13 vector<int> distinctCountSuffix(n + 1);
14
15 // Use a set to store distinct numbers to facilitate counting
16 unordered_set<int> distinctNumbers;
17
18 // Fill the suffix array with distinct number count, starting from the end of the vector
19 for (int i = n - 1; i >= 0; --i) {
20 distinctNumbers.insert(nums[i]);
21 distinctCountSuffix[i] = distinctNumbers.size();
22 }
23
24 // Clear the set for reuse from the start of the vector
25 distinctNumbers.clear();
26
27 // Initialize the answer vector to store the result
28 vector<int> ans(n);
29
30 // Calculate the distinct difference for each position
31 for (int i = 0; i < n; ++i) {
32 // Insert number into the set
33 distinctNumbers.insert(nums[i]);
34 // The distinct difference is the count of unique numbers we've seen so far
35 // minus the count of unique numbers from the next index onwards
36 ans[i] = distinctNumbers.size() - distinctCountSuffix[i + 1];
37 }
38 // Return the final answer vector
39 return ans;
40 }
41};
42
1function distinctDifferenceArray(nums: number[]): number[] {
2 // The length of the input array
3 const length = nums.length;
4
5 // This array will hold the count of distinct numbers from the current index to the end.
6 const suffixDistinctCount = new Array(length + 1).fill(0);
7
8 // A set to keep track of distinct numbers seen so far.
9 const seenNumbers = new Set<number>();
10
11 // Populate the suffixDistinctCount array with distinct number counts by iterating backwards.
12 for (let i = length - 1; i >= 0; --i) {
13 seenNumbers.add(nums[i]);
14 suffixDistinctCount[i] = seenNumbers.size;
15 }
16
17 // Clear the set to reuse it for the prefix pass.
18 seenNumbers.clear();
19
20 // The resulting array that will hold the final difference counts.
21 const result = new Array(length);
22
23 // Iterate through the input array to calculate the difference in the distinct count
24 // before the current index and after the current index.
25 for (let i = 0; i < length; ++i) {
26 seenNumbers.add(nums[i]);
27 // The distinct count difference at the current index is the difference between the count of
28 // distinct numbers seen so far and the distinct numbers from the next index to the end.
29 result[i] = seenNumbers.size - suffixDistinctCount[i + 1];
30 }
31
32 return result;
33}
34
Time and Space Complexity
The distinctDifferenceArray
function aims to calculate a list where each element at index i
reflects the count of unique numbers from index i
to the end of the list nums
, minus the count of unique numbers from index i+1
to the end.
Time Complexity
The function comprises two main loops which both iterate over the array nums
:
-
The first loop runs backward from
n-1
to0
. In each iteration, it adds the current element into the sets
. The length of the set is then recorded in thesuf
array. Since adding an element to the set and checking its length are O(1) operations, this loop has a complexity of O(n), where n is the number of elements innums
. -
The second loop runs forward from
0
ton
. Similar to the first loop, elements are added to a cleared sets
, and the length difference between the sets is stored in theans
array. The complexity of this loop is also O(n).
There are no nested loops, so the overall time complexity is the sum of the two loops, which is O(n) + O(n) = O(2n). Simplified, the time complexity is O(n).
Space Complexity
The space complexity includes the memory used by the data structures that scale with the input size:
-
The two sets,
s
andsuf
:s
will contain at most n unique elements, thus O(n) space complexity.suf
is an array of size n+1, which also gives O(n). -
The
ans
array: Created to store the resulting differences and is of size n, contributing another O(n) space complexity. -
Auxiliary variables like
i
,x
, andn
: These use constant space, O(1).
Summing these up, the total space complexity is O(n) + O(n) + O(n) which simplifies to O(3n). However, in Big O notation, constant factors are discarded, so the space complexity simplifies to O(n).
Learn more about how to find time and space complexity quickly using problem constraints.
What does the following code do?
1def f(arr1, arr2):
2 i, j = 0, 0
3 new_arr = []
4 while i < len(arr1) and j < len(arr2):
5 if arr1[i] < arr2[j]:
6 new_arr.append(arr1[i])
7 i += 1
8 else:
9 new_arr.append(arr2[j])
10 j += 1
11 new_arr.extend(arr1[i:])
12 new_arr.extend(arr2[j:])
13 return new_arr
14
1public static List<Integer> f(int[] arr1, int[] arr2) {
2 int i = 0, j = 0;
3 List<Integer> newArr = new ArrayList<>();
4
5 while (i < arr1.length && j < arr2.length) {
6 if (arr1[i] < arr2[j]) {
7 newArr.add(arr1[i]);
8 i++;
9 } else {
10 newArr.add(arr2[j]);
11 j++;
12 }
13 }
14
15 while (i < arr1.length) {
16 newArr.add(arr1[i]);
17 i++;
18 }
19
20 while (j < arr2.length) {
21 newArr.add(arr2[j]);
22 j++;
23 }
24
25 return newArr;
26}
27
1function f(arr1, arr2) {
2 let i = 0, j = 0;
3 let newArr = [];
4
5 while (i < arr1.length && j < arr2.length) {
6 if (arr1[i] < arr2[j]) {
7 newArr.push(arr1[i]);
8 i++;
9 } else {
10 newArr.push(arr2[j]);
11 j++;
12 }
13 }
14
15 while (i < arr1.length) {
16 newArr.push(arr1[i]);
17 i++;
18 }
19
20 while (j < arr2.length) {
21 newArr.push(arr2[j]);
22 j++;
23 }
24
25 return newArr;
26}
27
Recommended Readings
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!