1852. Distinct Numbers in Each Subarray


Problem Description

In this problem, we are given an array of integers, nums, and another integer, k. Our task is to find the number of distinct integers in every contiguous subarray of nums that has a length of k. We construct an array ans such that each element ans[i] represents the number of distinct integers in the subarray starting from index i to index i+k-1 inclusively. The result will be an array of the counts of distinct integers for each subarray of size k.

For example, suppose nums is [1, 2, 3, 2, 2, 1] and k is 3. The subarrays of size three are:

  • [1, 2, 3], which has 3 distinct numbers;
  • [2, 3, 2], which has 2 distinct numbers;
  • [3, 2, 2], which has 2 distinct numbers;
  • [2, 2, 1], which has 2 distinct numbers. The resulting ans would be [3, 2, 2, 2].

Intuition

To find the number of distinct values in each subarray of size k, we need to efficiently count and update the distinct elements as we move the subarray from the beginning of nums to the end. A naive approach would involve recomputing the number of distinct elements for each subarray by traversing all elements within each subarray, resulting in an inefficient solution with a high time complexity.

However, as we slide the subarray window by one element from left to right, we notice that all but one element of the new subarray is the same as in the previous subarray. Specifically, each time we move the window, we remove the first element from the previous window and add a new element from the right. This suggests a more efficient approach, where we can update the counts of elements in the overlap of the two subarrays in constant time.

To achieve this, we use a Counter (in Python, a dictionary that automatically initializes values to 0 if keys are not yet present) to maintain the counts of elements in the current subarray window. For the initial subarray of size k, we populate the Counter with the frequency of each distinct number. We record the number of distinct elements (the size of the Counter) in our ans array.

As we slide the window to the right, we decrement the count of the element that is dropped from the subarray and remove the key from the Counter if its count drops to zero, ensuring that only counts of distinct elements are maintained. Then, we increment the count of the new element added to the subarray window. After each window slide, we append the current number of distinct elements (the size of the Counter) to the ans array.

This process continues until we have covered all subarrays of size k, resulting in a fully populated ans array with the correct counts. The use of the Counter allows us to maintain the update of distinct elements efficiently, which is the crux of this solution approach.

Learn more about Sliding Window patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Which data structure is used to implement priority queue?

Solution Approach

The solution involves using a sliding window approach along with a data structure, the Counter, which is essentially a specialized dictionary for counting hashable objects in Python. These tools together provide an efficient way to dynamically track the changes in the frequency of distinct numbers as we move the window across the nums array.

Let's break down the code step by step:

  • First, we calculate n, the length of the nums array. We also initialize the Counter with the first k elements of nums, which represents our first window.

  • We add the length of the Counter len(cnt) to our ans list as our first entry, which corresponds to the number of distinct elements in the initial window.

  • Next, we iterate over the remaining elements of the array starting from the k-th element to the end. For every iteration, we perform two operations:

    • We adjust the count of the element that's moving out of the window (nums[i - k]). We decrement its counter by 1, and if the count drops to 0, we remove the element from the Counter, effectively updating the distinct elements.
    • We adjust the count of the new element that's coming into the window (nums[i]). We increment its counter by 1. If the element was not present in the Counter, it gets added, again updating the distinct elements.
  • After adjusting the Counter, we append the current length of the Counter len(cnt) to our ans list. This indicates the number of distinct elements in the current window.

  • The loop continues, effectively moving the sliding window to the right by one element and updating the distinct element count on each move.

  • Once we've gone through all elements from k to n-1, our ans array is complete and we return it.

This sliding window technique coupled with the Counter avoids the need for double loops (which would increase the time complexity significantly) and ensures that we maintain a running count of distinct elements in constant time, as only two elements change positions in the window at each step.

By intelligently managing the frequency counts and updating them in constant time as the window slides, we achieve an efficient solution with O(n) time complexity, where n is the number of elements in the nums array, and O(k) space complexity, where k is the window size, which corresponds to the maximum size of the Counter.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Which of the following is equvalent to O(3*2^n + n^3 + n!+ log n)?

Example Walkthrough

Let's illustrate the solution approach with a small example. Consider the array nums = [4, 5, 4, 3, 5] and k = 2. We are to find the number of distinct numbers in each subarray of length k.

  1. Initialize the Counter with the first k elements of nums. Here, k = 2, so the Counter includes {4: 1, 5: 1} after accounting for the first two elements [4, 5].
  2. Add the number of distinct elements in the Counter to the ans array. At this point, we have ans = [2] since there are two distinct numbers [4, 5].
  3. Start sliding the window across nums by one element at a time and adjust the Counter as we go.
    • Move the window one step to include [5, 4], which means we should decrease the count of 4 (as we move past the first one) and then increase the count of the new 4 (the duplicate). The Counter remains unchanged {4: 1, 5: 1}, and we append 2 to ans, resulting in ans = [2, 2].
    • Move to the next window [4, 3]. We decrease the count of 5 (dropping it off as it's no longer within the window) and include 3. The Counter updates to {4: 1, 3: 1}, and ans becomes ans = [2, 2, 2].
    • Finally, for the last window [3, 5], we decrease the count of 4 (removing it from the Counter since its count goes to zero) and add 5. Our Counter is {3: 1, 5: 1}. We append 2 to ans, ending with ans = [2, 2, 2, 2].

Throughout the process, we efficiently kept track of the distinct numbers as the sliding window moved from left to right across nums, resulting in the final answer ans = [2, 2, 2, 2], indicating that for each subarray of size k there are exactly two distinct numbers.

Solution Implementation

1from collections import Counter
2
3class Solution:
4    def distinctNumbers(self, nums, k):
5        """
6        Return a list of the count of distinct numbers in every contiguous subarray of length k.
7      
8        :param nums: List[int] - the input array of numbers.
9        :param k: int - the length of each subarray to consider.
10        :return: List[int] - list containing the number of distinct elements in each subarray.
11        """
12      
13        # Calculate the length of the input list.
14        n = len(nums)
15
16        # Initialize a counter for the first window of size 'k'.
17        num_counter = Counter(nums[:k])
18
19        # List to store the count of distinct numbers in each window.
20        distinct_count = [len(num_counter)]
21
22        # Loop over the remaining elements starting from 'k' to the end of the list.
23        for i in range(k, n):
24            # 'outgoing_num' is the number exiting the window.
25            outgoing_num = nums[i - k]
26
27            # Decrease the count of the outgoing number.
28            num_counter[outgoing_num] -= 1
29
30            # If the outgoing number's count reaches 0, remove it from the counter.
31            if num_counter[outgoing_num] == 0:
32                num_counter.pop(outgoing_num)
33
34            # 'incoming_num' is the new number entering the window.
35            incoming_num = nums[i]
36
37            # Increase the count of the incoming number.
38            num_counter[incoming_num] += 1
39
40            # Append the current number of distinct elements to the result list.
41            distinct_count.append(len(num_counter))
42
43        # Return the list of counts of distinct numbers in each window.
44        return distinct_count
45
1class Solution {
2    public int[] distinctNumbers(int[] nums, int k) {
3        // declare and initialize an array to keep track of counts of each number
4        int[] count = new int[100010];
5        // distinctCount is used to keep the number of distinct elements in the current window of size k
6        int distinctCount = 0;
7      
8        // Initializing count for the first window of size k
9        for (int i = 0; i < k; ++i) {
10            // If we haven't seen this number before in the current window, increment distinctCount
11            if (count[nums[i]]++ == 0) {
12                ++distinctCount;
13            }
14        }
15      
16        // The total number of elements in the array
17        int n = nums.length;
18        // Prepare an array to store the result. n - k + 1 is the number of windows we'll have
19        int[] result = new int[n - k + 1];
20        // Store the count of distinct numbers for the first window
21        result[0] = distinctCount;
22      
23        // Iterate over the array starting from k, sliding the window one position at a time
24        for (int i = k; i < n; ++i) {
25            // Remove the count of the leftmost element of the previous window
26            // If the count of this number becomes 0 after decrementing, we've lost a distinct number
27            if (--count[nums[i - k]] == 0) {
28                --distinctCount;
29            }
30            // Add the count of the new element of the current window
31            // If this number wasn't in the window before, increment distinctCount
32            if (count[nums[i]]++ == 0) {
33                ++distinctCount;
34            }
35            // Store the count of distinct numbers for the current window
36            result[i - k + 1] = distinctCount;
37        }
38      
39        // return the array containing counts of distinct numbers for each window
40        return result;
41    }
42}
43
1#include <vector>
2
3class Solution {
4public:
5    // This function returns a vector of integers, representing the count of
6    // distinct numbers in each subarray of length 'k' within the input 'nums' vector.
7    vector<int> distinctNumbers(vector<int>& nums, int k) {
8      
9        // Initialize a counting array to keep track of the frequency of each number.
10        // Assuming the range of numbers in the input vector does not exceed 100010.
11        int frequencyCounter[100010] = {0};
12      
13        // Variable to keep track of the number of distinct elements.
14        int distinctCount = 0;
15      
16        // Count distinct numbers in the first window of size 'k'.
17        for (int i = 0; i < k; ++i) {
18            if (frequencyCounter[nums[i]]++ == 0) { // If the number appears for the first time.
19                ++distinctCount;
20            }
21        }
22      
23        // Get the size of the input vector.
24        int n = nums.size();
25      
26        // Initialize the answer vector with appropriate size.
27        vector<int> answer(n - k + 1);
28      
29        // The count for the first window is already found.
30        answer[0] = distinctCount;
31      
32        // Loop through the rest of the vector, sliding the window by one each time.
33        for (int i = k; i < n; ++i) {
34            // Remove the count of the first element in the previous window.
35            if (--frequencyCounter[nums[i - k]] == 0) { // If the number's frequency drops to zero.
36                --distinctCount;
37            }
38            // Add the count for the new element in the current window.
39            if (frequencyCounter[nums[i]]++ == 0) { // If the number appears for the first time in the window.
40                ++distinctCount;
41            }
42            // Record the number of distinct elements for this window.
43            answer[i - k + 1] = distinctCount;
44        }
45      
46        // Return the final vector containing counts of distinct numbers for all windows.
47        return answer;
48    }
49};
50
1// Importing necessary components for TypeScript.
2import { Vector } from 'prelude-ts';
3
4// This function computes a vector of integers, representing the count of distinct numbers
5// in each subarray of length 'k' within the input 'nums' vector.
6function distinctNumbers(nums: number[], k: number): number[] {
7
8    // Assuming the range of numbers in the input array does not exceed 100010.
9    // Initialize a frequency counter array to keep track of the frequency of each number.
10    const frequencyCounter: number[] = new Array(100010).fill(0);
11
12    // Variable to keep track of the number of distinct elements.
13    let distinctCount: number = 0;
14
15    // Count distinct numbers in the first window of size 'k'.
16    for (let i = 0; i < k; ++i) {
17        if (frequencyCounter[nums[i]]++ === 0) {
18            // If the number appears for the first time.
19            ++distinctCount;
20        }
21    }
22
23    // Variable to capture the size of the input array.
24    const n: number = nums.length;
25
26    // Initialize the answer array with appropriate size.
27    const answer: number[] = new Array(n - k + 1);
28
29    // The count for the first window has already been calculated.
30    answer[0] = distinctCount;
31
32    // Loop through the rest of the array, sliding the window by one element each iteration.
33    for (let i = k; i < n; ++i) {
34        // Decrease the count of the element exiting the window.
35        if (--frequencyCounter[nums[i - k]] === 0) {
36            // If the element's frequency drops to zero, decrement the distinct count.
37            --distinctCount;
38        }
39        // Process the new element entering the window.
40        if (frequencyCounter[nums[i]]++ === 0) {
41            // If the element is unique within the current window, increment the distinct count.
42            ++distinctCount;
43        }
44        // Store the number of distinct elements for this window in the answer array.
45        answer[i - k + 1] = distinctCount;
46    }
47  
48    // Return the computed array containing counts of distinct numbers for all windows.
49    return answer;
50}
51
Not Sure What to Study? Take the 2-min Quiz:

Which of the two traversal algorithms (BFS and DFS) can be used to find whether two nodes are connected?

Time and Space Complexity

Time Complexity

The time complexity of the given code can be analyzed by looking at the operations performed within the main for-loop which runs from k to n (where n is the length of nums):

  • Each iteration of the for-loop performs a constant time operation to decrement the counter for the outgoing element (u = nums[i - k] and cnt[u] -= 1).
  • If the outgoing element's count drops to zero, it's removed from the counter with cnt.pop(u), which also takes constant time since Counter is a subclass of a Python dictionary which generally has O(1) complexity for addition and removal.
  • Increment the counter for the new incoming element – this also takes constant time cnt[nums[i]] += 1.
  • Append the current number of distinct elements in the current window (ans.append(len(cnt))) – getting the length of cnt is constant time as well.

Given these operations, since all of them have a time complexity of O(1) and we're running these for n - k times, the overall time complexity of the for-loop is O(n - k).

However, we also have an initial setup cost where we initialize the counter with the first k elements of nums, this also takes O(k) time.

So, the overall time complexity of the function is O(k) + O(n - k) which simplifies to O(n).

Space Complexity

For space complexity, we need to consider:

  • The cnt counter, which can hold at most min(n, k) unique values if all elements are distinct within any window of size k. Hence the space complexity for the counter would be O(min(n, k)).
  • The answer list ans, which will contain n - k + 1 elements corresponding to the number of distinct elements in each window. Hence, its space complexity is O(n - k + 1).

The overall space complexity of the function is the maximum of the space complexities of cnt and ans, which would be O(n) since O(min(n, k)) is bounded by O(n) and O(n - k + 1) simplifies to O(n).

Therefore, both the time complexity and space complexity of the code are O(n).

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which of these properties could exist for a graph but not a tree?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫