Facebook Pixel

992. Subarrays with K Different Integers

HardArrayHash TableCountingSliding Window
Leetcode Link

Problem Description

You are given an integer array nums and an integer k. Your task is to find the total number of good subarrays in nums.

A good array is defined as an array that contains exactly k different integers. For instance, the array [1,2,3,1,2] contains exactly 3 different integers: 1, 2, and 3.

A subarray is a contiguous sequence of elements from the original array. This means the elements must appear consecutively in the same order as they appear in the original array.

Your goal is to count all possible subarrays of nums that have exactly k distinct integers.

For example:

  • If nums = [1,2,3,1,2] and k = 3, you need to find all contiguous subarrays that contain exactly 3 different integers
  • The subarray [1,2,3] would be valid because it has exactly 3 distinct integers
  • The subarray [1,2] would not be valid because it only has 2 distinct integers
  • The subarray [1,2,3,1] would be valid because it has exactly 3 distinct integers (1, 2, and 3)

The function should return the total count of such good subarrays.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is that directly counting subarrays with exactly k distinct integers is challenging. However, we can transform this problem using a clever observation: the number of subarrays with exactly k distinct integers equals the number of subarrays with at most k distinct integers minus the number of subarrays with at most k-1 distinct integers.

Why does this work? Think about it this way:

  • Subarrays with at most k distinct integers include those with 1, 2, 3, ..., up to k distinct integers
  • Subarrays with at most k-1 distinct integers include those with 1, 2, 3, ..., up to k-1 distinct integers
  • When we subtract the second from the first, we're left with only those subarrays that have exactly k distinct integers

This transformation is powerful because counting subarrays with "at most k distinct integers" is much easier using a sliding window approach.

For the sliding window technique, we can use two pointers. For each ending position i, we find the leftmost starting position j such that the subarray from j to i contains at most k distinct integers. All subarrays ending at position i and starting from any position between j and i will have at most k distinct integers.

The solution cleverly stores these leftmost valid positions in an array pos. For each ending position i, pos[i] stores the leftmost starting position where the subarray still has at most k distinct integers. The difference between positions when we compute for k-1 and k gives us the count of subarrays with exactly k distinct integers ending at each position.

This approach reduces a complex counting problem into two simpler sliding window problems, making the solution both elegant and efficient.

Learn more about Sliding Window patterns.

Solution Approach

The implementation uses a helper function f(k) that calculates, for each position in the array, the leftmost starting position of a subarray ending at that position with at most k distinct integers.

Helper Function f(k) Implementation:

  1. Initialize data structures:

    • pos: An array to store the leftmost valid starting position for each ending position
    • cnt: A Counter (hash map) to track the frequency of each integer in the current window
    • j: A pointer representing the left boundary of the sliding window, initialized to 0
  2. Sliding window process:

    • Iterate through the array with index i and value x
    • Add the current element to the window: cnt[x] += 1
    • Shrink window when needed: While the number of distinct integers exceeds k:
      • Decrement the count of the leftmost element: cnt[nums[j]] -= 1
      • If the count becomes 0, remove that element from the counter entirely: cnt.pop(nums[j])
      • Move the left pointer right: j += 1
    • Store the current left boundary for position i: pos[i] = j
  3. Return the position array which contains the leftmost valid starting positions for subarrays with at most k distinct integers

Main Algorithm:

  1. Call f(k-1) to get positions for subarrays with at most k-1 distinct integers
  2. Call f(k) to get positions for subarrays with at most k distinct integers
  3. Calculate the result: For each ending position, subtract the position from f(k) from the position from f(k-1)
    • This difference gives the number of subarrays ending at that position with exactly k distinct integers
    • Sum all these differences: sum(a - b for a, b in zip(f(k - 1), f(k)))

Why the subtraction works:

  • For position i, f(k)[i] gives the leftmost position where a subarray can start and have at most k distinct integers
  • f(k-1)[i] gives the leftmost position for at most k-1 distinct integers
  • Subarrays starting between f(k)[i] and f(k-1)[i] - 1 will have exactly k distinct integers
  • The count of such subarrays is f(k-1)[i] - f(k)[i]

The time complexity is O(n) where n is the length of the array, as each element is visited at most twice by the two pointers. The space complexity is O(n) for storing the position arrays.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example with nums = [1,2,1,2,3] and k = 2.

Goal: Find all subarrays with exactly 2 distinct integers.

Step 1: Find subarrays with at most k-1=1 distinct integers

Using the sliding window with f(1):

  • i=0, nums[0]=1: Window [1], distinct=1, j=0, pos[0]=0
  • i=1, nums[1]=2: Window [2], distinct=1 (shrink: remove 1), j=1, pos[1]=1
  • i=2, nums[2]=1: Window [1], distinct=1 (shrink: remove 2), j=2, pos[2]=2
  • i=3, nums[3]=2: Window [2], distinct=1 (shrink: remove 1), j=3, pos[3]=3
  • i=4, nums[4]=3: Window [3], distinct=1 (shrink: remove 2), j=4, pos[4]=4

Result: f(1) = [0, 1, 2, 3, 4]

Step 2: Find subarrays with at most k=2 distinct integers

Using the sliding window with f(2):

  • i=0, nums[0]=1: Window [1], distinct=1, j=0, pos[0]=0
  • i=1, nums[1]=2: Window [1,2], distinct=2, j=0, pos[1]=0
  • i=2, nums[2]=1: Window [1,2,1], distinct=2, j=0, pos[2]=0
  • i=3, nums[3]=2: Window [1,2,1,2], distinct=2, j=0, pos[3]=0
  • i=4, nums[4]=3: Window [2,1,2,3], distinct=3 (shrink: remove 1s), j=1, pos[4]=1

Result: f(2) = [0, 0, 0, 0, 1]

Step 3: Calculate the difference

For each position i, the number of subarrays ending at i with exactly 2 distinct integers is:

  • i=0: f(1)[0] - f(2)[0] = 0 - 0 = 0 (no subarrays with exactly 2 distinct)
  • i=1: f(1)[1] - f(2)[1] = 1 - 0 = 1 (subarray [1,2])
  • i=2: f(1)[2] - f(2)[2] = 2 - 0 = 2 (subarrays [1,2,1] and [2,1])
  • i=3: f(1)[3] - f(2)[3] = 3 - 0 = 3 (subarrays [1,2,1,2], [2,1,2], and [1,2])
  • i=4: f(1)[4] - f(2)[4] = 4 - 1 = 3 (subarrays [2,3], [1,2,3], and [2,1,2,3])

Total: 0 + 1 + 2 + 3 + 3 = 9 subarrays with exactly 2 distinct integers

The subarrays are: [1,2], [1,2,1], [2,1], [1,2,1,2], [2,1,2], [1,2] (at position 2-3), [2,3], [1,2,3], [2,1,2,3]

Solution Implementation

1class Solution:
2    def subarraysWithKDistinct(self, nums: List[int], k: int) -> int:
3        def find_left_boundaries(max_distinct):
4            """
5            For each index i, find the leftmost index j such that 
6            nums[j:i+1] contains at most max_distinct distinct integers.
7          
8            Returns a list where left_boundaries[i] represents the leftmost 
9            valid starting position for subarrays ending at index i.
10            """
11            left_boundaries = [0] * len(nums)
12            frequency_map = Counter()
13            left_pointer = 0
14          
15            for right_pointer, current_num in enumerate(nums):
16                # Add current element to the window
17                frequency_map[current_num] += 1
18              
19                # Shrink window from left while we have too many distinct elements
20                while len(frequency_map) > max_distinct:
21                    left_num = nums[left_pointer]
22                    frequency_map[left_num] -= 1
23                  
24                    # Remove element from map if its count reaches 0
25                    if frequency_map[left_num] == 0:
26                        del frequency_map[left_num]
27                  
28                    left_pointer += 1
29              
30                # Store the leftmost valid position for current right_pointer
31                left_boundaries[right_pointer] = left_pointer
32          
33            return left_boundaries
34      
35        # Get left boundaries for at most (k-1) and at most k distinct elements
36        boundaries_k_minus_1 = find_left_boundaries(k - 1)
37        boundaries_k = find_left_boundaries(k)
38      
39        # The difference gives us the count of subarrays with exactly k distinct
40        # For each position i, boundaries_k_minus_1[i] - boundaries_k[i] gives
41        # the number of valid starting positions for subarrays ending at i
42        # that have exactly k distinct integers
43        return sum(left_k_minus_1 - left_k 
44                   for left_k_minus_1, left_k in zip(boundaries_k_minus_1, boundaries_k))
45
1class Solution {
2    public int subarraysWithKDistinct(int[] nums, int k) {
3        // Get the leftmost valid position for each index with at most k distinct elements
4        int[] leftmostWithAtMostK = findLeftmostPositions(nums, k);
5      
6        // Get the leftmost valid position for each index with at most k-1 distinct elements
7        int[] leftmostWithAtMostKMinus1 = findLeftmostPositions(nums, k - 1);
8      
9        // Calculate the result using the difference
10        // For each ending position i, the number of subarrays with exactly k distinct elements
11        // equals (subarrays with at most k) - (subarrays with at most k-1)
12        int result = 0;
13        for (int i = 0; i < nums.length; i++) {
14            result += leftmostWithAtMostKMinus1[i] - leftmostWithAtMostK[i];
15        }
16        return result;
17    }
18
19    /**
20     * Finds the leftmost starting position for each ending position
21     * such that the subarray has at most k distinct elements.
22     * 
23     * @param nums the input array
24     * @param k maximum number of distinct elements allowed
25     * @return array where positions[i] represents the leftmost valid starting index
26     *         for subarrays ending at index i with at most k distinct elements
27     */
28    private int[] findLeftmostPositions(int[] nums, int k) {
29        int n = nums.length;
30      
31        // Frequency counter for each element (array size is n+1 to handle all possible values)
32        int[] frequency = new int[n + 1];
33      
34        // Result array storing leftmost valid position for each ending index
35        int[] positions = new int[n];
36      
37        // Counter for number of distinct elements in current window
38        int distinctCount = 0;
39      
40        // Two pointers: left (j) and right (i)
41        int left = 0;
42      
43        for (int right = 0; right < n; right++) {
44            // Add current element to the window
45            if (++frequency[nums[right]] == 1) {
46                // If this element appears for the first time, increment distinct count
47                distinctCount++;
48            }
49          
50            // Shrink window from left while we have more than k distinct elements
51            while (distinctCount > k) {
52                // Remove leftmost element from the window
53                if (--frequency[nums[left]] == 0) {
54                    // If this element's count becomes 0, decrement distinct count
55                    distinctCount--;
56                }
57                left++;
58            }
59          
60            // Store the leftmost valid position for current ending position
61            positions[right] = left;
62        }
63      
64        return positions;
65    }
66}
67
1class Solution {
2public:
3    int subarraysWithKDistinct(vector<int>& nums, int k) {
4        // Get the leftmost position for subarrays with at most k distinct integers
5        vector<int> leftmostWithK = findLeftmostPositions(nums, k);
6      
7        // Get the leftmost position for subarrays with at most k-1 distinct integers
8        vector<int> leftmostWithKMinus1 = findLeftmostPositions(nums, k - 1);
9      
10        int result = 0;
11      
12        // For each ending position i, count subarrays with exactly k distinct integers
13        // The count equals: (subarrays with at most k) - (subarrays with at most k-1)
14        for (int i = 0; i < nums.size(); ++i) {
15            result += leftmostWithKMinus1[i] - leftmostWithK[i];
16        }
17      
18        return result;
19    }
20
21private:
22    // Find the leftmost valid starting position for each ending position
23    // such that the subarray has at most k distinct integers
24    vector<int> findLeftmostPositions(vector<int>& nums, int k) {
25        int n = nums.size();
26        vector<int> leftmostPos(n);
27      
28        // Frequency map for elements (assuming elements are in range [1, n])
29        vector<int> frequency(n + 1, 0);
30      
31        int distinctCount = 0;  // Number of distinct integers in current window
32        int left = 0;           // Left pointer of sliding window
33      
34        // Iterate through each position as the right boundary
35        for (int right = 0; right < n; ++right) {
36            // Add current element to window
37            if (++frequency[nums[right]] == 1) {
38                // New distinct element encountered
39                ++distinctCount;
40            }
41          
42            // Shrink window from left while we have more than k distinct elements
43            while (distinctCount > k) {
44                if (--frequency[nums[left]] == 0) {
45                    // Element completely removed from window
46                    --distinctCount;
47                }
48                ++left;
49            }
50          
51            // Store the leftmost valid position for current right boundary
52            leftmostPos[right] = left;
53        }
54      
55        return leftmostPos;
56    }
57};
58
1function subarraysWithKDistinct(nums: number[], k: number): number {
2    // Get the leftmost position for subarrays with at most k distinct integers
3    const leftmostWithK = findLeftmostPositions(nums, k);
4  
5    // Get the leftmost position for subarrays with at most k-1 distinct integers
6    const leftmostWithKMinus1 = findLeftmostPositions(nums, k - 1);
7  
8    let result = 0;
9  
10    // For each ending position i, count subarrays with exactly k distinct integers
11    // The count equals: (subarrays with at most k) - (subarrays with at most k-1)
12    for (let i = 0; i < nums.length; i++) {
13        result += leftmostWithKMinus1[i] - leftmostWithK[i];
14    }
15  
16    return result;
17}
18
19// Find the leftmost valid starting position for each ending position
20// such that the subarray has at most k distinct integers
21function findLeftmostPositions(nums: number[], k: number): number[] {
22    const n = nums.length;
23    const leftmostPos: number[] = new Array(n);
24  
25    // Frequency map for elements (using Map for flexibility with any integer values)
26    const frequency = new Map<number, number>();
27  
28    let distinctCount = 0;  // Number of distinct integers in current window
29    let left = 0;           // Left pointer of sliding window
30  
31    // Iterate through each position as the right boundary
32    for (let right = 0; right < n; right++) {
33        // Add current element to window
34        const currentFreq = frequency.get(nums[right]) || 0;
35        frequency.set(nums[right], currentFreq + 1);
36      
37        // New distinct element encountered
38        if (currentFreq === 0) {
39            distinctCount++;
40        }
41      
42        // Shrink window from left while we have more than k distinct elements
43        while (distinctCount > k) {
44            const leftElement = nums[left];
45            const leftFreq = frequency.get(leftElement)!;
46            frequency.set(leftElement, leftFreq - 1);
47          
48            // Element completely removed from window
49            if (leftFreq - 1 === 0) {
50                distinctCount--;
51            }
52            left++;
53        }
54      
55        // Store the leftmost valid position for current right boundary
56        leftmostPos[right] = left;
57    }
58  
59    return leftmostPos;
60}
61

Time and Space Complexity

Time Complexity: O(n)

The algorithm calls function f twice with parameters k-1 and k. Each call to f performs:

  • A single pass through the array with the outer loop iterating n times
  • The inner while loop moves pointer j from left to right across the array
  • Since j only moves forward and never resets, each element is visited at most twice (once by i, once by j)
  • Counter operations (add, remove, check size) are O(1) on average
  • The final zip and sum operation is O(n)

Therefore, the total time complexity is O(n) + O(n) + O(n) = O(n).

Space Complexity: O(n + k)

The space usage includes:

  • Two arrays pos of size n (one for each call to f): O(n)
  • Counter dictionary cnt which stores at most k distinct elements in each call: O(k)
  • The zip operation creates pairs but processes them on-the-fly: O(1) additional space

Therefore, the total space complexity is O(n + k), which simplifies to O(n) when k ≤ n.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Understanding of the Subtraction Logic

Pitfall: Many developers struggle to understand why subtracting f(k) from f(k-1) gives subarrays with exactly k distinct integers. They might incorrectly think they need to count subarrays directly.

Why it happens: The concept that "exactly k" = "at most k" - "at most k-1" is not immediately intuitive.

Solution: Visualize with an example:

  • If f(k)[i] = 2, subarrays starting from indices 2, 3, 4, ..., i have at most k distinct integers
  • If f(k-1)[i] = 5, subarrays starting from indices 5, 6, 7, ..., i have at most k-1 distinct integers
  • Therefore, subarrays starting from indices 2, 3, 4 (positions between f(k)[i] and f(k-1)[i]-1) have exactly k distinct integers

2. Off-by-One Error in Window Management

Pitfall: Incorrectly managing the sliding window boundaries, especially when removing elements from the frequency map.

# WRONG: Removing before decrementing
while len(frequency_map) > max_distinct:
    if frequency_map[nums[left_pointer]] == 1:
        del frequency_map[nums[left_pointer]]
    frequency_map[nums[left_pointer]] -= 1  # This will fail!
    left_pointer += 1

Solution: Always decrement first, then check if removal is needed:

# CORRECT: Decrement first, then remove if zero
while len(frequency_map) > max_distinct:
    frequency_map[nums[left_pointer]] -= 1
    if frequency_map[nums[left_pointer]] == 0:
        del frequency_map[nums[left_pointer]]
    left_pointer += 1

3. Edge Case: k = 0 or k > array length

Pitfall: Not handling edge cases where k is 0 or greater than the number of unique elements in the array.

Solution: Add validation at the beginning:

def subarraysWithKDistinct(self, nums: List[int], k: int) -> int:
    if k == 0 or k > len(set(nums)):
        return 0
    # ... rest of the code

4. Memory Inefficiency with Position Arrays

Pitfall: Storing entire position arrays when only the differences are needed, which can be memory-intensive for large inputs.

Solution: Calculate the result on-the-fly without storing full arrays:

def subarraysWithKDistinct(self, nums: List[int], k: int) -> int:
    def count_at_most_k(max_distinct):
        result = 0
        frequency_map = Counter()
        left_pointer = 0
      
        for right_pointer, current_num in enumerate(nums):
            frequency_map[current_num] += 1
          
            while len(frequency_map) > max_distinct:
                frequency_map[nums[left_pointer]] -= 1
                if frequency_map[nums[left_pointer]] == 0:
                    del frequency_map[nums[left_pointer]]
                left_pointer += 1
          
            # Add count of all subarrays ending at right_pointer
            result += right_pointer - left_pointer + 1
      
        return result
  
    return count_at_most_k(k) - count_at_most_k(k - 1)

5. Using Dictionary Instead of Counter

Pitfall: Using a regular dictionary and forgetting to initialize keys before incrementing.

# WRONG: This will raise KeyError
frequency_map = {}
frequency_map[current_num] += 1  # KeyError if current_num not in map

Solution: Use Counter from collections or handle initialization:

# Option 1: Use Counter
from collections import Counter
frequency_map = Counter()

# Option 2: Handle initialization manually
frequency_map = {}
frequency_map[current_num] = frequency_map.get(current_num, 0) + 1
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

What is the best way of checking if an element exists in an unsorted array once in terms of time complexity? Select the best that applies.


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More