992. Subarrays with K Different Integers
Problem Description
You are given an integer array nums
and an integer k
. Your task is to find the total number of good subarrays in nums
.
A good array is defined as an array that contains exactly k
different integers. For instance, the array [1,2,3,1,2]
contains exactly 3 different integers: 1
, 2
, and 3
.
A subarray is a contiguous sequence of elements from the original array. This means the elements must appear consecutively in the same order as they appear in the original array.
Your goal is to count all possible subarrays of nums
that have exactly k
distinct integers.
For example:
- If
nums = [1,2,3,1,2]
andk = 3
, you need to find all contiguous subarrays that contain exactly 3 different integers - The subarray
[1,2,3]
would be valid because it has exactly 3 distinct integers - The subarray
[1,2]
would not be valid because it only has 2 distinct integers - The subarray
[1,2,3,1]
would be valid because it has exactly 3 distinct integers (1, 2, and 3)
The function should return the total count of such good subarrays.
Intuition
The key insight is that directly counting subarrays with exactly k
distinct integers is challenging. However, we can transform this problem using a clever observation: the number of subarrays with exactly k
distinct integers equals the number of subarrays with at most k
distinct integers minus the number of subarrays with at most k-1
distinct integers.
Why does this work? Think about it this way:
- Subarrays with at most
k
distinct integers include those with 1, 2, 3, ..., up tok
distinct integers - Subarrays with at most
k-1
distinct integers include those with 1, 2, 3, ..., up tok-1
distinct integers - When we subtract the second from the first, we're left with only those subarrays that have exactly
k
distinct integers
This transformation is powerful because counting subarrays with "at most k
distinct integers" is much easier using a sliding window approach.
For the sliding window technique, we can use two pointers. For each ending position i
, we find the leftmost starting position j
such that the subarray from j
to i
contains at most k
distinct integers. All subarrays ending at position i
and starting from any position between j
and i
will have at most k
distinct integers.
The solution cleverly stores these leftmost valid positions in an array pos
. For each ending position i
, pos[i]
stores the leftmost starting position where the subarray still has at most k
distinct integers. The difference between positions when we compute for k-1
and k
gives us the count of subarrays with exactly k
distinct integers ending at each position.
This approach reduces a complex counting problem into two simpler sliding window problems, making the solution both elegant and efficient.
Learn more about Sliding Window patterns.
Solution Approach
The implementation uses a helper function f(k)
that calculates, for each position in the array, the leftmost starting position of a subarray ending at that position with at most k
distinct integers.
Helper Function f(k)
Implementation:
-
Initialize data structures:
pos
: An array to store the leftmost valid starting position for each ending positioncnt
: A Counter (hash map) to track the frequency of each integer in the current windowj
: A pointer representing the left boundary of the sliding window, initialized to 0
-
Sliding window process:
- Iterate through the array with index
i
and valuex
- Add the current element to the window:
cnt[x] += 1
- Shrink window when needed: While the number of distinct integers exceeds
k
:- Decrement the count of the leftmost element:
cnt[nums[j]] -= 1
- If the count becomes 0, remove that element from the counter entirely:
cnt.pop(nums[j])
- Move the left pointer right:
j += 1
- Decrement the count of the leftmost element:
- Store the current left boundary for position
i
:pos[i] = j
- Iterate through the array with index
-
Return the position array which contains the leftmost valid starting positions for subarrays with at most
k
distinct integers
Main Algorithm:
- Call
f(k-1)
to get positions for subarrays with at mostk-1
distinct integers - Call
f(k)
to get positions for subarrays with at mostk
distinct integers - Calculate the result: For each ending position, subtract the position from
f(k)
from the position fromf(k-1)
- This difference gives the number of subarrays ending at that position with exactly
k
distinct integers - Sum all these differences:
sum(a - b for a, b in zip(f(k - 1), f(k)))
- This difference gives the number of subarrays ending at that position with exactly
Why the subtraction works:
- For position
i
,f(k)[i]
gives the leftmost position where a subarray can start and have at mostk
distinct integers f(k-1)[i]
gives the leftmost position for at mostk-1
distinct integers- Subarrays starting between
f(k)[i]
andf(k-1)[i] - 1
will have exactlyk
distinct integers - The count of such subarrays is
f(k-1)[i] - f(k)[i]
The time complexity is O(n)
where n
is the length of the array, as each element is visited at most twice by the two pointers. The space complexity is O(n)
for storing the position arrays.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example with nums = [1,2,1,2,3]
and k = 2
.
Goal: Find all subarrays with exactly 2 distinct integers.
Step 1: Find subarrays with at most k-1=1 distinct integers
Using the sliding window with f(1)
:
- i=0, nums[0]=1: Window [1], distinct=1, j=0, pos[0]=0
- i=1, nums[1]=2: Window [2], distinct=1 (shrink: remove 1), j=1, pos[1]=1
- i=2, nums[2]=1: Window [1], distinct=1 (shrink: remove 2), j=2, pos[2]=2
- i=3, nums[3]=2: Window [2], distinct=1 (shrink: remove 1), j=3, pos[3]=3
- i=4, nums[4]=3: Window [3], distinct=1 (shrink: remove 2), j=4, pos[4]=4
Result: f(1) = [0, 1, 2, 3, 4]
Step 2: Find subarrays with at most k=2 distinct integers
Using the sliding window with f(2)
:
- i=0, nums[0]=1: Window [1], distinct=1, j=0, pos[0]=0
- i=1, nums[1]=2: Window [1,2], distinct=2, j=0, pos[1]=0
- i=2, nums[2]=1: Window [1,2,1], distinct=2, j=0, pos[2]=0
- i=3, nums[3]=2: Window [1,2,1,2], distinct=2, j=0, pos[3]=0
- i=4, nums[4]=3: Window [2,1,2,3], distinct=3 (shrink: remove 1s), j=1, pos[4]=1
Result: f(2) = [0, 0, 0, 0, 1]
Step 3: Calculate the difference
For each position i, the number of subarrays ending at i with exactly 2 distinct integers is:
- i=0: f(1)[0] - f(2)[0] = 0 - 0 = 0 (no subarrays with exactly 2 distinct)
- i=1: f(1)[1] - f(2)[1] = 1 - 0 = 1 (subarray [1,2])
- i=2: f(1)[2] - f(2)[2] = 2 - 0 = 2 (subarrays [1,2,1] and [2,1])
- i=3: f(1)[3] - f(2)[3] = 3 - 0 = 3 (subarrays [1,2,1,2], [2,1,2], and [1,2])
- i=4: f(1)[4] - f(2)[4] = 4 - 1 = 3 (subarrays [2,3], [1,2,3], and [2,1,2,3])
Total: 0 + 1 + 2 + 3 + 3 = 9 subarrays with exactly 2 distinct integers
The subarrays are: [1,2], [1,2,1], [2,1], [1,2,1,2], [2,1,2], [1,2] (at position 2-3), [2,3], [1,2,3], [2,1,2,3]
Solution Implementation
1class Solution:
2 def subarraysWithKDistinct(self, nums: List[int], k: int) -> int:
3 def find_left_boundaries(max_distinct):
4 """
5 For each index i, find the leftmost index j such that
6 nums[j:i+1] contains at most max_distinct distinct integers.
7
8 Returns a list where left_boundaries[i] represents the leftmost
9 valid starting position for subarrays ending at index i.
10 """
11 left_boundaries = [0] * len(nums)
12 frequency_map = Counter()
13 left_pointer = 0
14
15 for right_pointer, current_num in enumerate(nums):
16 # Add current element to the window
17 frequency_map[current_num] += 1
18
19 # Shrink window from left while we have too many distinct elements
20 while len(frequency_map) > max_distinct:
21 left_num = nums[left_pointer]
22 frequency_map[left_num] -= 1
23
24 # Remove element from map if its count reaches 0
25 if frequency_map[left_num] == 0:
26 del frequency_map[left_num]
27
28 left_pointer += 1
29
30 # Store the leftmost valid position for current right_pointer
31 left_boundaries[right_pointer] = left_pointer
32
33 return left_boundaries
34
35 # Get left boundaries for at most (k-1) and at most k distinct elements
36 boundaries_k_minus_1 = find_left_boundaries(k - 1)
37 boundaries_k = find_left_boundaries(k)
38
39 # The difference gives us the count of subarrays with exactly k distinct
40 # For each position i, boundaries_k_minus_1[i] - boundaries_k[i] gives
41 # the number of valid starting positions for subarrays ending at i
42 # that have exactly k distinct integers
43 return sum(left_k_minus_1 - left_k
44 for left_k_minus_1, left_k in zip(boundaries_k_minus_1, boundaries_k))
45
1class Solution {
2 public int subarraysWithKDistinct(int[] nums, int k) {
3 // Get the leftmost valid position for each index with at most k distinct elements
4 int[] leftmostWithAtMostK = findLeftmostPositions(nums, k);
5
6 // Get the leftmost valid position for each index with at most k-1 distinct elements
7 int[] leftmostWithAtMostKMinus1 = findLeftmostPositions(nums, k - 1);
8
9 // Calculate the result using the difference
10 // For each ending position i, the number of subarrays with exactly k distinct elements
11 // equals (subarrays with at most k) - (subarrays with at most k-1)
12 int result = 0;
13 for (int i = 0; i < nums.length; i++) {
14 result += leftmostWithAtMostKMinus1[i] - leftmostWithAtMostK[i];
15 }
16 return result;
17 }
18
19 /**
20 * Finds the leftmost starting position for each ending position
21 * such that the subarray has at most k distinct elements.
22 *
23 * @param nums the input array
24 * @param k maximum number of distinct elements allowed
25 * @return array where positions[i] represents the leftmost valid starting index
26 * for subarrays ending at index i with at most k distinct elements
27 */
28 private int[] findLeftmostPositions(int[] nums, int k) {
29 int n = nums.length;
30
31 // Frequency counter for each element (array size is n+1 to handle all possible values)
32 int[] frequency = new int[n + 1];
33
34 // Result array storing leftmost valid position for each ending index
35 int[] positions = new int[n];
36
37 // Counter for number of distinct elements in current window
38 int distinctCount = 0;
39
40 // Two pointers: left (j) and right (i)
41 int left = 0;
42
43 for (int right = 0; right < n; right++) {
44 // Add current element to the window
45 if (++frequency[nums[right]] == 1) {
46 // If this element appears for the first time, increment distinct count
47 distinctCount++;
48 }
49
50 // Shrink window from left while we have more than k distinct elements
51 while (distinctCount > k) {
52 // Remove leftmost element from the window
53 if (--frequency[nums[left]] == 0) {
54 // If this element's count becomes 0, decrement distinct count
55 distinctCount--;
56 }
57 left++;
58 }
59
60 // Store the leftmost valid position for current ending position
61 positions[right] = left;
62 }
63
64 return positions;
65 }
66}
67
1class Solution {
2public:
3 int subarraysWithKDistinct(vector<int>& nums, int k) {
4 // Get the leftmost position for subarrays with at most k distinct integers
5 vector<int> leftmostWithK = findLeftmostPositions(nums, k);
6
7 // Get the leftmost position for subarrays with at most k-1 distinct integers
8 vector<int> leftmostWithKMinus1 = findLeftmostPositions(nums, k - 1);
9
10 int result = 0;
11
12 // For each ending position i, count subarrays with exactly k distinct integers
13 // The count equals: (subarrays with at most k) - (subarrays with at most k-1)
14 for (int i = 0; i < nums.size(); ++i) {
15 result += leftmostWithKMinus1[i] - leftmostWithK[i];
16 }
17
18 return result;
19 }
20
21private:
22 // Find the leftmost valid starting position for each ending position
23 // such that the subarray has at most k distinct integers
24 vector<int> findLeftmostPositions(vector<int>& nums, int k) {
25 int n = nums.size();
26 vector<int> leftmostPos(n);
27
28 // Frequency map for elements (assuming elements are in range [1, n])
29 vector<int> frequency(n + 1, 0);
30
31 int distinctCount = 0; // Number of distinct integers in current window
32 int left = 0; // Left pointer of sliding window
33
34 // Iterate through each position as the right boundary
35 for (int right = 0; right < n; ++right) {
36 // Add current element to window
37 if (++frequency[nums[right]] == 1) {
38 // New distinct element encountered
39 ++distinctCount;
40 }
41
42 // Shrink window from left while we have more than k distinct elements
43 while (distinctCount > k) {
44 if (--frequency[nums[left]] == 0) {
45 // Element completely removed from window
46 --distinctCount;
47 }
48 ++left;
49 }
50
51 // Store the leftmost valid position for current right boundary
52 leftmostPos[right] = left;
53 }
54
55 return leftmostPos;
56 }
57};
58
1function subarraysWithKDistinct(nums: number[], k: number): number {
2 // Get the leftmost position for subarrays with at most k distinct integers
3 const leftmostWithK = findLeftmostPositions(nums, k);
4
5 // Get the leftmost position for subarrays with at most k-1 distinct integers
6 const leftmostWithKMinus1 = findLeftmostPositions(nums, k - 1);
7
8 let result = 0;
9
10 // For each ending position i, count subarrays with exactly k distinct integers
11 // The count equals: (subarrays with at most k) - (subarrays with at most k-1)
12 for (let i = 0; i < nums.length; i++) {
13 result += leftmostWithKMinus1[i] - leftmostWithK[i];
14 }
15
16 return result;
17}
18
19// Find the leftmost valid starting position for each ending position
20// such that the subarray has at most k distinct integers
21function findLeftmostPositions(nums: number[], k: number): number[] {
22 const n = nums.length;
23 const leftmostPos: number[] = new Array(n);
24
25 // Frequency map for elements (using Map for flexibility with any integer values)
26 const frequency = new Map<number, number>();
27
28 let distinctCount = 0; // Number of distinct integers in current window
29 let left = 0; // Left pointer of sliding window
30
31 // Iterate through each position as the right boundary
32 for (let right = 0; right < n; right++) {
33 // Add current element to window
34 const currentFreq = frequency.get(nums[right]) || 0;
35 frequency.set(nums[right], currentFreq + 1);
36
37 // New distinct element encountered
38 if (currentFreq === 0) {
39 distinctCount++;
40 }
41
42 // Shrink window from left while we have more than k distinct elements
43 while (distinctCount > k) {
44 const leftElement = nums[left];
45 const leftFreq = frequency.get(leftElement)!;
46 frequency.set(leftElement, leftFreq - 1);
47
48 // Element completely removed from window
49 if (leftFreq - 1 === 0) {
50 distinctCount--;
51 }
52 left++;
53 }
54
55 // Store the leftmost valid position for current right boundary
56 leftmostPos[right] = left;
57 }
58
59 return leftmostPos;
60}
61
Time and Space Complexity
Time Complexity: O(n)
The algorithm calls function f
twice with parameters k-1
and k
. Each call to f
performs:
- A single pass through the array with the outer loop iterating
n
times - The inner while loop moves pointer
j
from left to right across the array - Since
j
only moves forward and never resets, each element is visited at most twice (once byi
, once byj
) - Counter operations (add, remove, check size) are
O(1)
on average - The final zip and sum operation is
O(n)
Therefore, the total time complexity is O(n) + O(n) + O(n) = O(n)
.
Space Complexity: O(n + k)
The space usage includes:
- Two arrays
pos
of sizen
(one for each call tof
):O(n)
- Counter dictionary
cnt
which stores at mostk
distinct elements in each call:O(k)
- The zip operation creates pairs but processes them on-the-fly:
O(1)
additional space
Therefore, the total space complexity is O(n + k)
, which simplifies to O(n)
when k ≤ n
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Understanding of the Subtraction Logic
Pitfall: Many developers struggle to understand why subtracting f(k)
from f(k-1)
gives subarrays with exactly k
distinct integers. They might incorrectly think they need to count subarrays directly.
Why it happens: The concept that "exactly k" = "at most k" - "at most k-1" is not immediately intuitive.
Solution: Visualize with an example:
- If
f(k)[i] = 2
, subarrays starting from indices 2, 3, 4, ..., i have at most k distinct integers - If
f(k-1)[i] = 5
, subarrays starting from indices 5, 6, 7, ..., i have at most k-1 distinct integers - Therefore, subarrays starting from indices 2, 3, 4 (positions between f(k)[i] and f(k-1)[i]-1) have exactly k distinct integers
2. Off-by-One Error in Window Management
Pitfall: Incorrectly managing the sliding window boundaries, especially when removing elements from the frequency map.
# WRONG: Removing before decrementing
while len(frequency_map) > max_distinct:
if frequency_map[nums[left_pointer]] == 1:
del frequency_map[nums[left_pointer]]
frequency_map[nums[left_pointer]] -= 1 # This will fail!
left_pointer += 1
Solution: Always decrement first, then check if removal is needed:
# CORRECT: Decrement first, then remove if zero
while len(frequency_map) > max_distinct:
frequency_map[nums[left_pointer]] -= 1
if frequency_map[nums[left_pointer]] == 0:
del frequency_map[nums[left_pointer]]
left_pointer += 1
3. Edge Case: k = 0 or k > array length
Pitfall: Not handling edge cases where k
is 0 or greater than the number of unique elements in the array.
Solution: Add validation at the beginning:
def subarraysWithKDistinct(self, nums: List[int], k: int) -> int:
if k == 0 or k > len(set(nums)):
return 0
# ... rest of the code
4. Memory Inefficiency with Position Arrays
Pitfall: Storing entire position arrays when only the differences are needed, which can be memory-intensive for large inputs.
Solution: Calculate the result on-the-fly without storing full arrays:
def subarraysWithKDistinct(self, nums: List[int], k: int) -> int:
def count_at_most_k(max_distinct):
result = 0
frequency_map = Counter()
left_pointer = 0
for right_pointer, current_num in enumerate(nums):
frequency_map[current_num] += 1
while len(frequency_map) > max_distinct:
frequency_map[nums[left_pointer]] -= 1
if frequency_map[nums[left_pointer]] == 0:
del frequency_map[nums[left_pointer]]
left_pointer += 1
# Add count of all subarrays ending at right_pointer
result += right_pointer - left_pointer + 1
return result
return count_at_most_k(k) - count_at_most_k(k - 1)
5. Using Dictionary Instead of Counter
Pitfall: Using a regular dictionary and forgetting to initialize keys before incrementing.
# WRONG: This will raise KeyError frequency_map = {} frequency_map[current_num] += 1 # KeyError if current_num not in map
Solution: Use Counter
from collections or handle initialization:
# Option 1: Use Counter from collections import Counter frequency_map = Counter() # Option 2: Handle initialization manually frequency_map = {} frequency_map[current_num] = frequency_map.get(current_num, 0) + 1
What is the best way of checking if an element exists in an unsorted array once in terms of time complexity? Select the best that applies.
Recommended Readings
https assets algo monster cover_photos stack svg Sliding Window Maximum Monotonic Stack We have an array and a sliding window defined by a start index and an end index The sliding window moves from left of the array to right There are always k elements in the window The window
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Don’t Miss This!