3134. Find the Median of the Uniqueness Array
Problem Description
You are given an integer array nums
. Your task is to find the median of a special array called the "uniqueness array".
The uniqueness array is constructed as follows:
- For every possible subarray of
nums
(including single elements and the entire array), count the number of distinct elements in that subarray - Collect all these counts into an array
- Sort this array in non-decreasing order
For example, if nums = [1, 2, 1]
:
- Subarray
[1]
has 1 distinct element - Subarray
[2]
has 1 distinct element - Subarray
[1]
has 1 distinct element - Subarray
[1, 2]
has 2 distinct elements - Subarray
[2, 1]
has 2 distinct elements - Subarray
[1, 2, 1]
has 2 distinct elements - The uniqueness array would be
[1, 1, 1, 2, 2, 2]
after sorting
The median of the uniqueness array is:
- The middle element if the array has an odd number of elements
- The smaller of the two middle elements if the array has an even number of elements
Since there are n * (n + 1) / 2
total subarrays for an array of length n
, the uniqueness array will have exactly this many elements.
Your goal is to return the median value from this uniqueness array without actually constructing the entire array (which would be inefficient for large inputs).
Intuition
The key insight is that we don't need to generate the entire uniqueness array to find its median. Instead, we can use binary search to directly find the median value.
Think about it this way: if we pick any value x
, we can count how many elements in the uniqueness array are less than or equal to x
. For the median, we need the value where exactly (m + 1) / 2
elements are less than or equal to it, where m = n * (n + 1) / 2
is the total number of subarrays.
The crucial observation is that as x
increases, the count of elements ≤ x
also increases (monotonic property). This makes binary search perfect for finding the smallest x
where at least (m + 1) / 2
elements are ≤ x
.
But how do we count subarrays with at most x
distinct elements efficiently? Here's where the sliding window technique comes in. We can maintain a window that expands to the right and contracts from the left to ensure it never has more than x
distinct elements.
For each position r
as the right endpoint:
- We expand the window to include
nums[r]
- If the window has more than
x
distinct elements, we shrink from the left until it's valid again - Once valid, all subarrays ending at
r
with starting positions froml
tor
are valid (that'sr - l + 1
subarrays)
This two-pointer approach lets us count all valid subarrays in O(n)
time for a given x
. Combined with binary search over possible values of x
(which ranges from 1 to n
), we get an efficient solution without explicitly constructing the massive uniqueness array.
The beauty of this approach is that we transform the problem from "find the median of a huge array" to "find the first value where enough subarrays satisfy a condition" - a much more manageable task.
Learn more about Binary Search and Sliding Window patterns.
Solution Approach
The solution uses binary search combined with a sliding window technique to efficiently find the median without constructing the entire uniqueness array.
Binary Search Setup
First, we calculate the total number of subarrays: m = n * (n + 1) / 2
. The median will be at position (m + 1) / 2
in the sorted uniqueness array.
We set up binary search with:
- Left boundary:
l = 0
(minimum possible distinct elements) - Right boundary:
r = n
(maximum possible distinct elements)
We search for the smallest value mid
such that at least (m + 1) / 2
subarrays have at most mid
distinct elements.
The Check Function
The check(mx)
function determines if there are at least (m + 1) / 2
subarrays with at most mx
distinct elements.
Data Structures:
cnt
: A dictionary tracking the frequency of each element in the current windowk
: Counter for the number of valid subarrays foundl
,r
: Left and right pointers for the sliding window
Algorithm:
- For each position
r
from 0 ton-1
:- Add
nums[r]
to the window:cnt[nums[r]] += 1
- Add
- While the window has more than
mx
distinct elements (len(cnt) > mx
):- Remove elements from the left:
- Decrement
cnt[nums[l]]
- If count becomes 0, remove the key from
cnt
- Move left pointer:
l += 1
- Decrement
- Remove elements from the left:
- Count valid subarrays:
- All subarrays ending at
r
with start positions froml
tor
are valid - Add
r - l + 1
to our counterk
- All subarrays ending at
- Early termination:
- If
k >= (m + 1) / 2
, returnTrue
(we've found enough subarrays)
- If
- If we finish the loop without finding enough subarrays, return
False
Binary Search Logic
The main function uses bisect_left
to find the first value where check
returns True
:
bisect_left(range(n), True, key=check)
searches through values 0 to n-1- It finds the leftmost position where
check(x)
evaluates toTrue
- This gives us the smallest number of distinct elements such that at least half of the subarrays have that many or fewer distinct elements
Time Complexity
- Binary search:
O(log n)
iterations - Each
check
call:O(n)
time (sliding window traverses the array once) - Overall:
O(n log n)
Space Complexity
O(n)
for thecnt
dictionary in the worst case (all elements are distinct)
The elegance of this solution lies in avoiding the explicit construction of the uniqueness array while still finding its median through careful counting and binary search.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with nums = [1, 2, 1]
.
Step 1: Setup
- Array length
n = 3
- Total subarrays:
m = 3 * 4 / 2 = 6
- Median position:
(6 + 1) / 2 = 3.5
, so we need the 4th smallest element (ceiling of 3.5) - Binary search range:
l = 0
,r = 3
Step 2: Binary Search Iterations
First iteration: mid = (0 + 3) / 2 = 1
- Check if at least 4 subarrays have ≤ 1 distinct element
- Using sliding window with
mx = 1
:- Window
[1]
: 1 distinct element, valid → count = 1 - Window
[2]
: 1 distinct element, valid → count = 2 - Window
[1]
: 1 distinct element, valid → count = 3 - Total: 3 subarrays with ≤ 1 distinct element
- Window
- Since 3 < 4,
check(1) = False
- Update:
l = 2
(search in upper half)
Second iteration: mid = (2 + 3) / 2 = 2
- Check if at least 4 subarrays have ≤ 2 distinct elements
- Using sliding window with
mx = 2
:r = 0
: Window[1]
→ 1 distinct, add 1 subarray → count = 1r = 1
: Window[1,2]
→ 2 distinct, add 2 subarrays ([2]
and[1,2]
) → count = 3r = 2
: Window[1,2,1]
→ 2 distinct, add 3 subarrays ([1]
,[2,1]
,[1,2,1]
) → count = 6- Total: 6 subarrays with ≤ 2 distinct elements
- Since 6 ≥ 4,
check(2) = True
- Update:
r = 2
(search in lower half to find smallest valid value)
Third iteration: Binary search converges
- We've found that 2 is the smallest value where at least 4 subarrays have that many or fewer distinct elements
- This means the median is 2
Verification: The actual uniqueness array (sorted) would be:
[1]
→ 1 distinct[2]
→ 1 distinct[1]
→ 1 distinct[1,2]
→ 2 distinct[2,1]
→ 2 distinct[1,2,1]
→ 2 distinct
Sorted: [1, 1, 1, 2, 2, 2]
The 4th element (median for even-length array) is indeed 2.
The algorithm correctly found the median without constructing the entire array!
Solution Implementation
1from typing import List
2from collections import defaultdict
3from bisect import bisect_left
4
5class Solution:
6 def medianOfUniquenessArray(self, nums: List[int]) -> int:
7 def can_achieve_median(max_unique: int) -> bool:
8 """
9 Check if we can have at least half of all subarrays
10 with uniqueness <= max_unique
11 """
12 # Dictionary to track frequency of elements in current window
13 element_count = defaultdict(int)
14 subarrays_count = 0
15 left = 0
16
17 # Sliding window approach
18 for right, current_num in enumerate(nums):
19 # Add current element to window
20 element_count[current_num] += 1
21
22 # Shrink window if unique elements exceed max_unique
23 while len(element_count) > max_unique:
24 left_num = nums[left]
25 element_count[left_num] -= 1
26 if element_count[left_num] == 0:
27 del element_count[left_num]
28 left += 1
29
30 # Count subarrays ending at 'right' with unique elements <= max_unique
31 subarrays_count += right - left + 1
32
33 # Check if we've reached the median position
34 if subarrays_count >= (total_subarrays + 1) // 2:
35 return True
36
37 return False
38
39 # Calculate total number of subarrays
40 n = len(nums)
41 total_subarrays = n * (n + 1) // 2
42
43 # Binary search for the minimum uniqueness value that can be the median
44 # Using bisect_left to find the leftmost position where check returns True
45 return bisect_left(range(n), True, key=can_achieve_median)
46
1class Solution {
2 private long totalSubarrays;
3 private int[] nums;
4
5 /**
6 * Finds the median of uniqueness values for all subarrays.
7 * Uses binary search to find the minimum uniqueness value such that
8 * at least half of all subarrays have uniqueness <= this value.
9 *
10 * @param nums the input array
11 * @return the median uniqueness value
12 */
13 public int medianOfUniquenessArray(int[] nums) {
14 int n = nums.length;
15 this.nums = nums;
16
17 // Calculate total number of subarrays: n*(n+1)/2
18 totalSubarrays = (1L + n) * n / 2;
19
20 // Binary search on the answer (uniqueness value)
21 int left = 0;
22 int right = n;
23
24 while (left < right) {
25 int mid = (left + right) >> 1;
26
27 // Check if there are at least (totalSubarrays + 1) / 2 subarrays
28 // with uniqueness <= mid
29 if (hasEnoughSubarraysWithMaxUniqueness(mid)) {
30 right = mid; // Try to find a smaller valid answer
31 } else {
32 left = mid + 1; // Need a larger uniqueness value
33 }
34 }
35
36 return left;
37 }
38
39 /**
40 * Checks if at least half of all subarrays have uniqueness <= maxUniqueness.
41 * Uses sliding window technique to count subarrays with at most maxUniqueness distinct elements.
42 *
43 * @param maxUniqueness the maximum allowed number of distinct elements
44 * @return true if at least half of subarrays have uniqueness <= maxUniqueness
45 */
46 private boolean hasEnoughSubarraysWithMaxUniqueness(int maxUniqueness) {
47 Map<Integer, Integer> frequencyMap = new HashMap<>();
48 long subarrayCount = 0;
49
50 // Sliding window approach
51 int left = 0;
52 for (int right = 0; right < nums.length; right++) {
53 // Add current element to the window
54 int currentElement = nums[right];
55 frequencyMap.merge(currentElement, 1, Integer::sum);
56
57 // Shrink window from left while number of distinct elements > maxUniqueness
58 while (frequencyMap.size() > maxUniqueness) {
59 int leftElement = nums[left];
60 left++;
61
62 // Decrease frequency and remove if it becomes 0
63 if (frequencyMap.merge(leftElement, -1, Integer::sum) == 0) {
64 frequencyMap.remove(leftElement);
65 }
66 }
67
68 // Count all subarrays ending at right with at most maxUniqueness distinct elements
69 // These are: [left, right], [left+1, right], ..., [right, right]
70 subarrayCount += right - left + 1;
71
72 // Check if we've found enough subarrays (at least half of total)
73 if (subarrayCount >= (totalSubarrays + 1) / 2) {
74 return true;
75 }
76 }
77
78 return false;
79 }
80}
81
1class Solution {
2public:
3 int medianOfUniquenessArray(vector<int>& nums) {
4 int n = nums.size();
5 using ll = long long;
6
7 // Total number of subarrays: n*(n+1)/2
8 ll totalSubarrays = (1LL + n) * n / 2;
9
10 // Binary search range: minimum distinct count is 1, maximum is n
11 int left = 1, right = n;
12
13 // Lambda function to check if there are at least (totalSubarrays+1)/2 subarrays
14 // with at most 'maxDistinct' unique elements
15 auto countSubarraysWithAtMostKDistinct = [&](int maxDistinct) -> bool {
16 unordered_map<int, int> frequencyMap;
17 ll subarrayCount = 0;
18
19 // Sliding window approach
20 for (int windowStart = 0, windowEnd = 0; windowEnd < n; ++windowEnd) {
21 // Add current element to the window
22 int currentElement = nums[windowEnd];
23 ++frequencyMap[currentElement];
24
25 // Shrink window if number of distinct elements exceeds maxDistinct
26 while (frequencyMap.size() > maxDistinct) {
27 int leftElement = nums[windowStart++];
28 if (--frequencyMap[leftElement] == 0) {
29 frequencyMap.erase(leftElement);
30 }
31 }
32
33 // Count all subarrays ending at windowEnd with at most maxDistinct unique elements
34 subarrayCount += windowEnd - windowStart + 1;
35
36 // Check if we've found enough subarrays (at least half of total)
37 if (subarrayCount >= (totalSubarrays + 1) / 2) {
38 return true;
39 }
40 }
41 return false;
42 };
43
44 // Binary search to find the minimum value where at least half of subarrays
45 // have that many or fewer distinct elements
46 while (left < right) {
47 int mid = (left + right) >> 1;
48 if (countSubarraysWithAtMostKDistinct(mid)) {
49 // If true, the median is at most mid, search in left half
50 right = mid;
51 } else {
52 // If false, the median is greater than mid, search in right half
53 left = mid + 1;
54 }
55 }
56
57 return left;
58 }
59};
60
1/**
2 * Finds the median of the uniqueness array for all subarrays
3 * @param nums - The input array of numbers
4 * @returns The median value of the uniqueness array
5 */
6function medianOfUniquenessArray(nums: number[]): number {
7 const arrayLength: number = nums.length;
8 // Total number of subarrays: n*(n+1)/2
9 const totalSubarrays: number = Math.floor(((1 + arrayLength) * arrayLength) / 2);
10
11 // Binary search bounds for the median value
12 let left: number = 0;
13 let right: number = arrayLength;
14
15 /**
16 * Checks if at least half of all subarrays have uniqueness <= maxUniqueness
17 * @param maxUniqueness - The maximum allowed number of unique elements
18 * @returns True if at least half of subarrays have uniqueness <= maxUniqueness
19 */
20 const checkIfMedianCandidate = (maxUniqueness: number): boolean => {
21 // Map to track frequency of elements in current window
22 const frequencyMap = new Map<number, number>();
23 let windowStart: number = 0;
24 let subarrayCount: number = 0;
25
26 // Sliding window approach to count subarrays with uniqueness <= maxUniqueness
27 for (let windowEnd = 0; windowEnd < arrayLength; ++windowEnd) {
28 const currentElement: number = nums[windowEnd];
29
30 // Add current element to frequency map
31 frequencyMap.set(currentElement, (frequencyMap.get(currentElement) || 0) + 1);
32
33 // Shrink window if unique count exceeds maxUniqueness
34 while (frequencyMap.size > maxUniqueness) {
35 const elementToRemove: number = nums[windowStart++];
36 const updatedFrequency: number = frequencyMap.get(elementToRemove)! - 1;
37 frequencyMap.set(elementToRemove, updatedFrequency);
38
39 // Remove element from map if frequency becomes 0
40 if (updatedFrequency === 0) {
41 frequencyMap.delete(elementToRemove);
42 }
43 }
44
45 // All subarrays ending at windowEnd with start >= windowStart are valid
46 subarrayCount += windowEnd - windowStart + 1;
47
48 // Check if we've found at least half of all subarrays
49 if (subarrayCount >= Math.floor((totalSubarrays + 1) / 2)) {
50 return true;
51 }
52 }
53
54 return false;
55 };
56
57 // Binary search for the minimum value where at least half subarrays have that uniqueness
58 while (left < right) {
59 const mid: number = (left + right) >> 1;
60
61 if (checkIfMedianCandidate(mid)) {
62 // If mid works, try to find a smaller value
63 right = mid;
64 } else {
65 // If mid doesn't work, we need a larger value
66 left = mid + 1;
67 }
68 }
69
70 return left;
71}
72
Time and Space Complexity
The time complexity is O(n × log n)
, where n
is the length of the array nums
.
Time Complexity Analysis:
- The outer binary search using
bisect_left
searches over the range[0, n)
, which takesO(log n)
iterations. - For each binary search iteration, the
check
function is called. - Inside the
check
function:- There's a sliding window approach with two pointers
l
andr
that traverse the array once. - Each element is visited at most twice (once when
r
moves right, once whenl
moves right). - The operations inside the loop (dictionary updates, comparisons) are
O(1)
on average. - Therefore, each call to
check
takesO(n)
time.
- There's a sliding window approach with two pointers
- Total time complexity:
O(log n) × O(n) = O(n × log n)
.
The space complexity is O(n)
.
Space Complexity Analysis:
- The
cnt
dictionary (defaultdict) stores the frequency of distinct elements in the current window. - In the worst case, all elements in the array are distinct, so the dictionary can contain up to
n
entries. - The binary search and other variables use
O(1)
additional space. - Total space complexity:
O(n)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Median Position Calculation
Pitfall: Using total_subarrays // 2
instead of (total_subarrays + 1) // 2
for the median position. This leads to finding the wrong element in the sorted uniqueness array.
Why it happens: For even-length arrays, we need the smaller of the two middle elements. The position (total_subarrays + 1) // 2
correctly gives us this position (1-indexed), which translates to checking if at least this many subarrays have uniqueness ≤ our target value.
Solution: Always use (total_subarrays + 1) // 2
for the median position check.
2. Off-by-One Error in Binary Search Range
Pitfall: Setting the binary search range as bisect_left(range(n + 1), ...)
instead of bisect_left(range(n), ...)
. This searches for values from 0 to n, but the check function expects values from 0 to n-1.
Why it happens: Confusion about whether the maximum distinct elements should be n or n-1. Since we're looking for the first value where the condition is satisfied, and the maximum possible distinct elements is n, we should search in range(n).
Solution: Use range(n)
which searches from 0 to n-1, and the function will automatically return n if no value in this range satisfies the condition.
3. Memory Leak with defaultdict
Pitfall: Not removing elements from the dictionary when their count reaches 0, causing unnecessary memory usage and incorrect distinct element counting.
# Wrong approach
while len(element_count) > max_unique:
element_count[nums[left]] -= 1
# Missing: del element_count[nums[left]] when count is 0
left += 1
Solution: Always remove dictionary entries when their count reaches 0:
while len(element_count) > max_unique:
element_count[nums[left]] -= 1
if element_count[nums[left]] == 0:
del element_count[nums[left]]
left += 1
4. Misunderstanding the Binary Search Target
Pitfall: Thinking we're searching for the exact median value rather than the minimum number of distinct elements that satisfies our condition.
Why it happens: The problem asks for the median of uniqueness values, but the binary search finds the minimum value where at least half of the subarrays have that many or fewer distinct elements.
Solution: Remember that bisect_left
finds the first (smallest) value where the condition becomes True, which correctly gives us the median of the sorted uniqueness array.
Which data structure is used to implement recursion?
Recommended Readings
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
https assets algo monster cover_photos stack svg Sliding Window Maximum Monotonic Stack We have an array and a sliding window defined by a start index and an end index The sliding window moves from left of the array to right There are always k elements in the window The window
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Want a Structured Path to Master System Design Too? Don’t Miss This!