Facebook Pixel

3134. Find the Median of the Uniqueness Array

Problem Description

You are given an integer array nums. Your task is to find the median of a special array called the "uniqueness array".

The uniqueness array is constructed as follows:

  • For every possible subarray of nums (including single elements and the entire array), count the number of distinct elements in that subarray
  • Collect all these counts into an array
  • Sort this array in non-decreasing order

For example, if nums = [1, 2, 1]:

  • Subarray [1] has 1 distinct element
  • Subarray [2] has 1 distinct element
  • Subarray [1] has 1 distinct element
  • Subarray [1, 2] has 2 distinct elements
  • Subarray [2, 1] has 2 distinct elements
  • Subarray [1, 2, 1] has 2 distinct elements
  • The uniqueness array would be [1, 1, 1, 2, 2, 2] after sorting

The median of the uniqueness array is:

  • The middle element if the array has an odd number of elements
  • The smaller of the two middle elements if the array has an even number of elements

Since there are n * (n + 1) / 2 total subarrays for an array of length n, the uniqueness array will have exactly this many elements.

Your goal is to return the median value from this uniqueness array without actually constructing the entire array (which would be inefficient for large inputs).

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is that we don't need to generate the entire uniqueness array to find its median. Instead, we can use binary search to directly find the median value.

Think about it this way: if we pick any value x, we can count how many elements in the uniqueness array are less than or equal to x. For the median, we need the value where exactly (m + 1) / 2 elements are less than or equal to it, where m = n * (n + 1) / 2 is the total number of subarrays.

The crucial observation is that as x increases, the count of elements ≤ x also increases (monotonic property). This makes binary search perfect for finding the smallest x where at least (m + 1) / 2 elements are ≤ x.

But how do we count subarrays with at most x distinct elements efficiently? Here's where the sliding window technique comes in. We can maintain a window that expands to the right and contracts from the left to ensure it never has more than x distinct elements.

For each position r as the right endpoint:

  • We expand the window to include nums[r]
  • If the window has more than x distinct elements, we shrink from the left until it's valid again
  • Once valid, all subarrays ending at r with starting positions from l to r are valid (that's r - l + 1 subarrays)

This two-pointer approach lets us count all valid subarrays in O(n) time for a given x. Combined with binary search over possible values of x (which ranges from 1 to n), we get an efficient solution without explicitly constructing the massive uniqueness array.

The beauty of this approach is that we transform the problem from "find the median of a huge array" to "find the first value where enough subarrays satisfy a condition" - a much more manageable task.

Learn more about Binary Search and Sliding Window patterns.

Solution Approach

The solution uses binary search combined with a sliding window technique to efficiently find the median without constructing the entire uniqueness array.

Binary Search Setup

First, we calculate the total number of subarrays: m = n * (n + 1) / 2. The median will be at position (m + 1) / 2 in the sorted uniqueness array.

We set up binary search with:

  • Left boundary: l = 0 (minimum possible distinct elements)
  • Right boundary: r = n (maximum possible distinct elements)

We search for the smallest value mid such that at least (m + 1) / 2 subarrays have at most mid distinct elements.

The Check Function

The check(mx) function determines if there are at least (m + 1) / 2 subarrays with at most mx distinct elements.

Data Structures:

  • cnt: A dictionary tracking the frequency of each element in the current window
  • k: Counter for the number of valid subarrays found
  • l, r: Left and right pointers for the sliding window

Algorithm:

  1. For each position r from 0 to n-1:
    • Add nums[r] to the window: cnt[nums[r]] += 1
  2. While the window has more than mx distinct elements (len(cnt) > mx):
    • Remove elements from the left:
      • Decrement cnt[nums[l]]
      • If count becomes 0, remove the key from cnt
      • Move left pointer: l += 1
  3. Count valid subarrays:
    • All subarrays ending at r with start positions from l to r are valid
    • Add r - l + 1 to our counter k
  4. Early termination:
    • If k >= (m + 1) / 2, return True (we've found enough subarrays)
  5. If we finish the loop without finding enough subarrays, return False

Binary Search Logic

The main function uses bisect_left to find the first value where check returns True:

  • bisect_left(range(n), True, key=check) searches through values 0 to n-1
  • It finds the leftmost position where check(x) evaluates to True
  • This gives us the smallest number of distinct elements such that at least half of the subarrays have that many or fewer distinct elements

Time Complexity

Space Complexity

  • O(n) for the cnt dictionary in the worst case (all elements are distinct)

The elegance of this solution lies in avoiding the explicit construction of the uniqueness array while still finding its median through careful counting and binary search.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with nums = [1, 2, 1].

Step 1: Setup

  • Array length n = 3
  • Total subarrays: m = 3 * 4 / 2 = 6
  • Median position: (6 + 1) / 2 = 3.5, so we need the 4th smallest element (ceiling of 3.5)
  • Binary search range: l = 0, r = 3

Step 2: Binary Search Iterations

First iteration: mid = (0 + 3) / 2 = 1

  • Check if at least 4 subarrays have ≤ 1 distinct element
  • Using sliding window with mx = 1:
    • Window [1]: 1 distinct element, valid → count = 1
    • Window [2]: 1 distinct element, valid → count = 2
    • Window [1]: 1 distinct element, valid → count = 3
    • Total: 3 subarrays with ≤ 1 distinct element
  • Since 3 < 4, check(1) = False
  • Update: l = 2 (search in upper half)

Second iteration: mid = (2 + 3) / 2 = 2

  • Check if at least 4 subarrays have ≤ 2 distinct elements
  • Using sliding window with mx = 2:
    • r = 0: Window [1] → 1 distinct, add 1 subarray → count = 1
    • r = 1: Window [1,2] → 2 distinct, add 2 subarrays ([2] and [1,2]) → count = 3
    • r = 2: Window [1,2,1] → 2 distinct, add 3 subarrays ([1], [2,1], [1,2,1]) → count = 6
    • Total: 6 subarrays with ≤ 2 distinct elements
  • Since 6 ≥ 4, check(2) = True
  • Update: r = 2 (search in lower half to find smallest valid value)

Third iteration: Binary search converges

  • We've found that 2 is the smallest value where at least 4 subarrays have that many or fewer distinct elements
  • This means the median is 2

Verification: The actual uniqueness array (sorted) would be:

  • [1] → 1 distinct
  • [2] → 1 distinct
  • [1] → 1 distinct
  • [1,2] → 2 distinct
  • [2,1] → 2 distinct
  • [1,2,1] → 2 distinct

Sorted: [1, 1, 1, 2, 2, 2] The 4th element (median for even-length array) is indeed 2.

The algorithm correctly found the median without constructing the entire array!

Solution Implementation

1from typing import List
2from collections import defaultdict
3from bisect import bisect_left
4
5class Solution:
6    def medianOfUniquenessArray(self, nums: List[int]) -> int:
7        def can_achieve_median(max_unique: int) -> bool:
8            """
9            Check if we can have at least half of all subarrays 
10            with uniqueness <= max_unique
11            """
12            # Dictionary to track frequency of elements in current window
13            element_count = defaultdict(int)
14            subarrays_count = 0
15            left = 0
16          
17            # Sliding window approach
18            for right, current_num in enumerate(nums):
19                # Add current element to window
20                element_count[current_num] += 1
21              
22                # Shrink window if unique elements exceed max_unique
23                while len(element_count) > max_unique:
24                    left_num = nums[left]
25                    element_count[left_num] -= 1
26                    if element_count[left_num] == 0:
27                        del element_count[left_num]
28                    left += 1
29              
30                # Count subarrays ending at 'right' with unique elements <= max_unique
31                subarrays_count += right - left + 1
32              
33                # Check if we've reached the median position
34                if subarrays_count >= (total_subarrays + 1) // 2:
35                    return True
36          
37            return False
38      
39        # Calculate total number of subarrays
40        n = len(nums)
41        total_subarrays = n * (n + 1) // 2
42      
43        # Binary search for the minimum uniqueness value that can be the median
44        # Using bisect_left to find the leftmost position where check returns True
45        return bisect_left(range(n), True, key=can_achieve_median)
46
1class Solution {
2    private long totalSubarrays;
3    private int[] nums;
4
5    /**
6     * Finds the median of uniqueness values for all subarrays.
7     * Uses binary search to find the minimum uniqueness value such that
8     * at least half of all subarrays have uniqueness <= this value.
9     * 
10     * @param nums the input array
11     * @return the median uniqueness value
12     */
13    public int medianOfUniquenessArray(int[] nums) {
14        int n = nums.length;
15        this.nums = nums;
16      
17        // Calculate total number of subarrays: n*(n+1)/2
18        totalSubarrays = (1L + n) * n / 2;
19      
20        // Binary search on the answer (uniqueness value)
21        int left = 0;
22        int right = n;
23      
24        while (left < right) {
25            int mid = (left + right) >> 1;
26          
27            // Check if there are at least (totalSubarrays + 1) / 2 subarrays
28            // with uniqueness <= mid
29            if (hasEnoughSubarraysWithMaxUniqueness(mid)) {
30                right = mid;  // Try to find a smaller valid answer
31            } else {
32                left = mid + 1;  // Need a larger uniqueness value
33            }
34        }
35      
36        return left;
37    }
38
39    /**
40     * Checks if at least half of all subarrays have uniqueness <= maxUniqueness.
41     * Uses sliding window technique to count subarrays with at most maxUniqueness distinct elements.
42     * 
43     * @param maxUniqueness the maximum allowed number of distinct elements
44     * @return true if at least half of subarrays have uniqueness <= maxUniqueness
45     */
46    private boolean hasEnoughSubarraysWithMaxUniqueness(int maxUniqueness) {
47        Map<Integer, Integer> frequencyMap = new HashMap<>();
48        long subarrayCount = 0;
49      
50        // Sliding window approach
51        int left = 0;
52        for (int right = 0; right < nums.length; right++) {
53            // Add current element to the window
54            int currentElement = nums[right];
55            frequencyMap.merge(currentElement, 1, Integer::sum);
56          
57            // Shrink window from left while number of distinct elements > maxUniqueness
58            while (frequencyMap.size() > maxUniqueness) {
59                int leftElement = nums[left];
60                left++;
61              
62                // Decrease frequency and remove if it becomes 0
63                if (frequencyMap.merge(leftElement, -1, Integer::sum) == 0) {
64                    frequencyMap.remove(leftElement);
65                }
66            }
67          
68            // Count all subarrays ending at right with at most maxUniqueness distinct elements
69            // These are: [left, right], [left+1, right], ..., [right, right]
70            subarrayCount += right - left + 1;
71          
72            // Check if we've found enough subarrays (at least half of total)
73            if (subarrayCount >= (totalSubarrays + 1) / 2) {
74                return true;
75            }
76        }
77      
78        return false;
79    }
80}
81
1class Solution {
2public:
3    int medianOfUniquenessArray(vector<int>& nums) {
4        int n = nums.size();
5        using ll = long long;
6      
7        // Total number of subarrays: n*(n+1)/2
8        ll totalSubarrays = (1LL + n) * n / 2;
9      
10        // Binary search range: minimum distinct count is 1, maximum is n
11        int left = 1, right = n;
12      
13        // Lambda function to check if there are at least (totalSubarrays+1)/2 subarrays
14        // with at most 'maxDistinct' unique elements
15        auto countSubarraysWithAtMostKDistinct = [&](int maxDistinct) -> bool {
16            unordered_map<int, int> frequencyMap;
17            ll subarrayCount = 0;
18          
19            // Sliding window approach
20            for (int windowStart = 0, windowEnd = 0; windowEnd < n; ++windowEnd) {
21                // Add current element to the window
22                int currentElement = nums[windowEnd];
23                ++frequencyMap[currentElement];
24              
25                // Shrink window if number of distinct elements exceeds maxDistinct
26                while (frequencyMap.size() > maxDistinct) {
27                    int leftElement = nums[windowStart++];
28                    if (--frequencyMap[leftElement] == 0) {
29                        frequencyMap.erase(leftElement);
30                    }
31                }
32              
33                // Count all subarrays ending at windowEnd with at most maxDistinct unique elements
34                subarrayCount += windowEnd - windowStart + 1;
35              
36                // Check if we've found enough subarrays (at least half of total)
37                if (subarrayCount >= (totalSubarrays + 1) / 2) {
38                    return true;
39                }
40            }
41            return false;
42        };
43      
44        // Binary search to find the minimum value where at least half of subarrays
45        // have that many or fewer distinct elements
46        while (left < right) {
47            int mid = (left + right) >> 1;
48            if (countSubarraysWithAtMostKDistinct(mid)) {
49                // If true, the median is at most mid, search in left half
50                right = mid;
51            } else {
52                // If false, the median is greater than mid, search in right half
53                left = mid + 1;
54            }
55        }
56      
57        return left;
58    }
59};
60
1/**
2 * Finds the median of the uniqueness array for all subarrays
3 * @param nums - The input array of numbers
4 * @returns The median value of the uniqueness array
5 */
6function medianOfUniquenessArray(nums: number[]): number {
7    const arrayLength: number = nums.length;
8    // Total number of subarrays: n*(n+1)/2
9    const totalSubarrays: number = Math.floor(((1 + arrayLength) * arrayLength) / 2);
10  
11    // Binary search bounds for the median value
12    let left: number = 0;
13    let right: number = arrayLength;
14  
15    /**
16     * Checks if at least half of all subarrays have uniqueness <= maxUniqueness
17     * @param maxUniqueness - The maximum allowed number of unique elements
18     * @returns True if at least half of subarrays have uniqueness <= maxUniqueness
19     */
20    const checkIfMedianCandidate = (maxUniqueness: number): boolean => {
21        // Map to track frequency of elements in current window
22        const frequencyMap = new Map<number, number>();
23        let windowStart: number = 0;
24        let subarrayCount: number = 0;
25      
26        // Sliding window approach to count subarrays with uniqueness <= maxUniqueness
27        for (let windowEnd = 0; windowEnd < arrayLength; ++windowEnd) {
28            const currentElement: number = nums[windowEnd];
29          
30            // Add current element to frequency map
31            frequencyMap.set(currentElement, (frequencyMap.get(currentElement) || 0) + 1);
32          
33            // Shrink window if unique count exceeds maxUniqueness
34            while (frequencyMap.size > maxUniqueness) {
35                const elementToRemove: number = nums[windowStart++];
36                const updatedFrequency: number = frequencyMap.get(elementToRemove)! - 1;
37                frequencyMap.set(elementToRemove, updatedFrequency);
38              
39                // Remove element from map if frequency becomes 0
40                if (updatedFrequency === 0) {
41                    frequencyMap.delete(elementToRemove);
42                }
43            }
44          
45            // All subarrays ending at windowEnd with start >= windowStart are valid
46            subarrayCount += windowEnd - windowStart + 1;
47          
48            // Check if we've found at least half of all subarrays
49            if (subarrayCount >= Math.floor((totalSubarrays + 1) / 2)) {
50                return true;
51            }
52        }
53      
54        return false;
55    };
56  
57    // Binary search for the minimum value where at least half subarrays have that uniqueness
58    while (left < right) {
59        const mid: number = (left + right) >> 1;
60      
61        if (checkIfMedianCandidate(mid)) {
62            // If mid works, try to find a smaller value
63            right = mid;
64        } else {
65            // If mid doesn't work, we need a larger value
66            left = mid + 1;
67        }
68    }
69  
70    return left;
71}
72

Time and Space Complexity

The time complexity is O(n × log n), where n is the length of the array nums.

Time Complexity Analysis:

  • The outer binary search using bisect_left searches over the range [0, n), which takes O(log n) iterations.
  • For each binary search iteration, the check function is called.
  • Inside the check function:
    • There's a sliding window approach with two pointers l and r that traverse the array once.
    • Each element is visited at most twice (once when r moves right, once when l moves right).
    • The operations inside the loop (dictionary updates, comparisons) are O(1) on average.
    • Therefore, each call to check takes O(n) time.
  • Total time complexity: O(log n) × O(n) = O(n × log n).

The space complexity is O(n).

Space Complexity Analysis:

  • The cnt dictionary (defaultdict) stores the frequency of distinct elements in the current window.
  • In the worst case, all elements in the array are distinct, so the dictionary can contain up to n entries.
  • The binary search and other variables use O(1) additional space.
  • Total space complexity: O(n).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Median Position Calculation

Pitfall: Using total_subarrays // 2 instead of (total_subarrays + 1) // 2 for the median position. This leads to finding the wrong element in the sorted uniqueness array.

Why it happens: For even-length arrays, we need the smaller of the two middle elements. The position (total_subarrays + 1) // 2 correctly gives us this position (1-indexed), which translates to checking if at least this many subarrays have uniqueness ≤ our target value.

Solution: Always use (total_subarrays + 1) // 2 for the median position check.

2. Off-by-One Error in Binary Search Range

Pitfall: Setting the binary search range as bisect_left(range(n + 1), ...) instead of bisect_left(range(n), ...). This searches for values from 0 to n, but the check function expects values from 0 to n-1.

Why it happens: Confusion about whether the maximum distinct elements should be n or n-1. Since we're looking for the first value where the condition is satisfied, and the maximum possible distinct elements is n, we should search in range(n).

Solution: Use range(n) which searches from 0 to n-1, and the function will automatically return n if no value in this range satisfies the condition.

3. Memory Leak with defaultdict

Pitfall: Not removing elements from the dictionary when their count reaches 0, causing unnecessary memory usage and incorrect distinct element counting.

# Wrong approach
while len(element_count) > max_unique:
    element_count[nums[left]] -= 1
    # Missing: del element_count[nums[left]] when count is 0
    left += 1

Solution: Always remove dictionary entries when their count reaches 0:

while len(element_count) > max_unique:
    element_count[nums[left]] -= 1
    if element_count[nums[left]] == 0:
        del element_count[nums[left]]
    left += 1

4. Misunderstanding the Binary Search Target

Pitfall: Thinking we're searching for the exact median value rather than the minimum number of distinct elements that satisfies our condition.

Why it happens: The problem asks for the median of uniqueness values, but the binary search finds the minimum value where at least half of the subarrays have that many or fewer distinct elements.

Solution: Remember that bisect_left finds the first (smallest) value where the condition becomes True, which correctly gives us the median of the sorted uniqueness array.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which data structure is used to implement recursion?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More