Facebook Pixel

3134. Find the Median of the Uniqueness Array


Problem Description

You are given an integer array nums. The uniqueness array of nums is the sorted array that contains the number of distinct elements of all the subarrays of nums. In other words, it is a sorted array consisting of distinct(nums[i..j]), for all 0 <= i <= j < nums.length.

Here, distinct(nums[i..j]) denotes the number of distinct elements in the subarray that starts at index i and ends at index j.

Return the median of the uniqueness array of nums.

Note that the median of an array is defined as the middle element of the array when it is sorted in non-decreasing order. If there are two choices for a median, the smaller of the two values is taken.

Intuition

To solve this problem, we need to understand the structure of the uniqueness array that records the number of distinct elements in all possible subarrays. For an array nums of length n, the length of the uniqueness array is m = (1 + n) * n / 2, which represents the total number of subarrays that can be formed from nums.

Given this, we need to find the median value of these unique counts. The challenge here is the potentially large number of subarrays, which can lead to inefficiencies if calculated directly. However, the properties of these subarrays regarding their unique counts are monotonic. As more elements are added to a subarray, the count of unique elements either increases or remains the same.

Utilizing this monotonic property, we can employ a binary search strategy to find the median of the uniqueness array. By defining a helper function check(mx), we attempt to determine how many subarrays have distinct element counts less than or equal to a certain threshold mx.

The plan is to find the first mx where at least half of the subarrays have distinct counts less than or equal to mx, thus finding the median by binary search from 0 to n. By leveraging two pointers and a hashmap in the check(mx) function, we maintain a sliding window to count distinct elements efficiently, shifting the window as necessary to ensure we only count valid subarrays for each potential median value during the binary search.

Learn more about Binary Search and Sliding Window patterns.

Solution Approach

We approach the problem using a combination of binary search and a two-pointer technique. Here’s how the solution is implemented:

  1. Binary Search for Median Determination:

    • Define the length of the uniqueness array, m = (1 + n) * n / 2, where n is the length of the input array nums.
    • We're looking for the median, which is the (m + 1) / 2-th smallest number in the uniqueness array.
    • Establish a binary search range from 0 to n, where each midpoint (mid) represents a potential median value.
  2. Checking Validity with Two Pointers:

    • Implement a helper function check(mx) which returns True if the number of subarrays with unique counts less than or equal to mx is at least (m + 1) / 2.
    • Within check(mx), use a dictionary cnt to keep track of the count of elements in the current window.
    • Use two pointers, l (left) and r (right), to manage the sliding window on nums. Start both pointers at the beginning of the array.
    • As r iterates over nums, add nums[r] to the window and update cnt.
    • Adjust the left pointer l to maintain the condition that the number of distinct elements in the current window does not exceed mx.
    • Count all valid subarrays ending at r using r - l + 1 once the constraint is satisfied.
    • If the cumulative count k of these valid subarrays reaches or exceeds (m + 1) / 2, return True.
  3. Finding the Median:

    • Use bisect_left with the range [0, n) where check(mx) is used as the key to find the smallest mx that satisfies the condition from step 2.
    • The result of bisect_left is the median of the uniqueness array.

This approach effectively combines the precision of binary search with the efficiency of the two-pointer technique to handle potentially large input arrays and their subarray combinations.

Ready to land your dream job?

Unlock your dream job with a 2-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through an example using a small input array to understand the solution approach.

Example Input:

nums = [1, 2, 1]

Step-by-step Explanation:

  1. Identify Subarrays and Their Unique Counts:

    • Subarrays of nums and their distinct counts:

      • [1] → 1 distinct element
      • [2] → 1 distinct element
      • [1, 2] → 2 distinct elements
      • [2, 1] → 2 distinct elements
      • [1, 2, 1] → 2 distinct elements
      • [1, 1] → 1 distinct element
    • Uniqueness array in unsorted form: [1, 1, 2, 2, 2, 1]

    • Sorted Uniqueness array: [1, 1, 1, 2, 2, 2]

  2. Calculate Median of Uniqueness Array:

    • Total number of subarrays m = (1 + 3) * 3 / 2 = 6

    • We are looking for the 4th smallest number (since (m + 1) / 2 = 4).

    • From [1, 1, 1, 2, 2, 2], the 4th smallest number is 2.

    Hence, the median of the uniqueness array is 2.

  3. Using Binary Search to Find Median:

    • Define the binary search range from 0 to 3 (length of nums).
    • Implement the check(mx) function to determine if enough subarrays have distinct counts ≤ mx.
  4. Implementation of check(mx):

    • Assume mx starts at 1.
    • Initialize a hashmap cnt and two pointers l = 0, r = 0 for sliding the window.
    • Iterate while maintaining mx number of distinct elements in the window.
    • Count valid subarrays where distinct elements ≤ mx.
    • If the count meets or exceeds (m + 1) / 2, adjust mx accordingly via binary search.
  5. Finding the Median using bisect_left:

    • Continue iterations to pinpoint where exactly mx allows the half (3) of all subarrays.
    • Once you identify the smallest mx where subarrays with enough distinct values is possible, you've found the median.

This step-by-step process uses binary search to efficiently determine the median of the uniqueness array, employing a two-pointer strategy to manage subarrays while counting unique elements.

Solution Implementation

1from typing import List
2from collections import defaultdict
3from bisect import bisect_left
4
5class Solution:
6    def medianOfUniquenessArray(self, nums: List[int]) -> int:
7        # Function to check if the current maximum number of unique elements 'mx' is possible
8        def check(mx: int) -> bool:
9            count = defaultdict(int)
10            k = l = 0
11            # Iterate over the array
12            for r, x in enumerate(nums):
13                count[x] += 1  # Increase the count for the element at index 'r'
14              
15                # if the count of unique elements is greater than mx, shrink the window from the left
16                while len(count) > mx:
17                    y = nums[l]
18                    count[y] -= 1
19                    if count[y] == 0:
20                        count.pop(y)
21                    l += 1  # Move the left boundary of the window to the right
22              
23                k += r - l + 1  # Calculate the number of subarrays with at most 'mx' unique elements
24                # Check if the number of subarrays reaches at least half of the total subarrays
25                if k >= (m + 1) // 2:
26                    return True
27            return False
28
29        n = len(nums)  # Get the length of the input array
30        m = (1 + n) * n // 2  # Calculate the total number of subarrays in 'nums'
31
32        # Use binary search to find the minimum 'mx' that allows enough subarrays with the condition
33        return bisect_left(range(n), True, key=check)
34
1import java.util.HashMap;
2import java.util.Map;
3
4class Solution {
5    private long medianThreshold;
6    private int[] nums;
7
8    // Method to calculate the median of the uniqueness of an array using a binary search approach
9    public int medianOfUniquenessArray(int[] nums) {
10        int arrayLength = nums.length;
11        this.nums = nums;
12        // Calculate the threshold for the median
13        medianThreshold = (1L + arrayLength) * arrayLength / 2;
14        int left = 0, right = arrayLength;
15
16        // Perform binary search to find the smallest 'mx' that satisfies the condition
17        while (left < right) {
18            int mid = (left + right) >> 1; // Equivalent to (left + right) / 2
19            if (check(mid)) {
20                right = mid;
21            } else {
22                left = mid + 1;
23            }
24        }
25        return left;
26    }
27
28    // Helper function to check if there is a subarray with at most 'mx' unique elements whose subarray sum meets the requirement
29    private boolean check(int maxUnique) {
30        Map<Integer, Integer> countMap = new HashMap<>();
31        long currentSum = 0;
32      
33        // Use two pointers to manage the sliding window
34        for (int left = 0, right = 0; right < nums.length; ++right) {
35            int currentElement = nums[right];
36            countMap.merge(currentElement, 1, Integer::sum);
37
38            // Adjust the left pointer to maintain at most 'maxUnique' unique elements in the current window
39            while (countMap.size() > maxUnique) {
40                int leftElement = nums[left++];
41                if (countMap.merge(leftElement, -1, Integer::sum) == 0) {
42                    countMap.remove(leftElement);
43                }
44            }
45
46            // Update the current sum with the number of subarrays ending at 'right'
47            currentSum += right - left + 1;
48
49            // Check if the current sum is sufficient to return true
50            if (currentSum >= (medianThreshold + 1) / 2) {
51                return true;
52            }
53        }
54        return false;
55    }
56}
57
1#include <vector>
2#include <unordered_map>
3#include <climits>
4
5class Solution {
6public:
7    int medianOfUniquenessArray(std::vector<int>& nums) {
8        int n = nums.size();
9        using ll = long long;
10        ll totalSubarrays = (1LL + n) * n / 2;  // Calculate the total number of subarrays
11        int left = 0, right = n;
12
13        // Define a lambda function to check if a given maximum number of unique elements works
14        auto isValid = [&](int maxUnique) -> bool {
15            std::unordered_map<int, int> elementCount;
16            ll subarrayCount = 0;
17
18            for (int leftPointer = 0, rightPointer = 0; rightPointer < n; ++rightPointer) {
19                int currentNum = nums[rightPointer];
20                ++elementCount[currentNum];  // Increment the count for the current number
21
22                // Ensure the size of the unique elements doesn't exceed maxUnique
23                while (elementCount.size() > maxUnique) {
24                    int leftNum = nums[leftPointer++];
25                    if (--elementCount[leftNum] == 0) {
26                        elementCount.erase(leftNum);
27                    }
28                }
29
30                // Count subarrays with the current right end
31                subarrayCount += rightPointer - leftPointer + 1;
32
33                // If the number of subarrays is at least half of the total, return true
34                if (subarrayCount >= (totalSubarrays + 1) / 2) {
35                    return true;
36                }
37            }
38            return false; // Return false otherwise
39        };
40
41        // Perform a binary search to find the minimum maxUnique
42        while (left < right) {
43            int mid = (left + right) >> 1;  // Compute the midpoint
44            if (isValid(mid)) {
45                right = mid;  // If valid, try a smaller value
46            } else {
47                left = mid + 1;  // If not valid, increase the lower bound
48            }
49        }
50
51        return left;  // Return the smallest number of unique elements that satisfies the condition
52    }
53};
54
1function medianOfUniquenessArray(nums: number[]): number {
2    const n = nums.length;
3    // Calculate the total number of subarrays in the form of n * (n + 1) / 2
4    const totalSubarrays = Math.floor(((1 + n) * n) / 2);
5    // Set initial binary search boundaries
6    let left = 0;
7    let right = n;
8
9    // Helper function to determine if there exists a subarray with median uniqueness of at least mx
10    const check = (maximumUniqueCount: number): boolean => {
11        const countMap = new Map<number, number>(); // Map to hold count of elements
12        let leftPointer = 0;
13        let subarrayCount = 0;
14
15        for (let rightPointer = 0; rightPointer < n; ++rightPointer) {
16            const currentNumber = nums[rightPointer];
17            countMap.set(currentNumber, (countMap.get(currentNumber) || 0) + 1);
18
19            // Maintain subarray with maximum unique elements not exceeding maximumUniqueCount
20            while (countMap.size > maximumUniqueCount) {
21                const leftNumber = nums[leftPointer++];
22                countMap.set(leftNumber, countMap.get(leftNumber)! - 1);
23
24                if (countMap.get(leftNumber) === 0) {
25                    countMap.delete(leftNumber);
26                }
27            }
28
29            // Calculate how many subarrays end at rightPointer and start between leftPointer and rightPointer
30            subarrayCount += rightPointer - leftPointer + 1;
31
32            // Check if at least half of the total subarrays meet the condition
33            if (subarrayCount >= Math.floor((totalSubarrays + 1) / 2)) {
34                return true;
35            }
36        }
37        return false;
38    };
39
40    // Binary search to find the minimum uniqueness level
41    while (left < right) {
42        const mid = (left + right) >> 1;
43        if (check(mid)) {
44            right = mid; // If condition met, adjust right boundary
45        } else {
46            left = mid + 1; // If not met, adjust left boundary
47        }
48    }
49
50    return left; // Return the median of uniqueness
51}
52

Time and Space Complexity

The time complexity of the code is O(n × log n), where n is the length of the array nums. This complexity arises from the use of the bisect_left function, which applies a binary search strategy over n possible values, and each invocation of the check function individually takes O(n) time due to the sliding window logic that checks counting frequency of elements and adjusts the window size.

The space complexity of the code is O(n). This is because a defaultdict is used to keep track of the frequency of elements within the sliding window, and in the worst-case scenario, this dictionary could store entries for every unique element in the array.

Learn more about how to find time and space complexity quickly.


Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:
Question 1 out of 10

Which technique can we use to find the middle of a linked list?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!


Load More