2856. Minimum Array Length After Pair Removals


Problem Description

You are provided with a sorted array of integers nums, where the sorting is in non-decreasing order. The problem presents an operation that can be performed repeatedly, as many times as you wish. Each operation involves picking two distinct indices i and j (where i < j) such that nums[i] < nums[j], and then removing the elements at both of these indices from the array. The goal is to determine the minimum length of the array nums after doing this operation as many times as possible (you could also opt to not perform any operations).

Intuition

The solution to this problem involves recognizing that you can always remove pairs of distinct numbers due to the array's sorted nature. Since the array is sorted, nums[i] < nums[j] for any i < j where the elements are different.

A key observation here is that when you remove a pair of numbers, only two elements are removed. Therefore, to reach the minimum length of the array, you want to maximize the number of such removal operations. However, you need to ensure that you're pairing elements effectively.

How do we maximize the number of removals? Let's consider the frequency of each number in the sorted array. If you have a number that appears, say, three times, you can only form one complete pair out of it, leaving one instance of the number unpaired. It's clear that each unpaired number will remain in the final array, adding to its length.

The intuition behind the provided Python solution involves using a max-heap to efficiently pick out the two numbers with the highest frequencies for the pairing—each removal of a pair will decrease the count of these numbers by one. By continuously pairing off the numbers with the highest frequency (and hence the highest chance of having unpaired elements left over), we ensure that a minimal number of elements remain unpaired.

After all possible pairings, you are left with either an empty heap (all elements have been paired off), or a heap with one element remaining (an element that couldn't be paired). If the heap is not empty, the final count of the remaining element contributes to the final size of the array post-operations.

The answer is thus the initial length of the array minus twice the number of successful pairings (since two elements are removed for each pairing).

Learn more about Greedy, Two Pointers and Binary Search patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Which two pointer techniques do you use to check if a string is a palindrome?

Solution Approach

The solution follows a greedy approach using a priority queue (max-heap) to perform operations that effectively pair off elements. Here's how the algorithm is implemented step-by-step:

  1. Frequency Count: First, it counts the frequency of each unique number in the nums array using a Counter from Python's collections module. This is because the frequency of numbers will determine how many pairs can be formed. Numbers with a frequency of 1 cannot be paired, while those with higher frequencies can be paired multiple times.

  2. Convert to Max-Heap: It then converts the frequency counts into a max-heap using the -x technique because Python's heapq module provides a min-heap by default. By negating the values, the solution turns it into a max-heap for our purpose. The heapq.heapify function is used to transform the list of frequencies into a heap, which now lets us easily access the two largest frequencies.

  3. Pair Up Elements: The algorithm then enters a loop that runs as long as there is more than one element left in the max-heap. In each iteration, it pops the two largest elements (which are the two most frequent numbers in the array), decrements their count to reflect that a pair has been formed and removes them from the array, and then pushes the decremented frequencies back into the heap if they are greater than zero. This ensures that each time we are attempting to pair off the most frequent numbers first, gradually decreasing their frequency and thus incrementally reducing the array size by two elements for each pair.

  4. Decrease Array Length: After successfully pairing two elements, the solution decreases the running total length of the array ans by 2 since that's how many elements are removed by the performed operation.

  5. Remaining Elements: Once the loop finishes, the resulting length (ans) represents the minimum length of the array after performing the maximum number of valid operations. If one element is left in the heap, it cannot be paired and will remain in the final array.

In mathematical terms, if f(e) is the frequency of element e, the operation essentially reduces both f(e_i) and f(e_j) by one for each pair (e_i, e_j). The ans variable is initialized to the total length of the nums array, and with every pairing operation (each decrements two frequencies by one), ans is decremented by 2. The algorithm terminates either when all frequencies have been paired off (and no element with frequency greater than 0 can form a pair), or only one frequency count is left in the max-heap (which then directly contributes to the length of the final array).

The use of the heapq for the removal and insertion of elements allows this solution to perform these operations efficiently, taking O(n log n) time complexity where n is the number of unique elements, as insertion and removal in a heap take O(log n) time. It's an elegant solution leveraging a priority queue to solve the problem at hand.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

What is the space complexity of the following code?

1int sum(int n) {
2  if (n <= 0) {
3    return 0;
4  }
5  return n + sum(n - 1);
6}

Example Walkthrough

Let's consider a small array nums = [1,2,2,3,3,3,4,4,4,4] to illustrate the solution approach.

  1. Frequency Count: We first count the frequency of each unique number in the array.

    • Frequency of 1 is 1
    • Frequency of 2 is 2
    • Frequency of 3 is 3
    • Frequency of 4 is 4
  2. Convert to Max-Heap: Convert these frequencies to a max-heap using negative values (to implement the max-heap with Python's min-heap structure).

    • Max-heap would be represented as [-4, -3, -2, -1]
  3. Pair Up Elements: Now, we enter a loop to start pairing elements.

    • We pop the two largest elements from the heap, which are -4 and -3 (representing 4 and 3 with frequencies 4 and 3, respectively).

    • After pairing them, their frequencies decrease by 1. The heap now has [-3, -2, -1, -2] (representing 4, 2, 1, 3).

    • The array length is reduced by 2, going from 10 to 8.

    • Pair the next two largest elements, which are -3 (representing 4) and -2 (representing 3).

    • After pairing them, the heap is now [-2, -1, -2, -1] (representing 4, 1, 3, 2).

    • The array length is reduced by another 2, now 6.

    • This process continues, pairing the two largest elements each time and updating the heap.

    • The final operations will be unable to pair the last frequency of 1 (assuming we end with one element having a frequency of 1), which means the array cannot be reduced further.

  4. Decrease Array Length: Each pairing operation decreases the ans variable by 2. After the first operation, ans = 8; after the second, ans = 6; and so on.

  5. Remaining Elements: If we are left with an element that cannot be paired, it adds to the final length of the array. In this example, we would end with one unpaired element (frequency 1), which contributes to the final array size.

So, if we start with an array of length 10, after performing the pairings, we could end up with a final array length that is the original length minus twice the number of operations plus any remaining unpaired elements. In this case, we might end up with either an empty array (if all elements can be paired) or an array of length 1 (if one unpaired element remains).

Solution Implementation

1from heapq import heapify, heappop, heappush
2from collections import Counter
3from typing import List
4
5class Solution:
6    def minLengthAfterRemovals(self, nums: List[int]) -> int:
7        # Create a Counter object to count the frequency of each number in the input list
8        frequency_counter = Counter(nums)
9      
10        # Create a max heap to store the negative counts (since Python's heapq is a min heap by default)
11        max_heap = [-count for count in frequency_counter.values()]
12        heapify(max_heap)
13      
14        # Initialize the answer as the length of the input list
15        answer_length = len(nums)
16      
17        # Process the heap while there are at least two elements in the heap
18        while len(max_heap) > 1:
19            # Pop the two largest elements (most frequent numbers); invert them back to positive
20            count1, count2 = -heappop(max_heap), -heappop(max_heap)
21          
22            # Decrease their counts (simulate removal)
23            count1 -= 1
24            count2 -= 1
25          
26            # If there are any remaining counts for the number, push it back into the heap
27            if count1 > 0:
28                heappush(max_heap, -count1)
29            if count2 > 0:
30                heappush(max_heap, -count2)
31          
32            # Since we remove 2 elements, decrease the total length of the array by 2, updating the answer
33            answer_length -= 2
34      
35        # Return the final answer
36        return answer_length
37
1class Solution {
2    public int minLengthAfterRemovals(List<Integer> numbers) {
3        // A map to store the frequency count of each unique number
4        Map<Integer, Integer> frequencyMap = new HashMap<>();
5        // Counting the frequency of each number in the list
6        for (int number : numbers) {
7            // If the number is already in the map, increase its count by 1, otherwise add it with a count of 1
8            frequencyMap.merge(number, 1, Integer::sum);
9        }
10      
11        // PriorityQueue to store frequencies in descending order (largest frequency at the top)
12        PriorityQueue<Integer> frequencyQueue = new PriorityQueue<>(Comparator.reverseOrder());
13      
14        // Add all frequencies to the PriorityQueue
15        for (int frequency : frequencyMap.values()) {
16            frequencyQueue.offer(frequency);
17        }
18      
19        // Initialize the answer with the total count of numbers in the list
20        int minLength = numbers.size();
21      
22        // Process pairs until we only have one frequency or none left
23        while (frequencyQueue.size() > 1) {
24            // Poll the two top frequencies
25            int firstFrequency = frequencyQueue.poll();
26            int secondFrequency = frequencyQueue.poll();
27          
28            // Decrement the frequencies as we are pairing and removing one occurrence of each
29            firstFrequency--;
30            secondFrequency--;
31          
32            // If any updated frequencies are greater than 0, offer them back to the priority queue
33            if (firstFrequency > 0) {
34                frequencyQueue.offer(firstFrequency);
35            }
36            if (secondFrequency > 0) {
37                frequencyQueue.offer(secondFrequency);
38            }
39          
40            // Each removal reduces the length by 2 since we remove one occurrence of each of the two numbers
41            minLength -= 2;
42        }
43      
44        // Return the minimum length after making all possible removals
45        return minLength;
46    }
47}
48
1#include <vector>
2#include <queue>
3#include <unordered_map>
4
5class Solution {
6public:
7    int minLengthAfterRemovals(vector<int>& nums) {
8        // Create a hashmap to count the occurrence of each number in the vector.
9        unordered_map<int, int> countMap;
10        for (int num : nums) {
11            ++countMap[num]; // Increment the count for this number.
12        }
13
14        // Use a max-heap (priority queue) to keep track of the counts.
15        priority_queue<int> maxHeap;
16        for (auto& keyValue : countMap) {
17            maxHeap.push(keyValue.second); // Push counts to the max-heap.
18        }
19
20        // Initialize final answer to the size of the input vector.
21        int finalLength = nums.size();
22      
23        // Process the heap until there is one or no element left.
24        while (maxHeap.size() > 1) {
25            int topCount1 = maxHeap.top(); // Get the top (maximum) element from the max-heap.
26            maxHeap.pop(); // Remove that element.
27            int topCount2 = maxHeap.top(); // Get the next top element.
28            maxHeap.pop(); // Remove that element as well.
29
30            // Decrement both counts as we are planning to remove one occurrence of each.
31            topCount1--;
32            topCount2--;
33
34            // If there are still occurrences left after removal, push the new count back to the heap.
35            if (topCount1 > 0) {
36                maxHeap.push(topCount1);
37            }
38            if (topCount2 > 0) {
39                maxHeap.push(topCount2);
40            }
41
42            // Since we remove two elements from the array (one each for the two counts), decrease the length by 2.
43            finalLength -= 2;
44        }
45
46        // Return the final length of the vector after the removals.
47        return finalLength;
48    }
49};
50
1import { MaxPriorityQueue } from '@datastructures-js/priority-queue';
2
3// This function calculates the minimum length of the array after performing
4// removals such that each pair of adjacent numbers is not the same.
5function minLengthAfterRemovals(nums: number[]): number {
6    // A map to store the frequency count of each number in the array.
7    const frequencyCount: Map<number, number> = new Map();
8  
9    // Count the frequency of each number in the input array.
10    for (const num of nums) {
11        frequencyCount.set(num, (frequencyCount.get(num) ?? 0) + 1);
12    }
13  
14    // A priority queue to order the counts in decreasing order.
15    const priorityQueue = new MaxPriorityQueue<number>();
16  
17    // Enqueue the frequency counts into the priority queue.
18    for (const [, count] of frequencyCount) {
19        priorityQueue.enqueue(count);
20    }
21  
22    // Variable to track the answer which is initialized to the length of the input array.
23    let minimumLength = nums.length;
24  
25    // Keep removing elements in pairs until one or no elements remain in the priority queue.
26    while (priorityQueue.size() > 1) {
27        let countX = priorityQueue.dequeue().element;
28        let countY = priorityQueue.dequeue().element;
29
30        // Decrease the counts and re-enqueue if they are still greater than zero.
31        if (--countX > 0) {
32            priorityQueue.enqueue(countX);
33        }
34        if (--countY > 0) {
35            priorityQueue.enqueue(countY);
36        }
37      
38        // Each valid removal decreases the minimum length by 2.
39        minimumLength -= 2;
40    }
41  
42    // Return the min length of the array after removals.
43    return minimumLength;
44}
45
46// You will need to install @datastructures-js/priority-queue package to use MaxPriorityQueue.
47// If you’re using npm, you can install with: npm install @datastructures-js/priority-queue
48
Not Sure What to Study? Take the 2-min Quiz:

How would you design a stack which has a function min that returns the minimum element in the stack, in addition to push and pop? All push, pop, min should have running time O(1).

Time and Space Complexity

Time Complexity

The time complexity of the given code is determined by several factors:

  1. Counting the elements in nums using Counter constructor has a time complexity of O(n), where n is the number of elements in nums.
  2. The list comprehension used to create pq from the counts also has a time complexity of O(n), assuming there are n unique elements. This is because it traverses the counter once.
  3. Constructing a heap from the list of counts using heapify has a time complexity of O(n).
  4. The while loop runs until there is only one or no element left in the priority queue. In the worst case, the loop runs for n/2 iterations if every element is unique. The total number of heappop and heappush operations will together contribute to the time complexity.
    • For each heappop operation, the time complexity is O(log n).
    • For each heappush operation after decrementing the element, if it is still greater than 0, the time complexity is also O(log n).

Taking into account that each iteration may involve up to two heappop and two heappush operations, the total complexity for the loop is O(n log n) for n/2 iterations, leading to an overall complexity of O(n log n).

Adding the complexities together, the dominant term is O(n log n) due to the while loop. Hence, the overall time complexity of the code is O(n log n).

Space Complexity

The space complexity is determined by the additional storage used by the code:

  1. The Counter data structure requires O(u) space, where u is the number of unique elements in nums.
  2. The priority queue (heap) also requires O(u) space.
  3. No other significant space is used since the operations inside the loop use a constant amount of space.

Thus, the overall space complexity of the code is O(u), where u represents the number of unique elements in nums.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Breadth first search can be used to find the shortest path between two nodes in a directed graph.


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫