2856. Minimum Array Length After Pair Removals
Problem Description
You are provided with a sorted array of integers nums
, where the sorting is in non-decreasing order. The problem presents an operation that can be performed repeatedly, as many times as you wish. Each operation involves picking two distinct indices i
and j
(where i < j
) such that nums[i] < nums[j]
, and then removing the elements at both of these indices from the array. The goal is to determine the minimum length of the array nums
after doing this operation as many times as possible (you could also opt to not perform any operations).
Intuition
The solution to this problem involves recognizing that you can always remove pairs of distinct numbers due to the array's sorted nature. Since the array is sorted, nums[i] < nums[j]
for any i < j
where the elements are different.
A key observation here is that when you remove a pair of numbers, only two elements are removed. Therefore, to reach the minimum length of the array, you want to maximize the number of such removal operations. However, you need to ensure that you're pairing elements effectively.
How do we maximize the number of removals? Let's consider the frequency of each number in the sorted array. If you have a number that appears, say, three times, you can only form one complete pair out of it, leaving one instance of the number unpaired. It's clear that each unpaired number will remain in the final array, adding to its length.
The intuition behind the provided Python solution involves using a max-heap to efficiently pick out the two numbers with the highest frequencies for the pairing—each removal of a pair will decrease the count of these numbers by one. By continuously pairing off the numbers with the highest frequency (and hence the highest chance of having unpaired elements left over), we ensure that a minimal number of elements remain unpaired.
After all possible pairings, you are left with either an empty heap (all elements have been paired off), or a heap with one element remaining (an element that couldn't be paired). If the heap is not empty, the final count of the remaining element contributes to the final size of the array post-operations.
The answer is thus the initial length of the array minus twice the number of successful pairings (since two elements are removed for each pairing).
Learn more about Greedy, Two Pointers and Binary Search patterns.
Solution Approach
The solution follows a greedy approach using a priority queue (max-heap) to perform operations that effectively pair off elements. Here's how the algorithm is implemented step-by-step:
-
Frequency Count: First, it counts the frequency of each unique number in the
nums
array using aCounter
from Python'scollections
module. This is because the frequency of numbers will determine how many pairs can be formed. Numbers with a frequency of 1 cannot be paired, while those with higher frequencies can be paired multiple times. -
Convert to Max-Heap: It then converts the frequency counts into a max-heap using the
-x
technique because Python'sheapq
module provides a min-heap by default. By negating the values, the solution turns it into a max-heap for our purpose. Theheapq.heapify
function is used to transform the list of frequencies into a heap, which now lets us easily access the two largest frequencies. -
Pair Up Elements: The algorithm then enters a loop that runs as long as there is more than one element left in the max-heap. In each iteration, it pops the two largest elements (which are the two most frequent numbers in the array), decrements their count to reflect that a pair has been formed and removes them from the array, and then pushes the decremented frequencies back into the heap if they are greater than zero. This ensures that each time we are attempting to pair off the most frequent numbers first, gradually decreasing their frequency and thus incrementally reducing the array size by two elements for each pair.
-
Decrease Array Length: After successfully pairing two elements, the solution decreases the running total length of the array
ans
by 2 since that's how many elements are removed by the performed operation. -
Remaining Elements: Once the loop finishes, the resulting length (
ans
) represents the minimum length of the array after performing the maximum number of valid operations. If one element is left in the heap, it cannot be paired and will remain in the final array.
In mathematical terms, if f(e)
is the frequency of element e
, the operation essentially reduces both f(e_i)
and f(e_j)
by one for each pair (e_i, e_j)
. The ans
variable is initialized to the total length of the nums
array, and with every pairing operation (each decrements two frequencies by one), ans
is decremented by 2. The algorithm terminates either when all frequencies have been paired off (and no element with frequency greater than 0 can form a pair), or only one frequency count is left in the max-heap (which then directly contributes to the length of the final array).
The use of the heapq for the removal and insertion of elements allows this solution to perform these operations efficiently, taking O(n log n)
time complexity where n
is the number of unique elements, as insertion and removal in a heap take O(log n)
time. It's an elegant solution leveraging a priority queue to solve the problem at hand.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's consider a small array nums = [1,2,2,3,3,3,4,4,4,4]
to illustrate the solution approach.
-
Frequency Count: We first count the frequency of each unique number in the array.
- Frequency of
1
is1
- Frequency of
2
is2
- Frequency of
3
is3
- Frequency of
4
is4
- Frequency of
-
Convert to Max-Heap: Convert these frequencies to a max-heap using negative values (to implement the max-heap with Python's min-heap structure).
- Max-heap would be represented as
[-4, -3, -2, -1]
- Max-heap would be represented as
-
Pair Up Elements: Now, we enter a loop to start pairing elements.
-
We pop the two largest elements from the heap, which are
-4
and-3
(representing4
and3
with frequencies4
and3
, respectively). -
After pairing them, their frequencies decrease by
1
. The heap now has[-3, -2, -1, -2]
(representing4
,2
,1
,3
). -
The array length is reduced by
2
, going from10
to8
. -
Pair the next two largest elements, which are
-3
(representing4
) and-2
(representing3
). -
After pairing them, the heap is now
[-2, -1, -2, -1]
(representing4
,1
,3
,2
). -
The array length is reduced by another
2
, now6
. -
This process continues, pairing the two largest elements each time and updating the heap.
-
The final operations will be unable to pair the last frequency of
1
(assuming we end with one element having a frequency of1
), which means the array cannot be reduced further.
-
-
Decrease Array Length: Each pairing operation decreases the
ans
variable by2
. After the first operation,ans = 8
; after the second,ans = 6
; and so on. -
Remaining Elements: If we are left with an element that cannot be paired, it adds to the final length of the array. In this example, we would end with one unpaired element (frequency
1
), which contributes to the final array size.
So, if we start with an array of length 10
, after performing the pairings, we could end up with a final array length that is the original length minus twice the number of operations plus any remaining unpaired elements. In this case, we might end up with either an empty array (if all elements can be paired) or an array of length 1
(if one unpaired element remains).
Solution Implementation
1from heapq import heapify, heappop, heappush
2from collections import Counter
3from typing import List
4
5class Solution:
6 def minLengthAfterRemovals(self, nums: List[int]) -> int:
7 # Create a Counter object to count the frequency of each number in the input list
8 frequency_counter = Counter(nums)
9
10 # Create a max heap to store the negative counts (since Python's heapq is a min heap by default)
11 max_heap = [-count for count in frequency_counter.values()]
12 heapify(max_heap)
13
14 # Initialize the answer as the length of the input list
15 answer_length = len(nums)
16
17 # Process the heap while there are at least two elements in the heap
18 while len(max_heap) > 1:
19 # Pop the two largest elements (most frequent numbers); invert them back to positive
20 count1, count2 = -heappop(max_heap), -heappop(max_heap)
21
22 # Decrease their counts (simulate removal)
23 count1 -= 1
24 count2 -= 1
25
26 # If there are any remaining counts for the number, push it back into the heap
27 if count1 > 0:
28 heappush(max_heap, -count1)
29 if count2 > 0:
30 heappush(max_heap, -count2)
31
32 # Since we remove 2 elements, decrease the total length of the array by 2, updating the answer
33 answer_length -= 2
34
35 # Return the final answer
36 return answer_length
37
1class Solution {
2 public int minLengthAfterRemovals(List<Integer> numbers) {
3 // A map to store the frequency count of each unique number
4 Map<Integer, Integer> frequencyMap = new HashMap<>();
5 // Counting the frequency of each number in the list
6 for (int number : numbers) {
7 // If the number is already in the map, increase its count by 1, otherwise add it with a count of 1
8 frequencyMap.merge(number, 1, Integer::sum);
9 }
10
11 // PriorityQueue to store frequencies in descending order (largest frequency at the top)
12 PriorityQueue<Integer> frequencyQueue = new PriorityQueue<>(Comparator.reverseOrder());
13
14 // Add all frequencies to the PriorityQueue
15 for (int frequency : frequencyMap.values()) {
16 frequencyQueue.offer(frequency);
17 }
18
19 // Initialize the answer with the total count of numbers in the list
20 int minLength = numbers.size();
21
22 // Process pairs until we only have one frequency or none left
23 while (frequencyQueue.size() > 1) {
24 // Poll the two top frequencies
25 int firstFrequency = frequencyQueue.poll();
26 int secondFrequency = frequencyQueue.poll();
27
28 // Decrement the frequencies as we are pairing and removing one occurrence of each
29 firstFrequency--;
30 secondFrequency--;
31
32 // If any updated frequencies are greater than 0, offer them back to the priority queue
33 if (firstFrequency > 0) {
34 frequencyQueue.offer(firstFrequency);
35 }
36 if (secondFrequency > 0) {
37 frequencyQueue.offer(secondFrequency);
38 }
39
40 // Each removal reduces the length by 2 since we remove one occurrence of each of the two numbers
41 minLength -= 2;
42 }
43
44 // Return the minimum length after making all possible removals
45 return minLength;
46 }
47}
48
1#include <vector>
2#include <queue>
3#include <unordered_map>
4
5class Solution {
6public:
7 int minLengthAfterRemovals(vector<int>& nums) {
8 // Create a hashmap to count the occurrence of each number in the vector.
9 unordered_map<int, int> countMap;
10 for (int num : nums) {
11 ++countMap[num]; // Increment the count for this number.
12 }
13
14 // Use a max-heap (priority queue) to keep track of the counts.
15 priority_queue<int> maxHeap;
16 for (auto& keyValue : countMap) {
17 maxHeap.push(keyValue.second); // Push counts to the max-heap.
18 }
19
20 // Initialize final answer to the size of the input vector.
21 int finalLength = nums.size();
22
23 // Process the heap until there is one or no element left.
24 while (maxHeap.size() > 1) {
25 int topCount1 = maxHeap.top(); // Get the top (maximum) element from the max-heap.
26 maxHeap.pop(); // Remove that element.
27 int topCount2 = maxHeap.top(); // Get the next top element.
28 maxHeap.pop(); // Remove that element as well.
29
30 // Decrement both counts as we are planning to remove one occurrence of each.
31 topCount1--;
32 topCount2--;
33
34 // If there are still occurrences left after removal, push the new count back to the heap.
35 if (topCount1 > 0) {
36 maxHeap.push(topCount1);
37 }
38 if (topCount2 > 0) {
39 maxHeap.push(topCount2);
40 }
41
42 // Since we remove two elements from the array (one each for the two counts), decrease the length by 2.
43 finalLength -= 2;
44 }
45
46 // Return the final length of the vector after the removals.
47 return finalLength;
48 }
49};
50
1import { MaxPriorityQueue } from '@datastructures-js/priority-queue';
2
3// This function calculates the minimum length of the array after performing
4// removals such that each pair of adjacent numbers is not the same.
5function minLengthAfterRemovals(nums: number[]): number {
6 // A map to store the frequency count of each number in the array.
7 const frequencyCount: Map<number, number> = new Map();
8
9 // Count the frequency of each number in the input array.
10 for (const num of nums) {
11 frequencyCount.set(num, (frequencyCount.get(num) ?? 0) + 1);
12 }
13
14 // A priority queue to order the counts in decreasing order.
15 const priorityQueue = new MaxPriorityQueue<number>();
16
17 // Enqueue the frequency counts into the priority queue.
18 for (const [, count] of frequencyCount) {
19 priorityQueue.enqueue(count);
20 }
21
22 // Variable to track the answer which is initialized to the length of the input array.
23 let minimumLength = nums.length;
24
25 // Keep removing elements in pairs until one or no elements remain in the priority queue.
26 while (priorityQueue.size() > 1) {
27 let countX = priorityQueue.dequeue().element;
28 let countY = priorityQueue.dequeue().element;
29
30 // Decrease the counts and re-enqueue if they are still greater than zero.
31 if (--countX > 0) {
32 priorityQueue.enqueue(countX);
33 }
34 if (--countY > 0) {
35 priorityQueue.enqueue(countY);
36 }
37
38 // Each valid removal decreases the minimum length by 2.
39 minimumLength -= 2;
40 }
41
42 // Return the min length of the array after removals.
43 return minimumLength;
44}
45
46// You will need to install @datastructures-js/priority-queue package to use MaxPriorityQueue.
47// If you’re using npm, you can install with: npm install @datastructures-js/priority-queue
48
Time and Space Complexity
Time Complexity
The time complexity of the given code is determined by several factors:
- Counting the elements in
nums
usingCounter
constructor has a time complexity ofO(n)
, wheren
is the number of elements innums
. - The list comprehension used to create
pq
from the counts also has a time complexity ofO(n)
, assuming there aren
unique elements. This is because it traverses the counter once. - Constructing a heap from the list of counts using
heapify
has a time complexity ofO(n)
. - The
while
loop runs until there is only one or no element left in the priority queue. In the worst case, the loop runs forn/2
iterations if every element is unique. The total number of heappop and heappush operations will together contribute to the time complexity.- For each heappop operation, the time complexity is
O(log n)
. - For each heappush operation after decrementing the element, if it is still greater than 0, the time complexity is also
O(log n)
.
- For each heappop operation, the time complexity is
Taking into account that each iteration may involve up to two heappop and two heappush operations, the total complexity for the loop is O(n log n)
for n/2
iterations, leading to an overall complexity of O(n log n)
.
Adding the complexities together, the dominant term is O(n log n)
due to the while loop. Hence, the overall time complexity of the code is O(n log n)
.
Space Complexity
The space complexity is determined by the additional storage used by the code:
- The
Counter
data structure requiresO(u)
space, whereu
is the number of unique elements innums
. - The priority queue (heap) also requires
O(u)
space. - No other significant space is used since the operations inside the loop use a constant amount of space.
Thus, the overall space complexity of the code is O(u)
, where u
represents the number of unique elements in nums
.
Learn more about how to find time and space complexity quickly using problem constraints.
How many ways can you arrange the three letters A, B and C?
Recommended Readings
Greedy Introduction div class responsive iframe iframe src https www youtube com embed WTslqPbj7I title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture web share allowfullscreen iframe div When do we use greedy Greedy algorithms tend to solve optimization problems Typically they will ask you to calculate the max min of some value Commonly you may see this phrased in the problem as max min longest shortest largest smallest etc These keywords can be identified by just scanning
Tech Interview Pattern Two Pointers Introduction If you prefer videos here's a super quick introduction to Two Pointers div class responsive iframe iframe src https www youtube com embed xZ4AfXHQ1VQ title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture allowfullscreen iframe div Two pointers is a common interview
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
Want a Structured Path to Master System Design Too? Don’t Miss This!