2856. Minimum Array Length After Pair Removals
Problem Description
You are given an integer array nums
that is sorted in non-decreasing order.
You can perform the following operation any number of times:
- Select two indices
i
andj
wherenums[i] < nums[j]
(the elements at these positions must be different) - Remove both elements at indices
i
andj
from the array - The remaining elements keep their original order, and the array gets re-indexed
Your task is to find the minimum possible length of the array after performing this operation zero or more times.
The key insight is that each operation removes a pair of different elements from the array. Since we want to minimize the final length, we should try to remove as many pairs as possible. The strategy involves pairing up elements greedily - we can use elements with higher frequencies to pair with elements with lower frequencies.
For example, if nums = [1, 1, 2, 2, 2]
, we can pair one 1
with one 2
, and another 1
with another 2
, leaving us with one 2
that cannot be paired. The minimum length would be 1
.
The solution uses a max heap to always pair the two most frequent elements first, ensuring we create the maximum number of pairs possible. Each pairing reduces both frequencies by 1 and the total array length by 2. The process continues until we can no longer form pairs (when fewer than 2 distinct elements remain).
Intuition
To minimize the array length, we need to maximize the number of pairs we can remove. Since each operation requires two different elements (nums[i] < nums[j]
), we can think of this as a matching problem where we want to pair up as many elements as possible.
The first observation is that the actual values don't matter - only their frequencies do. If we have 3 copies of value a
and 2 copies of value b
, we can form 2 pairs, leaving 1 copy of a
unpaired. This suggests we should count the frequency of each unique element.
The next insight is about the pairing strategy. Consider if we have frequencies [5, 3, 2, 1]
. We could pair the frequency-5 element with the frequency-1 element completely (removing all 1 copy), leaving frequencies [4, 3, 2]
. But this isn't optimal! Instead, we should balance the pairing by always matching the two highest frequencies. This way, we avoid creating a situation where one element dominates and cannot be paired with others.
Why does pairing the two highest frequencies work? Think of it like distributing the "pairing burden" evenly. If we always reduce the two largest frequencies by 1, we prevent any single element from becoming too dominant. This greedy approach ensures we can continue forming pairs for as long as possible.
The process naturally terminates when either:
- Only one unique element remains (all have the same value, so no pairs can be formed)
- No elements remain (we've successfully paired everything)
This leads us to use a max heap: repeatedly extract the two largest frequencies, reduce both by 1, and put them back if they're still positive. Each iteration removes 2 elements from the original array, and we continue until we can't form any more pairs.
Learn more about Greedy, Two Pointers and Binary Search patterns.
Solution Approach
The implementation uses a greedy strategy with a max heap (priority queue) to efficiently pair elements with the highest frequencies.
Step 1: Count Frequencies
First, we use a Counter to get the frequency of each unique element in nums
:
cnt = Counter(nums)
Step 2: Build Max Heap
Python's heapq
module provides a min heap by default. To simulate a max heap, we negate all frequency values before adding them to the heap:
pq = [-x for x in cnt.values()] heapify(pq)
Now the largest frequency (most negative value) will be at the top of the heap.
Step 3: Initialize Result We start with the original array length:
ans = len(nums)
Step 4: Greedy Pairing Process While we have at least 2 different element types (heap size ≥ 2), we repeatedly:
-
Extract the two largest frequencies from the heap:
x, y = -heappop(pq), -heappop(pq)
(We negate them back to get positive values)
-
Reduce both frequencies by 1 (representing one pairing):
x -= 1 y -= 1
-
If any frequency is still positive after reduction, push it back to the heap:
if x > 0: heappush(pq, -x) if y > 0: heappush(pq, -y)
-
Decrease the answer by 2 (we removed a pair of elements):
ans -= 2
Step 5: Return Result
When the heap has fewer than 2 elements, no more pairs can be formed. The remaining value in ans
is our minimum possible array length.
The time complexity is O(n log k)
where k
is the number of unique elements, as each heap operation takes O(log k)
time and we perform at most n/2
iterations. The space complexity is O(k)
for storing the frequencies.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with nums = [2, 3, 3, 3, 4, 4, 5]
.
Step 1: Count Frequencies
- Value 2: frequency 1
- Value 3: frequency 3
- Value 4: frequency 2
- Value 5: frequency 1
Step 2: Build Max Heap
Create a max heap with frequencies: [3, 2, 1, 1]
(In implementation, we use negative values: [-3, -2, -1, -1]
)
Step 3: Initialize Result
ans = 7
(original array length)
Step 4: Greedy Pairing Process
Iteration 1:
- Extract two largest: 3 and 2
- Pair one element from each group (reduce both by 1)
- New frequencies: 2 and 1
- Push both back to heap:
[2, 1, 1, 1]
- Update
ans = 7 - 2 = 5
Iteration 2:
- Extract two largest: 2 and 1
- Pair one element from each group
- New frequencies: 1 and 0
- Push only 1 back (0 is discarded):
[1, 1, 1]
- Update
ans = 5 - 2 = 3
Iteration 3:
- Extract two largest: 1 and 1
- Pair one element from each group
- New frequencies: 0 and 0
- Both are 0, so nothing pushed back:
[1]
- Update
ans = 3 - 2 = 1
Step 5: Return Result
Heap has only 1 element left, so we can't form more pairs.
Final answer: 1
This means after optimally pairing elements, we're left with just 1 element that couldn't be paired.
Solution Implementation
1from typing import List
2from collections import Counter
3from heapq import heapify, heappop, heappush
4
5
6class Solution:
7 def minLengthAfterRemovals(self, nums: List[int]) -> int:
8 # Count frequency of each element in the array
9 frequency_map = Counter(nums)
10
11 # Create max heap using negative values (Python has min heap by default)
12 # Store negative frequencies to simulate max heap behavior
13 max_heap = [-freq for freq in frequency_map.values()]
14 heapify(max_heap)
15
16 # Initialize result with total number of elements
17 remaining_elements = len(nums)
18
19 # Keep pairing elements with highest frequencies
20 while len(max_heap) > 1:
21 # Extract two most frequent elements (negate to get actual values)
22 first_freq = -heappop(max_heap)
23 second_freq = -heappop(max_heap)
24
25 # Remove one occurrence of each element (form a pair)
26 first_freq -= 1
27 second_freq -= 1
28
29 # Push back to heap if elements still have remaining occurrences
30 if first_freq > 0:
31 heappush(max_heap, -first_freq)
32 if second_freq > 0:
33 heappush(max_heap, -second_freq)
34
35 # Two elements were removed in this pairing
36 remaining_elements -= 2
37
38 return remaining_elements
39
1class Solution {
2 public int minLengthAfterRemovals(List<Integer> nums) {
3 // Count frequency of each number in the list
4 Map<Integer, Integer> frequencyMap = new HashMap<>();
5 for (int num : nums) {
6 frequencyMap.merge(num, 1, Integer::sum);
7 }
8
9 // Create a max heap to store frequencies in descending order
10 PriorityQueue<Integer> maxHeap = new PriorityQueue<>(Comparator.reverseOrder());
11 for (int frequency : frequencyMap.values()) {
12 maxHeap.offer(frequency);
13 }
14
15 // Initialize result with total size of the array
16 int remainingElements = nums.size();
17
18 // Repeatedly pair up elements from the two most frequent groups
19 while (maxHeap.size() > 1) {
20 // Get the two largest frequencies
21 int firstFreq = maxHeap.poll();
22 int secondFreq = maxHeap.poll();
23
24 // Remove one element from each group (simulate pairing)
25 firstFreq--;
26 secondFreq--;
27
28 // Add back to heap if frequency is still positive
29 if (firstFreq > 0) {
30 maxHeap.offer(firstFreq);
31 }
32 if (secondFreq > 0) {
33 maxHeap.offer(secondFreq);
34 }
35
36 // Two elements were removed (paired)
37 remainingElements -= 2;
38 }
39
40 return remainingElements;
41 }
42}
43
1class Solution {
2public:
3 int minLengthAfterRemovals(vector<int>& nums) {
4 // Count frequency of each element in the array
5 unordered_map<int, int> frequencyMap;
6 for (int num : nums) {
7 ++frequencyMap[num];
8 }
9
10 // Use max heap to store frequencies in descending order
11 priority_queue<int> maxHeap;
12 for (const auto& [element, frequency] : frequencyMap) {
13 maxHeap.push(frequency);
14 }
15
16 // Initialize result with total array size
17 int remainingLength = nums.size();
18
19 // Greedily pair elements with highest frequencies
20 while (maxHeap.size() > 1) {
21 // Get two most frequent elements
22 int firstFreq = maxHeap.top();
23 maxHeap.pop();
24 int secondFreq = maxHeap.top();
25 maxHeap.pop();
26
27 // Remove one occurrence of each element (form a pair)
28 firstFreq--;
29 secondFreq--;
30
31 // Push back remaining frequencies if they're positive
32 if (firstFreq > 0) {
33 maxHeap.push(firstFreq);
34 }
35 if (secondFreq > 0) {
36 maxHeap.push(secondFreq);
37 }
38
39 // Two elements were removed in this pairing
40 remainingLength -= 2;
41 }
42
43 return remainingLength;
44 }
45};
46
1/**
2 * Calculates the minimum length of array after removing pairs of different elements
3 * @param nums - Input array of numbers
4 * @returns Minimum length after all possible removals
5 */
6function minLengthAfterRemovals(nums: number[]): number {
7 // Count frequency of each number in the array
8 const frequencyMap: Map<number, number> = new Map();
9 for (const num of nums) {
10 frequencyMap.set(num, (frequencyMap.get(num) ?? 0) + 1);
11 }
12
13 // Create a max priority queue to store frequencies
14 // Higher frequencies will be processed first
15 const maxHeap = new MaxPriorityQueue<number>();
16 for (const [_, frequency] of frequencyMap) {
17 maxHeap.enqueue(frequency);
18 }
19
20 // Initialize result with the original array length
21 let remainingLength = nums.length;
22
23 // Keep pairing elements with highest frequencies
24 while (maxHeap.size() > 1) {
25 // Get the two highest frequencies
26 let firstFrequency = maxHeap.dequeue();
27 let secondFrequency = maxHeap.dequeue();
28
29 // Decrease both frequencies by 1 (removing one pair)
30 firstFrequency--;
31 secondFrequency--;
32
33 // If frequency is still positive, add it back to the heap
34 if (firstFrequency > 0) {
35 maxHeap.enqueue(firstFrequency);
36 }
37 if (secondFrequency > 0) {
38 maxHeap.enqueue(secondFrequency);
39 }
40
41 // Two elements were removed (one pair)
42 remainingLength -= 2;
43 }
44
45 return remainingLength;
46}
47
Time and Space Complexity
The time complexity of this algorithm is O(n × log n)
, where n
is the length of the array nums
.
Breaking down the time complexity:
- Creating the Counter from
nums
takesO(n)
time - Converting counter values to a list and negating them takes
O(k)
time, wherek
is the number of unique elements (at mostn
) - Heapifying the list takes
O(k)
time - The while loop can run at most
k-1
times (when we reduce the heap to 1 or 0 elements) - Each iteration performs two
heappop
operations and up to twoheappush
operations, each takingO(log k)
time - Total for the while loop:
O(k × log k)
Since k ≤ n
, the overall time complexity is O(n + k × log k) = O(n + n × log n) = O(n × log n)
.
The space complexity is O(n)
:
- The Counter stores at most
n
key-value pairs:O(n)
- The priority queue
pq
stores at mostn
elements (when all elements are unique):O(n)
- Other variables use constant space:
O(1)
Therefore, the total space complexity is O(n)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Misunderstanding the Pairing Constraint
A common mistake is thinking we can only pair adjacent elements or that we need to maintain the sorted order during operations. The problem allows pairing ANY two indices i
and j
where nums[i] < nums[j]
, meaning we can pair any two different values regardless of their positions.
Incorrect approach:
# Wrong: Trying to pair only adjacent different elements
def minLengthAfterRemovals(nums):
result = len(nums)
i = 0
while i < len(nums) - 1:
if nums[i] < nums[i + 1]:
result -= 2
i += 2
else:
i += 1
return result
Correct understanding: We can pair any two different elements from anywhere in the array, which is why the frequency-based approach works.
2. Forgetting to Handle Single Element Dominance
When one element's frequency is more than half the array length, it will dominate and some of its occurrences cannot be paired. Failing to recognize this pattern leads to incorrect results.
Example scenario:
nums = [1, 1, 1, 1, 2] # Element 1 appears 4 times out of 5 total # Maximum pairs possible: 2 (pairing all 2s and some 1s) # Minimum length: 1 (one 1 will remain unpaired)
Solution verification: The heap approach naturally handles this because when one frequency dominates, it will eventually be the only element left in the heap.
3. Using Min Heap Instead of Max Heap Incorrectly
Python's heapq
is a min heap, and forgetting to negate values leads to pairing elements with the LOWEST frequencies first, which is suboptimal.
Incorrect implementation:
# Wrong: Using min heap directly without negation
max_heap = list(frequency_map.values()) # Should be negated!
heapify(max_heap)
while len(max_heap) > 1:
first_freq = heappop(max_heap) # Gets minimum, not maximum
second_freq = heappop(max_heap)
# ... rest of logic
Correct implementation:
# Correct: Negate values for max heap simulation
max_heap = [-freq for freq in frequency_map.values()]
heapify(max_heap)
while len(max_heap) > 1:
first_freq = -heappop(max_heap) # Negate back to get actual value
second_freq = -heappop(max_heap)
# ... rest of logic
4. Not Re-adding Elements with Remaining Frequency
After pairing, if an element still has occurrences left (frequency > 0), it must be added back to the heap for future pairings. Forgetting this step causes incorrect results.
Incorrect code:
# Wrong: Not pushing back elements with remaining frequency
while len(max_heap) > 1:
first_freq = -heappop(max_heap)
second_freq = -heappop(max_heap)
first_freq -= 1
second_freq -= 1
# Missing: should push back if freq > 0!
remaining_elements -= 2
Correct code:
# Correct: Always check and push back if frequency > 0
while len(max_heap) > 1:
first_freq = -heappop(max_heap)
second_freq = -heappop(max_heap)
first_freq -= 1
second_freq -= 1
if first_freq > 0:
heappush(max_heap, -first_freq)
if second_freq > 0:
heappush(max_heap, -second_freq)
remaining_elements -= 2
5. Edge Case: Empty Array or Single Element Type
If the input array is empty or contains only one unique element, no pairs can be formed.
Test cases to consider:
nums = [] # Expected: 0 nums = [1] # Expected: 1 nums = [1, 1, 1, 1] # Expected: 4 (all same, no pairs possible)
The current solution handles these correctly because:
- Empty array: Counter returns empty dict, heap is empty, returns 0
- Single element type: Heap size is 1, loop never executes, returns original length
What's the relationship between a tree and a graph?
Recommended Readings
Greedy Introduction div class responsive iframe iframe src https www youtube com embed WTslqPbj7I title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture web share allowfullscreen iframe div When do we use greedy Greedy algorithms tend to solve optimization problems Typically they will ask you to calculate the max min of some value Commonly you may see this phrased in the problem as max min longest shortest largest smallest etc These keywords can be identified by just scanning
Tech Interview Pattern Two Pointers Introduction If you prefer videos here's a super quick introduction to Two Pointers div class responsive iframe iframe src https www youtube com embed xZ4AfXHQ1VQ title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture allowfullscreen iframe div Two pointers is a common interview
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
Want a Structured Path to Master System Design Too? Don’t Miss This!