480. Sliding Window Median
Problem Description
The problem involves finding the median of each window of size k
in an array called nums
. A window is a contiguous subset of the array, and it slides one position to the right on each step, so the subsequent window contains k
elements as well, with one new element added on the right and one old element removed from the left.
The median is defined as the middle value of an ordered list. If the size of the list is odd, it's simply the middle value. If the size of the list is even, it's the average of the two middle values.
The task is to output an array containing the median of the array nums
within each sliding window.
Intuition
Finding the median requires the elements to be in sorted order. Sorting the entire window for each position of the sliding window would result in a time-consuming algorithm, especially for large windows or arrays.
To optimize this, we can use two heaps: a max-heap to store the smaller half of the elements and a min-heap to store the larger half. By maintaining these two heaps, we can ensure that we have quick access to the median elements at all times. The max-heap allows us to access the largest element in the smaller half, and the min-heap allows us to access the smallest element in the larger half. When the window size k
is odd, the median is the top element of the max-heap. When k
is even, the median is the average of the top elements of both heaps.
Additional complications arise from the sliding nature of the window: We need to add a new element and remove an old one with each step. To prevent the heaps from growing with duplicates of shifted-out elements, a delayed deletion approach using a hash map (defaultdict) is employed. With this method, elements that should be removed are marked ('delayed'), and actual removal from the heaps occurs lazily, which means we only remove when the element is at the top of the respective heap. Proper rebalancing of heaps is done after each add or remove operation to maintain the half-size property of the two heaps.
The code defines a MedianFinder
class to encapsulate this logic, and uses it within the medianSlidingWindow
function to generate the array of medians as required by iterating through the entire array, sliding the window one step at a time and efficiently calculating the medians.
Learn more about Sliding Window and Heap (Priority Queue) patterns.
Solution Approach
The solution uses a class MedianFinder
to encapsulate the logic for finding medians in a sliding window. Here's a step-by-step breakdown of how the MedianFinder
class and the solution function medianSlidingWindow
works:
-
Initialization: The
MedianFinder
object initializes two heaps:self.small
(a max-heap) to store the smaller half of the numbers andself.large
(a min-heap) to store the larger half. It also maintains a hash mapself.delayed
to keep track of the numbers that have been "logically" removed but not physically removed from the heaps.self.small_size
andself.large_size
help track the size of the two halves. -
Adding Numbers: When a new number is added (
add_num
), it is compared with the smallest number inself.large
. If it is smaller or equal, it should belong to theself.small
heap, and it's added there; otherwise, it's added toself.large
. Once a number is added,rebalance
is called to make sure that both heaps have roughly the same number of elements, allowing for a maximum size difference of one. -
Rebalancing the Heaps: The
rebalance
method ensures that the number of elements inself.small
andself.large
are either equal or thatself.small
has one more element thanself.large
ifk
is odd. This is achieved by transferring the top element from one heap to the other when the size difference condition is violated. -
Finding the Median: The
find_median
method calculates the median based on the sizes of the heaps. Ifk
is odd, the median is the top element ofself.small
. Ifk
is even, the median is the average of the top elements ofself.small
andself.large
. -
Removing Numbers: When moving the window, numbers that exit the window need to be removed (
remove_num
). To avoid costly removal operations, a lazy deletion approach is used where the number is marked for deletion in theself.delayed
hash map but not immediately removed from the heap. The actual removal happens in theprune
method, which is triggered when the number to be removed is at the top of the respective heap. -
Calculating Medians in Sliding Window: The
medianSlidingWindow
function creates aMedianFinder
instance and then slides the window across thenums
array. Initially, the firstk
numbers are added to theMedianFinder
, from where the first median is obtained. As the window slides, the function adds a new element and removes the leftmost element of the previous window, then calculates the new median and appends it to theans
list. -
Time Complexity: Each add and remove operation can be done in logarithmic time regarding the total number of elements in the heaps. Consequently, the entire sliding window median calculation takes
O(n log k)
time, wheren
is the number of elements innums
andk
is the size of the sliding window. -
Space Complexity: The space used is
O(k)
for storing elements in the heaps and the hash map for delayed removals.
By following this approach, medians of sliding windows can be calculated efficiently without resorting to sorting the window each time it slides, thus saving computational time and making the algorithm suitable for large datasets.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's take a simple example to illustrate the solution approach. Consider the array nums = [1,3,-1,-3,5,3,6,7]
and a window size k = 3
. The goal is to find the median for each window of size 3 as we slide through nums
.
Here's how the algorithm process would work:
Initial Preparation
- Instantiate a
MedianFinder
object. - Heaps are initialized but empty.
self.small
is the max-heap, andself.large
is the min-heap.
First Window: [1,3,-1]
- Add
1
,3
,-1
to theMedianFinder
. After adding and rebalancing:self.small
=[1]
,self.large
=[-1, 3]
- Calculate the median and append to results. The median is
1
.
Slide to Second Window: [3,-1,-3]
- Remove
1
(by marking it delayed). - Add
3
and-3
:3
is added toself.large
since it's not smaller than the smallest inself.large
which is-1
.-3
is added toself.small
after comparison withself.large
.
self.small
=[-3, 1]
, (1 is delayed)self.large
=[-1, 3, 3]
after rebalancing.- Calculate the median. The median is
-1
.
Slide to Third Window: [-1,-3,5]
- Remove
3
(mark it delayed). - Add
5
:5
is added toself.large
since it's larger than the smallest inself.large
.
- Rebalance the heaps.
self.small
=[-3, -1]
,self.large
=[5, 3, 3]
(two 3's are marked delayed) after rebalancing.- Calculate the median. The median is
-1
.
Slide to Fourth Window: [-3,5,3]
- Continue this process until the end of the array.
Final Results
- Once we apply the above operations for each window, we get the list of medians =
[1, -1, -1, 3, 5, 6]
.
Efficiency and Complexity
- This process avoids full sorting of each window:
- Adding and removing elements and rebalancing takes
O(log k)
time due to heaps. - The process is repeated for each of the
n - k + 1
windows. - Thus, the overall time complexity is
O(n log k)
.
- Adding and removing elements and rebalancing takes
- Space complexity remains
O(k)
due to storage in heaps and the delayed removal hash map.
By using the MedianFinder
class and the medianSlidingWindow
method, we can see how to efficiently calculate the median of each sliding window by leveraging min-heaps, max-heaps, and delayed removals, avoiding the inefficiency of sorting each window.
Solution Implementation
1import heapq
2from collections import defaultdict
3from typing import List
4
5class MedianFinder:
6 def __init__(self, k: int):
7 self.k = k # size of the sliding window
8 self.small_max_heap = [] # max heap to store the smaller half of numbers
9 self.large_min_heap = [] # min heap to store the larger half of numbers
10 self.delayed = defaultdict(int) # count of delayed elements for lazy deletion
11 self.small_size = 0 # size of small_max_heap (not necessarily len(small_max_heap))
12 self.large_size = 0 # size of large_min_heap (not necessarily len(large_min_heap))
13
14 def add_num(self, num: int):
15 # Add a number to one of the heaps
16 if not self.small_max_heap or num <= -self.small_max_heap[0]:
17 heapq.heappush(self.small_max_heap, -num)
18 self.small_size += 1
19 else:
20 heapq.heappush(self.large_min_heap, num)
21 self.large_size += 1
22 self.rebalance()
23
24 def find_median(self) -> float:
25 # Find the median of the current numbers
26 return -self.small_max_heap[0] if self.k % 2 == 1 else (-self.small_max_heap[0] + self.large_min_heap[0]) / 2
27
28 def remove_num(self, num: int):
29 # Lazily remove a number (the number will be removed later during pruning)
30 self.delayed[num] += 1
31 if num <= -self.small_max_heap[0]:
32 self.small_size -= 1
33 if num == -self.small_max_heap[0]:
34 self.prune(self.small_max_heap)
35 else:
36 self.large_size -= 1
37 if num == self.large_min_heap[0]:
38 self.prune(self.large_min_heap)
39 self.rebalance()
40
41 def prune(self, heap: List[int]):
42 # Remove elements that have been flagged for removal
43 while heap and self.delayed[heap[0]] > 0:
44 self.delayed[heap[0]] -= 1
45 if self.delayed[heap[0]] == 0:
46 del self.delayed[heap[0]]
47 heapq.heappop(heap)
48
49 def rebalance(self):
50 # Balance the two heaps to maintain the invariant small_max_heap.size() > large_min_heap.size()
51 while self.small_size > self.large_size + 1:
52 heapq.heappush(self.large_min_heap, -heapq.heappop(self.small_max_heap))
53 self.small_size -= 1
54 self.large_size += 1
55 self.prune(self.small_max_heap)
56 while self.small_size < self.large_size:
57 heapq.heappush(self.small_max_heap, -heapq.heappop(self.large_min_heap))
58 self.large_size -= 1
59 self.small_size += 1
60 self.prune(self.large_min_heap)
61
62class Solution:
63 def median_sliding_window(self, nums: List[int], k: int) -> List[float]:
64 finder = MedianFinder(k)
65 for x in nums[:k]:
66 finder.add_num(x)
67 medians = [finder.find_median()]
68 for i in range(k, len(nums)):
69 finder.add_num(nums[i])
70 finder.remove_num(nums[i - k])
71 medians.append(finder.find_median())
72 return medians
73
1class MedianFinder {
2 private PriorityQueue<Integer> minHeap = new PriorityQueue<>(); // holds all the larger elements
3 private PriorityQueue<Integer> maxHeap = new PriorityQueue<>(Collections.reverseOrder()); // holds the smaller elements, reversed to act as a max-heap
4 private Map<Integer, Integer> countMap = new HashMap<>(); // keeps track of the number of instances a number has been delayed for removal
5 private int minHeapSize = 0;
6 private int maxHeapSize = 0;
7 private final int windowSize; // the size of the sliding window
8
9 public MedianFinder(int k) {
10 this.windowSize = k;
11 }
12
13 public void addNum(int num) {
14 // Add the number to the appropriate heap
15 if (maxHeap.isEmpty() || num <= maxHeap.peek()) {
16 maxHeap.offer(num);
17 maxHeapSize++;
18 } else {
19 minHeap.offer(num);
20 minHeapSize++;
21 }
22 rebalanceHeaps(); // Ensure both heaps remain in a valid state
23 }
24
25 public double findMedian() {
26 // If odd window size, the median is the top of the max heap, otherwise, the median is the average of the tops of both heaps
27 return (windowSize & 1) == 1 ? maxHeap.peek() : ((double) maxHeap.peek() + minHeap.peek()) / 2.0;
28 }
29
30 public void removeNum(int num) {
31 countMap.put(num, countMap.getOrDefault(num, 0) + 1); // Mark the number for delayed removal
32 if (num <= maxHeap.peek()) {
33 maxHeapSize--;
34 if (num == maxHeap.peek()) {
35 pruneHeap(maxHeap); // Remove elements that should be delayed from the maxHeap
36 }
37 } else {
38 minHeapSize--;
39 if (num == minHeap.peek()) {
40 pruneHeap(minHeap); // Remove elements that should be delayed from the minHeap
41 }
42 }
43 rebalanceHeaps(); // Ensure both heaps remain in a valid state
44 }
45
46 private void pruneHeap(PriorityQueue<Integer> heap) {
47 // Remove elements from the heap that were marked for delayed removal
48 while (!heap.isEmpty() && countMap.containsKey(heap.peek())) {
49 int heapTop = heap.peek();
50 countMap.put(heapTop, countMap.get(heapTop) - 1);
51
52 if (countMap.get(heapTop) == 0) {
53 countMap.remove(heapTop); // remove the entry from the map if the count goes to zero
54 }
55
56 heap.poll(); // remove the top element in the heap as it's already accounted for in the delayed removal
57 }
58 }
59
60 private void rebalanceHeaps() {
61 // Maintain the property that maxHeap has the same number of elements as minHeap, or 1 more
62 if (maxHeapSize > minHeapSize + 1) {
63 minHeap.offer(maxHeap.poll());
64 maxHeapSize--;
65 minHeapSize++;
66 pruneHeap(maxHeap); // Ensure maxHeap is pruned
67 } else if (maxHeapSize < minHeapSize) {
68 maxHeap.offer(minHeap.poll());
69 minHeapSize--;
70 maxHeapSize++;
71 pruneHeap(minHeap); // Ensure minHeap is pruned
72 }
73 }
74}
75
76class Solution {
77 public double[] medianSlidingWindow(int[] nums, int k) {
78 MedianFinder finder = new MedianFinder(k);
79 // Populate the MedianFinder with the initial sliding window
80 for (int i = 0; i < k; ++i) {
81 finder.addNum(nums[i]);
82 }
83
84 int n = nums.length;
85 double[] medians = new double[n - k + 1]; // Array to hold the medians
86 medians[0] = finder.findMedian(); // Find the median for the first window
87
88 // Move the sliding window and calculate the median for each window
89 for (int i = k; i < n; ++i) {
90 finder.addNum(nums[i]); // Add the new element to the sliding window
91 finder.removeNum(nums[i - k]); // Remove the oldest element from the sliding window
92 medians[i - k + 1] = finder.findMedian(); // Find the median for the current window
93 }
94 return medians; // Return the array of medians
95 }
96}
97
1#include <functional>
2#include <queue>
3#include <unordered_map>
4#include <vector>
5
6class MedianFinder {
7public:
8 // Constructor that initializes the finder with a window size 'k'.
9 MedianFinder(int k) : windowSize(k), smallSize(0), largeSize(0) {}
10
11 // Adds a number into the data structure.
12 void addNum(int num) {
13 if (small.empty() || num <= small.top()) {
14 small.push(num);
15 ++smallSize;
16 } else {
17 large.push(num);
18 ++largeSize;
19 }
20 // Ensure the heaps are balanced after the insertion.
21 rebalanceHeaps();
22 }
23
24 // Removes a number from the data structure.
25 void removeNum(int num) {
26 delayed[num]++;
27 if (num <= small.top()) {
28 --smallSize;
29 if (num == small.top()) {
30 prune(small);
31 }
32 } else {
33 --largeSize;
34 if (num == large.top()) {
35 prune(large);
36 }
37 }
38 // Ensure the heaps are balanced after the removal.
39 rebalanceHeaps();
40 }
41
42 // Returns the median of current numbers.
43 double findMedian() {
44 if (windowSize & 1) {
45 return small.top();
46 } else {
47 return (static_cast<double>(small.top()) + large.top()) * 0.5;
48 }
49 }
50
51private:
52 // Max heap to store the smaller half of the elements
53 std::priority_queue<int> small;
54 // Min heap to store the larger half of the elements
55 std::priority_queue<int, std::vector<int>, std::greater<int>> large;
56 // Hashmap to keep track of delayed removals when numbers are out of the current window
57 std::unordered_map<int, int> delayed;
58 // Sizes of the heaps
59 int smallSize, largeSize;
60 // Window size
61 int windowSize;
62
63 // Prunes/removes elements from the heap that should be delayed-removed.
64 template <typename T>
65 void prune(T& heap) {
66 while (!heap.empty() && delayed[heap.top()]) {
67 if (--delayed[heap.top()] == 0) {
68 delayed.erase(heap.top());
69 }
70 heap.pop();
71 }
72 }
73
74 // Balances the two heaps to maintain the median property.
75 void rebalanceHeaps() {
76 if (smallSize > largeSize + 1) {
77 large.push(small.top());
78 small.pop();
79 --smallSize;
80 ++largeSize;
81 // Re-prune in case the top element is meant to be delayed-removed.
82 prune(small);
83 } else if (smallSize < largeSize) {
84 small.push(large.top());
85 large.pop();
86 ++smallSize;
87 --largeSize;
88 // Re-prune in case the top element is meant to be delayed-removed.
89 prune(large);
90 }
91 }
92};
93
94class Solution {
95public:
96 // Calculates the median for each window of size 'k' in the array.
97 std::vector<double> medianSlidingWindow(std::vector<int>& nums, int k) {
98 MedianFinder finder(k);
99 // Initialize by adding the first 'k' elements.
100 for (int i = 0; i < k; ++i) {
101 finder.addNum(nums[i]);
102 }
103 // Store the first median.
104 std::vector<double> medians = {finder.findMedian()};
105 // Slide the window, one element at a time, maintaining the median.
106 for (int i = k; i < nums.size(); ++i) {
107 finder.addNum(nums[i]); // Add the new element to the window.
108 finder.removeNum(nums[i - k]); // Remove the element that is no longer in the window.
109 // Store the new median.
110 medians.push_back(finder.findMedian());
111 }
112 return medians;
113 }
114};
115
1// Create an alias for the comparison function type.
2type CompareFunction<T> = (a: T, b: T) => number;
3
4// Define PriorityQueue class template.
5class PriorityQueue<T> {
6 private data: T[] = [];
7 private compare: CompareFunction<T>;
8
9 constructor(compareFunction: CompareFunction<T>) {
10 this.compare = compareFunction;
11 }
12
13 private leftChildIndex(parentIndex: number): number {
14 return 2 * parentIndex + 1;
15 }
16
17 private rightChildIndex(parentIndex: number): number {
18 return 2 * parentIndex + 2;
19 }
20
21 private parentIndex(childIndex: number): number {
22 return Math.floor((childIndex - 1) / 2);
23 }
24
25 private hasLeftChild(index: number): boolean {
26 return this.leftChildIndex(index) < this.data.length;
27 }
28
29 private hasRightChild(index: number): boolean {
30 return this.rightChildIndex(index) < this.data.length;
31 }
32
33 private hasParent(index: number): boolean {
34 return this.parentIndex(index) >= 0;
35 }
36
37 private leftChild(index: number): T {
38 return this.data[this.leftChildIndex(index)];
39 }
40
41 private rightChild(index: number): T {
42 return this.data[this.rightChildIndex(index)];
43 }
44
45 private parent(index: number): T {
46 return this.data[this.parentIndex(index)];
47 }
48
49 private swap(indexOne: number, indexTwo: number): void {
50 let temp = this.data[indexOne];
51 this.data[indexOne] = this.data[indexTwo];
52 this.data[indexTwo] = temp;
53 }
54
55 private heapifyUp(): void {
56 let index = this.data.length - 1;
57 while (this.hasParent(index) && this.compare(this.parent(index), this.data[index]) > 0) {
58 this.swap(this.parentIndex(index), index);
59 index = this.parentIndex(index);
60 }
61 }
62
63 private heapifyDown(): void {
64 let index = 0;
65 while (this.hasLeftChild(index)) {
66 let smallerChildIndex = this.leftChildIndex(index);
67 if (this.hasRightChild(index) && this.compare(this.rightChild(index), this.leftChild(index)) < 0) {
68 smallerChildIndex = this.rightChildIndex(index);
69 }
70
71 if (this.compare(this.data[index], this.data[smallerChildIndex]) < 0) {
72 break;
73 } else {
74 this.swap(index, smallerChildIndex);
75 }
76 index = smallerChildIndex;
77 }
78 }
79
80 public isEmpty(): boolean {
81 return this.data.length === 0;
82 }
83
84 public peek(): T | undefined {
85 if (this.data.length === 0) return undefined;
86 return this.data[0];
87 }
88
89 public push(value: T): void {
90 this.data.push(value);
91 this.heapifyUp();
92 }
93
94 public pop(): T | undefined {
95 if (this.data.length === 0) return undefined;
96 if (this.data.length === 1) return this.data.pop();
97
98 const item = this.data[0];
99 this.data[0] = this.data.pop()!;
100 this.heapifyDown();
101 return item;
102 }
103
104 public size(): number {
105 return this.data.length;
106 }
107}
108
109// A max heap comparator for numbers.
110const maxHeapCompare: CompareFunction<number> = (a, b) => b - a;
111
112// A min heap comparator for numbers.
113const minHeapCompare: CompareFunction<number> = (a, b) => a - b;
114
115// Small values are stored in a max heap.
116const small: PriorityQueue<number> = new PriorityQueue(maxHeapCompare);
117// Large values are stored in a min heap.
118const large: PriorityQueue<number> = new PriorityQueue(minHeapCompare);
119
120// Map to keep track of values that need to be pruned from heaps.
121const delayed = new Map<number, number>();
122
123let windowSize = 0;
124let smallSize = 0;
125let largeSize = 0;
126
127function addNum(num: number): void {
128 if (small.isEmpty() || num <= small.peek()!) {
129 small.push(num);
130 smallSize++;
131 } else {
132 large.push(num);
133 largeSize++;
134 }
135 rebalanceHeaps();
136}
137
138function removeNum(num: number): void {
139 delayed.set(num, (delayed.get(num) || 0) + 1);
140 if (num <= small.peek()!) {
141 smallSize--;
142 if (num === small.peek()) {
143 prune(small);
144 }
145 } else {
146 largeSize--;
147 if (num === large.peek()) {
148 prune(large);
149 }
150 }
151 rebalanceHeaps();
152}
153
154function findMedian(): number {
155 if (windowSize % 2 === 1) {
156 return small.peek()!;
157 } else {
158 return (small.peek()! + large.peek()!) / 2.0;
159 }
160}
161
162function prune(heap: PriorityQueue<number>): void {
163 while (!heap.isEmpty() && delayed.get(heap.peek()!)! > 0) {
164 delayed.set(heap.peek()!, delayed.get(heap.peek()!)! - 1);
165 if (delayed.get(heap.peek()!) === 0) {
166 delayed.delete(heap.peek()!);
167 }
168 heap.pop();
169 }
170}
171
172function rebalanceHeaps(): void {
173 if (smallSize > largeSize + 1) {
174 large.push(small.pop()!);
175 smallSize--;
176 largeSize++;
177 prune(small);
178 } else if (smallSize < largeSize) {
179 small.push(large.pop()!);
180 smallSize++;
181 largeSize--;
182 prune(large);
183 }
184}
185
186function medianSlidingWindow(nums: number[], k: number): number[] {
187 windowSize = k;
188 const medians: number[] = [];
189
190 for (let i = 0; i < k; ++i) {
191 addNum(nums[i]);
192 }
193
194 medians.push(findMedian());
195
196 for (let i = k; i < nums.length; ++i) {
197 addNum(nums[i]);
198 removeNum(nums[i - k]);
199 medians.push(findMedian());
200 }
201
202 return medians;
203}
204
Time and Space Complexity
Time Complexity:
The primary operations involving computational complexity in the MedianFinder
class are add_num
, find_median
, remove_num
, prune
, and rebalance
. Each of these functions relies on heap operations which generally have a time complexity of O(log n)
for insertion and removal, where n
is the number of elements in the heap.
- The
add_num
function inserts an element into one of the heaps. This operation isO(log k)
, as at most one element is added to a heap with no more thank
elements. - The
remove_num
function involves decrementing the counter and possibly removing the top element from a heap which also has a complexity ofO(log k)
. - The
prune
function uses a while loop which can, in the worst-case scenario, run as many times as there are elements inpq
. However, each pop operation isO(log k)
. The actual frequency of pruning depends on the elements being removed and the state of the delayed dictionary. - The
rebalance
function ensures both heaps remain balanced after adding or removing an element. It may callheappush
andheappop
at most twice if the sizes of heaps differ by more than one. Hence, rebalancing operations are alsoO(log k)
each.
The medianSlidingWindow
function in the Solution
class calls the add_num
and remove_num
methods of MedianFinder
for each element in nums
. Since nums
has n
elements, and each operation is O(log k)
, the overall time complexity for processing the sliding window across all the elements is O(n log k)
.
Space Complexity:
The space complexity mainly comes from the data stored in MedianFinder
, specifically the two heaps and the delayed elements dictionary.
- Two heaps are maintained, one for the smaller half and one for the larger half of the numbers, and they can contain up to
k
elements combined. Therefore, the space complexity for the heaps isO(k)
. - The
defaultdict
, nameddelayed
, holds the count of the elements delayed for removal. In the worst case, this could hold each unique number once, so the space complexity fordelayed
isO(k)
(assuming the sliding window could contain up tok
unique elements).
Combined, the space complexity of the MedianFinder class is O(k)
.
Considering both the heaps and the additional structures like delayed
, the overall space complexity of the algorithm is O(k)
.
Learn more about how to find time and space complexity quickly using problem constraints.
Depth first search is equivalent to which of the tree traversal order?