3369. Design an Array Statistics Tracker π
Problem Description
You need to design a data structure that can track numbers and efficiently answer statistical queries about them. The data structure should support adding numbers, removing the oldest number, and computing the mean, median, and mode of the current numbers.
The StatisticsTracker
class must implement these operations:
-
StatisticsTracker()
: Creates an empty statistics tracker. -
addNumber(int number)
: Adds a new number to the data structure. -
removeFirstAddedNumber()
: Removes the earliest added number (following FIFO order). -
getMean()
: Returns the floored mean of all current numbers. The mean is calculated as the sum of all values divided by the count, with the result floored (rounded down). -
getMedian()
: Returns the median of all current numbers. The median is the middle value when numbers are sorted in non-decreasing order. When there's an even count of numbers (two middle values), return the larger one. -
getMode()
: Returns the mode of all current numbers. The mode is the most frequently occurring number. If multiple numbers have the same highest frequency, return the smallest one.
The solution uses multiple data structures to efficiently handle these operations:
- A queue (
deque
) to maintain the order of insertion for FIFO removal - A variable to track the running sum for mean calculation
- A hash table to count occurrences of each number
- Two sorted lists: one for finding the median and another for tracking modes (sorted by frequency descending, then by value ascending)
Each operation is optimized: getMean()
runs in O(1)
, getMedian()
and getMode()
leverage sorted data structures for efficient access, while addNumber()
and removeFirstAddedNumber()
run in O(log n)
due to the sorted list operations.
Intuition
The key insight is that we need to efficiently support different statistical operations while maintaining the ability to remove elements in FIFO order. This suggests we need multiple data structures, each optimized for a specific operation.
For tracking insertion order, a queue is the natural choice since we need FIFO removal. We can't rely solely on the queue for statistics though, because finding the median or mode in an unsorted queue would require O(n log n)
sorting or O(n)
scanning for each query.
For the mean, we realize we don't need to sum all elements every time - we can maintain a running sum. When we add a number, we add it to the sum; when we remove a number, we subtract it. The mean is simply sum // count
.
For the median, we need quick access to the middle element of a sorted sequence. A sorted list allows us to maintain sorted order during insertions and deletions (O(log n)
), and gives us O(1)
access to any element by index. The median is always at index len(q) // 2
due to the problem's specific median definition.
For the mode, we need to track frequencies and quickly find the number with the highest frequency (breaking ties by smallest value). We use two complementary structures:
- A hash table (
cnt
) forO(1)
frequency lookups and updates - A sorted list (
sl2
) that maintains tuples of(number, frequency)
sorted by frequency descending, then by number ascending
The clever part is how we update sl2
: when a number's frequency changes, we first remove the old (number, old_frequency)
tuple, then add the new (number, new_frequency)
tuple. This keeps sl2
always sorted correctly, and the mode is always the first element.
This design trades some space complexity (maintaining multiple structures) for optimal time complexity on each operation, which is crucial when these statistical queries are called frequently.
Learn more about Queue, Binary Search, Data Stream and Heap (Priority Queue) patterns.
Solution Approach
The implementation uses five main components to efficiently handle all operations:
Data Structures:
- Queue (
q
): A deque to maintain insertion order for FIFO removal - Sum variable (
s
): Tracks the running sum of all numbers - Hash table (
cnt
): Maps each number to its occurrence count - Sorted list (
sl
): Maintains all numbers in sorted order for median calculation - Sorted list with custom key (
sl2
): Stores(number, frequency)
tuples, sorted by frequency descending then by number ascending for mode calculation
Implementation Details:
addNumber(number)
:
- Append the number to queue
q
for maintaining insertion order - Add the number to sorted list
sl
to maintain sorted order - Update the frequency tracking in
sl2
:- First discard the old tuple
(number, cnt[number])
fromsl2
- Increment
cnt[number]
by 1 - Add the new tuple
(number, cnt[number])
tosl2
- First discard the old tuple
- Add the number to running sum
s
- Time complexity:
O(log n)
due to sorted list operations
removeFirstAddedNumber()
:
- Pop the first element from queue
q
usingpopleft()
- Remove this number from sorted list
sl
- Update the frequency tracking in
sl2
:- Discard the old tuple
(number, cnt[number])
fromsl2
- Decrement
cnt[number]
by 1 - Add the updated tuple
(number, cnt[number])
tosl2
- Discard the old tuple
- Subtract the number from running sum
s
- Time complexity:
O(log n)
due to sorted list operations
getMean()
:
- Simply return
s // len(q)
(integer division for floored mean) - Time complexity:
O(1)
getMedian()
:
- Access the element at index
len(q) // 2
in sorted listsl
- This works because for even-length arrays,
len(q) // 2
gives us the larger of the two middle elements - Time complexity:
O(1)
in Python (direct index access in SortedList)
getMode()
:
- Return the first element's number from
sl2[0][0]
- Since
sl2
is sorted by frequency descending (using negative frequency in the key), the first element has the highest frequency - When frequencies are tied, the sorting by number ascending ensures we get the smallest number
- Time complexity:
O(1)
The key insight in the sorting key for sl2
is using lambda x: (-x[1], x[0])
, which sorts by negative frequency (highest first) and then by the number itself (smallest first) for tie-breaking.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a sequence of operations to see how the data structures work together:
Initial State:
- Queue
q = []
- Sum
s = 0
- Count map
cnt = {}
- Sorted list
sl = []
- Mode list
sl2 = []
Operation 1: addNumber(5)
- Add 5 to queue:
q = [5]
- Add 5 to sorted list:
sl = [5]
- Update frequency tracking:
- Old tuple (5, 0) doesn't exist in sl2, so no removal
- Update cnt:
cnt = {5: 1}
- Add new tuple:
sl2 = [(5, 1)]
sorted by(-1, 5)
- Update sum:
s = 5
Operation 2: addNumber(3)
- Add 3 to queue:
q = [5, 3]
- Add 3 to sorted list:
sl = [3, 5]
(maintains sorted order) - Update frequency tracking:
- cnt becomes
{5: 1, 3: 1}
- Add (3, 1):
sl2 = [(3, 1), (5, 1)]
sorted by(-1, 3)
and(-1, 5)
- cnt becomes
- Update sum:
s = 8
Operation 3: addNumber(5)
- Add 5 to queue:
q = [5, 3, 5]
- Add 5 to sorted list:
sl = [3, 5, 5]
- Update frequency tracking:
- Remove old (5, 1) from sl2
- cnt becomes
{5: 2, 3: 1}
- Add new (5, 2):
sl2 = [(5, 2), (3, 1)]
sorted by(-2, 5)
then(-1, 3)
- Update sum:
s = 13
Query: getMean()
- Return
s // len(q) = 13 // 3 = 4
Query: getMedian()
- Index =
len(q) // 2 = 3 // 2 = 1
- Return
sl[1] = 5
Query: getMode()
- Return
sl2[0][0] = 5
(highest frequency is 2)
Operation 4: removeFirstAddedNumber()
- Remove first element (5) from queue:
q = [3, 5]
- Remove 5 from sorted list:
sl = [3, 5]
- Update frequency tracking:
- Remove old (5, 2) from sl2
- cnt becomes
{5: 1, 3: 1}
- Add new (5, 1):
sl2 = [(3, 1), (5, 1)]
(both have frequency 1, sorted by value)
- Update sum:
s = 8
Query: getMode()
- Return
sl2[0][0] = 3
(both have frequency 1, return smallest value)
This example demonstrates how:
- The queue maintains insertion order for FIFO removal
- The sorted list provides O(1) median access
- The frequency tracking with sl2 ensures the mode is always at index 0
- The running sum allows O(1) mean calculation
Solution Implementation
1from collections import deque, defaultdict
2from sortedcontainers import SortedList
3
4
5class StatisticsTracker:
6 def __init__(self):
7 # Queue to maintain insertion order for FIFO removal
8 self.queue = deque()
9
10 # Running sum for mean calculation
11 self.total_sum = 0
12
13 # Dictionary to track frequency of each number
14 self.frequency_map = defaultdict(int)
15
16 # Sorted list for median calculation
17 self.sorted_numbers = SortedList()
18
19 # Sorted list for mode calculation
20 # Key sorts by: highest frequency first (negative), then smallest number
21 self.frequency_sorted = SortedList(key=lambda x: (-x[1], x[0]))
22
23 def addNumber(self, number: int) -> None:
24 """Add a number to the tracker and update all statistics"""
25 # Add to queue for FIFO tracking
26 self.queue.append(number)
27
28 # Add to sorted list for median
29 self.sorted_numbers.add(number)
30
31 # Update frequency tracking for mode
32 # Remove old frequency entry if exists
33 self.frequency_sorted.discard((number, self.frequency_map[number]))
34 # Increment frequency
35 self.frequency_map[number] += 1
36 # Add new frequency entry
37 self.frequency_sorted.add((number, self.frequency_map[number]))
38
39 # Update running sum
40 self.total_sum += number
41
42 def removeFirstAddedNumber(self) -> None:
43 """Remove the oldest number (FIFO) and update all statistics"""
44 # Remove oldest number from queue
45 number = self.queue.popleft()
46
47 # Remove from sorted list for median
48 self.sorted_numbers.remove(number)
49
50 # Update frequency tracking for mode
51 # Remove old frequency entry
52 self.frequency_sorted.discard((number, self.frequency_map[number]))
53 # Decrement frequency
54 self.frequency_map[number] -= 1
55 # Add new frequency entry (even if frequency is 0)
56 self.frequency_sorted.add((number, self.frequency_map[number]))
57
58 # Update running sum
59 self.total_sum -= number
60
61 def getMean(self) -> int:
62 """Return the mean (average) of all numbers, using integer division"""
63 return self.total_sum // len(self.queue)
64
65 def getMedian(self) -> int:
66 """Return the median (middle value when sorted)"""
67 # For even number of elements, returns the upper middle element
68 return self.sorted_numbers[len(self.queue) // 2]
69
70 def getMode(self) -> int:
71 """Return the mode (most frequent number, smallest if tie)"""
72 # First element has highest frequency due to sorting key
73 return self.frequency_sorted[0][0]
74
75
76# Your StatisticsTracker object will be instantiated and called as such:
77# obj = StatisticsTracker()
78# obj.addNumber(number)
79# obj.removeFirstAddedNumber()
80# param_3 = obj.getMean()
81# param_4 = obj.getMedian()
82# param_5 = obj.getMode()
83
1/**
2 * A data structure that maintains the median of a dynamic collection of numbers
3 * with support for both insertion and deletion operations.
4 * Uses two heaps (max heap for smaller half, min heap for larger half) and lazy deletion.
5 */
6class MedianFinder {
7 // Max heap to store the smaller half of numbers
8 private final PriorityQueue<Integer> smallerHalf = new PriorityQueue<>(Comparator.reverseOrder());
9
10 // Min heap to store the larger half of numbers
11 private final PriorityQueue<Integer> largerHalf = new PriorityQueue<>();
12
13 // Map to track numbers that need to be deleted lazily
14 private final Map<Integer, Integer> delayedDeletions = new HashMap<>();
15
16 // Actual size of smaller half (excluding delayed deletions)
17 private int smallerHalfSize;
18
19 // Actual size of larger half (excluding delayed deletions)
20 private int largerHalfSize;
21
22 /**
23 * Adds a number to the median finder.
24 * @param num The number to add
25 */
26 public void addNum(int num) {
27 // Determine which heap to add the number to
28 if (smallerHalf.isEmpty() || num <= smallerHalf.peek()) {
29 smallerHalf.offer(num);
30 smallerHalfSize++;
31 } else {
32 largerHalf.offer(num);
33 largerHalfSize++;
34 }
35
36 // Ensure heaps remain balanced
37 rebalance();
38 }
39
40 /**
41 * Finds and returns the current median.
42 * @return The median value
43 */
44 public Integer findMedian() {
45 // If sizes are equal, median is the smaller element of the larger half
46 // Otherwise, median is the largest element of the smaller half
47 return smallerHalfSize == largerHalfSize ? largerHalf.peek() : smallerHalf.peek();
48 }
49
50 /**
51 * Removes a number from the median finder using lazy deletion.
52 * @param num The number to remove
53 */
54 public void removeNum(int num) {
55 // Mark the number for lazy deletion
56 delayedDeletions.merge(num, 1, Integer::sum);
57
58 // Update size and prune if necessary
59 if (num <= smallerHalf.peek()) {
60 smallerHalfSize--;
61 if (num == smallerHalf.peek()) {
62 prune(smallerHalf);
63 }
64 } else {
65 largerHalfSize--;
66 if (num == largerHalf.peek()) {
67 prune(largerHalf);
68 }
69 }
70
71 // Ensure heaps remain balanced after removal
72 rebalance();
73 }
74
75 /**
76 * Removes elements from the top of the heap that are marked for deletion.
77 * @param heap The heap to prune
78 */
79 private void prune(PriorityQueue<Integer> heap) {
80 while (!heap.isEmpty() && delayedDeletions.containsKey(heap.peek())) {
81 int topElement = heap.peek();
82
83 // Decrement the deletion count for this element
84 if (delayedDeletions.merge(topElement, -1, Integer::sum) == 0) {
85 delayedDeletions.remove(topElement);
86 }
87
88 // Remove the element from the heap
89 heap.poll();
90 }
91 }
92
93 /**
94 * Rebalances the two heaps to maintain the median property.
95 * Ensures that the size difference between heaps is at most 1.
96 */
97 private void rebalance() {
98 // If smaller half has more than one extra element, move one to larger half
99 if (smallerHalfSize > largerHalfSize + 1) {
100 largerHalf.offer(smallerHalf.poll());
101 smallerHalfSize--;
102 largerHalfSize++;
103 prune(smallerHalf);
104 }
105 // If larger half has more elements, move one to smaller half
106 else if (smallerHalfSize < largerHalfSize) {
107 smallerHalf.offer(largerHalf.poll());
108 largerHalfSize--;
109 smallerHalfSize++;
110 prune(largerHalf);
111 }
112 }
113}
114
115/**
116 * A statistics tracker that maintains mean, median, and mode for a dynamic collection
117 * of numbers with support for adding numbers and removing the oldest added number.
118 */
119class StatisticsTracker {
120 // Queue to maintain insertion order for FIFO removal
121 private final Deque<Integer> numberQueue = new ArrayDeque<>();
122
123 // Sum of all current numbers
124 private long sum;
125
126 // Frequency count of each number
127 private final Map<Integer, Integer> frequencyMap = new HashMap<>();
128
129 // Median finder utility
130 private final MedianFinder medianFinder = new MedianFinder();
131
132 // TreeSet to efficiently track mode (sorted by frequency descending, then by value ascending)
133 private final TreeSet<int[]> frequencySet = new TreeSet<>((a, b) ->
134 a[1] == b[1] ? a[0] - b[0] : b[1] - a[1]);
135
136 /**
137 * Constructor for StatisticsTracker.
138 */
139 public StatisticsTracker() {
140 }
141
142 /**
143 * Adds a new number to the tracker.
144 * @param number The number to add
145 */
146 public void addNumber(int number) {
147 // Add to queue for order tracking
148 numberQueue.offerLast(number);
149
150 // Update sum
151 sum += number;
152
153 // Update frequency tracking for mode calculation
154 int oldFrequency = frequencyMap.getOrDefault(number, 0);
155 frequencySet.remove(new int[] {number, oldFrequency});
156 frequencyMap.merge(number, 1, Integer::sum);
157 int newFrequency = frequencyMap.get(number);
158 frequencySet.add(new int[] {number, newFrequency});
159
160 // Update median finder
161 medianFinder.addNum(number);
162 }
163
164 /**
165 * Removes the first (oldest) added number from the tracker.
166 */
167 public void removeFirstAddedNumber() {
168 // Remove the oldest number from queue
169 int number = numberQueue.pollFirst();
170
171 // Update sum
172 sum -= number;
173
174 // Update frequency tracking for mode calculation
175 int oldFrequency = frequencyMap.get(number);
176 frequencySet.remove(new int[] {number, oldFrequency});
177 frequencyMap.merge(number, -1, Integer::sum);
178 int newFrequency = frequencyMap.get(number);
179 frequencySet.add(new int[] {number, newFrequency});
180
181 // Update median finder
182 medianFinder.removeNum(number);
183 }
184
185 /**
186 * Calculates and returns the mean of current numbers.
187 * @return The mean value as an integer
188 */
189 public int getMean() {
190 return (int) (sum / numberQueue.size());
191 }
192
193 /**
194 * Returns the median of current numbers.
195 * @return The median value
196 */
197 public int getMedian() {
198 return medianFinder.findMedian();
199 }
200
201 /**
202 * Returns the mode (most frequent number) of current numbers.
203 * @return The mode value
204 */
205 public int getMode() {
206 return frequencySet.first()[0];
207 }
208}
209
1class MedianFinder {
2public:
3 // Add a number to the data structure
4 void addNum(int num) {
5 // Add to max heap (small values) if it's empty or num is smaller than max heap's top
6 if (maxHeap.empty() || num <= maxHeap.top()) {
7 maxHeap.push(num);
8 ++maxHeapSize;
9 } else {
10 // Otherwise add to min heap (large values)
11 minHeap.push(num);
12 ++minHeapSize;
13 }
14 rebalance();
15 }
16
17 // Remove a number from the data structure (lazy deletion)
18 void removeNum(int num) {
19 // Mark the number for delayed removal
20 ++delayedRemoval[num];
21
22 // Update size and prune if necessary
23 if (num <= maxHeap.top()) {
24 --maxHeapSize;
25 if (num == maxHeap.top()) {
26 pruneHeap(maxHeap);
27 }
28 } else {
29 --minHeapSize;
30 if (num == minHeap.top()) {
31 pruneHeap(minHeap);
32 }
33 }
34 rebalance();
35 }
36
37 // Find the median of current numbers
38 int findMedian() {
39 // If equal sizes, median is the smaller of the two tops; otherwise it's from the larger heap
40 return maxHeapSize == minHeapSize ? minHeap.top() : maxHeap.top();
41 }
42
43private:
44 priority_queue<int> maxHeap; // Max heap for smaller half
45 priority_queue<int, vector<int>, greater<int>> minHeap; // Min heap for larger half
46 unordered_map<int, int> delayedRemoval; // Track numbers to be removed lazily
47 int maxHeapSize = 0; // Effective size of max heap
48 int minHeapSize = 0; // Effective size of min heap
49
50 // Remove delayed elements from the top of the heap
51 template <typename T>
52 void pruneHeap(T& heap) {
53 while (!heap.empty() && delayedRemoval[heap.top()]) {
54 // Decrement the delayed count
55 if (--delayedRemoval[heap.top()] == 0) {
56 delayedRemoval.erase(heap.top());
57 }
58 heap.pop();
59 }
60 }
61
62 // Maintain balance between two heaps
63 void rebalance() {
64 // Max heap should have at most one more element than min heap
65 if (maxHeapSize > minHeapSize + 1) {
66 minHeap.push(maxHeap.top());
67 maxHeap.pop();
68 --maxHeapSize;
69 ++minHeapSize;
70 pruneHeap(maxHeap);
71 }
72 // Min heap should never have more elements than max heap
73 else if (maxHeapSize < minHeapSize) {
74 maxHeap.push(minHeap.top());
75 minHeap.pop();
76 ++maxHeapSize;
77 --minHeapSize;
78 pruneHeap(minHeap);
79 }
80 }
81};
82
83class StatisticsTracker {
84private:
85 queue<int> numberQueue; // Queue to maintain insertion order
86 long long sum = 0; // Sum of all current numbers
87 unordered_map<int, int> frequencyMap; // Frequency count of each number
88 MedianFinder medianFinder; // Helper class to find median
89 set<pair<int, int>> frequencySet; // Set sorted by frequency (negative for max frequency first)
90
91public:
92 StatisticsTracker() {}
93
94 // Add a new number to the tracker
95 void addNumber(int number) {
96 // Add to queue and update sum
97 numberQueue.push(number);
98 sum += number;
99
100 // Update frequency in the sorted set
101 frequencySet.erase({-frequencyMap[number], number});
102 frequencyMap[number]++;
103 medianFinder.addNum(number);
104 frequencySet.insert({-frequencyMap[number], number});
105 }
106
107 // Remove the oldest number from the tracker
108 void removeFirstAddedNumber() {
109 // Get and remove the first added number
110 int number = numberQueue.front();
111 numberQueue.pop();
112 sum -= number;
113
114 // Update frequency in the sorted set
115 frequencySet.erase({-frequencyMap[number], number});
116 frequencyMap[number]--;
117
118 // Only re-insert if frequency is still positive
119 if (frequencyMap[number] > 0) {
120 frequencySet.insert({-frequencyMap[number], number});
121 }
122
123 medianFinder.removeNum(number);
124 }
125
126 // Calculate and return the mean of all numbers
127 int getMean() {
128 return static_cast<int>(sum / numberQueue.size());
129 }
130
131 // Get the median of all numbers
132 int getMedian() {
133 return medianFinder.findMedian();
134 }
135
136 // Get the mode (most frequent number, smallest if tie)
137 int getMode() {
138 return frequencySet.begin()->second;
139 }
140};
141
1// Queue to maintain insertion order of numbers
2let numberQueue: number[] = [];
3// Sum of all current numbers
4let sum: number = 0;
5// Frequency count of each number
6let frequencyMap: Map<number, number> = new Map();
7// Max heap for smaller half of numbers
8let maxHeap: number[] = [];
9// Min heap for larger half of numbers
10let minHeap: number[] = [];
11// Track numbers to be removed lazily from heaps
12let delayedRemoval: Map<number, number> = new Map();
13// Effective size of max heap
14let maxHeapSize: number = 0;
15// Effective size of min heap
16let minHeapSize: number = 0;
17// Set sorted by frequency (stores [frequency, number] pairs)
18let frequencySet: Array<[number, number]> = [];
19
20// Helper function to maintain max heap property
21function maxHeapPush(num: number): void {
22 maxHeap.push(num);
23 let childIndex = maxHeap.length - 1;
24 while (childIndex > 0) {
25 const parentIndex = Math.floor((childIndex - 1) / 2);
26 if (maxHeap[parentIndex] >= maxHeap[childIndex]) break;
27 [maxHeap[parentIndex], maxHeap[childIndex]] = [maxHeap[childIndex], maxHeap[parentIndex]];
28 childIndex = parentIndex;
29 }
30}
31
32// Helper function to pop from max heap
33function maxHeapPop(): number {
34 const result = maxHeap[0];
35 maxHeap[0] = maxHeap[maxHeap.length - 1];
36 maxHeap.pop();
37
38 let parentIndex = 0;
39 while (true) {
40 const leftChild = 2 * parentIndex + 1;
41 const rightChild = 2 * parentIndex + 2;
42 let largest = parentIndex;
43
44 if (leftChild < maxHeap.length && maxHeap[leftChild] > maxHeap[largest]) {
45 largest = leftChild;
46 }
47 if (rightChild < maxHeap.length && maxHeap[rightChild] > maxHeap[largest]) {
48 largest = rightChild;
49 }
50 if (largest === parentIndex) break;
51
52 [maxHeap[parentIndex], maxHeap[largest]] = [maxHeap[largest], maxHeap[parentIndex]];
53 parentIndex = largest;
54 }
55 return result;
56}
57
58// Helper function to maintain min heap property
59function minHeapPush(num: number): void {
60 minHeap.push(num);
61 let childIndex = minHeap.length - 1;
62 while (childIndex > 0) {
63 const parentIndex = Math.floor((childIndex - 1) / 2);
64 if (minHeap[parentIndex] <= minHeap[childIndex]) break;
65 [minHeap[parentIndex], minHeap[childIndex]] = [minHeap[childIndex], minHeap[parentIndex]];
66 childIndex = parentIndex;
67 }
68}
69
70// Helper function to pop from min heap
71function minHeapPop(): number {
72 const result = minHeap[0];
73 minHeap[0] = minHeap[minHeap.length - 1];
74 minHeap.pop();
75
76 let parentIndex = 0;
77 while (true) {
78 const leftChild = 2 * parentIndex + 1;
79 const rightChild = 2 * parentIndex + 2;
80 let smallest = parentIndex;
81
82 if (leftChild < minHeap.length && minHeap[leftChild] < minHeap[smallest]) {
83 smallest = leftChild;
84 }
85 if (rightChild < minHeap.length && minHeap[rightChild] < minHeap[smallest]) {
86 smallest = rightChild;
87 }
88 if (smallest === parentIndex) break;
89
90 [minHeap[parentIndex], minHeap[smallest]] = [minHeap[smallest], minHeap[parentIndex]];
91 parentIndex = smallest;
92 }
93 return result;
94}
95
96// Remove delayed elements from the top of max heap
97function pruneMaxHeap(): void {
98 while (maxHeap.length > 0 && delayedRemoval.has(maxHeap[0]) && delayedRemoval.get(maxHeap[0])! > 0) {
99 const count = delayedRemoval.get(maxHeap[0])! - 1;
100 if (count === 0) {
101 delayedRemoval.delete(maxHeap[0]);
102 } else {
103 delayedRemoval.set(maxHeap[0], count);
104 }
105 maxHeapPop();
106 }
107}
108
109// Remove delayed elements from the top of min heap
110function pruneMinHeap(): void {
111 while (minHeap.length > 0 && delayedRemoval.has(minHeap[0]) && delayedRemoval.get(minHeap[0])! > 0) {
112 const count = delayedRemoval.get(minHeap[0])! - 1;
113 if (count === 0) {
114 delayedRemoval.delete(minHeap[0]);
115 } else {
116 delayedRemoval.set(minHeap[0], count);
117 }
118 minHeapPop();
119 }
120}
121
122// Maintain balance between two heaps
123function rebalance(): void {
124 // Max heap should have at most one more element than min heap
125 if (maxHeapSize > minHeapSize + 1) {
126 minHeapPush(maxHeap[0]);
127 maxHeapPop();
128 maxHeapSize--;
129 minHeapSize++;
130 pruneMaxHeap();
131 }
132 // Min heap should never have more elements than max heap
133 else if (maxHeapSize < minHeapSize) {
134 maxHeapPush(minHeap[0]);
135 minHeapPop();
136 maxHeapSize++;
137 minHeapSize--;
138 pruneMinHeap();
139 }
140}
141
142// Add a number to the median finder data structure
143function addNumToMedianFinder(num: number): void {
144 // Add to max heap (small values) if it's empty or num is smaller than max heap's top
145 if (maxHeap.length === 0 || num <= maxHeap[0]) {
146 maxHeapPush(num);
147 maxHeapSize++;
148 } else {
149 // Otherwise add to min heap (large values)
150 minHeapPush(num);
151 minHeapSize++;
152 }
153 rebalance();
154}
155
156// Remove a number from the median finder data structure (lazy deletion)
157function removeNumFromMedianFinder(num: number): void {
158 // Mark the number for delayed removal
159 delayedRemoval.set(num, (delayedRemoval.get(num) || 0) + 1);
160
161 // Update size and prune if necessary
162 if (maxHeap.length > 0 && num <= maxHeap[0]) {
163 maxHeapSize--;
164 if (num === maxHeap[0]) {
165 pruneMaxHeap();
166 }
167 } else {
168 minHeapSize--;
169 if (minHeap.length > 0 && num === minHeap[0]) {
170 pruneMinHeap();
171 }
172 }
173 rebalance();
174}
175
176// Find the median of current numbers
177function findMedian(): number {
178 // If equal sizes, median is the smaller of the two tops; otherwise it's from the larger heap
179 return maxHeapSize === minHeapSize ? minHeap[0] : maxHeap[0];
180}
181
182// Update frequency set by removing old entry and adding new one
183function updateFrequencySet(number: number, oldFreq: number, newFreq: number): void {
184 // Remove old frequency entry if it exists
185 if (oldFreq > 0) {
186 const indexToRemove = frequencySet.findIndex(
187 ([freq, num]) => freq === -oldFreq && num === number
188 );
189 if (indexToRemove !== -1) {
190 frequencySet.splice(indexToRemove, 1);
191 }
192 }
193
194 // Add new frequency entry if frequency is positive
195 if (newFreq > 0) {
196 frequencySet.push([-newFreq, number]);
197 // Sort by frequency (descending) then by number (ascending)
198 frequencySet.sort((a, b) => {
199 if (a[0] !== b[0]) return a[0] - b[0];
200 return a[1] - b[1];
201 });
202 }
203}
204
205// Add a new number to the tracker
206function addNumber(number: number): void {
207 // Add to queue and update sum
208 numberQueue.push(number);
209 sum += number;
210
211 // Update frequency in the map and sorted set
212 const oldFreq = frequencyMap.get(number) || 0;
213 const newFreq = oldFreq + 1;
214 frequencyMap.set(number, newFreq);
215 updateFrequencySet(number, oldFreq, newFreq);
216
217 // Add to median finder
218 addNumToMedianFinder(number);
219}
220
221// Remove the oldest number from the tracker
222function removeFirstAddedNumber(): void {
223 // Get and remove the first added number
224 const number = numberQueue.shift()!;
225 sum -= number;
226
227 // Update frequency in the map and sorted set
228 const oldFreq = frequencyMap.get(number)!;
229 const newFreq = oldFreq - 1;
230
231 if (newFreq > 0) {
232 frequencyMap.set(number, newFreq);
233 } else {
234 frequencyMap.delete(number);
235 }
236 updateFrequencySet(number, oldFreq, newFreq);
237
238 // Remove from median finder
239 removeNumFromMedianFinder(number);
240}
241
242// Calculate and return the mean of all numbers
243function getMean(): number {
244 return Math.floor(sum / numberQueue.length);
245}
246
247// Get the median of all numbers
248function getMedian(): number {
249 return findMedian();
250}
251
252// Get the mode (most frequent number, smallest if tie)
253function getMode(): number {
254 return frequencySet[0][1];
255}
256
Time and Space Complexity
Time Complexity:
-
addNumber(number)
:O(log n)
wheren
is the current number of elementsdeque.append()
:O(1)
SortedList.add()
:O(log n)
SortedList.discard()
:O(log n)
SortedList.add()
for sl2:O(log n)
- Dictionary operations and arithmetic:
O(1)
-
removeFirstAddedNumber()
:O(log n)
wheren
is the current number of elementsdeque.popleft()
:O(1)
SortedList.remove()
:O(log n)
SortedList.discard()
:O(log n)
SortedList.add()
for sl2:O(log n)
- Dictionary operations and arithmetic:
O(1)
-
getMean()
:O(1)
- Simple division operation
-
getMedian()
:O(log n)
- Accessing element by index in SortedList:
O(log n)
- Accessing element by index in SortedList:
-
getMode()
:O(log n)
- Accessing element by index in SortedList:
O(log n)
- Accessing element by index in SortedList:
Space Complexity: O(n)
where n
is the number of added numbers
deque
: stores alln
numbers -O(n)
SortedList sl
: stores alln
numbers -O(n)
defaultdict cnt
: stores at mostn
unique numbers and their counts -O(n)
SortedList sl2
: stores at mostn
unique (number, count) pairs -O(n)
- Variable
s
: single integer -O(1)
The overall space complexity is O(n)
as all data structures grow linearly with the number of elements stored.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Median Calculation for Even-Length Arrays
A frequent mistake is using (len(q) - 1) // 2
or misunderstanding which middle element to return when there's an even count of numbers. The problem specifically asks for the larger of the two middle elements.
Incorrect approach:
def getMedian(self) -> int:
# Wrong: This returns the smaller middle element for even counts
return self.sorted_numbers[(len(self.queue) - 1) // 2]
Correct approach:
def getMedian(self) -> int:
# Correct: len(q) // 2 gives the upper middle element
# For n=4: indices [0,1,2,3], len//2 = 2 (correct)
# For n=5: indices [0,1,2,3,4], len//2 = 2 (correct middle)
return self.sorted_numbers[len(self.queue) // 2]
2. Not Handling Zero Frequency in removeFirstAddedNumber()
When removing a number, developers often forget that after decrementing, a number might have zero occurrences. Leaving entries with zero frequency in the frequency_sorted list can cause incorrect mode calculations.
Problematic approach:
def removeFirstAddedNumber(self) -> None:
number = self.queue.popleft()
self.frequency_sorted.discard((number, self.frequency_map[number]))
self.frequency_map[number] -= 1
# Bug: Only add back if frequency > 0
if self.frequency_map[number] > 0:
self.frequency_sorted.add((number, self.frequency_map[number]))
Better approach: Either add all entries (including zero frequency) and filter when getting mode, or properly clean up zero-frequency entries:
def removeFirstAddedNumber(self) -> None:
number = self.queue.popleft()
self.frequency_sorted.discard((number, self.frequency_map[number]))
self.frequency_map[number] -= 1
if self.frequency_map[number] > 0:
self.frequency_sorted.add((number, self.frequency_map[number]))
else:
# Clean up zero frequency from the map
del self.frequency_map[number]
3. Forgetting to Update frequency_sorted Before Changing Frequency
The order of operations is crucial. You must remove the old (number, frequency) tuple from frequency_sorted before updating the frequency in frequency_map.
Wrong order:
def addNumber(self, number: int) -> None:
# Wrong: Incrementing before removing old tuple
self.frequency_map[number] += 1
# This won't find the tuple since frequency has changed!
self.frequency_sorted.discard((number, self.frequency_map[number] - 1))
self.frequency_sorted.add((number, self.frequency_map[number]))
Correct order:
def addNumber(self, number: int) -> None:
# Remove old tuple first (using current frequency)
self.frequency_sorted.discard((number, self.frequency_map[number]))
# Then update frequency
self.frequency_map[number] += 1
# Add new tuple with updated frequency
self.frequency_sorted.add((number, self.frequency_map[number]))
4. Using Regular Division Instead of Integer Division for getMean()
The problem explicitly asks for the floored mean, which requires integer division.
Incorrect:
def getMean(self) -> int:
# Wrong: Returns float, not int
return self.total_sum / len(self.queue)
Correct:
def getMean(self) -> int:
# Correct: Integer division floors the result
return self.total_sum // len(self.queue)
5. Incorrect Sorting Key for Mode Calculation
The mode requires finding the most frequent number, with ties broken by selecting the smallest number. Using the wrong sorting key will produce incorrect results.
Common mistakes:
# Wrong: Sorts by frequency ascending (smallest first) self.frequency_sorted = SortedList(key=lambda x: (x[1], x[0])) # Wrong: Doesn't handle tie-breaking correctly self.frequency_sorted = SortedList(key=lambda x: -x[1])
Correct approach:
# Correct: -x[1] for descending frequency, x[0] for ascending number self.frequency_sorted = SortedList(key=lambda x: (-x[1], x[0]))
Which data structure is used in a depth first search?
Recommended Readings
Queue Intro Think of the last time you stood in line to buy movie tickets The first person in line gets their ticket first and any newcomers join the end of the line This system where the first person to arrive is the first to be served is a queue in real
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
Median of Data Stream Given a stream of numbers find the median number at any given time accurate to 1 decimal place Example add_number 1 add_number 2 add_number 3 get_median 2 0 add_number 4 get_median 2 5 Try it yourself Explanation Intuition Brute force way is to sort the entire list every time we get a new number This would be O N log N for each add_number operation One step up is to notice that the list is sorted before we add any new number to it So every
Want a Structured Path to Master System Design Too? Donβt Miss This!