1825. Finding MK Average


Problem Description

The given problem involves creating a data structure that calculates the MKAverage for a sequence of integers. To calculate this MKAverage, you have to follow a specific series of steps:

  1. Maintain the last m integers of the stream. If there are fewer than m integers in the stream, the MKAverage is -1.
  2. From these m integers, remove the smallest k integers and the largest k integers.
  3. The MKAverage is the average of the remaining integers after the removal, rounded down to the nearest integer.

This calculation needs to happen every time a new integer is added to the stream, and the MKAverage is requested.

Intuition

The solution approach uses a balanced approach to maintain the order of numbers by categorizing them into three separate SortedList objects: lo for the smallest k numbers, hi for the largest k numbers, and mid for the rest. Additionally, we maintain a running sum s of the numbers in mid. Here's how we arrive at the solution:

  • When adding an element, determine where it should go: lo, hi, or mid. Add the element to the appropriate SortedList, adjusting the running sum s if the element is added to mid.
  • If the total count of elements exceeds m, remove the oldest element. Depending on which list the oldest element is in, update s.
  • Periodically move elements between the lists to maintain exactly k elements in both lo and hi.
    • If lo has more than k elements, move the largest of these to mid. Adjust s by adding the moved element's value.
    • If hi has more than k elements, move the smallest of these to mid. Adjust s by adding the moved element's value.
    • If lo has fewer than k elements and mid has elements, move the smallest element from mid to lo. Adjust s by subtracting the moved element's value.
    • If hi has fewer than k elements and mid has elements, move the largest element from mid to hi. Adjust s by subtracting the moved element's value.

By keeping track of these transitions and maintaining the correct number of elements in each list, we can quickly calculate the MKAverage when requested by dividing the running sum s by the number of elements in mid, which is m - 2 * k.

Learn more about Queue, Data Stream and Heap (Priority Queue) patterns.

Solution Approach

The solution to this problem utilizes several data structures and algorithms to efficiently maintain the necessary elements and perform calculations on demand.

Data Structures Used:

  1. SortedList (from sortedcontainers): A SortedList is used three times within the class to maintain the smallest k elements (lo), the largest k elements (hi), and the middle elements (mid). This data structure provides efficient insertion, removal, and access of the smallest and largest elements.
  2. deque: A deque (double-ended queue) is used to keep track of the order of elements, specifically to remove the oldest element once the size exceeds m.

Patterns and Algorithms:

  1. Balanced Partitioning: The stream's elements are partitioned across lo, mid, and hi such that:

    • lo contains the smallest k elements.
    • hi contains the largest k elements.
    • mid contains the remaining elements.
  2. Dynamic Update: The running sum s of the elements in mid is dynamically updated whenever an element is:

    • Added to mid: Sum is incremented by the value of the new element.
    • Removed from mid: Sum is decremented by the value of the removed element.
  3. Stream Buffering: The deque serves as a buffer for the stream, enforcing the maximum length of m elements by removing the oldest entry when exceeded.

  4. Conditional Rebalancing: The addElement method rebalances the three SortedLists whenever an element is added or removed. This ensures that lo and hi always have exactly k elements, if possible, while the remaining elements are in mid.

  5. Efficiency: By maintaining the sorted order and performing selective insertions and deletions, the algorithm avoids having to sort the entire list every time an element is added, significantly improving efficiency.

Here is the breakdown of the key steps in the addElement method:

  • Determine the List for the New Element: Upon adding a new element num, we have three cases:

    • If num is less than or equal to the last element in lo (or lo is empty), add num to lo.
    • If num is greater than or equal to the first element in hi (or hi is empty), add num to hi.
    • Otherwise, add num to mid and adjust the sum s.
  • Handle Stream Exceeding Capacity m: If adding num causes the stream to exceed m elements, remove the oldest element and make corresponding adjustments to maintain the size invariant.

  • Rebalance SortedLists: After the addition and potential removal of elements, check the sizes of lo and hi. We execute several corrective steps:

    • Move elements from lo to mid if lo is too large, and likewise from hi to mid.
    • Conversely, if lo or hi are too small, move elements from mid.

The calculateMKAverage method simply returns the integer division of s by m - 2 * k, provided that we have at least m elements in the stream; otherwise, it returns -1.

Ready to land your dream job?

Unlock your dream job with a 2-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's consider an example where m = 5 and k = 1. This implies we are tracking the last 5 numbers, and we will remove the smallest and largest number before calculating the MKAverage.

Following the sequence of steps for the given approach:

  1. Initialize the data structure with m = 5 and k = 1.
  2. Let's add some elements: 3, 1, 10, 5, 5, and request the MKAverage at the end.

Here is how the process will look like step by step:

  • Adding 3: lo = [3], mid = [], hi = [], s = 0. Stream = [3]. MKAverage is not possible yet as we don't have m elements.
  • Adding 1: lo = [1], mid = [3], hi = [], s = 3. Stream = [3, 1]. Still fewer than m elements, hence MKAverage = -1.
  • Adding 10: Now, lo = [1], mid = [3], hi = [10], s = 3. Stream = [3, 1, 10]. Still fewer than m elements, hence MKAverage = -1.
  • Adding 5: It goes into mid: lo = [1], mid = [3, 5], hi = [10], s = 8. Stream = [3, 1, 10, 5]. Still fewer than m elements, hence MKAverage = -1.
  • Adding 5: Another 5 goes into mid: lo = [1], mid = [3, 5, 5], hi = [10], s = 13. Stream = [3, 1, 10, 5, 5]. Now we have m elements, so we can calculate MKAverage = (\lfloor\frac{13}{5 - 2 \times 1}\rfloor) = (\lfloor\frac{13}{3}\rfloor) = 4.

After processing the sequence of numbers 3, 1, 10, 5, 5 with an m value of 5 and a k value of 1, our final SortedLists are:

  • lo = [1] (smallest k = 1 numbers)
  • mid = [3, 5, 5] (middle m - 2 * k = 3 numbers and the sum s of these numbers is 13)
  • hi = [10] (largest k = 1 numbers)

And our current MKAverage is 4, as per the calculation detailed above.

  1. Adding another number (e.g., 7): We need to manage the total elements to be the last m integers. So with the addition of 7, our data structure will look like this:
  • The oldest number (3) will be removed from the Stream = [1, 10, 5, 5, 7]. The corresponding number (3) will be removed from mid and s will be adjusted to s = 10.
  • 7 will be compared to the least element in hi (10) and the largest in lo (1). Since 7 is between them, it will be added to mid: lo = [1], mid = [5, 5, 7], hi = [10], s = 17.
  • Calculate the new MKAverage: MKAverage = (\lfloor\frac{17}{5 - 2 \times 1}\rfloor) = (\lfloor\frac{17}{3}\rfloor) = 5.

The addElement and calculateMKAverage methods thus maintain and compute the MKAverage after each addition, in line with the description provided, utilizing the SortedList partitioning and the dynamic update rules.

Solution Implementation

1from sortedcontainers import SortedList
2from collections import deque
3
4class MKAverage:
5    def __init__(self, m: int, k: int):
6        # Initialize variables:
7        # m represents the length of the moving window.
8        # k represents the number of elements to exclude from the top and bottom.
9        self.m = m
10        self.k = k
11        # Initialize sum of middle segment.
12        self.middle_sum = 0
13        # Initialize queue for the moving window.
14        self.window = deque()
15        # SortedList 'low' will contain the lowest k elements in the window.
16        self.low = SortedList()
17        # SortedList 'middle' will contain the elements other than the top k and bottom k.
18        self.middle = SortedList()
19        # SortedList 'high' will contain the highest k elements in the window.
20        self.high = SortedList()
21
22    def add_element(self, num: int) -> None:
23        # Add a new element to appropriate sorted list and update sums as needed.
24        if not self.low or num <= self.low[-1]:
25            self.low.add(num)
26        elif not self.high or num >= self.high[0]:
27            self.high.add(num)
28        else:
29            self.middle.add(num)
30            self.middle_sum += num
31      
32        # Add the new element to the end of the window deque.
33        self.window.append(num)
34      
35        # If the window size exceeded 'm', remove the oldest element.
36        if len(self.window) > self.m:
37            oldest_element = self.window.popleft()
38            if oldest_element in self.low:
39                self.low.remove(oldest_element)
40            elif oldest_element in self.high:
41                self.high.remove(oldest_element)
42            else:
43                self.middle.remove(oldest_element)
44                self.middle_sum -= oldest_element
45
46        # Rebalance the three sorted lists if necessary.
47        self.rebalance_lists()
48
49    def rebalance_lists(self):
50        # Ensure that low and high always have exactly 'k' elements by moving them to and from middle.
51        while len(self.low) > self.k:
52            x = self.low.pop()
53            self.middle.add(x)
54            self.middle_sum += x
55
56        while len(self.high) > self.k:
57            x = self.high.pop(0)
58            self.middle.add(x)
59            self.middle_sum += x
60
61        while len(self.low) < self.k and self.middle:
62            x = self.middle.pop(0)
63            self.low.add(x)
64            self.middle_sum -= x
65
66        while len(self.high) < self.k and self.middle:
67            x = self.middle.pop()
68            self.high.add(x)
69            self.middle_sum -= x
70
71    def calculate_mk_average(self) -> int:
72        # If there are fewer than 'm' elements, return -1.
73        if len(self.window) < self.m:
74            return -1
75        # Otherwise, calculate and return MKAverage: sum of middle segment divided by its length.
76        return self.middle_sum // (self.m - 2 * self.k)
77
78# Usage
79# obj = MKAverage(m, k)
80# obj.add_element(num)
81# param_2 = obj.calculate_mk_average()
82
1import java.util.ArrayDeque;
2import java.util.Deque;
3import java.util.TreeMap;
4
5class MKAverage {
6
7    private int m, k;
8    private long sumTotal;
9    private int sizeLow, sizeHigh;
10    private Deque<Integer> elementsQueue = new ArrayDeque<>();
11    private TreeMap<Integer, Integer> lowElements = new TreeMap<>();
12    private TreeMap<Integer, Integer> midElements = new TreeMap<>();
13    private TreeMap<Integer, Integer> highElements = new TreeMap<>();
14
15    public MKAverage(int m, int k) {
16        this.m = m;
17        this.k = k;
18    }
19
20    // Adds a new element to the MKAverage structure
21    public void addElement(int num) {
22        // Decide the right bucket and add the number to it
23        if (lowElements.isEmpty() || num <= lowElements.lastKey()) {
24            lowElements.merge(num, 1, Integer::sum);
25            ++sizeLow;
26        } else if (highElements.isEmpty() || num >= highElements.firstKey()) {
27            highElements.merge(num, 1, Integer::sum);
28            ++sizeHigh;
29        } else {
30            midElements.merge(num, 1, Integer::sum);
31            sumTotal += num;
32        }
33      
34        // Add the new number to the end of the deque
35        elementsQueue.offer(num);
36      
37        // If the number of elements exceeds m, remove the oldest one
38        if (elementsQueue.size() > m) {
39            int oldestNum = elementsQueue.poll();
40            // Remove the oldest number from the corresponding bucket
41            if (lowElements.containsKey(oldestNum)) {
42                if (lowElements.merge(oldestNum, -1, Integer::sum) == 0) {
43                    lowElements.remove(oldestNum);
44                }
45                --sizeLow;
46            } else if (highElements.containsKey(oldestNum)) {
47                if (highElements.merge(oldestNum, -1, Integer::sum) == 0) {
48                    highElements.remove(oldestNum);
49                }
50                --sizeHigh;
51            } else {
52                if (midElements.merge(oldestNum, -1, Integer::sum) == 0) {
53                    midElements.remove(oldestNum);
54                }
55                sumTotal -= oldestNum;
56            }
57        }
58      
59        // Rebalance the buckets if needed
60        balanceBuckets();
61    }
62
63    // Calculates the MKAverage value
64    public int calculateMKAverage() {
65        // If not enough elements present, return -1
66        if (elementsQueue.size() < m) return -1;
67        // Calculate the average, excluding k smallest and k largest elements
68        return (int) (sumTotal / (m - k * 2));
69    }
70
71    // Helper function to rebalance the buckets
72    private void balanceBuckets() {
73        // Move elements from low to middle
74        while (sizeLow > k) {
75            int x = lowElements.lastKey();
76            if (lowElements.merge(x, -1, Integer::sum) == 0) {
77                lowElements.remove(x);
78            }
79            midElements.merge(x, 1, Integer::sum);
80            sumTotal += x;
81            --sizeLow;
82        }
83        // Move elements from high to middle
84        while (sizeHigh > k) {
85            int x = highElements.firstKey();
86            if (highElements.merge(x, -1, Integer::sum) == 0) {
87                highElements.remove(x);
88            }
89            midElements.merge(x, 1, Integer::sum);
90            sumTotal += x;
91            --sizeHigh;
92        }
93        // Move elements from middle to low
94        while (sizeLow < k && !midElements.isEmpty()) {
95            int x = midElements.firstKey();
96            if (midElements.merge(x, -1, Integer::sum) == 0) {
97                midElements.remove(x);
98            }
99            sumTotal -= x;
100            lowElements.merge(x, 1, Integer::sum);
101            ++sizeLow;
102        }
103        // Move elements from middle to high
104        while (sizeHigh < k && !midElements.isEmpty()) {
105            int x = midElements.lastKey();
106            if (midElements.merge(x, -1, Integer::sum) == 0) {
107                midElements.remove(x);
108            }
109            sumTotal -= x;
110            highElements.merge(x, 1, Integer::sum);
111            ++sizeHigh;
112        }
113    }
114}
115
116// The commented code below is how you would use the MKAverage class.
117/*
118MKAverage obj = new MKAverage(m, k);
119obj.addElement(num);
120int averageValue = obj.calculateMKAverage();
121*/
122
1#include <queue>
2#include <set>
3
4class MKAverage {
5public:
6    // Initialize the MKAverage object with the window size 'm' and integer 'k'.
7    MKAverage(int m, int k) : m(m), k(k), sum(0) {}
8
9    // Add the new element to the data structure.
10    void addElement(int num) {
11        if (smallSet.empty() || num <= *smallSet.rbegin()) {
12            smallSet.insert(num); // insert into smaller numbers set
13        } else if (largeSet.empty() || num >= *largeSet.begin()) {
14            largeSet.insert(num); // insert into larger numbers set
15        } else {
16            middleSet.insert(num); // insert into middle numbers set
17            sum += num; // add to sum of the middle set
18        }
19
20        // Add new element to the queue to maintain the sliding window
21        elementsQueue.push(num);
22        if (elementsQueue.size() > m) {
23            // Remove the oldest element from the sliding window
24            int oldest = elementsQueue.front();
25            elementsQueue.pop();
26            if (smallSet.find(oldest) != smallSet.end()) {
27                smallSet.erase(smallSet.find(oldest));
28            } else if (largeSet.find(oldest) != largeSet.end()) {
29                largeSet.erase(largeSet.find(oldest));
30            } else {
31                middleSet.erase(middleSet.find(oldest));
32                sum -= oldest; // subtract from sum of the middle set
33            }
34        }
35
36        // Balance the sets so that 'k' smallest and 'k' largest are in their respective sets
37        // and the middle set contains the rest.
38        rebalanceSets();
39    }
40
41    // Calculate the MKAverage for the current data.
42    int calculateMKAverage() {
43        // If there aren't enough elements, return -1.
44        if (elementsQueue.size() < m) return -1;
45        // Calculate and return the average of the middle set.
46        return sum / (m - k * 2);
47    }
48
49private:
50    int m, k; // Window size 'm' and integer 'k'
51    long long sum; // Sum of the elements in the middle set
52    std::queue<int> elementsQueue; // Queue to maintain the sliding window
53    std::multiset<int> smallSet, middleSet, largeSet; // Multisets to keep track of the smallest, middle and largest elements
54
55    // Helper method to rebalance the multisets so that the smallSet and largeSet
56    // contain exactly 'k' elements each, and the middleSet contains the rest.
57    void rebalanceSets() {
58        while (smallSet.size() > k) {
59            auto it = prev(smallSet.end());
60            sum += *it;
61            middleSet.insert(*it);
62            smallSet.erase(it);
63        }
64        while (largeSet.size() > k) {
65            auto it = largeSet.begin();
66            sum += *it;
67            middleSet.insert(*it);
68            largeSet.erase(it);
69        }
70        while (smallSet.size() < k && !middleSet.empty()) {
71            auto it = middleSet.begin();
72            sum -= *it;
73            smallSet.insert(*it);
74            middleSet.erase(it);
75        }
76        while (largeSet.size() < k && !middleSet.empty()) {
77            auto it = prev(middleSet.end());
78            sum -= *it;
79            largeSet.insert(*it);
80            middleSet.erase(it);
81        }
82    }
83};
84
1let m: number = 0; // The window size 'm'.
2let k: number = 0; // The number 'k' used to exclude k smallest and k largest elements.
3let sum: number = 0; // The sum of the elements in the middle set.
4let elementsQueue: number[] = []; // An array to act as a queue to maintain the sliding window.
5let smallSet: Set<number> = new Set(); // A set to keep track of the smallest elements.
6let middleSet: Set<number> = new Set(); // A set to keep track of the middle elements.
7let largeSet: Set<number> = new Set(); // A set to keep track of the largest elements.
8
9// Initialize the MKAverage object with the window size 'm' and integer 'k'.
10function init(mVal: number, kVal: number): void {
11    m = mVal;
12    k = kVal;
13    sum = 0;
14    elementsQueue = [];
15    smallSet = new Set();
16    middleSet = new Set();
17    largeSet = new Set();
18}
19
20// Add the new element to the data structure.
21function addElement(num: number): void {
22    if (smallSet.size === 0 || num <= Math.max(...smallSet)) {
23        smallSet.add(num);
24    } else if (largeSet.size === 0 || num >= Math.min(...largeSet)) {
25        largeSet.add(num);
26    } else {
27        middleSet.add(num);
28        sum += num;
29    }
30
31    elementsQueue.push(num);
32    if (elementsQueue.length > m) {
33        let oldest: number = elementsQueue.shift()!;
34        if (smallSet.has(oldest)) {
35            smallSet.delete(oldest);
36        } else if (largeSet.has(oldest)) {
37            largeSet.delete(oldest);
38        } else {
39            middleSet.delete(oldest);
40            sum -= oldest;
41        }
42    }
43
44    rebalanceSets();
45}
46
47// Calculate the MKAverage for the current data.
48function calculateMKAverage(): number {
49    if (elementsQueue.length < m) {
50        return -1;
51    }
52    return sum / (m - 2 * k);
53}
54
55// Helper function to rebalance the sets.
56function rebalanceSets(): void {
57    while (smallSet.size > k) {
58        let toMove: number = Math.max(...smallSet);
59        smallSet.delete(toMove);
60        middleSet.add(toMove);
61        sum += toMove;
62    }
63    while (largeSet.size > k) {
64        let toMove: number = Math.min(...largeSet);
65        largeSet.delete(toMove);
66        middleSet.add(toMove);
67        sum += toMove;
68    }
69    while (smallSet.size < k && middleSet.size > 0) {
70        let toMove: number = Math.min(...middleSet);
71        middleSet.delete(toMove);
72        smallSet.add(toMove);
73        sum -= toMove;
74    }
75    while (largeSet.size < k && middleSet.size > 0) {
76        let toMove: number = Math.max(...middleSet);
77        middleSet.delete(toMove);
78        largeSet.add(toMove);
79        sum -= toMove;
80    }
81}
82

Time and Space Complexity

Time Complexity

addElement Method:

  • Inserting an element into the sorted list: Insertions into SortedList are O(log n) where n is the length of the list.
  • Popping from deque: The deque operations are done in O(1) time.
  • Removing an element from the sorted list: Removals are O(log n) since it's a sorted list.
  • Balancing the sorted lists: Each while loop potentially moves one element between the lists, so it's O(log n) per movement, but since it only happens if there's an unbalance, the amortized cost could be considered lower. However, treating each potential movement separately, the worst-case scenario in one call can go up to O(log n) for all movements (in cases where many movements are required to rebalance the lists).

Considering all of these operations, the worst-case time complexity of addElement in a single operation can be considered O(log n).

calculateMKAverage Method:

  • Simply returns a value or performs a division, which are both O(1) operations.

Therefore, the time complexity for addElement is O(log n) and for calculateMKAverage is O(1).

Space Complexity

  • Extra space for storage: The lists lo, mid, and hi, along with the deque q, store up to m elements, so they all take up O(m) space together.
  • SortedList overhead: A SortedList has some additional space overhead. However, this does not change the overall space complexity and it remains O(m).

Thus, the overall space complexity of the operations and data structures used in the MKAverage class is O(m).

Learn more about how to find time and space complexity quickly using problem constraints.


Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:
Question 1 out of 10

Which algorithm should you use to find a node that is close to the root of the tree?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!


Load More