352. Data Stream as Disjoint Intervals

HardDesignBinary SearchOrdered Set
Leetcode Link

Problem Description

This LeetCode problem involves writing a class called SummaryRanges that is designed to keep track of a stream of non-negative integers and provide a summarized view of the data in the form of disjoint intervals. The class should support two operations:

  1. addNum(value): This method takes a non-negative integer value and adds it to the stream.
  2. getIntervals(): This method returns the current summary of the stream as a list of disjoint intervals [start_i, end_i]. Each interval represents a sequence of consecutive numbers that have been added to the stream. The list of intervals is sorted based on the start_i of each interval.

The goal of SummaryRanges is to condense the stream of integers into as few intervals as possible by merging consecutive numbers together.

Intuition

To solve this problem, one needs to efficiently manage intervals of consecutive numbers with operations to insert a number into the stream and then update and merge these intervals. The key challenge is handling numbers that might need to merge with existing intervals when they fit just right before the start, after the end, or in between two existing intervals.

To address this efficiently, we can use a data structure that maintains the intervals in a sorted order to easily locate where the new number fits relative to existing intervals. This would simplify operations for finding, adding, or merging intervals. A SortedDict, which is a dictionary that keeps its keys sorted, can be used for this purpose. Each key is an integer from the data stream, and the corresponding value is an interval [start, end] representing consecutive numbers.

Here is the intuition behind handling the addition of a new number, val, to the stream:

  1. Check if val can merge with an existing interval on its left or right or if it fills a gap between two intervals:

    • If val is just after the end of one interval and just before the start of the next, we merge these two intervals.
    • If val is just after the end of an existing interval but not close enough to the next interval, we extend the existing interval to include val.
    • Similarly, if val is just before the start of an existing interval but not close enough to the previous interval, we adjust the start of this interval to include val.
  2. If val does not fit next to or between existing intervals, we create a new interval [val, val].

For getting the intervals, since we maintain a sorted dictionary where each interval is stored as values and the keys are the starting points of each interval, we simply return these values.

The provided solution uses this approach with a SortedDict to ensure that we can quickly locate adjacent intervals and merge intervals or adjust their boundaries as necessary when a new number is added.

Solution Approach

The implementation of the SummaryRanges class utilizes a SortedDict from the sortedcontainers module in Python. A SortedDict keeps keys in a sorted order, which is vital for efficient interval management in this problem.

__init__ Method:

The constructor initializes a SortedDict named self.mp. This will store the intervals where the keys will be the start of the intervals.

addNum Method:

This method handles the addition of a new number, val, to the stream and updates the intervals accordingly:

  1. We start by determining the position (right index ridx) of the value with respect to the sorted keys using self.mp.bisect_right(val). This gives us the index of the first key that is greater than val (if any). Similarly, we calculate the left index (lidx) which is just ridx - 1 if val is not the first number to be inserted or n (the length of the SortedDict) if ridx is 0.

  2. With keys and values of the SortedDict extracted, we use the indices to find the adjacent intervals (if they exist).

  3. We check if the new value val can merge with adjacent intervals or if it belongs to a new, separate interval:

    • If the new value val is in between two existing intervals and can merge them (values[lidx][1] + 1 == val and values[ridx][0] - 1 == val), we extend the left interval to cover the right one and remove the right interval.
    • If the new value val can be appended to the end of the left interval (val <= values[lidx][1] + 1), we update the end of this interval.
    • If the new value val can be prepended to the start of the right interval (val >= values[ridx][0] - 1), we update the start of this interval.
    • If none of the above conditions are met, val creates its own interval [val, val].
  4. We then insert the new interval or update the existing intervals in the SortedDict accordingly.

getIntervals Method:

This is a straightforward method that returns all values (which are the intervals) in the SortedDict. Since the SortedDict maintains the intervals in ascending order based on their starting points, no further sorting is needed.

Overall, the algorithm effectively maintains and manipulates intervals, using the fast lookup and ordered nature of the SortedDict. The choice of data structure greatly simplifies the complexity of adding and merging intervals while keeping the stream data summarized efficiently.

💪
Level Up Your
Algo Skills

Example Walkthrough

Let's walk through a small example to illustrate the solution approach.

Assume we have an instance of SummaryRanges and we perform the following operations:

  1. addNum(1)
  2. addNum(3)
  3. getIntervals() - Should return [[1, 1], [3, 3]]
  4. addNum(2)
  5. getIntervals() - Should return [[1, 3]]

Here's what happens internally with each operation:

After operation 1: addNum(1)

  • There are no existing intervals.
  • val = 1 creates a new interval [1, 1].
  • self.mp now stores {1: [1, 1]}.

After operation 2: addNum(3)

  • Keys in self.mp are [1].
  • val = 3 does not fit with the existing interval [1, 1] as it is not consecutive.
  • A new interval [3, 3] is created since it is separate.
  • self.mp now stores {1: [1, 1], 3: [3, 3]}.

After operation 3: getIntervals()

  • The getIntervals method is called.
  • We simply return the values of self.mp as a list: [[1, 1], [3, 3]].

After operation 4: addNum(2)

  • Keys in self.mp are [1, 3].
  • val = 2 is checked against the intervals.
  • val = 2 fits between the intervals [1, 1] and [3, 3] and can merge them since 1 + 1 = 2 = 3 - 1.
  • We extend the left interval to [1, 3] and remove the interval [3, 3].
  • self.mp now stores {1: [1, 3]}.

After operation 5: getIntervals()

  • The getIntervals method is called once again.
  • We simply return the values of self.mp as a list: [[1, 3]].

In conclusion, as shown in the example, the SummaryRanges class efficiently manages the addition and merging of intervals. The addNum method updates the SortedDict as required, with the main logic focused on checking if the new value can be merged with existing intervals or if it forms a new interval. The getIntervals method provides the current summarized view as ordered intervals. The effective use of SortedDict dramatically simplifies complex interval logic and ensures efficient manipulation of the data stream.

Python Solution

1from sortedcontainers import SortedDict
2
3
4class SummaryRanges:
5    def __init__(self):
6        """
7        Initialize a data structure to keep track of non-overlapping intervals.
8        """
9        self.intervals_map = SortedDict()
10
11    def addNum(self, val: int) -> None:
12        """
13        Add a value to the non-overlapping intervals. Merge intervals if necessary.
14        """
15        num_intervals = len(self.intervals_map)
16        right_index = self.intervals_map.bisect_right(val)
17        left_index = num_intervals if right_index == 0 else right_index - 1
18
19        keys = self.intervals_map.keys()
20        values = self.intervals_map.values()
21
22        # If val can merge two adjancent intervals into one.
23        if (left_index != num_intervals and
24            right_index != num_intervals and
25            values[left_index][1] + 1 == val and
26            values[right_index][0] - 1 == val):
27
28            self.intervals_map[keys[left_index]][1] = self.intervals_map[keys[right_index]][1]
29            self.intervals_map.pop(keys[right_index])
30
31        # If val is a duplicate or extends an interval to the right.
32        elif left_index != num_intervals and val <= values[left_index][1] + 1:
33            self.intervals_map[keys[left_index]][1] = max(val, self.intervals_map[keys[left_index]][1])
34
35        # If val extends an interval to the left.
36        elif right_index != num_intervals and val >= values[right_index][0] - 1:
37            self.intervals_map[keys[right_index]][0] = min(val, self.intervals_map[keys[right_index]][0])
38
39        # If val starts a new interval.
40        else:
41            self.intervals_map[val] = [val, val]
42
43    def getIntervals(self) -> list:
44        """
45        Get the list of intervals as a list of lists.
46        """
47        return list(self.intervals_map.values())
48
49
50# Example of how this class might be used:
51# obj = SummaryRanges()
52# obj.addNum(val)
53# intervals = obj.getIntervals()
54

Java Solution

1import java.util.Map;
2import java.util.TreeMap;
3
4class SummaryRanges {
5    // TreeMap to hold the intervals. The key is the start of the interval,
6    // and the value is an array containing the start and end of the interval.
7    private TreeMap<Integer, int[]> intervalsMap;
8
9    // Constructor to initialize the TreeMap
10    public SummaryRanges() {
11        intervalsMap = new TreeMap<>();
12    }
13
14    // Function to add a number into the set and merge intervals if necessary
15    public void addNum(int val) {
16        // Find the interval immediately before the new value
17        Integer left = intervalsMap.floorKey(val);
18      
19        // Find the interval immediately after the new value
20        Integer right = intervalsMap.ceilingKey(val);
21      
22        // Check if there's a need to merge the intervals to the left and right of the new value
23        if (left != null && right != null && intervalsMap.get(left)[1] + 1 == val && intervalsMap.get(right)[0] - 1 == val) {
24            // Merge both intervals since the new value bridges them
25            intervalsMap.get(left)[1] = intervalsMap.get(right)[1];
26          
27            // Remove the redundant interval (the one that was to the right)
28            intervalsMap.remove(right);
29        } else if (left != null && val <= intervalsMap.get(left)[1] + 1) {
30            // Extend the interval to the left if the new value fits within or adjacent to it
31            intervalsMap.get(left)[1] = Math.max(val, intervalsMap.get(left)[1]);
32        } else if (right != null && val >= intervalsMap.get(right)[0] - 1) {
33            // Extend the interval to the right if the new value fits within or adjacent to it
34            intervalsMap.get(right)[0] = Math.min(val, intervalsMap.get(right)[0]);
35        } else {
36            // If the value is not adjacent to any interval, add it as a new interval
37            intervalsMap.put(val, new int[] {val, val});
38        }
39    }
40
41    // Function to return a list of intervals
42    public int[][] getIntervals() {
43        // Initialize the result array with the size of the intervalsMap
44        int[][] result = new int[intervalsMap.size()][2];
45      
46        // Populate the result with the intervals
47        int i = 0;
48        for (int[] range : intervalsMap.values()) {
49            result[i++] = range;
50        }
51      
52        // Return the populated list of intervals
53        return result;
54    }
55}
56

C++ Solution

1#include <map>
2#include <vector>
3
4using namespace std;
5
6class SummaryRanges {
7private:
8    // Map to store the start of each interval as key and interval (as a pair) as value.
9    map<int, pair<int, int>> intervals;
10
11public:
12    // Constructor does nothing since intervals map is automatically initialized.
13    SummaryRanges() {
14    }
15
16    // Add a number into the data structure.
17    void addNum(int val) {
18        // Find the first interval that starts after the value.
19        auto next = intervals.upper_bound(val);
20        // Find the last interval that starts before the value.
21        auto prev = next == intervals.begin() ? intervals.end() : std::prev(next);
22      
23        // Check if value connects two adjacent intervals.
24        if (prev != intervals.end() && next != intervals.end() && prev->second.second + 1 == val && next->first - 1 == val) {
25            // Merge the two intervals by updating the end of the previous interval
26            // and removing the next interval.
27            prev->second.second = next->second.second;
28            intervals.erase(next);
29        } else if (prev != intervals.end() && val <= prev->second.second + 1) {
30            // If value falls within or just after the previous interval, extend that interval.
31            prev->second.second = max(val, prev->second.second);
32        } else if (next != intervals.end() && val >= next->first - 1) {
33            // If value falls right before the next interval, extend the next interval
34            // to include value and change the key in the map accordingly.
35            intervals[val] = {val, next->second.second};
36            intervals.erase(next);
37        } else {
38            // Otherwise, insert the value as a new interval.
39            intervals[val] = {val, val};
40        }
41    }
42
43    // Get the current list of intervals.
44    vector<vector<int>> getIntervals() {
45        vector<vector<int>> result;
46      
47        // Build a list of intervals based on the map values.
48        for (const auto& kvp : intervals) {
49            result.push_back({kvp.second.first, kvp.second.second});
50        }
51      
52        return result;
53    }
54};
55

Typescript Solution

1// Importing necessary features from TypeScript standard library
2import { Map } from "typescript-collections";
3
4// Initialize the map to store the start of each interval as key and interval as value
5let intervals: Map<number, [number, number]> = new Map<number, [number, number]>();
6
7// Add a number into the data structure
8function addNum(val: number): void {
9    // Find the first interval that starts after the value
10    let next = intervals.higherKey(val);
11    // Find the last interval that starts before the value
12    let prevKey = next !== null ? intervals.lowerKey(val) : null;
13  
14    // Conditional to check if the previous key is needed
15    let prev = prevKey !== null ? intervals.getValue(prevKey) : null;
16
17    // Check if value connects two adjacent intervals
18    if (prev !== null && next !== null && prev[1] + 1 === val && next - 1 === val) {
19        // Merge the two intervals
20        intervals.setValue(prevKey, [prev[0], intervals.getValue(next)[1]]);
21        intervals.remove(next);
22    } else if (prev !== null && val <= prev[1] + 1) {
23        // If value falls within or just after the previous interval, extend that interval
24        prev[1] = Math.max(val, prev[1]);
25        intervals.setValue(prevKey, prev);
26    } else if (next !== null && val >= next - 1) {
27        // If value is right before to the start of next interval, merge val into it
28        let nextInterval = intervals.getValue(next);
29        intervals.remove(next);
30        intervals.setValue(val, [val, nextInterval[1]]);
31    } else {
32        // Otherwise, insert the value as a new interval
33        intervals.setValue(val, [val, val]);
34    }
35}
36
37// Get the current list of intervals
38function getIntervals(): Array<Array<number>> {
39    let result: Array<Array<number>> = [];
40    // Build a list of intervals based on the map entries
41    intervals.forEach((key, value) => {
42        result.push([value[0], value[1]]);
43    });
44  
45    return result;
46}
47

Time and Space Complexity

Time Complexity

The addNum function:

  • bisect_right: O(logN) where N is the number of existing intervals in self.mp. This operation performs a binary search.
  • Update operations: O(1) for basic integer operations.
  • pop: O(logN) because when popping from a sorted dictionary, it maintains the order. Hence, the time complexity for addNum is O(logN) per operation due to the binary search and potential pop operations.

The getIntervals function simply returns the values of the sorted dictionary:

  • This operation is O(N) because it constructs a list from the values stored in mp.

Space Complexity

The main space usage in SummaryRanges comes from the sorted dictionary self.mp:

  • For addNum: Additional space complexity is O(1) because it stores the interval as a pair of integers only when a new interval is created or extended.
  • The SortedDict self.mp can have at most N elements if no intervals overlap, where N is the total number of times addNum is called. Hence, space complexity is O(N) for self.mp.
  • For getIntervals: O(N) to store the output list.

Therefore, the overall space complexity is O(N).

😈
Become an
Algo Monster

Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫