757. Set Intersection Size At Least Two

Problem Description

You are given a collection of intervals, where each interval is represented as [start, end] and includes all integers from start to end (both inclusive).

Your task is to find the smallest set of integers such that every interval contains at least 2 integers from this set.

For example, given intervals [[1,3], [3,7], [8,9]]:

The interval [1,3] contains integers 1, 2, and 3
The interval [3,7] contains integers 3, 4, 5, 6, and 7
The interval [8,9] contains integers 8 and 9

A valid containing set could be [2,3,4,8,9] because:

Interval [1,3] contains 2 and 3 from the set (2 integers ✓)
Interval [3,7] contains 3 and 4 from the set (2 integers ✓)
Interval [8,9] contains 8 and 9 from the set (2 integers ✓)

The solution uses a greedy approach by sorting intervals by their end points (and by start points in descending order when end points are equal). It then tracks the last two selected points (s and e) and decides how many new points to add based on the overlap with the current interval:

If the interval's start is less than or equal to s, it already contains both tracked points
If the interval's start is greater than e, no overlap exists, so we add 2 new points from the interval's end
Otherwise, there's partial overlap with one point, so we add 1 new point

The algorithm returns the total count of points added to form the minimum containing set.

Quick Interview Experience

Help others by sharing your interview experience

Have you seen this problem before?

Intuition

The key insight is to think about this problem greedily - we want to place our selected integers as strategically as possible to cover multiple intervals simultaneously.

Consider what happens when we process intervals in a specific order. If we sort intervals by their ending points, we can make locally optimal choices that lead to a globally optimal solution. Why? Because once we've decided which points to use for an interval, those points might also satisfy future intervals that overlap with it.

The critical observation is that for any interval, the best strategy is to place our selected points as far to the right as possible (near the interval's end). This maximizes the chance that these points will also satisfy upcoming intervals. If we need to pick 2 points from interval [a, b], choosing b-1 and b gives us the best chance of reusing these points for other intervals.

Now, as we process each new interval, we need to check how many of our previously selected points it already contains:

If it contains 2 or more points, we don't need to add anything
If it contains exactly 1 point, we need to add 1 more point (choosing the rightmost position in the interval)
If it contains 0 points, we need to add 2 new points (choosing the two rightmost positions)

By maintaining just the last two selected points (s and e where s < e), we can efficiently determine the overlap. Since intervals are sorted by endpoints, and we always choose points from the right side of intervals, these two most recent points are the only ones that could possibly overlap with the current interval.

The secondary sorting by start point in descending order (when endpoints are equal) ensures we process wider intervals first, which helps minimize redundant point selections when intervals share the same endpoint.

Learn more about Greedy and Sorting patterns.

Solution Approach

The implementation follows a greedy algorithm with careful interval processing:

1. Sort the intervals:

intervals.sort(key=lambda x: (x[1], -x[0]))

We sort primarily by endpoint in ascending order, and secondarily by start point in descending order (using -x[0]). This ensures we process intervals ending earlier first, and among those with the same endpoint, we process wider intervals first.

2. Initialize tracking variables:

s = e = -1
ans = 0

s and e track the last two selected points (with s < e)
Initialize them to -1 since all intervals have non-negative values
ans counts the total number of points in our containing set

3. Process each interval greedily:

for a, b in intervals:

For each interval [a, b], we check how many of our previously selected points it contains.

Case 1: Interval already contains both points

if a <= s:
    continue

If the interval's start a is less than or equal to s (the smaller of our two tracked points), then the interval contains both s and e. No new points needed.

Case 2: Interval contains no tracked points

if a > e:
    ans += 2
    s, e = b - 1, b

If the interval's start a is greater than e (the larger tracked point), the interval contains neither point. We add 2 new points: b-1 and b (the rightmost positions in the interval).

Case 3: Interval contains exactly one tracked point

else:
    ans += 1
    s, e = e, b

If s < a <= e, the interval contains only e. We add 1 new point at position b, and update our tracked points to e (the old one that's still in the interval) and b (the new one).

4. Return the result:

return ans

The total count represents the minimum size of the containing set.

This greedy approach works because by always choosing points from the right side of intervals and processing intervals in sorted order, we maximize the reuse of selected points across multiple intervals.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the algorithm with intervals [[1,3], [1,4], [2,5], [3,5]].

Step 1: Sort the intervals

Original: [[1,3], [1,4], [2,5], [3,5]]
After sorting by (endpoint ascending, start descending): [[1,3], [1,4], [2,5], [3,5]]

Step 2: Initialize variables

s = -1, e = -1 (our tracked points)
ans = 0 (count of points in our set)

Step 3: Process each interval

Interval [1,3]:

Check: a = 1 > e = -1? Yes, no overlap with tracked points
Add 2 points from the end: choose points 2 and 3
Update: ans = 2, s = 2, e = 3
Our set so far: {2, 3}

Interval [1,4]:

Check: a = 1 <= s = 2? Yes, interval contains both tracked points (2 and 3)
No new points needed
Keep: ans = 2, s = 2, e = 3
Our set remains: {2, 3}

Interval [2,5]:

Check: a = 2 <= s = 2? Yes, interval contains both tracked points (2 and 3)
No new points needed
Keep: ans = 2, s = 2, e = 3
Our set remains: {2, 3}

Interval [3,5]:

Check: a = 3 <= s = 2? No
Check: a = 3 > e = 3? No
So s < a <= e, meaning interval contains only point 3
Add 1 point from the end: choose point 5
Update: ans = 3, s = 3 (the old point still in interval), e = 5 (the new point)
Our set now: {2, 3, 5}

Final verification:

[1,3] contains {2, 3} ✓
[1,4] contains {2, 3} ✓
[2,5] contains {2, 3, 5} - at least 2 ✓
[3,5] contains {3, 5} ✓

The algorithm returns 3, and our minimum containing set has 3 elements.

Solution Implementation

1class Solution:
2    def intersectionSizeTwo(self, intervals: List[List[int]]) -> int:
3        # Sort intervals by end point (ascending), then by start point (descending)
4        # This ensures we process intervals with earlier endpoints first
5        # For same endpoints, we prefer intervals with larger start points
6        intervals.sort(key=lambda interval: (interval[1], -interval[0]))
7      
8        # Initialize two points that will be in our result set
9        # second_last: the second-to-last point we've chosen
10        # last: the last (most recent) point we've chosen
11        second_last = -1
12        last = -1
13        result_size = 0
14      
15        for start, end in intervals:
16            # Case 1: Current interval already contains both tracked points
17            # No new points needed
18            if start <= second_last:
19                continue
20          
21            # Case 2: Current interval doesn't contain any tracked points
22            # We need to add 2 new points (choose the two largest in the interval)
23            if start > last:
24                result_size += 2
25                second_last = end - 1
26                last = end
27          
28            # Case 3: Current interval contains only the last point
29            # We need to add 1 new point (choose the largest in the interval)
30            else:
31                result_size += 1
32                second_last = last
33                last = end
34      
35        return result_size
36

1class Solution {
2    public int intersectionSizeTwo(int[][] intervals) {
3        // Sort intervals by end point ascending, if equal then by start point descending
4        // This ensures we process intervals with earlier endpoints first
5        // When endpoints are equal, we prefer intervals with larger start points (smaller intervals)
6        Arrays.sort(intervals, (a, b) -> a[1] == b[1] ? b[0] - a[0] : a[1] - b[1]);
7      
8        // Initialize the result counter for minimum points needed
9        int result = 0;
10      
11        // Track the last two points in our intersection set
12        // secondLast: the second-to-last point added to the set
13        // last: the most recent point added to the set
14        int secondLast = -1;
15        int last = -1;
16      
17        // Process each interval in sorted order
18        for (int[] interval : intervals) {
19            int start = interval[0];
20            int end = interval[1];
21          
22            // Case 1: Current interval already contains both tracked points
23            // No new points needed
24            if (start <= secondLast) {
25                continue;
26            }
27          
28            // Case 2: Current interval doesn't contain any of the tracked points
29            // Need to add 2 new points (the last two points of current interval)
30            if (start > last) {
31                result += 2;
32                secondLast = end - 1;
33                last = end;
34            } 
35            // Case 3: Current interval contains only the last point
36            // Need to add 1 new point (the endpoint of current interval)
37            else {
38                result += 1;
39                secondLast = last;
40                last = end;
41            }
42        }
43      
44        return result;
45    }
46}
47

1class Solution {
2public:
3    int intersectionSizeTwo(vector<vector<int>>& intervals) {
4        // Sort intervals by ending point (ascending), 
5        // if ending points are same, sort by starting point (descending)
6        sort(intervals.begin(), intervals.end(), [](vector<int>& a, vector<int>& b) {
7            if (a[1] == b[1]) {
8                return a[0] > b[0];
9            }
10            return a[1] < b[1];
11        });
12      
13        int result = 0;
14        // Track the last two points in our intersection set
15        // Initialize to invalid values
16        int secondLastPoint = -1;
17        int lastPoint = -1;
18      
19        // Process each interval
20        for (auto& interval : intervals) {
21            int start = interval[0];
22            int end = interval[1];
23          
24            // Case 1: Both tracked points are already in current interval
25            // No new points needed
26            if (start <= secondLastPoint) {
27                continue;
28            }
29          
30            // Case 2: No tracked points are in current interval
31            // Need to add 2 new points (choose the rightmost two)
32            if (start > lastPoint) {
33                result += 2;
34                secondLastPoint = end - 1;
35                lastPoint = end;
36            } 
37            // Case 3: Only the last tracked point is in current interval
38            // Need to add 1 new point (choose the rightmost)
39            else {
40                result += 1;
41                secondLastPoint = lastPoint;
42                lastPoint = end;
43            }
44        }
45      
46        return result;
47    }
48};
49

1function intersectionSizeTwo(intervals: number[][]): number {
2    // Sort intervals by ending point (ascending)
3    // If ending points are the same, sort by starting point (descending)
4    // This ensures we process intervals in an optimal order for greedy selection
5    intervals.sort((a: number[], b: number[]) => {
6        if (a[1] === b[1]) {
7            return b[0] - a[0];
8        }
9        return a[1] - b[1];
10    });
11  
12    let result: number = 0;
13    // Track the last two points in our intersection set
14    // These represent the rightmost points we've selected so far
15    let secondLastPoint: number = -1;
16    let lastPoint: number = -1;
17  
18    // Process each interval in sorted order
19    for (const interval of intervals) {
20        const start: number = interval[0];
21        const end: number = interval[1];
22      
23        // Case 1: Both tracked points are already within the current interval
24        // No additional points needed to satisfy this interval
25        if (start <= secondLastPoint) {
26            continue;
27        }
28      
29        // Case 2: Neither tracked point is in the current interval
30        // We need to add 2 new points (choose the rightmost two: end-1 and end)
31        if (start > lastPoint) {
32            result += 2;
33            secondLastPoint = end - 1;
34            lastPoint = end;
35        } 
36        // Case 3: Only the last tracked point is in the current interval
37        // We need to add 1 new point (choose the rightmost: end)
38        else {
39            result += 1;
40            secondLastPoint = lastPoint;
41            lastPoint = end;
42        }
43    }
44  
45    return result;
46}
47

Time and Space Complexity

Time Complexity: O(n log n) where n is the number of intervals.

The sorting operation takes O(n log n) time
After sorting, we iterate through all intervals once, which takes O(n) time
Each operation inside the loop (comparisons and assignments) takes O(1) time
Therefore, the overall time complexity is dominated by the sorting step: O(n log n)

Space Complexity: O(1) or O(n) depending on the sorting algorithm implementation.

The algorithm uses only a constant amount of extra space for variables s, e, and ans
However, the sorting operation may require additional space:
- If Python's sort() uses Timsort (which it does), it requires O(n) space in the worst case
- If we consider only the extra space used by our algorithm excluding the sorting, it's O(1)
Therefore, the space complexity is O(n) when considering the sorting implementation, or O(1) for the algorithm's explicit space usage

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Sorting Logic

Pitfall: Using only endpoint sorting without considering the start point can lead to suboptimal solutions.

# WRONG: Only sorting by endpoint
intervals.sort(key=lambda x: x[1])

Why it fails: When two intervals have the same endpoint, we should process the wider interval first (one with smaller start point) to maximize point reuse. For example, with intervals [[1,3], [2,3]], if we process [2,3] first and choose points 2 and 3, then [1,3] already contains both points. But if we process [1,3] first, we might choose points that don't work as well for [2,3].

Solution: Sort by endpoint ascending, then by start point descending:

intervals.sort(key=lambda x: (x[1], -x[0]))

2. Choosing Points from the Wrong Side

Pitfall: Selecting points from the left side or middle of intervals instead of the rightmost positions.

# WRONG: Choosing from the left side
if start > last:
    result_size += 2
    second_last = start      # Wrong!
    last = start + 1         # Wrong!

Why it fails: Points from the right side of an interval have a better chance of being contained in subsequent intervals (since we're processing in order of endpoints). Choosing [start, start+1] instead of [end-1, end] reduces the opportunity for point reuse.

Solution: Always choose the rightmost points:

if start > last:
    result_size += 2
    second_last = end - 1
    last = end

3. Incorrect Overlap Detection

Pitfall: Using wrong comparison operators when checking interval overlap with tracked points.

# WRONG: Using < instead of <=
if start < second_last:  # Should be <=
    continue

Why it fails: If start == second_last, the interval still contains both second_last and last (since second_last < last). Using strict inequality would incorrectly treat this as needing additional points.

Solution: Use the correct comparison operators:

if start <= second_last:    # Interval contains both points
    continue
if start > last:            # Interval contains no points
    # Add 2 points
else:                       # Interval contains exactly one point (last)
    # Add 1 point

4. Not Maintaining Point Order Invariant

Pitfall: Forgetting to maintain second_last < last after updates.

# WRONG: Could violate the invariant
if start > last:
    result_size += 2
    second_last = end       # Wrong order!
    last = end - 1          # Wrong order!

Why it fails: The algorithm relies on second_last being smaller than last for correct overlap detection. Reversing this order breaks the logic.

Solution: Always ensure second_last < last:

if start > last:
    result_size += 2
    second_last = end - 1   # Smaller value
    last = end              # Larger value

5. Edge Case with Single-Point Intervals

Pitfall: Not handling intervals where start == end correctly.

Why it could fail: When an interval has only one point (e.g., [5,5]), we still need to ensure it contains at least 2 points from our set. This means we might need to add the same point twice conceptually, but our tracking should handle this naturally.

Solution: The algorithm handles this correctly by choosing end-1 and end. For [5,5], this becomes points 4 and 5, ensuring the interval [5,5] contains at least one of them (point 5). However, since the problem requires 2 points, the interval definition must allow for this (typically single-point intervals would be considered as containing the same point with multiplicity).

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which of the following problems can be solved with backtracking (select multiple)

Generating subsets

Generating random numbers

Generating permutations

Sorting integers