2015. Average Height of Buildings in Each Segment

MediumGreedyArraySortingHeap (Priority Queue)

Problem Description

In this problem, we are given an array of buildings, where each building is defined by a starting position (start_i), an ending position (end_i), and its height (height_i). The goal is to describe the street by providing a summary of building heights in the form of segments with average heights. Each segment covers a half-closed interval [left, right) and specifies the average height of all buildings within that interval. The array representing the street should be filled with the fewest possible segments while still accurately portraying the buildings' average heights.

Notice that the average height for a segment is calculated by (integer) dividing the total height of the buildings in the segment by the number of buildings in that segment.

This problem is essentially about merging intervals and calculating the weighted average of heights where buildings overlap.

Quick Interview Experience

Help others by sharing your interview experience

Have you seen this problem before?

Intuition

To solve this problem, we can:

Utilize a line sweep technique which involves breaking down the buildings' information at certain key points (i.e., the start and end of buildings).
Keep a running tally of the height and count of buildings as we "sweep" the street from left to right. This lets us keep track of changes in the average height as we encounter the beginnings and ends of buildings.
As we move along the street, we build segments based on the changes in the count and height of buildings. When a new building starts or an existing one ends, it's an opportunity to potentially start a new segment or modify the current one.

Here's more detail on the approach taken in the provided solution:

The cnt dictionary keeps track of how many buildings are present at each point on the number line.
The height dictionary keeps track of the cumulative height of the buildings at each point on the number line.
By iterating over the sorted keys of cnt, we are effectively scanning the number line from left to right.
Based on the height and count at each position, we calculate the average height for that position and use the information to create or extend the segments in the ans list.
If a new segment continues directly from the previous one and has the same average height, they are merged; otherwise, a new segment is added to ans.

The crucial insight is that the segments of the street should only change at the points where buildings start or end, resulting in a change of height or count. This approach ensures that we can capture all changes in heights across the street with minimal segments, as required.

Learn more about Greedy, Sorting and Heap (Priority Queue) patterns.

Solution Approach

The solution approach is implemented in Python using the following elements:

Dictionaries: Two defaultdict(int) instances are created, cnt for maintaining the count of how many buildings overlap at any given point and height for the cumulative height at every point.
Line Sweep Algorithm: This technique involves iterating over key points along the number line (in this case, the starts and ends of buildings) and updating cnt and height dictionaries.

The algorithm proceeds with these steps:

Initialization: For each building, update the cnt and height dictionaries at the start and end points of the building.

for s, e, h in buildings:
    cnt[s] += 1
    cnt[e] -= 1
    height[s] += h
    height[e] -= h

Processing the Sweeping Line: Before we begin processing, we initialize ans (answer list), i (current position), h (current total height), and n (current number of buildings).

ans = []
i = h = n = 0

We sort the keys of the cnt dictionary to get the points in increasing order and then iterate through these points. j represents the current key during the iteration. For each such key:

a. If there's at least one building overlapping (n > 0), we calculate the average height (h // n) and initialize a segment [i, j, avg height].

b. We check whether we can extend the previous segment in the ans list by comparing the current segment's start point and average height with the previous segment's end point and height. If they match, we extend the end point (ans[-1][1]), otherwise, we append the new segment to ans.

c. Finally, we update i, h, and n with the current point's details.

for j in sorted(cnt.keys()):
    if n:  # There's at least one building spanning the current segment.
        t = [i, j, h // n]
        if ans and ans[-1][1] == i and ans[-1][2] == t[-1]:
            ans[-1][1] = j  # Extend the last segment.
        else:
            ans.append(t)  # Start a new segment.
    i = j
    h += height[j]
    n += cnt[j]

Returning the Result: Finally, the ans list containing the minimum number of non-overlapping segments describing the buildings on the street is returned.

The correctness of this approach hinges on the efficiency of the line sweep algorithm, which elegantly handles the merging of intervals and calculation of average heights. The defaultdict(int) data structure is instrumental in maintaining the running count and cumulative height without the need for complex checks or initializations.

Ready to land your dream job?

Unlock your dream job with a 2-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's go through the solution approach with a small example to illustrate how it works.

Suppose we have the following array of buildings: [(1, 5, 4), (2, 6, 6), (4, 7, 5)]. Each tuple represents a building with a start position, end position, and height. Using the solution approach, we want to create segments that describe the street with the minimum number of segments while accurately portraying the average building heights.

Initialization

First, we initialize our cnt and height dictionaries using the given buildings' details.

buildings = [(1, 5, 4), (2, 6, 6), (4, 7, 5)]
cnt = defaultdict(int)
height = defaultdict(int)

for s, e, h in buildings:
    cnt[s] += 1
    cnt[e] -= 1
    height[s] += h
    height[e] -= h

After this step, we have:

cnt: {1: 1, 5: -1, 2: 1, 6: -1, 4: 1, 7: -1}
height: {1: 4, 5: -4, 2: 6, 6: -6, 4: 5, 7: -5}

These dictionaries indicate how the count and height are changing at each key point.

Processing

We now initialize our variables for processing the sweeping line:

ans = []
i = h = n = 0

Next, we sort the keys (points of interest where buildings start and end) of the cnt dictionary and iterate through them, updating i, h, and n and creating segments:

for j in sorted(cnt.keys()):
    if n > 0:  # There's at least one building in the current segment
        avg_height = h // n
        segment = [i, j, avg_height]
        if ans and ans[-1][1] == i and ans[-1][2] == avg_height:
            ans[-1][1] = j  # Extend the last segment
        else:
            ans.append(segment)  # Start a new segment
    i = j
    h += height[j]
    n += cnt[j]

Walking through the sorted keys:

At point 1, i becomes 1, h becomes 4, and n becomes 1. We don't create a segment since we've just started.
At point 2, h becomes 10 (4+6), and n becomes 2 (1+1). The average height is 10//2 = 5. We create a segment [1,2,5].
At point 4, h becomes 15 (10+5), and n remains 2. The average height is 15//2 = 7. We create a segment [2,4,7].
At point 5, h becomes 11 (15-4), and n decreases to 1 (2-1). The average height is 11. We create a segment [4,5,11].
At point 6, h becomes 5 (11-6), and n becomes 0 (1-1). As n is 0, no buildings are spanning this segment, so no need to create a new segment.
At point 7, h remains 5 (5-0), and n remains 0 (0+0). Again, as n is 0, no buildings are spanning this segment, so we're done.

Finally, we have our ans: [[1, 2, 5], [2, 4, 7], [4, 5, 11]].

This represents the minimum number of segments that accurately describe the street profile in terms of average building heights.

The above walkthrough demonstrates how the line sweep algorithm effectively processes the building information to create segments. The running tally of height and count is kept, and wall segments are built based on changes in count and height. The key insight is that new segments are only created when needed, which ensures minimal segments are used while still accurately representing the street profile.

Solution Implementation

1from collections import defaultdict
2
3class Solution:
4    def averageHeightOfBuildings(self, buildings: List[List[int]]) -> List[List[int]]:
5        # Create dictionaries to track the total height and count of buildings at each point.
6        total_height = defaultdict(int)
7        building_count = defaultdict(int)
8      
9        # Iterate over each building's start and end points and update heights and counts.
10        for start, end, height in buildings:
11            building_count[start] += 1
12            building_count[end] -= 1
13            total_height[start] += height
14            total_height[end] -= height
15      
16        # Initialize the result list and variables to store the running tally.
17        result = []
18        current_position = 0
19        running_height = 0
20        running_count = 0
21      
22        # Iterate over the sorted keys of the count dictionary.
23        for position in sorted(building_count.keys()):
24            # If there's at least one building in the current running range,
25            # Calculate the average height and add it to the result list.
26            if running_count:
27                average_height = running_height // running_count
28                new_interval = [current_position, position, average_height]
29              
30                # If the current interval can be merged with the last one in the result list,
31                # Extend the last interval. Otherwise, add the new interval to the result list.
32                if result and result[-1][1] == current_position and result[-1][2] == average_height:
33                    result[-1][1] = position
34                else:
35                    result.append(new_interval)
36          
37            # Update the current position and the running tallies for the height and count.
38            current_position = position
39            running_height += total_height[position]
40            running_count += building_count[position]
41      
42        return result
43

1import java.util.ArrayList;
2import java.util.List;
3import java.util.TreeMap;
4
5public class Solution {
6    public int[][] averageHeightOfBuildings(int[][] buildings) {
7        // TreeMap to store the aggregate heights and count of buildings starting or ending at a point 
8        TreeMap<Integer, Integer> aggregateHeight = new TreeMap<>();
9        TreeMap<Integer, Integer> buildingCount = new TreeMap<>();
10
11        // Iterating over each building's [start, end, height] intervals
12        for (int[] building : buildings) {
13            int start = building[0];
14            int end = building[1];
15            int height = building[2];
16          
17            // Update the starting point's total height and building count
18            aggregateHeight.put(start, aggregateHeight.getOrDefault(start, 0) + height);
19            buildingCount.put(start, buildingCount.getOrDefault(start, 0) + 1);
20
21            // Update the ending point's total height and building count
22            aggregateHeight.put(end, aggregateHeight.getOrDefault(end, 0) - height);
23            buildingCount.put(end, buildingCount.getOrDefault(end, 0) - 1);
24        }
25
26        // Initialize variables for tracking start index, running height, and count
27        int startIndex = 0, runningHeight = 0, runningCount = 0;
28      
29        // List to hold the resulting intervals
30        List<int[]> result = new ArrayList<>();
31
32        // Iterate over the TreeMap keys to compute average heights
33        for (int index : buildingCount.keySet()) {
34            // If there is a building in the current range, calculate the average height
35            if (runningCount > 0) {
36                int[] temp = new int[] {startIndex, index, runningHeight / runningCount};
37              
38                // Check if we can merge the current interval with the previous one
39                int lastIndex = result.size() - 1;
40                if (lastIndex >= 0 && result.get(lastIndex)[1] == startIndex && 
41                    result.get(lastIndex)[2] == temp[2]) {
42                    result.get(lastIndex)[1] = index;
43                } else {
44                    result.add(temp);
45                }
46            }
47
48            // Update the running height and count for the next interval
49            runningHeight += aggregateHeight.get(index);
50            runningCount += buildingCount.get(index);
51          
52            // Update the starting index for the next interval
53            startIndex = index;
54        }
55
56        // Convert the result list to an array
57        int[][] answer = new int[result.size()][3];
58        for (int i = 0; i < answer.length; i++) {
59            answer[i] = result.get(i);
60        }
61      
62        // Return the final array of intervals with average heights
63        return answer;
64    }
65}
66

1#include <vector>
2#include <map>
3using namespace std;
4
5class Solution {
6public:
7    // Function to calculate the average height of overlapping buildings.
8    vector<vector<int>> averageHeightOfBuildings(vector<vector<int>>& buildings) {
9        // Create two maps to keep track of height changes and count of buildings at each position.
10        map<int, int> heightChanges, buildingCounts;
11      
12        // Fill the maps with the start and end positions of the buildings.
13        for (auto& building : buildings) {
14            int startPos = building[0], endPos = building[1], height = building[2];
15            buildingCounts[startPos]++;
16            buildingCounts[endPos]--;
17            heightChanges[startPos] += height;
18            heightChanges[endPos] -= height;
19        }
20      
21        // Prepare the answer vector for storing the average heights.
22        vector<vector<int>> averages;
23      
24        // Initialize variables to track the current position, total height, and number of buildings.
25        int currentPosition = 0, totalHeight = 0, buildingCount = 0;
26        // Process each position in the count map.
27        for (auto& countEntry : buildingCounts) {
28            int nextPosition = countEntry.first;
29            // If there are buildings at the current position, calculate the average height.
30            if (buildingCount > 0) {
31                int averageHeight = totalHeight / buildingCount;
32                vector<int> currentAverage = {currentPosition, nextPosition, averageHeight};
33              
34                // Merge with the previous range if the average height and end position are the same.
35                if (!averages.empty() && averages.back()[1] == currentPosition && averages.back()[2] == averageHeight) {
36                    averages.back()[1] = nextPosition;
37                } else {
38                    averages.push_back(currentAverage);
39                }
40            }
41            // Move to the next position and update the total height and building count.
42            currentPosition = nextPosition;
43            totalHeight += heightChanges[nextPosition];
44            buildingCount += buildingCounts[nextPosition];
45        }
46        return averages;
47    }
48};
49

1// Import necessary functionality for map-like operations
2import { Map } from "immutable";
3
4// Record type for a building, including start, end, and height
5type Building = [number, number, number];
6
7// Record type for the average height information, including start, end, and average height
8type AverageHeightInfo = [number, number, number];
9
10// Function to calculate the average height of overlapping buildings
11function averageHeightOfBuildings(buildings: Building[]): AverageHeightInfo[] {
12    // Maps to keep track of height changes and count of buildings at each position
13    const heightChanges: Map<number, number> = new Map();
14    const buildingCounts: Map<number, number> = new Map();
15
16    // Fill the maps with the start and end positions of the buildings
17    buildings.forEach(([startPos, endPos, height]) => {
18        heightChanges.update(startPos, (prev = 0) => prev + height);
19        heightChanges.update(endPos, (prev = 0) => prev - height);
20        buildingCounts.update(startPos, (prev = 0) => prev + 1);
21        buildingCounts.update(endPos, (prev = 0) => prev - 1);
22    });
23
24    // Prepare the answer array for storing the average heights
25    const averages: AverageHeightInfo[] = [];
26
27    // Initialize variables to track the current position, total height, and number of buildings
28    let currentPosition = 0, totalHeight = 0, buildingCount = 0;
29
30    // Sorted keys ensure we process positions in increasing order
31    const sortedPositions = Array.from(buildingCounts.keys()).sort((a, b) => a - b);
32
33    // Process each position in the sorted array of positions
34    sortedPositions.forEach(nextPosition => {
35        // If there are buildings at the current position, calculate the average height
36        if (buildingCount > 0) {
37            const averageHeight = totalHeight / buildingCount;
38            const currentAverage: AverageHeightInfo = [currentPosition, nextPosition, averageHeight];
39
40            // Merge with the previous range if the average height and end position are the same
41            if (averages.length > 0 && averages[averages.length - 1][1] === currentPosition && averages[averages.length - 1][2] === averageHeight) {
42                averages[averages.length - 1][1] = nextPosition;
43            } else {
44                averages.push(currentAverage);
45            }
46        }
47        // Move to the next position and update the total height and building count
48        currentPosition = nextPosition;
49        totalHeight += heightChanges.get(nextPosition) ?? 0;
50        buildingCount += buildingCounts.get(nextPosition) ?? 0;
51    });
52  
53    return averages;
54}
55

Time and Space Complexity

Time Complexity

The time complexity of the algorithm is determined by several factors:

The loop where each building's start and end points with the corresponding height changes are recorded. This loop runs once for every building, hence it has a time complexity of O(N), where N is the number of buildings.
The sorting of the keys of the cnt dictionary. This is the most expensive operation in the algorithm with a time complexity of O(M * log(M)), where M is the number of unique points (start or end of buildings).
The second loop, where the average height is computed between points. This loop iterates over every key in the sorted list of keys, which is O(M) operations. Inside this loop, calculations are done in constant time, O(1).

Combining these, the overall time complexity is driven by the sorting step and is O(M * log(M)) where M is the number of unique start or endpoints.

Space Complexity

The space complexity is determined by the auxiliary data structures used:

The height and cnt defaultdicts, which will hold as many entries as there are unique start and end points (at most 2N, because there could be a start and end for each building). This gives O(M) space complexity, where M is the number of unique points.
The ans list, which in the worst case, if every building has a unique start and end that doesn't overlap with any other, would also be O(M).

Therefore, the overall space complexity is also O(M).

Learn more about how to find time and space complexity quickly using problem constraints.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Problem: Given a list of tasks and a list of requirements, compute a sequence of tasks that can be performed, such that we complete every task once while satisfying all the requirements.

Which of the following method should we use to solve this problem?

Topological Sort

Uniform Cost Search

Monotonic Stack

Memoization

I don't know