Facebook Pixel

1478. Allocate Mailboxes

Problem Description

You are given an array houses where houses[i] represents the location of the i-th house along a street. You need to place exactly k mailboxes on this street.

The goal is to minimize the total distance between each house and its nearest mailbox. In other words, for each house, you want to find the closest mailbox to it, and the sum of all these distances should be as small as possible.

Key Points:

  • Houses are located at specific positions along a street (given as integers in the array)
  • You must place exactly k mailboxes somewhere on this street
  • Each house will be served by its nearest mailbox
  • You want to minimize the sum of distances from all houses to their nearest mailboxes
  • The answer is guaranteed to fit in a 32-bit integer

Example Understanding: If you have houses at positions [1, 4, 8, 10, 20] and you can place k = 3 mailboxes, you need to strategically place these 3 mailboxes so that when each house goes to its nearest mailbox, the total distance traveled by all houses combined is minimized.

The problem requires finding the optimal placement of mailboxes to achieve this minimum total distance.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

Let's think about this problem step by step. First, when we have a group of houses and want to place one mailbox to serve them all, where should we put it? The optimal position is at the median of those houses. This minimizes the total distance because moving the mailbox away from the median would increase the total distance.

Now, with k mailboxes for n houses, we need to partition the houses into k groups, where each group is served by one mailbox. Since we want to minimize distances, it makes sense to sort the houses first - this way, each mailbox will serve a contiguous segment of houses along the street.

This leads us to a dynamic programming approach. We can think of the problem as: "What's the minimum cost to serve the first i+1 houses using j mailboxes?"

For each state f[i][j], we need to decide where the j-1th mailbox ends its service. If it ends at house p, then:

  • Houses 0 to p are served by the first j-1 mailboxes (cost = f[p][j-1])
  • Houses p+1 to i are served by the jth mailbox (cost = g[p+1][i])

The key insight for calculating g[i][j] (cost of serving houses from i to j with one mailbox) is that when we place a mailbox at the median position, the total distance equals the sum of pairwise distances between houses at opposite ends. For sorted houses, this can be computed efficiently using the recurrence:

g[i][j] = g[i+1][j-1] + houses[j] - houses[i]

This works because adding houses i and j to an existing group adds exactly houses[j] - houses[i] to the total distance (they pair up with each other when the mailbox is at the median).

By building up from smaller subproblems to larger ones, we can find the optimal placement strategy for all k mailboxes.

Learn more about Math, Dynamic Programming and Sorting patterns.

Solution Approach

The implementation follows a dynamic programming approach with preprocessing for efficient distance calculations.

Step 1: Sort the houses

houses.sort()

Sorting ensures that each mailbox will serve a contiguous range of houses, simplifying our problem.

Step 2: Precompute distances using matrix g

g = [[0] * n for _ in range(n)]
for i in range(n - 2, -1, -1):
    for j in range(i + 1, n):
        g[i][j] = g[i + 1][j - 1] + houses[j] - houses[i]

g[i][j] represents the minimum total distance when placing one mailbox to serve houses from index i to j. We build this matrix bottom-up using the recurrence relation:

  • g[i][j] = g[i+1][j-1] + houses[j] - houses[i]

This formula works because when we add houses at positions i and j to a group, they contribute houses[j] - houses[i] to the total distance (since the optimal mailbox position is at the median).

Step 3: Dynamic Programming

f = [[inf] * (k + 1) for _ in range(n)]

f[i][j] represents the minimum total distance to serve houses 0 to i using j mailboxes.

Step 4: Base case initialization

for i in range(n):
    f[i][1] = g[0][i]

When using only 1 mailbox, the cost for houses 0 to i is simply g[0][i].

Step 5: Fill the DP table

for i in range(n):
    for j in range(2, min(k + 1, i + 2)):
        for p in range(i):
            f[i][j] = min(f[i][j], f[p][j - 1] + g[p + 1][i])

For each state f[i][j], we try all possible positions p where the (j-1)th mailbox ends its service:

  • Houses 0 to p are served by j-1 mailboxes: cost = f[p][j-1]
  • Houses p+1 to i are served by the jth mailbox: cost = g[p+1][i]
  • Total cost = f[p][j-1] + g[p+1][i]

We take the minimum over all possible splits.

Step 6: Return the result

return f[-1][k]

The answer is f[n-1][k], which represents the minimum distance to serve all n houses using exactly k mailboxes.

Time Complexity: O(n³) for preprocessing g and O(n²k) for the DP computation. Space Complexity: O(n²) for matrix g and O(nk) for the DP table f.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example with houses at positions [1, 4, 8] and k = 2 mailboxes.

Step 1: Sort the houses Houses are already sorted: [1, 4, 8]

Step 2: Build the distance matrix g

We need to calculate g[i][j] - the cost of serving houses from index i to j with one mailbox placed at the median.

Starting from the bottom-right and working backwards:

  • g[1][2] (houses at 4 and 8): The median is between 4 and 8, and the total distance = 8 - 4 = 4
  • g[0][1] (houses at 1 and 4): The median is between 1 and 4, and the total distance = 4 - 1 = 3
  • g[0][2] (houses at 1, 4, and 8): Using the recurrence g[0][2] = g[1][1] + houses[2] - houses[0] = 0 + 8 - 1 = 7

So our g matrix is:

    0  1  2
0 [ 0  3  7 ]
1 [ 0  0  4 ]
2 [ 0  0  0 ]

Step 3: Initialize DP table f

Create f[i][j] where i represents serving houses 0 to i, and j represents using j mailboxes.

Step 4: Base case (j = 1) When using only 1 mailbox for all houses:

  • f[0][1] = g[0][0] = 0 (one house at position 1)
  • f[1][1] = g[0][1] = 3 (houses at 1 and 4, mailbox at median)
  • f[2][1] = g[0][2] = 7 (houses at 1, 4, 8, mailbox at 4)

Step 5: Fill DP for j = 2

For f[1][2] (2 houses, 2 mailboxes):

  • Split at house 0: f[0][1] + g[1][1] = 0 + 0 = 0
  • So f[1][2] = 0 (one mailbox per house)

For f[2][2] (3 houses, 2 mailboxes):

  • Split at house 0: f[0][1] + g[1][2] = 0 + 4 = 4
    • First mailbox serves house at 1 (cost = 0)
    • Second mailbox serves houses at 4 and 8 (cost = 4)
  • Split at house 1: f[1][1] + g[2][2] = 3 + 0 = 3
    • First mailbox serves houses at 1 and 4 (cost = 3)
    • Second mailbox serves house at 8 (cost = 0)
  • So f[2][2] = min(4, 3) = 3

Final DP table:

    j=0  j=1  j=2
i=0 [∞    0    ∞]
i=1 [∞    3    0]
i=2 [∞    7    3]

Step 6: Return result f[2][2] = 3 is our answer.

The optimal placement is: one mailbox serving houses at positions 1 and 4 (placed at position 2.5 or anywhere between them), and another mailbox at position 8. Total distance = |1-2.5| + |4-2.5| + |8-8| = 1.5 + 1.5 + 0 = 3.

Solution Implementation

1class Solution:
2    def minDistance(self, houses: List[int], k: int) -> int:
3        # Sort houses by position for optimal mailbox placement
4        houses.sort()
5        n = len(houses)
6      
7        # cost[i][j] = minimum cost to place one mailbox for houses[i] to houses[j]
8        # The optimal position for one mailbox is the median of the houses
9        cost = [[0] * n for _ in range(n)]
10      
11        # Build cost matrix from right-bottom to left-top
12        # For sorted houses, placing mailbox at median minimizes total distance
13        for i in range(n - 2, -1, -1):
14            for j in range(i + 1, n):
15                # Recurrence: extend the range by adding houses at both ends
16                # The new cost adds the distance between the new pair of houses
17                cost[i][j] = cost[i + 1][j - 1] + houses[j] - houses[i]
18      
19        # dp[i][j] = minimum distance for houses[0..i] using j mailboxes
20        dp = [[float('inf')] * (k + 1) for _ in range(n)]
21      
22        # Process each house position
23        for i in range(n):
24            # Base case: one mailbox for houses[0..i]
25            dp[i][1] = cost[0][i]
26          
27            # Try using 2 to min(k, i+1) mailboxes
28            # Can't use more mailboxes than houses up to position i
29            for j in range(2, min(k + 1, i + 2)):
30                # Try all possible positions for the (j-1)th mailbox
31                for p in range(i):
32                    # Split: houses[0..p] use j-1 mailboxes
33                    #         houses[p+1..i] use 1 mailbox
34                    dp[i][j] = min(dp[i][j], dp[p][j - 1] + cost[p + 1][i])
35      
36        # Return minimum distance for all houses using k mailboxes
37        return dp[-1][k]
38
1class Solution {
2    public int minDistance(int[] houses, int k) {
3        // Sort houses by position to enable optimal mailbox placement
4        Arrays.sort(houses);
5        int n = houses.length;
6      
7        // costMatrix[i][j] = minimum cost to place one mailbox for houses[i] to houses[j]
8        // The optimal position for one mailbox is the median position
9        int[][] costMatrix = new int[n][n];
10      
11        // Build cost matrix from bottom-right to top-left
12        // For houses from index i to j, the cost is calculated recursively
13        for (int i = n - 2; i >= 0; i--) {
14            for (int j = i + 1; j < n; j++) {
15                // Cost for range [i,j] = cost for inner range [i+1,j-1] + distance from edges
16                // This works because the median remains optimal and we add edge distances
17                costMatrix[i][j] = costMatrix[i + 1][j - 1] + houses[j] - houses[i];
18            }
19        }
20      
21        // dp[i][j] = minimum cost to place j mailboxes for houses[0] to houses[i]
22        int[][] dp = new int[n][k + 1];
23      
24        // Initialize with maximum value (infinity)
25        final int INFINITY = 1 << 30;
26        for (int[] row : dp) {
27            Arrays.fill(row, INFINITY);
28        }
29      
30        // Dynamic programming to find minimum cost
31        for (int houseIdx = 0; houseIdx < n; houseIdx++) {
32            // Base case: one mailbox for houses[0] to houses[houseIdx]
33            dp[houseIdx][1] = costMatrix[0][houseIdx];
34          
35            // Try placing 2 to k mailboxes (but not more than houses available)
36            for (int mailboxCount = 2; mailboxCount <= k && mailboxCount <= houseIdx + 1; mailboxCount++) {
37                // Try different split points for the last mailbox coverage
38                for (int splitPoint = 0; splitPoint < houseIdx; splitPoint++) {
39                    // Cost = cost of (mailboxCount-1) mailboxes for houses[0..splitPoint]
40                    //        + cost of 1 mailbox for houses[splitPoint+1..houseIdx]
41                    dp[houseIdx][mailboxCount] = Math.min(
42                        dp[houseIdx][mailboxCount], 
43                        dp[splitPoint][mailboxCount - 1] + costMatrix[splitPoint + 1][houseIdx]
44                    );
45                }
46            }
47        }
48      
49        // Return minimum cost for all houses with k mailboxes
50        return dp[n - 1][k];
51    }
52}
53
1class Solution {
2public:
3    int minDistance(vector<int>& houses, int k) {
4        int n = houses.size();
5      
6        // Sort houses by position for optimal mailbox placement
7        sort(houses.begin(), houses.end());
8      
9        // cost[i][j]: minimum cost to place one mailbox for houses[i..j]
10        // The optimal position for one mailbox is the median position
11        vector<vector<int>> cost(n, vector<int>(n, 0));
12      
13        // Precompute costs for placing one mailbox for any range [i, j]
14        // Working backwards to build up the cost table efficiently
15        for (int i = n - 2; i >= 0; --i) {
16            for (int j = i + 1; j < n; ++j) {
17                // For sorted houses, placing mailbox at median minimizes total distance
18                // This formula accumulates the cost incrementally
19                cost[i][j] = cost[i + 1][j - 1] + houses[j] - houses[i];
20            }
21        }
22      
23        // dp[i][j]: minimum distance sum for houses[0..i] using j mailboxes
24        const int INF = 0x3f3f3f3f;
25        vector<vector<int>> dp(n, vector<int>(k + 1, INF));
26      
27        // Dynamic programming to find optimal mailbox allocation
28        for (int i = 0; i < n; ++i) {
29            // Base case: one mailbox for houses[0..i]
30            dp[i][1] = cost[0][i];
31          
32            // Try using j mailboxes for houses[0..i]
33            for (int j = 2; j <= k && j <= i + 1; ++j) {
34                // Try placing the j-th mailbox to serve houses[lastHouse+1..i]
35                // and use j-1 mailboxes for houses[0..lastHouse]
36                for (int lastHouse = j - 2; lastHouse < i; ++lastHouse) {
37                    dp[i][j] = min(dp[i][j], 
38                                   dp[lastHouse][j - 1] + cost[lastHouse + 1][i]);
39                }
40            }
41        }
42      
43        // Return minimum distance for all houses with k mailboxes
44        return dp[n - 1][k];
45    }
46};
47
1/**
2 * Finds the minimum sum of distances from houses to their nearest mailboxes
3 * when placing k mailboxes optimally among the houses
4 * 
5 * @param houses - Array of house positions on a street
6 * @param k - Number of mailboxes to place
7 * @returns Minimum total distance from all houses to their nearest mailbox
8 */
9function minDistance(houses: number[], k: number): number {
10    // Sort houses by position to work with ordered positions
11    houses.sort((a, b) => a - b);
12    const houseCount: number = houses.length;
13  
14    // costMatrix[i][j] represents the cost of serving houses from index i to j with one mailbox
15    // The optimal position for one mailbox is the median position
16    const costMatrix: number[][] = Array.from(
17        { length: houseCount }, 
18        () => Array(houseCount).fill(0)
19    );
20
21    // Pre-calculate cost of serving any range of houses with a single mailbox
22    // Working backwards to build up the cost matrix efficiently
23    for (let startIdx = houseCount - 2; startIdx >= 0; startIdx--) {
24        for (let endIdx = startIdx + 1; endIdx < houseCount; endIdx++) {
25            // Cost accumulates from inner range plus the new pair of houses
26            costMatrix[startIdx][endIdx] = costMatrix[startIdx + 1][endIdx - 1] + 
27                                           houses[endIdx] - houses[startIdx];
28        }
29    }
30
31    const INFINITY: number = Number.POSITIVE_INFINITY;
32  
33    // dpTable[i][j] represents minimum cost to serve houses 0 to i using j mailboxes
34    const dpTable: number[][] = Array.from(
35        { length: houseCount }, 
36        () => Array(k + 1).fill(INFINITY)
37    );
38
39    // Base case: serving first i+1 houses with 1 mailbox
40    for (let houseIdx = 0; houseIdx < houseCount; houseIdx++) {
41        dpTable[houseIdx][1] = costMatrix[0][houseIdx];
42    }
43
44    // Fill DP table for increasing number of mailboxes
45    for (let mailboxCount = 2; mailboxCount <= k; mailboxCount++) {
46        // Need at least mailboxCount houses to place mailboxCount mailboxes
47        for (let lastHouseIdx = mailboxCount - 1; lastHouseIdx < houseCount; lastHouseIdx++) {
48            // Try all possible positions for the (mailboxCount-1)th mailbox
49            for (let splitPoint = lastHouseIdx - 1; splitPoint >= 0; splitPoint--) {
50                // Cost = cost of first splitPoint+1 houses with mailboxCount-1 mailboxes
51                //        + cost of remaining houses with 1 mailbox
52                dpTable[lastHouseIdx][mailboxCount] = Math.min(
53                    dpTable[lastHouseIdx][mailboxCount],
54                    dpTable[splitPoint][mailboxCount - 1] + costMatrix[splitPoint + 1][lastHouseIdx]
55                );
56            }
57        }
58    }
59
60    // Return minimum cost to serve all houses with k mailboxes
61    return dpTable[houseCount - 1][k];
62}
63

Time and Space Complexity

Time Complexity: O(n^2 × k)

The analysis breaks down as follows:

  • Sorting the houses takes O(n log n) time
  • Building the cost matrix g:
    • The nested loops iterate through all pairs (i, j) where i < j
    • This results in O(n^2) operations
  • Computing the DP table f:
    • The outer loop runs n times (for each house index i)
    • The middle loop runs up to k times (for each number of mailboxes j)
    • The inner loop runs up to i times (for each split point p)
    • In the worst case, this gives us O(n × k × n) = O(n^2 × k)

Since O(n^2 × k) dominates O(n log n) and O(n^2), the overall time complexity is O(n^2 × k).

Space Complexity: O(n^2)

The space usage consists of:

  • The cost matrix g: n × n array requiring O(n^2) space
  • The DP table f: n × (k+1) array requiring O(n × k) space

Since k ≤ n (we cannot have more mailboxes than houses), and typically k is much smaller than n, the O(n^2) term from matrix g dominates, giving us a total space complexity of O(n^2).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Cost Matrix Calculation for Single Mailbox

Pitfall: A common mistake is incorrectly calculating the cost matrix g[i][j] (or cost[i][j] in the code). Developers often try to compute this by iterating through all houses and calculating distances to a median position explicitly, which is both inefficient and error-prone.

Why it happens: The recurrence relation g[i][j] = g[i+1][j-1] + houses[j] - houses[i] is not intuitive. Many attempt to calculate the median position first and then sum distances, leading to:

# INCORRECT approach
median = houses[(i + j) // 2]
for h in range(i, j + 1):
    cost[i][j] += abs(houses[h] - median)

Solution: Trust the mathematical property that for sorted houses, when adding two houses at the ends of a range, they contribute exactly houses[j] - houses[i] to the total distance when the mailbox is at the median. The correct approach builds the cost matrix using the recurrence relation.

2. Off-by-One Errors in DP Boundaries

Pitfall: Incorrectly setting the loop boundaries, especially:

  • Using range(2, k + 1) instead of range(2, min(k + 1, i + 2)) for the number of mailboxes
  • This causes attempting to place more mailboxes than houses available up to position i

Why it happens: Not considering that you cannot place more mailboxes than the number of houses in the current range.

Solution: Always bound the number of mailboxes by both k and the number of available houses:

for j in range(2, min(k + 1, i + 2)):  # i+2 because we have houses 0 to i (inclusive)

3. Forgetting to Sort the Houses Array

Pitfall: Operating on the unsorted houses array, which breaks the fundamental assumption that each mailbox serves a contiguous range of houses.

Why it happens: The problem statement doesn't explicitly mention that houses need to be sorted, and developers might assume the input is already sorted or that order doesn't matter.

Solution: Always sort the houses array at the beginning:

houses.sort()  # Critical first step

4. Incorrect Base Case Initialization

Pitfall: Not properly initializing the base case where one mailbox serves houses from index 0 to i:

# INCORRECT - forgetting base case
dp[i][1] = 0  # Wrong!

# CORRECT
dp[i][1] = cost[0][i]

Why it happens: Confusion about what the base case represents - it's not zero cost, but the actual cost of placing one mailbox optimally for those houses.

Solution: Ensure the base case correctly uses the precomputed cost matrix for placing one mailbox.

5. Integer Overflow in Other Languages

Pitfall: While Python handles large integers automatically, implementing this in languages like Java or C++ might cause overflow when initializing with Integer.MAX_VALUE and performing additions.

Solution: Use appropriate data types (like long in Java) or check for overflow conditions before additions:

# In Python, we use float('inf') which handles this gracefully
dp = [[float('inf')] * (k + 1) for _ in range(n)]
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

What is an advantages of top-down dynamic programming vs bottom-up dynamic programming?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More