1478. Allocate Mailboxes
Problem Description
You are given an array houses
where houses[i]
represents the location of the i-th house along a street. You need to place exactly k
mailboxes on this street.
The goal is to minimize the total distance between each house and its nearest mailbox. In other words, for each house, you want to find the closest mailbox to it, and the sum of all these distances should be as small as possible.
Key Points:
- Houses are located at specific positions along a street (given as integers in the array)
- You must place exactly
k
mailboxes somewhere on this street - Each house will be served by its nearest mailbox
- You want to minimize the sum of distances from all houses to their nearest mailboxes
- The answer is guaranteed to fit in a 32-bit integer
Example Understanding:
If you have houses at positions [1, 4, 8, 10, 20]
and you can place k = 3
mailboxes, you need to strategically place these 3 mailboxes so that when each house goes to its nearest mailbox, the total distance traveled by all houses combined is minimized.
The problem requires finding the optimal placement of mailboxes to achieve this minimum total distance.
Intuition
Let's think about this problem step by step. First, when we have a group of houses and want to place one mailbox to serve them all, where should we put it? The optimal position is at the median of those houses. This minimizes the total distance because moving the mailbox away from the median would increase the total distance.
Now, with k
mailboxes for n
houses, we need to partition the houses into k
groups, where each group is served by one mailbox. Since we want to minimize distances, it makes sense to sort the houses first - this way, each mailbox will serve a contiguous segment of houses along the street.
This leads us to a dynamic programming approach. We can think of the problem as: "What's the minimum cost to serve the first i+1
houses using j
mailboxes?"
For each state f[i][j]
, we need to decide where the j-1
th mailbox ends its service. If it ends at house p
, then:
- Houses
0
top
are served by the firstj-1
mailboxes (cost =f[p][j-1]
) - Houses
p+1
toi
are served by thej
th mailbox (cost =g[p+1][i]
)
The key insight for calculating g[i][j]
(cost of serving houses from i
to j
with one mailbox) is that when we place a mailbox at the median position, the total distance equals the sum of pairwise distances between houses at opposite ends. For sorted houses, this can be computed efficiently using the recurrence:
g[i][j] = g[i+1][j-1] + houses[j] - houses[i]
This works because adding houses i
and j
to an existing group adds exactly houses[j] - houses[i]
to the total distance (they pair up with each other when the mailbox is at the median).
By building up from smaller subproblems to larger ones, we can find the optimal placement strategy for all k
mailboxes.
Learn more about Math, Dynamic Programming and Sorting patterns.
Solution Approach
The implementation follows a dynamic programming approach with preprocessing for efficient distance calculations.
Step 1: Sort the houses
houses.sort()
Sorting ensures that each mailbox will serve a contiguous range of houses, simplifying our problem.
Step 2: Precompute distances using matrix g
g = [[0] * n for _ in range(n)]
for i in range(n - 2, -1, -1):
for j in range(i + 1, n):
g[i][j] = g[i + 1][j - 1] + houses[j] - houses[i]
g[i][j]
represents the minimum total distance when placing one mailbox to serve houses from index i
to j
. We build this matrix bottom-up using the recurrence relation:
g[i][j] = g[i+1][j-1] + houses[j] - houses[i]
This formula works because when we add houses at positions i
and j
to a group, they contribute houses[j] - houses[i]
to the total distance (since the optimal mailbox position is at the median).
Step 3: Dynamic Programming
f = [[inf] * (k + 1) for _ in range(n)]
f[i][j]
represents the minimum total distance to serve houses 0
to i
using j
mailboxes.
Step 4: Base case initialization
for i in range(n):
f[i][1] = g[0][i]
When using only 1 mailbox, the cost for houses 0
to i
is simply g[0][i]
.
Step 5: Fill the DP table
for i in range(n):
for j in range(2, min(k + 1, i + 2)):
for p in range(i):
f[i][j] = min(f[i][j], f[p][j - 1] + g[p + 1][i])
For each state f[i][j]
, we try all possible positions p
where the (j-1)
th mailbox ends its service:
- Houses
0
top
are served byj-1
mailboxes: cost =f[p][j-1]
- Houses
p+1
toi
are served by thej
th mailbox: cost =g[p+1][i]
- Total cost =
f[p][j-1] + g[p+1][i]
We take the minimum over all possible splits.
Step 6: Return the result
return f[-1][k]
The answer is f[n-1][k]
, which represents the minimum distance to serve all n
houses using exactly k
mailboxes.
Time Complexity: O(n³)
for preprocessing g
and O(n²k)
for the DP computation.
Space Complexity: O(n²)
for matrix g
and O(nk)
for the DP table f
.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example with houses at positions [1, 4, 8]
and k = 2
mailboxes.
Step 1: Sort the houses
Houses are already sorted: [1, 4, 8]
Step 2: Build the distance matrix g
We need to calculate g[i][j]
- the cost of serving houses from index i
to j
with one mailbox placed at the median.
Starting from the bottom-right and working backwards:
g[1][2]
(houses at 4 and 8): The median is between 4 and 8, and the total distance = 8 - 4 = 4g[0][1]
(houses at 1 and 4): The median is between 1 and 4, and the total distance = 4 - 1 = 3g[0][2]
(houses at 1, 4, and 8): Using the recurrenceg[0][2] = g[1][1] + houses[2] - houses[0] = 0 + 8 - 1 = 7
So our g
matrix is:
0 1 2 0 [ 0 3 7 ] 1 [ 0 0 4 ] 2 [ 0 0 0 ]
Step 3: Initialize DP table f
Create f[i][j]
where i
represents serving houses 0 to i, and j
represents using j mailboxes.
Step 4: Base case (j = 1) When using only 1 mailbox for all houses:
f[0][1] = g[0][0] = 0
(one house at position 1)f[1][1] = g[0][1] = 3
(houses at 1 and 4, mailbox at median)f[2][1] = g[0][2] = 7
(houses at 1, 4, 8, mailbox at 4)
Step 5: Fill DP for j = 2
For f[1][2]
(2 houses, 2 mailboxes):
- Split at house 0:
f[0][1] + g[1][1] = 0 + 0 = 0
- So
f[1][2] = 0
(one mailbox per house)
For f[2][2]
(3 houses, 2 mailboxes):
- Split at house 0:
f[0][1] + g[1][2] = 0 + 4 = 4
- First mailbox serves house at 1 (cost = 0)
- Second mailbox serves houses at 4 and 8 (cost = 4)
- Split at house 1:
f[1][1] + g[2][2] = 3 + 0 = 3
- First mailbox serves houses at 1 and 4 (cost = 3)
- Second mailbox serves house at 8 (cost = 0)
- So
f[2][2] = min(4, 3) = 3
Final DP table:
j=0 j=1 j=2 i=0 [∞ 0 ∞] i=1 [∞ 3 0] i=2 [∞ 7 3]
Step 6: Return result
f[2][2] = 3
is our answer.
The optimal placement is: one mailbox serving houses at positions 1 and 4 (placed at position 2.5 or anywhere between them), and another mailbox at position 8. Total distance = |1-2.5| + |4-2.5| + |8-8| = 1.5 + 1.5 + 0 = 3.
Solution Implementation
1class Solution:
2 def minDistance(self, houses: List[int], k: int) -> int:
3 # Sort houses by position for optimal mailbox placement
4 houses.sort()
5 n = len(houses)
6
7 # cost[i][j] = minimum cost to place one mailbox for houses[i] to houses[j]
8 # The optimal position for one mailbox is the median of the houses
9 cost = [[0] * n for _ in range(n)]
10
11 # Build cost matrix from right-bottom to left-top
12 # For sorted houses, placing mailbox at median minimizes total distance
13 for i in range(n - 2, -1, -1):
14 for j in range(i + 1, n):
15 # Recurrence: extend the range by adding houses at both ends
16 # The new cost adds the distance between the new pair of houses
17 cost[i][j] = cost[i + 1][j - 1] + houses[j] - houses[i]
18
19 # dp[i][j] = minimum distance for houses[0..i] using j mailboxes
20 dp = [[float('inf')] * (k + 1) for _ in range(n)]
21
22 # Process each house position
23 for i in range(n):
24 # Base case: one mailbox for houses[0..i]
25 dp[i][1] = cost[0][i]
26
27 # Try using 2 to min(k, i+1) mailboxes
28 # Can't use more mailboxes than houses up to position i
29 for j in range(2, min(k + 1, i + 2)):
30 # Try all possible positions for the (j-1)th mailbox
31 for p in range(i):
32 # Split: houses[0..p] use j-1 mailboxes
33 # houses[p+1..i] use 1 mailbox
34 dp[i][j] = min(dp[i][j], dp[p][j - 1] + cost[p + 1][i])
35
36 # Return minimum distance for all houses using k mailboxes
37 return dp[-1][k]
38
1class Solution {
2 public int minDistance(int[] houses, int k) {
3 // Sort houses by position to enable optimal mailbox placement
4 Arrays.sort(houses);
5 int n = houses.length;
6
7 // costMatrix[i][j] = minimum cost to place one mailbox for houses[i] to houses[j]
8 // The optimal position for one mailbox is the median position
9 int[][] costMatrix = new int[n][n];
10
11 // Build cost matrix from bottom-right to top-left
12 // For houses from index i to j, the cost is calculated recursively
13 for (int i = n - 2; i >= 0; i--) {
14 for (int j = i + 1; j < n; j++) {
15 // Cost for range [i,j] = cost for inner range [i+1,j-1] + distance from edges
16 // This works because the median remains optimal and we add edge distances
17 costMatrix[i][j] = costMatrix[i + 1][j - 1] + houses[j] - houses[i];
18 }
19 }
20
21 // dp[i][j] = minimum cost to place j mailboxes for houses[0] to houses[i]
22 int[][] dp = new int[n][k + 1];
23
24 // Initialize with maximum value (infinity)
25 final int INFINITY = 1 << 30;
26 for (int[] row : dp) {
27 Arrays.fill(row, INFINITY);
28 }
29
30 // Dynamic programming to find minimum cost
31 for (int houseIdx = 0; houseIdx < n; houseIdx++) {
32 // Base case: one mailbox for houses[0] to houses[houseIdx]
33 dp[houseIdx][1] = costMatrix[0][houseIdx];
34
35 // Try placing 2 to k mailboxes (but not more than houses available)
36 for (int mailboxCount = 2; mailboxCount <= k && mailboxCount <= houseIdx + 1; mailboxCount++) {
37 // Try different split points for the last mailbox coverage
38 for (int splitPoint = 0; splitPoint < houseIdx; splitPoint++) {
39 // Cost = cost of (mailboxCount-1) mailboxes for houses[0..splitPoint]
40 // + cost of 1 mailbox for houses[splitPoint+1..houseIdx]
41 dp[houseIdx][mailboxCount] = Math.min(
42 dp[houseIdx][mailboxCount],
43 dp[splitPoint][mailboxCount - 1] + costMatrix[splitPoint + 1][houseIdx]
44 );
45 }
46 }
47 }
48
49 // Return minimum cost for all houses with k mailboxes
50 return dp[n - 1][k];
51 }
52}
53
1class Solution {
2public:
3 int minDistance(vector<int>& houses, int k) {
4 int n = houses.size();
5
6 // Sort houses by position for optimal mailbox placement
7 sort(houses.begin(), houses.end());
8
9 // cost[i][j]: minimum cost to place one mailbox for houses[i..j]
10 // The optimal position for one mailbox is the median position
11 vector<vector<int>> cost(n, vector<int>(n, 0));
12
13 // Precompute costs for placing one mailbox for any range [i, j]
14 // Working backwards to build up the cost table efficiently
15 for (int i = n - 2; i >= 0; --i) {
16 for (int j = i + 1; j < n; ++j) {
17 // For sorted houses, placing mailbox at median minimizes total distance
18 // This formula accumulates the cost incrementally
19 cost[i][j] = cost[i + 1][j - 1] + houses[j] - houses[i];
20 }
21 }
22
23 // dp[i][j]: minimum distance sum for houses[0..i] using j mailboxes
24 const int INF = 0x3f3f3f3f;
25 vector<vector<int>> dp(n, vector<int>(k + 1, INF));
26
27 // Dynamic programming to find optimal mailbox allocation
28 for (int i = 0; i < n; ++i) {
29 // Base case: one mailbox for houses[0..i]
30 dp[i][1] = cost[0][i];
31
32 // Try using j mailboxes for houses[0..i]
33 for (int j = 2; j <= k && j <= i + 1; ++j) {
34 // Try placing the j-th mailbox to serve houses[lastHouse+1..i]
35 // and use j-1 mailboxes for houses[0..lastHouse]
36 for (int lastHouse = j - 2; lastHouse < i; ++lastHouse) {
37 dp[i][j] = min(dp[i][j],
38 dp[lastHouse][j - 1] + cost[lastHouse + 1][i]);
39 }
40 }
41 }
42
43 // Return minimum distance for all houses with k mailboxes
44 return dp[n - 1][k];
45 }
46};
47
1/**
2 * Finds the minimum sum of distances from houses to their nearest mailboxes
3 * when placing k mailboxes optimally among the houses
4 *
5 * @param houses - Array of house positions on a street
6 * @param k - Number of mailboxes to place
7 * @returns Minimum total distance from all houses to their nearest mailbox
8 */
9function minDistance(houses: number[], k: number): number {
10 // Sort houses by position to work with ordered positions
11 houses.sort((a, b) => a - b);
12 const houseCount: number = houses.length;
13
14 // costMatrix[i][j] represents the cost of serving houses from index i to j with one mailbox
15 // The optimal position for one mailbox is the median position
16 const costMatrix: number[][] = Array.from(
17 { length: houseCount },
18 () => Array(houseCount).fill(0)
19 );
20
21 // Pre-calculate cost of serving any range of houses with a single mailbox
22 // Working backwards to build up the cost matrix efficiently
23 for (let startIdx = houseCount - 2; startIdx >= 0; startIdx--) {
24 for (let endIdx = startIdx + 1; endIdx < houseCount; endIdx++) {
25 // Cost accumulates from inner range plus the new pair of houses
26 costMatrix[startIdx][endIdx] = costMatrix[startIdx + 1][endIdx - 1] +
27 houses[endIdx] - houses[startIdx];
28 }
29 }
30
31 const INFINITY: number = Number.POSITIVE_INFINITY;
32
33 // dpTable[i][j] represents minimum cost to serve houses 0 to i using j mailboxes
34 const dpTable: number[][] = Array.from(
35 { length: houseCount },
36 () => Array(k + 1).fill(INFINITY)
37 );
38
39 // Base case: serving first i+1 houses with 1 mailbox
40 for (let houseIdx = 0; houseIdx < houseCount; houseIdx++) {
41 dpTable[houseIdx][1] = costMatrix[0][houseIdx];
42 }
43
44 // Fill DP table for increasing number of mailboxes
45 for (let mailboxCount = 2; mailboxCount <= k; mailboxCount++) {
46 // Need at least mailboxCount houses to place mailboxCount mailboxes
47 for (let lastHouseIdx = mailboxCount - 1; lastHouseIdx < houseCount; lastHouseIdx++) {
48 // Try all possible positions for the (mailboxCount-1)th mailbox
49 for (let splitPoint = lastHouseIdx - 1; splitPoint >= 0; splitPoint--) {
50 // Cost = cost of first splitPoint+1 houses with mailboxCount-1 mailboxes
51 // + cost of remaining houses with 1 mailbox
52 dpTable[lastHouseIdx][mailboxCount] = Math.min(
53 dpTable[lastHouseIdx][mailboxCount],
54 dpTable[splitPoint][mailboxCount - 1] + costMatrix[splitPoint + 1][lastHouseIdx]
55 );
56 }
57 }
58 }
59
60 // Return minimum cost to serve all houses with k mailboxes
61 return dpTable[houseCount - 1][k];
62}
63
Time and Space Complexity
Time Complexity: O(n^2 × k)
The analysis breaks down as follows:
- Sorting the houses takes
O(n log n)
time - Building the cost matrix
g
:- The nested loops iterate through all pairs
(i, j)
wherei < j
- This results in
O(n^2)
operations
- The nested loops iterate through all pairs
- Computing the DP table
f
:- The outer loop runs
n
times (for each house indexi
) - The middle loop runs up to
k
times (for each number of mailboxesj
) - The inner loop runs up to
i
times (for each split pointp
) - In the worst case, this gives us
O(n × k × n) = O(n^2 × k)
- The outer loop runs
Since O(n^2 × k)
dominates O(n log n)
and O(n^2)
, the overall time complexity is O(n^2 × k)
.
Space Complexity: O(n^2)
The space usage consists of:
- The cost matrix
g
:n × n
array requiringO(n^2)
space - The DP table
f
:n × (k+1)
array requiringO(n × k)
space
Since k ≤ n
(we cannot have more mailboxes than houses), and typically k
is much smaller than n
, the O(n^2)
term from matrix g
dominates, giving us a total space complexity of O(n^2)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Cost Matrix Calculation for Single Mailbox
Pitfall: A common mistake is incorrectly calculating the cost matrix g[i][j]
(or cost[i][j]
in the code). Developers often try to compute this by iterating through all houses and calculating distances to a median position explicitly, which is both inefficient and error-prone.
Why it happens: The recurrence relation g[i][j] = g[i+1][j-1] + houses[j] - houses[i]
is not intuitive. Many attempt to calculate the median position first and then sum distances, leading to:
# INCORRECT approach
median = houses[(i + j) // 2]
for h in range(i, j + 1):
cost[i][j] += abs(houses[h] - median)
Solution: Trust the mathematical property that for sorted houses, when adding two houses at the ends of a range, they contribute exactly houses[j] - houses[i]
to the total distance when the mailbox is at the median. The correct approach builds the cost matrix using the recurrence relation.
2. Off-by-One Errors in DP Boundaries
Pitfall: Incorrectly setting the loop boundaries, especially:
- Using
range(2, k + 1)
instead ofrange(2, min(k + 1, i + 2))
for the number of mailboxes - This causes attempting to place more mailboxes than houses available up to position
i
Why it happens: Not considering that you cannot place more mailboxes than the number of houses in the current range.
Solution: Always bound the number of mailboxes by both k
and the number of available houses:
for j in range(2, min(k + 1, i + 2)): # i+2 because we have houses 0 to i (inclusive)
3. Forgetting to Sort the Houses Array
Pitfall: Operating on the unsorted houses array, which breaks the fundamental assumption that each mailbox serves a contiguous range of houses.
Why it happens: The problem statement doesn't explicitly mention that houses need to be sorted, and developers might assume the input is already sorted or that order doesn't matter.
Solution: Always sort the houses array at the beginning:
houses.sort() # Critical first step
4. Incorrect Base Case Initialization
Pitfall: Not properly initializing the base case where one mailbox serves houses from index 0 to i:
# INCORRECT - forgetting base case dp[i][1] = 0 # Wrong! # CORRECT dp[i][1] = cost[0][i]
Why it happens: Confusion about what the base case represents - it's not zero cost, but the actual cost of placing one mailbox optimally for those houses.
Solution: Ensure the base case correctly uses the precomputed cost matrix for placing one mailbox.
5. Integer Overflow in Other Languages
Pitfall: While Python handles large integers automatically, implementing this in languages like Java or C++ might cause overflow when initializing with Integer.MAX_VALUE
and performing additions.
Solution: Use appropriate data types (like long
in Java) or check for overflow conditions before additions:
# In Python, we use float('inf') which handles this gracefully
dp = [[float('inf')] * (k + 1) for _ in range(n)]
What is an advantages of top-down dynamic programming vs bottom-up dynamic programming?
Recommended Readings
Math for Technical Interviews How much math do I need to know for technical interviews The short answer is about high school level math Computer science is often associated with math and some universities even place their computer science department under the math faculty However the reality is that you
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
Sorting Summary Comparisons We presented quite a few sorting algorithms and it is essential to know the advantages and disadvantages of each one The basic algorithms are easy to visualize and easy to learn for beginner programmers because of their simplicity As such they will suffice if you don't know any advanced
Want a Structured Path to Master System Design Too? Don’t Miss This!