1168. Optimize Water Distribution in a Village
Problem Description
You have n
houses in a village that need water supply. To provide water to all houses, you can either build wells directly in houses or connect houses with pipes to share water from existing wells.
For each house i
, you have two options:
- Build a well directly inside it at a cost of
wells[i - 1]
(using 0-based indexing) - Connect it to another house that has water supply using pipes
The pipe connections are described in the pipes
array, where each element pipes[j] = [house1_j, house2_j, cost_j]
represents:
- A bidirectional pipe can be built between
house1_j
andhouse2_j
- The cost to build this pipe is
cost_j
- Multiple pipes with different costs can exist between the same two houses
Your goal is to find the minimum total cost to ensure all houses have water supply. Houses can get water either from their own well or through pipe connections to other houses that have water (directly from a well or indirectly through other pipes).
The solution transforms this problem into a minimum spanning tree problem by creating a virtual node (node 0) representing a "super well". Each house's well-building cost becomes an edge weight between that house and node 0. The algorithm then uses Kruskal's approach with Union-Find to find the minimum cost to connect all houses to a water source.
Intuition
The key insight is recognizing that this is actually a graph connectivity problem in disguise. Each house needs to be connected to a water source, either directly (by building a well) or indirectly (through pipes to other houses with water).
Think of it this way: if we build a well in a house, we're essentially "connecting" that house to a water source. If we lay pipes between houses, we're creating paths for water to flow. The challenge is that wells can be built in multiple houses, making it unclear which houses should have wells and which should rely on pipes.
The breakthrough comes from creating a virtual "super source" node (node 0) that represents all possible wells. Now, instead of deciding whether to build a well in house i
, we can model it as connecting house i
to this super source with an edge weight equal to wells[i-1]
. This transforms the problem elegantly:
- Building a well in house
i
→ Connecting housei
to node 0 with costwells[i-1]
- Laying a pipe between houses → Connecting two house nodes with the pipe cost
- Ensuring all houses have water → Ensuring all house nodes are connected to node 0
With this transformation, the problem becomes finding the minimum cost to connect all nodes in a graph - a classic Minimum Spanning Tree (MST) problem! We need exactly n
edges to connect n+1
nodes (n houses + 1 super source), and we want the minimum total cost.
Using Kruskal's algorithm with Union-Find is natural here: we sort all edges by cost and greedily pick the cheapest edges that connect different components until everything is connected. The beauty is that the algorithm automatically decides which houses get wells (direct edges to node 0) and which houses share water through pipes (edges between house nodes).
Learn more about Union Find, Graph, Minimum Spanning Tree and Heap (Priority Queue) patterns.
Solution Approach
The solution implements Kruskal's algorithm with Union-Find data structure to find the minimum spanning tree. Here's how the implementation works:
Step 1: Transform the Problem
for i, w in enumerate(wells, 1):
pipes.append([0, i, w])
We add virtual edges from node 0 (super source) to each house i
with cost wells[i-1]
. This transforms building a well into a graph edge problem.
Step 2: Sort Edges by Cost
pipes.sort(key=lambda x: x[2])
Kruskal's algorithm requires edges sorted by weight in ascending order. This ensures we consider cheaper connections first.
Step 3: Initialize Union-Find
p = list(range(n + 1))
Create a parent array where p[i] = i
initially, meaning each node (0 to n) is its own parent. Node 0 represents the super source, nodes 1 to n represent houses.
Step 4: Find Operation with Path Compression
def find(x: int) -> int:
if p[x] != x:
p[x] = find(p[x]) # Path compression
return p[x]
This finds the root of the component containing node x
. Path compression optimization makes subsequent finds faster by directly connecting nodes to their root.
Step 5: Process Edges Using Union-Find
ans = 0 for a, b, c in pipes: pa, pb = find(a), find(b) if pa != pb: p[pa] = pb # Union operation n -= 1 # One less component ans += c # Add edge cost if n == 0: return ans
For each edge (a, b, cost)
:
- Find the roots of both endpoints
- If they're in different components (
pa != pb
), merge them:- Union: Set one root as parent of the other
- Decrement
n
(number of components remaining) - Add the edge cost to our answer
- If we've connected all houses (
n == 0
), we're done
The algorithm greedily selects minimum-cost edges that connect different components. Since we need exactly n
edges to connect n+1
nodes into one component, we stop when n
reaches 0. The total cost accumulated is the minimum cost to supply water to all houses.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example to illustrate the solution approach.
Given:
- 3 houses in the village
wells = [1, 2, 2]
(costs to build wells in houses 1, 2, 3)pipes = [[1, 2, 1], [2, 3, 1]]
(pipe connections with costs)
Step 1: Transform the Problem
First, we create a virtual node 0 (super source) and add edges from it to each house:
- Edge (0, 1) with cost 1 (well in house 1)
- Edge (0, 2) with cost 2 (well in house 2)
- Edge (0, 3) with cost 2 (well in house 3)
Our complete edge list becomes:
pipes = [[1, 2, 1], [2, 3, 1], [0, 1, 1], [0, 2, 2], [0, 3, 2]]
Step 2: Sort Edges by Cost
After sorting by cost (third element):
pipes = [[1, 2, 1], [2, 3, 1], [0, 1, 1], [0, 2, 2], [0, 3, 2]]
Step 3: Initialize Union-Find
Parent array: p = [0, 1, 2, 3]
- Each node is its own parent initially
- We have 4 components (nodes 0, 1, 2, 3)
- Counter
n = 3
(we need to connect 3 houses to water)
Step 4: Process Edges with Kruskal's Algorithm
Edge 1: [1, 2, 1]
- Find parents:
find(1) = 1
,find(2) = 2
- Different components, so union them:
p[1] = 2
- Add cost:
ans = 0 + 1 = 1
- Decrement counter:
n = 2
- State: Houses 1 and 2 are connected
Edge 2: [2, 3, 1]
- Find parents:
find(2) = 2
,find(3) = 3
- Different components, so union them:
p[2] = 3
- Add cost:
ans = 1 + 1 = 2
- Decrement counter:
n = 1
- State: Houses 1, 2, and 3 are all connected
Edge 3: [0, 1, 1]
- Find parents:
find(0) = 0
,find(1) = 3
(through path compression) - Different components, so union them:
p[0] = 3
- Add cost:
ans = 2 + 1 = 3
- Decrement counter:
n = 0
- All houses connected to water source! Return ans = 3
Result Interpretation:
The minimum cost is 3, achieved by:
- Building a well in house 1 (edge 0→1, cost 1)
- Connecting house 1 to house 2 with a pipe (cost 1)
- Connecting house 2 to house 3 with a pipe (cost 1)
This gives all houses access to water at minimum total cost of 3.
Solution Implementation
1class Solution:
2 def minCostToSupplyWater(
3 self, n: int, wells: List[int], pipes: List[List[int]]
4 ) -> int:
5 # Union-Find helper function to find the root of a node
6 def find_root(node: int) -> int:
7 # Path compression: make each node point directly to root
8 if parent[node] != node:
9 parent[node] = find_root(parent[node])
10 return parent[node]
11
12 # Add wells as edges from virtual node 0 to each house
13 # wells[i-1] is the cost to build a well at house i
14 for house_id, well_cost in enumerate(wells, start=1):
15 pipes.append([0, house_id, well_cost])
16
17 # Sort all edges (pipes + wells) by cost for Kruskal's algorithm
18 pipes.sort(key=lambda edge: edge[2])
19
20 # Initialize Union-Find parent array
21 # parent[i] represents the parent of node i
22 parent = list(range(n + 1))
23
24 total_cost = 0
25 edges_added = 0
26
27 # Process edges in ascending order of cost (Kruskal's algorithm)
28 for source, destination, cost in pipes:
29 root_source = find_root(source)
30 root_destination = find_root(destination)
31
32 # If nodes are in different components, unite them
33 if root_source != root_destination:
34 # Union: merge the two components
35 parent[root_source] = root_destination
36 edges_added += 1
37 total_cost += cost
38
39 # Early termination: we need exactly n edges to connect n+1 nodes
40 if edges_added == n:
41 return total_cost
42
1class Solution {
2 // Parent array for Union-Find (Disjoint Set Union)
3 private int[] parent;
4
5 public int minCostToSupplyWater(int n, int[] wells, int[][] pipes) {
6 // Create a new edge array that includes both pipes and virtual edges from source (node 0) to each house
7 // The virtual edges represent the cost of building a well at each house
8 int[][] edges = Arrays.copyOf(pipes, pipes.length + n);
9
10 // Add virtual edges from node 0 (virtual source) to each house with cost = well cost
11 for (int i = 0; i < n; i++) {
12 edges[pipes.length + i] = new int[] {0, i + 1, wells[i]};
13 }
14
15 // Sort all edges by cost in ascending order (Kruskal's algorithm requirement)
16 Arrays.sort(edges, (a, b) -> a[2] - b[2]);
17
18 // Initialize parent array for Union-Find
19 // Index 0 represents the virtual source node, indices 1 to n represent houses
20 parent = new int[n + 1];
21 for (int i = 0; i <= n; i++) {
22 parent[i] = i;
23 }
24
25 int totalCost = 0;
26 int edgesUsed = 0;
27
28 // Process edges in order of increasing cost (Kruskal's algorithm)
29 for (int[] edge : edges) {
30 int nodeA = edge[0];
31 int nodeB = edge[1];
32 int cost = edge[2];
33
34 // Find the root parents of both nodes
35 int rootA = find(nodeA);
36 int rootB = find(nodeB);
37
38 // If nodes are in different components, connect them
39 if (rootA != rootB) {
40 // Union: merge the two components
41 parent[rootA] = rootB;
42 totalCost += cost;
43 edgesUsed++;
44
45 // Early termination: we need exactly n edges to connect n+1 nodes
46 if (edgesUsed == n) {
47 return totalCost;
48 }
49 }
50 }
51
52 return totalCost;
53 }
54
55 // Find operation with path compression for Union-Find
56 private int find(int x) {
57 if (parent[x] != x) {
58 // Path compression: make every node point directly to root
59 parent[x] = find(parent[x]);
60 }
61 return parent[x];
62 }
63}
64
1class Solution {
2public:
3 int minCostToSupplyWater(int n, vector<int>& wells, vector<vector<int>>& pipes) {
4 // Add virtual edges from a virtual node 0 to each house
5 // The cost represents building a well at that house
6 for (int i = 0; i < n; ++i) {
7 pipes.push_back({0, i + 1, wells[i]});
8 }
9
10 // Sort all edges (pipes) by cost in ascending order
11 sort(pipes.begin(), pipes.end(), [](const vector<int>& a, const vector<int>& b) {
12 return a[2] < b[2];
13 });
14
15 // Initialize parent array for Union-Find
16 // Index 0 represents the virtual source node
17 vector<int> parent(n + 1);
18 iota(parent.begin(), parent.end(), 0);
19
20 // Find function with path compression
21 function<int(int)> find = [&](int x) {
22 if (parent[x] != x) {
23 parent[x] = find(parent[x]); // Path compression
24 }
25 return parent[x];
26 };
27
28 // Kruskal's algorithm to find minimum spanning tree
29 int totalCost = 0;
30 int edgesUsed = 0;
31
32 for (const auto& edge : pipes) {
33 int house1 = edge[0];
34 int house2 = edge[1];
35 int cost = edge[2];
36
37 // Find the root parents of both houses
38 int root1 = find(house1);
39 int root2 = find(house2);
40
41 // Skip if already connected (would create a cycle)
42 if (root1 == root2) {
43 continue;
44 }
45
46 // Union the two components
47 parent[root1] = root2;
48 totalCost += cost;
49 edgesUsed++;
50
51 // Early termination: MST needs exactly n edges
52 // (n-1 edges to connect n houses + 1 edge to virtual node)
53 if (edgesUsed == n) {
54 break;
55 }
56 }
57
58 return totalCost;
59 }
60};
61
1/**
2 * Finds the minimum cost to supply water to all houses either by building wells or connecting pipes
3 * Uses Kruskal's algorithm with Union-Find to find the minimum spanning tree
4 *
5 * @param n - Number of houses
6 * @param wells - Cost to build a well at each house (wells[i] is cost for house i+1)
7 * @param pipes - Array of pipe connections [house1, house2, cost]
8 * @returns Minimum total cost to supply water to all houses
9 */
10function minCostToSupplyWater(n: number, wells: number[], pipes: number[][]): number {
11 // Create virtual node 0 and connect it to all houses with cost equal to well cost
12 // This transforms the problem into finding minimum spanning tree
13 for (let i = 0; i < n; ++i) {
14 pipes.push([0, i + 1, wells[i]]);
15 }
16
17 // Sort all edges (pipes) by cost in ascending order for Kruskal's algorithm
18 pipes.sort((a, b) => a[2] - b[2]);
19
20 // Initialize parent array for Union-Find data structure
21 // Each node is initially its own parent
22 const parent: number[] = Array(n + 1)
23 .fill(0)
24 .map((_, index) => index);
25
26 /**
27 * Find operation with path compression for Union-Find
28 * Finds the root parent of a node and compresses the path
29 *
30 * @param x - Node to find root parent for
31 * @returns Root parent of the node
32 */
33 const find = (x: number): number => {
34 if (parent[x] !== x) {
35 // Path compression: make every node point directly to root
36 parent[x] = find(parent[x]);
37 }
38 return parent[x];
39 };
40
41 let totalCost: number = 0;
42
43 // Process edges in ascending order of cost
44 for (const [houseA, houseB, cost] of pipes) {
45 const rootA: number = find(houseA);
46 const rootB: number = find(houseB);
47
48 // Skip if nodes are already in the same component (would create cycle)
49 if (rootA === rootB) {
50 continue;
51 }
52
53 // Union operation: merge two components
54 parent[rootA] = rootB;
55 totalCost += cost;
56
57 // Early termination: we need exactly n edges to connect n+1 nodes
58 if (--n === 0) {
59 break;
60 }
61 }
62
63 return totalCost;
64}
65
Time and Space Complexity
Time Complexity: O((m + n) × log(m + n))
The time complexity is dominated by the sorting operation. After adding all wells as virtual edges from node 0, we have m + n
total edges (where m
is the original number of pipes and n
is the number of wells). Sorting these m + n
edges takes O((m + n) × log(m + n))
time. The Union-Find operations (find and union) with path compression have an amortized time complexity of nearly O(1)
per operation, and we perform at most m + n
such operations, contributing O((m + n) × α(n))
where α
is the inverse Ackermann function, which is effectively constant for practical purposes. Therefore, the overall time complexity is O((m + n) × log(m + n))
.
Space Complexity: O(m + n)
The space complexity consists of several components:
- The modified
pipes
array after adding well connections:O(m + n)
space - The parent array
p
for Union-Find:O(n + 1) = O(n)
space - The recursion stack for the find operation with path compression:
O(n)
in the worst case
Since we're modifying the input pipes
array in-place by appending to it, the additional space used is O(n)
for the parent array and potential recursion stack. However, considering the total space including the modified input, it becomes O(m + n)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect House Indexing When Adding Wells
A critical pitfall is mishandling the 1-based vs 0-based indexing when adding wells as edges. The problem states houses are numbered 1 to n, but the wells array uses 0-based indexing.
Incorrect Implementation:
# Wrong: This creates edges from node 0 to nodes 0,1,2...n-1
for i in range(n):
pipes.append([0, i, wells[i]])
Correct Implementation:
# Correct: Creates edges from node 0 to nodes 1,2,3...n
for i, w in enumerate(wells, 1):
pipes.append([0, i, w])
# OR
for i in range(n):
pipes.append([0, i + 1, wells[i]])
2. Forgetting to Account for the Virtual Node in Parent Array
The parent array must have size n + 1
to accommodate nodes 0 through n (virtual node 0 plus n houses).
Incorrect Implementation:
# Wrong: Only accounts for n houses, missing the virtual node
parent = list(range(n))
Correct Implementation:
# Correct: Includes space for nodes 0 to n
parent = list(range(n + 1))
3. Wrong Termination Condition
The algorithm needs exactly n edges to connect n+1 nodes (n houses + 1 virtual node). A common mistake is checking for n-1 edges or using the wrong counter.
Incorrect Implementation:
# Wrong: Checking for n-1 edges if edges_added == n - 1: return total_cost # Wrong: Using the component counter incorrectly components = n + 1 # Starting with n+1 components for source, destination, cost in pipes: if find(source) != find(destination): union(source, destination) components -= 1 if components == 1: # Wrong: should check if we've added n edges return total_cost
Correct Implementation:
# Correct: Need exactly n edges edges_added = 0 for source, destination, cost in pipes: if find(source) != find(destination): union(source, destination) edges_added += 1 total_cost += cost if edges_added == n: # Connected all n houses to virtual node return total_cost
4. Missing Path Compression in Find Operation
Without path compression, the find operation can degrade to O(n) time complexity in worst case, making the overall algorithm inefficient.
Incorrect Implementation:
# Wrong: No path compression
def find(x):
while parent[x] != x:
x = parent[x]
return x
Correct Implementation:
# Correct: With path compression
def find(x):
if parent[x] != x:
parent[x] = find(parent[x]) # Compress path
return parent[x]
5. Not Handling the Case When Input is Already Connected
While less common in this specific problem, if the pipes array is modified or pre-processed, ensure the algorithm handles cases where some houses might already be connected.
Solution: The Union-Find structure naturally handles this by checking if nodes are already in the same component before adding an edge, preventing redundant connections and incorrect cost calculations.
A heap is a ...?
Recommended Readings
Union Find Disjoint Set Union Data Structure Introduction Prerequisite Depth First Search Review problems dfs_intro Once we have a strong grasp of recursion and Depth First Search we can now introduce Disjoint Set Union DSU This data structure is motivated by the following problem Suppose we have sets of elements
https assets algo monster cover_photos graph svg Graph Fundamentals Tree with 0 cycle At this point you should be pretty familiar with trees A tree is a special kind of graph a connected acyclic cycle less graph A graph may contain cycle s and nodes could be disconnected A tree
Minimum Spanning Tree Forests The Umbristan Department of Forestry UDF is tackling a rather difficult problem and the Umbristan government has assigned you as one of its best workers to go resolve the issue When you arrive you are quickly briefed on the problem at hand Inside the Umbristan National Park there exists an
Want a Structured Path to Master System Design Too? Don’t Miss This!