1168. Optimize Water Distribution in a Village
Problem Description
This problem is about finding the most cost-effective way to supply water to a set of houses in a village using wells and pipes. Each house has two options for being supplied with water:
- Build a well inside it directly, with the cost varying for each house.
- Connect to another house that already has water supply via pipes, which might have different costs for different connections.
The houses and the options to supply water to them form a network where we need to choose the minimum cost strategy that connects all the houses to a water source. Since connections between houses can be done with pipes and are bidirectional, a house could potentially receive water from multiple other houses. The task is to calculate the minimum total cost that would supply water to all houses considering the cost of building wells and laying pipes.
Intuition
The solution to the problem is to use a variation of the Union Find algorithm, which helps efficiently determine whether two elements are in the same group and also merge groups if needed. Here's how we drive towards the solution:
-
Building a Graph: Consider each house as a node, and each connection (whether a well or a pipe) as an edge with a certain cost. Now we are looking for the minimum spanning tree (MST) of this graph, which connects all nodes with the minimum total edge weight.
-
Adding Wells as Edges: Treat wells as special edges that connect each house to a virtual "water source" node (index 0). This allows us to use the same logic for connecting houses with pipes and building wells inside houses.
-
Sorting by Cost: Sort all the edges (both well and pipe connections) by their cost. This is because in Kruskal's MST algorithm, we consider the lowest cost edges first to ensure we are building the MST.
-
Union-Find to Connect Houses: Initialize every house as its own group. For each edge in sorted order, if the two houses at the ends of the edge are not already in the same group, then this edge would be part of the MST and we should "union" the houses (i.e., connect them), and add the cost of this edge to the answer. Repeat this step until all houses have been connected.
-
Optimization Stop Condition: If we connect all the houses before going through all the connections, we stop the process early as we've already formed an MST that connects all the houses which are represented by separate groups being unified into a single group.
By continuously unioning the groups with the least costly edge available and halting when all houses are connected, we guarantee that the minimum total cost is achieved for supplying water to all houses.
Learn more about Union Find, Graph and Minimum Spanning Tree patterns.
Solution Approach
The provided solution uses the Union Find data structure to determine the minimum cost to supply water to all houses, and it is particularly suited to problems involving group connectivity such as this one. Here's a step-by-step walkthrough of the implementation:
-
Union Find Initialization: We create a parent list,
p
, where each index represents a house and stores the "parent" of that house. Initially, each house is its own parent, indicative of each being its own separate group. -
Define
find
Function: A helperfind
function is defined to find the root parent of any given house. This function is recursive and employs path compression by directly connecting each node to its root parent, thereby flattening the structure for efficiency.1def find(x: int) -> int: 2 if p[x] != x: 3 p[x] = find(p[x]) 4 return p[x]
-
Incorporating Wells: Wells are treated as edges connecting the virtual node 0 to each house, with the cost to build the well as the edge weight. These wells are then appended to the
pipes
list so they can be considered alongside actual pipes withenumerate
starting at index 1, owing to the 0-based indexing.1for i, w in enumerate(wells, 1): 2 pipes.append([0, i, w])
-
Sorting: All edges, which now include both wells and pipes, are sorted in ascending order based on their cost. This is the initial step in Kruskal's algorithm to ensure that we're always considering the lowest cost connection first.
1pipes.sort(key=lambda x: x[2])
-
Iterating Over Connections: We iterate over all these connections. For each connection consisting of two houses (or one house and the virtual node 0) and the cost, check if they belong to different groups.
1for i, j, c in pipes: 2 pa, pb = find(i), find(j)
-
Union and Add Cost: If the houses belong to different groups, we union them by setting the parent of one group to be the parent of the other (effectively merging the groups), and add the cost of this connection to the running total.
1if pa != pb: 2 p[pa] = pb 3 ans += c
-
Early Stopping Condition: If the number of separate groups (
n
) reaches zero, all houses have been connected and we can break out of the loop to avoid unnecessary work.1n -= 1 2if n == 0: 3 break
In the end, the accumulated cost ans
represents the minimum cost to connect all houses with water, which solves the problem.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach with a simple example. Suppose we have a village with 3 houses and the following costs to build wells inside the houses: [1, 2, 2]
. There are also pipes connecting the houses with the following costs: [[1, 2, 1], [2, 3, 1]]
, where each connection is represented as [house1, house2, cost]
.
Step-by-Step Solution
-
Union Find Initialization:
Create a parent list
p
wherep = [0, 1, 2, 3]
representing that house 1 is its own parent, house 2 is its own parent, and so on. -
Define
find
Function:We have the
find
function ready which will find the root parent of a given house with path compression. -
Incorporating Wells:
We add well costs as edges to the virtual node 0.
Initial pipes list:
[[1, 2, 1], [2, 3, 1]]
Adding wells as edges:
[[0, 1, 1], [0, 2, 2], [0, 3, 2]]
Updated pipes list:
[[0, 1, 1], [0, 2, 2], [0, 3, 2], [1, 2, 1], [2, 3, 1]]
-
Sorting:
Sort the connections by cost:
[[1, 2, 1], [2, 3, 1], [0, 1, 1], [0, 2, 2], [0, 3, 2]]
-
Iterating Over Connections:
Iterate over the sorted connections and use the
find
function to check if the connection is between two different groups. -
Union and Add Cost:
For the first pipe
[1, 2, 1]
, since house 1 and house 2 are in different groups (find(1)
is 1 andfind(2)
is 2), connect them and update the answerans = 1
.Now, for the second pipe
[2, 3, 1]
, house 2 and house 3 are also in different groups (find(2)
would now give us 1, butfind(3)
is still 3), connect them as well and update the answerans = 2
.Next, we consider the well
[0, 1, 1]
. However, since houses 1, 2, and 3 are already connected through pipes, we no longer need to build this well. We would continue checking the remaining connections, but they are not necessary as all houses are connected. -
Early Stopping Condition:
Once all houses are connected (
n
becomes 0), we stop the process. In this case, after then
is reduced from 3 to 1, we don't need any more connections. Thus, we don't need wells for house 2 or house 3, and we can stop iterating further.
That is how we would achieve the minimum cost (ans = 2
) to supply water to all houses in our example by connecting house 1 to house 2, and then house 2 to house 3 with pipes, completely avoiding the need to build additional wells.
Solution Implementation
1# Import typing List for type annotations
2from typing import List
3
4class Solution:
5 def minCostToSupplyWater(self, num_houses: int, cost_of_well: List[int], pipes: List[List[int]]) -> int:
6 # Helper function find used for finding the root of an element
7 def find(root: int) -> int:
8 if parents[root] != root:
9 parents[root] = find(parents[root]) # Path compression
10 return parents[root]
11
12 # Adding the cost of building a well at each house as a pipe from house 0 (imaginary).
13 for house_index, well_cost in enumerate(cost_of_well, 1):
14 pipes.append([0, house_index, well_cost])
15
16 # Sorting the pipes by their cost in ascending order
17 pipes.sort(key=lambda x: x[2])
18
19 # Initialize parents list, where each element is its own root at the beginning.
20 parents = list(range(num_houses + 1))
21
22 total_cost = 0
23 for pipe_start, pipe_end, pipe_cost in pipes:
24 # Find the roots of the nodes the pipe connects
25 root_a, root_b = find(pipe_start), find(pipe_end)
26
27 # If the roots are the same, nodes are already connected, so we skip this pipe.
28 if root_a == root_b:
29 continue
30
31 # Union: connect these nodes
32 parents[root_a] = root_b
33 total_cost += pipe_cost # Add cost to the answer
34
35 # Decrement the number of houses that need water connection
36 num_houses -= 1
37 # If all houses are connected, exit loop early
38 if num_houses == 0:
39 break
40
41 # Return the minimum cost to supply water to all houses
42 return total_cost
43
1class Solution {
2 private int[] parent; // Array to store the parent of each element in the disjoint-set.
3
4 // Method to calculate the minimum cost to supply water to all houses.
5 public int minCostToSupplyWater(int n, int[] wells, int[][] pipes) {
6 // Create an array to store all connections with their costs.
7 int[][] connections = new int[n + pipes.length][];
8 int index = 0; // Index to track the current position in the connections array.
9
10 // Add all the existing pipes to the connections array.
11 for (int[] pipe : pipes) {
12 connections[index++] = pipe;
13 }
14
15 // Add connections of wells to the network. Treat the well as a pipe from node 0 to the house.
16 for (int i = 0; i < n; ++i) {
17 connections[index++] = new int[]{0, i + 1, wells[i]};
18 }
19
20 // Sort the connections array by cost.
21 Arrays.sort(connections, (a, b) -> a[2] - b[2]);
22
23 // Initialize the parent array for disjoint-set operations.
24 parent = new int[n + 1];
25 for (int i = 1; i <= n; ++i) {
26 parent[i] = i;
27 }
28
29 int answer = 0; // Variable to store the total cost.
30
31 // Iterate over all connections.
32 for (int[] connection : connections) {
33 int parentA = find(connection[0]); // Find the root of the first node.
34 int parentB = find(connection[1]); // Find the root of the second node.
35
36 // If both nodes have the same root, they are already connected.
37 if (parentA == parentB) {
38 continue;
39 }
40
41 // If they are not connected, connect them and add the cost.
42 answer += connection[2];
43 parent[parentA] = parentB;
44
45 // If all nodes are connected, break out of the loop.
46 if (--n == 0) {
47 break;
48 }
49 }
50
51 // Return the total cost.
52 return answer;
53 }
54
55 // Method to find the root parent of a node using path compression.
56 private int find(int x) {
57 if (parent[x] != x) {
58 parent[x] = find(parent[x]); // Path compression by halving.
59 }
60 return parent[x];
61 }
62}
63
1#include <vector> // Include necessary header for vectors
2#include <numeric> // Include necessary header for iota
3#include <algorithm> // Include necessary header for sort
4#include <functional>// Include necessary header for function
5
6using namespace std;
7
8class Solution {
9public:
10 int minCostToSupplyWater(int n, vector<int>& wells, vector<vector<int>>& pipes) {
11 // Add the cost of drilling a well at each house as a pipe from source 0
12 for (int i = 0; i < n; ++i) {
13 pipes.push_back({0, i + 1, wells[i]});
14 }
15
16 // Sort the pipes based on their cost in ascending order
17 sort(pipes.begin(), pipes.end(), [](const vector<int>& a, const vector<int>& b) {
18 return a[2] < b[2];
19 });
20
21 // Parent array for union-find
22 int parents[n + 1];
23
24 // Initialize each element's parent to itself
25 iota(parents, parents + n + 1, 0);
26
27 // Find function using path compression for efficient union-find
28 function<int(int)> find = [&](int x) {
29 if (parents[x] != x) {
30 parents[x] = find(parents[x]);
31 }
32 return parents[x];
33 };
34
35 // Total cost to supply water to all houses
36 int totalCost = 0;
37
38 // Iterate over all pipes/edges
39 for (const auto& edge : pipes) {
40 int parentA = find(edge[0]);
41 int parentB = find(edge[1]);
42
43 // If the nodes are already connected, ignore this pipe
44 if (parentA == parentB) {
45 continue;
46 }
47
48 // Connect the components and add the cost of this pipe
49 parents[parentA] = parentB;
50 totalCost += edge[2];
51
52 // If all houses are connected, break early
53 if (--n == 0) {
54 break;
55 }
56 }
57 return totalCost;
58 }
59};
60
1function minCostToSupplyWater(houseCount: number, wellCosts: number[], pipeCosts: number[][]): number {
2 // Add the cost of connecting the wells to the houses to the list of pipe costs.
3 for (let i = 0; i < houseCount; ++i) {
4 pipeCosts.push([0, i + 1, wellCosts[i]]);
5 }
6
7 // Sort the pipe costs in ascending order based on their cost.
8 pipeCosts.sort((a, b) => a[2] - b[2]);
9
10 // Initialize a parent array for Union-Find, pointing each node to itself at the start.
11 const parent = new Array(houseCount + 1).fill(0).map((_, i) => i);
12
13 // Define the find function for Union-Find to find the root parent of a node.
14 const findRoot = (node: number): number => {
15 if (parent[node] !== node) {
16 parent[node] = findRoot(parent[node]);
17 }
18 return parent[node];
19 };
20
21 // Variable to track the total minimum cost to supply water to all houses.
22 let totalCost = 0;
23
24 // Iterate through all pipeCosts to connect the houses and find the minimum cost.
25 for (const [houseA, houseB, cost] of pipeCosts) {
26 const rootA = findRoot(houseA);
27 const rootB = findRoot(houseB);
28
29 // If the roots are the same, the houses are already connected; skip to the next pipe.
30 if (rootA === rootB) {
31 continue;
32 }
33
34 // Connect the two components and add the cost of the connection to the total cost.
35 parent[rootA] = rootB;
36 totalCost += cost;
37
38 // When we have connected all houses we can exit the loop early.
39 if (--houseCount === 0) {
40 break;
41 }
42 }
43
44 // Return the total minimum cost to supply water to all houses.
45 return totalCost;
46}
47
Time and Space Complexity
Time Complexity
The time complexity of the code can be broken down into the following parts:
-
Initial Sorting of pipes: The list of pipes is sorted based on the cost associated with connecting them, which has a time complexity of
O(E*log(E))
, whereE
is the number of pipes. -
Initialization of the array p: Initializing the array
p
to keep track of the parent nodes during Union-Find operations has a time complexity ofO(N)
, whereN
is the number of wells or vertices. -
Union-Find Operations: For each edge in the sorted pipes list, the 'find' function is called twice (once for each vertex) and possibly a union operation. The time complexity for each find operation can be considered as
O(1)
on average due to path compression. Therefore, all find and union operations together would have a time complexity ofO(E)
.
The overall time complexity would be dominated by the sorting step, hence the time complexity is O(E*log(E) + N)
.
Space Complexity
The space complexity of the code includes:
-
The array p: The array
p
has a space complexity ofO(N)
as it containsN+1
elements to represent each node and the virtual node '0'. -
The updated pipes list: The pipes list is updated to include the wells, increasing its length to
O(E + N)
.
Since the array p
does not grow with the addition of the wells to the pipes list, the space complexity is primarily dependent on the size of the updated pipes list. Therefore, the overall space complexity is O(E + N)
.
Learn more about how to find time and space complexity quickly using problem constraints.
What data structure does Breadth-first search typically uses to store intermediate states?
Recommended Readings
Union Find Disjoint Set Union Data Structure Introduction Prerequisite Depth First Search Review problems dfs_intro Once we have a strong grasp of recursion and Depth First Search we can now introduce Disjoint Set Union DSU This data structure is motivated by the following problem Suppose we have sets of elements
https algomonster s3 us east 2 amazonaws com cover_photos graph svg Graph Fundamentals Tree with 0 cycle At this point you should be pretty familiar with trees A tree is a special kind of graph a connected acyclic cycle less graph A graph may contain cycle s and nodes could
Minimum Spanning Tree Forests The Umbristan Department of Forestry UDF is tackling a rather difficult problem and the Umbristan government has assigned you as one of its best workers to go resolve the issue When you arrive you are quickly briefed on the problem at hand Inside the Umbristan National Park there exists an