1168. Optimize Water Distribution in a Village

Problem Description

This problem is about finding the most cost-effective way to supply water to a set of houses in a village using wells and pipes. Each house has two options for being supplied with water:

  1. Build a well inside it directly, with the cost varying for each house.
  2. Connect to another house that already has water supply via pipes, which might have different costs for different connections.

The houses and the options to supply water to them form a network where we need to choose the minimum cost strategy that connects all the houses to a water source. Since connections between houses can be done with pipes and are bidirectional, a house could potentially receive water from multiple other houses. The task is to calculate the minimum total cost that would supply water to all houses considering the cost of building wells and laying pipes.


The solution to the problem is to use a variation of the Union Find algorithm, which helps efficiently determine whether two elements are in the same group and also merge groups if needed. Here's how we drive towards the solution:

  • Building a Graph: Consider each house as a node, and each connection (whether a well or a pipe) as an edge with a certain cost. Now we are looking for the minimum spanning tree (MST) of this graph, which connects all nodes with the minimum total edge weight.

  • Adding Wells as Edges: Treat wells as special edges that connect each house to a virtual "water source" node (index 0). This allows us to use the same logic for connecting houses with pipes and building wells inside houses.

  • Sorting by Cost: Sort all the edges (both well and pipe connections) by their cost. This is because in Kruskal's MST algorithm, we consider the lowest cost edges first to ensure we are building the MST.

  • Union-Find to Connect Houses: Initialize every house as its own group. For each edge in sorted order, if the two houses at the ends of the edge are not already in the same group, then this edge would be part of the MST and we should "union" the houses (i.e., connect them), and add the cost of this edge to the answer. Repeat this step until all houses have been connected.

  • Optimization Stop Condition: If we connect all the houses before going through all the connections, we stop the process early as we've already formed an MST that connects all the houses which are represented by separate groups being unified into a single group.

By continuously unioning the groups with the least costly edge available and halting when all houses are connected, we guarantee that the minimum total cost is achieved for supplying water to all houses.

Solution Approach

The provided solution uses the Union Find data structure to determine the minimum cost to supply water to all houses, and it is particularly suited to problems involving group connectivity such as this one. Here's a step-by-step walkthrough of the implementation:

  1. Union Find Initialization: We create a parent list, p, where each index represents a house and stores the "parent" of that house. Initially, each house is its own parent, indicative of each being its own separate group.

  2. Define find Function: A helper find function is defined to find the root parent of any given house. This function is recursive and employs path compression by directly connecting each node to its root parent, thereby flattening the structure for efficiency.

    1def find(x: int) -> int:
    2    if p[x] != x:
    3        p[x] = find(p[x])
    4    return p[x]
  3. Incorporating Wells: Wells are treated as edges connecting the virtual node 0 to each house, with the cost to build the well as the edge weight. These wells are then appended to the pipes list so they can be considered alongside actual pipes with enumerate starting at index 1, owing to the 0-based indexing.

    1for i, w in enumerate(wells, 1):
    2    pipes.append([0, i, w])
  4. Sorting: All edges, which now include both wells and pipes, are sorted in ascending order based on their cost. This is the initial step in Kruskal's algorithm to ensure that we're always considering the lowest cost connection first.

    1pipes.sort(key=lambda x: x[2])
  5. Iterating Over Connections: We iterate over all these connections. For each connection consisting of two houses (or one house and the virtual node 0) and the cost, check if they belong to different groups.

    1for i, j, c in pipes:
    2    pa, pb = find(i), find(j)
  6. Union and Add Cost: If the houses belong to different groups, we union them by setting the parent of one group to be the parent of the other (effectively merging the groups), and add the cost of this connection to the running total.

    1if pa != pb:
    2    p[pa] = pb
    3    ans += c
  7. Early Stopping Condition: If the number of separate groups (n) reaches zero, all houses have been connected and we can break out of the loop to avoid unnecessary work.

    1n -= 1
    2if n == 0:
    3    break

In the end, the accumulated cost ans represents the minimum cost to connect all houses with water, which solves the problem.

Level Up Your
Algo Skills

Example Walkthrough

Let's illustrate the solution approach with a simple example. Suppose we have a village with 3 houses and the following costs to build wells inside the houses: [1, 2, 2]. There are also pipes connecting the houses with the following costs: [[1, 2, 1], [2, 3, 1]], where each connection is represented as [house1, house2, cost].

Step-by-Step Solution

  1. Union Find Initialization:

    Create a parent list p where p = [0, 1, 2, 3] representing that house 1 is its own parent, house 2 is its own parent, and so on.

  2. Define find Function:

    We have the find function ready which will find the root parent of a given house with path compression.

  3. Incorporating Wells:

    We add well costs as edges to the virtual node 0.

    Initial pipes list: [[1, 2, 1], [2, 3, 1]]

    Adding wells as edges: [[0, 1, 1], [0, 2, 2], [0, 3, 2]]

    Updated pipes list: [[0, 1, 1], [0, 2, 2], [0, 3, 2], [1, 2, 1], [2, 3, 1]]

  4. Sorting:

    Sort the connections by cost: [[1, 2, 1], [2, 3, 1], [0, 1, 1], [0, 2, 2], [0, 3, 2]]

  5. Iterating Over Connections:

    Iterate over the sorted connections and use the find function to check if the connection is between two different groups.

  6. Union and Add Cost:

    For the first pipe [1, 2, 1], since house 1 and house 2 are in different groups (find(1) is 1 and find(2) is 2), connect them and update the answer ans = 1.

    Now, for the second pipe [2, 3, 1], house 2 and house 3 are also in different groups (find(2) would now give us 1, but find(3) is still 3), connect them as well and update the answer ans = 2.

    Next, we consider the well [0, 1, 1]. However, since houses 1, 2, and 3 are already connected through pipes, we no longer need to build this well. We would continue checking the remaining connections, but they are not necessary as all houses are connected.

  7. Early Stopping Condition:

    Once all houses are connected (n becomes 0), we stop the process. In this case, after the n is reduced from 3 to 1, we don't need any more connections. Thus, we don't need wells for house 2 or house 3, and we can stop iterating further.

That is how we would achieve the minimum cost (ans = 2) to supply water to all houses in our example by connecting house 1 to house 2, and then house 2 to house 3 with pipes, completely avoiding the need to build additional wells.

Python Solution

1# Import typing List for type annotations
2from typing import List
4class Solution:
5    def minCostToSupplyWater(self, num_houses: int, cost_of_well: List[int], pipes: List[List[int]]) -> int:
6        # Helper function find used for finding the root of an element
7        def find(root: int) -> int:
8            if parents[root] != root:
9                parents[root] = find(parents[root])  # Path compression
10            return parents[root]
12        # Adding the cost of building a well at each house as a pipe from house 0 (imaginary).
13        for house_index, well_cost in enumerate(cost_of_well, 1):
14            pipes.append([0, house_index, well_cost])
16        # Sorting the pipes by their cost in ascending order
17        pipes.sort(key=lambda x: x[2])
19        # Initialize parents list, where each element is its own root at the beginning.
20        parents = list(range(num_houses + 1))
22        total_cost = 0
23        for pipe_start, pipe_end, pipe_cost in pipes:
24            # Find the roots of the nodes the pipe connects
25            root_a, root_b = find(pipe_start), find(pipe_end)
27            # If the roots are the same, nodes are already connected, so we skip this pipe.
28            if root_a == root_b:
29                continue
31            # Union: connect these nodes
32            parents[root_a] = root_b
33            total_cost += pipe_cost  # Add cost to the answer
35            # Decrement the number of houses that need water connection
36            num_houses -= 1
37            # If all houses are connected, exit loop early
38            if num_houses == 0:
39                break
41        # Return the minimum cost to supply water to all houses
42        return total_cost

Java Solution

1class Solution {
2    private int[] parent; // Array to store the parent of each element in the disjoint-set.
4    // Method to calculate the minimum cost to supply water to all houses.
5    public int minCostToSupplyWater(int n, int[] wells, int[][] pipes) {
6        // Create an array to store all connections with their costs.
7        int[][] connections = new int[n + pipes.length][];
8        int index = 0; // Index to track the current position in the connections array.
10        // Add all the existing pipes to the connections array.
11        for (int[] pipe : pipes) {
12            connections[index++] = pipe;
13        }
15        // Add connections of wells to the network. Treat the well as a pipe from node 0 to the house.
16        for (int i = 0; i < n; ++i) {
17            connections[index++] = new int[]{0, i + 1, wells[i]};
18        }
20        // Sort the connections array by cost.
21        Arrays.sort(connections, (a, b) -> a[2] - b[2]);
23        // Initialize the parent array for disjoint-set operations.
24        parent = new int[n + 1];
25        for (int i = 1; i <= n; ++i) {
26            parent[i] = i;
27        }
29        int answer = 0; // Variable to store the total cost.
31        // Iterate over all connections.
32        for (int[] connection : connections) {
33            int parentA = find(connection[0]); // Find the root of the first node.
34            int parentB = find(connection[1]); // Find the root of the second node.
36            // If both nodes have the same root, they are already connected.
37            if (parentA == parentB) {
38                continue;
39            }
41            // If they are not connected, connect them and add the cost.
42            answer += connection[2];
43            parent[parentA] = parentB;
45            // If all nodes are connected, break out of the loop.
46            if (--n == 0) {
47                break;
48            }
49        }
51        // Return the total cost.
52        return answer;
53    }
55    // Method to find the root parent of a node using path compression.
56    private int find(int x) {
57        if (parent[x] != x) {
58            parent[x] = find(parent[x]); // Path compression by halving.
59        }
60        return parent[x];
61    }

C++ Solution

1#include <vector>    // Include necessary header for vectors
2#include <numeric>   // Include necessary header for iota
3#include <algorithm> // Include necessary header for sort
4#include <functional>// Include necessary header for function
6using namespace std;
8class Solution {
10    int minCostToSupplyWater(int n, vector<int>& wells, vector<vector<int>>& pipes) {
11        // Add the cost of drilling a well at each house as a pipe from source 0
12        for (int i = 0; i < n; ++i) {
13            pipes.push_back({0, i + 1, wells[i]});
14        }
16        // Sort the pipes based on their cost in ascending order
17        sort(pipes.begin(), pipes.end(), [](const vector<int>& a, const vector<int>& b) {
18            return a[2] < b[2];
19        });
21        // Parent array for union-find
22        int parents[n + 1];
24        // Initialize each element's parent to itself
25        iota(parents, parents + n + 1, 0);
27        // Find function using path compression for efficient union-find
28        function<int(int)> find = [&](int x) {
29            if (parents[x] != x) {
30                parents[x] = find(parents[x]);
31            }
32            return parents[x];
33        };
35        // Total cost to supply water to all houses
36        int totalCost = 0;
38        // Iterate over all pipes/edges
39        for (const auto& edge : pipes) {
40            int parentA = find(edge[0]);
41            int parentB = find(edge[1]);
43            // If the nodes are already connected, ignore this pipe
44            if (parentA == parentB) {
45                continue;
46            }
48            // Connect the components and add the cost of this pipe
49            parents[parentA] = parentB;
50            totalCost += edge[2];
52            // If all houses are connected, break early
53            if (--n == 0) {
54                break;
55            }
56        }
57        return totalCost;
58    }

Typescript Solution

1function minCostToSupplyWater(houseCount: number, wellCosts: number[], pipeCosts: number[][]): number {
2  // Add the cost of connecting the wells to the houses to the list of pipe costs.
3  for (let i = 0; i < houseCount; ++i) {
4    pipeCosts.push([0, i + 1, wellCosts[i]]);
5  }
7  // Sort the pipe costs in ascending order based on their cost.
8  pipeCosts.sort((a, b) => a[2] - b[2]);
10  // Initialize a parent array for Union-Find, pointing each node to itself at the start.
11  const parent = new Array(houseCount + 1).fill(0).map((_, i) => i);
13  // Define the find function for Union-Find to find the root parent of a node.
14  const findRoot = (node: number): number => {
15    if (parent[node] !== node) {
16      parent[node] = findRoot(parent[node]);
17    }
18    return parent[node];
19  };
21  // Variable to track the total minimum cost to supply water to all houses.
22  let totalCost = 0;
24  // Iterate through all pipeCosts to connect the houses and find the minimum cost.
25  for (const [houseA, houseB, cost] of pipeCosts) {
26    const rootA = findRoot(houseA);
27    const rootB = findRoot(houseB);
29    // If the roots are the same, the houses are already connected; skip to the next pipe.
30    if (rootA === rootB) {
31      continue;
32    }
34    // Connect the two components and add the cost of the connection to the total cost.
35    parent[rootA] = rootB;
36    totalCost += cost;
38    // When we have connected all houses we can exit the loop early.
39    if (--houseCount === 0) {
40      break;
41    }
42  }
44  // Return the total minimum cost to supply water to all houses.
45  return totalCost;

Time and Space Complexity

Time Complexity

The time complexity of the code can be broken down into the following parts:

  1. Initial Sorting of pipes: The list of pipes is sorted based on the cost associated with connecting them, which has a time complexity of O(E*log(E)), where E is the number of pipes.

  2. Initialization of the array p: Initializing the array p to keep track of the parent nodes during Union-Find operations has a time complexity of O(N), where N is the number of wells or vertices.

  3. Union-Find Operations: For each edge in the sorted pipes list, the 'find' function is called twice (once for each vertex) and possibly a union operation. The time complexity for each find operation can be considered as O(1) on average due to path compression. Therefore, all find and union operations together would have a time complexity of O(E).

The overall time complexity would be dominated by the sorting step, hence the time complexity is O(E*log(E) + N).

Space Complexity

The space complexity of the code includes:

  1. The array p: The array p has a space complexity of O(N) as it contains N+1 elements to represent each node and the virtual node '0'.

  2. The updated pipes list: The pipes list is updated to include the wells, increasing its length to O(E + N).

Since the array p does not grow with the addition of the wells to the pipes list, the space complexity is primarily dependent on the size of the updated pipes list. Therefore, the overall space complexity is O(E + N).

Become an
Algo Monster

Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.

TA 👨‍🏫