Facebook Pixel

1168. Optimize Water Distribution in a Village

Problem Description

You have n houses in a village that need water supply. To provide water to all houses, you can either build wells directly in houses or connect houses with pipes to share water from existing wells.

For each house i, you have two options:

  • Build a well directly inside it at a cost of wells[i - 1] (using 0-based indexing)
  • Connect it to another house that has water supply using pipes

The pipe connections are described in the pipes array, where each element pipes[j] = [house1_j, house2_j, cost_j] represents:

  • A bidirectional pipe can be built between house1_j and house2_j
  • The cost to build this pipe is cost_j
  • Multiple pipes with different costs can exist between the same two houses

Your goal is to find the minimum total cost to ensure all houses have water supply. Houses can get water either from their own well or through pipe connections to other houses that have water (directly from a well or indirectly through other pipes).

The solution transforms this problem into a minimum spanning tree problem by creating a virtual node (node 0) representing a "super well". Each house's well-building cost becomes an edge weight between that house and node 0. The algorithm then uses Kruskal's approach with Union-Find to find the minimum cost to connect all houses to a water source.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is recognizing that this is actually a graph connectivity problem in disguise. Each house needs to be connected to a water source, either directly (by building a well) or indirectly (through pipes to other houses with water).

Think of it this way: if we build a well in a house, we're essentially "connecting" that house to a water source. If we lay pipes between houses, we're creating paths for water to flow. The challenge is that wells can be built in multiple houses, making it unclear which houses should have wells and which should rely on pipes.

The breakthrough comes from creating a virtual "super source" node (node 0) that represents all possible wells. Now, instead of deciding whether to build a well in house i, we can model it as connecting house i to this super source with an edge weight equal to wells[i-1]. This transforms the problem elegantly:

  • Building a well in house i → Connecting house i to node 0 with cost wells[i-1]
  • Laying a pipe between houses → Connecting two house nodes with the pipe cost
  • Ensuring all houses have water → Ensuring all house nodes are connected to node 0

With this transformation, the problem becomes finding the minimum cost to connect all nodes in a graph - a classic Minimum Spanning Tree (MST) problem! We need exactly n edges to connect n+1 nodes (n houses + 1 super source), and we want the minimum total cost.

Using Kruskal's algorithm with Union-Find is natural here: we sort all edges by cost and greedily pick the cheapest edges that connect different components until everything is connected. The beauty is that the algorithm automatically decides which houses get wells (direct edges to node 0) and which houses share water through pipes (edges between house nodes).

Learn more about Union Find, Graph, Minimum Spanning Tree and Heap (Priority Queue) patterns.

Solution Approach

The solution implements Kruskal's algorithm with Union-Find data structure to find the minimum spanning tree. Here's how the implementation works:

Step 1: Transform the Problem

for i, w in enumerate(wells, 1):
    pipes.append([0, i, w])

We add virtual edges from node 0 (super source) to each house i with cost wells[i-1]. This transforms building a well into a graph edge problem.

Step 2: Sort Edges by Cost

pipes.sort(key=lambda x: x[2])

Kruskal's algorithm requires edges sorted by weight in ascending order. This ensures we consider cheaper connections first.

Step 3: Initialize Union-Find

p = list(range(n + 1))

Create a parent array where p[i] = i initially, meaning each node (0 to n) is its own parent. Node 0 represents the super source, nodes 1 to n represent houses.

Step 4: Find Operation with Path Compression

def find(x: int) -> int:
    if p[x] != x:
        p[x] = find(p[x])  # Path compression
    return p[x]

This finds the root of the component containing node x. Path compression optimization makes subsequent finds faster by directly connecting nodes to their root.

Step 5: Process Edges Using Union-Find

ans = 0
for a, b, c in pipes:
    pa, pb = find(a), find(b)
    if pa != pb:
        p[pa] = pb  # Union operation
        n -= 1      # One less component
        ans += c    # Add edge cost
        if n == 0:
            return ans

For each edge (a, b, cost):

  • Find the roots of both endpoints
  • If they're in different components (pa != pb), merge them:
    • Union: Set one root as parent of the other
    • Decrement n (number of components remaining)
    • Add the edge cost to our answer
    • If we've connected all houses (n == 0), we're done

The algorithm greedily selects minimum-cost edges that connect different components. Since we need exactly n edges to connect n+1 nodes into one component, we stop when n reaches 0. The total cost accumulated is the minimum cost to supply water to all houses.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example to illustrate the solution approach.

Given:

  • 3 houses in the village
  • wells = [1, 2, 2] (costs to build wells in houses 1, 2, 3)
  • pipes = [[1, 2, 1], [2, 3, 1]] (pipe connections with costs)

Step 1: Transform the Problem

First, we create a virtual node 0 (super source) and add edges from it to each house:

  • Edge (0, 1) with cost 1 (well in house 1)
  • Edge (0, 2) with cost 2 (well in house 2)
  • Edge (0, 3) with cost 2 (well in house 3)

Our complete edge list becomes:

pipes = [[1, 2, 1], [2, 3, 1], [0, 1, 1], [0, 2, 2], [0, 3, 2]]

Step 2: Sort Edges by Cost

After sorting by cost (third element):

pipes = [[1, 2, 1], [2, 3, 1], [0, 1, 1], [0, 2, 2], [0, 3, 2]]

Step 3: Initialize Union-Find

Parent array: p = [0, 1, 2, 3]

  • Each node is its own parent initially
  • We have 4 components (nodes 0, 1, 2, 3)
  • Counter n = 3 (we need to connect 3 houses to water)

Step 4: Process Edges with Kruskal's Algorithm

Edge 1: [1, 2, 1]

  • Find parents: find(1) = 1, find(2) = 2
  • Different components, so union them: p[1] = 2
  • Add cost: ans = 0 + 1 = 1
  • Decrement counter: n = 2
  • State: Houses 1 and 2 are connected

Edge 2: [2, 3, 1]

  • Find parents: find(2) = 2, find(3) = 3
  • Different components, so union them: p[2] = 3
  • Add cost: ans = 1 + 1 = 2
  • Decrement counter: n = 1
  • State: Houses 1, 2, and 3 are all connected

Edge 3: [0, 1, 1]

  • Find parents: find(0) = 0, find(1) = 3 (through path compression)
  • Different components, so union them: p[0] = 3
  • Add cost: ans = 2 + 1 = 3
  • Decrement counter: n = 0
  • All houses connected to water source! Return ans = 3

Result Interpretation:

The minimum cost is 3, achieved by:

  • Building a well in house 1 (edge 0→1, cost 1)
  • Connecting house 1 to house 2 with a pipe (cost 1)
  • Connecting house 2 to house 3 with a pipe (cost 1)

This gives all houses access to water at minimum total cost of 3.

Solution Implementation

1class Solution:
2    def minCostToSupplyWater(
3        self, n: int, wells: List[int], pipes: List[List[int]]
4    ) -> int:
5        # Union-Find helper function to find the root of a node
6        def find_root(node: int) -> int:
7            # Path compression: make each node point directly to root
8            if parent[node] != node:
9                parent[node] = find_root(parent[node])
10            return parent[node]
11
12        # Add wells as edges from virtual node 0 to each house
13        # wells[i-1] is the cost to build a well at house i
14        for house_id, well_cost in enumerate(wells, start=1):
15            pipes.append([0, house_id, well_cost])
16
17        # Sort all edges (pipes + wells) by cost for Kruskal's algorithm
18        pipes.sort(key=lambda edge: edge[2])
19
20        # Initialize Union-Find parent array
21        # parent[i] represents the parent of node i
22        parent = list(range(n + 1))
23
24        total_cost = 0
25        edges_added = 0
26
27        # Process edges in ascending order of cost (Kruskal's algorithm)
28        for source, destination, cost in pipes:
29            root_source = find_root(source)
30            root_destination = find_root(destination)
31
32            # If nodes are in different components, unite them
33            if root_source != root_destination:
34                # Union: merge the two components
35                parent[root_source] = root_destination
36                edges_added += 1
37                total_cost += cost
38
39                # Early termination: we need exactly n edges to connect n+1 nodes
40                if edges_added == n:
41                    return total_cost
42
1class Solution {
2    // Parent array for Union-Find (Disjoint Set Union)
3    private int[] parent;
4
5    public int minCostToSupplyWater(int n, int[] wells, int[][] pipes) {
6        // Create a new edge array that includes both pipes and virtual edges from source (node 0) to each house
7        // The virtual edges represent the cost of building a well at each house
8        int[][] edges = Arrays.copyOf(pipes, pipes.length + n);
9
10        // Add virtual edges from node 0 (virtual source) to each house with cost = well cost
11        for (int i = 0; i < n; i++) {
12            edges[pipes.length + i] = new int[] {0, i + 1, wells[i]};
13        }
14
15        // Sort all edges by cost in ascending order (Kruskal's algorithm requirement)
16        Arrays.sort(edges, (a, b) -> a[2] - b[2]);
17
18        // Initialize parent array for Union-Find
19        // Index 0 represents the virtual source node, indices 1 to n represent houses
20        parent = new int[n + 1];
21        for (int i = 0; i <= n; i++) {
22            parent[i] = i;
23        }
24
25        int totalCost = 0;
26        int edgesUsed = 0;
27
28        // Process edges in order of increasing cost (Kruskal's algorithm)
29        for (int[] edge : edges) {
30            int nodeA = edge[0];
31            int nodeB = edge[1];
32            int cost = edge[2];
33
34            // Find the root parents of both nodes
35            int rootA = find(nodeA);
36            int rootB = find(nodeB);
37
38            // If nodes are in different components, connect them
39            if (rootA != rootB) {
40                // Union: merge the two components
41                parent[rootA] = rootB;
42                totalCost += cost;
43                edgesUsed++;
44
45                // Early termination: we need exactly n edges to connect n+1 nodes
46                if (edgesUsed == n) {
47                    return totalCost;
48                }
49            }
50        }
51
52        return totalCost;
53    }
54
55    // Find operation with path compression for Union-Find
56    private int find(int x) {
57        if (parent[x] != x) {
58            // Path compression: make every node point directly to root
59            parent[x] = find(parent[x]);
60        }
61        return parent[x];
62    }
63}
64
1class Solution {
2public:
3    int minCostToSupplyWater(int n, vector<int>& wells, vector<vector<int>>& pipes) {
4        // Add virtual edges from a virtual node 0 to each house
5        // The cost represents building a well at that house
6        for (int i = 0; i < n; ++i) {
7            pipes.push_back({0, i + 1, wells[i]});
8        }
9
10        // Sort all edges (pipes) by cost in ascending order
11        sort(pipes.begin(), pipes.end(), [](const vector<int>& a, const vector<int>& b) {
12            return a[2] < b[2];
13        });
14
15        // Initialize parent array for Union-Find
16        // Index 0 represents the virtual source node
17        vector<int> parent(n + 1);
18        iota(parent.begin(), parent.end(), 0);
19
20        // Find function with path compression
21        function<int(int)> find = [&](int x) {
22            if (parent[x] != x) {
23                parent[x] = find(parent[x]);  // Path compression
24            }
25            return parent[x];
26        };
27
28        // Kruskal's algorithm to find minimum spanning tree
29        int totalCost = 0;
30        int edgesUsed = 0;
31
32        for (const auto& edge : pipes) {
33            int house1 = edge[0];
34            int house2 = edge[1];
35            int cost = edge[2];
36
37            // Find the root parents of both houses
38            int root1 = find(house1);
39            int root2 = find(house2);
40
41            // Skip if already connected (would create a cycle)
42            if (root1 == root2) {
43                continue;
44            }
45
46            // Union the two components
47            parent[root1] = root2;
48            totalCost += cost;
49            edgesUsed++;
50
51            // Early termination: MST needs exactly n edges
52            // (n-1 edges to connect n houses + 1 edge to virtual node)
53            if (edgesUsed == n) {
54                break;
55            }
56        }
57
58        return totalCost;
59    }
60};
61
1/**
2 * Finds the minimum cost to supply water to all houses either by building wells or connecting pipes
3 * Uses Kruskal's algorithm with Union-Find to find the minimum spanning tree
4 *
5 * @param n - Number of houses
6 * @param wells - Cost to build a well at each house (wells[i] is cost for house i+1)
7 * @param pipes - Array of pipe connections [house1, house2, cost]
8 * @returns Minimum total cost to supply water to all houses
9 */
10function minCostToSupplyWater(n: number, wells: number[], pipes: number[][]): number {
11    // Create virtual node 0 and connect it to all houses with cost equal to well cost
12    // This transforms the problem into finding minimum spanning tree
13    for (let i = 0; i < n; ++i) {
14        pipes.push([0, i + 1, wells[i]]);
15    }
16
17    // Sort all edges (pipes) by cost in ascending order for Kruskal's algorithm
18    pipes.sort((a, b) => a[2] - b[2]);
19
20    // Initialize parent array for Union-Find data structure
21    // Each node is initially its own parent
22    const parent: number[] = Array(n + 1)
23        .fill(0)
24        .map((_, index) => index);
25
26    /**
27     * Find operation with path compression for Union-Find
28     * Finds the root parent of a node and compresses the path
29     *
30     * @param x - Node to find root parent for
31     * @returns Root parent of the node
32     */
33    const find = (x: number): number => {
34        if (parent[x] !== x) {
35            // Path compression: make every node point directly to root
36            parent[x] = find(parent[x]);
37        }
38        return parent[x];
39    };
40
41    let totalCost: number = 0;
42
43    // Process edges in ascending order of cost
44    for (const [houseA, houseB, cost] of pipes) {
45        const rootA: number = find(houseA);
46        const rootB: number = find(houseB);
47
48        // Skip if nodes are already in the same component (would create cycle)
49        if (rootA === rootB) {
50            continue;
51        }
52
53        // Union operation: merge two components
54        parent[rootA] = rootB;
55        totalCost += cost;
56
57        // Early termination: we need exactly n edges to connect n+1 nodes
58        if (--n === 0) {
59            break;
60        }
61    }
62
63    return totalCost;
64}
65

Time and Space Complexity

Time Complexity: O((m + n) × log(m + n))

The time complexity is dominated by the sorting operation. After adding all wells as virtual edges from node 0, we have m + n total edges (where m is the original number of pipes and n is the number of wells). Sorting these m + n edges takes O((m + n) × log(m + n)) time. The Union-Find operations (find and union) with path compression have an amortized time complexity of nearly O(1) per operation, and we perform at most m + n such operations, contributing O((m + n) × α(n)) where α is the inverse Ackermann function, which is effectively constant for practical purposes. Therefore, the overall time complexity is O((m + n) × log(m + n)).

Space Complexity: O(m + n)

The space complexity consists of several components:

  • The modified pipes array after adding well connections: O(m + n) space
  • The parent array p for Union-Find: O(n + 1) = O(n) space
  • The recursion stack for the find operation with path compression: O(n) in the worst case

Since we're modifying the input pipes array in-place by appending to it, the additional space used is O(n) for the parent array and potential recursion stack. However, considering the total space including the modified input, it becomes O(m + n).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect House Indexing When Adding Wells

A critical pitfall is mishandling the 1-based vs 0-based indexing when adding wells as edges. The problem states houses are numbered 1 to n, but the wells array uses 0-based indexing.

Incorrect Implementation:

# Wrong: This creates edges from node 0 to nodes 0,1,2...n-1
for i in range(n):
    pipes.append([0, i, wells[i]])

Correct Implementation:

# Correct: Creates edges from node 0 to nodes 1,2,3...n
for i, w in enumerate(wells, 1):
    pipes.append([0, i, w])
# OR
for i in range(n):
    pipes.append([0, i + 1, wells[i]])

2. Forgetting to Account for the Virtual Node in Parent Array

The parent array must have size n + 1 to accommodate nodes 0 through n (virtual node 0 plus n houses).

Incorrect Implementation:

# Wrong: Only accounts for n houses, missing the virtual node
parent = list(range(n))

Correct Implementation:

# Correct: Includes space for nodes 0 to n
parent = list(range(n + 1))

3. Wrong Termination Condition

The algorithm needs exactly n edges to connect n+1 nodes (n houses + 1 virtual node). A common mistake is checking for n-1 edges or using the wrong counter.

Incorrect Implementation:

# Wrong: Checking for n-1 edges
if edges_added == n - 1:
    return total_cost

# Wrong: Using the component counter incorrectly
components = n + 1  # Starting with n+1 components
for source, destination, cost in pipes:
    if find(source) != find(destination):
        union(source, destination)
        components -= 1
        if components == 1:  # Wrong: should check if we've added n edges
            return total_cost

Correct Implementation:

# Correct: Need exactly n edges
edges_added = 0
for source, destination, cost in pipes:
    if find(source) != find(destination):
        union(source, destination)
        edges_added += 1
        total_cost += cost
        if edges_added == n:  # Connected all n houses to virtual node
            return total_cost

4. Missing Path Compression in Find Operation

Without path compression, the find operation can degrade to O(n) time complexity in worst case, making the overall algorithm inefficient.

Incorrect Implementation:

# Wrong: No path compression
def find(x):
    while parent[x] != x:
        x = parent[x]
    return x

Correct Implementation:

# Correct: With path compression
def find(x):
    if parent[x] != x:
        parent[x] = find(parent[x])  # Compress path
    return parent[x]

5. Not Handling the Case When Input is Already Connected

While less common in this specific problem, if the pipes array is modified or pre-processed, ensure the algorithm handles cases where some houses might already be connected.

Solution: The Union-Find structure naturally handles this by checking if nodes are already in the same component before adding an edge, preventing redundant connections and incorrect cost calculations.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

A heap is a ...?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More