1135. Connecting Cities With Minimum Cost
Problem Description
The given problem scenario presents a set of cities along with possible connections between them. Each city is uniquely identified by a numeric label ranging from 1
to n
, and the goal is to find the most cost-effective way to ensure that there is at least one path between every pair of cities.
We are provided with the total number of cities (n
) and an array connections
, where each element of connections
is a triplet consisting of two city labels and the cost of connecting those two cities (for example, [xi, yi, costi]
). The connections are bidirectional, meaning the cost is the same regardless of the direction of travel between the two cities.
The task is to determine the minimum total cost required to connect all cities in such a way that every city is reachable from every other city. If this task is impossible with the given connections, the function should return -1
.
Intuition
To solve this problem, we need to connect all cities while minimizing the total cost. This is a classic example of a Minimum Spanning Tree (MST) problem which can be solved using algorithms like Kruskal's or Prim's algorithm.
Here, the Kruskal's algorithm is suitable because we start with edges (connections) and aim to create the MST by picking the shortest edges while avoiding cycles.
The intuition behind the solution is as follows:
-
Sorting Edges: We begin by sorting the connections by cost. This allows us to consider the cheapest connection available at any step, which is a core part of Kruskal's algorithm.
-
Union-Find Data Structure: We utilize a union-find data structure to keep track of which cities are already connected. This data structure helps us to efficiently find the root (representative) of each city and to unite two separate trees (cities that aren't yet connected).
-
Processing Connections: Iterate through each connection. Use the union-find data struture to determine whether the current connection will form a cycle. We only consider a connection if it connects two different trees (does not form a cycle).
-
Connecting Cities: When a valid connection is found (it doesn't form a cycle), we join the cities. The cost of this connection is added to the total minimum cost, and the number of disconnected components (
n
) is reduced by 1 because two components are now connected. -
Checking Early Termination: If at any moment the number of disconnected components becomes 1, it means we have connected all cities, and we can return the total minimum cost at this stage to improve efficiency.
-
Return Result: If we exit the loop and multiple disconnected components (
n
) are still present, this indicates it's impossible to connect all cities; hence, we return-1
.
The provided solution implements Kruskal's algorithm with a union-find data structure to connect cities in the least costly manner.
Learn more about Union Find, Graph, Minimum Spanning Tree and Heap (Priority Queue) patterns.
Solution Approach
The solution provided follows the Kruskal's algorithm approach for finding the minimum spanning tree (MST). Let's walk through the implementation details:
-
Union-Find Initialization: We initiate an array
p
which stands for the parent array, mostly used by union-find data structures. It's initialized such that each city is its own parent:p = list(range(n))
Here,
range(n)
is used instead ofrange(n+1)
because the cities are labeled from1
ton
, so an adjustment is made in the index while processing. -
Sorting the Connections: The connections are sorted in ascending order of their cost using the
sort
method with a lambda function specifying the sorting key:connections.sort(key=lambda x: x[2])
-
Find Function: This is a part of the union-find data structure to find the root parent of a city. The function is recursive, and it also performs path compression, updating the parent reference to the root parent, which saves time in subsequent find operations:
def find(x): if p[x] != x: p[x] = find(p[x]) return p[x]
-
Iterating Over Connections: For each connection
[x, y, cost]
, we first adjust the city labels to zero-based indexing (since the initial arrayp
was created with zero-based index):x, y = x - 1, y - 1
-
Checking and Union: For each connection, we check if both cities
x
andy
have the same root parent. If they do, we skip the connection as it would form a cycle:if find(x) == find(y): continue
Otherwise, we proceed to union (merge) the two components by setting one city's root parent to the other's, effectively connecting the two components:
p[find(x)] = find(y)
The cost of the connection is then added to the total cost
ans
:ans += cost
Following the union, we reduce
n
by 1. In this context,n
is being used to count the number of disconnected components, with the goal of having exactly 1 component once all cities are connected:n -= 1
-
Early Termination: If
n
becomes 1, that means all cities have been connected, and we can return the total cost at that point:if n == 1: return ans
-
Return -1 if Impossible: If the loop completes and there are still multiple disconnected components remaining (i.e.,
n
is not 1), we return-1
, indicating that it is not possible to connect all the cities with the given connections:return -1
This algorithm ensures all cities are connected with the minimum cost, or it returns -1
if they can't be connected with the given connections.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's consider a small example scenario. Suppose we have 4
cities with the following possible connections between them:
n = 4 connections = [[1, 2, 3], [3, 4, 4], [2, 3, 2], [1, 4, 5]]
Here's a step-by-step implementation of the provided solution approach:
-
Union-Find Initialization: We initiate the parent array with the cities being their own parent:
p = [0, 1, 2, 3] # Adjusted for zero-index based (subtracting 1 from each label)
-
Sorting the Connections: We sort the connections by their cost in ascending order:
connections.sort(key=lambda x: x[2]) # connections becomes [[2, 3, 2], [1, 2, 3], [3, 4, 4], [1, 4, 5]]
-
Processing Each Connection:
-
Look at the first connection
[2, 3, 2]
(zero-based indexing used:[1, 2, 2]
):- Both do not have the same root, apply union:
p[find(1)] = find(2) # p becomes [0, 2, 2, 3] ans += 2 n = 4 - 1 = 3
-
Next, connection
[1, 2, 3]
(zero-based indexing used:[0, 1, 3]
):- Roots are different, apply union:
p[find(0)] = find(1) # p becomes [2, 2, 2, 3] ans += 3 n = 3 - 1 = 2
-
Then, connection
[3, 4, 4]
(zero-based indexing:[2, 3, 4]
):- Roots are different, apply union:
p[find(2)] = find(3) # p becomes [2, 2, 3, 3] ans += 4 n = 2 - 1 = 1
After this operation, we have only one disconnected component because
n
becomes1
, so we don't need to process any more connections. The current total cost is2 + 3 + 4 = 9
.
-
-
Early Termination: Since
n
is now1
, we can terminate the process early and return the total costans
, which is9
.
In this example, the minimum total cost required to connect all the cities with the given connections is 9
, ensuring a path exists between every pair of cities.
Solution Implementation
1from typing import List
2
3class Solution:
4 def minimum_cost(self, n: int, connections: List[List[int]]) -> int:
5
6 # Helper function to find the parent of a given node using Path Compression technique
7 def find(parent: int) -> int:
8 if parents[parent] != parent:
9 parents[parent] = find(parents[parent])
10 return parents[parent]
11
12 # Sort the connections based on their cost in ascending order
13 connections.sort(key=lambda connection: connection[2])
14
15 # Initialize the parent array for Union-Find
16 parents = list(range(n))
17
18 # Initialize the total cost to 0
19 total_cost = 0
20
21 # Iterate over all the connections
22 for start, end, cost in connections:
23 # Adjusting indices to be zero-based
24 start, end = start - 1, end - 1
25
26 # Find the parents for the start and end nodes
27 start_parent = find(start)
28 end_parent = find(end)
29
30 # If the nodes have different parents, they belong to different sets, and we can connect them
31 if start_parent != end_parent:
32 # Union the sets by updating the parent
33 parents[start_parent] = end_parent
34
35 # Increment the total cost by the cost of this connection
36 total_cost += cost
37
38 # Decrement the count of disconnected components
39 n -= 1
40
41 # If all nodes are connected (i.e., only one disconnected component left), return total cost
42 if n == 1:
43 return total_cost
44
45 # If we have more than one disconnected component, it's not possible to connect all cities
46 return -1
47
48# Example usage:
49# sol = Solution()
50# print(sol.minimum_cost(3, [[1, 2, 5], [1, 3, 6], [2, 3, 1]])) # Output should be 6, connecting cities 2-3 and either 1-2 or 1-3
51
1class Solution {
2
3 // Parent array for Disjoint Set (Union Find) structure
4 private int[] parent;
5
6 /**
7 * Calculates the minimum cost to connect all points in a graph.
8 *
9 * @param n The number of points in the graph.
10 * @param connections The available connections between points with their costs.
11 * @return The minimum cost to connect all points. If not possible, returns -1.
12 */
13 public int minimumCost(int n, int[][] connections) {
14 // Sort the connections by cost using a lambda expression.
15 Arrays.sort(connections, Comparator.comparingInt(connection -> connection[2]));
16
17 // Initialize the parent array for the Disjoint Set.
18 parent = new int[n];
19 for (int i = 0; i < n; i++) {
20 parent[i] = i;
21 }
22
23 int totalCost = 0; // This will accumulate the total minimum cost to connect all points.
24
25 // Iterate over each connection.
26 for (int[] connection : connections) {
27 int point1 = connection[0] - 1; // Convert to zero-indexed.
28 int point2 = connection[1] - 1; // Convert to zero-indexed.
29 int cost = connection[2]; // The cost associated with this connection.
30
31 // Only consider the connection if the points are not already connected.
32 if (find(point1) != find(point2)) {
33 // Union the two sets.
34 parent[find(point1)] = find(point2);
35
36 // Add the cost of this connection to the total.
37 totalCost += cost;
38
39 // If only one set remains, all points are connected.
40 if (--n == 1) {
41 return totalCost;
42 }
43 }
44 }
45
46 // If after processing all connections not all points are connected return -1.
47 return -1;
48 }
49
50 /**
51 * Finds the representative of the set that the element belongs to.
52 *
53 * @param x The element to find the set representative for.
54 * @return The representative of the set.
55 */
56 private int find(int x) {
57 // Path Compression: Point directly to the representative to flatten the tree.
58 if (parent[x] != x) {
59 parent[x] = find(parent[x]);
60 }
61 return parent[x];
62 }
63}
64
1class Solution {
2public:
3 int minimumCost(int n, vector<vector<int>>& connections) {
4 // Parent array to represent the disjoint set (union-find structure).
5 vector<int> parent(n);
6 // Initialize the parent array, where each node is its own parent at the start.
7 iota(parent.begin(), parent.end(), 0);
8
9 // Sort the connections based on their cost in ascending order.
10 sort(connections.begin(), connections.end(), [](const auto& a, const auto& b) {
11 return a[2] < b[2];
12 });
13
14 // Variable to accumulate the total cost of connections used.
15 int totalCost = 0;
16
17 // Lambda function for the find operation in the union-find structure.
18 function<int(int)> find = [&](int vertex) -> int {
19 if (parent[vertex] != vertex) { // Path compression.
20 parent[vertex] = find(parent[vertex]);
21 }
22 return parent[vertex];
23 };
24
25 // Iterate over the connections.
26 for (const auto& edge : connections) {
27 int node1 = edge[0] - 1; // Adjusting for zero-based indexing.
28 int node2 = edge[1] - 1; // Adjusting for zero-based indexing.
29 int cost = edge[2]; // Cost of the connection.
30
31 // Union operation - connecting two subsets if they don't have the same parent.
32 if (find(node1) != find(node2)) {
33 parent[find(node1)] = find(node2);
34 totalCost += cost;
35 if (--n == 1) { // Check if all nodes are connected.
36 return totalCost; // Return the total cost if all nodes are connected.
37 }
38 }
39 }
40 return -1; // If not all nodes are connected, return -1 to indicate that it's not possible.
41 }
42};
43
1function minimumCost(cityCount: number, connections: number[][]): number {
2 // Parent array to represent the disjoint set (union-find structure)
3 const parent = new Array(cityCount);
4 // Initialize each city to be its own parent at the start
5 for (let i = 0; i < cityCount; ++i) {
6 parent[i] = i;
7 }
8
9 // Function to find the root parent of a city using path compression
10 const find = (city: number): number => {
11 if (parent[city] !== city) {
12 parent[city] = find(parent[city]); // Path compression step
13 }
14 return parent[city];
15 };
16
17 // Sort the connections based on the cost in ascending order
18 connections.sort((a, b) => a[2] - b[2]);
19
20 // Initialize the total cost to 0
21 let totalCost = 0;
22
23 // Iterate through the sorted connections
24 for (const [city1, city2, cost] of connections) {
25 // Find the root parents of the connected cities
26 if (find(city1 - 1) === find(city2 - 1)) {
27 // If the cities are already connected, ignore this connection
28 continue;
29 }
30 // Union operation: connect the two sets and update the cost
31 parent[find(city1 - 1)] = find(city2 - 1);
32 totalCost += cost; // Add the connection cost to the total cost
33
34 // If there is only one set left (all cities are connected), return the total cost
35 if (--cityCount === 1) {
36 return totalCost;
37 }
38 }
39
40 // If not all cities are connected, return -1
41 return -1;
42}
43
Time and Space Complexity
The code implements a variation of Kruskal's algorithm with path compression for the union-find operation.
Time Complexity
The time complexity of the code is as follows:
- Sorting the connections: The
sort()
function has a time complexity ofO(E log E)
, where E is the number of connections. - Iterating through sorted connections: In the worst case, each connection can involve a union operation. Each find operation takes
O(log N)
in practice due to path compression, although the exact complexity can be effectively considered asO(α(N))
, whereα(N)
is the inverse Ackermann function, which grows very slowly and is less than 5 for any practical input size. - The union operation takes
O(1)
after finding the roots.
Combining all the above steps, the total time complexity is O(E log E + E α(N))
. Since the sorting term is dominant for larger input sizes, we can approximate the time complexity to O(E log E)
.
Space Complexity
The space complexity of the code is as follows:
- The
p
array of sizeO(N)
, whereN
is the number of vertices. - Minimal space for local variables.
Thus, the space complexity is O(N)
.
Learn more about how to find time and space complexity quickly using problem constraints.
Problem: Given a list of tasks and a list of requirements, compute a sequence of tasks that can be performed, such that we complete every task once while satisfying all the requirements.
Which of the following method should we use to solve this problem?
Recommended Readings
Union Find Disjoint Set Union Data Structure Introduction Prerequisite Depth First Search Review problems dfs_intro Once we have a strong grasp of recursion and Depth First Search we can now introduce Disjoint Set Union DSU This data structure is motivated by the following problem Suppose we have sets of elements
https algomonster s3 us east 2 amazonaws com cover_photos graph svg Graph Fundamentals Tree with 0 cycle At this point you should be pretty familiar with trees A tree is a special kind of graph a connected acyclic cycle less graph A graph may contain cycle s and nodes could
Minimum Spanning Tree Forests The Umbristan Department of Forestry UDF is tackling a rather difficult problem and the Umbristan government has assigned you as one of its best workers to go resolve the issue When you arrive you are quickly briefed on the problem at hand Inside the Umbristan National Park there exists an
Want a Structured Path to Master System Design Too? Don’t Miss This!