323. Number of Connected Components in an Undirected Graph


Problem Description

In this problem, we have a graph that consists of n nodes. We are given an integer n and an array edges where each element edges[i] = [a_i, b_i] represents an undirected edge between nodes a_i and b_i in the graph. The goal is to determine the number of connected components in the graph. A connected component is a set of nodes in a graph that are connected to each other by paths, and those nodes are not connected to any other nodes outside of the component. The task is to return the total count of such connected components.

Intuition

The objective is to count the number of separate groups of interconnected nodes where there are no connections between the groups. We can think of each node as starting in its own group, and each edge as a connection that, potentially, merges two groups into one larger group. Therefore, we are looking to find the total number of distinct sets that cannot be merged any further.

To address this problem, we use the Union-Find algorithm (also known as Disjoint Set Union or DSU). The Union-Find algorithm is efficient in keeping track of elements which are split into one or more disjoint sets. It has two primary operations: find, which determines the identity of the set containing a particular element, and union, which merges two sets together.

  1. Initialization: We start by initializing each node to be its own parent, representing that each node is the representative of its own set (group).

  2. Union Operation: We loop through the list of edges. For each edge, we apply the union operation. This means we find the parents (representatives) of the two nodes connected by the edge. If they have different parents, we set the parent of one to be the parent of the other, effectively merging the two sets.

    The function find(x) is a recursive function that finds the root (or parent) of the set to which x belongs. It includes path compression, which means that during the find operation, we make each looked-up node point directly to the root. Path compression improves efficiency, making subsequent find operations faster.

  3. Counting Components: After processing all the edges, we iterate through the nodes and count how many nodes are their own parent. This count is equal to the number of connected components since nodes that are their own parent represent the root of a connected component.

This solution's beauty is that it is relatively simple but powerful, allowing us to solve the connectivity problem in nearly linear time complexity, which is very efficient.

Learn more about Depth-First Search, Breadth-First Search, Union Find and Graph patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece๏ผš

Which algorithm is best for finding the shortest distance between two points in an unweighted graph?

Solution Approach

The implementation of the solution follows the Union-Find algorithm, using two primary functions: find and union. The key to understanding this approach is recognizing how these functions work together to determine the number of connected components.

Data Structure Used:

  • Parent Array: An array p is used to keep track of the representative (or parent) for each node in the graph. Initially, each node's representative is itself, meaning p[i] = i.

Algorithms and Patterns Used:

The solution has three main parts:

  1. The Find Function: The find function is used to find the root (or parent) of the node in the graph. When calling find(x), the function checks if p[x] != x. If this condition is true, it means that x is not its own parent, and there is a path to follow to find the root. This is done recursively until the root is found. The line p[x] = find(p[x]) applies path compression by directly connecting x to the root of its set, which speeds up future find operations.

    1def find(x):
    2    if p[x] != x:
    3        p[x] = find(p[x])
    4    return p[x]
  2. Union Operation: The union operation is not explicitly defined but is performed within the loop that iterates over the edges array. For each edge, the roots of the nodes a and b are found using the find operation. If they have different roots, it means they belong to different sets and should be connected. The line p[find(a)] = find(b) effectively merges the two sets by setting the parent of a's root to be b's root.

    1for a, b in edges:
    2    p[find(a)] = find(b)
  3. Counting Components: Finally, the number of connected components is determined by counting the number of nodes that are their own parents (i.e., the roots). This is done by iterating through each node i in the range 0 to n-1 and checking if i == find(i). If this condition is true, it means that i is the representative of its component, contributing to the total count.

    1return sum(i == find(i) for i in range(n))

The solution approach effectively groups nodes into sets based on the connections (edges) between them, and then it identifies the unique sets to arrive at the total number of connected components. Each connected component is represented by a unique root node, which is why counting these root nodes gives the correct count for the connected components.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Which type of traversal does breadth first search do?

Example Walkthrough

Let's use a small example to illustrate the solution approach. Imagine we have n = 5 nodes and the following edges: edges = [[0,1], [1,2], [3,4]].

  1. Initialization: We start with each node being its own parent, hence p = [0, 1, 2, 3, 4].

  2. Union Operation: We process the edges one by one:

    • For the edge [0,1], we find the parents of 0 and 1, which are 0 and 1, respectively. Since they are different, we connect them by setting p[find(0)] = find(1) resulting in p = [1, 1, 2, 3, 4].
    • Next, for the edge [1,2], we find the parents of 1 and 2, which are 1 and 2. They are different, so we connect them by setting p[find(1)] = find(2), leading to p = [1, 1, 1, 3, 4].
    • Lastly, for the edge [3,4], we find the parents of 3 and 4, which are 3 and 4. Again, they are different, so we unite them, resulting in p = [1, 1, 1, 4, 4].
  3. Counting Components: Now we count the number of nodes that are their own parents, which corresponds to the root nodes:

    • 0 is not its own parent (its parent is 1).
    • 1 is the parent of itself, so it counts as a component.
    • 2 is not its own parent (its parent is 1).
    • 3 is not its own parent (its parent is 4).
    • 4 is the parent of itself, so it counts as a component.

In this example, we have two nodes (1 and 4) that are their own parents, so the total number of connected components is 2.

Solution Implementation

1class Solution:
2    def countComponents(self, n: int, edges: List[List[int]]) -> int:
3        # Function to find the root of a node using path compression
4        def find(node):
5            if parent[node] != node:
6                parent[node] = find(parent[node])
7            return parent[node]
8
9        # Initialization of parent list where each node is its own parent initially
10        parent = list(range(n))
11
12        # Union operation: join two components by pointing the parent of one's root to the other's root
13        for a, b in edges:
14            parent[find(a)] = find(b)
15
16        # Count the number of components by checking how many nodes are their own parents
17        return sum(i == find(i) for i in range(n))
18
1class Solution {
2    private int[] parent; // This array will hold the parent for each node representing the components
3
4    public int countComponents(int n, int[][] edges) {
5        // Initialize parent array, where initially each node is its own parent
6        parent = new int[n];
7        for (int i = 0; i < n; ++i) {
8            parent[i] = i;
9        }
10      
11        // For each edge, perform a union of the two vertices
12        for (int[] edge : edges) {
13            int vertex1 = edge[0], vertex2 = edge[1];
14            union(vertex1, vertex2);
15        }
16      
17        // Count the number of components by counting the nodes that are their own parents
18        int count = 0;
19        for (int i = 0; i < n; ++i) {
20            if (i == find(i)) { // If the node's parent is itself, it's the root of a component
21                count++;
22            }
23        }
24        return count; // Return the total count of connected components
25    }
26
27    // Find function with path compression
28    private int find(int node) {
29        if (parent[node] != node) {
30            parent[node] = find(parent[node]); // Path compression for efficiency
31        }
32        return parent[node]; // Return the root parent of the node
33    }
34  
35    // Union function to join two subsets into a single subset
36    private void union(int node1, int node2) {
37        int root1 = find(node1); // Find the root parent of the first node
38        int root2 = find(node2); // Find the root parent of the second node
39        parent[root1] = root2; /* Make one root parent the parent of the other */
40    }
41}
42
1#include <vector>
2#include <numeric>
3#include <functional>
4
5class Solution {
6public:
7    // Function to count the number of connected components in an undirected graph
8    int countComponents(int n, vector<vector<int>>& edges) {
9        // Parent array to represent disjoint sets
10        vector<int> parent(n);
11       
12        // Initialize each node to be its own parent, forming n separate sets
13        iota(parent.begin(), parent.end(), 0);
14      
15        // Lambda function to find the representative (leader) of a set
16        function<int(int)> find = [&](int x) -> int {
17            if (parent[x] != x) {
18                // Path compression: update the parent of x to be the parent of its current parent
19                parent[x] = find(parent[x]);
20            }
21            return parent[x];
22        };
23      
24        // Iterate through all edges to merge sets
25        for (auto& edge : edges) {
26            int nodeA = edge[0], nodeB = edge[1];
27            // Union by rank not used, simply attach the tree of `nodeA` to `nodeB`
28            parent[find(nodeA)] = find(nodeB);
29        }
30      
31        // Count the number of sets by counting the number of nodes that are self-parented
32        int componentCount = 0;
33        for (int i = 0; i < n; ++i) {
34            if (i == find(i)) { // If the node is the leader of a set
35                componentCount++;
36            }
37        }
38        return componentCount;
39    }
40};
41
1/**
2 * Counts the number of connected components in an undirected graph
3 * @param {number} n - The number of nodes in the graph
4 * @param {number[][]} edges - The edges of the graph
5 * @return {number} - The number of connected components
6 */
7function countComponents(n: number, edges: number[][]): number {
8    // Parent array to represent the disjoint set forest
9    let parent: number[] = new Array(n);
10    // Initialize each node to be its own parent
11    for (let i = 0; i < n; ++i) {
12        parent[i] = i;
13    }
14
15    // Function to find the root of the set that 'x' belongs to
16    function find(x: number): number {
17        if (parent[x] !== x) {
18            // Path compression for efficiency
19            parent[x] = find(parent[x]);
20        }
21        return parent[x];
22    }
23
24    // Union the sets that the edges connect
25    for (const [a, b] of edges) {
26        // Perform union by setting the parent of the representative of 'a' 
27        // to the representative of 'b'
28        parent[find(a)] = find(b);
29    }
30
31    // Count the number of connected components
32    // by counting the number of nodes that are their own parent
33    let count = 0;
34    for (let i = 0; i < n; ++i) {
35        if (i === find(i)) {
36            count++;
37        }
38    }
39    return count;
40}
41
Not Sure What to Study? Take the 2-min Quiz๏ผš

What data structure does Breadth-first search typically uses to store intermediate states?

Time and Space Complexity

Time Complexity

The time complexity of the given code is primarily derived from two parts: the union-find operation that involves both path compression in the find function and the union operation in the loop iterating through the edges.

  1. Path Compression (find function) - The recursive find function includes path compression which optimizes the time complexity of finding the root representative of a node to O(log n) on average and O(ฮฑ(n)) in the amortized case, where ฮฑ is the inverse Ackermann function, which grows very slowly. For most practical purposes, it can be considered almost constant.

  2. Union Operation (The for loop) - The time to traverse all edges and perform the union operation for each edge. There are 'e' edges, with e being the length of the edges list. As the path compression makes the union operation effectively near-constant time, this also takes O(e) operations.

  3. Summation Operation - The final sum iterates over all nodes to count the number of components, taking O(n) time.

Considering all parts, the overall time complexity is O(e + n).

Space Complexity

The space complexity consists of:

  1. Parent Array (p) - It's a list of length n, so it uses O(n) space.

  2. Recursive Stack Space - Because the find function is recursive (though with path compression), in the worst case, it could go as deep as O(n) before path compression is applied. However, with path compression, this is drastically reduced.

Given these considerations, the overall space complexity is O(n).

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

How many ways can you arrange the three letters A, B and C?


Recommended Readings


Got a question?ย Ask the Teaching Assistantย anything you don't understand.

Still not clear? Ask in the Forum, ย Discordย orย Submitย the part you don't understand to our editors.

โ†
โ†‘TA ๐Ÿ‘จโ€๐Ÿซ