1617. Count Subtrees With Max Distance Between Cities

HardBit ManipulationTreeDynamic ProgrammingBitmaskEnumeration
Leetcode Link

Problem Description

In this problem, we are given n cities that form a tree (a connected acyclic graph). These cities are connected with n-1 bidirectional edges, ensuring that there is a unique path between every pair of cities. A subtree is defined as a subset of these cities where any city is reachable from any other within the subset, through paths only involving cities in the subset. We are asked to find the number of distinct subtrees such that the longest path (or maximum distance) between any two cities in the subtree is exactly d, for each d from 1 to n-1. The distance is measured in the number of edges.

Our goal is to produce an array of n-1 elements where the dth element (1-indexed) shows the number of subtrees having a maximum distance of d between any two of its cities.

Intuition

The solution to this problem lies in understanding what defines a subtree in a tree and how to calculate the maximum distance between any two cities given a set of edges. A subtree is not just any subset of the nodes but a connected subset that contains no cycles.

To approach this problem, we can make use of bitwise operations to represent subsets and iterate over all possible subsets of cities (except the empty set and the sets with only one city, as they don't form edge paths).

For each subset, we do the following:

  1. Determine if the current subset represents a connected subtree by performing a Depth-First Search (DFS). If it's not connected or does not form a tree (has cycles), we ignore it.

  2. If the subset is a connected subtree, we then find the maximum distance within this subtree. To do this, we pick any city and perform a DFS to find the furthest city from it, which gives us one end of the maximum distance. We perform another DFS starting from this furthest city to find the maximum distance.

  3. We record the count of subtrees having each particular maximum distance.

In the code given, the function dfs is used to determine the furthest city from a given city (represented as nxt after the first DFS run) and to calculate the maximum distance (mx) within a subset of cities. The bit manipulation using a mask (msk) helps in iterating over subsets of cities and confirming connectivity. The variable mask represents the current subset, and each bit represents whether a city is included (1) or excluded (0) from the subset.

The loop for mask in range(1, 1 << n): iterates over all possible subsets of the cities. The condition mask & (mask - 1) == 0 checks if the subset has only one city by verifying if mask is a power of two (in which case, there would be no edges and hence no distance to count). The rest of the loop is for checking connectivity and maximum distance calculation.

The approach cleverly utilizes the properties of the tree to calculate the distances and uses bitmasking to efficiently examine all subsets. By counting the occurrences of each distance as the final step, the problem is solved comprehensively.

Learn more about Tree, Dynamic Programming and Bitmask patterns.

Solution Approach

The implementation of the solution involves several steps that make use of standard algorithms and data structures - notably Depth-First Search (DFS) and bit manipulation to handle combination generation.

Here's how the solution is implemented:

  1. Prepare Graph Data Structure: The tree of cities is represented using an adjacency list. A Python defaultdict of lists is used to create the graph g. Each list contains all the adjacent cities (nodes) for a given city. This data structure is populated in a loop where each edge is added for the corresponding cities.

  2. Define DFS Function: A Depth-First Search function dfs is used to explore the cities in the given subtree. It maintains the current distance d, the maximum distance found mx, and the furthest city found nxt. Using bitwise operations, it keeps track of the cities visited in the current subset by updating the mask msk.

  3. Iterate Over City Subsets: The outer loop for mask in range(1, 1 << n): iterates over all possible subsets of the cities (represented in binary form where each bit corresponds to the presence or absence of a city in the subset).

  4. Verify Subsets: Each subset must have more than one city to qualify as a subtree with a path. The condition mask & (mask - 1) == 0 is used to filter out single-city subsets.

  5. Check Subtree Connectivity: For every subset that passes the previous step, a DFS is performed from the last city in the subset indicated by the position of the most significant bit of the mask (cur). After the DFS is complete, if msk is zero, all cities in the subset were reached, which means the subset is a connected subtree.

  6. Calculate Maximum Distance: If the subset is connected, another DFS is performed starting from nxt (the furthest city found from the first DFS). The idea here is to find the diameter of the subtree (the longest path in the subtree). After this second DFS, mx contains the maximum distance in the subtree.

  7. Update the Answer Array: The ans array contains counters for how many times each possible distance d has been found. The index corresponds to d - 1 (1-indexed), and the value is incremented each time a subtree with a maximum distance of d is discovered.

  8. Complexity Analysis: The function iterates through 2^n - n possible subsets (since empty subsets and single-city subsets are skipped), and within each subset, it may perform up to two DFS operations, resulting in a worst-case time complexity close to O(2^n * n). However, many subsets will not form connected subtrees and be skipped earlier in the process, which can make the solution faster in practice.

This solution is a combination of combinatorics (to generate all possible subsets) and graph traversal (to check for connectivity and distances). By combining these techniques, the implementation efficiently solves the problem of counting subtrees with specific maximum distances across a tree graph.

Ready to land your dream job?

Unlock your dream job with a 2-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's consider a tree with 4 cities (nodes) connected by 3 edges as follows:

   City 1
     |
   City 2
   /    \
City 3  City 4

Here, n = 4, representing the number of cities. We will use binary numbers (bitmasks) to represent all possible subsets of the cities and follow the steps provided in the solution approach to find the subtrees with maximum distances of 1, 2, and 3. The cities are 1-indexed, so bitmask 0b0001 represents a subset containing only City 1.

1. Prepare Graph Data Structure

The adjacency list, g, would be something like:

  • City 1: [City 2]
  • City 2: [City 1, City 3, City 4]
  • City 3: [City 2]
  • City 4: [City 2]

2. Define DFS Function

The dfs function will be used to explore connected cities and calculate the maximum distance in the subset.

3. Iterate Over City Subsets

We have 2^4 - 4 non-empty non-singleton subsets to consider. Let's explore some of them:

  • Subset 0b0110 (Cities 2 and 3 are included): Reachable as there's an edge between City 2 and City 3. The maximum distance here is 1.
  • Subset 0b1010 (Cities 1 and 3 are included): This does not represent a connected subtree because there is no direct path between City 1 and City 3 within the subset.

4. Verify Subsets

A bitmask like 0b0100 is skipped because it represents only City 3, forming no paths.

5. Check Subtree Connectivity

Check with DFS:

  • Starting with subset 0b0110, begin the DFS from City 2. After completing DFS, if all cities in the subset are reached (mask msk becomes 0), the subset is connected.

6. Calculate Maximum Distance

For connected subsets:

  • Using the subset 0b0110, perform another DFS starting from City 3 (furthest found from previous DFS). The maximum distance within this subset is 1.

7. Update the Answer Array

For the subset 0b0110, increment the counter in ans[1-1] since the maximum distance d found is 1.

8. Complexity Analysis

We would repeat these steps for all subsets ranging from 0b0011 to 0b1111 excluding subsets like 0b0001, 0b0010, 0b0100, and 0b1000 which are single-city subsets, and hence have no paths.

Through this solution approach, you can calculate the number of subtrees with different maximum distances. Here, simplistic examples such as 0b0110 and 0b1010 are used, but the problem will apply these steps to a full 4 city tree and consider all 2^4 - 4 = 12 possible city subsets.

Solution Implementation

1from collections import defaultdict
2from typing import List
3
4class Solution:
5    def countSubgraphsForEachDiameter(self, n: int, edges: List[List[int]]) -> List[int]:
6        # Perform a depth-first search from a given node, tracking the maximum distance and its corresponding node
7        def dfs(node: int, distance: int = 0):
8            nonlocal max_distance, next_node, mask
9            if max_distance < distance:  # Updating maximum distance and corresponding node
10                max_distance, next_node = distance, node
11            mask ^= 1 << node  # Remove the current node from the mask
12            for neighbor in graph[node]:
13                if mask >> neighbor & 1:  # Check if the neighbor is part of the current subgraph (mask)
14                    dfs(neighbor, distance + 1)
15      
16        # Create an adjacency list representation of the graph
17        graph = defaultdict(list)
18        for u, v in edges:
19            u, v = u - 1, v - 1  # Convert to 0-based indexing for easier bit manipulation
20            graph[u].append(v)
21            graph[v].append(u)
22      
23        # Initialize the result list for storing the count of subgraphs for each diameter
24        result = [0] * (n - 1)
25      
26        # Variables to store the node for next DFS and the maximum distance in current DFS
27        next_node = max_distance = 0
28      
29        # Iterate over all possible subgraphs represented by bitmasks (excluding empty subgraph and single node subgraphs)
30        for mask in range(1, 1 << n):
31            if mask & (mask - 1) == 0:  # Skip subgraphs of size 1 (only one bit set in the mask)
32                continue
33          
34            # Initialize the mask and max distance for current subgraph
35            current_mask, max_distance = mask, 0
36          
37            # Start DFS from the last bit set in the current_mask
38            current_node = current_mask.bit_length() - 1
39            dfs(current_node)
40          
41            if current_mask == 0:  # If all nodes in subgraph have been visited,
42                current_mask, max_distance = mask, 0  # Reset the mask and distance
43                dfs(next_node)  # Start new DFS to find the maximum distance
44                result[max_distance - 1] += 1  # Increment the count of subgraphs by found diameter
45      
46        # Return the counts of subgraphs for each possible diameter
47        return result
48
1import java.util.ArrayList;
2import java.util.Arrays;
3import java.util.List;
4
5class Solution {
6
7    // Graph represented as an adjacency list
8    private List<Integer>[] graph;
9  
10    // Mask to represent a subset of nodes
11    private int mask;
12  
13    // The starting point for the next DFS
14    private int nextNode;
15  
16    // The maximum distance found in current DFS
17    private int maxDistance;
18
19    // Method that calculates the count of subgraphs for each possible diameter
20    public int[] countSubgraphsForEachDiameter(int nodeCount, int[][] edges) {
21        // Initialize the adjacency list for the graph
22        graph = new List[nodeCount];
23        Arrays.setAll(graph, index -> new ArrayList<>());
24
25        // Add each edge to the undirected graph
26        for (int[] edge : edges) {
27            int from = edge[0] - 1;
28            int to = edge[1] - 1;
29            graph[from].add(to);
30            graph[to].add(from);
31        }
32
33        // Answer array to count the subgraphs for each diameter
34        int[] answer = new int[nodeCount - 1];
35      
36        // Iterate over all possible subsets of nodes
37        for (int subsetMask = 1; subsetMask < (1 << nodeCount); ++subsetMask) {
38            // Skip subsets that have only one bit set (single node)
39            if ((subsetMask & (subsetMask - 1)) == 0) {
40                continue;
41            }
42
43            // Initialize helper variables for the current subset
44            mask = subsetMask;
45            maxDistance = 0;
46          
47            // Find the highest-order bit that is set
48            int current = Integer.SIZE - 1 - Integer.numberOfLeadingZeros(mask);
49          
50            // Perform depth-first search from the highest-order bit
51            dfs(current, 0);
52          
53            // If all bits were cleared, we found a connected component
54            if (mask == 0) {
55                // Perform another DFS to find the diameter of the tree
56                mask = subsetMask;
57                maxDistance = 0;
58                dfs(nextNode, 0);
59              
60                // Increment the count for the diameter found
61                ++answer[maxDistance - 1];
62            }
63        }
64        return answer;
65    }
66
67    // Depth-first search to clear bits in the mask and find the maximum distance
68    private void dfs(int node, int distance) {
69        // Clear the bit for the current node, marking it as visited
70        mask ^= (1 << node);
71      
72        // Update maximum distance and record the furthest node
73        if (maxDistance < distance) {
74            maxDistance = distance;
75            nextNode = node;
76        }
77      
78        // Explore all neighbors
79        for (int neighbor : graph[node]) {
80            // Continue DFS if the neighbor is in the current subset and is not visited
81            if ((mask >> neighbor & 1) == 1) {
82                dfs(neighbor, distance + 1);
83            }
84        }
85    }
86}
87
1#include <vector>
2#include <functional>
3
4class Solution {
5public:
6    // Function to count the number of subgraphs for each diameter
7    vector<int> countSubgraphsForEachDiameter(int n, vector<vector<int>>& edges) {
8        // Convert edges into an adjacency list for a graph representation
9        vector<vector<int>> graph(n);
10        for (auto& edge : edges) {
11            int u = edge[0] - 1, v = edge[1] - 1; // Convert to zero-based indexing
12            graph[u].emplace_back(v);
13            graph[v].emplace_back(u);
14        }
15
16        // Answer array to store the counts of subgraphs for each diameter
17        vector<int> answer(n - 1, 0);
18        int nextNode = 0;
19        int mask = 0;
20        int maxDiameter = 0;
21
22        // Depth-first search function to explore the graph
23        function<void(int, int)> dfs = [&](int node, int distance) {
24            mask ^= 1 << node; // Toggle the node in the current mask
25            if (maxDiameter < distance) {
26                maxDiameter = distance; // Update the max diameter found
27                nextNode = node; // Update the node for the next DFS call
28            }
29            for (int neighbor : graph[node]) { // Visit all connected nodes
30                if (mask >> neighbor & 1) { // If the neighbor is in the current mask
31                    dfs(neighbor, distance + 1); // Recursive DFS
32                }
33            }
34        };
35
36        // Iterate over all possible subgraphs; represented by bit masks
37        for (int currentMask = 1; currentMask < (1 << n); ++currentMask) {
38            // Skip masks with single bit set (no edges in such subgraphs)
39            if ((currentMask & (currentMask - 1)) == 0) {
40                continue;
41            }
42
43            // Perform a DFS from the rightmost set bit in the current mask
44            mask = currentMask;
45            maxDiameter = 0;
46            int startNode = 31 - __builtin_clz(mask); // Find rightmost set bit
47            dfs(startNode, 0);
48
49            // If after the DFS the mask is 0, all nodes were visited
50            if (mask == 0) {
51                mask = currentMask;
52                maxDiameter = 0;
53                dfs(nextNode, 0); // Perform another DFS to find the diameter
54                ++answer[maxDiameter - 1]; // Increment the count for this diameter
55            }
56        }
57
58        return answer;
59    }
60};
61
1function countSubgraphsForEachDiameter(n: number, edges: number[][]): number[] {
2    // Create an adjacency list for the graph
3    const graph = Array.from({ length: n }, () => []);
4    for (const [u, v] of edges) {
5        graph[u - 1].push(v - 1);
6        graph[v - 1].push(u - 1);
7    }
8
9    // Initialize the answer array with zeros
10    const answer: number[] = new Array(n - 1).fill(0);
11
12    // Variables for tracking maximum diameter, the bitmask, and the next node
13    let [maxDiameter, bitmask, nextNode] = [0, 0, 0];
14
15    // Depth-first search to find the diameter of the subgraph
16    const dfs = (node: number, distance: number): void => {
17        if (maxDiameter < distance) {
18            maxDiameter = distance;
19            nextNode = node;
20        }
21        bitmask ^= 1 << node;
22        for (const neighbor of graph[node]) {
23            if ((bitmask >> neighbor) & 1) {
24                dfs(neighbor, distance + 1);
25            }
26        }
27    };
28
29    // Iterate over all possible subgraphs
30    for (let mask = 1; mask < (1 << n); ++mask) {
31        if ((mask & (mask - 1)) === 0) { // Skip if only one bit is set (not a valid subgraph)
32            continue;
33        }
34        // Reset variables and start dfs from the highest node within the mask
35        bitmask = mask;
36        maxDiameter = 0;
37        const current = 31 - numberOfLeadingZeros(bitmask);
38        dfs(current, 0);
39        // Check if all nodes in the subgraph were visited
40        if (bitmask === 0) {
41            bitmask = mask;
42            maxDiameter = 0;
43            dfs(nextNode, 0); // Run dfs again to find actual diameter starting from nextNode
44            ++answer[maxDiameter - 1]; // Increment count of subgraphs for this diameter
45        }
46    }
47    return answer;
48}
49
50// Function to count the number of leading zeros in a 32-bit integer
51function numberOfLeadingZeros(i: number): number {
52    if (i === 0) return 32;
53    let numZeros = 1;
54    if ((i >>> 16) === 0) {
55        numZeros += 16;
56        i <<= 16;
57    }
58    if ((i >>> 24) === 0) {
59        numZeros += 8;
60        i <<= 8;
61    }
62    if ((i >>> 28) === 0) {
63        numZeros += 4;
64        i <<= 4;
65    }
66    if ((i >>> 30) === 0) {
67        numZeros += 2;
68        i <<= 2;
69    }
70    numZeros -= i >>> 31;
71    return numZeros;
72}
73

Time and Space Complexity

Time Complexity

The time complexity of the provided code is primarily determined by the number of possible subsets of the vertices in the graph, which is 2^n, where n is the number of vertices. For each subset (excluding subsets of size 0 and 1), the code performs a Depth-First Search (DFS) to find the maximum diameter of the subgraph represented by the subset.

Each call to the dfs function has a time complexity of O(n), as in the worst case it visits all vertices in the graph. Since dfs is called twice for each subset that passes the initial checks, the time complexity for checking all subsets with DFS is O(2^n * n).

Therefore, the overall time complexity of the code is O(2^n * n).

Space Complexity

The space complexity is governed by the space needed to store the graph structure, the recursive calls during DFS, and the ans list.

  • The graph is stored using an adjacency list, which in the worst case (a complete graph) requires O(n^2) space.
  • The maximum depth of the recursive call stack for DFS is n, as the DFS might traverse all the vertices in the worst case. This adds a space complexity of O(n).
  • The ans list takes O(n) space.

Hence, the overall space complexity is O(n^2) due to the adjacency list storage, assuming it is the worst-case scenario.

Learn more about how to find time and space complexity quickly using problem constraints.


Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:
Question 1 out of 10

You are given an array of intervals where intervals[i] = [start_i, end_i] represent the start and end of the ith interval. You need to merge all overlapping intervals and return an array of the non-overlapping intervals that cover all the intervals in the input.


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!