1617. Count Subtrees With Max Distance Between Cities
Problem Description
In this problem, we are given n
cities that form a tree (a connected acyclic graph). These cities are connected with n-1
bidirectional edges, ensuring that there is a unique path between every pair of cities. A subtree is defined as a subset of these cities where any city is reachable from any other within the subset, through paths only involving cities in the subset. We are asked to find the number of distinct subtrees such that the longest path (or maximum distance) between any two cities in the subtree is exactly d
, for each d
from 1 to n-1
. The distance is measured in the number of edges.
Our goal is to produce an array of n-1
elements where the d
th element (1-indexed) shows the number of subtrees having a maximum distance of d
between any two of its cities.
Intuition
The solution to this problem lies in understanding what defines a subtree in a tree and how to calculate the maximum distance between any two cities given a set of edges. A subtree is not just any subset of the nodes but a connected subset that contains no cycles.
To approach this problem, we can make use of bitwise operations to represent subsets and iterate over all possible subsets of cities (except the empty set and the sets with only one city, as they don't form edge paths).
For each subset, we do the following:
-
Determine if the current subset represents a connected subtree by performing a Depth-First Search (DFS). If it's not connected or does not form a tree (has cycles), we ignore it.
-
If the subset is a connected subtree, we then find the maximum distance within this subtree. To do this, we pick any city and perform a DFS to find the furthest city from it, which gives us one end of the maximum distance. We perform another DFS starting from this furthest city to find the maximum distance.
-
We record the count of subtrees having each particular maximum distance.
In the code given, the function dfs
is used to determine the furthest city from a given city (represented as nxt
after the first DFS run) and to calculate the maximum distance (mx
) within a subset of cities. The bit manipulation using a mask (msk
) helps in iterating over subsets of cities and confirming connectivity. The variable mask
represents the current subset, and each bit represents whether a city is included (1) or excluded (0) from the subset.
The loop for mask in range(1, 1 << n):
iterates over all possible subsets of the cities. The condition mask & (mask - 1) == 0
checks if the subset has only one city by verifying if mask
is a power of two (in which case, there would be no edges and hence no distance to count). The rest of the loop is for checking connectivity and maximum distance calculation.
The approach cleverly utilizes the properties of the tree to calculate the distances and uses bitmasking to efficiently examine all subsets. By counting the occurrences of each distance as the final step, the problem is solved comprehensively.
Learn more about Tree, Dynamic Programming and Bitmask patterns.
Solution Approach
The implementation of the solution involves several steps that make use of standard algorithms and data structures - notably Depth-First Search (DFS) and bit manipulation to handle combination generation.
Here's how the solution is implemented:
-
Prepare Graph Data Structure: The tree of cities is represented using an adjacency list. A Python
defaultdict
of lists is used to create the graphg
. Each list contains all the adjacent cities (nodes) for a given city. This data structure is populated in a loop where each edge is added for the corresponding cities. -
Define DFS Function: A Depth-First Search function
dfs
is used to explore the cities in the given subtree. It maintains the current distanced
, the maximum distance foundmx
, and the furthest city foundnxt
. Using bitwise operations, it keeps track of the cities visited in the current subset by updating the maskmsk
. -
Iterate Over City Subsets: The outer loop
for mask in range(1, 1 << n):
iterates over all possible subsets of the cities (represented in binary form where each bit corresponds to the presence or absence of a city in the subset). -
Verify Subsets: Each subset must have more than one city to qualify as a subtree with a path. The condition
mask & (mask - 1) == 0
is used to filter out single-city subsets. -
Check Subtree Connectivity: For every subset that passes the previous step, a DFS is performed from the last city in the subset indicated by the position of the most significant bit of the mask (
cur
). After the DFS is complete, ifmsk
is zero, all cities in the subset were reached, which means the subset is a connected subtree. -
Calculate Maximum Distance: If the subset is connected, another DFS is performed starting from
nxt
(the furthest city found from the first DFS). The idea here is to find the diameter of the subtree (the longest path in the subtree). After this second DFS,mx
contains the maximum distance in the subtree. -
Update the Answer Array: The
ans
array contains counters for how many times each possible distanced
has been found. The index corresponds tod - 1
(1-indexed
), and the value is incremented each time a subtree with a maximum distance ofd
is discovered. -
Complexity Analysis: The function iterates through
2^n - n
possible subsets (since empty subsets and single-city subsets are skipped), and within each subset, it may perform up to two DFS operations, resulting in a worst-case time complexity close toO(2^n * n)
. However, many subsets will not form connected subtrees and be skipped earlier in the process, which can make the solution faster in practice.
This solution is a combination of combinatorics (to generate all possible subsets) and graph traversal (to check for connectivity and distances). By combining these techniques, the implementation efficiently solves the problem of counting subtrees with specific maximum distances across a tree graph.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's consider a tree with 4
cities (nodes) connected by 3
edges as follows:
City 1 | City 2 / \ City 3 City 4
Here, n = 4
, representing the number of cities. We will use binary numbers (bitmasks) to represent all possible subsets of the cities and follow the steps provided in the solution approach to find the subtrees with maximum distances of 1, 2, and 3. The cities are 1-indexed, so bitmask 0b0001
represents a subset containing only City 1.
1. Prepare Graph Data Structure
The adjacency list, g
, would be something like:
- City 1: [City 2]
- City 2: [City 1, City 3, City 4]
- City 3: [City 2]
- City 4: [City 2]
2. Define DFS Function
The dfs
function will be used to explore connected cities and calculate the maximum distance in the subset.
3. Iterate Over City Subsets
We have 2^4 - 4
non-empty non-singleton subsets to consider. Let's explore some of them:
- Subset
0b0110
(Cities 2 and 3 are included): Reachable as there's an edge between City 2 and City 3. The maximum distance here is1
. - Subset
0b1010
(Cities 1 and 3 are included): This does not represent a connected subtree because there is no direct path between City 1 and City 3 within the subset.
4. Verify Subsets
A bitmask like 0b0100
is skipped because it represents only City 3, forming no paths.
5. Check Subtree Connectivity
Check with DFS:
- Starting with subset
0b0110
, begin the DFS from City 2. After completing DFS, if all cities in the subset are reached (maskmsk
becomes0
), the subset is connected.
6. Calculate Maximum Distance
For connected subsets:
- Using the subset
0b0110
, perform another DFS starting from City 3 (furthest found from previous DFS). The maximum distance within this subset is1
.
7. Update the Answer Array
For the subset 0b0110
, increment the counter in ans[1-1]
since the maximum distance d
found is 1
.
8. Complexity Analysis
We would repeat these steps for all subsets ranging from 0b0011
to 0b1111
excluding subsets like 0b0001
, 0b0010
, 0b0100
, and 0b1000
which are single-city subsets, and hence have no paths.
Through this solution approach, you can calculate the number of subtrees with different maximum distances. Here, simplistic examples such as 0b0110
and 0b1010
are used, but the problem will apply these steps to a full 4
city tree and consider all 2^4 - 4 = 12
possible city subsets.
Solution Implementation
1from collections import defaultdict
2from typing import List
3
4class Solution:
5 def countSubgraphsForEachDiameter(self, n: int, edges: List[List[int]]) -> List[int]:
6 # Perform a depth-first search from a given node, tracking the maximum distance and its corresponding node
7 def dfs(node: int, distance: int = 0):
8 nonlocal max_distance, next_node, mask
9 if max_distance < distance: # Updating maximum distance and corresponding node
10 max_distance, next_node = distance, node
11 mask ^= 1 << node # Remove the current node from the mask
12 for neighbor in graph[node]:
13 if mask >> neighbor & 1: # Check if the neighbor is part of the current subgraph (mask)
14 dfs(neighbor, distance + 1)
15
16 # Create an adjacency list representation of the graph
17 graph = defaultdict(list)
18 for u, v in edges:
19 u, v = u - 1, v - 1 # Convert to 0-based indexing for easier bit manipulation
20 graph[u].append(v)
21 graph[v].append(u)
22
23 # Initialize the result list for storing the count of subgraphs for each diameter
24 result = [0] * (n - 1)
25
26 # Variables to store the node for next DFS and the maximum distance in current DFS
27 next_node = max_distance = 0
28
29 # Iterate over all possible subgraphs represented by bitmasks (excluding empty subgraph and single node subgraphs)
30 for mask in range(1, 1 << n):
31 if mask & (mask - 1) == 0: # Skip subgraphs of size 1 (only one bit set in the mask)
32 continue
33
34 # Initialize the mask and max distance for current subgraph
35 current_mask, max_distance = mask, 0
36
37 # Start DFS from the last bit set in the current_mask
38 current_node = current_mask.bit_length() - 1
39 dfs(current_node)
40
41 if current_mask == 0: # If all nodes in subgraph have been visited,
42 current_mask, max_distance = mask, 0 # Reset the mask and distance
43 dfs(next_node) # Start new DFS to find the maximum distance
44 result[max_distance - 1] += 1 # Increment the count of subgraphs by found diameter
45
46 # Return the counts of subgraphs for each possible diameter
47 return result
48
1import java.util.ArrayList;
2import java.util.Arrays;
3import java.util.List;
4
5class Solution {
6
7 // Graph represented as an adjacency list
8 private List<Integer>[] graph;
9
10 // Mask to represent a subset of nodes
11 private int mask;
12
13 // The starting point for the next DFS
14 private int nextNode;
15
16 // The maximum distance found in current DFS
17 private int maxDistance;
18
19 // Method that calculates the count of subgraphs for each possible diameter
20 public int[] countSubgraphsForEachDiameter(int nodeCount, int[][] edges) {
21 // Initialize the adjacency list for the graph
22 graph = new List[nodeCount];
23 Arrays.setAll(graph, index -> new ArrayList<>());
24
25 // Add each edge to the undirected graph
26 for (int[] edge : edges) {
27 int from = edge[0] - 1;
28 int to = edge[1] - 1;
29 graph[from].add(to);
30 graph[to].add(from);
31 }
32
33 // Answer array to count the subgraphs for each diameter
34 int[] answer = new int[nodeCount - 1];
35
36 // Iterate over all possible subsets of nodes
37 for (int subsetMask = 1; subsetMask < (1 << nodeCount); ++subsetMask) {
38 // Skip subsets that have only one bit set (single node)
39 if ((subsetMask & (subsetMask - 1)) == 0) {
40 continue;
41 }
42
43 // Initialize helper variables for the current subset
44 mask = subsetMask;
45 maxDistance = 0;
46
47 // Find the highest-order bit that is set
48 int current = Integer.SIZE - 1 - Integer.numberOfLeadingZeros(mask);
49
50 // Perform depth-first search from the highest-order bit
51 dfs(current, 0);
52
53 // If all bits were cleared, we found a connected component
54 if (mask == 0) {
55 // Perform another DFS to find the diameter of the tree
56 mask = subsetMask;
57 maxDistance = 0;
58 dfs(nextNode, 0);
59
60 // Increment the count for the diameter found
61 ++answer[maxDistance - 1];
62 }
63 }
64 return answer;
65 }
66
67 // Depth-first search to clear bits in the mask and find the maximum distance
68 private void dfs(int node, int distance) {
69 // Clear the bit for the current node, marking it as visited
70 mask ^= (1 << node);
71
72 // Update maximum distance and record the furthest node
73 if (maxDistance < distance) {
74 maxDistance = distance;
75 nextNode = node;
76 }
77
78 // Explore all neighbors
79 for (int neighbor : graph[node]) {
80 // Continue DFS if the neighbor is in the current subset and is not visited
81 if ((mask >> neighbor & 1) == 1) {
82 dfs(neighbor, distance + 1);
83 }
84 }
85 }
86}
87
1#include <vector>
2#include <functional>
3
4class Solution {
5public:
6 // Function to count the number of subgraphs for each diameter
7 vector<int> countSubgraphsForEachDiameter(int n, vector<vector<int>>& edges) {
8 // Convert edges into an adjacency list for a graph representation
9 vector<vector<int>> graph(n);
10 for (auto& edge : edges) {
11 int u = edge[0] - 1, v = edge[1] - 1; // Convert to zero-based indexing
12 graph[u].emplace_back(v);
13 graph[v].emplace_back(u);
14 }
15
16 // Answer array to store the counts of subgraphs for each diameter
17 vector<int> answer(n - 1, 0);
18 int nextNode = 0;
19 int mask = 0;
20 int maxDiameter = 0;
21
22 // Depth-first search function to explore the graph
23 function<void(int, int)> dfs = [&](int node, int distance) {
24 mask ^= 1 << node; // Toggle the node in the current mask
25 if (maxDiameter < distance) {
26 maxDiameter = distance; // Update the max diameter found
27 nextNode = node; // Update the node for the next DFS call
28 }
29 for (int neighbor : graph[node]) { // Visit all connected nodes
30 if (mask >> neighbor & 1) { // If the neighbor is in the current mask
31 dfs(neighbor, distance + 1); // Recursive DFS
32 }
33 }
34 };
35
36 // Iterate over all possible subgraphs; represented by bit masks
37 for (int currentMask = 1; currentMask < (1 << n); ++currentMask) {
38 // Skip masks with single bit set (no edges in such subgraphs)
39 if ((currentMask & (currentMask - 1)) == 0) {
40 continue;
41 }
42
43 // Perform a DFS from the rightmost set bit in the current mask
44 mask = currentMask;
45 maxDiameter = 0;
46 int startNode = 31 - __builtin_clz(mask); // Find rightmost set bit
47 dfs(startNode, 0);
48
49 // If after the DFS the mask is 0, all nodes were visited
50 if (mask == 0) {
51 mask = currentMask;
52 maxDiameter = 0;
53 dfs(nextNode, 0); // Perform another DFS to find the diameter
54 ++answer[maxDiameter - 1]; // Increment the count for this diameter
55 }
56 }
57
58 return answer;
59 }
60};
61
1function countSubgraphsForEachDiameter(n: number, edges: number[][]): number[] {
2 // Create an adjacency list for the graph
3 const graph = Array.from({ length: n }, () => []);
4 for (const [u, v] of edges) {
5 graph[u - 1].push(v - 1);
6 graph[v - 1].push(u - 1);
7 }
8
9 // Initialize the answer array with zeros
10 const answer: number[] = new Array(n - 1).fill(0);
11
12 // Variables for tracking maximum diameter, the bitmask, and the next node
13 let [maxDiameter, bitmask, nextNode] = [0, 0, 0];
14
15 // Depth-first search to find the diameter of the subgraph
16 const dfs = (node: number, distance: number): void => {
17 if (maxDiameter < distance) {
18 maxDiameter = distance;
19 nextNode = node;
20 }
21 bitmask ^= 1 << node;
22 for (const neighbor of graph[node]) {
23 if ((bitmask >> neighbor) & 1) {
24 dfs(neighbor, distance + 1);
25 }
26 }
27 };
28
29 // Iterate over all possible subgraphs
30 for (let mask = 1; mask < (1 << n); ++mask) {
31 if ((mask & (mask - 1)) === 0) { // Skip if only one bit is set (not a valid subgraph)
32 continue;
33 }
34 // Reset variables and start dfs from the highest node within the mask
35 bitmask = mask;
36 maxDiameter = 0;
37 const current = 31 - numberOfLeadingZeros(bitmask);
38 dfs(current, 0);
39 // Check if all nodes in the subgraph were visited
40 if (bitmask === 0) {
41 bitmask = mask;
42 maxDiameter = 0;
43 dfs(nextNode, 0); // Run dfs again to find actual diameter starting from nextNode
44 ++answer[maxDiameter - 1]; // Increment count of subgraphs for this diameter
45 }
46 }
47 return answer;
48}
49
50// Function to count the number of leading zeros in a 32-bit integer
51function numberOfLeadingZeros(i: number): number {
52 if (i === 0) return 32;
53 let numZeros = 1;
54 if ((i >>> 16) === 0) {
55 numZeros += 16;
56 i <<= 16;
57 }
58 if ((i >>> 24) === 0) {
59 numZeros += 8;
60 i <<= 8;
61 }
62 if ((i >>> 28) === 0) {
63 numZeros += 4;
64 i <<= 4;
65 }
66 if ((i >>> 30) === 0) {
67 numZeros += 2;
68 i <<= 2;
69 }
70 numZeros -= i >>> 31;
71 return numZeros;
72}
73
Time and Space Complexity
Time Complexity
The time complexity of the provided code is primarily determined by the number of possible subsets of the vertices in the graph, which is 2^n
, where n
is the number of vertices. For each subset (excluding subsets of size 0 and 1), the code performs a Depth-First Search (DFS) to find the maximum diameter of the subgraph represented by the subset.
Each call to the dfs
function has a time complexity of O(n)
, as in the worst case it visits all vertices in the graph. Since dfs
is called twice for each subset that passes the initial checks, the time complexity for checking all subsets with DFS is O(2^n * n)
.
Therefore, the overall time complexity of the code is O(2^n * n)
.
Space Complexity
The space complexity is governed by the space needed to store the graph structure, the recursive calls during DFS, and the ans
list.
- The graph is stored using an adjacency list, which in the worst case (a complete graph) requires
O(n^2)
space. - The maximum depth of the recursive call stack for DFS is
n
, as the DFS might traverse all the vertices in the worst case. This adds a space complexity ofO(n)
. - The
ans
list takesO(n)
space.
Hence, the overall space complexity is O(n^2)
due to the adjacency list storage, assuming it is the worst-case scenario.
Learn more about how to find time and space complexity quickly using problem constraints.
You are given an array of intervals where intervals[i] = [start_i, end_i]
represent the start and end of the ith
interval. You need to merge all overlapping intervals and return an array of the non-overlapping intervals that cover all the intervals in the input.
Recommended Readings
Everything About Trees A tree is a type of graph data structure composed of nodes and edges Its main properties are It is acyclic doesn't contain any cycles There exists a path from the root to any node Has N 1 edges where N is the number of nodes in the tree and
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
Bitmask and Dynamic Programming Bit manipulation is a crucial aspect of computer programming and one of the most powerful tools for bit manipulation is bitmasks Let's first understand what a bit is A bit is a binary digit It's the smallest piece of data in a computer and can be
Want a Structured Path to Master System Design Too? Don’t Miss This!