834. Sum of Distances in Tree
Problem Description
In this problem, we are given a connected tree with n
nodes numbered from 0
to n - 1
and n - 1
edges. Our objective is to calculate for each node the sum of the distances from that node to all other nodes in the tree. To elucidate, the distance between two nodes is defined as the number of edges in the shortest path connecting them. The function needs to return an array answer
, where answer[i]
contains the sum of distances from node i
to every other node in the tree.
Flowchart Walkthrough
Let's analyze LeetCode problem 834 "Sum of Distances in Tree" using the Flowchart. Here is a comprehensive step-by-step walkthrough:
Is it a graph?
- Yes: The structure described in this problem is a tree, which is a specific type of graph.
Is it a tree?
- Yes: As explicitly mentioned in the problem, we're dealing with a tree structure where every node is connected in an undirected and acyclic manner.
Since the problem is directly identified as a tree-based problem, and no further branching questions apply directly from being a simple tree structure, we proceed to solve it with Depth-First Search (DFS). Trees are a natural fit for DFS because the algorithm helps in efficiently searching or traversing each node, especially useful in problems involving the exploration of hierarchical structures or paths, like calculating distances from each node.
Conclusion: Based on the problem being a tree and the efficient use of DFS for such structures, the flowchart directs us to implement the Depth-First Search pattern for solving the "Sum of Distances in Tree" problem.
Intuition
This problem can be broken down into two primary subtasks. First, we calculate the sum of distances from a reference node. Generally, this reference node is chosen to be the root of the tree, which can be any node, often taken as 0
. Secondly, we leverage the tree's connectivity property and the previously computed sum to determine the sums for all other nodes.
The intuition behind choosing a root node and computing the sum of distances to all other nodes is grounded in the tree's property of having no cycles, which means any node can be considered as the root of the tree.
To derive the solution:
-
We start by performing a Depth-First Search (DFS) traversal from the root node, which we assume to be
0
. During this traversal, we calculate the sum of distances from the root node to all other nodes. -
As we traverse the tree, we also maintain a count of the number of nodes (
size
) that are present in the subtree rooted at each node, including the node itself. -
Once we have the sum of distances from the root node, we perform another DFS to iteratively compute the sum of distances for all other nodes by adjusting the sum obtained from the root based on the subtree sizes.
The adjustment is based on the crucial observation that moving the root from one node to an adjacent node changes the sum by moving one step closer to the nodes in the subtree of the adjacent node while moving one step further from the other nodes. This leads to a difference of n - 2 * size[adjacent_node]
which can be used to update the sum for the adjacent node when moving the root to it.
This approach ensures that we can calculate the answer in O(n)
time, as each edge and node is visited only a couple of times during the two separate DFS traversals.
Learn more about Tree, Depth-First Search, Graph and Dynamic Programming patterns.
Solution Approach
The solution applies a two-pass Depth-First Search (DFS) algorithm on the tree to compute the desired sums. Here is a step-by-step breakdown of the approach:
-
First, we construct an adjacency list
g
to represent the tree structure from the givenedges
. Each entry ing
maps a node to its neighbors. This adjacency list is then used to traverse the tree. -
The
dfs1
function takes a nodei
, its parentfa
, and the distanced
. The purpose of this DFS is to calculate the initial sum of distances from a reference node (assuming it as root) and to fill out thesize
array with the subtree sizes. Theans
array's first element is used to accumulate the sum of distances. When callingdfs1
on a node, we visit all its children and update this sum by adding the distance to each child, and we recursively do the same for each child. At the same time, thesize
array is updated to reflect each subtree's size. -
After the first DFS is complete, the 0th index of the
ans
array has the sum of distances from the root node to all other nodes. Thesize
array has the size of the subtree rooted at each node. -
The second DFS,
dfs2
, is used to find the answer for the remaining nodes based on the answer and size calculated for the root. This function takes in the same parameters asdfs1
, but the additional parametert
represents the total distance sum calculated from the parent node. For each childj
of the current nodei
, we deduce thesize[j]
fromt
and add the number of nodes outside the subtree ofj
(which isn - size[j]
) to get the distance sum for nodej
.Here is the mathematical operation performed during this adjustment:
new_node_distance_sum = parent_node_distance_sum - size[child] + (n - size[child])
The intuition is that moving from a parent to its child node, all the nodes in the subtree of that child node will be 1 distance unit closer, hence we subtract
size[child]
, and all the other nodes will be 1 distance unit further, hence we addn - size[child]
. -
Lastly, we initialize arrays
ans
andsize
ton
zeroes. We calldfs1(0, -1, 0)
to compute the distance sum from the root and then calldfs2(0, -1, ans[0])
to compute the distance sum for the remaining nodes using the previously calculated values.
By using these two DFS traversals, we can compute the sum of distances from each node to all other nodes efficiently.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution using a tree with n = 4
nodes that form the following connected tree structure:
0 / \ 1 2 / 3
-
Construct the adjacency list for the graph:
g = {0: [1, 2], 1: [0], 2: [0, 3], 3: [2]}
-
Call the first DFS,
dfs1
, starting from the root node0
. We initialize theans
andsize
arrays:ans = [0, 0, 0, 0] size = [1, 1, 1, 1] # Every node contributes at least size 1 (itself)
- Visit node
0
, we travel to nodes1
and2
.- Update
ans[0]
by adding1
, which is the distance to node1
. - Visit node
1
, since it's a leaf, return to node0
. - Update
ans[0]
by adding1
, which is the distance to node2
. - Visit node
2
, then travel to node3
.- Update
ans[0]
by adding2
, which is the distance to node3
. - Visit node
3
, since it's a leaf, return to node2
. - Now, we know the size of the subtree rooted at
2
is2
(3
and2
itself), and we updatesize[2]
.
- Update
- Return to node
0
, and after visiting all children, thesize[0]
is updated to4
.
- Update
The result after
dfs1
is:ans = [4, 0, 0, 0] # Sum of distances from node 0 to other nodes size = [4, 1, 2, 1] # Size of each subtree rooted at the respective node
- Visit node
-
Now, we call the second DFS,
dfs2
, on the root node0
.- For each child:
- For node
1
, we takeans[0]
, subtract the size of the subtree rooted at node1
(which is just the node1
itself), and addn - size[1]
:ans[1] = ans[0] - size[1] + (n - size[1]) = 4 - 1 + (4 - 1) = 6
- For node
2
, do the similar operation:ans[2] = ans[0] - size[2] + (n - size[2]) = 4 - 2 + (4 - 2) = 4
- Now, we consider the children of node
2
, which is node3
.- Apply the formula:
ans[3] = ans[2] - size[3] + (n - size[3]) = 4 - 1 + (4 - 1) = 6
- Now, we consider the children of node
- For node
The final
ans
array after callingdfs2
gives us the sum of distances from each node to every other node:ans = [4, 6, 4, 6]
- For each child:
This result represents the sum of distances from each node to all others:
- For node
0
, the distances are{1-0, 2-0, 3-0}
-> sum is1+1+2 = 4
. - For node
1
, the distances are{1-0, 1-2, 1-3}
-> sum is1+2+3 = 6
. - For node
2
, the distances are{2-0, 2-1, 2-3}
-> sum is1+2+1 = 4
. - For node
3
, the distances are{3-0, 3-1, 3-2}
-> sum is2+3+1 = 6
.
In conclusion, by cleverly using the subtree sizes and adjusting the sums based on the position of nodes within the tree, the solution approach efficiently calculates the requested sums for all nodes using two passes of DFS, without computing distances from scratch for each node.
Solution Implementation
1from collections import defaultdict
2
3class Solution:
4 def sumOfDistancesInTree(self, n: int, edges: list[list[int]]) -> list[int]:
5 # Perform a depth-first search to calculate initial distances and subtree sizes
6 def dfs_calculate_dist_and_size(current: int, parent: int, depth: int) -> None:
7 total_distance[0] += depth
8 subtree_size[current] = 1
9 for neighbor in adjacency_list[current]:
10 if neighbor != parent:
11 dfs_calculate_dist_and_size(neighbor, current, depth + 1)
12 subtree_size[current] += subtree_size[neighbor]
13
14 # Perform a second DFS to calculate the answer for each node based on subtree re-rooting
15 def dfs_re_root(current: int, parent: int, total_dist: int) -> None:
16 # The new total distance is the parent total distance
17 # adjusted for moving the root from the parent to the current node
18 distances[current] = total_dist
19 for neighbor in adjacency_list[current]:
20 if neighbor != parent:
21 new_total_dist = total_dist - subtree_size[neighbor] + (n - subtree_size[neighbor])
22 dfs_re_root(neighbor, current, new_total_dist)
23
24 # Initialize the adjacency list to store the graph
25 adjacency_list = defaultdict(list)
26 # Store each pair of edges in both directions
27 for a, b in edges:
28 adjacency_list[a].append(b)
29 adjacency_list[b].append(a)
30
31 # Initialize list for distances and sizes
32 total_distance = [0]
33 subtree_size = [0] * n
34 distances = [0] * n
35
36 # First depth-first search: Calculate total distance and subtree sizes
37 dfs_calculate_dist_and_size(0, -1, 0)
38
39 # Second depth-first search: Calculate the answer for each node
40 dfs_re_root(0, -1, total_distance[0])
41
42 return distances
43
1import java.util.ArrayList;
2import java.util.Arrays;
3import java.util.List;
4
5class Solution {
6 private int numberOfNodes;
7 private int[] distanceSum;
8 private int[] subtreeSize;
9 private List<Integer>[] graph;
10
11 public int[] sumOfDistancesInTree(int n, int[][] edges) {
12 this.numberOfNodes = n;
13 this.graph = new List[n];
14 this.distanceSum = new int[n];
15 this.subtreeSize = new int[n];
16
17 // Initialize lists for each vertex.
18 Arrays.setAll(graph, k -> new ArrayList<>());
19
20 // Build the graph from the edges array.
21 for (int[] edge : edges) {
22 int nodeA = edge[0], nodeB = edge[1];
23 graph[nodeA].add(nodeB);
24 graph[nodeB].add(nodeA);
25 }
26
27 // First DFS to calculate the total distance and the size of subtrees.
28 dfsPostOrder(0, -1, 0);
29
30 // Second DFS to calculate the answer for all nodes based on root's answer.
31 dfsPreOrder(0, -1, distanceSum[0]);
32
33 return distanceSum;
34 }
35
36 private void dfsPostOrder(int node, int parentNode, int depth) {
37 // Add the depth to the total distance for the root.
38 distanceSum[0] += depth;
39 subtreeSize[node] = 1;
40
41 for (int child : graph[node]) {
42 if (child != parentNode) {
43 dfsPostOrder(child, node, depth + 1);
44 // Update subtree size after the child's size has been determined.
45 subtreeSize[node] += subtreeSize[child];
46 }
47 }
48 }
49
50 private void dfsPreOrder(int node, int parentNode, int totalDistance) {
51 // Set the current node's distance sum.
52 distanceSum[node] = totalDistance;
53
54 for (int child : graph[node]) {
55 if (child != parentNode) {
56 // Calculate the new total distance for the child node.
57 int childDistance = totalDistance - subtreeSize[child] + numberOfNodes - subtreeSize[child];
58 dfsPreOrder(child, node, childDistance);
59 }
60 }
61 }
62}
63
1class Solution {
2public:
3 vector<int> sumOfDistancesInTree(int n, vector<vector<int>>& edges) {
4 vector<vector<int>> graph(n); // Use a graph to represent the tree
5 // Build the graph from the edges input
6 for (auto& edge : edges) {
7 int node1 = edge[0], node2 = edge[1];
8 graph[node1].push_back(node2);
9 graph[node2].push_back(node1);
10 }
11 vector<int> answer(n); // This will hold the final answer
12 vector<int> subtreeSize(n); // This will hold the sizes of the subtrees
13
14 // Depth-First Search (DFS) for calculating initial distances and subtree sizes
15 function<void(int, int, int)> dfsCalculateDistances = [&](int node, int parent, int depth) {
16 answer[0] += depth; // Add the depth to the answer for the root
17 subtreeSize[node] = 1; // Initialize the size of this subtree
18 // Traverse the graph
19 for (int& neighbor : graph[node]) {
20 if (neighbor != parent) {
21 dfsCalculateDistances(neighbor, node, depth + 1);
22 subtreeSize[node] += subtreeSize[neighbor]; // Update the size of the subtree
23 }
24 }
25 };
26
27 // DFS for calculating answer for each node based on the root's answer
28 function<void(int, int, int)> dfsCalculateAnswer = [&](int node, int parent, int totalDistance) {
29 answer[node] = totalDistance; // Set the answer for this node
30 // Traverse the graph
31 for (int& neighbor : graph[node]) {
32 if (neighbor != parent) {
33 // Recalculate the total distance when moving the root from current node to the neighbor
34 int revisedDistance = totalDistance - subtreeSize[neighbor] + n - subtreeSize[neighbor];
35 dfsCalculateAnswer(neighbor, node, revisedDistance);
36 }
37 }
38 };
39
40 // Call the first DFS for the root node to initialize distances and subtree sizes
41 dfsCalculateDistances(0, -1, 0);
42 // Call the second DFS to calculate the answer for each node
43 dfsCalculateAnswer(0, -1, answer[0]);
44 return answer; // Return the final answer array
45 }
46};
47
1function sumOfDistancesInTree(n: number, edges: number[][]): number[] {
2 // Create a graph 'g' as an adjacency list representation of the tree.
3 const graph: number[][] = Array.from({ length: n }, () => []);
4 for (const [node1, node2] of edges) {
5 graph[node1].push(node2);
6 graph[node2].push(node1);
7 }
8
9 // Initialize an array to store the answer for each node.
10 const answer: number[] = new Array(n).fill(0);
11 // Initialize an array to store the subtree size for each node.
12 const subtreeSize: number[] = new Array(n).fill(0);
13
14 // DFS function to calculate the sum of distances to the root node.
15 const dfsCalculateDistances = (node: number, parent: number, distanceToRoot: number) => {
16 answer[0] += distanceToRoot;
17 subtreeSize[node] = 1;
18 for (const adjacentNode of graph[node]) {
19 if (adjacentNode !== parent) {
20 dfsCalculateDistances(adjacentNode, node, distanceToRoot + 1);
21 subtreeSize[node] += subtreeSize[adjacentNode];
22 }
23 }
24 };
25
26 // DFS function to redistribute the sum of distances from the root node to all other nodes.
27 const dfsRedistributeDistances = (node: number, parent: number, totalDistance: number) => {
28 answer[node] = totalDistance;
29 for (const adjacentNode of graph[node]) {
30 if (adjacentNode !== parent) {
31 const newDistance = totalDistance - subtreeSize[adjacentNode] + (n - subtreeSize[adjacentNode]);
32 dfsRedistributeDistances(adjacentNode, node, newDistance);
33 }
34 }
35 };
36
37 // Run the first DFS from node 0 with parent -1 and initial distance 0.
38 dfsCalculateDistances(0, -1, 0);
39 // Run the second DFS to calculate the final answer array.
40 dfsRedistributeDistances(0, -1, answer[0]);
41
42 return answer;
43}
44
Time and Space Complexity
Time Complexity
The time complexity of the given code can be analyzed by looking at the two depth-first search (DFS) functions, dfs1
and dfs2
, and the edges-to-graph conversion at the start.
-
The conversion of edges into a graph using a defaultdict takes
O(E)
, whereE
is the number of edges. Since the graph is a tree, we haveE = n - 1
edges. -
The function
dfs1
traverses each node exactly once to calculate the initial sum of distances and the size of each subtree. This takesO(V)
, whereV
is the number of vertices, and since it's a tree, we haveV = n
. -
The function
dfs2
again traverses the tree once to adjust the sum of distances for each node based on thedfs1
calculations. This is alsoO(V)
, whereV = n
.
Combining these, we have the total time complexity as O(E + 2V) = O(2n - 1 + 2n) = O(4n - 1)
. Simplifying, we get O(n)
, because the constant factor is dropped in Big O notation.
Space Complexity
The space complexity is determined by the additional space used aside from the input.
-
The graph
g
takesO(E)
space, which isO(2n - 1)
because each undirected edge contributes to two entries. -
The
ans
andsize
arrays each takeO(V)
space, which isO(n)
. -
The recursion stack for
dfs1
anddfs2
could go up toO(h)
, whereh
is the height of the tree. In the worst case of a skewed tree, this isO(n)
, but for a balanced tree, it'sO(log n)
.
Combining these, we have the total space complexity as O(E + V + V + h) = O(n + n + n + h)
. For the worst case, we consider h = n
, so the space complexity is O(4n)
, which simplifies to O(n)
.
In summary, for the given code, the time complexity is O(n)
and the space complexity is O(n)
.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of the following uses divide and conquer strategy?
Recommended Readings
Everything About Trees A tree is a type of graph data structure composed of nodes and edges Its main properties are It is acyclic doesn't contain any cycles There exists a path from the root to any node Has N 1 edges where N is the number of nodes in the tree and
https algomonster s3 us east 2 amazonaws com cover_photos dfs svg Depth First Search Prereqs Recursion Review problems recursion_intro Trees problems tree_intro With a solid understanding of recursion under our belts we are now ready to tackle one of the most useful techniques in coding interviews Depth First Search DFS
https algomonster s3 us east 2 amazonaws com cover_photos graph svg Graph Fundamentals Tree with 0 cycle At this point you should be pretty familiar with trees A tree is a special kind of graph a connected acyclic cycle less graph A graph may contain cycle s and nodes could
Want a Structured Path to Master System Design Too? Don’t Miss This!