3331. Find Subtree Sizes After Changes
Problem Description
You have a tree with n
nodes numbered from 0
to n - 1
, where node 0
is the root. The tree structure is given by an array parent
where parent[i]
indicates the parent of node i
. Since node 0
is the root, parent[0] = -1
.
Each node i
has a character s[i]
assigned to it from the string s
.
The problem asks you to perform a tree restructuring operation once and simultaneously for all nodes from 1
to n - 1
:
For each node x
(where 1 ≤ x ≤ n - 1
):
- Look for the closest ancestor node
y
ofx
that has the same character asx
(i.e.,s[x] == s[y]
) - If such a node
y
exists:- Remove the edge between
x
and its current parent - Make
y
the new parent ofx
by adding an edge between them
- Remove the edge between
- If no such ancestor exists, leave node
x
unchanged
After all these changes are applied simultaneously, you need to return an array answer
where answer[i]
represents the size of the subtree rooted at node i
in the final restructured tree.
Key Points:
- The restructuring happens simultaneously for all nodes, meaning you evaluate all nodes based on the original tree structure before making any changes
- An ancestor is a node that appears on the path from a node to the root
- The "closest" ancestor means the one with minimum distance when traversing up toward the root
- A subtree size includes the node itself plus all its descendants
Flowchart Walkthrough
First, let's pin down the algorithm using the Flowchart. Here's a step-by-step walkthrough:
Is it a graph?
- Yes: The problem involves a tree structure with nodes (0 to n-1) and edges (parent-child relationships). A tree is a special type of graph.
Is it a tree?
- Yes: The problem explicitly states we have a tree rooted at node 0, represented by a parent array where each node (except root) has exactly one parent.
DFS
- Conclusion: The flowchart leads us directly to DFS (Depth-First Search) for tree problems.
Why DFS is the Right Choice
The problem requires us to:
- Traverse the tree to identify ancestors with matching characters
- Restructure the tree based on these relationships
- Calculate subtree sizes in the final tree
DFS is particularly well-suited here because:
- We need to traverse from each node up to its ancestors (path to root)
- We need to calculate subtree sizes, which is naturally done bottom-up using DFS
- The restructuring requires understanding parent-child relationships throughout the tree
- DFS allows us to maintain a stack/path of ancestors as we traverse, making it easy to find the closest ancestor with a matching character
The solution uses DFS to:
- Traverse the tree while maintaining a stack of ancestors with their characters
- For each node, find the closest ancestor with the same character
- Accumulate subtree sizes from children to parents in the restructured tree
Conclusion: The flowchart correctly identifies this as a tree problem that should be solved using DFS, which aligns perfectly with the solution approach of using depth-first traversal to handle both the restructuring logic and subtree size calculation.
Intuition
The key insight is that we need to simulate the tree restructuring while calculating subtree sizes efficiently. Since all nodes restructure simultaneously based on the original tree, we can't actually modify the tree structure during traversal - we need to determine where each node's contribution (subtree size) should go in the final tree.
Think about what happens when we restructure: if node x
finds an ancestor y
with the same character, x
and all its descendants become part of y
's subtree instead of its original parent's subtree. This means we need to track which ancestor should "collect" each node's subtree size.
The clever approach is to use DFS to traverse the tree while maintaining a stack of ancestors. As we go deeper into the tree, we keep track of nodes with each character we've seen on the current path. When we visit a node, we can quickly find the closest ancestor with the same character by looking at the most recent occurrence of that character in our path.
Here's the critical observation: when we're at node i
with character s[i]
, we need:
- The current path from root to
i
(maintained implicitly through DFS) - For each character, the most recent ancestor with that character
Using a dictionary d
where d[char]
is a list of nodes with character char
in the current path gives us exactly what we need. When we visit node i
:
- We add
i
tod[s[i]]
- If
d[s[i]]
has more than one element, the second-to-last element is the closest ancestor with the same character (the last element isi
itself) - We accumulate
i
's subtree size to this ancestor instead of its direct parent
The beauty of this approach is that by using DFS and maintaining the character-to-nodes mapping only for the current path (pushing when entering a node, popping when leaving), we automatically handle the "simultaneous" restructuring requirement. Each node's subtree size gets added to the correct ancestor in the final tree structure without actually modifying any edges.
Learn more about Tree and Depth-First Search patterns.
Solution Approach
The implementation uses DFS with a path-tracking mechanism to efficiently handle the tree restructuring and subtree size calculation.
Data Structures Used:
- Adjacency List
g
: Store the tree structure whereg[i]
contains all children of nodei
- Dictionary
d
: Track nodes with each character on the current path from root - Answer Array
ans
: Store the final subtree sizes for each node
Algorithm Steps:
-
Build the Tree Structure: Convert the parent array into an adjacency list representation for easier traversal:
g = [[] for _ in range(n)] for i in range(1, n): g[parent[i]].append(i)
-
DFS Traversal with Path Tracking: The core DFS function
dfs(i, fa)
wherei
is the current node andfa
is its parent:-
Initialize node's subtree size: Start with
ans[i] = 1
(counting the node itself) -
Track current node in path: Add node
i
to the list of nodes with characters[i]
:d[s[i]].append(i)
-
Process all children: Recursively visit all children to calculate their subtree sizes:
for j in g[i]: dfs(j, i)
-
Find the target parent: Determine where this node's subtree size should be added:
k = fa # Default to original parent if len(d[s[i]]) > 1: k = d[s[i]][-2] # Get closest ancestor with same character
The logic here is crucial: if
d[s[i]]
has more than one element, the last element is the current nodei
, and the second-to-last is the closest ancestor with the same character. -
Accumulate subtree size: Add the current node's subtree size to its target parent:
if k != -1: ans[k] += ans[i]
-
Clean up path tracking: Remove the current node from the path as we backtrack:
d[s[i]].pop()
-
-
Execute DFS from Root: Start the traversal from node 0 with parent -1:
dfs(0, -1)
Why This Works:
- The DFS naturally maintains the ancestor path through its call stack
- By adding/removing nodes from
d
as we enter/exit them, we ensured[char]
always contains only ancestors of the current node - Processing children before determining where to accumulate the subtree size ensures we calculate sizes bottom-up
- The simultaneous restructuring is achieved because we don't modify the tree structure during traversal - we just redirect where subtree sizes accumulate based on the original tree's ancestor relationships
The time complexity is O(n)
as we visit each node once, and the space complexity is O(n)
for the recursion stack and data structures.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example to illustrate the solution approach.
Input:
parent = [-1, 0, 0, 1, 1, 2]
s = "abacbe"
n = 6
Initial Tree Structure:
0(a) / \ 1(b) 2(a) / \ \ 3(c) 4(b) 5(e)
Step 1: Build Adjacency List
g = [[], [], [], [], [], []] g[0] = [1, 2] # Node 0's children g[1] = [3, 4] # Node 1's children g[2] = [5] # Node 2's children
Step 2: DFS Traversal
Let's trace through the DFS starting from node 0:
Visit Node 0 (char='a'):
ans[0] = 1
d = {'a': [0]}
- Process children 1 and 2
Visit Node 1 (char='b'):
ans[1] = 1
d = {'a': [0], 'b': [1]}
- Process children 3 and 4
Visit Node 3 (char='c'):
ans[3] = 1
d = {'a': [0], 'b': [1], 'c': [3]}
- No children to process
- Find target parent:
d['c']
has only 1 element, sok = fa = 1
ans[1] += ans[3]
→ans[1] = 2
- Pop 3 from
d['c']
Visit Node 4 (char='b'):
ans[4] = 1
d = {'a': [0], 'b': [1, 4], 'c': []}
- No children to process
- Find target parent:
d['b']
has 2 elements[1, 4]
- Closest ancestor with 'b' is
d['b'][-2] = 1
- So
k = 1
(which happens to be the same as original parent)
- Closest ancestor with 'b' is
ans[1] += ans[4]
→ans[1] = 3
- Pop 4 from
d['b']
Back to Node 1:
d = {'a': [0], 'b': [1], 'c': []}
- Find target parent:
d['b']
has only 1 element, sok = fa = 0
ans[0] += ans[1]
→ans[0] = 4
- Pop 1 from
d['b']
Visit Node 2 (char='a'):
ans[2] = 1
d = {'a': [0, 2], 'b': [], 'c': []}
- Process child 5
Visit Node 5 (char='e'):
ans[5] = 1
d = {'a': [0, 2], 'b': [], 'c': [], 'e': [5]}
- No children to process
- Find target parent:
d['e']
has only 1 element, sok = fa = 2
ans[2] += ans[5]
→ans[2] = 2
- Pop 5 from
d['e']
Back to Node 2:
d = {'a': [0, 2], 'b': [], 'c': [], 'e': []}
- Find target parent:
d['a']
has 2 elements[0, 2]
- Closest ancestor with 'a' is
d['a'][-2] = 0
- So
k = 0
(which is the same as original parent)
- Closest ancestor with 'a' is
ans[0] += ans[2]
→ans[0] = 6
- Pop 2 from
d['a']
Final Result:
ans = [6, 3, 2, 1, 1, 1]
Restructured Tree Explanation: In this example, the restructuring didn't change the tree structure because:
- Node 3 (char='c') has no ancestor with 'c', stays with parent 1
- Node 4 (char='b') finds ancestor 1 with 'b', but 1 is already its parent
- Node 5 (char='e') has no ancestor with 'e', stays with parent 2
- Node 2 (char='a') finds ancestor 0 with 'a', but 0 is already its parent
The algorithm correctly computes subtree sizes by accumulating each node's contribution to its target parent in the final tree structure.
Solution Implementation
1from typing import List
2from collections import defaultdict
3
4class Solution:
5 def findSubtreeSizes(self, parent: List[int], s: str) -> List[int]:
6 """
7 Find the size of subtrees for each node in a tree with special character-based rules.
8
9 Args:
10 parent: List where parent[i] is the parent of node i (parent[0] = -1 for root)
11 s: String where s[i] is the character associated with node i
12
13 Returns:
14 List where result[i] is the subtree size for node i
15 """
16
17 def dfs(node: int, parent_node: int) -> None:
18 """
19 Depth-first search to calculate subtree sizes.
20
21 Args:
22 node: Current node being processed
23 parent_node: Parent of the current node (-1 for root)
24 """
25 # Initialize the subtree size for current node
26 subtree_sizes[node] = 1
27
28 # Add current node to the stack of nodes with the same character
29 char_stack[s[node]].append(node)
30
31 # Process all children of the current node
32 for child in adjacency_list[node]:
33 dfs(child, node)
34
35 # Determine which ancestor should receive this subtree's size
36 target_ancestor = parent_node
37
38 # If there are multiple nodes with the same character in the path,
39 # use the second-to-last one (the previous occurrence)
40 if len(char_stack[s[node]]) > 1:
41 target_ancestor = char_stack[s[node]][-2]
42
43 # Add current subtree size to the target ancestor
44 if target_ancestor != -1:
45 subtree_sizes[target_ancestor] += subtree_sizes[node]
46
47 # Remove current node from the character stack after processing
48 char_stack[s[node]].pop()
49
50 # Initialize variables
51 n = len(s)
52
53 # Build adjacency list representation of the tree
54 adjacency_list = [[] for _ in range(n)]
55 for i in range(1, n):
56 adjacency_list[parent[i]].append(i)
57
58 # Dictionary to track nodes with each character during DFS
59 char_stack = defaultdict(list)
60
61 # Initialize result array for subtree sizes
62 subtree_sizes = [0] * n
63
64 # Start DFS from root node (node 0)
65 dfs(0, -1)
66
67 return subtree_sizes
68
1class Solution {
2 private List<Integer>[] adjacencyList; // Tree structure as adjacency list
3 private List<Integer>[] charToNodeStack; // Stack of nodes for each character (a-z)
4 private char[] characters; // Character array from input string
5 private int[] subtreeSizes; // Result array storing subtree sizes
6
7 public int[] findSubtreeSizes(int[] parent, String s) {
8 int nodeCount = s.length();
9
10 // Initialize data structures
11 adjacencyList = new List[nodeCount];
12 charToNodeStack = new List[26]; // 26 letters in alphabet
13 this.characters = s.toCharArray();
14
15 // Create empty lists for each node and character
16 Arrays.setAll(adjacencyList, index -> new ArrayList<>());
17 Arrays.setAll(charToNodeStack, index -> new ArrayList<>());
18
19 // Build adjacency list from parent array (skip root at index 0)
20 for (int childNode = 1; childNode < nodeCount; ++childNode) {
21 adjacencyList[parent[childNode]].add(childNode);
22 }
23
24 // Initialize result array and perform DFS
25 subtreeSizes = new int[nodeCount];
26 dfs(0, -1); // Start DFS from root (node 0) with no parent (-1)
27
28 return subtreeSizes;
29 }
30
31 private void dfs(int currentNode, int parentNode) {
32 // Initialize current node's subtree size to 1 (itself)
33 subtreeSizes[currentNode] = 1;
34
35 // Get character index (0-25 for a-z)
36 int charIndex = characters[currentNode] - 'a';
37
38 // Push current node onto the stack for its character
39 charToNodeStack[charIndex].add(currentNode);
40
41 // Recursively process all children
42 for (int childNode : adjacencyList[currentNode]) {
43 dfs(childNode, currentNode);
44 }
45
46 // Find the closest ancestor with the same character
47 // If stack has more than 1 element, the second-to-last is the closest ancestor
48 // Otherwise, use the parent node
49 int closestAncestorWithSameChar = charToNodeStack[charIndex].size() > 1 ?
50 charToNodeStack[charIndex].get(charToNodeStack[charIndex].size() - 2) :
51 parentNode;
52
53 // Add current node's subtree size to its closest ancestor with same character
54 if (closestAncestorWithSameChar >= 0) {
55 subtreeSizes[closestAncestorWithSameChar] += subtreeSizes[currentNode];
56 }
57
58 // Pop current node from the character stack (backtrack)
59 charToNodeStack[charIndex].remove(charToNodeStack[charIndex].size() - 1);
60 }
61}
62
1class Solution {
2public:
3 vector<int> findSubtreeSizes(vector<int>& parent, string s) {
4 int n = s.size();
5
6 // Build adjacency list for the tree (children for each node)
7 vector<vector<int>> children(n);
8 for (int i = 1; i < n; ++i) {
9 children[parent[i]].push_back(i);
10 }
11
12 // Track the stack of ancestors for each character (a-z)
13 // ancestors[i] contains the stack of nodes with character 'a' + i
14 vector<vector<int>> ancestors(26);
15
16 // Result array to store subtree sizes
17 vector<int> subtreeSizes(n);
18
19 // DFS function to traverse the tree and calculate subtree sizes
20 auto dfs = [&](this auto&& dfs, int node, int parentNode) -> void {
21 // Initialize current node's subtree size to 1 (itself)
22 subtreeSizes[node] = 1;
23
24 // Get the character index (0-25 for 'a'-'z')
25 int charIndex = s[node] - 'a';
26
27 // Add current node to the ancestor stack for this character
28 ancestors[charIndex].push_back(node);
29
30 // Recursively process all children
31 for (int child : children[node]) {
32 dfs(child, node);
33 }
34
35 // Find the nearest ancestor with the same character
36 // If there are at least 2 nodes with this character in the stack,
37 // the second-to-last one is the nearest ancestor (excluding current node)
38 int nearestAncestor = ancestors[charIndex].size() > 1 ?
39 ancestors[charIndex][ancestors[charIndex].size() - 2] :
40 parentNode;
41
42 // Add current node's subtree size to its nearest valid ancestor
43 if (nearestAncestor >= 0) {
44 subtreeSizes[nearestAncestor] += subtreeSizes[node];
45 }
46
47 // Remove current node from the ancestor stack after processing
48 ancestors[charIndex].pop_back();
49 };
50
51 // Start DFS from root node (node 0) with parent -1
52 dfs(0, -1);
53
54 return subtreeSizes;
55 }
56};
57
1function findSubtreeSizes(parent: number[], s: string): number[] {
2 const nodeCount = parent.length;
3
4 // Build adjacency list for the tree (children of each node)
5 const childrenGraph: number[][] = Array.from({ length: nodeCount }, () => []);
6
7 // Stack to track nodes by character (26 letters a-z)
8 // Each array stores the path of nodes with that character during DFS
9 const characterStacks: number[][] = Array.from({ length: 26 }, () => []);
10
11 // Build the tree structure (skip root at index 0)
12 for (let i = 1; i < nodeCount; ++i) {
13 childrenGraph[parent[i]].push(i);
14 }
15
16 // Initialize answer array where each node starts with size 1 (itself)
17 const subtreeSizes: number[] = Array(nodeCount).fill(1);
18
19 const dfs = (currentNode: number, parentNode: number): void => {
20 // Get the character index for current node (a=0, b=1, ..., z=25)
21 const charIndex = s.charCodeAt(currentNode) - 97;
22
23 // Push current node to its character's stack
24 characterStacks[charIndex].push(currentNode);
25
26 // Recursively process all children
27 for (const childNode of childrenGraph[currentNode]) {
28 dfs(childNode, currentNode);
29 }
30
31 // Find the ancestor to attach this subtree to
32 // If there's a previous node with the same character, use it
33 // Otherwise, use the parent node
34 const targetAncestor = characterStacks[charIndex].length > 1
35 ? characterStacks[charIndex].at(-2)! // Second to last element (previous same character)
36 : parentNode;
37
38 // Add current subtree size to the target ancestor
39 if (targetAncestor >= 0) {
40 subtreeSizes[targetAncestor] += subtreeSizes[currentNode];
41 }
42
43 // Remove current node from character stack (backtrack)
44 characterStacks[charIndex].pop();
45 };
46
47 // Start DFS from root (node 0) with no parent (-1)
48 dfs(0, -1);
49
50 return subtreeSizes;
51}
52
Time and Space Complexity
Time Complexity: O(n)
The algorithm performs a depth-first search (DFS) traversal of the tree. Each node is visited exactly once during the DFS. For each node i
, the main operations are:
- Adding the current node to
d[s[i]]
list:O(1)
- Recursively visiting all children
- Finding the second-to-last element in
d[s[i]]
if the list has more than one element:O(1)
- Updating the answer array:
O(1)
- Popping from
d[s[i]]
:O(1)
Since each node is processed once with O(1)
operations per node (excluding the recursive calls), and there are n
nodes in total, the overall time complexity is O(n)
.
Space Complexity: O(n)
The space complexity consists of several components:
- The adjacency list
g
:O(n)
space to store all parent-child relationships - The dictionary
d
which maps characters to lists of node indices: In the worst case, all nodes could have the same character, resulting in a list of sizen
, soO(n)
space - The answer array
ans
:O(n)
space - The recursion call stack: In the worst case (a skewed tree), the depth could be
O(n)
The total space complexity is O(n)
as all components are bounded by O(n)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
Pitfall 1: Misunderstanding the Simultaneous Restructuring
The Problem: A common mistake is thinking that we need to actually modify the tree structure during traversal. Developers might attempt to:
- Update the adjacency list while traversing
- Create a new tree structure and then calculate subtree sizes
- Process nodes sequentially rather than based on the original tree
Why It's Wrong: The problem states that restructuring happens "simultaneously" for all nodes. This means all nodes evaluate their closest ancestor based on the original tree structure, not a partially modified one.
The Correct Approach: Instead of modifying the tree structure, we simulate the restructuring by redirecting where subtree sizes accumulate. The tree traversal remains based on the original structure, but we add each node's subtree size to its "would-be" parent after restructuring.
Pitfall 2: Incorrect Ancestor Tracking
The Problem: When looking for the closest ancestor with the same character, developers might:
# WRONG: Trying to find ancestors by traversing up the parent array
def find_ancestor(node, target_char):
current = parent[node]
while current != -1:
if s[current] == target_char:
return current
current = parent[current]
return -1
Why It's Wrong: This approach requires multiple traversals up the tree for each node, leading to O(n²) complexity in the worst case. Additionally, it doesn't naturally integrate with the DFS traversal needed for subtree size calculation.
The Correct Approach:
Use the DFS call stack to maintain the current path. By adding nodes to char_stack
when entering and removing when exiting, we always have access to all ancestors on the current path:
char_stack[s[node]].append(node) # Enter node
# ... process children ...
if len(char_stack[s[node]]) > 1:
target = char_stack[s[node]][-2] # Second-to-last is closest ancestor
char_stack[s[node]].pop() # Exit node
Pitfall 3: Forgetting to Clean Up the Path Stack
The Problem:
Forgetting to remove the current node from char_stack
after processing:
# WRONG: Missing the pop operation
def dfs(node, parent_node):
subtree_sizes[node] = 1
char_stack[s[node]].append(node)
for child in adjacency_list[node]:
dfs(child, node)
# ... calculate target ancestor ...
# MISSING: char_stack[s[node]].pop()
Why It's Wrong: Without removing nodes from the stack when backtracking, the stack will contain nodes that are not ancestors of the current node. This leads to incorrect ancestor identification and wrong subtree size calculations.
The Correct Approach:
Always clean up the stack when exiting a node to maintain the invariant that char_stack
only contains ancestors:
def dfs(node, parent_node):
# ... enter node ...
char_stack[s[node]].append(node)
# ... process children ...
# Clean up before returning
char_stack[s[node]].pop()
Pitfall 4: Incorrect Subtree Size Accumulation Order
The Problem: Adding the subtree size before processing children:
# WRONG: Accumulating before children are processed
def dfs(node, parent_node):
subtree_sizes[node] = 1
# Wrong timing - children haven't been processed yet
if parent_node != -1:
subtree_sizes[parent_node] += subtree_sizes[node]
for child in adjacency_list[node]:
dfs(child, node)
Why It's Wrong: The subtree size of a node includes all its descendants. We can't know the full subtree size until after processing all children.
The Correct Approach: Process all children first to calculate their subtree sizes, then accumulate to the appropriate parent:
def dfs(node, parent_node):
subtree_sizes[node] = 1
# Process children first
for child in adjacency_list[node]:
dfs(child, node)
# Now subtree_sizes[node] includes all descendants
if target_ancestor != -1:
subtree_sizes[target_ancestor] += subtree_sizes[node]
Consider the classic dynamic programming of longest increasing subsequence:
Find the length of the longest subsequence of a given sequence such that all elements of the subsequence are sorted in increasing order.
For example, the length of LIS for [50, 3, 10, 7, 40, 80]
is 4
and LIS is
[3, 7, 40, 80]
.
What is the recurrence relation?
Recommended Readings
Everything About Trees A tree is a type of graph data structure composed of nodes and edges Its main properties are It is acyclic doesn't contain any cycles There exists a path from the root to any node Has N 1 edges where N is the number of nodes in the tree and
https assets algo monster cover_photos dfs svg Depth First Search Prereqs Recursion Review problems recursion_intro Trees problems tree_intro With a solid understanding of recursion under our belts we are now ready to tackle one of the most useful techniques in coding interviews Depth First Search DFS As the name suggests
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Want a Structured Path to Master System Design Too? Don’t Miss This!