Facebook Pixel

3331. Find Subtree Sizes After Changes

Problem Description

You have a tree with n nodes numbered from 0 to n - 1, where node 0 is the root. The tree structure is given by an array parent where parent[i] indicates the parent of node i. Since node 0 is the root, parent[0] = -1.

Each node i has a character s[i] assigned to it from the string s.

The problem asks you to perform a tree restructuring operation once and simultaneously for all nodes from 1 to n - 1:

For each node x (where 1 ≤ x ≤ n - 1):

  1. Look for the closest ancestor node y of x that has the same character as x (i.e., s[x] == s[y])
  2. If such a node y exists:
    • Remove the edge between x and its current parent
    • Make y the new parent of x by adding an edge between them
  3. If no such ancestor exists, leave node x unchanged

After all these changes are applied simultaneously, you need to return an array answer where answer[i] represents the size of the subtree rooted at node i in the final restructured tree.

Key Points:

  • The restructuring happens simultaneously for all nodes, meaning you evaluate all nodes based on the original tree structure before making any changes
  • An ancestor is a node that appears on the path from a node to the root
  • The "closest" ancestor means the one with minimum distance when traversing up toward the root
  • A subtree size includes the node itself plus all its descendants

Flowchart Walkthrough

First, let's pin down the algorithm using the Flowchart. Here's a step-by-step walkthrough:

Is it a graph?

  • Yes: The problem involves a tree structure with nodes (0 to n-1) and edges (parent-child relationships). A tree is a special type of graph.

Is it a tree?

  • Yes: The problem explicitly states we have a tree rooted at node 0, represented by a parent array where each node (except root) has exactly one parent.

DFS

  • Conclusion: The flowchart leads us directly to DFS (Depth-First Search) for tree problems.

Why DFS is the Right Choice

The problem requires us to:

  1. Traverse the tree to identify ancestors with matching characters
  2. Restructure the tree based on these relationships
  3. Calculate subtree sizes in the final tree

DFS is particularly well-suited here because:

  • We need to traverse from each node up to its ancestors (path to root)
  • We need to calculate subtree sizes, which is naturally done bottom-up using DFS
  • The restructuring requires understanding parent-child relationships throughout the tree
  • DFS allows us to maintain a stack/path of ancestors as we traverse, making it easy to find the closest ancestor with a matching character

The solution uses DFS to:

  1. Traverse the tree while maintaining a stack of ancestors with their characters
  2. For each node, find the closest ancestor with the same character
  3. Accumulate subtree sizes from children to parents in the restructured tree

Conclusion: The flowchart correctly identifies this as a tree problem that should be solved using DFS, which aligns perfectly with the solution approach of using depth-first traversal to handle both the restructuring logic and subtree size calculation.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is that we need to simulate the tree restructuring while calculating subtree sizes efficiently. Since all nodes restructure simultaneously based on the original tree, we can't actually modify the tree structure during traversal - we need to determine where each node's contribution (subtree size) should go in the final tree.

Think about what happens when we restructure: if node x finds an ancestor y with the same character, x and all its descendants become part of y's subtree instead of its original parent's subtree. This means we need to track which ancestor should "collect" each node's subtree size.

The clever approach is to use DFS to traverse the tree while maintaining a stack of ancestors. As we go deeper into the tree, we keep track of nodes with each character we've seen on the current path. When we visit a node, we can quickly find the closest ancestor with the same character by looking at the most recent occurrence of that character in our path.

Here's the critical observation: when we're at node i with character s[i], we need:

  1. The current path from root to i (maintained implicitly through DFS)
  2. For each character, the most recent ancestor with that character

Using a dictionary d where d[char] is a list of nodes with character char in the current path gives us exactly what we need. When we visit node i:

  • We add i to d[s[i]]
  • If d[s[i]] has more than one element, the second-to-last element is the closest ancestor with the same character (the last element is i itself)
  • We accumulate i's subtree size to this ancestor instead of its direct parent

The beauty of this approach is that by using DFS and maintaining the character-to-nodes mapping only for the current path (pushing when entering a node, popping when leaving), we automatically handle the "simultaneous" restructuring requirement. Each node's subtree size gets added to the correct ancestor in the final tree structure without actually modifying any edges.

Learn more about Tree and Depth-First Search patterns.

Solution Approach

The implementation uses DFS with a path-tracking mechanism to efficiently handle the tree restructuring and subtree size calculation.

Data Structures Used:

  1. Adjacency List g: Store the tree structure where g[i] contains all children of node i
  2. Dictionary d: Track nodes with each character on the current path from root
  3. Answer Array ans: Store the final subtree sizes for each node

Algorithm Steps:

  1. Build the Tree Structure: Convert the parent array into an adjacency list representation for easier traversal:

    g = [[] for _ in range(n)]
    for i in range(1, n):
        g[parent[i]].append(i)
  2. DFS Traversal with Path Tracking: The core DFS function dfs(i, fa) where i is the current node and fa is its parent:

    • Initialize node's subtree size: Start with ans[i] = 1 (counting the node itself)

    • Track current node in path: Add node i to the list of nodes with character s[i]:

      d[s[i]].append(i)
    • Process all children: Recursively visit all children to calculate their subtree sizes:

      for j in g[i]:
          dfs(j, i)
    • Find the target parent: Determine where this node's subtree size should be added:

      k = fa  # Default to original parent
      if len(d[s[i]]) > 1:
          k = d[s[i]][-2]  # Get closest ancestor with same character

      The logic here is crucial: if d[s[i]] has more than one element, the last element is the current node i, and the second-to-last is the closest ancestor with the same character.

    • Accumulate subtree size: Add the current node's subtree size to its target parent:

      if k != -1:
          ans[k] += ans[i]
    • Clean up path tracking: Remove the current node from the path as we backtrack:

      d[s[i]].pop()
  3. Execute DFS from Root: Start the traversal from node 0 with parent -1:

    dfs(0, -1)

Why This Works:

  • The DFS naturally maintains the ancestor path through its call stack
  • By adding/removing nodes from d as we enter/exit them, we ensure d[char] always contains only ancestors of the current node
  • Processing children before determining where to accumulate the subtree size ensures we calculate sizes bottom-up
  • The simultaneous restructuring is achieved because we don't modify the tree structure during traversal - we just redirect where subtree sizes accumulate based on the original tree's ancestor relationships

The time complexity is O(n) as we visit each node once, and the space complexity is O(n) for the recursion stack and data structures.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example to illustrate the solution approach.

Input:

  • parent = [-1, 0, 0, 1, 1, 2]
  • s = "abacbe"
  • n = 6

Initial Tree Structure:

      0(a)
      /  \
    1(b)  2(a)
    /  \    \
  3(c) 4(b)  5(e)

Step 1: Build Adjacency List

g = [[], [], [], [], [], []]
g[0] = [1, 2]  # Node 0's children
g[1] = [3, 4]  # Node 1's children  
g[2] = [5]     # Node 2's children

Step 2: DFS Traversal

Let's trace through the DFS starting from node 0:

Visit Node 0 (char='a'):

  • ans[0] = 1
  • d = {'a': [0]}
  • Process children 1 and 2

Visit Node 1 (char='b'):

  • ans[1] = 1
  • d = {'a': [0], 'b': [1]}
  • Process children 3 and 4

Visit Node 3 (char='c'):

  • ans[3] = 1
  • d = {'a': [0], 'b': [1], 'c': [3]}
  • No children to process
  • Find target parent: d['c'] has only 1 element, so k = fa = 1
  • ans[1] += ans[3] → ans[1] = 2
  • Pop 3 from d['c']

Visit Node 4 (char='b'):

  • ans[4] = 1
  • d = {'a': [0], 'b': [1, 4], 'c': []}
  • No children to process
  • Find target parent: d['b'] has 2 elements [1, 4]
    • Closest ancestor with 'b' is d['b'][-2] = 1
    • So k = 1 (which happens to be the same as original parent)
  • ans[1] += ans[4] → ans[1] = 3
  • Pop 4 from d['b']

Back to Node 1:

  • d = {'a': [0], 'b': [1], 'c': []}
  • Find target parent: d['b'] has only 1 element, so k = fa = 0
  • ans[0] += ans[1] → ans[0] = 4
  • Pop 1 from d['b']

Visit Node 2 (char='a'):

  • ans[2] = 1
  • d = {'a': [0, 2], 'b': [], 'c': []}
  • Process child 5

Visit Node 5 (char='e'):

  • ans[5] = 1
  • d = {'a': [0, 2], 'b': [], 'c': [], 'e': [5]}
  • No children to process
  • Find target parent: d['e'] has only 1 element, so k = fa = 2
  • ans[2] += ans[5] → ans[2] = 2
  • Pop 5 from d['e']

Back to Node 2:

  • d = {'a': [0, 2], 'b': [], 'c': [], 'e': []}
  • Find target parent: d['a'] has 2 elements [0, 2]
    • Closest ancestor with 'a' is d['a'][-2] = 0
    • So k = 0 (which is the same as original parent)
  • ans[0] += ans[2] → ans[0] = 6
  • Pop 2 from d['a']

Final Result: ans = [6, 3, 2, 1, 1, 1]

Restructured Tree Explanation: In this example, the restructuring didn't change the tree structure because:

  • Node 3 (char='c') has no ancestor with 'c', stays with parent 1
  • Node 4 (char='b') finds ancestor 1 with 'b', but 1 is already its parent
  • Node 5 (char='e') has no ancestor with 'e', stays with parent 2
  • Node 2 (char='a') finds ancestor 0 with 'a', but 0 is already its parent

The algorithm correctly computes subtree sizes by accumulating each node's contribution to its target parent in the final tree structure.

Solution Implementation

1from typing import List
2from collections import defaultdict
3
4class Solution:
5    def findSubtreeSizes(self, parent: List[int], s: str) -> List[int]:
6        """
7        Find the size of subtrees for each node in a tree with special character-based rules.
8      
9        Args:
10            parent: List where parent[i] is the parent of node i (parent[0] = -1 for root)
11            s: String where s[i] is the character associated with node i
12      
13        Returns:
14            List where result[i] is the subtree size for node i
15        """
16      
17        def dfs(node: int, parent_node: int) -> None:
18            """
19            Depth-first search to calculate subtree sizes.
20          
21            Args:
22                node: Current node being processed
23                parent_node: Parent of the current node (-1 for root)
24            """
25            # Initialize the subtree size for current node
26            subtree_sizes[node] = 1
27          
28            # Add current node to the stack of nodes with the same character
29            char_stack[s[node]].append(node)
30          
31            # Process all children of the current node
32            for child in adjacency_list[node]:
33                dfs(child, node)
34          
35            # Determine which ancestor should receive this subtree's size
36            target_ancestor = parent_node
37          
38            # If there are multiple nodes with the same character in the path,
39            # use the second-to-last one (the previous occurrence)
40            if len(char_stack[s[node]]) > 1:
41                target_ancestor = char_stack[s[node]][-2]
42          
43            # Add current subtree size to the target ancestor
44            if target_ancestor != -1:
45                subtree_sizes[target_ancestor] += subtree_sizes[node]
46          
47            # Remove current node from the character stack after processing
48            char_stack[s[node]].pop()
49      
50        # Initialize variables
51        n = len(s)
52      
53        # Build adjacency list representation of the tree
54        adjacency_list = [[] for _ in range(n)]
55        for i in range(1, n):
56            adjacency_list[parent[i]].append(i)
57      
58        # Dictionary to track nodes with each character during DFS
59        char_stack = defaultdict(list)
60      
61        # Initialize result array for subtree sizes
62        subtree_sizes = [0] * n
63      
64        # Start DFS from root node (node 0)
65        dfs(0, -1)
66      
67        return subtree_sizes
68
1class Solution {
2    private List<Integer>[] adjacencyList;  // Tree structure as adjacency list
3    private List<Integer>[] charToNodeStack;  // Stack of nodes for each character (a-z)
4    private char[] characters;  // Character array from input string
5    private int[] subtreeSizes;  // Result array storing subtree sizes
6  
7    public int[] findSubtreeSizes(int[] parent, String s) {
8        int nodeCount = s.length();
9      
10        // Initialize data structures
11        adjacencyList = new List[nodeCount];
12        charToNodeStack = new List[26];  // 26 letters in alphabet
13        this.characters = s.toCharArray();
14      
15        // Create empty lists for each node and character
16        Arrays.setAll(adjacencyList, index -> new ArrayList<>());
17        Arrays.setAll(charToNodeStack, index -> new ArrayList<>());
18      
19        // Build adjacency list from parent array (skip root at index 0)
20        for (int childNode = 1; childNode < nodeCount; ++childNode) {
21            adjacencyList[parent[childNode]].add(childNode);
22        }
23      
24        // Initialize result array and perform DFS
25        subtreeSizes = new int[nodeCount];
26        dfs(0, -1);  // Start DFS from root (node 0) with no parent (-1)
27      
28        return subtreeSizes;
29    }
30  
31    private void dfs(int currentNode, int parentNode) {
32        // Initialize current node's subtree size to 1 (itself)
33        subtreeSizes[currentNode] = 1;
34      
35        // Get character index (0-25 for a-z)
36        int charIndex = characters[currentNode] - 'a';
37      
38        // Push current node onto the stack for its character
39        charToNodeStack[charIndex].add(currentNode);
40      
41        // Recursively process all children
42        for (int childNode : adjacencyList[currentNode]) {
43            dfs(childNode, currentNode);
44        }
45      
46        // Find the closest ancestor with the same character
47        // If stack has more than 1 element, the second-to-last is the closest ancestor
48        // Otherwise, use the parent node
49        int closestAncestorWithSameChar = charToNodeStack[charIndex].size() > 1 ? 
50            charToNodeStack[charIndex].get(charToNodeStack[charIndex].size() - 2) : 
51            parentNode;
52      
53        // Add current node's subtree size to its closest ancestor with same character
54        if (closestAncestorWithSameChar >= 0) {
55            subtreeSizes[closestAncestorWithSameChar] += subtreeSizes[currentNode];
56        }
57      
58        // Pop current node from the character stack (backtrack)
59        charToNodeStack[charIndex].remove(charToNodeStack[charIndex].size() - 1);
60    }
61}
62
1class Solution {
2public:
3    vector<int> findSubtreeSizes(vector<int>& parent, string s) {
4        int n = s.size();
5      
6        // Build adjacency list for the tree (children for each node)
7        vector<vector<int>> children(n);
8        for (int i = 1; i < n; ++i) {
9            children[parent[i]].push_back(i);
10        }
11      
12        // Track the stack of ancestors for each character (a-z)
13        // ancestors[i] contains the stack of nodes with character 'a' + i
14        vector<vector<int>> ancestors(26);
15      
16        // Result array to store subtree sizes
17        vector<int> subtreeSizes(n);
18      
19        // DFS function to traverse the tree and calculate subtree sizes
20        auto dfs = [&](this auto&& dfs, int node, int parentNode) -> void {
21            // Initialize current node's subtree size to 1 (itself)
22            subtreeSizes[node] = 1;
23          
24            // Get the character index (0-25 for 'a'-'z')
25            int charIndex = s[node] - 'a';
26          
27            // Add current node to the ancestor stack for this character
28            ancestors[charIndex].push_back(node);
29          
30            // Recursively process all children
31            for (int child : children[node]) {
32                dfs(child, node);
33            }
34          
35            // Find the nearest ancestor with the same character
36            // If there are at least 2 nodes with this character in the stack,
37            // the second-to-last one is the nearest ancestor (excluding current node)
38            int nearestAncestor = ancestors[charIndex].size() > 1 ? 
39                                  ancestors[charIndex][ancestors[charIndex].size() - 2] : 
40                                  parentNode;
41          
42            // Add current node's subtree size to its nearest valid ancestor
43            if (nearestAncestor >= 0) {
44                subtreeSizes[nearestAncestor] += subtreeSizes[node];
45            }
46          
47            // Remove current node from the ancestor stack after processing
48            ancestors[charIndex].pop_back();
49        };
50      
51        // Start DFS from root node (node 0) with parent -1
52        dfs(0, -1);
53      
54        return subtreeSizes;
55    }
56};
57
1function findSubtreeSizes(parent: number[], s: string): number[] {
2    const nodeCount = parent.length;
3  
4    // Build adjacency list for the tree (children of each node)
5    const childrenGraph: number[][] = Array.from({ length: nodeCount }, () => []);
6  
7    // Stack to track nodes by character (26 letters a-z)
8    // Each array stores the path of nodes with that character during DFS
9    const characterStacks: number[][] = Array.from({ length: 26 }, () => []);
10  
11    // Build the tree structure (skip root at index 0)
12    for (let i = 1; i < nodeCount; ++i) {
13        childrenGraph[parent[i]].push(i);
14    }
15  
16    // Initialize answer array where each node starts with size 1 (itself)
17    const subtreeSizes: number[] = Array(nodeCount).fill(1);
18  
19    const dfs = (currentNode: number, parentNode: number): void => {
20        // Get the character index for current node (a=0, b=1, ..., z=25)
21        const charIndex = s.charCodeAt(currentNode) - 97;
22      
23        // Push current node to its character's stack
24        characterStacks[charIndex].push(currentNode);
25      
26        // Recursively process all children
27        for (const childNode of childrenGraph[currentNode]) {
28            dfs(childNode, currentNode);
29        }
30      
31        // Find the ancestor to attach this subtree to
32        // If there's a previous node with the same character, use it
33        // Otherwise, use the parent node
34        const targetAncestor = characterStacks[charIndex].length > 1 
35            ? characterStacks[charIndex].at(-2)!  // Second to last element (previous same character)
36            : parentNode;
37      
38        // Add current subtree size to the target ancestor
39        if (targetAncestor >= 0) {
40            subtreeSizes[targetAncestor] += subtreeSizes[currentNode];
41        }
42      
43        // Remove current node from character stack (backtrack)
44        characterStacks[charIndex].pop();
45    };
46  
47    // Start DFS from root (node 0) with no parent (-1)
48    dfs(0, -1);
49  
50    return subtreeSizes;
51}
52

Time and Space Complexity

Time Complexity: O(n)

The algorithm performs a depth-first search (DFS) traversal of the tree. Each node is visited exactly once during the DFS. For each node i, the main operations are:

  • Adding the current node to d[s[i]] list: O(1)
  • Recursively visiting all children
  • Finding the second-to-last element in d[s[i]] if the list has more than one element: O(1)
  • Updating the answer array: O(1)
  • Popping from d[s[i]]: O(1)

Since each node is processed once with O(1) operations per node (excluding the recursive calls), and there are n nodes in total, the overall time complexity is O(n).

Space Complexity: O(n)

The space complexity consists of several components:

  • The adjacency list g: O(n) space to store all parent-child relationships
  • The dictionary d which maps characters to lists of node indices: In the worst case, all nodes could have the same character, resulting in a list of size n, so O(n) space
  • The answer array ans: O(n) space
  • The recursion call stack: In the worst case (a skewed tree), the depth could be O(n)

The total space complexity is O(n) as all components are bounded by O(n).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

Pitfall 1: Misunderstanding the Simultaneous Restructuring

The Problem: A common mistake is thinking that we need to actually modify the tree structure during traversal. Developers might attempt to:

  • Update the adjacency list while traversing
  • Create a new tree structure and then calculate subtree sizes
  • Process nodes sequentially rather than based on the original tree

Why It's Wrong: The problem states that restructuring happens "simultaneously" for all nodes. This means all nodes evaluate their closest ancestor based on the original tree structure, not a partially modified one.

The Correct Approach: Instead of modifying the tree structure, we simulate the restructuring by redirecting where subtree sizes accumulate. The tree traversal remains based on the original structure, but we add each node's subtree size to its "would-be" parent after restructuring.

Pitfall 2: Incorrect Ancestor Tracking

The Problem: When looking for the closest ancestor with the same character, developers might:

# WRONG: Trying to find ancestors by traversing up the parent array
def find_ancestor(node, target_char):
    current = parent[node]
    while current != -1:
        if s[current] == target_char:
            return current
        current = parent[current]
    return -1

Why It's Wrong: This approach requires multiple traversals up the tree for each node, leading to O(n²) complexity in the worst case. Additionally, it doesn't naturally integrate with the DFS traversal needed for subtree size calculation.

The Correct Approach: Use the DFS call stack to maintain the current path. By adding nodes to char_stack when entering and removing when exiting, we always have access to all ancestors on the current path:

char_stack[s[node]].append(node)  # Enter node
# ... process children ...
if len(char_stack[s[node]]) > 1:
    target = char_stack[s[node]][-2]  # Second-to-last is closest ancestor
char_stack[s[node]].pop()  # Exit node

Pitfall 3: Forgetting to Clean Up the Path Stack

The Problem: Forgetting to remove the current node from char_stack after processing:

# WRONG: Missing the pop operation
def dfs(node, parent_node):
    subtree_sizes[node] = 1
    char_stack[s[node]].append(node)
  
    for child in adjacency_list[node]:
        dfs(child, node)
  
    # ... calculate target ancestor ...
    # MISSING: char_stack[s[node]].pop()

Why It's Wrong: Without removing nodes from the stack when backtracking, the stack will contain nodes that are not ancestors of the current node. This leads to incorrect ancestor identification and wrong subtree size calculations.

The Correct Approach: Always clean up the stack when exiting a node to maintain the invariant that char_stack only contains ancestors:

def dfs(node, parent_node):
    # ... enter node ...
    char_stack[s[node]].append(node)
  
    # ... process children ...
  
    # Clean up before returning
    char_stack[s[node]].pop()

Pitfall 4: Incorrect Subtree Size Accumulation Order

The Problem: Adding the subtree size before processing children:

# WRONG: Accumulating before children are processed
def dfs(node, parent_node):
    subtree_sizes[node] = 1
  
    # Wrong timing - children haven't been processed yet
    if parent_node != -1:
        subtree_sizes[parent_node] += subtree_sizes[node]
  
    for child in adjacency_list[node]:
        dfs(child, node)

Why It's Wrong: The subtree size of a node includes all its descendants. We can't know the full subtree size until after processing all children.

The Correct Approach: Process all children first to calculate their subtree sizes, then accumulate to the appropriate parent:

def dfs(node, parent_node):
    subtree_sizes[node] = 1
  
    # Process children first
    for child in adjacency_list[node]:
        dfs(child, node)
  
    # Now subtree_sizes[node] includes all descendants
    if target_ancestor != -1:
        subtree_sizes[target_ancestor] += subtree_sizes[node]
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Consider the classic dynamic programming of longest increasing subsequence:

Find the length of the longest subsequence of a given sequence such that all elements of the subsequence are sorted in increasing order.

For example, the length of LIS for [50, 3, 10, 7, 40, 80] is 4 and LIS is [3, 7, 40, 80].

What is the recurrence relation?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More