Facebook Pixel

1519. Number of Nodes in the Sub-Tree With the Same Label

Problem Description

You are given a tree with n nodes numbered from 0 to n - 1, where node 0 is the root. The tree is represented by n - 1 edges, making it a connected graph with no cycles. Each node has a label, which is a lowercase letter specified in the string labels - node i has the label labels[i].

The edges are provided as pairs [ai, bi], indicating an undirected edge between nodes ai and bi.

Your task is to return an array of size n where ans[i] represents the count of nodes in the subtree rooted at node i that have the same label as node i itself.

A subtree of node i includes node i and all its descendant nodes in the tree structure.

For example, if node 2 has label 'a', and its subtree contains 3 nodes with label 'a' (including itself), then ans[2] = 3.

The solution uses a DFS traversal starting from the root. It maintains a counter for each label and uses a clever technique: before entering a subtree, it subtracts the current count of the node's label, then after processing all descendants, it adds back the updated count. This difference gives the exact number of nodes with the matching label in that subtree.

The algorithm builds an adjacency list from the edges since the tree is undirected, then performs DFS while tracking parent nodes to avoid revisiting them. The cnt counter keeps track of label frequencies as we traverse, and the ans array stores the final results for each node.

Flowchart Walkthrough

First, let's pin down the algorithm using the Flowchart. Here's a step-by-step walkthrough:

Is it a graph?

  • Yes: The problem explicitly states we have a tree, which is a special type of graph (connected, undirected graph with no cycles). We have n nodes and n-1 edges.

Is it a tree?

  • Yes: The problem clearly states "You are given a tree" with n nodes and exactly n-1 edges, making it a connected acyclic graph.

DFS

  • Since we answered "yes" to the tree question, the flowchart leads us directly to DFS as the recommended approach.

Conclusion: The flowchart suggests using DFS (Depth-First Search) for this problem.

This makes perfect sense because:

  1. We need to explore each node's subtree to count nodes with matching labels
  2. A subtree is defined as a node and all its descendants, which naturally aligns with DFS traversal
  3. DFS allows us to process each subtree completely before returning to the parent, making it ideal for aggregating counts from child nodes
  4. The recursive nature of DFS matches the recursive structure of tree subtrees

The DFS approach in the solution traverses from the root (node 0), visits all descendants of each node, and cleverly uses a counter to track label frequencies. By calculating the difference in label counts before and after processing a subtree, it determines exactly how many nodes in that subtree have the same label as the current node.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

When we need to count nodes with a specific label in each subtree, the natural approach is to traverse each subtree and count matching labels. However, doing this naively for every node would involve redundant traversals.

The key insight is that during a single DFS traversal from the root, we can calculate the answer for all nodes simultaneously. As we traverse down the tree, we're visiting every node exactly once, and we can maintain a running count of how many times we've seen each label.

Here's the clever part: when we enter a node's subtree, we can record the current count of that node's label. After we've finished exploring all descendants and return back to this node, the count will have increased by exactly the number of matching labels in the subtree.

For example, if we're at node i with label 'a':

  • Before exploring: we've seen 'a' appear 5 times in ancestors
  • After exploring the entire subtree: we've seen 'a' appear 8 times total
  • Therefore, the subtree contains 8 - 5 = 3 nodes with label 'a'

This difference technique works because:

  1. DFS naturally processes all descendants before returning to a parent
  2. Any labels counted during the subtree exploration must belong to that subtree
  3. The global counter accumulates counts as we go deeper and preserves them as we backtrack

The algorithm essentially asks: "How much did the count of my label increase while I was exploring my subtree?" This increase is exactly the number of nodes in my subtree (including myself) that share my label.

This approach is elegant because it computes all answers in a single O(n) traversal, using the DFS call stack to naturally define subtree boundaries.

Learn more about Tree, Depth-First Search and Breadth-First Search patterns.

Solution Approach

The solution implements a DFS traversal with a clever counting mechanism. Let's walk through the implementation step by step:

1. Build the Graph Representation:

g = defaultdict(list)
for a, b in edges:
    g[a].append(b)
    g[b].append(a)

Since the tree edges are undirected, we create an adjacency list where each edge is added in both directions. This allows traversal from any node to its neighbors.

2. Initialize Data Structures:

cnt = Counter()  # Tracks frequency of each label during traversal
ans = [0] * n    # Stores the final answer for each node
  • cnt: A counter that maintains the running count of each label (character) as we traverse
  • ans: The result array where ans[i] will store the count for node i

3. DFS Function Implementation:

def dfs(i, fa):
    ans[i] -= cnt[labels[i]]  # Record count before exploring subtree
    cnt[labels[i]] += 1        # Include current node in count
    for j in g[i]:
        if j != fa:            # Avoid going back to parent
            dfs(j, i)          # Recursively explore child
    ans[i] += cnt[labels[i]]   # Record count after exploring subtree

The DFS function works as follows:

  • Before exploring the subtree at node i:

    • Subtract the current count of labels[i] from ans[i] (initially 0, so this makes it negative)
    • Increment the count for the current node's label
  • During exploration:

    • Recursively visit all children (neighbors except parent fa)
    • Each recursive call will update the global cnt counter
  • After exploring the entire subtree:

    • Add the current count of labels[i] to ans[i]
    • The difference between after and before gives us the exact count in the subtree

4. Start DFS from Root:

dfs(0, -1)  # Start from root node 0, with -1 as parent (no parent)

Why This Works:

Consider node i with label 'a'. If before entering the subtree, we've seen 'a' appear x times, and after processing the entire subtree, we've seen 'a' appear y times, then the subtree contains exactly y - x nodes with label 'a'.

The formula ans[i] = cnt[labels[i]]_after - cnt[labels[i]]_before is implemented as:

  • ans[i] -= cnt[labels[i]] (subtract before value)
  • Process subtree (cnt gets updated)
  • ans[i] += cnt[labels[i]] (add after value)

This approach efficiently computes all answers in a single O(n) traversal, where each node is visited exactly once.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example to illustrate the solution approach.

Example Tree:

    0(a)
   / \
  1(b) 2(a)
       |
      3(a)
  • Edges: [[0,1], [0,2], [2,3]]
  • Labels: "abaa"
  • Expected answer: [3, 1, 2, 1] (node 0 has 3 'a's in subtree, node 1 has 1 'b', node 2 has 2 'a's, node 3 has 1 'a')

Step-by-step DFS Traversal:

Initial State:

  • cnt = {} (empty counter)
  • ans = [0, 0, 0, 0]

1. Visit Node 0 (label 'a'):

  • Before subtree: cnt['a'] = 0, so ans[0] -= 0 → ans[0] = 0
  • Increment: cnt['a'] = 1
  • Explore children (nodes 1 and 2)

2. Visit Node 1 (label 'b'):

  • Before subtree: cnt['b'] = 0, so ans[1] -= 0 → ans[1] = 0
  • Increment: cnt['b'] = 1
  • No children to explore
  • After subtree: cnt['b'] = 1, so ans[1] += 1 → ans[1] = 1
  • Return to node 0

3. Visit Node 2 (label 'a'):

  • Before subtree: cnt['a'] = 1, so ans[2] -= 1 → ans[2] = -1
  • Increment: cnt['a'] = 2
  • Explore child (node 3)

4. Visit Node 3 (label 'a'):

  • Before subtree: cnt['a'] = 2, so ans[3] -= 2 → ans[3] = -2
  • Increment: cnt['a'] = 3
  • No children to explore
  • After subtree: cnt['a'] = 3, so ans[3] += 3 → ans[3] = -2 + 3 = 1
  • Return to node 2

5. Back at Node 2:

  • After subtree: cnt['a'] = 3, so ans[2] += 3 → ans[2] = -1 + 3 = 2
  • Return to node 0

6. Back at Node 0:

  • After subtree: cnt['a'] = 3, so ans[0] += 3 → ans[0] = 0 + 3 = 3

Final Result: ans = [3, 1, 2, 1]

The key insight illustrated here:

  • Node 0: Started with cnt['a']=0, ended with cnt['a']=3, difference = 3 'a's in subtree
  • Node 2: Started with cnt['a']=1, ended with cnt['a']=3, difference = 2 'a's in subtree
  • Node 3: Started with cnt['a']=2, ended with cnt['a']=3, difference = 1 'a' in subtree

Each node's answer equals how much its label's count increased during its subtree exploration.

Solution Implementation

1from typing import List
2from collections import defaultdict, Counter
3
4class Solution:
5    def countSubTrees(self, n: int, edges: List[List[int]], labels: str) -> List[int]:
6        """
7        Count nodes in each subtree that have the same label as the subtree's root.
8      
9        Args:
10            n: Number of nodes in the tree
11            edges: List of edges representing the tree structure
12            labels: String where labels[i] is the label of node i
13          
14        Returns:
15            List where result[i] is the count of nodes with label[i] in subtree rooted at i
16        """
17      
18        def dfs(node: int, parent: int) -> None:
19            """
20            Depth-first search to count label occurrences in subtrees.
21          
22            Uses a clever technique: 
23            1. Subtract current count of node's label before traversing subtree
24            2. Traverse all children and update global counter
25            3. Add back the count after traversal to get subtree count
26          
27            Args:
28                node: Current node being processed
29                parent: Parent node to avoid revisiting in undirected graph
30            """
31            # Store the count of current node's label before exploring subtree
32            result[node] -= label_counter[labels[node]]
33          
34            # Increment counter for current node's label
35            label_counter[labels[node]] += 1
36          
37            # Recursively process all children
38            for neighbor in adjacency_list[node]:
39                if neighbor != parent:
40                    dfs(neighbor, node)
41          
42            # After processing subtree, the difference gives us the count in this subtree
43            result[node] += label_counter[labels[node]]
44      
45        # Build adjacency list for the undirected tree
46        adjacency_list = defaultdict(list)
47        for node_a, node_b in edges:
48            adjacency_list[node_a].append(node_b)
49            adjacency_list[node_b].append(node_a)
50      
51        # Initialize counter for tracking label occurrences during DFS
52        label_counter = Counter()
53      
54        # Initialize result array to store counts for each node
55        result = [0] * n
56      
57        # Start DFS from root node (node 0) with no parent (-1)
58        dfs(0, -1)
59      
60        return result
61
1class Solution {
2    // Adjacency list representation of the tree
3    private List<Integer>[] adjacencyList;
4    // String containing labels for each node
5    private String nodeLabels;
6    // Result array storing count of matching labels in each subtree
7    private int[] result;
8    // Global counter array for each letter (26 letters in alphabet)
9    private int[] letterCount;
10
11    public int[] countSubTrees(int n, int[][] edges, String labels) {
12        // Initialize adjacency list for n nodes
13        adjacencyList = new List[n];
14        Arrays.setAll(adjacencyList, index -> new ArrayList<>());
15      
16        // Build undirected tree from edges
17        for (int[] edge : edges) {
18            int nodeA = edge[0];
19            int nodeB = edge[1];
20            adjacencyList[nodeA].add(nodeB);
21            adjacencyList[nodeB].add(nodeA);
22        }
23      
24        // Initialize instance variables
25        this.nodeLabels = labels;
26        result = new int[n];
27        letterCount = new int[26];
28      
29        // Start DFS from root node (0) with no parent (-1)
30        dfs(0, -1);
31      
32        return result;
33    }
34
35    private void dfs(int currentNode, int parentNode) {
36        // Get the letter index for current node (0-25 for 'a'-'z')
37        int letterIndex = nodeLabels.charAt(currentNode) - 'a';
38      
39        // Store the count of this letter before processing subtree
40        // Subtract it now, will add back the final count after processing children
41        result[currentNode] -= letterCount[letterIndex];
42      
43        // Increment count for current node's letter
44        letterCount[letterIndex]++;
45      
46        // Recursively process all children (excluding parent to avoid cycles)
47        for (int childNode : adjacencyList[currentNode]) {
48            if (childNode != parentNode) {
49                dfs(childNode, currentNode);
50            }
51        }
52      
53        // Add the final count of this letter in the subtree rooted at currentNode
54        // This includes the current node and all its descendants
55        result[currentNode] += letterCount[letterIndex];
56    }
57}
58
1class Solution {
2public:
3    vector<int> countSubTrees(int n, vector<vector<int>>& edges, string labels) {
4        // Build adjacency list representation of the tree
5        vector<vector<int>> adjacencyList(n);
6        for (auto& edge : edges) {
7            int nodeA = edge[0];
8            int nodeB = edge[1];
9            adjacencyList[nodeA].push_back(nodeB);
10            adjacencyList[nodeB].push_back(nodeA);
11        }
12      
13        // Result array to store count of matching labels in each subtree
14        vector<int> result(n);
15      
16        // Global counter array for each character (a-z)
17        int charCount[26] = {0};
18      
19        // DFS function to traverse the tree and count matching labels
20        function<void(int, int)> dfs = [&](int currentNode, int parentNode) {
21            // Get the character index for current node's label (0-25 for a-z)
22            int charIndex = labels[currentNode] - 'a';
23          
24            // Store the count of this character before processing subtree
25            // Subtract to get the initial count before visiting this subtree
26            result[currentNode] -= charCount[charIndex];
27          
28            // Increment the count for current node's character
29            charCount[charIndex]++;
30          
31            // Recursively visit all children (except parent to avoid cycles)
32            for (int& childNode : adjacencyList[currentNode]) {
33                if (childNode != parentNode) {
34                    dfs(childNode, currentNode);
35                }
36            }
37          
38            // After visiting all children, add the final count to get
39            // the total occurrences of this character in the subtree
40            result[currentNode] += charCount[charIndex];
41        };
42      
43        // Start DFS from node 0 with no parent (-1)
44        dfs(0, -1);
45      
46        return result;
47    }
48};
49
1/**
2 * Counts the number of nodes in each subtree that have the same label as the subtree's root
3 * @param n - Number of nodes in the tree
4 * @param edges - Array of edges representing the tree structure
5 * @param labels - String where labels[i] is the label of node i
6 * @returns Array where result[i] is the count of nodes with same label as node i in its subtree
7 */
8function countSubTrees(n: number, edges: number[][], labels: string): number[] {
9    // Initialize result array to store counts for each node
10    const result: number[] = new Array(n).fill(0);
11  
12    // Array to track count of each character (a-z) during DFS traversal
13    const characterCount: number[] = new Array(26).fill(0);
14  
15    // Build adjacency list representation of the tree
16    const adjacencyList: number[][] = Array.from({ length: n }, () => []);
17    for (const [nodeA, nodeB] of edges) {
18        adjacencyList[nodeA].push(nodeB);
19        adjacencyList[nodeB].push(nodeA);
20    }
21  
22    /**
23     * Performs depth-first search to count labels in subtrees
24     * @param currentNode - Current node being visited
25     * @param parentNode - Parent of the current node (-1 for root)
26     */
27    const dfs = (currentNode: number, parentNode: number): void => {
28        // Get the character index (0-25) for current node's label
29        const charIndex: number = labels.charCodeAt(currentNode) - 97;
30      
31        // Store count before processing subtree (will be subtracted to get subtree count)
32        result[currentNode] -= characterCount[charIndex];
33      
34        // Increment count for current character
35        characterCount[charIndex]++;
36      
37        // Recursively visit all children (excluding parent to avoid cycles)
38        for (const neighbor of adjacencyList[currentNode]) {
39            if (neighbor !== parentNode) {
40                dfs(neighbor, currentNode);
41            }
42        }
43      
44        // Add final count for this character (includes current node and all descendants)
45        result[currentNode] += characterCount[charIndex];
46    };
47  
48    // Start DFS from node 0 with no parent (-1)
49    dfs(0, -1);
50  
51    return result;
52}
53

Time and Space Complexity

Time Complexity: O(n)

The algorithm performs a depth-first search (DFS) traversal of the tree. During the traversal:

  • Each node is visited exactly once in the DFS
  • At each node, we perform constant time operations: updating the counter cnt[labels[i]] and updating ans[i]
  • The adjacency list iteration at each node visits each edge twice (once from each endpoint) across the entire traversal

Since the tree has n nodes and n-1 edges, the total number of operations is proportional to n. Therefore, the time complexity is O(n).

Space Complexity: O(n)

The space complexity consists of several components:

  • The adjacency list g stores all edges, requiring O(n) space (since there are n-1 edges in a tree, and each edge is stored twice)
  • The counter cnt can store at most 26 entries (one for each lowercase letter), which is O(1) in terms of the alphabet size, but O(min(n, 26)) = O(1) for practical purposes
  • The answer array ans stores n elements, requiring O(n) space
  • The recursive call stack for DFS can go as deep as the height of the tree, which in the worst case (skewed tree) is O(n)

Therefore, the overall space complexity is O(n).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Forgetting the Tree is Undirected

Pitfall: A common mistake is treating the edges as directed and only adding them in one direction to the adjacency list. This would prevent proper traversal of the tree.

# WRONG - treats edges as directed
g = defaultdict(list)
for a, b in edges:
    g[a].append(b)  # Only adds edge from a to b

Solution: Always add edges in both directions for undirected trees:

# CORRECT - treats edges as undirected
g = defaultdict(list)
for a, b in edges:
    g[a].append(b)
    g[b].append(a)

2. Not Tracking the Parent Node During DFS

Pitfall: Without tracking the parent, the DFS will revisit nodes infinitely in an undirected graph, causing stack overflow or infinite recursion.

# WRONG - no parent tracking
def dfs(i):
    ans[i] -= cnt[labels[i]]
    cnt[labels[i]] += 1
    for j in g[i]:
        dfs(j)  # Will revisit parent node!
    ans[i] += cnt[labels[i]]

Solution: Pass and check the parent node to avoid revisiting:

# CORRECT - tracks parent to avoid cycles
def dfs(i, fa):
    ans[i] -= cnt[labels[i]]
    cnt[labels[i]] += 1
    for j in g[i]:
        if j != fa:  # Skip parent node
            dfs(j, i)
    ans[i] += cnt[labels[i]]

3. Using Local Counter Instead of Global

Pitfall: Creating a new counter for each recursive call or trying to merge counters from children breaks the algorithm's logic.

# WRONG - creates new counter for each call
def dfs(i, fa):
    cnt = Counter()  # Local counter - won't accumulate properly
    cnt[labels[i]] += 1
    # ... rest of the function

Solution: Use a single global counter that accumulates counts throughout the entire traversal:

# CORRECT - uses global counter
cnt = Counter()  # Global counter defined outside DFS

def dfs(i, fa):
    ans[i] -= cnt[labels[i]]  # Works with global counter
    cnt[labels[i]] += 1
    # ... rest of the function

4. Misunderstanding the Subtraction-Addition Pattern

Pitfall: Attempting to directly count nodes in the subtree without using the before/after difference technique, leading to incorrect counts or complex logic.

# WRONG - tries to count directly without difference technique
def dfs(i, fa):
    count = 1 if labels[i] == target_label else 0
    for j in g[i]:
        if j != fa:
            count += dfs(j, i)  # Complex return value handling
    ans[i] = count

Solution: Stick to the subtraction-before, addition-after pattern:

# CORRECT - uses difference technique
def dfs(i, fa):
    ans[i] -= cnt[labels[i]]  # Subtract before
    cnt[labels[i]] += 1
    for j in g[i]:
        if j != fa:
            dfs(j, i)
    ans[i] += cnt[labels[i]]  # Add after

5. Incorrect Initialization of Result Array

Pitfall: Initializing the result array with wrong values or size can lead to index errors or incorrect results.

# WRONG - incorrect size
ans = [0] * len(edges)  # Should be n, not len(edges)

# WRONG - non-zero initialization interferes with algorithm
ans = [1] * n  # Should start with zeros

Solution: Initialize with zeros and correct size:

# CORRECT
ans = [0] * n  # n zeros, one for each node
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

How many times is a tree node visited in a depth first search?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More