3004. Maximum Subtree of the Same Color 🔒
Problem Description
You are given a tree with n
nodes numbered from 0
to n - 1
, rooted at node 0
. The tree structure is represented by a 2D integer array edges
, where edges[i] = [ui, vi]
indicates there is an edge between nodes ui
and vi
.
Each node has a color assigned to it, given in a 0-indexed integer array colors
of size n
, where colors[i]
represents the color of node i
.
Your task is to find a node v
such that every node in the subtree rooted at v
(including v
itself) has the same color. Among all such valid subtrees, you need to return the size (number of nodes) of the largest one.
In other words, you need to:
- Identify all subtrees where all nodes share the same color
- Return the maximum number of nodes among these valid subtrees
For example, if a subtree rooted at node v
contains nodes {v, a, b, c}
and all four nodes have the same color, this would be a valid subtree of size 4. You need to find the largest such valid subtree in the entire tree.
Flowchart Walkthrough
First, let's pin down the algorithm using the Flowchart. Here's a step-by-step walkthrough:
Is it a graph?
- Yes: The problem explicitly mentions a tree structure with nodes and edges. A tree is a special type of graph (connected and acyclic).
Is it a tree?
- Yes: The problem clearly states we have a tree with
n
nodes numbered from0
ton-1
, rooted at node0
. The edges array represents the tree structure whereedges[i] = [ui, vi]
indicates connections between nodes.
DFS
- According to the flowchart, when we have a tree structure, we should use DFS (Depth-First Search).
Why DFS is appropriate for this problem:
- We need to explore subtrees - DFS naturally traverses subtrees completely before moving to siblings
- We need to check if all nodes in a subtree have the same color - this requires visiting all descendants of a node
- We need to calculate subtree sizes - DFS allows us to aggregate information from children nodes as we backtrack
- The problem requires examining every possible subtree rooted at each node - DFS systematically visits each node and can treat it as a potential subtree root
Conclusion: The flowchart correctly leads us to use DFS for this tree-based problem. DFS allows us to traverse each subtree, check color consistency within subtrees, calculate subtree sizes, and track the maximum valid subtree size - all in a single traversal of the tree.
Intuition
When dealing with tree problems where we need to examine properties of subtrees, DFS is a natural choice because it completely explores one branch before moving to another. This problem asks us to find the largest subtree where all nodes share the same color.
The key insight is that we can check each node as a potential root of a valid same-color subtree. For any node to be the root of a valid subtree:
- All its descendants must have the same color as itself
- Each of its child subtrees must also be valid (all same color)
This suggests a bottom-up approach: start from the leaves and work our way up. Why? Because:
- Leaf nodes trivially form valid single-node subtrees
- For any internal node, we can determine if it forms a valid subtree by checking if all its children have the same color as itself AND if each child's subtree is also valid
- As we traverse back up, we can accumulate the size of each subtree
The DFS traversal naturally gives us this bottom-up processing through recursion. When we visit a node:
- First, we recursively process all its children (going deep)
- Then, we check if the current node and all its children have matching colors
- If yes, we've found a valid subtree - update our answer if it's the largest so far
- Return whether this subtree is valid to help parent nodes make their decision
The beauty of this approach is that we only need one pass through the tree. Each node is visited once, and we make our decision about each potential subtree based on information gathered from its children. The recursive nature of DFS perfectly matches the recursive structure of the tree, making it elegant to check color consistency while simultaneously calculating subtree sizes.
Learn more about Tree, Depth-First Search and Dynamic Programming patterns.
Solution Approach
The implementation follows the DFS approach we identified, with several key components:
Data Structure Setup:
- Build an adjacency list
g
whereg[a]
stores all nodes connected to nodea
. This allows efficient traversal of the tree. - Initialize a
size
array wheresize[a]
tracks the number of nodes in the subtree rooted at nodea
. Initially, each node has size 1 (itself).
DFS Function Design:
The core logic is in the dfs(a, fa)
function, where:
a
is the current node being processedfa
is the parent of nodea
(to avoid revisiting in an undirected tree)- Returns a boolean indicating whether the subtree rooted at
a
has all nodes of the same color
DFS Processing Steps:
-
Initialize tracking variable: Start with
ok = True
, assuming the subtree is valid until proven otherwise. -
Process all children: For each neighbor
b
of nodea
:- Skip if
b
is the parentfa
(avoid going backward in the tree) - Recursively call
dfs(b, a)
to check if childb
's subtree is valid - Update
ok
using the logical AND of three conditions:- Current
ok
status colors[a] == colors[b]
(parent and child have same color)t
(the child's subtree is valid)
- Current
- Accumulate subtree size:
size[a] += size[b]
- Skip if
-
Update answer if valid: If
ok
remainsTrue
after checking all children, we've found a valid same-color subtree. Update the global answer:ans = max(ans, size[a])
-
Return validity: Return
ok
to inform the parent whether this subtree is valid.
Algorithm Execution:
- Start DFS from the root node:
dfs(0, -1)
where-1
indicates no parent - The DFS will explore the entire tree, checking every possible subtree
- The maximum valid subtree size encountered is stored in
ans
Time Complexity: O(n)
where n
is the number of nodes, as each node is visited exactly once.
Space Complexity: O(n)
for the adjacency list, size array, and recursion stack in the worst case (skewed tree).
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a concrete example to illustrate the solution approach.
Consider a tree with 5 nodes and the following structure:
0 (color: 2) / \ 1 2 (color: 2) (color: 2) / \ 3 4 (color: 2) (color: 1)
Given:
edges = [[0,1], [0,2], [2,3], [2,4]]
colors = [2, 2, 2, 2, 1]
Step-by-step DFS execution:
Initial Setup:
- Build adjacency list:
g = {0: [1,2], 1: [0], 2: [0,3,4], 3: [2], 4: [2]}
- Initialize
size = [1, 1, 1, 1, 1]
(each node starts with size 1) ans = 0
(tracks maximum valid subtree size)
DFS Traversal from root (node 0):
- Visit node 0 (parent = -1)
-
Process child 1: Call
dfs(1, 0)
- Node 1 has no children (except parent 0), so
ok = True
- Node 1 forms valid subtree of size 1
- Update
ans = max(0, 1) = 1
- Return
True
to node 0
- Node 1 has no children (except parent 0), so
-
Back at node 0: Check
colors[0] == colors[1]
→2 == 2
✓ -
Update
size[0] = 1 + 1 = 2
-
Process child 2: Call
dfs(2, 0)
-
Process child 3: Call
dfs(3, 2)
- Node 3 has no children, forms valid subtree of size 1
- Update
ans = 1
(no change) - Return
True
to node 2
-
Check
colors[2] == colors[3]
→2 == 2
✓ -
Update
size[2] = 1 + 1 = 2
-
Process child 4: Call
dfs(4, 2)
- Node 4 has no children, forms valid subtree of size 1
- Update
ans = 1
(no change) - Return
True
to node 2
-
Check
colors[2] == colors[4]
→2 == 1
✗ -
ok
becomesFalse
for node 2's subtree -
size[2] = 2 + 1 = 3
(still accumulate size) -
Since
ok = False
, don't updateans
-
Return
False
to node 0
-
-
Back at node 0: Child 2's subtree is not valid (
False
) -
ok
becomesFalse
for node 0's subtree -
size[0] = 2 + 3 = 5
-
Since
ok = False
, don't updateans
-
Valid subtrees found:
- Node 1: size 1, all nodes have color 2 ✓
- Node 3: size 1, all nodes have color 2 ✓
- Node 4: size 1, all nodes have color 1 ✓
- Node 2's subtree: size 3, but mixed colors (2 and 1) ✗
- Node 0's subtree: size 5, but mixed colors (2 and 1) ✗
Result: The maximum valid subtree size is 1.
If we changed node 4's color to 2, then:
- Node 2's subtree would be valid with size 3 (nodes 2, 3, 4 all color 2)
- Node 0's subtree would be valid with size 5 (all nodes color 2)
- The answer would be 5
This example demonstrates how the algorithm:
- Traverses the tree depth-first
- Checks color consistency between parent and children
- Propagates validity information upward
- Tracks the maximum size of valid subtrees encountered
Solution Implementation
1class Solution:
2 def maximumSubtreeSize(self, edges: List[List[int]], colors: List[int]) -> int:
3 # Build adjacency list representation of the tree
4 num_nodes = len(edges) + 1
5 adjacency_list = [[] for _ in range(num_nodes)]
6
7 # Initialize subtree sizes (each node starts with size 1)
8 subtree_sizes = [1] * num_nodes
9
10 # Build bidirectional edges
11 for node_a, node_b in edges:
12 adjacency_list[node_a].append(node_b)
13 adjacency_list[node_b].append(node_a)
14
15 # Track the maximum valid subtree size found
16 max_subtree_size = 0
17
18 def dfs(current_node: int, parent_node: int) -> bool:
19 """
20 Performs DFS to check if subtree rooted at current_node is valid
21 (all nodes have same color) and calculates subtree sizes.
22
23 Args:
24 current_node: Current node being processed
25 parent_node: Parent of current node (-1 if root)
26
27 Returns:
28 True if the subtree rooted at current_node has all nodes
29 with the same color, False otherwise
30 """
31 nonlocal max_subtree_size
32
33 # Track if current subtree is valid (all same color)
34 is_valid_subtree = True
35
36 # Explore all children (neighbors except parent)
37 for neighbor in adjacency_list[current_node]:
38 if neighbor != parent_node:
39 # Recursively check child subtree
40 child_is_valid = dfs(neighbor, current_node)
41
42 # Check if child has same color and its subtree is valid
43 is_valid_subtree = (is_valid_subtree and
44 colors[current_node] == colors[neighbor] and
45 child_is_valid)
46
47 # Add child's subtree size to current node's subtree size
48 subtree_sizes[current_node] += subtree_sizes[neighbor]
49
50 # If current subtree is valid, update maximum size
51 if is_valid_subtree:
52 max_subtree_size = max(max_subtree_size, subtree_sizes[current_node])
53
54 return is_valid_subtree
55
56 # Start DFS from node 0 as root (assuming 0-indexed tree)
57 dfs(0, -1)
58
59 return max_subtree_size
60
1class Solution {
2 // Adjacency list representation of the tree
3 private List<Integer>[] adjacencyList;
4
5 // Array storing color of each node
6 private int[] nodeColors;
7
8 // Array storing size of subtree rooted at each node
9 private int[] subtreeSize;
10
11 // Maximum size of a valid subtree found so far
12 private int maxSubtreeSize;
13
14 public int maximumSubtreeSize(int[][] edges, int[] colors) {
15 int nodeCount = edges.length + 1;
16
17 // Initialize data structures
18 adjacencyList = new List[nodeCount];
19 subtreeSize = new int[nodeCount];
20 this.nodeColors = colors;
21
22 // Initialize all subtree sizes to 1 (each node counts itself)
23 Arrays.fill(subtreeSize, 1);
24
25 // Create empty adjacency lists for each node
26 Arrays.setAll(adjacencyList, i -> new ArrayList<>());
27
28 // Build the undirected tree from edges
29 for (int[] edge : edges) {
30 int nodeA = edge[0];
31 int nodeB = edge[1];
32 adjacencyList[nodeA].add(nodeB);
33 adjacencyList[nodeB].add(nodeA);
34 }
35
36 // Start DFS from node 0 with no parent (-1)
37 dfs(0, -1);
38
39 return maxSubtreeSize;
40 }
41
42 /**
43 * Performs DFS to find the maximum size of a subtree where all nodes have the same color
44 *
45 * @param currentNode The current node being processed
46 * @param parentNode The parent of the current node (-1 if root)
47 * @return true if the subtree rooted at currentNode has all nodes of the same color
48 */
49 private boolean dfs(int currentNode, int parentNode) {
50 // Flag to track if all nodes in current subtree have same color
51 boolean isValidSubtree = true;
52
53 // Process all children of current node
54 for (int childNode : adjacencyList[currentNode]) {
55 // Skip the parent to avoid revisiting
56 if (childNode != parentNode) {
57 // Recursively check child subtree
58 boolean isChildSubtreeValid = dfs(childNode, currentNode);
59
60 // Current subtree is valid only if:
61 // 1. It was valid so far (isValidSubtree)
62 // 2. Current node and child have same color
63 // 3. Child's subtree is also valid
64 isValidSubtree = isValidSubtree &&
65 nodeColors[currentNode] == nodeColors[childNode] &&
66 isChildSubtreeValid;
67
68 // Add child's subtree size to current node's subtree size
69 subtreeSize[currentNode] += subtreeSize[childNode];
70 }
71 }
72
73 // If current subtree has all nodes of same color, update maximum
74 if (isValidSubtree) {
75 maxSubtreeSize = Math.max(maxSubtreeSize, subtreeSize[currentNode]);
76 }
77
78 return isValidSubtree;
79 }
80}
81
1class Solution {
2public:
3 int maximumSubtreeSize(vector<vector<int>>& edges, vector<int>& colors) {
4 // Calculate the total number of nodes in the tree
5 int nodeCount = edges.size() + 1;
6
7 // Build adjacency list representation of the tree
8 vector<vector<int>> adjacencyList(nodeCount);
9
10 // Track the size of subtree rooted at each node
11 vector<int> subtreeSize(nodeCount, 1);
12
13 // Construct the bidirectional graph from edges
14 for (const auto& edge : edges) {
15 int nodeA = edge[0];
16 int nodeB = edge[1];
17 adjacencyList[nodeA].push_back(nodeB);
18 adjacencyList[nodeB].push_back(nodeA);
19 }
20
21 // Track the maximum size of valid subtree found
22 int maxSubtreeSize = 0;
23
24 // DFS function to check if subtree rooted at current node has all same colors
25 // Returns true if the subtree rooted at 'currentNode' has all nodes with same color
26 function<bool(int, int)> dfs = [&](int currentNode, int parentNode) -> bool {
27 // Flag to track if current subtree is valid (all nodes have same color)
28 bool isValidSubtree = true;
29
30 // Traverse all children of current node
31 for (int childNode : adjacencyList[currentNode]) {
32 // Skip the parent to avoid revisiting
33 if (childNode != parentNode) {
34 // Recursively check if child's subtree is valid
35 bool isChildSubtreeValid = dfs(childNode, currentNode);
36
37 // Current subtree is valid only if:
38 // 1. Current node and child have same color
39 // 2. Child's subtree is also valid
40 // 3. Previous children were also valid
41 isValidSubtree = isValidSubtree &&
42 (colors[currentNode] == colors[childNode]) &&
43 isChildSubtreeValid;
44
45 // Add child's subtree size to current node's subtree size
46 subtreeSize[currentNode] += subtreeSize[childNode];
47 }
48 }
49
50 // If current subtree is valid, update the maximum size
51 if (isValidSubtree) {
52 maxSubtreeSize = max(maxSubtreeSize, subtreeSize[currentNode]);
53 }
54
55 return isValidSubtree;
56 };
57
58 // Start DFS from node 0 with no parent (-1)
59 dfs(0, -1);
60
61 return maxSubtreeSize;
62 }
63};
64
1/**
2 * Finds the maximum size of a subtree where all nodes have the same color
3 * @param edges - Array of edges representing the tree structure
4 * @param colors - Array of colors for each node
5 * @returns The maximum size of a valid subtree
6 */
7function maximumSubtreeSize(edges: number[][], colors: number[]): number {
8 // Calculate total number of nodes (edges + 1 for a tree)
9 const nodeCount: number = edges.length + 1;
10
11 // Build adjacency list representation of the tree
12 const adjacencyList: number[][] = Array.from({ length: nodeCount }, () => []);
13 for (const [nodeA, nodeB] of edges) {
14 adjacencyList[nodeA].push(nodeB);
15 adjacencyList[nodeB].push(nodeA);
16 }
17
18 // Track the size of each subtree rooted at each node
19 const subtreeSize: number[] = Array(nodeCount).fill(1);
20
21 // Track the maximum valid subtree size found
22 let maxSubtreeSize: number = 0;
23
24 /**
25 * Performs depth-first search to find valid subtrees
26 * @param currentNode - Current node being processed
27 * @param parentNode - Parent of the current node (-1 for root)
28 * @returns True if the subtree rooted at currentNode has all nodes with same color
29 */
30 const dfs = (currentNode: number, parentNode: number): boolean => {
31 // Assume the current subtree is valid initially
32 let isValidSubtree: boolean = true;
33
34 // Explore all neighbors (children in the tree)
35 for (const childNode of adjacencyList[currentNode]) {
36 // Skip the parent to avoid revisiting
37 if (childNode !== parentNode) {
38 // Recursively check if child subtree is valid
39 const isChildSubtreeValid: boolean = dfs(childNode, currentNode);
40
41 // Current subtree is valid only if:
42 // 1. Child subtree is valid
43 // 2. Child has the same color as current node
44 isValidSubtree = isValidSubtree && isChildSubtreeValid && colors[currentNode] === colors[childNode];
45
46 // Add child subtree size to current subtree
47 subtreeSize[currentNode] += subtreeSize[childNode];
48 }
49 }
50
51 // If current subtree is valid, update the maximum size
52 if (isValidSubtree) {
53 maxSubtreeSize = Math.max(maxSubtreeSize, subtreeSize[currentNode]);
54 }
55
56 return isValidSubtree;
57 };
58
59 // Start DFS from node 0 with no parent (-1)
60 dfs(0, -1);
61
62 return maxSubtreeSize;
63}
64
Time and Space Complexity
Time Complexity: O(n)
The algorithm performs a depth-first search (DFS) traversal of the tree. Each node is visited exactly once during the DFS traversal. For each node, the algorithm:
- Iterates through all its adjacent nodes (children in the tree)
- Performs constant-time operations (comparisons and updates)
Since the input represents a tree with n
nodes and n-1
edges, the total number of edges traversed is 2(n-1)
(each edge is traversed twice, once from each direction). Therefore, the overall time complexity is O(n)
.
Space Complexity: O(n)
The space complexity consists of:
- The adjacency list
g
:O(n)
space to store all nodes and their connections - The
size
array:O(n)
space to store the size of each subtree - The recursion call stack:
O(h)
whereh
is the height of the tree, which in the worst case (skewed tree) can beO(n)
The dominant factor is O(n)
, making the overall space complexity O(n)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrectly Handling Tree Structure Assumptions
Pitfall: The code assumes the tree is always rooted at node 0, but the problem statement might not guarantee this. Additionally, the code assumes nodes are numbered from 0 to n-1 consecutively, which could fail if there are gaps in node numbering.
Solution: Validate the tree structure and handle edge cases:
def maximumSubtreeSize(self, edges: List[List[int]], colors: List[int]) -> int:
if not edges:
return 1 # Single node tree
# Find all unique nodes to handle non-consecutive numbering
nodes = set()
for u, v in edges:
nodes.add(u)
nodes.add(v)
num_nodes = len(nodes)
# Verify we have the right number of colors
if len(colors) != num_nodes:
raise ValueError("Mismatch between number of nodes and colors")
# Continue with adjacency list building...
2. Short-Circuit Evaluation Breaking Subtree Size Calculation
Pitfall: Using short-circuit evaluation incorrectly when checking validity could prevent proper subtree size accumulation:
# WRONG: This might skip size calculation if is_valid_subtree becomes False if is_valid_subtree: child_is_valid = dfs(neighbor, current_node) is_valid_subtree = colors[current_node] == colors[neighbor] and child_is_valid subtree_sizes[current_node] += subtree_sizes[neighbor]
Solution: Always perform DFS and size calculation, then update validity:
# CORRECT: Always calculate sizes regardless of validity child_is_valid = dfs(neighbor, current_node) subtree_sizes[current_node] += subtree_sizes[neighbor] # Always accumulate size is_valid_subtree = (is_valid_subtree and colors[current_node] == colors[neighbor] and child_is_valid)
3. Not Considering Disconnected Components
Pitfall: The code assumes a connected tree, but if the input has disconnected components or is a forest, only the component containing node 0 will be processed.
Solution: Check for connectivity or process all components:
def maximumSubtreeSize(self, edges: List[List[int]], colors: List[int]) -> int:
num_nodes = len(colors) # Use colors length as node count
if num_nodes == 0:
return 0
if num_nodes == 1:
return 1
# Check if we have a valid tree (n-1 edges for n nodes)
if len(edges) != num_nodes - 1:
# Handle forest case - process each component separately
visited = [False] * num_nodes
max_subtree_size = 0
for start_node in range(num_nodes):
if not visited[start_node]:
component_max = self.process_component(start_node, visited, ...)
max_subtree_size = max(max_subtree_size, component_max)
return max_subtree_size
4. Integer Overflow in Large Trees
Pitfall: While Python handles arbitrary precision integers, in other languages or when porting this solution, integer overflow could occur when accumulating subtree sizes.
Solution: Use appropriate data types and add bounds checking:
# Add validation for tree size if num_nodes > 10**5: # Typical constraint raise ValueError("Tree too large") # In languages like Java/C++, use long/long long for size arrays
5. Forgetting to Handle Single Node Trees
Pitfall: When edges
is empty (single node tree), the code might fail or return 0 instead of 1.
Solution: Add explicit handling for edge cases:
def maximumSubtreeSize(self, edges: List[List[int]], colors: List[int]) -> int:
# Handle single node case
if not edges and len(colors) == 1:
return 1
if not edges or not colors:
return 0
# Rest of the implementation...
How many ways can you arrange the three letters A, B and C?
Recommended Readings
Everything About Trees A tree is a type of graph data structure composed of nodes and edges Its main properties are It is acyclic doesn't contain any cycles There exists a path from the root to any node Has N 1 edges where N is the number of nodes in the tree and
https assets algo monster cover_photos dfs svg Depth First Search Prereqs Recursion Review problems recursion_intro Trees problems tree_intro With a solid understanding of recursion under our belts we are now ready to tackle one of the most useful techniques in coding interviews Depth First Search DFS As the name suggests
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
Want a Structured Path to Master System Design Too? Don’t Miss This!