2003. Smallest Missing Genetic Value in Each Subtree
Problem Description
You are given a family tree represented as a rooted tree with n
nodes numbered from 0
to n - 1
, where node 0
is the root. The tree structure is defined by an array parents
where parents[i]
indicates the parent of node i
. Since node 0
is the root, parents[0] == -1
.
Each node in the tree has a unique genetic value from the range [1, 10^5]
. These genetic values are given in an array nums
where nums[i]
represents the genetic value of node i
. All genetic values in nums
are distinct.
Your task is to find, for each node i
, the smallest positive genetic value that is missing from the subtree rooted at that node. A subtree rooted at node x
includes node x
itself and all of its descendants.
Return an array ans
of length n
where ans[i]
is the smallest missing genetic value from the subtree rooted at node i
.
For example, if a subtree contains nodes with genetic values [1, 2, 4, 5]
, then the smallest missing value would be 3
. If a subtree contains genetic values [2, 3, 4]
, then the smallest missing value would be 1
.
Flowchart Walkthrough
First, let's pin down the algorithm using the Flowchart. Here's a step-by-step walkthrough:
Is it a graph?
- Yes: The problem involves a tree structure where nodes are connected through parent-child relationships. A tree is a special type of graph (specifically, a connected acyclic graph).
Is it a tree?
- Yes: The problem explicitly states we have a family tree rooted at node
0
, with parent-child relationships defined by theparents
array. Each node (except the root) has exactly one parent, which confirms this is a tree structure.
DFS
- According to the flowchart, when we have a tree structure, the natural algorithm choice is DFS (Depth-First Search).
Conclusion: The flowchart suggests using DFS for this problem. This makes sense because:
- We need to explore subtrees (which requires traversing from each node to all its descendants)
- For each node, we need to collect information about all genetic values present in its subtree
- DFS is ideal for tree traversal where we need to process complete subtrees before returning results to parent nodes
- The solution requires us to traverse from specific nodes (like the node with genetic value 1) upward to the root, collecting subtree information along the way
The DFS pattern is particularly effective here as we can:
- Traverse the subtree rooted at each node to identify which genetic values are present
- Mark visited nodes to avoid redundant processing
- Efficiently find the smallest missing genetic value by tracking which values appear in each subtree
Intuition
The key insight is that since all genetic values are unique and we're looking for the smallest missing value, most nodes will have an answer of 1
- unless their subtree contains the genetic value 1
.
Think about it: if a subtree doesn't contain genetic value 1
, then 1
is the smallest missing value for that subtree. This means we only need to do complex calculations for nodes whose subtrees contain 1
.
So where is genetic value 1
? Since each value appears exactly once, there's only one node with genetic value 1
. Let's call this node idx
. Now, only the nodes on the path from idx
to the root can possibly have subtrees containing 1
(because idx
is in their subtree). All other nodes will have answer 1
.
For the nodes on the path from idx
to root, we need to find the actual smallest missing value. Here's where it gets clever:
- Start at node
idx
and use DFS to mark all genetic values present in its subtree - Find the smallest positive integer not in this set - this is the answer for node
idx
- Move to the parent of
idx
and repeat
The brilliant optimization is that as we move up the tree, we don't need to restart from scratch. Each parent's subtree contains all the genetic values from its children's subtrees plus potentially more from other branches. So we can keep accumulating the genetic values we've seen and continue our search for the smallest missing value from where we left off.
For example, if node idx
has smallest missing value 5
, then its parent's smallest missing value must be at least 5
(since the parent's subtree contains everything idx
's subtree has). This means we can start checking from 5
instead of from 1
again.
This approach is efficient because:
- We only do expensive DFS traversals for nodes on one path (from the node with value
1
to root) - We reuse information as we move up the tree
- The smallest missing value can only increase as we go up (never decrease)
Learn more about Tree, Depth-First Search, Union Find and Dynamic Programming patterns.
Solution Approach
The implementation follows the intuition by focusing on the path from the node containing genetic value 1
to the root:
1. Initialize Data Structures:
- Create an answer array
ans
initialized with all1
s (since most nodes will have answer1
) - Build an adjacency list
g
to represent the tree structure from the parent array - Find the node
idx
that has genetic value1
(if no such node exists, all answers remain1
)
2. Set Up Tracking Arrays:
vis[i]
: Boolean array to track visited nodes during DFS to avoid revisitinghas[i]
: Boolean array wherehas[i] = True
means genetic valuei
exists in the current subtree being explored- Initialize a counter
i = 2
to track the next potential missing value
3. DFS Function:
def dfs(i: int):
if vis[i]:
return
vis[i] = True
if nums[i] < len(has):
has[nums[i]] = True
for j in g[i]:
dfs(j)
This function marks node i
as visited, records its genetic value in has
, and recursively visits all children.
4. Main Algorithm:
Starting from node idx
(the node with genetic value 1
), traverse up to the root:
while idx != -1: dfs(idx) # Mark all genetic values in subtree rooted at idx while has[i]: # Find smallest missing value starting from i i += 1 ans[idx] = i # Set the answer for current node idx = parents[idx] # Move to parent
Key Optimizations:
- The variable
i
is never reset - it only increases. This works because as we move up the tree, the smallest missing value can only stay the same or increase (parent subtrees contain all values from child subtrees) - We only visit each node once across all DFS calls due to the
vis
array - The
has
array accumulates genetic values as we move up, so we don't lose information from previous subtrees
Time Complexity: O(n) where n is the number of nodes, since each node is visited at most once during all DFS traversals combined.
Space Complexity: O(n) for the adjacency list, visited array, and has array.
The algorithm efficiently solves the problem by recognizing that only nodes on one specific path (from the node with value 1
to root) need detailed computation, while all other nodes have a trivial answer of 1
.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example to illustrate the solution approach.
Example Input:
parents = [-1, 0, 0, 1, 1, 2]
nums = [5, 4, 1, 3, 6, 2]
This creates the following tree:
0 (val=5) / \ 1 2 / \ \ 3 4 5 (val=4)(val=3)(val=6)(val=2)
Step 1: Initialize
ans = [1, 1, 1, 1, 1, 1]
(all nodes start with answer 1)- Build adjacency list:
g = {0: [1,2], 1: [3,4], 2: [5], 3: [], 4: [], 5: []}
- Find node with value 1:
idx = 2
(sincenums[2] = 1
)
Step 2: Process path from node 2 to root
We'll traverse: node 2 → node 0 (root)
Processing Node 2:
- Start DFS from node 2
- Visit node 2: mark
has[1] = True
(value 1) - Visit child node 5: mark
has[2] = True
(value 2) - Current
has = [False, True, True, False, False, ...]
- Start checking from
i = 2
:has[2]
is True, increment to 3 - Check
i = 3
:has[3]
is False ans[2] = 3
(smallest missing value is 3)- Move to parent:
idx = 0
Processing Node 0 (root):
- Continue DFS from node 0 (accumulating previous values)
- Visit node 0: mark
has[5] = True
(value 5) - Visit child node 1: mark
has[4] = True
(value 4) - Visit grandchild node 3: mark
has[3] = True
(value 3) - Visit grandchild node 4: mark
has[6] = True
(value 6) - Current
has = [False, True, True, True, True, True, True, ...]
- Continue checking from
i = 3
: all of 3, 4, 5, 6 are True - Check
i = 7
:has[7]
is False ans[0] = 7
(smallest missing value is 7)- Move to parent:
idx = -1
(done, as 0 is the root)
Final Result:
ans = [7, 1, 3, 1, 1, 1]
Verification:
- Node 0's subtree has values [5,4,1,3,6,2] → missing 7 ✓
- Node 1's subtree has values [4,3,6] → missing 1 ✓
- Node 2's subtree has values [1,2] → missing 3 ✓
- Node 3's subtree has values [3] → missing 1 ✓
- Node 4's subtree has values [6] → missing 1 ✓
- Node 5's subtree has values [2] → missing 1 ✓
The key insight illustrated here: only nodes 2 and 0 (on the path from the node with value 1 to root) needed detailed computation. All other nodes simply have answer 1 because their subtrees don't contain the value 1.
Solution Implementation
1class Solution:
2 def smallestMissingValueSubtree(
3 self, parents: List[int], nums: List[int]
4 ) -> List[int]:
5 """
6 Find the smallest missing positive integer in each subtree.
7
8 Args:
9 parents: Parent array where parents[i] is the parent of node i
10 nums: Array where nums[i] is the genetic value at node i
11
12 Returns:
13 Array where result[i] is the smallest missing value in subtree rooted at i
14 """
15
16 def mark_subtree_values(node: int) -> None:
17 """
18 DFS to mark all genetic values present in the subtree rooted at 'node'.
19
20 Args:
21 node: Current node to process
22 """
23 # Skip if already visited
24 if visited[node]:
25 return
26
27 # Mark node as visited
28 visited[node] = True
29
30 # Mark the genetic value of current node as present
31 if nums[node] < len(value_exists):
32 value_exists[nums[node]] = True
33
34 # Recursively process all children
35 for child in children[node]:
36 mark_subtree_values(child)
37
38 n = len(nums)
39 # Initialize result array - default is 1 (smallest missing value)
40 result = [1] * n
41
42 # Build adjacency list for tree (parent -> children)
43 children = [[] for _ in range(n)]
44 node_with_value_one = -1
45
46 for node_id, parent_id in enumerate(parents):
47 # Skip root node (parent is -1)
48 if node_id > 0:
49 children[parent_id].append(node_id)
50 # Find the node containing genetic value 1
51 if nums[node_id] == 1:
52 node_with_value_one = node_id
53
54 # If no node has value 1, all subtrees miss value 1
55 if node_with_value_one == -1:
56 return result
57
58 # Track visited nodes during DFS
59 visited = [False] * n
60 # Track which genetic values exist in current path
61 value_exists = [False] * (n + 2)
62
63 # Start checking from value 2 (since we know 1 exists)
64 current_missing = 2
65 current_node = node_with_value_one
66
67 # Process path from node with value 1 to root
68 while current_node != -1:
69 # Mark all values in subtree rooted at current_node
70 mark_subtree_values(current_node)
71
72 # Find the next missing value
73 while value_exists[current_missing]:
74 current_missing += 1
75
76 # Set result for current node
77 result[current_node] = current_missing
78
79 # Move to parent node
80 current_node = parents[current_node]
81
82 return result
83
1class Solution {
2 // Graph adjacency list where g[i] contains all children of node i
3 private List<Integer>[] graph;
4 // Tracks whether a node has been visited during DFS
5 private boolean[] visited;
6 // Tracks which values (from node values) have been seen in current subtree
7 private boolean[] hasValue;
8 // Original node values array
9 private int[] nodeValues;
10
11 public int[] smallestMissingValueSubtree(int[] parents, int[] nums) {
12 int n = nums.length;
13 this.nodeValues = nums;
14
15 // Initialize data structures
16 graph = new List[n];
17 visited = new boolean[n];
18 hasValue = new boolean[n + 2]; // Extra space to handle boundary cases
19
20 // Create adjacency list for the tree
21 Arrays.setAll(graph, i -> new ArrayList<>());
22
23 // Find the node with value 1 (if exists) and build the graph
24 int nodeWithOne = -1;
25 for (int i = 0; i < n; ++i) {
26 // Add edge from parent to child (skip root node at index 0)
27 if (i > 0) {
28 graph[parents[i]].add(i);
29 }
30 // Track which node has value 1
31 if (nums[i] == 1) {
32 nodeWithOne = i;
33 }
34 }
35
36 // Initialize answer array with 1 (default smallest missing value)
37 int[] answer = new int[n];
38 Arrays.fill(answer, 1);
39
40 // If no node has value 1, all subtrees have smallest missing value = 1
41 if (nodeWithOne == -1) {
42 return answer;
43 }
44
45 // Process path from node with value 1 to root
46 // Only nodes on this path can have smallest missing value > 1
47 int currentMissing = 2;
48 for (int currentNode = nodeWithOne; currentNode != -1; currentNode = parents[currentNode]) {
49 // Mark all values in the subtree rooted at currentNode
50 dfs(currentNode);
51
52 // Find the smallest missing positive integer
53 while (hasValue[currentMissing]) {
54 currentMissing++;
55 }
56
57 // Set the answer for current node
58 answer[currentNode] = currentMissing;
59 }
60
61 return answer;
62 }
63
64 /**
65 * Depth-first search to mark all values present in the subtree rooted at node i
66 * @param nodeIndex The current node being processed
67 */
68 private void dfs(int nodeIndex) {
69 // Skip if already visited (avoid reprocessing)
70 if (visited[nodeIndex]) {
71 return;
72 }
73
74 // Mark node as visited
75 visited[nodeIndex] = true;
76
77 // Mark the value at this node as present (if within bounds)
78 if (nodeValues[nodeIndex] < hasValue.length) {
79 hasValue[nodeValues[nodeIndex]] = true;
80 }
81
82 // Recursively process all children
83 for (int child : graph[nodeIndex]) {
84 dfs(child);
85 }
86 }
87}
88
1class Solution {
2public:
3 vector<int> smallestMissingValueSubtree(vector<int>& parents, vector<int>& nums) {
4 int n = nums.size();
5
6 // Build adjacency list for the tree (children of each node)
7 vector<vector<int>> children(n);
8 for (int i = 1; i < n; ++i) {
9 children[parents[i]].push_back(i);
10 }
11
12 // Find the node with value 1 (if it exists)
13 int nodeWithOne = -1;
14 for (int i = 0; i < n; ++i) {
15 if (nums[i] == 1) {
16 nodeWithOne = i;
17 break;
18 }
19 }
20
21 // Initialize result array with 1 (default smallest missing value)
22 vector<int> result(n, 1);
23
24 // If there's no node with value 1, all subtrees miss 1
25 if (nodeWithOne == -1) {
26 return result;
27 }
28
29 // Track visited nodes and present values in explored subtrees
30 vector<bool> visited(n, false);
31 vector<bool> valuePresent(n + 2, false);
32
33 // DFS to mark all values present in a subtree rooted at node
34 function<void(int)> dfs = [&](int node) {
35 if (visited[node]) {
36 return;
37 }
38
39 visited[node] = true;
40
41 // Mark the value at this node as present
42 if (nums[node] <= n + 1) {
43 valuePresent[nums[node]] = true;
44 }
45
46 // Recursively visit all children
47 for (int child : children[node]) {
48 dfs(child);
49 }
50 };
51
52 // Process path from node with value 1 to root
53 int currentMissing = 2; // Start checking from 2 since 1 is present
54 for (int currentNode = nodeWithOne; currentNode != -1; currentNode = parents[currentNode]) {
55 // Explore subtree rooted at currentNode
56 dfs(currentNode);
57
58 // Find the smallest missing positive integer
59 while (valuePresent[currentMissing]) {
60 ++currentMissing;
61 }
62
63 result[currentNode] = currentMissing;
64 }
65
66 return result;
67 }
68};
69
1/**
2 * Finds the smallest missing value in each subtree of a tree
3 * @param parents - Array where parents[i] is the parent of node i (parents[0] = -1 for root)
4 * @param nums - Array where nums[i] is the value at node i
5 * @returns Array where result[i] is the smallest missing positive integer in subtree rooted at i
6 */
7function smallestMissingValueSubtree(parents: number[], nums: number[]): number[] {
8 const nodeCount: number = nums.length;
9
10 // Build adjacency list representation of the tree
11 const adjacencyList: number[][] = Array.from({ length: nodeCount }, () => []);
12
13 // Track visited nodes during DFS traversal
14 const visited: boolean[] = Array(nodeCount).fill(false);
15
16 // Track which values have been seen in the current path
17 const valueExists: boolean[] = Array(nodeCount + 2).fill(false);
18
19 // Initialize result array with 1 (default smallest missing value)
20 const result: number[] = Array(nodeCount).fill(1);
21
22 // Find the node containing value 1 (if exists)
23 let nodeWithOne: number = -1;
24
25 // Build the tree and find node with value 1
26 for (let i = 0; i < nodeCount; ++i) {
27 // Add edge from parent to child (skip root node)
28 if (i !== 0) {
29 adjacencyList[parents[i]].push(i);
30 }
31
32 // Track the node containing value 1
33 if (nums[i] === 1) {
34 nodeWithOne = i;
35 }
36 }
37
38 // If no node contains 1, all subtrees have smallest missing value = 1
39 if (nodeWithOne === -1) {
40 return result;
41 }
42
43 /**
44 * Performs DFS to mark all values in the subtree as existing
45 * @param currentNode - Current node being processed
46 */
47 const dfs = (currentNode: number): void => {
48 // Skip if already visited
49 if (visited[currentNode]) {
50 return;
51 }
52
53 // Mark current node as visited
54 visited[currentNode] = true;
55
56 // Mark the value at current node as existing
57 if (nums[currentNode] < valueExists.length) {
58 valueExists[nums[currentNode]] = true;
59 }
60
61 // Recursively process all children
62 for (const child of adjacencyList[currentNode]) {
63 dfs(child);
64 }
65 };
66
67 // Process path from node with value 1 to root
68 let smallestMissing: number = 2;
69 for (let currentNode = nodeWithOne; currentNode !== -1; currentNode = parents[currentNode]) {
70 // Collect all values in the subtree rooted at current node
71 dfs(currentNode);
72
73 // Find the smallest missing value
74 while (valueExists[smallestMissing]) {
75 ++smallestMissing;
76 }
77
78 // Store the result for current node
79 result[currentNode] = smallestMissing;
80 }
81
82 return result;
83}
84
Time and Space Complexity
Time Complexity: O(n)
The algorithm performs a depth-first search starting from the node containing value 1 and traversing up to the root. Key observations:
- The while loop that finds
idx
(node with value 1) runs at most once through all nodes:O(n)
- The main while loop traverses from the node with value 1 up to the root, visiting at most
n
nodes along the path - The DFS function
dfs(i)
visits each node at most once due to thevis[i]
check. Across all DFS calls, every node is visited exactly once:O(n)
total - The inner while loop
while has[i]: i += 1
incrementsi
from 2 to at mostn+1
. Sincei
never decreases, this contributesO(n)
total across all iterations
Therefore, the total time complexity is O(n) + O(n) + O(n) = O(n)
.
Space Complexity: O(n)
The algorithm uses the following data structures:
ans
: array of sizen
storing the result:O(n)
g
: adjacency list representation of the tree withn
nodes:O(n)
vis
: boolean array of sizen
for tracking visited nodes:O(n)
has
: boolean array of sizen+2
for tracking which values exist in subtrees:O(n)
- Recursive call stack for DFS: In the worst case (skewed tree), the depth can be
O(n)
The total space complexity is O(n)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Assumption About Value Range
Pitfall: A common mistake is assuming that genetic values are limited to the range [1, n]
where n
is the number of nodes. The problem states values can be up to 10^5
, which may be larger than n
.
Example Issue:
# Incorrect: assumes values are bounded by n
value_exists = [False] * (n + 2) # This could cause index out of bounds!
# When marking values:
if nums[node] < len(value_exists): # This check prevents crashes but silently ignores large values
value_exists[nums[node]] = True
Solution: Size the value_exists
array based on the maximum possible genetic value:
# Correct approach:
max_value = 10**5 + 2 # Based on problem constraints
value_exists = [False] * max_value
# Or dynamically based on actual values:
max_genetic_value = max(nums)
value_exists = [False] * (max_genetic_value + 2)
2. Resetting the Missing Value Counter
Pitfall: Developers might instinctively reset current_missing
back to 1 or 2 for each node on the path, thinking each subtree needs independent computation.
Example Issue:
while current_node != -1: current_missing = 2 # WRONG: Resetting for each node mark_subtree_values(current_node) while value_exists[current_missing]: current_missing += 1 result[current_node] = current_missing current_node = parents[current_node]
Why It's Wrong: As we move up the tree, parent nodes contain all values from their children's subtrees. The smallest missing value can only increase or stay the same - it never decreases. Resetting would cause unnecessary recomputation and incorrect results.
Solution: Keep current_missing
persistent across iterations:
current_missing = 2 # Initialize once outside the loop while current_node != -1: # Don't reset current_missing here mark_subtree_values(current_node) while value_exists[current_missing]: current_missing += 1 result[current_node] = current_missing current_node = parents[current_node]
3. Building the Tree Structure Incorrectly
Pitfall: Confusing parent-to-child vs child-to-parent relationships when building the adjacency list.
Example Issue:
# Incorrect: Adding parent to child's list
for node_id, parent_id in enumerate(parents):
if node_id > 0:
children[node_id].append(parent_id) # WRONG direction!
Solution: Ensure you're adding children to their parent's adjacency list:
# Correct: Adding child to parent's list
for node_id, parent_id in enumerate(parents):
if node_id > 0: # Skip root (parent_id would be -1)
children[parent_id].append(node_id)
4. Not Handling the Case When Value 1 Doesn't Exist
Pitfall: Forgetting to handle the edge case where no node has genetic value 1, leading to attempting to traverse from a non-existent node.
Example Issue:
# Find node with value 1
node_with_value_one = -1
for i, val in enumerate(nums):
if val == 1:
node_with_value_one = i
break
# Forgetting to check if it was found
current_node = node_with_value_one # Could be -1
while current_node != -1: # This would immediately exit without processing anything
# ...
Solution: Explicitly check and handle this case:
if node_with_value_one == -1: # All subtrees are missing value 1 return result # result is already initialized with all 1s
What is the best way of checking if an element exists in a sorted array once in terms of time complexity? Select the best that applies.
Recommended Readings
Everything About Trees A tree is a type of graph data structure composed of nodes and edges Its main properties are It is acyclic doesn't contain any cycles There exists a path from the root to any node Has N 1 edges where N is the number of nodes in the tree and
https assets algo monster cover_photos dfs svg Depth First Search Prereqs Recursion Review problems recursion_intro Trees problems tree_intro With a solid understanding of recursion under our belts we are now ready to tackle one of the most useful techniques in coding interviews Depth First Search DFS As the name suggests
Union Find Disjoint Set Union Data Structure Introduction Prerequisite Depth First Search Review problems dfs_intro Once we have a strong grasp of recursion and Depth First Search we can now introduce Disjoint Set Union DSU This data structure is motivated by the following problem Suppose we have sets of elements
Want a Structured Path to Master System Design Too? Don’t Miss This!