Facebook Pixel

508. Most Frequent Subtree Sum

Problem Description

You are given the root of a binary tree. Your task is to find the most frequent subtree sum in the tree.

A subtree sum is calculated by taking any node in the tree and computing the sum of all node values in the subtree rooted at that node. This includes the node itself plus all its descendants.

For example, if you have a node with value 5, and its left subtree has a sum of 2 and its right subtree has a sum of -3, then the subtree sum for this node would be 5 + 2 + (-3) = 4.

You need to calculate the subtree sum for every node in the tree, then determine which sum(s) appear most frequently. If multiple sums have the same highest frequency (a tie), return all of them in any order.

The solution uses a depth-first search (DFS) approach combined with a hash table to track frequencies. The dfs function recursively calculates the subtree sum for each node by:

  1. Returning 0 for null nodes
  2. Recursively getting the sum of left and right subtrees
  3. Adding the current node's value to these sums
  4. Recording this sum in the cnt hash table
  5. Returning the calculated sum for use by parent nodes

After traversing the entire tree and recording all subtree sums, the code finds the maximum frequency and returns all sums that appear with this maximum frequency.

Flowchart Walkthrough

First, let's pin down the algorithm using the Flowchart. Here's a step-by-step walkthrough:

Is it a graph?

  • Yes: A binary tree is a special type of graph where each node has at most two children and there are no cycles.

Is it a tree?

  • Yes: The problem explicitly states we're working with a binary tree structure.

DFS

  • Following the "yes" path from the tree node, we arrive at DFS (Depth-First Search).

Conclusion: The flowchart suggests using DFS (Depth-First Search) for this problem.

This makes perfect sense for the Most Frequent Subtree Sum problem because:

  1. We need to visit every node in the tree to calculate its subtree sum
  2. For each node, we need to first calculate the sums of its left and right subtrees before we can calculate the node's own subtree sum
  3. DFS naturally provides this bottom-up calculation approach through its recursive nature
  4. The post-order traversal aspect of DFS ensures we process children before parents, which is exactly what we need to accumulate subtree sums from leaves up to the root
Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is that we need to calculate a value for every single node in the tree - specifically, the sum of all nodes in its subtree. This immediately suggests we need to visit every node, which points us toward a tree traversal approach.

When we think about how to calculate a subtree sum, we realize that for any given node, its subtree sum equals node.val + sum of left subtree + sum of right subtree. This means we need to know the subtree sums of the children before we can calculate the parent's subtree sum. This dependency naturally leads us to a bottom-up approach.

DFS with recursion perfectly fits this need because:

  • In the recursive calls, we naturally reach the leaf nodes first (base case)
  • As the recursion unwinds, we move from children back to parents
  • Each recursive call can return its subtree sum to its parent

To track frequencies, we need a data structure that counts occurrences. A hash table (or Counter in Python) is ideal because:

  • We can increment counts in O(1) time
  • We don't know beforehand what sums will appear or how many different sums there will be
  • Finding the maximum frequency and filtering results is straightforward with a hash table

The elegant part of this solution is how the DFS serves a dual purpose:

  1. It calculates the subtree sum for the current node
  2. It returns that sum to be used by the parent node's calculation

This way, each node is processed exactly once, and we build up our frequency map as we traverse the tree. Once we've visited all nodes, we simply need to scan our frequency map to find which sum(s) appeared most often.

Learn more about Tree, Depth-First Search and Binary Tree patterns.

Solution Approach

The solution uses a Hash Table + DFS approach to efficiently calculate and track subtree sums.

Data Structures Used:

  • Counter (hash table): Stores the frequency of each subtree sum
  • Binary Tree: The input structure we're traversing

Implementation Walkthrough:

  1. Initialize the frequency counter:

    cnt = Counter()

    This creates an empty hash table to track how many times each sum appears.

  2. Define the DFS function:

    def dfs(root: Optional[TreeNode]) -> int:

    This recursive function calculates the subtree sum for a given node and updates the frequency counter.

  3. Handle the base case:

    if root is None:
        return 0

    Empty subtrees contribute a sum of 0.

  4. Recursively calculate child subtree sums:

    l, r = dfs(root.left), dfs(root.right)

    First process the left and right children to get their subtree sums.

  5. Calculate current subtree sum:

    s = l + r + root.val

    The subtree sum for the current node is its value plus the sums of both subtrees.

  6. Update the frequency counter:

    cnt[s] += 1

    Record that we've seen this sum one more time.

  7. Return the sum for parent's use:

    return s

    This allows the parent node to use this subtree's sum in its own calculation.

  8. Start the traversal:

    dfs(root)

    Begin the DFS from the root, which will process the entire tree.

  9. Find the maximum frequency:

    mx = max(cnt.values())

    Determine what the highest frequency is among all sums.

  10. Return all sums with maximum frequency:

    return [k for k, v in cnt.items() if v == mx]

    Filter the hash table to return only those sums that appear with the maximum frequency.

Time Complexity: O(n) where n is the number of nodes, as we visit each node exactly once.

Space Complexity: O(n) for the recursion stack in the worst case (skewed tree) and the hash table storing at most n different sums.

Ready to land your dream job?

Unlock your dream job with a 3-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a concrete example to understand how the solution works.

Consider this binary tree:

       5
      / \
     2   -3

Step-by-step execution:

  1. Start DFS from root (5)

    • We need to calculate the subtree sum for node 5
    • First, we recursively call dfs(root.left) for node 2
  2. Process left child (2)

    • Node 2 has no children, so dfs(None) returns 0 for both left and right
    • Calculate subtree sum: s = 0 + 0 + 2 = 2
    • Update counter: cnt[2] = 1 (we've seen sum 2 once)
    • Return 2 to parent
  3. Process right child (-3)

    • Node -3 has no children, so dfs(None) returns 0 for both left and right
    • Calculate subtree sum: s = 0 + 0 + (-3) = -3
    • Update counter: cnt[-3] = 1 (we've seen sum -3 once)
    • Return -3 to parent
  4. Complete root node (5)

    • Now we have left subtree sum = 2 and right subtree sum = -3
    • Calculate subtree sum: s = 2 + (-3) + 5 = 4
    • Update counter: cnt[4] = 1 (we've seen sum 4 once)
    • Return 4 (though not used since this is the root)
  5. Find most frequent sums

    • Our counter now contains: {2: 1, -3: 1, 4: 1}
    • Maximum frequency: mx = 1 (all sums appear once)
    • Return all sums with frequency 1: [2, -3, 4]

Another example with tied frequencies:

       5
      / \
     2   -5
    / \
   3  -1

Processing order and calculations:

  • Node 3: sum = 3, cnt[3] = 1
  • Node -1: sum = -1, cnt[-1] = 1
  • Node 2: sum = 3 + (-1) + 2 = 4, cnt[4] = 1
  • Node -5: sum = -5, cnt[-5] = 1
  • Node 5: sum = 4 + (-5) + 5 = 4, cnt[4] = 2

Final counter: {3: 1, -1: 1, 4: 2, -5: 1} Maximum frequency: 2 Result: [4] (only sum 4 appears twice)

This demonstrates how the DFS calculates sums bottom-up and tracks frequencies to find the most common subtree sum(s).

Solution Implementation

1# Definition for a binary tree node.
2# class TreeNode:
3#     def __init__(self, val=0, left=None, right=None):
4#         self.val = val
5#         self.left = left
6#         self.right = right
7
8from typing import Optional, List
9from collections import Counter
10
11class Solution:
12    def findFrequentTreeSum(self, root: Optional[TreeNode]) -> List[int]:
13        """
14        Find the most frequent subtree sum(s) in a binary tree.
15      
16        Args:
17            root: Root node of the binary tree
18          
19        Returns:
20            List of the most frequent subtree sum(s)
21        """
22      
23        def calculate_subtree_sum(node: Optional[TreeNode]) -> int:
24            """
25            Recursively calculate the sum of the subtree rooted at the given node.
26            Update the frequency counter for each subtree sum encountered.
27          
28            Args:
29                node: Current node being processed
30              
31            Returns:
32                Sum of the subtree rooted at this node
33            """
34            # Base case: empty subtree has sum of 0
35            if node is None:
36                return 0
37          
38            # Recursively calculate left and right subtree sums
39            left_sum = calculate_subtree_sum(node.left)
40            right_sum = calculate_subtree_sum(node.right)
41          
42            # Calculate current subtree sum
43            current_subtree_sum = left_sum + right_sum + node.val
44          
45            # Update frequency counter for this sum
46            sum_frequency[current_subtree_sum] += 1
47          
48            return current_subtree_sum
49      
50        # Initialize frequency counter for subtree sums
51        sum_frequency = Counter()
52      
53        # Perform DFS to calculate all subtree sums
54        calculate_subtree_sum(root)
55      
56        # Find the maximum frequency
57        max_frequency = max(sum_frequency.values())
58      
59        # Return all sums that have the maximum frequency
60        return [sum_value for sum_value, frequency in sum_frequency.items() 
61                if frequency == max_frequency]
62
1/**
2 * Definition for a binary tree node.
3 * public class TreeNode {
4 *     int val;
5 *     TreeNode left;
6 *     TreeNode right;
7 *     TreeNode() {}
8 *     TreeNode(int val) { this.val = val; }
9 *     TreeNode(int val, TreeNode left, TreeNode right) {
10 *         this.val = val;
11 *         this.left = left;
12 *         this.right = right;
13 *     }
14 * }
15 */
16class Solution {
17    // HashMap to store the frequency of each subtree sum
18    private Map<Integer, Integer> sumFrequencyMap = new HashMap<>();
19  
20    // Variable to track the maximum frequency among all subtree sums
21    private int maxFrequency;
22
23    /**
24     * Finds all subtree sums that appear with the highest frequency
25     * @param root The root of the binary tree
26     * @return Array containing all subtree sums with maximum frequency
27     */
28    public int[] findFrequentTreeSum(TreeNode root) {
29        // Perform DFS to calculate all subtree sums and their frequencies
30        calculateSubtreeSum(root);
31      
32        // List to store all sums that have the maximum frequency
33        List<Integer> mostFrequentSums = new ArrayList<>();
34      
35        // Iterate through the frequency map to find sums with maximum frequency
36        for (Map.Entry<Integer, Integer> entry : sumFrequencyMap.entrySet()) {
37            if (entry.getValue() == maxFrequency) {
38                mostFrequentSums.add(entry.getKey());
39            }
40        }
41      
42        // Convert the list to an integer array and return
43        return mostFrequentSums.stream().mapToInt(Integer::intValue).toArray();
44    }
45
46    /**
47     * Recursively calculates the sum of each subtree and updates frequency map
48     * @param node The current node being processed
49     * @return The sum of the subtree rooted at the current node
50     */
51    private int calculateSubtreeSum(TreeNode node) {
52        // Base case: if node is null, return 0
53        if (node == null) {
54            return 0;
55        }
56      
57        // Calculate the sum of current subtree:
58        // current node value + sum of left subtree + sum of right subtree
59        int currentSubtreeSum = node.val + calculateSubtreeSum(node.left) + calculateSubtreeSum(node.right);
60      
61        // Update the frequency of this sum in the map and get the new frequency
62        // merge() adds 1 to existing frequency or sets it to 1 if key doesn't exist
63        int updatedFrequency = sumFrequencyMap.merge(currentSubtreeSum, 1, Integer::sum);
64      
65        // Update the maximum frequency if current frequency is higher
66        maxFrequency = Math.max(maxFrequency, updatedFrequency);
67      
68        // Return the sum of current subtree for parent's calculation
69        return currentSubtreeSum;
70    }
71}
72
1/**
2 * Definition for a binary tree node.
3 * struct TreeNode {
4 *     int val;
5 *     TreeNode *left;
6 *     TreeNode *right;
7 *     TreeNode() : val(0), left(nullptr), right(nullptr) {}
8 *     TreeNode(int x) : val(x), left(nullptr), right(nullptr) {}
9 *     TreeNode(int x, TreeNode *left, TreeNode *right) : val(x), left(left), right(right) {}
10 * };
11 */
12class Solution {
13public:
14    vector<int> findFrequentTreeSum(TreeNode* root) {
15        // Map to store frequency of each subtree sum
16        unordered_map<int, int> sumFrequency;
17      
18        // Track the maximum frequency encountered
19        int maxFrequency = 0;
20      
21        // Lambda function to calculate subtree sum using post-order traversal
22        function<int(TreeNode*)> calculateSubtreeSum = [&](TreeNode* node) -> int {
23            // Base case: empty node contributes 0 to sum
24            if (!node) {
25                return 0;
26            }
27          
28            // Calculate sum of current subtree:
29            // current node value + left subtree sum + right subtree sum
30            int leftSum = calculateSubtreeSum(node->left);
31            int rightSum = calculateSubtreeSum(node->right);
32            int currentSubtreeSum = node->val + leftSum + rightSum;
33          
34            // Update frequency count for this sum and track maximum frequency
35            sumFrequency[currentSubtreeSum]++;
36            maxFrequency = max(maxFrequency, sumFrequency[currentSubtreeSum]);
37          
38            // Return the subtree sum for parent's calculation
39            return currentSubtreeSum;
40        };
41      
42        // Perform DFS to calculate all subtree sums
43        calculateSubtreeSum(root);
44      
45        // Collect all sums that appear with maximum frequency
46        vector<int> result;
47        for (const auto& [sum, frequency] : sumFrequency) {
48            if (frequency == maxFrequency) {
49                result.push_back(sum);
50            }
51        }
52      
53        return result;
54    }
55};
56
1/**
2 * Definition for a binary tree node.
3 * class TreeNode {
4 *     val: number
5 *     left: TreeNode | null
6 *     right: TreeNode | null
7 *     constructor(val?: number, left?: TreeNode | null, right?: TreeNode | null) {
8 *         this.val = (val===undefined ? 0 : val)
9 *         this.left = (left===undefined ? null : left)
10 *         this.right = (right===undefined ? null : right)
11 *     }
12 * }
13 */
14
15/**
16 * Finds the most frequent subtree sum(s) in a binary tree
17 * @param root - The root node of the binary tree
18 * @returns An array of the most frequent subtree sum(s)
19 */
20function findFrequentTreeSum(root: TreeNode | null): number[] {
21    // Map to store frequency count of each subtree sum
22    const sumFrequencyMap = new Map<number, number>();
23  
24    // Track the maximum frequency encountered
25    let maxFrequency = 0;
26  
27    /**
28     * Performs depth-first search to calculate subtree sums
29     * @param node - Current node being processed
30     * @returns The sum of the current subtree
31     */
32    const calculateSubtreeSum = (node: TreeNode | null): number => {
33        // Base case: empty node contributes 0 to the sum
34        if (!node) {
35            return 0;
36        }
37      
38        // Recursively calculate left and right subtree sums
39        const leftSubtreeSum = calculateSubtreeSum(node.left);
40        const rightSubtreeSum = calculateSubtreeSum(node.right);
41      
42        // Calculate current subtree sum
43        const currentSubtreeSum = node.val + leftSubtreeSum + rightSubtreeSum;
44      
45        // Update frequency map for current sum
46        const currentFrequency = (sumFrequencyMap.get(currentSubtreeSum) ?? 0) + 1;
47        sumFrequencyMap.set(currentSubtreeSum, currentFrequency);
48      
49        // Update maximum frequency if needed
50        maxFrequency = Math.max(maxFrequency, currentFrequency);
51      
52        return currentSubtreeSum;
53    };
54  
55    // Start DFS traversal from root
56    calculateSubtreeSum(root);
57  
58    // Filter and return sums that have maximum frequency
59    return Array.from(sumFrequencyMap.entries())
60        .filter(([sum, frequency]) => frequency === maxFrequency)
61        .map(([sum, frequency]) => sum);
62}
63

Time and Space Complexity

Time Complexity: O(n)

The algorithm performs a depth-first search (DFS) traversal of the binary tree. Each node is visited exactly once during the traversal. At each node, the operations performed are:

  • Computing the sum s = l + r + root.val - O(1)
  • Updating the counter cnt[s] += 1 - O(1) on average for dictionary operations
  • Returning the sum - O(1)

After the DFS completes, finding the maximum frequency takes O(n) in the worst case (when all subtree sums are unique), and filtering the results also takes O(n). Therefore, the overall time complexity is O(n) + O(n) + O(n) = O(n), where n is the number of nodes in the binary tree.

Space Complexity: O(n)

The space complexity consists of:

  • Recursion call stack: In the worst case (skewed tree), the recursion depth can be O(n). In the best case (balanced tree), it would be O(log n).
  • Counter dictionary: In the worst case, each subtree could have a unique sum, resulting in n entries in the counter, requiring O(n) space.
  • Output list: In the worst case, all subtree sums have the same frequency (the maximum), so the output list could contain up to n elements, requiring O(n) space.

Therefore, the overall space complexity is O(n), where n is the number of nodes in the binary tree.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Forgetting to Handle Empty Tree Input

One common pitfall is not properly handling the case where the input tree is completely empty (root = None).

Problem Code:

def findFrequentTreeSum(self, root: Optional[TreeNode]) -> List[int]:
    sum_frequency = Counter()
    calculate_subtree_sum(root)
    max_frequency = max(sum_frequency.values())  # Error: max() on empty sequence
    return [sum_value for sum_value, frequency in sum_frequency.items() 
            if frequency == max_frequency]

Issue: When root is None, the sum_frequency Counter remains empty. Calling max() on an empty sequence raises a ValueError.

Solution:

def findFrequentTreeSum(self, root: Optional[TreeNode]) -> List[int]:
    # Handle empty tree edge case
    if root is None:
        return []
  
    sum_frequency = Counter()
    calculate_subtree_sum(root)
    max_frequency = max(sum_frequency.values())
    return [sum_value for sum_value, frequency in sum_frequency.items() 
            if frequency == max_frequency]

2. Integer Overflow in Other Languages

While Python handles arbitrarily large integers, implementing this solution in languages like Java or C++ could lead to integer overflow for large subtree sums.

Problem Example: A deep tree with large positive values could cause the sum to exceed INT_MAX.

Solution for Other Languages:

  • Use long or long long data types for sum calculations
  • In Java: use Long instead of Integer for the HashMap keys
  • In C++: use long long for sum calculations

3. Modifying the Counter During Iteration

A less common but possible pitfall is attempting to modify the frequency counter while iterating over it.

Problem Code:

# Attempting to filter in-place (incorrect approach)
for k, v in cnt.items():
    if v != mx:
        del cnt[k]  # RuntimeError: dictionary changed size during iteration
return list(cnt.keys())

Solution: Always create a new list for the result using list comprehension:

return [k for k, v in cnt.items() if v == mx]

4. Assuming Single Maximum Frequency

Some might incorrectly assume there's only one sum with maximum frequency and return it immediately.

Problem Code:

# Wrong: Returns only the first sum with max frequency
for k, v in cnt.items():
    if v == mx:
        return [k]  # Exits too early!

Solution: Collect all sums with maximum frequency:

return [k for k, v in cnt.items() if v == mx]
Discover Your Strengths and Weaknesses: Take Our 3-Minute Quiz to Tailor Your Study Plan:

Which data structure is used to implement priority queue?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More