508. Most Frequent Subtree Sum


Problem Description

The problem is about finding the most frequently occurring sums of all the nodes in each subtree within a binary tree. A subtree sum is the sum of the values of all nodes in a subtree which includes the root node of that subtree. A binary tree is a tree data structure where each node has at most two children, referred to as the left child and the right child.

Given the root node of a binary tree, the task is to compute the sum for every possible subtree and then find out which sum(s) occur most frequently. If there is more than one sum with the highest frequency, all such sums should be returned. The return format is a list of integers in any order.

An example of a subtree sum would be taking a node, adding its value to the sum of all node values in its left subtree, and then adding the sum of all node values in its right subtree.

Intuition

To solve this problem, we need to traverse the binary tree and calculate the subtree sum for each node. This is a classic case for a depth-first search (DFS) traversal, in which we go as deep as possible down one path before backing up and trying a different one.

For each node visited, we calculate the subtree sum by summing:

  1. The value of the current node.
  2. The subtree sum of the left child.
  3. The subtree sum of the right child.

After calculating the sum, we update a histogram (counter) that records how many times each possible sum occurs. A Python Counter is handy for this purpose as it allows us to maintain a running count of each subtree sum encountered during the DFS.

Once the entire tree has been traversed and all subtree sums have been computed and counted, we determine the maximum frequency among them. This tells us how many times the most frequent sum occurs.

Finally, we iterate over the items in our counter to find all sums that have this maximum frequency, collecting them into a list. This list is what will be returned as the solution. Since the counter is a dictionary with subtree sum as keys and their frequencies as values, we can easily list the keys for which the values match the maximum frequency. This gives us all the most frequent subtree sums.

Learn more about Tree, Depth-First Search and Binary Tree patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece๏ผš

The three-steps of Depth First Search are:

  1. Identify states;
  2. Draw the state-space tree;
  3. DFS on the state-space tree.

Solution Approach

The solution involves a combination of a depth-first search (DFS) algorithm and a Counter object from Python's collections module to keep track of subtree sum frequencies. Here's a step-by-step approach explaining the above-provided solution code:

  1. Define a recursive helper function dfs which will be used to perform a depth-first search of the binary tree. This function takes a single argument, the root of the current subtree (node).

  2. In the DFS function, the base case is to return a sum of 0 when a None node is encountered, meaning we've reached a leaf node's child.

  3. The function then recursively calls itself on both the left and right children of the current node to calculate their subtree sums. These results are stored in variables left and right, respectively.

  4. With the left and right subtree sums, we can now calculate the current node's total subtree sum as s = root.val + left + right. This is the sum of the value stored in the current node plus the sum of all node values in both subtrees.

  5. We use a Counter to keep track of how frequently each subtree sum occurs. This Counter (counter) object is then updated with the current subtree sum, incrementing the count of s. This is done outside the DFS function in the global scope of the findFrequentTreeSum method.

  6. After the DFS traversal is complete, we can infer which subtree sums are most frequent. The maximum frequency of occurrence (mx) is obtained by applying the max() function to the values of the Counter object.

  7. The final step involves iterating over the items in the Counter and selecting only those keys (k) whose values (v) match the maximum frequency (mx). This is done using a list comprehension that filters and collects all the keys with the highest frequency.

  8. These keys represent the most frequent subtree sums, and they are returned as a list.

The solution nicely ties together the traversal of the binary tree to compute subtree sums and the use of a Counter for frequency tracking. The combination of recursive DFS and the Counter strategy efficiently solves the problem with clear and understandable logic.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Which of the two traversal algorithms (BFS and DFS) can be used to find whether two nodes are connected?

Example Walkthrough

Let's walk through a small example to illustrate the solution approach.

Consider the following binary tree:

1    5
2   / \
3  2   -5

Using the provided solution approach, we'll perform these steps:

  1. Call the dfs function on the root node (value 5).

  2. Since node 5 is not None, we recursively call dfs on its left child (value 2). The left child is a leaf node, so it has no children. The dfs function returns 2, as there are no further nodes to sum.

  3. Next, we recursively call dfs on the right child of node 5 (value -5). This is also a leaf node, so the function returns -5.

  4. We now calculate the subtree sum for the root node (node 5). The sum s is equal to the root's value plus the left and right subtree sums: s = 5 + 2 + (-5) = 2.

  5. Update the Counter with the subtree sum of 2 for the root node. The Counter now looks like this: {2: 1}.

  6. Now we move up and track the subtree sums of the leaf nodes. The Counter is updated with the subtree sum 2 from the left child and -5 from the right child, leading to: {2: 2, -5: 1}.

  7. After finishing the DFS traversal and recording all subtree sums, we determine the most frequent sums. The maximum frequency mx from the Counter is 2 (since the subtree sum 2 occurred twice).

  8. We now filter the keys in the Counter to find those with a frequency matching the maximum frequency mx. In this case, only the subtree sum 2 matches, so it will be collected in the resulting list.

The final list of most frequent subtree sums to be returned is [2], as this is the sum that occurred most frequently in our example binary tree.

Solution Implementation

1from collections import Counter
2from typing import List
3
4# Definition for a binary tree node.
5class TreeNode:
6    def __init__(self, val=0, left=None, right=None):
7        self.val = val
8        self.left = left
9        self.right = right
10
11class Solution:
12    def findFrequentTreeSum(self, root: TreeNode) -> List[int]:
13        """
14        Find all subtree sums that occur the most frequently in the binary tree.
15      
16        :param root: TreeNode, the root of the binary tree
17        :return: List[int], a list of most frequent subtree sums
18        """
19      
20        def dfs(node):
21            """
22            Perform a depth-first search to calculate the sum of each subtree.
23          
24            :param node: TreeNode, the current node in the binary tree
25            :return: int, the sum of the current subtree
26            """
27            if not node:
28                return 0
29            # Recursively find the sum of left and right subtrees
30            left_sum = dfs(node.left)
31            right_sum = dfs(node.right)
32            # Calculate the sum of the current subtree
33            subtree_sum = node.val + left_sum + right_sum
34            # Increment the counter for the subtree sum
35            subtree_sum_counter[subtree_sum] += 1
36            return subtree_sum
37
38        # Initialize a Counter to keep track of the frequency of each subtree sum
39        subtree_sum_counter = Counter()
40        # Start the DFS traversal from the root to fill the subtree_sum_counter
41        dfs(root)
42        # Find the maximum frequency among the subtree sums
43        max_frequency = max(subtree_sum_counter.values())
44        # Collect all subtree sums with the maximum frequency
45        return [subtree_sum for subtree_sum, frequency in subtree_sum_counter.items() if frequency == max_frequency]
46
1import java.util.ArrayList;
2import java.util.HashMap;
3import java.util.List;
4import java.util.Map;
5
6// Definition for a binary tree node.
7class TreeNode {
8    int val;
9    TreeNode left;
10    TreeNode right;
11    TreeNode() {}
12    TreeNode(int val) {
13        this.val = val;
14    }
15    TreeNode(int val, TreeNode left, TreeNode right) {
16        this.val = val;
17        this.left = left;
18        this.right = right;
19    }
20}
21
22class Solution {
23    // A map to keep track of the sum occurrences.
24    private Map<Integer, Integer> sumFrequency;
25  
26    // A variable to keep track of the maximum frequency of the sum.
27    private int maxFrequency;
28
29    // Method to find the tree sums that appear most frequently.
30    public int[] findFrequentTreeSum(TreeNode root) {
31        sumFrequency = new HashMap<>();
32        maxFrequency = Integer.MIN_VALUE;
33      
34        // Recursively find all subtree sums.
35        calculateSubtreeSum(root);
36      
37        // Store the sums that have the highest frequency.
38        List<Integer> frequentSums = new ArrayList<>();
39        for (Map.Entry<Integer, Integer> entry : sumFrequency.entrySet()) {
40            if (entry.getValue() == maxFrequency) {
41                frequentSums.add(entry.getKey());
42            }
43        }
44      
45        // Convert the result to an array.
46        int[] result = new int[frequentSums.size()];
47        for (int i = 0; i < frequentSums.size(); i++) {
48            result[i] = frequentSums.get(i);
49        }
50      
51        return result;
52    }
53
54    // Helper method to perform a depth-first search and calculate subtree sums.
55    private int calculateSubtreeSum(TreeNode node) {
56        // If the node is null, return sum as 0.
57        if (node == null) {
58            return 0;
59        }
60      
61        // Calculate the sum including the current node and its left and right subtrees.
62        int sum = node.val + calculateSubtreeSum(node.left) + calculateSubtreeSum(node.right);
63      
64        // Update the frequency of the sum in the map.
65        sumFrequency.put(sum, sumFrequency.getOrDefault(sum, 0) + 1);
66      
67        // Update the maximum frequency if necessary.
68        maxFrequency = Math.max(maxFrequency, sumFrequency.get(sum));
69      
70        // Return the sum of the subtree rooted at the current node.
71        return sum;
72    }
73}
74
1#include <unordered_map>
2#include <vector>
3#include <climits> // For INT_MIN
4
5// Definition for a binary tree node.
6struct TreeNode {
7    int val;
8    TreeNode* left;
9    TreeNode* right;
10    TreeNode(int x) : val(x), left(nullptr), right(nullptr) {}
11    TreeNode(int x, TreeNode* left, TreeNode* right) : val(x), left(left), right(right) {}
12};
13
14class Solution {
15public:
16    // Using an unordered map to keep track of the sum frequencies
17    std::unordered_map<int, int> sumFrequency;
18    // Variable to keep track of the highest frequency encountered
19    int maxFrequency;
20
21    // Constructor initializes the maximum frequency to the minimum integer value
22    Solution() : maxFrequency(INT_MIN) {}
23
24    // Function to find the most frequent subtree sums
25    vector<int> findFrequentTreeSum(TreeNode* root) {
26        // Reset the max frequency for each call
27        maxFrequency = INT_MIN;
28        // Calculate the subtree sums starting from the root
29        depthFirstSearch(root);
30        vector<int> result;
31        // Iterate over the counter map and select the sums with frequency equal to maxFrequency
32        for (auto& entry : sumFrequency) {
33            if (entry.second == maxFrequency) {
34                result.push_back(entry.first);
35            }
36        }
37        return result;
38    }
39
40    // Helper function to perform a depth-first search and calculate subtree sums
41    int depthFirstSearch(TreeNode* node) {
42        // If the node is null, return 0 as it contributes nothing to the sum
43        if (!node) {
44            return 0;
45        }
46        // Calculate the sum of the current subtree
47        int sum = node->val + depthFirstSearch(node->left) + depthFirstSearch(node->right);
48        // Increment the frequency count for the current sum
49        ++sumFrequency[sum];
50        // Update the max frequency if the current sum's frequency is higher
51        maxFrequency = std::max(maxFrequency, sumFrequency[sum]);
52        // Return the sum of this subtree to be used by its parent
53        return sum;
54    }
55};
56
1// TypeScript function to find all subtree sums occurring with the highest frequency in a binary tree.
2
3// TreeNode class definition
4class TreeNode {
5    val: number;
6    left: TreeNode | null;
7    right: TreeNode | null;
8    constructor(val?: number, left?: TreeNode | null, right?: TreeNode | null) {
9        this.val = (val === undefined ? 0 : val);
10        this.left = (left === undefined ? null : left);
11        this.right = (right === undefined ? null : right);
12    }
13}
14
15// Function to calculate the subtree sums with the highest frequency.
16function findFrequentTreeSum(root: TreeNode | null): number[] {
17    // Map to store subtree sum frequencies
18    const sumFrequencyMap = new Map<number, number>();
19
20    // Variable to keep track of the highest frequency of subtree sum
21    let maxFrequency = 0;
22
23    // Depth-first search function to explore the tree and calculate the sums
24    const calculateSubtreeSum = (node: TreeNode | null): number => {
25        if (node === null) {
26            return 0;
27        }
28        const leftSum = calculateSubtreeSum(node.left);
29        const rightSum = calculateSubtreeSum(node.right);
30        const subtreeSum = node.val + leftSum + rightSum;
31      
32        // Update the frequency map
33        const currentFrequency = (sumFrequencyMap.get(subtreeSum) ?? 0) + 1;
34        sumFrequencyMap.set(subtreeSum, currentFrequency);
35
36        // Update the max frequency if the current one is greater
37        maxFrequency = Math.max(maxFrequency, currentFrequency);
38
39        return subtreeSum;
40    };
41
42    // Start the subtree sum calculation from the root
43    calculateSubtreeSum(root);
44
45    // Array to store the most frequent subtree sums
46    const mostFrequentSums = [];
47
48    // Find all sums that have the max frequency
49    for (const [sum, frequency] of sumFrequencyMap) {
50        if (frequency === maxFrequency) {
51            mostFrequentSums.push(sum);
52        }
53    }
54
55    // Return the most frequent subtree sums
56    return mostFrequentSums;
57}
58
Not Sure What to Study? Take the 2-min Quiz๏ผš

What is the best way of checking if an element exists in a sorted array once in terms of time complexity? Select the best that applies.

Time and Space Complexity

Time Complexity

The time complexity of the code is mainly determined by the DFS traversal of the tree and the operations performed on the Counter object that is used as a hash map.

  1. DFS Traversal: Every node in the given binary tree is visited exactly once. If the tree has N nodes, the DFS traversal takes O(N) time.

  2. Counter Operations:

    • Update: The update of the Counter counter for each subtree sum happens N times (once for each node). This operation can be considered O(1) for each update since Counter uses a hash table for storage.
    • Max Operation: The operation max(counter.values()) is performed once and takes O(N) time in the worst case because it scans through all values in the counter.

In total, we have O(N) for DFS and O(N) for the max operation, which is linear. Thus, the overall time complexity is O(N).

Space Complexity

The space complexity of the code is a combination of the space used by the recursion stack during the DFS traversal and the space used by the Counter.

  1. DFS Recursion Stack: In the worst case, i.e., when the tree is completely unbalanced, the height of the tree could be N, leading to a recursion stack depth of O(N). In the best case, i.e., when the tree is perfectly balanced, the height of the tree would be log(N), leading to a recursion stack depth of O(log(N)). Thus, the space used by the recursion stack can vary between O(log(N)) and O(N).

  2. Counter: The Counter object counter can potentially store a distinct sum for each subtree of the binary tree. In the worst case, this could be O(N) distinct sums if every subtree has a unique sum.

Therefore, the overall space complexity of the algorithm is O(N) in the worst case, which includes the space for the Counter and the recursion stack.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

What are the two properties the problem needs to have for dynamic programming to be applicable? (Select 2)


Recommended Readings


Got a question?ย Ask the Teaching Assistantย anything you don't understand.

Still not clear? Ask in the Forum, ย Discordย orย Submitย the part you don't understand to our editors.

โ†
โ†‘TA ๐Ÿ‘จโ€๐Ÿซ