2570. Merge Two 2D Arrays by Summing Values


Problem Description

In this problem, we are given two 2D arrays, nums1 and nums2. Each array consists of subarrays where the subarray's first element represents an identifier (id) and the second element represents a value associated with that id. Both nums1 and nums2 are sorted in ascending order by the ids, and each id is unique within its own array. The goal is to merge nums1 and nums2 into a new array while following these rules:

  1. Incorporate only the ids present in at least one of the input arrays into the result.
  2. Each id should appear only once in the merged array.
  3. The value corresponding to each id in the merged array should be the sum of the values associated with the id from both arrays. If an id is present in only one array, its value in the other array is assumed to be 0.

The result should be an array of [id, value] pairs, sorted in ascending order by id.

Intuition

To solve this problem, one intuitive approach is to treat it as a frequency counting problem combined with summation for each unique id. Since we want to sum the values associated with an id across both arrays, we can iterate through each array, and for each [id, value] pair encountered, increase the counter for that id by the value. This approach immediately takes care of cases where an id is present in only one array or both arrays; if it's present in both, the sum in the counter will naturally reflect the total sum across both arrays.

The Python Counter class from the collections module is well-suited for this task. It allows us to keep track of the sum of values for each unique id as we iterate through nums1 + nums2. This way, we aggregate all the [id, value] pairs into a single Counter object which behaves like a dictionary where each key is the id and the value is the calculated sum.

After processing all the ids, the solution involves converting the Counter object into a list of [id, value] pairs and sorting that list by id. Sorting is necessary because the Counter does not maintain the original order of the input, and the final result needs to be sorted by id.

The sorted(cnt.items()) function call transforms the Counter object into a sorted list of tuples based on the ids, directly producing the desired output format for this problem, thereby completing the problem's requirements in an efficient and concise manner.

Learn more about Two Pointers patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Consider the classic dynamic programming of fibonacci numbers, what is the recurrence relation?

Solution Approach

The implementation of the solution is fairly straightforward once the intuition is understood. The solution makes use of the Counter class from Python's collections module, which provides a convenient way to tally counts for each unique id. Here is a step-by-step breakdown of the code implementation:

  1. Initial Counter Creation: A Counter object is initialized with no arguments, which will be used to keep a running total of the values associated with each id.

    1cnt = Counter()
  2. Accumulating Counts: The solution then iterates over the concatenation of nums1 + nums2 arrays. This effectively allows the iteration over all id-value pairs in both arrays as if they were in a single list.

    1for i, v in nums1 + nums2:
    2    cnt[i] += v

    During each iteration, the code accesses the value v associated with the id i and adds it to the count already present for i in the Counter. If i is not present, Counter automatically starts counting from zero.

  3. Returning the Sorted Result: Since the Counter object cnt is not ordered, and the problem expects the result to be sorted by id, we must sort cnt.items(). The items() method returns a list of (id, sum_value) tuples.

    1return sorted(cnt.items())

    The sorted() function by default sorts the list of tuples by the first element of each tuple, which is the id in our case.

The use of the Counter here is a clever application of a frequency counting pattern which, combined with the properties of the Counter class (like returning 0 for missing items instead of raising a KeyError), makes the accumulation of sums across both arrays efficient.

The algorithm effectively operates in O(N log N) time due to the sorting operation at the end, where N is the total number of id-value pairs in the concatenated list. The linear iteration over the concatenated list to populate the Counter has a time complexity of O(M) where M is the size of the concatenated array (i.e., the sum of the lengths of nums1 and nums2), but since M ≤ 2N and sorting dominates the complexity, O(N log N) is the practical complexity of the entire operation.

As such, this approach is an effective and elegant solution to the problem, neatly covering all edge cases thanks to the properties of the Counter class and the sorted list of items.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Which data structure is used in a depth first search?

Example Walkthrough

Let's consider two sample arrays to walk through the solution approach:

1nums1 = [[1, 3], [2, 2]]
2nums2 = [[1, 1], [3, 3]]

Both nums1 and nums2 are sorted by ids and each id is unique within its array. The goal is to merge the arrays such that we end up with a new array that includes the sum of values for each unique id.

  1. Creating a Counter: We create an empty Counter to collect the sums:

    1cnt = Counter()  # This is empty initially
  2. Accumulate Counts: Next, iterate over both arrays and use the Counter to add the values associated with each id:

    1# For nums1
    2cnt[1] += 3  # The counter is now Counter({1: 3})
    3cnt[2] += 2  # The counter is now Counter({1: 3, 2: 2})
    4
    5# For nums2
    6cnt[1] += 1  # Adds to the existing id 1, so Counter({1: 4, 2: 2})
    7cnt[3] += 3  # Adds a new id 3, so Counter({1: 4, 2: 2, 3: 3})

    By doing this for all id-value pairs in both nums1 and nums2, we store the sum of values for each id present in either or both of the arrays.

  3. Returning Sorted Result: Finally, we sort cnt.items() to get the sorted list by id:

    1sorted_result = sorted(cnt.items())  # This gives [(1, 4), (2, 2), (3, 3)]

Now we have the merged array sorted_result which holds the unique ids with their summed values, sorted by id.

As a result, the output will be:

1[[1, 4], [2, 2], [3, 3]]

This output represents the sums of values associated with each id from both input arrays. Id 1 has a total value of 4 (since it's in both nums1 and nums2), id 2 has a value of 2 (only in nums1), and id 3 has a value of 3 (only in nums2). This walkthrough illustrates how the algorithm effectively combines the values associated with each id and sorts them to satisfy the problem requirements.

Solution Implementation

1from typing import List
2from collections import Counter
3
4class Solution:
5    def mergeArrays(self, nums1: List[List[int]], nums2: List[List[int]]) -> List[List[int]]:
6        # Initialize a Counter object to keep track of the frequencies of integers
7        count = Counter()
8      
9        # Iterate through each pair in the concatenated list of nums1 and nums2
10        for index, value in nums1 + nums2:
11            # Sum the values for each unique index entry
12            count[index] += value
13      
14        # Convert the counter back to a sorted list of lists. Sorting by the first element of each pair (the index)
15        return sorted(count.items())
16
1class Solution {
2    public int[][] mergeArrays(int[][] nums1, int[][] nums2) {
3        int[] count = new int[1001]; // Array to hold the frequency of numbers, assuming the range is 0-1000.
4
5        // Iterate over the first array of pairs and update the count for each number.
6        for (int[] pair : nums1) {
7            count[pair[0]] += pair[1];
8        }
9
10        // Iterate over the second array of pairs and similarly update the count.
11        for (int[] pair : nums2) {
12            count[pair[0]] += pair[1];
13        }
14
15        // Determine the size of the resulting merged array by counting non-zero frequencies.
16        int size = 0;
17        for (int freq : count) {
18            if (freq > 0) {
19                size++;
20            }
21        }
22
23        // Create the answer array based on the calculated size.
24        int[][] merged = new int[size][2];
25
26        // Populate the answer array with the numbers and their frequencies.
27        for (int i = 0, index = 0; i < count.length; ++i) {
28            if (count[i] > 0) {
29                merged[index++] = new int[] {i, count[i]};// Using 'index' to fill the 'merged' array.
30            }
31        }
32
33        return merged; // Return the merged array with numbers and corresponding frequencies.
34    }
35}
36
1#include <vector>
2
3using namespace std;
4
5class Solution {
6public:
7    // Function to merge two arrays of key-value pairs where the keys are integers 
8    // and the values are the counts of those integers.
9    vector<vector<int>> mergeArrays(vector<vector<int>>& nums1, vector<vector<int>>& nums2) {
10        // Initialize a frequency array to store the counts for each number
11        // as the maximum possible key is 1000 according to the problem statement.
12        int frequency[1001] = {0};
13
14        // Increment the counts for the numbers in the first array
15        for (auto& pair : nums1) {
16            frequency[pair[0]] += pair[1];
17        }
18
19        // Increment the counts for the numbers in the second array
20        for (auto& pair : nums2) {
21            frequency[pair[0]] += pair[1];
22        }
23
24        // The answer array to hold the merged key-value pairs.
25        vector<vector<int>> merged;
26
27        // Iterate over the frequency array.
28        for (int i = 0; i < 1001; ++i) {
29            // Check if there is a non-zero count for the current number.
30            if (frequency[i] > 0) {
31                // Push the key-value pair to the merged list, 
32                // where the first element is the number (key)
33                // and the second element is the count (value).
34                merged.push_back({i, frequency[i]});
35            }
36        }
37
38        // Return the merged key-value pairs as a sorted array since we iterated in order.
39        return merged;
40    }
41};
42
1function mergeArrays(nums1: number[][], nums2: number[][]): number[][] {
2    // Define the maximum value for the index (1001 assumed as per the original code)
3    const maxIndex = 1001; 
4    // Initialize a count array with zeroes based on the maximum index
5    const countArray = new Array(maxIndex).fill(0); 
6  
7    // Iterate over the first array of number pairs
8    for (const [index, value] of nums1) {
9        // Increment the count for this index by the value
10        countArray[index] += value;
11    }
12  
13    // Iterate over the second array of number pairs
14    for (const [index, value] of nums2) {
15        // Increment the count for this index by the value
16        countArray[index] += value;
17    }
18  
19    // Initialize an array to hold the results
20    const mergedArray: number[][] = [];
21  
22    // Iterate through the count array to construct the mergedArray
23    for (let i = 0; i < maxIndex; ++i) {
24        if (countArray[i] > 0) { // If any count exists for this index
25            // Add a pair of [index, count] to the mergedArray
26            mergedArray.push([i, countArray[i]]);
27        }
28    }
29  
30    // Return the mergedArray containing the sum of pairs from both input arrays
31    return mergedArray;
32}
33
Not Sure What to Study? Take the 2-min Quiz:

How many ways can you arrange the three letters A, B and C?

Time and Space Complexity

The time complexity of the code provided is O(N log N), where N is the total number of elements in the nums1 and nums2 arrays combined. This is because, while the for-loop runs in O(N) and counter operations are on average O(1) for each, the dominant factor is the sorting operation. Sorting in Python typically uses Timsort, which has a time complexity of O(N log N) in the worst case.

The space complexity of the code is O(M), where M is the number of unique keys across both nums1 and nums2. A counter object is used to track the sums associated with each unique key. Although the lists are merged, there is no duplication of the original lists' contents, so the space required is related only to the number of unique keys. Each update to the counter is O(1) space, thus the overall space complexity depends on the number of unique keys rather than the total number of elements.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which algorithm should you use to find a node that is close to the root of the tree?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫