2363. Merge Similar Items
Problem Description
You are given two 2D integer arrays, items1
and items2
, representing two sets of items. Each array contains items where each item is represented as [value, weight]
:
value
represents the unique identifier/value of the itemweight
represents the weight of that item- All values within each array are unique
Your task is to merge these two arrays by combining items with the same value. When items from both arrays have the same value, you need to add their weights together.
The output should be a 2D array where:
- Each element is
[value, total_weight]
total_weight
is the sum of all weights for items with that specific value across both input arrays- The result must be sorted in ascending order by value
For example:
- If
items1
has[1, 5]
anditems2
has[1, 3]
, the merged result would include[1, 8]
(weights 5 + 3 = 8) - If a value appears in only one array, include it with its original weight
- The final result should list all unique values with their combined weights, sorted by value
Intuition
The key insight is that we need to group items by their value and sum up their weights. This is a classic aggregation problem where we need to combine data from two sources based on a common key (the value).
Think of it like merging inventory from two warehouses - if both warehouses have the same product (identified by value), we want to know the total quantity (weight) across both locations.
A hash table (or dictionary/counter) is the natural data structure for this task because:
- We need to quickly look up whether we've seen a particular value before
- We need to accumulate weights for the same value
- Hash tables provide O(1) average time complexity for both insertion and lookup
The solution becomes straightforward:
- Iterate through both arrays
- For each item
[value, weight]
, add the weight to our running total for that value - Using a
Counter
in Python makes this even simpler as it automatically handles missing keys by starting at 0
Once we have all the aggregated weights by value, we just need to convert our hash table back to a list of [value, weight]
pairs and sort them by value. The sorted()
function on dictionary items naturally sorts by the first element (key/value) in ascending order, which is exactly what we need.
The beauty of this approach is that we don't need to worry about which array an item comes from or whether it appears in one or both arrays - we simply process all items uniformly and let the hash table handle the aggregation.
Learn more about Sorting patterns.
Solution Approach
The implementation uses a hash table to efficiently merge and aggregate items from both arrays. Here's how the solution works step by step:
1. Initialize a Counter (Hash Table)
cnt = Counter()
We use Python's Counter
from the collections module, which is a specialized dictionary that automatically initializes missing keys with 0. This eliminates the need to check if a key exists before adding to it.
2. Iterate Through Both Arrays
for v, w in chain(items1, items2): cnt[v] += w
The chain()
function from itertools concatenates items1
and items2
into a single iterable sequence. This allows us to process all items in one loop without writing separate loops for each array.
For each item [v, w]
(value and weight):
- We add the weight
w
to the running total for valuev
- If
v
doesn't exist in the counter yet, it starts at 0 and then addsw
- If
v
already exists, we simply addw
to the existing total
3. Sort and Return the Result
return sorted(cnt.items())
The cnt.items()
method returns key-value pairs as tuples (value, total_weight)
. The sorted()
function:
- Converts these tuples into a list
- Sorts them by the first element (value) in ascending order by default
- Returns a list of lists in the format
[[value1, weight1], [value2, weight2], ...]
Time Complexity: O((m + n) log(m + n)) where m and n are the lengths of items1
and items2
. The iteration takes O(m + n) time, and sorting takes O(k log k) where k is the number of unique values (at most m + n).
Space Complexity: O(m + n) for storing the counter with at most m + n unique values.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a concrete example to illustrate how the solution works:
Given:
items1 = [[1, 3], [2, 2], [4, 5]]
items2 = [[1, 1], [3, 4], [4, 2]]
Step 1: Initialize the Counter
We start with an empty counter: cnt = {}
Step 2: Process items from both arrays
Processing items1
:
[1, 3]
: Add value 1 with weight 3 βcnt = {1: 3}
[2, 2]
: Add value 2 with weight 2 βcnt = {1: 3, 2: 2}
[4, 5]
: Add value 4 with weight 5 βcnt = {1: 3, 2: 2, 4: 5}
Processing items2
:
[1, 1]
: Value 1 already exists, add weight 1 to existing 3 βcnt = {1: 4, 2: 2, 4: 5}
[3, 4]
: Add value 3 with weight 4 βcnt = {1: 4, 2: 2, 3: 4, 4: 5}
[4, 2]
: Value 4 already exists, add weight 2 to existing 5 βcnt = {1: 4, 2: 2, 3: 4, 4: 7}
Step 3: Convert to sorted list
- Extract items:
[(1, 4), (2, 2), (3, 4), (4, 7)]
- Sort by value (already sorted in this case)
- Convert to list format:
[[1, 4], [2, 2], [3, 4], [4, 7]]
Final Result: [[1, 4], [2, 2], [3, 4], [4, 7]]
Note how:
- Value 1 appeared in both arrays with weights 3 and 1, resulting in total weight 4
- Value 2 appeared only in items1 with weight 2
- Value 3 appeared only in items2 with weight 4
- Value 4 appeared in both arrays with weights 5 and 2, resulting in total weight 7
Solution Implementation
1from typing import List
2from collections import Counter
3from itertools import chain
4
5
6class Solution:
7 def mergeSimilarItems(
8 self, items1: List[List[int]], items2: List[List[int]]
9 ) -> List[List[int]]:
10 """
11 Merge two lists of items where each item is [value, weight].
12 Items with the same value have their weights summed.
13 Returns a sorted list of [value, total_weight] pairs.
14
15 Args:
16 items1: First list of [value, weight] pairs
17 items2: Second list of [value, weight] pairs
18
19 Returns:
20 Sorted list of [value, total_weight] pairs
21 """
22 # Initialize a counter to store cumulative weights for each value
23 weight_counter = Counter()
24
25 # Iterate through both lists combined and accumulate weights by value
26 for value, weight in chain(items1, items2):
27 weight_counter[value] += weight
28
29 # Convert counter items to list and sort by value (key)
30 # Counter.items() returns (key, value) pairs which become [value, total_weight]
31 return sorted(weight_counter.items())
32
1class Solution {
2 public List<List<Integer>> mergeSimilarItems(int[][] items1, int[][] items2) {
3 // Create an array to store the total weight for each value
4 // Index represents the value, element represents the total weight
5 // Size 1010 assumes values are in range [1, 1000]
6 int[] weightCount = new int[1010];
7
8 // Add weights from items1 to the count array
9 // Each item is [value, weight]
10 for (int[] item : items1) {
11 weightCount[item[0]] += item[1];
12 }
13
14 // Add weights from items2 to the count array
15 for (int[] item : items2) {
16 weightCount[item[0]] += item[1];
17 }
18
19 // Create result list to store merged items
20 List<List<Integer>> result = new ArrayList<>();
21
22 // Iterate through the count array to build the result
23 for (int value = 0; value < weightCount.length; value++) {
24 // If there's a positive weight for this value, add it to result
25 if (weightCount[value] > 0) {
26 result.add(List.of(value, weightCount[value]));
27 }
28 }
29
30 return result;
31 }
32}
33
1class Solution {
2public:
3 vector<vector<int>> mergeSimilarItems(vector<vector<int>>& items1, vector<vector<int>>& items2) {
4 // Array to store the sum of weights for each value (index represents the value)
5 // Size 1010 based on problem constraints (values range from 1 to 1000)
6 int weightSum[1010]{};
7
8 // Accumulate weights from the first list of items
9 // Each item is [value, weight]
10 for (auto& item : items1) {
11 weightSum[item[0]] += item[1];
12 }
13
14 // Accumulate weights from the second list of items
15 for (auto& item : items2) {
16 weightSum[item[0]] += item[1];
17 }
18
19 // Result vector to store merged items
20 vector<vector<int>> result;
21
22 // Iterate through all possible values and add non-zero entries to result
23 for (int value = 0; value < 1010; ++value) {
24 if (weightSum[value] > 0) {
25 result.push_back({value, weightSum[value]});
26 }
27 }
28
29 return result;
30 }
31};
32
1/**
2 * Merges two arrays of items where each item is [value, weight].
3 * Items with the same value have their weights summed.
4 * @param items1 - First array of [value, weight] pairs
5 * @param items2 - Second array of [value, weight] pairs
6 * @returns Sorted array of merged items with combined weights
7 */
8function mergeSimilarItems(items1: number[][], items2: number[][]): number[][] {
9 // Initialize array to store weight sums for each value (index represents value)
10 // Assuming values range from 0 to 1000
11 const weightSums: number[] = new Array(1001).fill(0);
12
13 // Add weights from first array of items
14 for (const [value, weight] of items1) {
15 weightSums[value] += weight;
16 }
17
18 // Add weights from second array of items
19 for (const [value, weight] of items2) {
20 weightSums[value] += weight;
21 }
22
23 // Convert array to entries and filter out zero-weight items
24 // entries() returns [index, value] pairs where index is the item value
25 // and value is the total weight
26 return [...weightSums.entries()].filter(([_, totalWeight]) => totalWeight !== 0);
27}
28
Time and Space Complexity
The time complexity is O((n + m) * log(n + m))
where n
and m
are the lengths of items1
and items2
respectively.
- Iterating through both lists using
chain(items1, items2)
takesO(n + m)
time - Building the Counter by adding values takes
O(n + m)
time - The
sorted()
function oncnt.items()
sorts at mostn + m
unique items, which takesO((n + m) * log(n + m))
time
The overall time complexity is dominated by the sorting step: O((n + m) * log(n + m))
The space complexity is O(n + m)
where n
and m
are the lengths of items1
and items2
respectively.
- The Counter
cnt
stores at mostn + m
key-value pairs (in the worst case where all values are unique) - The
sorted()
function returns a new list containing at mostn + m
items
Therefore, the space complexity is O(n + m)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Using a Regular Dictionary Instead of Counter
A common mistake is using a regular dictionary without properly handling missing keys:
Incorrect approach:
def mergeSimilarItems(self, items1, items2):
weight_dict = {}
for value, weight in chain(items1, items2):
weight_dict[value] += weight # KeyError if value doesn't exist!
return sorted(weight_dict.items())
Why it fails: This will raise a KeyError
when encountering a value for the first time since weight_dict[value]
doesn't exist yet.
Solutions:
- Use
Counter()
as shown in the original solution (automatically handles missing keys) - Or manually check for key existence:
weight_dict = {} for value, weight in chain(items1, items2): if value in weight_dict: weight_dict[value] += weight else: weight_dict[value] = weight
- Or use
defaultdict(int)
:
from collections import defaultdict
weight_dict = defaultdict(int)
for value, weight in chain(items1, items2):
weight_dict[value] += weight
2. Forgetting to Return a List of Lists
The problem expects the output as a 2D array (list of lists), but sorted(cnt.items())
returns a list of tuples:
Potential issue:
return sorted(cnt.items()) # Returns [(1, 5), (2, 3), ...] instead of [[1, 5], [2, 3], ...]
Why it might matter: While Python often treats tuples and lists interchangeably, some test systems or strict type checkers might expect the exact format List[List[int]]
.
Solution if strict list format is required:
return [[value, weight] for value, weight in sorted(cnt.items())]
3. Incorrectly Handling the Sorting
Some might try to sort by weight instead of value, or forget to sort entirely:
Incorrect approaches:
# Sorting by weight (second element) instead of value
return sorted(cnt.items(), key=lambda x: x[1])
# Forgetting to sort at all
return list(cnt.items())
Solution: Always sort by value (the first element), which sorted()
does by default when no key is specified.
4. Inefficient Merging with Nested Loops
Attempting to merge using nested loops to find matching values:
Inefficient approach:
def mergeSimilarItems(self, items1, items2):
result = []
processed = set()
# Process items1
for v1, w1 in items1:
if v1 not in processed:
total_weight = w1
# Search for matching value in items2
for v2, w2 in items2:
if v1 == v2:
total_weight += w2
result.append([v1, total_weight])
processed.add(v1)
# Add remaining items from items2
for v2, w2 in items2:
if v2 not in processed:
result.append([v2, w2])
return sorted(result)
Why it's problematic: This has O(m*n) time complexity for the merging phase alone, making it inefficient for large inputs.
Solution: Use a hash table approach as in the original solution for O(m+n) merging time.
Which data structure is used to implement priority queue?
Recommended Readings
Sorting Summary Comparisons We presented quite a few sorting algorithms and it is essential to know the advantages and disadvantages of each one The basic algorithms are easy to visualize and easy to learn for beginner programmers because of their simplicity As such they will suffice if you don't know any advanced
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Donβt Miss This!