677. Map Sum Pairs
Problem Description
The problem requires designing a data structure that can map strings to integer values and efficiently calculate the sum of values for strings that start with a certain prefix. This data structure must support two operations:
- Insert a key-value pair. If the key already exists, update its value.
- Compute the sum of values for all keys that begin with a specific prefix.
This is useful in scenarios where you need to store key-value pairs and perform prefix sum queries, like in an autocomplete feature where you might want to know the aggregate score of suggestions based on a certain prefix.
Intuition
The intuition behind this solution involves using two main concepts: a hash map and a trie (prefix tree). The hash map (defaultdict
) is used to store key-value pairs, ensuring that we can update the value associated with a key quickly, in constant time (O(1)).
However, a hash map alone is not efficient for prefix-related queries, which is where the trie comes in. A trie is a tree-like data structure particularly suited for handling prefix related operations. Each node in a trie represents a letter, and each path from the root to a node represents a prefix formed by the corresponding letters.
Here's how the two structures work together in the solution:
-
Whenever we insert a new key-value pair, we first calculate the difference (
x
) between the new value and the existing value associated with the key in the hash map (which is 0 if the key is new). This difference is then used to update values in the trie. We update every node in the path of the key in the trie by this difference so that every prefix that includes this key has the correct sum. -
The
insert
operation updates the hash map with the new value for the key and traverses the trie to reflect these changes across all relevant prefixes. -
The
sum
operation simply traverses the trie along the path formed by the prefix to find the sum of all values associated with the keys that start with that prefix. Theval
stored at each trie node is used to calculate this sum, which is the cumulative value of all strings passing through that node.
This approach leads to an efficient solution since the insert operation is only as slow as the length of the word being inserted, and the sum operation is only as slow as the length of the prefix we're querying, both being O(m) where m is the length of the respective strings.
Learn more about Trie patterns.
Solution Approach
The solution is implemented using a combination of a Trie and a hash map. Here's how the main components of the solution work:
-
Trie Node: The
Trie
class represents nodes of the trie with two properties:children
: An array of 26 elements representing the 26 lowercase English letters.val
: An integer value storing the cumulative sum at the current node.
-
Trie Insertion:
insert
method of theTrie
class is used to add a stringw
along with an integerx
indicating the difference to be added at each node. For each character inw
, we calculate its index and proceed down the trie, creating new nodes if necessary. Theval
at each node is incremented byx
. -
Trie Search:
search
method is for computing the sum for a given prefix. It traverses the trie using each character of the prefix. If at any point the next character of the prefix does not exist in the trie, the method returns 0. If the prefix is found in the trie, it returns theval
at the last node of the prefix. -
MapSum Class: This class utilizes a hash map (
self.d
) to keep track of the most recent values for each key. -
Initialization: In the constructor of
MapSum
, we initialize the hash map (self.d
) and the trie (self.tree
). -
Inserting Key-Value Pair: The
insert
method first calculates the differencex
between the new value and the current value of the key in the hash map. It then updates the hash map with the new value for the key. Lastly, it calls theinsert
method on the trie to adjust the cumulative sums for the nodes in the path of the inserted key. -
Sum of Prefix: The
sum
method uses thesearch
method on the trie to return the sum of all values in the map that have a key with the given prefix.
This solution ensures that values are only updated in relevant paths of the trie and that the trie reflects the correct cumulative sums for each prefix at any given time. The insert
and sum
operations take O(m) time where m is the length of the input string, and this makes the operations efficient, especially considering multiple queries for different prefixes.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's demonstrate the solution approach using a small example. Suppose we are developing an autocomplete system and we want to use our data structure to store the strings along with their associated scores.
First, we initialize our data structure.
# Initialization of data structure mapsum = MapSum()
- We insert the key-value pair ('apple', 3) into our
MapSum
.
# Insert key-value pair ('apple', 3) mapsum.insert('apple', 3)
In the insert
method, since 'apple' is a new key, its difference x
is 3 (since the old value is implicitly 0). The hash map self.d
is updated to store {'apple': 3}, and the trie nodes for 'a', 'p', 'p', 'l', and 'e' are created or updated with the cumulative sum so far. Each of the nodes along the path will have a value of 3 added to their val
.
- Now, let's insert another key-value pair ('app', 2).
# Insert key-value pair ('app', 2) mapsum.insert('app', 2)
The difference x
this time is 2. The hash map now stores {'apple': 3, 'app': 2}. The trie is updated such that the nodes for the path 'app' now have an additional 2 added to their val
.
- If we now query the sum of keys that start with the prefix 'ap',
# Calculate the sum of values for keys starting with 'ap'
total_sum = mapsum.sum('ap') # Should return 5, since 'apple' and 'app' start with 'ap'
The trie is traversed through the path of 'a' to 'p' summing up the values. Since 'apple' and 'app' both start with 'ap', their values are added, which is 3 + 2 = 5. Thus, total_sum
would be 5.
- Let's update the value for 'app' by inserting a new value for it.
# Update the existing key 'app' with a new value mapsum.insert('app', 4)
Initially, the value in the hash map for 'app' was 2, so the difference x
is now 4 - 2 = 2. The hash map is updated to {'apple': 3, 'app': 4}, and the trie nodes on the path for 'app' have an additional 2 added to their val
. The node for 'apple' does not change since we're only updating the prefix 'app'.
- Querying the sum of keys with the prefix 'ap' again,
# Re-calculate the sum of values for keys starting with 'ap' after the update
total_sum = mapsum.sum('ap') # Should return 7, because we now have 'apple': 3 and 'app': 4
Upon calling the sum
method of MapSum
, the traversal in the trie for 'ap' now sums up to 7, as both 'apple' and 'app' are updated.
Through this example, we can see how the combination of a hash map and a trie makes it efficient to not only insert and update key-value pairs, but also compute sums based on prefixes with time complexity tied to the length of the input strings.
Solution Implementation
1from collections import defaultdict
2from typing import List
3
4
5class TrieNode:
6 def __init__(self):
7 # Initialize children for each letter of the alphabet
8 self.children: List[TrieNode | None] = [None] * 26
9 # Value to be accumulated at each node
10 self.value: int = 0
11
12 def insert(self, word: str, val: int):
13 # Start at the root node
14 node = self
15 # Iterate over each character in the word
16 for char in word:
17 index = ord(char) - ord('a') # Calculate index in the alphabet
18 # If the child doesn't exist, create a new TrieNode
19 if node.children[index] is None:
20 node.children[index] = TrieNode()
21 node = node.children[index]
22 # Add the value to the node's value
23 node.value += val
24
25 def search(self, word: str) -> int:
26 # Start at the root node
27 node = self
28 # Iterate over each character in the word
29 for char in word:
30 index = ord(char) - ord('a') # Calculate index in the alphabet
31 # If a child doesn't exist, return 0 as we have no match
32 if node.children[index] is None:
33 return 0
34 node = node.children[index]
35 # Return the accumulated value
36 return node.value
37
38
39class MapSum:
40 def __init__(self):
41 # Dictionary to keep track of inserted keys and their values
42 self.dictionary = defaultdict(int)
43 # Trie data structure to keep track of the prefixes and cumulative sum
44 self.trie_root = TrieNode()
45
46 def insert(self, key: str, val: int) -> None:
47 # Calculate the difference from the current value to update
48 delta = val - self.dictionary[key]
49 # Update or set the key's value
50 self.dictionary[key] = val
51 # Insert the key with the updated difference into the trie
52 self.trie_root.insert(key, delta)
53
54 def sum(self, prefix: str) -> int:
55 # Find the sum for the prefix in the trie
56 return self.trie_root.search(prefix)
57
58
59# The MapSum object usage
60# map_sum = MapSum()
61# map_sum.insert(key, value)
62# total = map_sum.sum(prefix)
63
1class Trie {
2 // Each Trie node keeps an array of its children for each letter of the alphabet
3 private Trie[] children = new Trie[26];
4 // Value to store the sum of the inserted values for words that pass through the node
5 private int value;
6
7 // Inserts a word with its corresponding value into the Trie
8 public void insert(String word, int valueToAdd) {
9 Trie node = this;
10 for (int i = 0; i < word.length(); ++i) {
11 int index = word.charAt(i) - 'a'; // Convert char to index (0-25)
12 if (node.children[index] == null) {
13 node.children[index] = new Trie();
14 }
15 node = node.children[index];
16 node.value += valueToAdd; // Update the value at each node in the path
17 }
18 }
19
20 // Searches the Trie for a word and returns its value
21 public int search(String word) {
22 Trie node = this;
23 for (int i = 0; i < word.length(); ++i) {
24 int index = word.charAt(i) - 'a';
25 if (node.children[index] == null) {
26 return 0;
27 }
28 node = node.children[index];
29 }
30 return node.value;
31 }
32}
33
34class MapSum {
35 // Dictionary to keep track of the original key-value pairs
36 private Map<String, Integer> dictionary = new HashMap<>();
37 // Trie structure to store prefixes and their corresponding summed values
38 private Trie trie = new Trie();
39
40 public MapSum() {
41 // Constructor is empty because no initialization is needed
42 }
43
44 // Inserts a key-value pair into the MapSum
45 public void insert(String key, int value) {
46 // Compute the difference of the new value and the previous value of the key
47 int delta = value - dictionary.getOrDefault(key, 0);
48 dictionary.put(key, value); // Update the dictionary
49 trie.insert(key, delta); // Insert the delta to update values along the path in Trie
50 }
51
52 // Returns the sum of all values that have a key with the prefix specified
53 public int sum(String prefix) {
54 return trie.search(prefix);
55 }
56}
57
58/**
59 * The MapSum class can be instantiated and used as shown in the example below:
60 *
61 * MapSum mapSum = new MapSum();
62 * mapSum.insert("key", 3);
63 * int sum = mapSum.sum("ke");
64 *
65 * This will return the sum of all the values corresponding to keys that start with "ke".
66 */
67
1#include <vector>
2#include <string>
3#include <unordered_map>
4
5// Trie node implementation for storing the string as key and the sum as value.
6class TrieNode {
7public:
8 TrieNode() : children(26, nullptr), value(0) {
9 }
10
11 // Inserts a string into the Trie and updates the nodes with the given value.
12 void insert(const std::string& word, int valueToAdd) {
13 TrieNode* node = this;
14 for(char ch : word) {
15 ch -= 'a';
16 if (!node->children[ch]) {
17 node->children[ch] = new TrieNode();
18 }
19 node = node->children[ch];
20 node->value += valueToAdd;
21 }
22 }
23
24 // Searches for a string in the Trie and returns the sum of all values of nodes corresponding to the prefix.
25 int search(const std::string& word) {
26 TrieNode* node = this;
27 for (char ch : word) {
28 ch -= 'a';
29 if (!node->children[ch]) {
30 return 0;
31 }
32 node = node->children[ch];
33 }
34 return node->value;
35 }
36
37private:
38 std::vector<TrieNode*> children; // Children of this node in the trie.
39 int value; // Value associated with the node.
40};
41
42// MapSum class using TrieNode to store keys and values, and support increment of values.
43class MapSum {
44public:
45 // Initializes the MapSum object.
46 MapSum(): root(new TrieNode()) {
47 }
48
49 // Inserts the key-value pair into the map. If the key already exists, value is replaced.
50 void insert(const std::string& key, int val) {
51 int delta = val;
52 if (keysCount.find(key) != keysCount.end()) {
53 delta -= keysCount[key];
54 }
55 keysCount[key] = val;
56 root->insert(key, delta);
57 }
58
59 // Returns the sum of all values that have the given prefix.
60 int sum(const std::string& prefix) {
61 return root->search(prefix);
62 }
63
64private:
65 std::unordered_map<std::string, int> keysCount; // Tracks the latest value of keys to calculate deltas.
66 TrieNode* root; // The trie's root node.
67};
68
69/**
70 * Your MapSum object will be instantiated and called as such:
71 * MapSum* obj = new MapSum();
72 * obj->insert(key, val);
73 * int param_2 = obj->sum(prefix);
74 */
75
1interface TrieNode {
2 children: (TrieNode | null)[];
3 value: number;
4}
5
6// A Trie data structure for storing strings and their values
7const trieRoot: TrieNode = {
8 children: new Array(26).fill(null),
9 value: 0
10};
11
12// Adds a word to the Trie and updates the values along the path
13function insertIntoTrie(word: string, value: number, node: TrieNode): void {
14 for (const char of word) {
15 const index = char.charCodeAt(0) - 'a'.charCodeAt(0);
16 if (node.children[index] === null) {
17 node.children[index] = { children: new Array(26).fill(null), value: 0 };
18 }
19 node = node.children[index]!;
20 node.value += value;
21 }
22}
23
24// Searches the Trie and returns the total value for the given prefix
25function searchTrie(prefix: string, node: TrieNode): number {
26 for (const char of prefix) {
27 const index = char.charCodeAt(0) - 'a'.charCodeAt(0);
28 if (!node.children[index]) {
29 return 0;
30 }
31 node = node.children[index]!;
32 }
33 return node.value;
34}
35
36// A map to store the key-value pairs for MapSum operations
37const mapSumDictionary: Map<string, number> = new Map<string, number>();
38
39// Inserts a key with its value into MapSum and updates the Trie
40function insertMapSum(key: string, value: number): void {
41 const diff = value - (mapSumDictionary.get(key) || 0);
42 mapSumDictionary.set(key, value);
43 insertIntoTrie(key, diff, trieRoot);
44}
45
46// Computes the sum of all values in MapSum that have the given prefix
47function sumMapSum(prefix: string): number {
48 return searchTrie(prefix, trieRoot);
49}
50
51// Usage example:
52// insertMapSum("apple", 3);
53// console.log(sumMapSum("ap")); // Output will be 3
54// insertMapSum("app", 2);
55// console.log(sumMapSum("ap")); // Output will be 5
56
Time and Space Complexity
Trie Class
- The
Trie
classinsert
method has a time complexity ofO(L)
, whereL
is the length of the wordw
. This is because it touches each character in the word once. - The space complexity of the
insert
method is alsoO(L)
, considering the worst-case scenario where each letter of the word corresponds to a new Trie node. - The
search
method in theTrie
class has a time complexity ofO(L)
as well, as it only traverses the word once.
MapSum Class
-
The
insert
method in theMapSum
class has a time complexity ofO(L)
for the same reasons as theTrie
insert
method, whereL
is the length of the key. The updating of the dictionaryself.d
has a constant time complexityO(1)
, but since we also call theTrie
insert method, the overall time complexity remainsO(L)
. -
The space complexity of
insert
depends on whether new Trie nodes are created. In the worst case, it isO(L)
, whereL
is the length of the key. The dictionaryself.d
has a space complexity ofO(N)
, whereN
is the number of unique keys inserted since each key-value pair is stored in the dictionary. -
The
sum
method has a time complexity ofO(L)
, which is required to traverse the Trie. After reaching the end of the prefix in the Trie, the value is obtained inO(1)
time. There is no significant space complexity for thesum
operation, so it isO(1)
.
Overall, the space complexity of the MapSum
class as a whole is driven by the space used to store all the unique keys in the dictionary and Trie nodes, which sums up to O(N * L)
where N
is the number of unique keys and L
is the average length of the keys.
Learn more about how to find time and space complexity quickly using problem constraints.
What data structure does Breadth-first search typically uses to store intermediate states?
Recommended Readings
Trie Introduction Imagine you have a magic dictionary where as soon as you start spelling a word it immediately tells you how many words that start with those letters How cool would that be This is exactly the kind of magic a data structure called trie performs What's a Trie
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!