981. Time Based Key-Value Store
Problem Description
The problem is about creating a data structure that allows storing multiple values for the same key but at different timestamps, similar to a versioning system. The TimeMap
should support two operations:
set
: This operation stores the key with the value at the given timestamp.get
: This operation retrieves the value associated with a key at a specific timestamp. The catch here is that if the exact timestamp doesn't exist for that key, we need to provide the value with the closest previous timestamp. If there are no earlier timestamps, it should return an empty string.
To understand it better, imagine you are building a history tracking system for document edits. Each time someone edits a document, you record the new version of the document with a timestamp. Later, someone might want to see the version at a particular time. If a version is not available at the exact time requested, you show the latest version before the requested time.
Intuition
The key to solving this problem lies in efficiently managing the history of each key's values along with their timestamps. The intuition is to use a hash map to store keys but with a twist: each key has a list of pairs, each containing a timestamp and the corresponding value at that timestamp.
Given this structure, implementing the set
function becomes straightforward: simply append the (timestamp, value)
pair to the list associated with the key.
However, the challenge is in the get
function, where you're asked to retrieve a value that is the closest to, but not greater than, the given timestamp. This is where binary search comes into play. The list of (timestamp, value) pairs for each key can be considered as sorted by timestamp. Using binary search, we that the closest previous timestamp to the given timestamp without going over it.
The binary search is implemented using the bisect_right
function from Python's bisect
module, which returns the index where an element should be inserted to maintain the sorted order. We search for a tuple where the first element is the given timestamp, and the second element is a dummy character chosen to be larger than any possible value (in this case, chr(127)
which is the last ASCII character). If bisect_right
provides a non-zero index, we step one index back and return the associated value. If the list is empty or the index is 0, it means there are no valid timestamps before the given timestamp, and as per problem description, we return an empty string.
Learn more about Binary Search patterns.
Solution Approach
The solution is implemented with the following key ideas:
-
Use of Defaultdict: The
TimeMap
class leverages a defaultdict with lists as its default value. This is a choice of data structure from thecollections
module in Python which helps in automatically initializing a list for each new key without checking for the key's existence. The structureself.ktv
is a dictionary where each key has a list of(timestamp, value)
tuples. -
Storing Values with TimeStamps:
- The
set
method appends a tuple consisting of thetimestamp
and thevalue
to the list associated with thekey
. Sinceset
is always called with increasing timestamp order, the list of values for each key naturally remains sorted by timestamp.
- The
-
Retrieving Values:
- The
get
method uses a binary search algorithm to quickly find the correct value for a given timestamp. - It checks whether the
key
exists inself.ktv
. If not, it returns an empty string since there are no entries for thatkey
. - If the key exists, a binary search is performed using the
bisect_right
function, which finds the insertion point in the list of values to maintain the sorted order. - The target of the binary search is a tuple where the first element is the
timestamp
and the second is a very high-value character.chr(127)
works as the dummy character becausechr(127)
will always be greater than any string value, ensuringbisect_right
returns the position for the latest timestamp that is not greater than the given timestamp. - If
bisect_right
returns zero, it means there were no timestamps less than or equal to thetimestamp
that we passed, so we return an empty string. - Otherwise, we subtract one from the index that
bisect_right
returned to find the largest timestamp less than or equal to thetimestamp
requested. Then we return the value associated with this timestamp.
- The
Here's the encapsulated code structure:
- Initialization:
def __init__(self): self.ktv = defaultdict(list)
- set:
def set(self, key: str, value: str, timestamp: int) -> None: self.ktv[key].append((timestamp, value))
- get:
def get(self, key: str, timestamp: int) -> str: if key not in self.ktv: return '' tv = self.ktv[key] i = bisect_right(tv, (timestamp, chr(127))) return tv[i - 1][1] if i else ''
Overall, the use of binary search ensures that the retrieval of values is efficient, even when the number of timestamps grows large. It allows the get
operation to have a time complexity of O(log n), where n is the number of timestamps associated with a key, making the solution scalable.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the process with a simple example to illustrate the solution approach.
Suppose we are using the TimeMap
to record the status of a project at different times.
- At time 1, the status is "Started"
- At time 3, the status updates to "In Progress"
- At time 5, the status finally changes to "Completed"
We would perform the following operations on our TimeMap
:
set("projectStatus", "Started", 1)
set("projectStatus", "In Progress", 3)
set("projectStatus", "Completed", 5)
Now, our TimeMap
data structure, self.ktv
, has an entry for "projectStatus", which is a list of timestamp and value tuples like the following:
"projectStatus": [(1, "Started"), (3, "In Progress"), (5, "Completed")]
Next, let's retrieve the status of the project at different times using the get
operation:
get("projectStatus", 2)
: This should return "Started" since it's the latest status before time 2.get("projectStatus", 4)
: This should return "In Progress" for the same reason.get("projectStatus", 5)
: This should return "Completed" as it matches the timestamp exactly.get("projectStatus", 0)
: This should return an empty string since there are no statuses recorded before time 1.
Let's walk through the first get
operation step-by-step:
- Look up "projectStatus" in
self.ktv
and find the list[(1, "Started"), (3, "In Progress"), (5, "Completed")]
. - Using
bisect_right
, find the index where we would place(2, chr(127))
to maintain sorted order.- The sorted position would be index 1 (between
(1, "Started")
and(3, "In Progress")
).
- The sorted position would be index 1 (between
- Subtract one from the index:
1 - 1 = 0
. - We return the value associated with this index, so
("projectStatus", 2)
returns"Started"
.
Now, imagine checking for get("projectStatus", 0)
:
- Look up "projectStatus" and find the same list as above.
- With
bisect_right
, we find the index where(0, chr(127))
would fit, which gives us index 0 because 0 is less than any existing timestamps in the list. - Since the index is 0, it would be incorrect to subtract 1 to find the previous timestamp, as this would lead to a negative index. Instead, we return an empty string, as there's no status before time 1.
Therefore, ("projectStatus", 0)
returns an empty string, which aligns with the functionality described in the problem. This example confirms that the binary search method provides an efficient and effective way to retrieve the correct value, complying with the problem's requirements.
Solution Implementation
1from collections import defaultdict
2from bisect import bisect_right
3
4class TimeMap:
5 def __init__(self):
6 # Initialize a dictionary to store the lists of (timestamp, value) tuples for each key.
7 self.key_time_value = defaultdict(list)
8
9 def set(self, key: str, value: str, timestamp: int) -> None:
10 # Append the (timestamp, value) tuple to the list corresponding to the key.
11 self.key_time_value[key].append((timestamp, value))
12
13 def get(self, key: str, timestamp: int) -> str:
14 # If the key does not exist in the dictionary, return an empty string.
15 if key not in self.key_time_value:
16 return ''
17
18 # Retrieve the list of (timestamp, value) tuples for the given key.
19 time_value_pairs = self.key_time_value[key]
20
21 # Use bisect_right to find the index where the timestamp would be inserted
22 # to maintain the list in sorted order. Since the list is sorted by timestamp,
23 # and we want to find the largest timestamp less than or equal to the given timestamp,
24 # we use (timestamp, chr(127)) as a "high" value to ensure we can locate
25 # timestamps that are equal to the given one.
26 index = bisect_right(time_value_pairs, (timestamp, chr(127)))
27
28 # If index is 0, there is no timestamp less than or equal to the given timestamp,
29 # so we return an empty string. If not, we return the value corresponding to
30 # the timestamp immediately before the insertion point, which would be index - 1.
31 return time_value_pairs[index - 1][1] if index else ''
32
33# Your TimeMap object will be instantiated and called as such:
34# obj = TimeMap()
35# obj.set(key,value,timestamp)
36# param_2 = obj.get(key,timestamp)
37
1import java.util.HashMap;
2import java.util.Map;
3import java.util.TreeMap;
4
5class TimeMap {
6 // Using a Map where each key is a string and its value is a TreeMap that associates timestamps with values
7 private Map<String, TreeMap<Integer, String>> keyTimeValueMap = new HashMap<>();
8
9 /**
10 * Initializes the TimeMap object.
11 */
12 public TimeMap() {
13 // Constructor does not need to initialize anything since the HashMap is already initialized.
14 }
15
16 /**
17 * Stores the key with the value along with the given timestamp.
18 *
19 * @param key the key to store
20 * @param value the value associated with the key
21 * @param timestamp the timestamp at which the value is set for the key
22 */
23 public void set(String key, String value, int timestamp) {
24 // Compute if absent will insert a new TreeMap if the key is not already present
25 // It then puts the given timestamp and value into the TreeMap for the given key
26 keyTimeValueMap.computeIfAbsent(key, k -> new TreeMap<>()).put(timestamp, value);
27 }
28
29 /**
30 * Retrieves the value for the key at the given timestamp or the last value set before the timestamp.
31 *
32 * @param key the key to retrieve the value for
33 * @param timestamp the timestamp to retrieve the value at
34 * @return the value set for the key at the given timestamp or the closest previous timestamp
35 */
36 public String get(String key, int timestamp) {
37 // If the key does not exist, return an empty string
38 if (!keyTimeValueMap.containsKey(key)) {
39 return "";
40 }
41 // Retrieve the TreeMap for the given key
42 TreeMap<Integer, String> timeValueMap = keyTimeValueMap.get(key);
43 // Find the greatest timestamp less than or equal to the given timestamp
44 Integer closestTimestamp = timeValueMap.floorKey(timestamp);
45 // If there is no timestamp that is less than or equal to the given timestamp, return an empty string
46 // Otherwise, return the associated value
47 return closestTimestamp == null ? "" : timeValueMap.get(closestTimestamp);
48 }
49}
50
1#include <string>
2#include <vector>
3#include <unordered_map>
4#include <algorithm>
5using namespace std;
6
7class TimeMap {
8public:
9 /** Initialize your data structure here. */
10 TimeMap() {
11 // Constructor remains empty as there's no initialization needed beyond that of member variables.
12 }
13
14 /**
15 * Stores the key with its value along with the timestamp in the map.
16 * The values associated with each key are stored in a vector of pairs, sorted by their timestamps.
17 * @param key The string key of the value.
18 * @param value The value to associate with the key.
19 * @param timestamp The time at which the value was set.
20 */
21 void set(string key, string value, int timestamp) {
22 // Appending the pair <timestamp, value> to the vector associated with the key.
23 keyTimeValueMap[key].emplace_back(timestamp, value);
24 }
25
26 /**
27 * Retrieves a value associated with the key at a particular timestamp.
28 * If the specific timestamp doesn't exist, the value at the most recent timestamp before that is returned.
29 * @param key The key whose value needs to be retrieved.
30 * @param timestamp The timestamp at which the value was needed.
31 * @return The value at the given timestamp or the last known value before the given timestamp.
32 */
33 string get(string key, int timestamp) {
34 // Reference to the vector of pairs for the given key.
35 auto& pairs = keyTimeValueMap[key];
36
37 // Constructing a dummy pair up to the current timestamp to use in searching.
38 pair<int, string> dummyPair = {timestamp, string({127})}; // 127 is used as the end of the range character.
39
40 // Finding the upper bound, which will point at the first element that is greater than the dummy pair.
41 auto it = upper_bound(pairs.begin(), pairs.end(), dummyPair);
42
43 // If the iterator is at the beginning, there are no elements less than or equal to the timestamp,
44 // hence return an empty string. Otherwise, decrement iterator to get to the element which has
45 // the latest time less than or equal to the given timestamp.
46 return it == pairs.begin() ? "" : (prev(it))->second;
47 }
48
49private:
50 // Map that associates each key with a vector of pairs, where each pair consists of a timestamp
51 // and the value associated with that timestamp.
52 unordered_map<string, vector<pair<int, string>>> keyTimeValueMap;
53};
54
55/**
56 * Your TimeMap object will be instantiated and called as such:
57 * TimeMap* obj = new TimeMap();
58 * obj->set(key, value, timestamp);
59 * string val = obj->get(key, timestamp);
60 */
61
1type KeyValuePair = { timestamp: number; value: string };
2const keyTimeValueMap: Record<string, KeyValuePair[]> = {};
3
4function set(key: string, value: string, timestamp: number): void {
5 if (!keyTimeValueMap[key]) {
6 keyTimeValueMap[key] = [];
7 }
8 // Append the pair {timestamp, value} to the array associated with the key.
9 keyTimeValueMap[key].push({ timestamp, value });
10}
11
12function get(key: string, timestamp: number): string {
13 const pairs = keyTimeValueMap[key];
14 if (!pairs) {
15 return '';
16 }
17 // Use binary search to find the smallest index i such that pairs[i].timestamp is greater than the timestamp.
18 let low = 0, high = pairs.length;
19 while (low < high) {
20 const mid = Math.floor((low + high) / 2);
21 if (pairs[mid].timestamp <= timestamp) {
22 low = mid + 1;
23 } else {
24 high = mid;
25 }
26 }
27 // If we found an element whose timestamp is greater than our target or there are no
28 // elements with a timestamp <= target, 'low' will be the length of the array or 0, respectively.
29 if (low === 0) {
30 return '';
31 }
32 // Return the last known value before the given timestamp.
33 return pairs[low - 1].value;
34}
35
Time and Space Complexity
Time Complexity
The set
method has a time complexity of O(1)
for each operation, as it appends the (timestamp, value)
tuple to the list corresponding to the key in the defaultdict
.
The get
method's time complexity is O(log N)
for each operation, where N
is the number of entries associated with the specific key. This is because it performs a binary search (bisect_right
) to find the position where the given timestamp
would fit in the sorted order of timestamps stored.
Space Complexity
The space complexity is O(K + T)
, where K
is the number of unique keys and T
is the total number of set
calls or the total number of timestamp and value pairs. This is because all values with their corresponding timestamps for all keys are stored in the defaultdict
.
Learn more about how to find time and space complexity quickly using problem constraints.
Problem: Given a list of tasks and a list of requirements, compute a sequence of tasks that can be performed, such that we complete every task once while satisfying all the requirements.
Which of the following method should we use to solve this problem?
Recommended Readings
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!