A self-balancing binary search tree. In modern languages, this is normally implemented as a Red-black tree.
Operation | Average | Worst Case |
---|---|---|
Search | O(log(n)) | O(log(n)) |
Insert | Amortized O(1) | O(log(n)) |
Insert | Amortized O(1) | O(log(n)) |
Java has collection library support (java.util.TreeMap) for TreeMap.
C++'s std::map is a red-black tree TreeMap Implementation. Note that name is rather tricky here. std::map is TreeMap(ordered) and the more common hashmap is actually std::unordered_map.
Not in the standard library. There are third-party libraries such as http://www.grantjenks.com/docs/sortedcontainers/
TreeMap comes in handy when you need
If the requirements are only 1) and 2), we can get away with a hashmap and a list/array and that's exactly what most interview problems ask.
Here's an example:
The problem asks us to implement a data structure that maps one (key, timestamp) to one value and being able to look up using key and timestamp. We can solve this using a hashmap that stores key to (timestamp, value) pairs. When we receive a new data point, we append the (timestamp, value) pair. Because timestamps are strictly increasing, the list is always sorted and we can use binary search to answer the get query quickly.
from collections import defaultdict
class TimeMap:
def __init__(self):
self.memo = defaultdict(list)
def set(self, key: str, value: str, timestamp: int) -> None:
# store a timestamp, value pair in the list
self.memo[key].append([timestamp, value])
def get(self, key: str, timestamp: int) -> str:
# binary search on timestamp in the list
index = bisect.bisect(self.memo[key], [timestamp+1]) -1
if index < 0:
return ""
return self.memo[key][index][1]
This gives a time complexity of O(1) set and O(log(n)) get.
It's important to note that this problem has two unique conditions
If the timestamps are not strictly increasing, then to keep the (timestamp, value) pairs sorted, we need to find and insert into the position a new data point belongs to. This requires moving each element after the insertion point by one position which is an O(n) operation. Similarly, if we need to delete a timestamp, we need find, delete and move which is also an O(n) operation.
In this case, if python had TreeMap, we could have used it instead of the hashmap + binary search and achieve O(log(n)) for both get and set.
Other problems that are best solved by TreeMap but can also be solved using alternative methods:
If the problem involves range queries, we can implement a segment tree, which is much simpler than the red-black tree that backs TreeMap.
A segment Tree is a special data structure that allows quick range queries. It's not nearly as hard to implement correctly as red-black tree and is something you could do in an interview. It's a great option when you need to do range query and do not have access to builtin treemap. A quintessential question is Range Max.
In summary,
Now that we have covered Treemap, let's talk about Treeset.
A TreeSet
is a set-based data structure that provides logarithmic time performance for common operations like add
, remove
, and contains
. This efficiency arises from the fact that, under the hood, a TreeSet
is typically backed by a balanced binary search tree, such as a Red-Black Tree.
Unlike regular sets, which don't guarantee any specific order of elements, TreeSet
guarantees the elements to be in ascending order, according to their natural ordering or based on a provided comparator.
Python's standard library does not include a TreeSet
implementation out-of-the-box. Python developers typically leverage the built-in set
for most use-cases or use sorted containers, like sortedcontainers.SortedSet
, for situations where sorted order is essential.
However, the idea behind TreeSet
can be found in many programming languages, with Java being the most notable proponent of the TreeSet
data structure in its standard library.
nums[i]
and nums[j]
are at most t
apart and i and j are at most k
apart. A TreeSet
can help maintain a window of k
numbers sorted, making it easy to check the conditions.Max Sum of Rectangle No Larger Than K
Data Stream as Disjoint Intervals
TreeSet
or a sorted set structure can help in merging overlapping intervals.And that's a wrap for Python treemap and treeset. Check out the other pattens you probably should know for your next interview.