Leetcode 635. Design Log Storage System

Problem Explanation:

In the given problem, a unique log is provided which contains identifiers and timestamps. For example: (1, "2017:01:01:23:59:59"). We're asked to make a log storage system to store and retrieve the logs. The timestamp in the log is the format Year:Month:Day:Hour:Minute:Second and all domains (year/month/day/hour/minute/second) are zero-padded decimal numbers. The system should be capable of performing the following functions:

  1. Put(int id, string timestamp)- To add a log by id and timestamp to the system.

  2. Retrieve(string start, string end, string granularity)- To retrieve the id(s) of logs falling between a given start and end timestamp, considering the given level of granularity. For example, if the start timestamp="2017:01:01:23:59:59", end timestamp="2018:01:02:23:59:59", and the granularity is "Year", then the logs belonging to the years 2017 & 2018 should be returned without taking into consideration the month, day, hour, minute, or second.

At most 300 operations of Put or Retrieve can be performed. The hour ranges from [00,23] and year is in the range [2000,2017]. The output for Retrieve isn't required to be sorted.

Solution Approach:

The approach that we use here to solve this problem is quite simple. In this problem a hashmap is used to maintain the correspondence between each granularity and its index in the timestamp. All logs are stored in a dual pair vector. For instance, we assign 2017 as "Year" and its index is 4, similarly we set up the index of other granularities (Month, Day, Hour, Minute, and Second). We would then iterate through all logs to identify those that are within the given range, disregarding specific information based on the granularity level.

For example there are 3 logs: (1, "2017:01:01:23:59:59"), (2, "2017:01:01:22:59:59") and (3, "2016:01:01:00:00:00"). Now if we call the function retrieve("2016:01:01:01:01:01","2017:01:01:23:00:00","Year") it would return [1,2,3] because we are only considering the "Year" granularity and all logs are in the years 2016 and 2017. However, retrieve("2016:01:01:01:01:01","2017:01:01:23:00:00","Hour") would return [1,2] as these logs from 2016:01:01:01 to 2017:01:01:23. The 3rd log isn't considered as it falls outside the hour range.

Python Solution:

1
2python
3class LogSystem(object):
4
5    def __init__(self):
6        self.logs = []
7        self.granularityToIndices = {'Year':4, 'Month':7, 'Day':10, 'Hour':13, 'Minute':16, 'Second':19}
8
9    def put(self, id, timestamp):
10        """
11        :type id: int
12        :type timestamp: str
13        :rtype: void
14        """
15        self.logs.append((id, timestamp))
16
17    def retrieve(self, start, end, granularity):
18        """
19        :type start: str
20        :type end: str
21        :type granularity: str
22        :rtype: List[int]
23        """
24        index = self.granularityToIndices[granularity]
25        return [id  for (id, timestamp) in self.logs if start[:index] <= timestamp[:index] <= end[:index]]

Java Solution:

1
2java
3class LogSystem {
4    private final static HashMap<String, Integer> map = new HashMap<String, Integer>() {
5        {
6            put("Year", 4);
7            put("Month", 7);
8            put("Day", 10);
9            put("Hour", 13);
10            put("Minute", 16);
11            put("Second", 19);
12        }
13    };
14    private List<String[]> timestamps = new ArrayList<>();
15
16    public void put(int id, String timestamp) { timestamps.add(new String[]{Integer.toString(id), timestamp}); }
17
18    public List<Integer> retrieve(String s, String e, String gra) {
19        List<Integer> res = new ArrayList<>();
20        int x = map.get(gra);
21        for (String[] timestamp : timestamps) {
22            if ( timestamp[1].substring(0, x).compareTo(s.substring(0, x)) >= 0 &&  
23                 timestamp[1].substring(0, x).compareTo(e.substring(0, x)) <= 0) res.add(Integer.parseInt(timestamp[0]));
24        }
25        return res;
26    }
27}

Javascript Solution:

1
2javascript
3var LogSystem = function() {
4    this.logs = [];
5    this.granularityToIndices = {'Year':4, 'Month':7, 'Day':10, 'Hour':13, 'Minute':16, 'Second':19};
6};
7
8LogSystem.prototype.put = function(id, timestamp) {
9    this.logs.push([id, timestamp]);
10};
11
12LogSystem.prototype.retrieve = function(start, end, granularity) {
13    let index = this.granularityToIndices[granularity];
14    return this.logs
15        .filter(([id, timestamp]) =>  start.slice(0, index)<=timestamp.slice(0,index) && timestamp.slice(0,index)<= end.slice(0, index))
16        .map(([id,_])=>id);
17};

C++ Solution:

1
2c++
3class LogSystem {
4public:
5    LogSystem() {
6        granularityToIndices["Year"] = 4;
7        granularityToIndices["Month"] = 7;
8        granularityToIndices["Day"] = 10;
9        granularityToIndices["Hour"] = 13;
10        granularityToIndices["Minute"] = 16;
11        granularityToIndices["Second"] = 19;
12    }
13    void put(int id, string timestamp) {
14        idAndTimestamps.push_back({id, timestamp});
15    }
16    vector<int> retrieve(string start, string end, string granularity) {
17        vector<int> ans;
18        int index = granularityToIndices[granularity];
19        string s = start.substr(0, index);
20        string e = end.substr(0, index);
21        for (auto log : idAndTimestamps) {
22            string t = log.second.substr(0, index);
23            if (s <= t && t <= e)
24                ans.push_back(log.first);
25        }
26        return ans;
27    }
28
29private:
30    unordered_map<string, int> granularityToIndices;
31    vector<pair<int, string>> idAndTimestamps;
32};

C# Solution:

1
2csharp
3public class LogSystem {
4    private List<int> idlog = new List<int>();
5    private List<string> timelog = new List<string>();
6
7    private Dictionary<string, int> index = new Dictionary<string, int> { { "Year", 4 }, { "Month", 7}, { "Day", 10 }, { "Hour", 13 }, { "Minute", 16}, { "Second", 19 } };
8
9    public LogSystem() { }
10
11    public void Put(int id, string timestamp) {
12        idlog.Add(id);
13        timelog.Add(timestamp);
14    }
15
16    public IList<int> Retrieve(string s, string e, string gra) {
17        int x = index[gra];
18        List<int> res = new List<int>();
19        for (int i = 0; i < timelog.Count; i++) {
20	     string timestamp = timelog[i];
21	     if(String.Compare(s.Substring(0, x), timestamp.Substring(0, x)) <= 0 &&
22            String.Compare(timestamp.Substring(0, x), e.Substring(0, x)) <= 0)
23                res.Add(idlog[i]);
24	     }
25         return res;
26    }     
27}

Each language solution represents the same logic: We are storing the logs assuming each id and timestamp as an object. For retrieval, we are just filtering the logs based on the start and end timestamp and granularity passed.For granularity, we are just considering a section of timestamp string, for example, if granularity is "Hour", we are just considering the first 13 characters of timestamp string to compare.

Time Complexity Analysis:

The time complexity for the 'put' operation is O(1) as we are directly appending the data or hashmap.

But for retrieve operation, we are traversing through logs and comparing the filtered timestamp, so the time complexity will be O(n), where n is the number of logs.

In the worst case, this will need to traverse all the logs, thus resulting in an O(n) time complexity. Furthermore, if there are more logs, it may take more time. To optimize this, we can leverage binary search or another sorting algorithm in case logs are sorted based on timestamps. However, it's not necessary as the problem mentions a maximum of 300 operations.

Space Complexity Analysis:

The space complexity is also O(n), where n is the number of logs. Each operation of put increases the space by a factor of one which is a constant space hence the space complexity is linear.

Conclusion

In this problem solving, we learned how to use the concept of hashmaps and filtering on a string. We applied these concepts to build an efficient log storage system. We implemented solutions in different languages like Python, JavaScript, C++, Java and C#. All these implementations follow the same approach but with syntax changes with respect to specific languages.

This problem can be extended to consider the logs to be sorted and applying binary search algorithm to reduce time complexity. This implementation can also be scaled from a single machine storage system to a distributed one by just extending the principles used in this problem.


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫