642. Design Search Autocomplete System
Problem Description
The problem requires building an autocomplete system much like those found in modern search engines. When a user types a sentence, the system should suggest the top 3 historical sentences that share the same prefix as the input. These suggestions are based on two key factors:
- The frequency of past inputs, reflecting how often each sentence has been previously typed to gauge its 'hotness' (popularity).
- The lexicographical order, which acts as a tiebreaker when sentences have the same frequency.
The system should update its suggestions with each new character typed, except when the special character '#' is entered, indicating the end of the sentence. At that point, the input should be recorded in the system to influence future suggestions, and the output should be an empty array.
To manage this process, we'll need to implement two main components:
AutocompleteSystem(sentences, times)
: Construct an autocomplete system with pre-recorded sentences and their corresponding frequencies.input(c)
: Process each character input by the user. If the character is '#', update the system with the new sentence. Otherwise, find and return the top 3 suggestions.
Intuition
To efficiently manage the input sentences and find the top suggestions, we can implement a Trie Data structure (prefix tree). A Trie is particularly well-suited for this task since it can quickly lookup suggestions based on input prefixes.
For the implementation, we'll follow these steps:
-
Initialize the Trie with preloaded sentences and their frequencies. For this, implement an insert function in the Trie that will store the sentence frequencies as well as the sentences themselves.
-
For each input character, navigate through the Trie to find the node corresponding to the prefix formed by the typed characters up to that point.
-
From that node, perform a depth-first search (DFS) traversing to gather all possible sentence suggestions from that prefix. Collect their frequencies and sentences themselves.
-
After gathering all possible suggestions, sort them based on the two criteria mentioned (frequency and lexicographical order).
-
Return the top 3 suggestions based on the sorting. If fewer than 3 suggestions are available, return all that exist.
-
If the input character is '#', ensure to insert the completed sentence into the Trie, which can then influence future suggestions.
By structuring the Trie nodes to hold values representing sentence frequency (v
) and the sentence itself (w
), and by leveraging sorting on the collected results, we accomplish the desired autocomplete functionality.
The key to the solution is efficient Trie traversal and effective use of sorting to identify the hottest sentences quickly. This approach minimizes the computation needed for each character input and ensures that each new entry dynamically updates the Trie for future suggestions.
Learn more about Trie and Data Stream patterns.
Solution Approach
The core of the autocomplete system is built on a Trie data structure, which allows for efficient insertion and searching of prefixes for the given sentences. The goal of the Trie is to provide a means of retrieving all sentences starting with a particular prefix that the user has entered up until that moment. Here is an outline of the implementation pattern used:
-
Trie Construction: The Trie node is initialized with 27 child nodes, one for each letter of the alphabet plus a space character. The Trie node also contains two properties:
v
to hold the frequency of the sentence, andw
to store the sentence itself. -
Insert Function in Trie:
- The
insert
method adds a sentence to the Trie. As it moves through each character in the sentence, it determines the appropriate child node, creating a new Trie node if necessary. - The index of the child node is determined by converting the character to its alphabetical index, with a special case for the space character, which is mapped to 26.
- When the end of the sentence is reached, the node's frequency or 'hotness' value
v
is incremented by the provided frequencyt
, and the sentencew
is stored in the node.
- The
-
Search Function in Trie:
- The
search
method navigates through the Trie based on the input prefix. If a character isn't found, the method returnsNone
, indicating no suggestions available for this prefix.
- The
-
Depth-First Search (DFS) for Suggestions:
- The
dfs
auxiliary function recursively traverses all children nodes starting from the node that matches the current input prefix to find all completed sentences stemming from it. - Each completed sentence and its frequency are collected in a results list
res
.
- The
-
Input Handling:
- When
input(c)
is called with a characterc
, it appends the character to a temporary listt
representing the current partial sentence being entered. - If
c
is '#', the user has finished typing, and this completed sentence is inserted into the Trie with a frequency increment of1
. An empty list is returned, and the temporary listt
is reset. - If
c
is not '#', the current prefix is used to search the Trie, and thendfs
is called to collect all possible sentences from this prefix.
- When
-
Sorting and Obtaining Top 3 Results:
- After the DFS is complete, the list of collected suggestions is sorted first by frequency in decreasing order and then by lexical order.
- The top 3 suggestions are extracted using array slicing (
res[:3]
), ensuring that if fewer suggestions are available, all are returned.
By combining the Trie structure for storing sentences efficiently and carefully designed DFS traversal with precise sorting logic, the implementation effectively provides the needed autocomplete suggestions. It's important to note that the Trie data structure is selected due to its ability to greatly optimize the lookup time for suggested sentences compared to simpler list-based or brute-force search approaches.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach with a small example. Imagine we have the following historical sentences with their corresponding frequencies:
- "i love you", frequency = 5
- "island", frequency = 3
- "i love leetcode", frequency = 2
- "ironman", frequency = 1
We initialize the AutocompleteSystem
with these sentences. Now, let's consider a user starts typing the string "i".
-
Trie Construction: Our Trie has been initialized with the given sentences and their frequencies. The structure will hold nodes for "i love you", "island", "i love leetcode", and "ironman" under nodes corresponding to "i → l" and "i → s" and "i → r".
-
Processing Input 'i': The Trie is traversed to the node corresponding to "i". Since there is no sentence completion yet, we move to the next step.
-
Depth-First Search (DFS) for Suggestions: From the "i" node, we do a DFS to collect all potential sentences. The system finds "i love you", "i love leetcode", "island", and "ironman".
-
Sorting Candidates: We sort these sentences by their frequencies and lexicographically:
- "i love you" – frequency 5
- "i love leetcode" – frequency 2
- "island" – frequency 3
- "ironman" – frequency 1
Sorted results: ["i love you", "island", "i love leetcode", "ironman"]
-
Return Top 3 Suggestions:
- After sorting, the top 3 are "i love you", "island", and "i love leetcode". So those are returned as suggestions to the user.
-
User Completes the Sentence: Suppose the user types "i love leetcode#" to complete the sentence.
-
Updating Trie: The input "i love leetcode" is now added to the Trie, updating its frequency from 2 to 3 because the user has entered it once more.
Now, if the user starts a new input with "i", the system will perform the same steps and will suggest: "i love you" (frequency 5), "i love leetcode" (updated frequency 3), and "island" (frequency 3). Since "i love leetcode" and "island" have the same frequency, lexicographic sorting decides the order, hence "i love leetcode" comes before "island".
Solution Implementation
1class TrieNode:
2 def __init__(self):
3 # Initialize 27 children for each letter in the alphabet plus the space character
4 self.children = [None] * 27
5 self.value = 0 # Frequency of word ending at this node
6 self.word = '' # The word ending at this node, if any
7
8 def insert(self, word, frequency):
9 # Inserts a word into the trie along with its frequency
10 node = self
11 for char in word:
12 index = 26 if char == ' ' else ord(char) - ord('a') # Mapping 'a'-'z' to 0-25, ' ' to 26
13 if node.children[index] is None:
14 node.children[index] = TrieNode()
15 node = node.children[index]
16 node.value += frequency
17 node.word = word
18
19 def search(self, prefix):
20 # Searches for a node in the trie that corresponds to the given prefix
21 node = self
22 for char in prefix:
23 index = 26 if char == ' ' else ord(char) - ord('a')
24 if node.children[index] is None:
25 return None
26 node = node.children[index]
27 return node
28
29class AutocompleteSystem:
30 def __init__(self, sentences, times):
31 self.trie = TrieNode()
32 for sentence, frequency in zip(sentences, times):
33 self.trie.insert(sentence, frequency)
34 self.typed_characters = [] # Keeps track of characters typed by the user
35
36 def input(self, character):
37 # Returns autocomplete suggestions based on the characters inputted so far
38
39 def dfs(node):
40 # Perform a depth-first search to find all words with their frequencies
41 if node is None:
42 return
43 if node.value:
44 results.append((node.value, node.word))
45 for next_node in node.children:
46 dfs(next_node)
47
48 if character == '#':
49 # The user finished typing a word; update the trie and reset the input
50 current_sentence = ''.join(self.typed_characters)
51 self.trie.insert(current_sentence, 1) # Increment the frequency of the word
52 self.typed_characters = []
53 return []
54
55 results = []
56 self.typed_characters.append(character)
57 current_prefix = ''.join(self.typed_characters)
58 node = self.trie.search(current_prefix)
59 # If the prefix doesn't exist in the trie, return an empty list
60 if node is None:
61 return []
62 # Otherwise, find and return autocomplete suggestions
63 dfs(node)
64 # Sort the results in descending order of frequency, and alphabetically for ties
65 results.sort(key=lambda x: (-x[0], x[1]))
66 # Return the top 3 suggestions, or fewer if there aren't enough
67 return [entry[1] for entry in results[:3]]
68
69# The AutocompleteSystem class can be instantiated and used as follows:
70# autocomplete_system = AutocompleteSystem(sentences, times)
71# suggestions = autocomplete_system.input(character)
72
1import java.util.ArrayList;
2import java.util.List;
3import java.util.PriorityQueue;
4
5class Trie {
6 Trie[] children = new Trie[27]; // 26 English letters and 1 space
7 int frequency;
8 String word = "";
9
10 // Inserts a word into the trie with its associated frequency
11 void insert(String word, int frequency) {
12 Trie node = this;
13 for (char ch : word.toCharArray()) {
14 int index = ch == ' ' ? 26 : ch - 'a'; // Convert char to index, space represented by 26
15 if (node.children[index] == null) {
16 node.children[index] = new Trie();
17 }
18 node = node.children[index];
19 }
20 node.frequency += frequency; // Update the frequency of the node (word-end)
21 node.word = word;
22 }
23
24 // Searches for a node with a given prefix in the trie
25 Trie search(String prefix) {
26 Trie node = this;
27 for (char ch : prefix.toCharArray()) {
28 int index = ch == ' ' ? 26 : ch - 'a';
29 if (node.children[index] == null) {
30 return null;
31 }
32 node = node.children[index];
33 }
34 return node;
35 }
36}
37
38class AutocompleteSystem {
39 private Trie rootTrie = new Trie(); // Trie root
40 private StringBuilder currentInput = new StringBuilder();
41
42 // Constructor which takes an array of sentences and their corresponding
43 // frequencies to build the trie
44 public AutocompleteSystem(String[] sentences, int[] times) {
45 for (int i = 0; i < sentences.length; i++) {
46 rootTrie.insert(sentences[i], times[i]);
47 }
48 }
49
50 // Processes input character by character to autocomplete
51 public List<String> input(char inputChar) {
52 List<String> autocompleteResult = new ArrayList<>();
53 if (inputChar == '#') {
54 rootTrie.insert(currentInput.toString(), 1); // Insert the current input into the trie
55 currentInput = new StringBuilder(); // Reset builder for next input
56 return autocompleteResult; // Autocomplete list is empty for new input
57 }
58 currentInput.append(inputChar); // Adding new character to current input
59 Trie node = rootTrie.search(currentInput.toString());
60 if (node == null) {
61 return autocompleteResult; // If no node found, return empty list
62 }
63 PriorityQueue<Trie> minHeap
64 = new PriorityQueue<>((a, b) -> a.frequency == b.frequency ? b.word.compareTo(a.word) : a.frequency - b.frequency);
65 depthFirstSearch(node, minHeap);
66 while (!minHeap.isEmpty()) {
67 autocompleteResult.add(0, minHeap.poll().word);
68 }
69 return autocompleteResult; // Return the list of autocompleted words
70 }
71
72 // DFS to find all words with the same prefix, use a priority queue to sort them based on frequency and lexicographical order
73 private void depthFirstSearch(Trie node, PriorityQueue<Trie> minHeap) {
74 if (node == null || minHeap.size() > 3 && node.frequency < minHeap.peek().frequency) {
75 return; // Early return condition to prune the search
76 }
77 if (node.frequency > 0) {
78 minHeap.offer(node); // If it's a word, offer to the min-heap
79 if (minHeap.size() > 3) {
80 minHeap.poll(); // Keep the size of min-heap no larger than 3
81 }
82 }
83 for (Trie childNode : node.children) {
84 depthFirstSearch(childNode, minHeap); // Recursively search children
85 }
86 }
87}
88
89/**
90 * Your AutocompleteSystem object will be instantiated and called as such:
91 * AutocompleteSystem obj = new AutocompleteSystem(sentences, times);
92 * List<String> param_1 = obj.input(c);
93 */
94
1#include <vector>
2#include <string>
3#include <unordered_map>
4#include <queue>
5
6using namespace std;
7
8// Trie node structure
9class Trie {
10public:
11 unordered_map<char, Trie*> children;
12 int frequency;
13 string word;
14
15 // Constructor
16 Trie() : frequency(0), word("") {}
17
18 // Destructor
19 ~Trie() {
20 for(auto child : children) {
21 delete child.second;
22 }
23 }
24
25 // Inserts a word into the trie with its associated frequency
26 void insert(string word, int frequency) {
27 Trie* node = this;
28 for (char ch : word) {
29 if (node->children.find(ch) == node->children.end())
30 node->children[ch] = new Trie();
31
32 node = node->children[ch];
33 }
34 node->frequency += frequency; // Update the frequency of the node (word-end)
35 node->word = word;
36 }
37
38 // Searches for a node with a given prefix in the trie
39 Trie* search(string prefix) {
40 Trie* node = this;
41 for (char ch : prefix) {
42 if (node->children.find(ch) == node->children.end())
43 return nullptr;
44
45 node = node->children[ch];
46 }
47 return node;
48 }
49};
50
51// AutocompleteSystem class to provide autocomplete functionality
52class AutocompleteSystem {
53private:
54 Trie* root Trie;
55 string currentInput;
56
57 // Comparator for priority queue
58 struct TrieComparator {
59 bool operator()(Trie* a, Trie* b) {
60 if(a->frequency == b->frequency) return a->word > b->word;
61 return a->frequency < b->frequency;
62 }
63 };
64
65public:
66 // Constructor which takes a vector of sentences and their corresponding frequencies to build the trie
67 AutocompleteSystem(vector<string>& sentences, vector<int>& times) {
68 rootTrie = new Trie();
69 for (int i = 0; i < sentences.size(); i++) {
70 rootTrie->insert(sentences[i], times[i]);
71 }
72 }
73
74 // Destructor to safely delete the Trie
75 ~AutocompleteSystem() {
76 delete rootTrie;
77 }
78
79 // Processes input character by character to autocomplete
80 vector<string> input(char inputChar) {
81 if (inputChar == '#') {
82 rootTrie->insert(currentInput, 1); // Insert the current input into the trie
83 currentInput.clear(); // Reset builder for next input
84 return vector<string>(); // Return empty vector
85 }
86
87 currentInput += inputChar; // Adding new character to current input
88 Trie* node = rootTrie->search(currentInput);
89 if (!node) {
90 return vector<string>(); // If no node found, return empty list
91 }
92
93 priority_queue<Trie*, vector<Trie*>, TrieComparator> minHeap;
94 depthFirstSearch(node, minHeap);
95
96 vector<string> autocompleteResult;
97 while (!minHeap.empty() && autocompleteResult.size() < 3) {
98 autocompleteResult.insert(autocompleteResult.begin(), minHeap.top()->word);
99 minHeap.pop();
100 }
101 return autocompleteResult; // Return the list of autocompleted words
102 }
103
104private:
105 // DFS to find all words with the same prefix, use a priority queue to sort them based on frequency and lexical order
106 void depthFirstSearch(Trie* node, priority_queue<Trie*, vector<Trie*>, TrieComparator>& minHeap) {
107 if (!node) return;
108 if (node->frequency > 0) {
109 minHeap.push(node); // If it's a word, add it to the min-heap
110 if (minHeap.size() > 3) minHeap.pop(); // If more than 3 elements, remove the one with the lowest frequency
111 }
112 for (auto pair : node->children) {
113 depthFirstSearch(pair.second, minHeap); // Recursively search children
114 }
115 }
116};
117
118// Usage example (not part of the class definition)
119/*
120int main() {
121 vector<string> sentences = {"i love you", "island", "iroman", "i love leetcode"};
122 vector<int> times = {5, 3, 2, 2};
123 AutocompleteSystem obj(sentences, times);
124 vector<string> param_1 = obj.input('i');
125 // Continue processing input characters...
126 return 0;
127}
128*/
129
1// Define the structure of a Trie node
2type TrieNode = {
3 children: TrieNode[];
4 frequency: number;
5 word: string;
6};
7
8// Initialize the root Trie node with 27 children for 26 letters and one space
9let rootTrie: TrieNode = createTrieNode();
10
11// Function to create a new Trie node
12function createTrieNode(): TrieNode {
13 return {
14 children: new Array(27).fill(null), // Initialize an array of 27 null children
15 frequency: 0,
16 word: ""
17 };
18}
19
20// Inserts a word into the trie with its associated frequency
21function insert(word: string, frequency: number): void {
22 let node = rootTrie;
23 for (let ch of word) {
24 const index = ch === ' ' ? 26 : ch.charCodeAt(0) - 'a'.charCodeAt(0);
25 if (node.children[index] === null) {
26 node.children[index] = createTrieNode();
27 }
28 node = node.children[index];
29 }
30 node.frequency += frequency;
31 node.word = word;
32}
33
34// Searches for a node with a given prefix in the trie
35function search(prefix: string): TrieNode | null {
36 let node = rootTrie;
37 for (let ch of prefix) {
38 const index = ch === ' ' ? 26 : ch.charCodeAt(0) - 'a'.charCodeAt(0);
39 if (node.children[index] === null) {
40 return null;
41 }
42 node = node.children[index];
43 }
44 return node;
45}
46
47// Current input buffer
48let currentInput: string[] = [];
49
50// Autocomplete system constructor mimicking
51function autocompleteSystem(sentences: string[], times: number[]): void {
52 for (let i = 0; i < sentences.length; i++) {
53 insert(sentences[i], times[i]);
54 }
55}
56
57// Processes input character by character to autocomplete
58function input(inputChar: string): string[] {
59 let autocompleteResult: string[] = [];
60 if (inputChar === '#') {
61 insert(currentInput.join(''), 1); // Construct the current input string and insert it into the trie
62 currentInput = []; // Reset current input for next input
63 return autocompleteResult; // Autocomplete list is empty for new input
64 }
65 currentInput.push(inputChar); // Add new character to current input
66 const node = search(currentInput.join(''));
67 if (node === null) {
68 return autocompleteResult; // If no node found, return empty list
69 }
70
71 // Create a min-heap priority queue to store TrieNode elements
72 const minHeap: TrieNode[] = [];
73 depthFirstSearch(node, minHeap);
74 while (minHeap.length) {
75 const nextItem = minHeap.shift();
76 if (nextItem) autocompleteResult.push(nextItem.word);
77 }
78 return autocompleteResult; // Return the list of autocompleted words
79}
80
81// DFS to find all words with the same prefix
82function depthFirstSearch(node: TrieNode, minHeap: TrieNode[]): void {
83 if (!node || (minHeap.length > 2 && node.frequency < minHeap[0].frequency)) {
84 return; // Stop searching if conditions are met
85 }
86 if (node.frequency > 0) {
87 minHeap.push(node); // If the node is a word, push it to the min-heap
88 minHeap.sort((a, b) => b.frequency - a.frequency || a.word.localeCompare(b.word)); // Sort by frequency and lex order
89 if (minHeap.length > 3) {
90 minHeap.shift(); // Remove the lowest frequency node if the heap size is greater than 3
91 }
92 }
93 for (let childNode of node.children) {
94 if (childNode) {
95 depthFirstSearch(childNode, minHeap); // Recursively search all children
96 }
97 }
98}
99
100/**
101 * Example usage:
102 * autocompleteSystem(["i love you", "island", "ironman", "i love leetcode"], [5, 3, 2, 2]);
103 * console.log(input('i')); // Expected autocomplete results for prefix 'i'
104 * console.log(input('#')); // Completing the input and expecting empty result for new input
105 */
106
Time and Space Complexity
Trie Insertion - insert
method
-
Time Complexity: The time complexity of inserting a word
w
with lengthn
into the trie isO(n)
. This is because we have to visit each character in the word and check/create a node in the trie for each character. -
Space Complexity: The space complexity for inserting a word is
O(n)
. In the worst-case scenario, where no prefix of the word is shared with previously inserted words, we create a new node for each character.
Trie Search - search
method
-
Time Complexity: The time complexity for searching a prefix
pref
of lengthm
isO(m)
, as we need to visit each character's node in the trie to check if the prefix exists. -
Space Complexity: The space complexity is
O(1)
for searching, since we're only iterating through existing nodes without using additional space proportional to the size of the input.
Autocomplete Input - input
method
-
Time Complexity: The
input
function has several components:-
Appending a character to the current input, which is
O(1)
. -
Searching for the current input in the trie, which is
O(m)
wherem
is the length of the current input string. -
Depth-First Search (DFS) to find all words with the given prefix, which can take up to
O(p + q * k)
wherep
is the total number of trie-nodes,q
is the number of nodes that contains words, andk
is the average length of the words stored. -
Sorting the results, if there are
x
completions found by DFS, the complexity of sorting isO(x * log(x))
.
Adding all of that together and considering the worst-case scenario, the time complexity for the input method is
O(m + p + q * k + x * log(x))
. -
-
Space Complexity: The space complexity here is dominated by the DFS operation, which stores all the potential completions. In the worst case, it stores
O(q)
completions, and the space required for the sorting operation. Therefore, the space complexity isO(q * k)
, wherek
is the average length of the words stored.
Note: List[str]
and times: List[int]
are Python-specific type hints. The time complexity does not change, but for clarity and context, they indicate the types of the arguments sentences
and times
respectively. The sentences
is a list of strings, and times
is a list of integers.
Learn more about how to find time and space complexity quickly using problem constraints.
In a binary min heap, the minimum element can be found in:
Recommended Readings
Trie Introduction Imagine you have a magic dictionary where as soon as you start spelling a word it immediately tells you how many words that start with those letters How cool would that be This is exactly the kind of magic a data structure called trie performs What's a Trie
Median of Data Stream Given a stream of numbers find the median number at any given time accurate to 1 decimal place Example add_number 1 add_number 2 add_number 3 get_median 2 0 add_number 4 get_median 2 5 Try it yourself Explanation Intuition Brute force way is to sort the entire list every time we get a new number This would be O N log N for each add_number operation One step up is to notice that the list is sorted before we add any new number to it So every
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Want a Structured Path to Master System Design Too? Don’t Miss This!