745. Prefix and Suffix Search
Problem Description
The problem requires the design of a specialized dictionary that allows for searching words by both a prefix and a suffix. Here's how the class should work:
- The
WordFilter
class should be initialized with an array ofwords
. These are the words that are going to be added to the dictionary. - The class should have a function
f(pref, suff)
that, given a prefixpref
and a suffixsuff
, returns the highest index of any word that starts withpref
and ends withsuff
. If there are no words that meet these criteria, the function should return -1.
This problem is about efficiently managing a set of strings so that they can be queried by their prefixes and suffixes. Implementing such a structure with fast lookup speeds can be challenging due to the complexity of string comparisons in a potentially large dataset.
Intuition
To solve this problem, we can take advantage of a [trie](https://en.wikipedia.org/wiki/Trie), which is a tree-like data structure that stores a dynamic set of strings in a way that makes it easy to retrieve all the keys with a common prefix. For this problem, we actually need two tries: one for prefixes and one for suffixes. Here's why this approach is effective:
- By inserting each word into the prefix trie normally, and into the suffix trie in reversed order, we efficiently prepare the data for the specific queries we're about to receive.
- Each node in the trie keeps a list of all the indexes of the words that pass through that node, which allows us to find all the words that have a specific prefix or suffix.
- Upon querying, we look up the prefix in the prefix trie to get a list of word indexes with that prefix, and we look up the reversed suffix in the suffix trie to get a list of word indexes with that suffix.
- Once we have both lists of indexes for a given prefix and suffix, we start from the end of each list (since we want the largest index) and compare the indexes. We step backwards through the lists, looking for a match. Because both lists are sorted, we are able to efficiently find the greatest index that appears in both lists, or decide that no such index exists if we reach the beginning of either list.
Given that we are dealing with words which are essentially sequences of characters, a trie allows us to narrow down our search space as we progress each character further into the word, making it a suitable data structure for both time and space efficiency in this scenario.
Learn more about Trie patterns.
Solution Approach
The solution approach involves the construction and utilization of two trie data structures:
-
Prefix Trie (
self.p
): This trie stores all the words of the dictionary by their prefix. When a word is inserted, each character of the word is traversed, creating a tree pathway if it doesn't already exist. At every node, the index of the word in the original dictionary is appended to theindexes
list. This enables the trie to keep track of all the dictionary indexes that share the same prefix up to that point. -
Suffix Trie (
self.s
): The suffix trie holds the words in reversed order to facilitate suffix searching. The process of insertion is similar to the prefix trie, except that the word is inserted backwards. Thus, when searching for a particular suffix, we search the trie with the reversed suffix.
Both tries are filled with each word's respective prefixes and reversed suffixes, along with their dictionary index. When performing a lookup with the f
function:
-
We first search the prefix trie with the given prefix,
pref
. This results in a list of indexes,a
, containing all the words that start with that prefix. -
Similarly, we search the suffix trie with the reversed suffix,
suff[::-1]
. This yields a list of indexes,b
, containing all the words that end with the specified suffix. -
With both lists
a
andb
, we now need to find the largest index that is common to both. Since the indexes are stored in ascending order within the trie, we comparea
andb
's values starting from their ends (the largest indexes). -
We iterate backwards through both lists simultaneously comparing their elements:
-
If the indexes at our current pointers match (
a[i] == b[j]
), we've found the highest index of a word that satisfies both the prefix and suffix requirements. We return this index. -
If
a[i]
is larger thanb[j]
, we decrementi
because we need to find a smaller index ina
that might match an index inb
. -
Conversely, if
b[j]
is larger thana[i]
, we decrementj
to try and find a smaller index inb
that matches an index ina
.
-
-
The iteration continues until either list has been fully traversed (indicated by
~i
or~j
becomingFalse
when the pointers become negative). If we haven't found a matching index by this point, we return-1
.
The clever use of tries allows us to optimize the lookup times for both prefixes and suffixes. Moreover, by keeping track of the indexes at each node of the trie, the solution enables an efficient way to determine the highest index where the prefix and suffix conditions overlap. This approach significantly reduces the complexity compared to brute force methods which would otherwise involve comparing each possible prefix and suffix pair in the dictionary.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's go through an example to illustrate the solution approach using two words: "apple" and "apply". These words will be used to initialize our WordFilter
class.
-
Initializing
WordFilter
: We create aWordFilter
object and pass the words["apple", "apply"]
to the constructor. -
Building Prefix Trie (
self.p
): Here is a simplified visual of the prefix trie after inserting "apple" and "apply":1root 2 | 3 a 4 | 5 p 6/ \
p l
| |
l y
|
e
1As each word is inserted, each character forms a path in the trie if it doesn't exist already. The index of the word is stored at the end of the path for that word. The prefix trie would have the indices of the words as they appear in the list: index 0 for "apple" and index 1 for "apply".
2
33. **Building Suffix Trie (`self.s`):** The suffix trie looks different, as it is constructed using the reversed words. After inserting the reversed words "elppa" and "ylppa", it will look like:
root
|
a
|
p
/
p l
| |
l e
|
y
1Like the prefix trie, the indices are kept at the end of each path - index 0 for "apple" and index 1 for "apply".
2
34. **Querying with `f` Function:** Suppose we want to find a word that has the prefix "ap" and the suffix "le". We call `f("ap", "le")`.
4
55. **Search the Prefix Trie:** We search for "ap" in the prefix trie (`self.p`). This gives us the list `[0, 1]` since both "apple" and "apply" have the prefix "ap".
6
76. **Search the Suffix Trie:** Now, we look up the reversed suffix "le" which becomes "el" in the suffix trie (`self.s`). The list obtained here is `[0]` because only "apple" has the suffix "le".
8
97. **Find Largest Common Index:** We look for the largest index that is common between the two lists:
10- The list from the prefix trie is `[0, 1]`.
11- The list from the suffix trie is `[0]`.
12
13We compare from the ends, but since both lists contain the index `0` and that is the largest value in the suffix list, we return it. Hence, `f("ap", "le")` returns `0`, indicating "apple" is the word that matches these criteria with the highest index.
14
15This approach quickly finds the highest index common to both the specified prefix and suffix without the need to compare each word in the dictionary. It efficiently narrows down the search space and leads to significant time savings, especially when dealing with a large number of queries or a large dictionary.
Solution Implementation
1class TrieNode:
2 def __init__(self):
3 # Initialize the children to be a list of 26 elements for each letter in the alphabet
4 self.children = [None] * 26
5 # Initialize an empty list to hold the indexes at which the words appear
6 self.indexes = []
7
8 def insert(self, word, index):
9 node = self
10 for char in word:
11 char_index = ord(char) - ord('a') # Find the index of the character
12 if node.children[char_index] is None:
13 node.children[char_index] = TrieNode() # Create a TrieNode if it doesn't exist
14 node = node.children[char_index]
15 node.indexes.append(index) # Appends index to indexes list at each node along the path
16
17 def search(self, prefix):
18 node = self
19 for char in prefix:
20 char_index = ord(char) - ord('a') # Convert char to index
21 if node.children[char_index] is None:
22 return [] # Return empty list if character path doesn’t exist
23 node = node.children[char_index]
24 return node.indexes # Return the indexes list if the prefix is found
25
26
27class WordFilter:
28 def __init__(self, words):
29 self.prefix_trie = TrieNode() # Trie to store the prefixes
30 self.suffix_trie = TrieNode() # Trie to store the reverse of the suffixes for easier matching
31 for index, word in enumerate(words):
32 self.prefix_trie.insert(word, index)
33 # Insert reversed word into suffix trie to handle suffix searches
34 self.suffix_trie.insert(word[::-1], index)
35
36 def f(self, prefix, suffix):
37 prefix_indexes = self.prefix_trie.search(prefix)
38 suffix_indexes = self.suffix_trie.search(suffix[::-1]) # Reverse suffix to match with our suffix trie
39 if not prefix_indexes or not suffix_indexes:
40 return -1
41
42 # Use two pointers to find the largest index which is common in both prefix and suffix indexes lists
43 i, j = len(prefix_indexes) - 1, len(suffix_indexes) - 1
44 while i >= 0 and j >= 0:
45 if prefix_indexes[i] == suffix_indexes[j]:
46 return prefix_indexes[i]
47 if prefix_indexes[i] > suffix_indexes[j]:
48 i -= 1
49 else:
50 j -= 1
51 return -1 # Return -1 if no common index is found
52
53
54# The WordFilter object will be instantiated and called as such:
55# word_filter = WordFilter(words)
56# result = word_filter.f(prefix, suffix)
57
1import java.util.ArrayList;
2import java.util.Collections;
3import java.util.List;
4
5// Trie class for storing characters and their associated indexes for easy string search.
6class Trie {
7 private Trie[] children = new Trie[26];
8 private List<Integer> indexes = new ArrayList<>();
9
10 // Insert a word into the trie and map it with its index.
11 void insert(String word, int index) {
12 Trie node = this;
13 for (char ch : word.toCharArray()) {
14 int charIndex = ch - 'a';
15 if (node.children[charIndex] == null) {
16 node.children[charIndex] = new Trie();
17 }
18 node = node.children[charIndex];
19 node.indexes.add(index);
20 }
21 }
22
23 // Search for a prefix in the trie and return a list of mapped indexes.
24 List<Integer> search(String prefix) {
25 Trie node = this;
26 for (char ch : prefix.toCharArray()) {
27 int charIndex = ch - 'a';
28 if (node.children[charIndex] == null) {
29 return Collections.emptyList();
30 }
31 node = node.children[charIndex];
32 }
33 return node.indexes;
34 }
35}
36
37// WordFilter class for prefix and suffix search functionality.
38class WordFilter {
39 private Trie prefixTrie = new Trie();
40 private Trie suffixTrie = new Trie();
41
42 // Constructor that takes an array of words and builds prefix and suffix tries.
43 public WordFilter(String[] words) {
44 for (int i = 0; i < words.length; ++i) {
45 String word = words[i];
46 prefixTrie.insert(word, i);
47 // Reverse the word for suffix operations and insert into the suffix trie.
48 suffixTrie.insert(new StringBuilder(word).reverse().toString(), i);
49 }
50 }
51
52 // Method to find the max index of a word that has the given prefix and suffix.
53 public int f(String prefix, String suffix) {
54 // Reverse suffix as we stored the words in reversed form in the suffix trie.
55 suffix = new StringBuilder(suffix).reverse().toString();
56 // Get the list of indexes for both prefix and suffix
57 List<Integer> prefixIndexes = prefixTrie.search(prefix);
58 List<Integer> suffixIndexes = suffixTrie.search(suffix);
59 // Return -1 if either list is empty.
60 if (prefixIndexes.isEmpty() || suffixIndexes.isEmpty()) {
61 return -1;
62 }
63 // Use two pointers to find the common max index in both lists.
64 int i = prefixIndexes.size() - 1;
65 int j = suffixIndexes.size() - 1;
66 while (i >= 0 && j >= 0) {
67 int prefixIndex = prefixIndexes.get(i);
68 int suffixIndex = suffixIndexes.get(j);
69 if (prefixIndex == suffixIndex) {
70 // If indexes match, return the index as it has the required prefix and suffix.
71 return prefixIndex;
72 }
73 // Move pointers based on which pointer has the larger index.
74 if (prefixIndex > suffixIndex) {
75 --i;
76 } else {
77 --j;
78 }
79 }
80 // If no common index is found return -1.
81 return -1;
82 }
83}
84
85/**
86 * The following comments would be used to explain how the class should be used:
87 *
88 * // Instantiate a WordFilter object.
89 * WordFilter obj = new WordFilter(words);
90 *
91 * // Find the maximum index of a word with the given prefix and suffix.
92 * int maxIndex = obj.f(pref, suff);
93 */
94
1#include <vector>
2#include <string>
3#include <unordered_map>
4using namespace std;
5
6class WordFilter {
7private:
8 // Dictionary to store the mapping between the concatenated prefix and suffix to the index of the word
9 unordered_map<string, int> prefixSuffixMap;
10
11public:
12 // Constructor that precomputes the prefix and suffix combinations
13 WordFilter(vector<string>& words) {
14 // Loop over each word to compute prefix and suffix combinations
15 for (int wordIndex = 0; wordIndex < words.size(); ++wordIndex) {
16 string word = words[wordIndex];
17 int wordLength = word.size();
18
19 // Create all possible prefix and suffix combinations for current word
20 for (int prefixLength = 0; prefixLength <= wordLength; ++prefixLength) {
21 string prefix = word.substr(0, prefixLength);
22 for (int suffixStart = 0; suffixStart <= wordLength; ++suffixStart) {
23 string suffix = word.substr(suffixStart, wordLength - suffixStart);
24 // Store combination in the dictionary, using dot as a separator
25 prefixSuffixMap[prefix + "." + suffix] = wordIndex;
26 }
27 }
28 }
29 }
30
31 // Function to find the index of the word with given prefix and suffix
32 int f(string prefix, string suffix) {
33 // Create the combined key as was done during precomputation
34 string key = prefix + "." + suffix;
35 // If the combination exists in the map, return the index
36 if (prefixSuffixMap.count(key)) {
37 return prefixSuffixMap[key];
38 }
39 // If the combination is not found, return -1
40 return -1;
41 }
42};
43
44/**
45 * Your WordFilter object will be instantiated and called as such:
46 * WordFilter* obj = new WordFilter(words);
47 * int param_1 = obj->f(pref, suff);
48 */
49
1// Store the dictionary globally to map the concatenated prefix and suffix to the word's index
2const prefixSuffixMap: Record<string, number> = {};
3
4// Function to precompute the prefix and suffix combinations for an array of words
5function initializeWordFilter(words: string[]): void {
6 // Iterate over each word to compute prefix and suffix combinations
7 words.forEach((word, wordIndex) => {
8 const wordLength = word.length;
9 // Generate all possible prefix and suffix combinations for the current word
10 for (let prefixLength = 0; prefixLength <= wordLength; ++prefixLength) {
11 const prefix = word.substring(0, prefixLength);
12 for (let suffixStart = 0; suffixStart <= wordLength; ++suffixStart) {
13 const suffix = word.substring(suffixStart);
14 // Store the combination in the global dictionary, using a dot as a separator
15 prefixSuffixMap[`${prefix}.${suffix}`] = wordIndex;
16 }
17 }
18 });
19}
20
21// Function to find the index of the word with a given prefix and suffix
22function f(prefix: string, suffix: string): number {
23 // Create the combined key as was done during precomputation
24 const key = `${prefix}.${suffix}`;
25 // Return the index if the combination exists in the map, otherwise return -1
26 return prefixSuffixMap.hasOwnProperty(key) ? prefixSuffixMap[key] : -1;
27}
28```
29
30Usage:
31
32```typescript
33// Initialize with a list of words
34initializeWordFilter(["apple", "banana", "cherry"]);
35
36// Find the index of the word with a given prefix and suffix
37const index = f("a", "e"); // Returns the index of "apple" if present
38console.log(index); // Output should be the index of the word "apple" or -1 if not found
39
Time and Space Complexity
Time Complexity:
-
Trie
Construction:- O(m * n)
, wherem
is the average length of the words, andn
is the total number of words. We process each word letter by letter and at each letter of each word, we may potentially create a new Trie node. -
WordFilter
Constructor:- O(m * n)
. The constructor itself calls theinsert
method of theTrie
class for each word and its reverse once, which is2 * m * n
. -
insert
method:- O(m)
. We need to traverse or create Trie nodes for each character in the word and push the index intoindexes
array which can take up toO(m)
operations wherem
is the length of the word. -
search
method:- O(m + k)
, wherek
is the number of entries in the index list. For each character in the prefix or suffix, we traverse the Trie, and then we traverse the list of indexes at the final Trie node. -
f
function: It depends on how many indexes are there to process. If the lengths of both lists area_i
andb_i
for prefix and suffix respectively, then the complexity is- O(a_i + b_i)
, as we are iterating through both lists at most once. In the worst case scenario, it could be- O(n)
, wheren
is the total number of words since every single word could potentially match the prefix or suffix.
Space Complexity:
-
Trie
Storage:- O(26 * m * n)
, each node could potentially have up to 26 children and we could have a unique path for every single letter in every word. However, due to common prefixes, the space complexity might be less in practice. -
indexes
Arrays Storage:- O(n * m)
, each index could be stored in the list of potentially every node in the path of a word in the worst case, wherem
is the average length of the words andn
is the number of words.
Overall, for the WordFilter
data structure, the space complexity can still be approximated as - O(26 * m * n)
due to the Trie nodes being the dominating factor.
Learn more about how to find time and space complexity quickly using problem constraints.
Depth first search is equivalent to which of the tree traversal order?
Recommended Readings
Trie Introduction Imagine you have a magic dictionary where as soon as you start spelling a word it immediately tells you how many words that start with those letters How cool would that be This is exactly the kind of magic a data structure called trie performs What's a Trie
Patterns The Shortest Path Algorithm for Coding Interviews The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we can determine the
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Got a question? Ask the Monster Assistant anything you don't understand.
Still not clear? Submit the part you don't understand to our editors. Or join our Discord and ask the community.