1032. Stream of Characters
Problem Description
The problem asks us to design an algorithm that continuously receives characters and needs to check whether any suffix (the end part of the characters received so far) is a string that appears in a given list of strings. Essentially, the algorithm has to be able to say whether the characters seen most recently form a word that we are looking for.
This is an example of a problem where we have to deal with a stream of data, meaning the input is not fixed, but keeps coming in over time (imagine typing on a keyboard, where each keystroke is a new character in the stream).
Suppose we are given a set of words like ["hello", "leet", "code"]
. Now imagine characters start coming in: ['c', 'o', 'd', 'e']
. After the last character ('e') arrives, the algorithm must recognize that 'code' is one of the words it has to detect and return true.
Intuition
To solve this problem effectively, we should use a data structure that allows for quick searches of prefixes or suffixes of words, such as a Trie (also known as a prefix tree). In typical usage, Tries are built with prefixes of words, but in this problem, because we are interested in suffixes, the Trie will be built backwards with words inserted in reverse order.
As characters stream in, we add them to a buffer (a list, for instance) that represents the current stream state. After each new character is added to the stream, we want to check if there exists any word that ends with the sequence of the letters currently in the buffer. Therefore, we check the Trie with the characters in the buffer in reverse order since our Trie is built with words in reverse order.
To optimize the search, we can keep the length of the search within the longest word in our list. There's no need to search further back in the stream because no word will be longer than the maximum word length in the list.
The Trie class has standard functionalities: it can insert
a reversed word so that each node in the Trie represents a character, and it has a flag is_end
to indicate if the current node is the end of a inserted word. The Trie also has a search
method that checks if a given list of characters (in reverse) matches any word that ends with those characters. If it does, it returns true
immediately upon finding a word's end. Otherwise, if it goes through all characters without finding a match, it returns false
.
The StreamChecker
class maintains an instance of the Trie. With each query, the StreamChecker
adds the new character to the buffer and performs a search in the Trie for the recently added characters up to the limit. If a word is found, it returns true
. Otherwise, it returns false
.
In conclusion, each query
operation involves a search in the Trie, which is efficient in terms of time complexity, making it well-suited for handling streams of characters where suffix matching is needed.
Learn more about Trie and Data Stream patterns.
Solution Approach
To implement the solution for the StreamChecker
problem, we make use of Trie data structure because of its ability to handle queries related to prefixes and, in this case, suffixes of words. Here's how the solution is implemented:
-
Trie Class: First, we define a
Trie
class which will be used to store the words in a way that makes it efficient to search for suffixes.-
The Trie consists of nodes, where each node contains an array of 26 elements (representing the 26 lowercase English letters) and a boolean flag
is_end
which indicates whether the node corresponds to the end of a word. -
insert
method: This inserts a word into the Trie. However, since we need to check for suffixes, each word is inserted in reverse. For instance, if the word is"abc"
, the Trie is updated with the letters 'c', 'b', 'a' in that order, starting from the Trie root. -
search
method: Takes a list of characters (representing the current stream) and checks if any suffix is present in the Trie. We pass the stream list sliced to the lastlimit
characters to maintain efficiency and only check the recent part of the stream. This search is performed in reverse since words have been stored in reverse in the Trie.
-
-
StreamChecker Class: This class uses an instance of the Trie to check the stream.
-
The constructor
__init__
: Initializes a new Trie and inserts all the words in reverse. It also initializes a listcs
which will be used to hold the characters from the stream. It sets alimit
to the maximum number of characters to consider for a query which can be optimized based on the maximum length of the words provided. -
query
method: Accepts a character and appends it to thecs
list which represents the current state of the stream; then, it calls thesearch
method of the Trie and passes a slice of the stream list (to reduce the unnecessary searches beyond the longest word's length).
-
In the query
method, it is important to note that the characters are used from the end of the list cs[-self.limit:]
, because the Trie has been constructed with words in reverse. This makes sure that we are always looking for a matching suffix in the Trie that corresponds with the most recently added characters to the stream.
With this implementation, whenever a new character arrives, we are able to query the Trie and get a response in efficient time. This solution ensures that even with a continuously growing stream, the query operations remain tractable due to the nature of the Trie data structure.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach with a small example using the set of words ["hello", "leet", "code"]
that our StreamChecker
should watch out for. Here's how it will work step by step:
-
Building the Trie: We begin by inserting the reverse of each word from our given list into the Trie. After this process, the Trie would have nodes representing the letters 'e', 't', 'o', 'c' (from "code"), 't', 'e', 'e', 'l' (from "leet"), and 'o', 'l', 'l', 'e', 'h' (from "hello").
-
Initializing
StreamChecker
: We create an instance ofStreamChecker
and during its initialization, the above words are inserted in reverse into the Trie. Thecs
buffer is initialized to store the stream of characters andlimit
is set to 5, which is the length of the longest word "hello". -
Processing the Stream: Characters start to arrive one by one. Let's consider the stream
['c', 'o', 'd', 'e', 'b', 'l', 'e', 'e', 't']
.-
When 'c' arrives, we add it to
cs
and check the Trie. No suffix formed yet, so it returnsfalse
. -
We keep adding subsequent characters: 'o', 'd'. Still, no recognized suffix, so return
false
each time. -
Upon adding 'e', we check again. This time, the Trie recognizes the suffix 'e', 'd', 'o', 'c' (which is "code" in reverse), and returns
true
, as "code" is one of the words we were looking for. -
The next letters are 'b', 'l', 'e', and 'e', all return
false
since they do not form a recognized suffix. -
Finally, when 't' is added, we check the subarray
cs[-5:]
which contains ['b', 'l', 'e', 'e', 't']. We reverse it and now look for 't', 'e', 'e', 'l', 'b' in the Trie. It recognizes the suffix 't', 'e', 'e', 'l' as "leet" in reverse and returnstrue
.
-
At each step, the query
method is efficiently checking the last limit
characters in the stream, which guarantees that as the character stream grows, we are only ever considering a manageable slice of it. This is why the Trie is an excellent choice for this type of problem; it keeps queries fast, even as more data comes in.
Solution Implementation
1class Trie:
2 def __init__(self):
3 # Initialize a list for the children, representing 26 lowercase English letters.
4 self.children = [None] * 26
5 # Mark this as the end of a word in the trie.
6 self.is_end_of_word = False
7
8 def insert(self, word: str):
9 # Insert a word into the trie with reversed order.
10 node = self
11 for char in word[::-1]:
12 index = ord(char) - ord('a') # Map the character to an index.
13 if node.children[index] is None:
14 node.children[index] = Trie() # Create a new Trie node if it does not exist.
15 node = node.children[index] # Move to the next node.
16 node.is_end_of_word = True # Mark the end of a word.
17
18 def search(self, word: list[str]) -> bool:
19 # Search for a word in the trie and returns True if it exists.
20 node = self
21 for char in word[::-1]: # Check each character in reversed order.
22 index = ord(char) - ord('a')
23 if node.children[index] is None:
24 return False # Return False if character path does not exist.
25 node = node.children[index]
26 if node.is_end_of_word:
27 return True # Return True if the current node marks the end of a word.
28 return False # Return False if the end of the word was not encountered.
29
30
31class StreamChecker:
32 def __init__(self, words: list[str]):
33 # Initialize the StreamChecker with a Trie and a list for incoming characters.
34 self.trie = Trie()
35 self.stream = [] # List to hold streamed letters.
36 self.limit = 200 # Maximum length of a word to consider in the trie.
37 for word in words:
38 self.trie.insert(word)
39
40 def query(self, letter: str) -> bool:
41 # Process a new letter from the stream, and check for matching words.
42 # Append the letter to the stream.
43 self.stream.append(letter)
44 # Return True if the added letter completes any word in the trie.
45 return self.trie.search(self.stream[-self.limit:])
46
47
48# Usage example:
49# stream_checker = StreamChecker(words)
50# result = stream_checker.query(letter)
51
1class Trie {
2 // Initialize trie children for each letter of the alphabet
3 private Trie[] children = new Trie[26];
4 // Flag to indicate the end of a word
5 private boolean isEndOfWord = false;
6
7 // Method to insert a word into the trie
8 public void insert(String word) {
9 // Start with the root of the trie
10 Trie node = this;
11 // Iterate through the word in reverse order to insert it into the trie
12 for (int i = word.length() - 1; i >= 0; --i) {
13 int index = word.charAt(i) - 'a';
14 // If the corresponding child does not exist, create it
15 if (node.children[index] == null) {
16 node.children[index] = new Trie();
17 }
18 // Move to the child node
19 node = node.children[index];
20 }
21 // Mark the end of the word
22 node.isEndOfWord = true;
23 }
24
25 // Method to query the trie with a given string from a stream
26 public boolean query(StringBuilder stringBuilder) {
27 Trie node = this;
28 // Iterate through the stringBuilder in reverse order
29 for (int i = stringBuilder.length() - 1; i >= 0; --i) {
30 int index = stringBuilder.charAt(i) - 'a';
31 // If the corresponding child does not exist, the query fails
32 if (node.children[index] == null) {
33 return false;
34 }
35 // Move to the child node
36 node = node.children[index];
37 // If a word is found, return true
38 if (node.isEndOfWord) {
39 return true;
40 }
41 }
42 // No word found at the end of the query
43 return false;
44 }
45}
46
47class StreamChecker {
48 // StringBuilder to maintain the sequence of queried letters
49 private StringBuilder queryStringBuilder = new StringBuilder();
50 // Trie object to store the words for query
51 private Trie trie = new Trie();
52
53 // Constructor
54 public StreamChecker(String[] words) {
55 // Insert each word into the trie
56 for (String word : words) {
57 trie.insert(word);
58 }
59 }
60
61 // Query a single character and append it to the stream
62 public boolean query(char letter) {
63 // Append the letter to the query string
64 queryStringBuilder.append(letter);
65 // Query the trie with the updated stream
66 return trie.query(queryStringBuilder);
67 }
68}
69
1#include <vector>
2#include <string>
3#include <algorithm>
4
5// Class for Trie Node
6class Trie {
7private:
8 std::vector<Trie*> children; // Children nodes representing each alphabet letter
9 bool isEnd; // Flag indicating whether the node marks the end of a word
10
11public:
12 // Constructor initializes children for each letter of the alphabet and sets isEnd to false
13 Trie() : children(26, nullptr), isEnd(false) {}
14
15 // Inserts a word into the trie in reverse order
16 void insert(const std::string& word) {
17 Trie* node = this; // Start from the root node
18 for (int i = word.length() - 1; i >= 0; --i) {
19 int index = word[i] - 'a';
20 // If the child node does not exist, create it
21 if (!node->children[index]) {
22 node->children[index] = new Trie();
23 }
24 // Move to the child node
25 node = node->children[index];
26 }
27 // Mark the end of the word
28 node->isEnd = true;
29 }
30
31 // Searches for a word in reverse in the trie
32 bool search(const std::string& word) {
33 Trie* node = this;
34 for (int i = word.length() - 1; i >= 0; --i) {
35 int index = word[i] - 'a';
36 // If the child node does not exist, the word is not present
37 if (!node->children[index]) {
38 return false;
39 }
40 // Move to the child node
41 node = node->children[index];
42 // If we find a word ending here, we can return true
43 if (node->isEnd) {
44 return true;
45 }
46 }
47 // If no word ending is encountered during traversal, return false
48 return node->isEnd;
49 }
50};
51
52// Class for checking if a string contains any word as a suffix
53class StreamChecker {
54private:
55 Trie* trie; // Trie node used to store and query words
56 std::string streamSoFar; // String to store the characters queried so far
57
58public:
59 // Constructor that initializes the trie with the given words
60 StreamChecker(const std::vector<std::string>& words) : trie(new Trie()) {
61 for (const std::string& word: words) {
62 trie->insert(word);
63 }
64 }
65
66 // Queries the current character and returns true if any word is found as a suffix
67 bool query(char letter) {
68 // Add the current character to the stream
69 streamSoFar += letter;
70 // Search in the trie for the current stream
71 return trie->search(streamSoFar);
72 }
73
74 // Destructor to clean up the dynamically allocated Trie
75 ~StreamChecker() {
76 delete trie; // Properly delete the Trie to avoid memory leaks
77 }
78};
79
80/**
81 * Usage
82 * ----
83 * StreamChecker streamChecker(words); // Initialize the StreamChecker with a vector of words
84 * bool result = streamChecker.query(letter); // Will return true if the stream results in any suffix that is present in the word list
85 */
86
1// Global variables
2let children: TrieNode[] = [];
3let isEnd: boolean = false;
4let trie: TrieNode | null = null;
5let streamSoFar: string = "";
6
7// Interface for a Trie Node
8interface TrieNode {
9 children: TrieNode[];
10 isEnd: boolean;
11}
12
13// Function to create a trie node
14function createTrieNode(): TrieNode {
15 return {
16 children: new Array(26).fill(null),
17 isEnd: false
18 };
19}
20
21// Function to insert a word into the trie in reverse order
22
23function insert(word: string): void {
24 let node: TrieNode | null = trie;
25 for (let i = word.length - 1; i >= 0; --i) {
26 const index: number = word.charCodeAt(i) - 'a'.charCodeAt(0);
27 if (!node.children[index]) {
28 node.children[index] = createTrieNode();
29 }
30 node = node.children[index];
31 }
32 node.isEnd = true;
33}
34
35// Function to search for a word in reverse in the trie
36
37function search(word: string): boolean {
38 let node: TrieNode | null = trie;
39 for (let i = word.length - 1; i >= 0; --i) {
40 const index: number = word.charCodeAt(i) - 'a'.charCodeAt(0);
41 if (!node.children[index]) {
42 return false;
43 }
44 node = node.children[index];
45 if (node.isEnd) {
46 return true;
47 }
48 }
49 return node.isEnd;
50}
51
52// Function to initialize the trie with the given words
53// This replaces the StreamChecker constructor
54function initializeTrie(words: string[]): void {
55 trie = createTrieNode();
56 for (const word of words) {
57 insert(word);
58 }
59}
60
61// Function to query the current character and return true if any word is found as a suffix
62// This replaces StreamChecker.query
63function query(letter: string): boolean {
64 streamSoFar += letter;
65 return search(streamSoFar);
66}
67
68// Example of usage
69// Here you would first call initializeTrie with the array of words, then you can
70// repeatedly call query with different letters to simulate the stream of letters.
71
Time and Space Complexity
Time Complexity
-
Trie.insert
method:- The insertion of a word into the trie takes
O(n)
time, wheren
is the length of the word. This is because we process each character, inserting new nodes as necessary.
- The insertion of a word into the trie takes
-
Trie.search
method:- The search operation in the trie takes
O(m)
time in the worst case, wherem
is the minimum between the length of the queried stringw
and thelimit
. The search reverses the stringw
and checks each character.
- The search operation in the trie takes
-
StreamChecker.__init__
method:- During the initialization of a
StreamChecker
object, we insert several words into the trie. Ifl
is the average length of the words andk
is the total number of words, the overall time complexity for the initialization isO(k*l)
.
- During the initialization of a
-
StreamChecker.query
method:- In each query, we add a letter to
self.cs
and perform a search. The time complexity of adding a letter isO(1)
. The search operates on a substring of at mostself.limit
characters, so the time complexity for the search part isO(self.limit)
. Therefore, the time complexity for each query isO(self.limit)
.
- In each query, we add a letter to
In summary, the time complexity for insertions is O(k*l)
, the search/query operations run in O(self.limit)
time.
Space Complexity
-
Trie
storage:- The worst-case space complexity of a trie depends on the number of nodes and the size of the alphabet. In this implementation, each node can have up to 26 children. If we have
n
words of average lengthl
, the worst-case space required would beO(26 * n * l)
. However, due to common prefixes, Tries usually use less space than this upper bound.
- The worst-case space complexity of a trie depends on the number of nodes and the size of the alphabet. In this implementation, each node can have up to 26 children. If we have
-
StreamChecker
object:- The StreamChecker maintains a list
self.cs
that holds the characters of the stream queried so far, and it's trimmed to the lastself.limit
elements. Thus, it takesO(self.limit)
space.
- The StreamChecker maintains a list
In summary, the space complexity is dominated by the Trie structure, which is O(26 * n * l)
for storing the nodes, and O(self.limit)
for storing the stream of characters, resulting in a total space complexity of O(26 * n * l + self.limit)
.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of these properties could exist for a graph but not a tree?
Recommended Readings
Trie Introduction Imagine you have a magic dictionary where as soon as you start spelling a word it immediately tells you how many words that start with those letters How cool would that be This is exactly the kind of magic a data structure called trie performs What's a Trie
Median of Data Stream Given a stream of numbers find the median number at any given time accurate to 1 decimal place Example add_number 1 add_number 2 add_number 3 get_median 2 0 add_number 4 get_median 2 5 Try it yourself Explanation Intuition Brute force way is to sort the entire list every time we get a new number This would be O N log N for each add_number operation One step up is to notice that the list is sorted before we add any new number to it So every
Patterns The Shortest Path Algorithm for Coding Interviews The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we can determine the
Got a question? Ask the Monster Assistant anything you don't understand.
Still not clear? Submit the part you don't understand to our editors. Or join our Discord and ask the community.