211. Design Add and Search Words Data Structure


Problem Description

The problem is to design a data structure named WordDictionary that allows for the addition of new words and the checking if a specific string matches any word that was previously added to the structure. This data structure should support two main operations:

  1. addWord(word): This operation should take a string word consisting of lowercase English letters and add it to the data structure to be available for future searches.

  2. search(word): This operation should accept a string word that may contain lowercase English letters and dots (.). Each dot can match any letter. The function should return true if the string matches any word in the data structure, and false otherwise. Importantly, the word provided in the search function can have at most 2 dots.

The problem provides an example to illustrate how the data structure should work once the operations are implemented. When some words are added, and various searches are performed, we can observe the behavior. For instance, a search for "pad" returns false since it is not present in the dictionary, while a search for "bad" returns true. When there are dots in the search string such as in ".ad" or "b..", these could represent any letter, and since there are words in the dictionary that satisfy this pattern, the searches return true.

The constraints stipulate that any word added will have a length between 1 and 25, there will be at most 2 dots in each search query, and there will be no more than 10^4 total calls to both addWord and search.

Intuition

To solve this problem, a data structure known as a Trie (prefix tree) is used. A Trie is an efficient information retrieval data structure that allows for the storage of a dynamic set or associative array where the keys are usually strings. In the context of this problem:

  1. Adding words: Each node in the Trie represents a character from a given word. When adding a new word to the Trie, we go through each character in the word, and for each character, we check if there's already an existing node (child) representing it. If there isn't, we create a new node and continue traversing. Once we reach the end of the word, we mark the current node as the end of a word to differentiate complete words from prefixes.

  2. Searching for words: To search for a word in the Trie, we traverse it starting from the root, following the links corresponding to each character of the word we're searching for. If we encounter a dot, we must check all possible sub-paths (since the dot can represent any character). We do this by recursively searching for the rest of the word in all non-null child nodes. If at any point there's no node corresponding to the current character (and it's not a dot), we can return false. If we successfully reach the end of the word and the last node is marked as end of a word, we've found a match and can return true.

The main advantage of using a Trie for this particular problem is its ability to search for words efficiently, even with wildcards (represented by dots). As the search function may need to handle up to 2 wildcards, using a depth-first search approach with backtracking allows for checking all possible combinations that these wildcards could represent.

Learn more about Depth-First Search and Trie patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Which of the following uses divide and conquer strategy?

Solution Approach

The solution uses the Trie data structure to implement the WordDictionary class with addWord and search functions as follows:

  1. Trie Data Structure: A Trie node is represented by a class Trie which contains an array children of size 26 (representing 26 lowercase English letters) and a boolean is_end. Each element in children can either hold another Trie node or be None.

  2. Initializing WordDictionary: The WordDictionary is initiated with a root [Trie](/problems/trie_intro) node. This root is the starting point for all operations performed on the data structure.

  3. addWord Method:

    • The word to be added is processed character by character.
    • Each character's index is determined by ord(c) - ord('a').
    • If a [Trie](/problems/trie_intro) node for a character does not exist in the current node's children, a new Trie node is created at that index.
    • The node pointer is then updated to the newly created node or the existing child node.
    • After inserting all characters, the is_end flag of the final node is set to True.
  4. search Method:

    • The search function uses a recursive helper function that takes the word and the current node as arguments.
    • The recursion proceeds character by character through the word.
    • If a character is not a dot (.), then it finds the corresponding index and checks if there is a corresponding child [Trie](/problems/trie_intro) node. If not, it returns False.
    • If a character is a dot (.), the function checks all non-null children nodes. It calls itself recursively with the remaining substring (the part after the dot). If any of these recursive calls return True, then True is returned.
    • The base case occurs when we reach the end of the word. If the current node has its is_end set to True, it means we have found a complete match for the word, and True is returned.

This implementation efficiently handles both the addition of words and the searching of words, even with wildcards, by utilizing the Trie's structure. The recursion during search allows us to handle the wildcard character by trying all possibilities.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Which data structure is used in a depth first search?

Example Walkthrough

Let's illustrate the solution approach with an example:

Suppose we initialize an empty WordDictionary and add the following words:

  1. addWord("bad")
  2. addWord("dad")
  3. addWord("mad")

Now, let's step through the search queries one by one.

Search Query 1: search("pad")

  • We start at the root of the Trie and look for the first character 'p'.
  • There is no child node corresponding to 'p' since we only added words beginning with 'b' and 'd'.
  • The search immediately returns false since the path for "pad" doesn't exist in our Trie.

Search Query 2: search("bad")

  • We check each character in the word "bad" sequentially.
  • For 'b', 'a', and 'd', there are corresponding child nodes in the Trie, as the word "bad" was added.
  • We arrive at a node that is marked as the end of a word.
  • The search returns true.

Search Query 3: search(".ad")

  • The first character is a dot ., which means we need to check all possible child nodes at this level, corresponding to 'b', 'd', and any other letters we might have.
  • For each valid non-null child node, we recursively continue the search for "ad".
  • We find that both "bad" and "dad" match the pattern ".ad", so the search returns true.

Search Query 4: search("b..")

  • We match the first character 'b' to the corresponding child node.
  • For the subsequent dots, we must explore all possibilities for each dot.
  • We will have to check all children of the 'b' node, then all children of those nodes.
  • For the word "bad" in the Trie, 'b' matches, and we have any character in the next two positions, so "b.." matches "bad," and the search returns true.

Search Query 5: search("b.d")

  • We match 'b' and then encounter the dot ., which again means any character.
  • At this point, there is only one child node for the 'b' with the 'a', so we go down that path.
  • Finally, we look for 'd', which successfully matches, and since we are at the end of the word and the end of word flag is true, we return true.

These examples walk through how the WordDictionary would handle word addition and search operations using the Trie data structure and its nodes to represent the letters and word endings.

Solution Implementation

1class TrieNode:
2    """ A node in the trie structure """
3    def __init__(self):
4        self.children = [None] * 26  # Array to hold 26 lowercase alphabet characters
5        self.is_end_of_word = False   # Flag to check the end of a word
6
7
8class WordDictionary:
9    """ Class to implement a word dictionary using a trie """
10
11    def __init__(self):
12        """ Initializes the word dictionary with a root trie node """
13        self.root = TrieNode()
14
15    def addWord(self, word: str) -> None:
16        """ Adds a word into the dictionary """
17        node = self.root
18        for char in word:
19            index = ord(char) - ord('a')  # Calculate the index for the relevant character
20            # If the current character path does not exist, create a new node
21            if node.children[index] is None:
22                node.children[index] = TrieNode()
23            node = node.children[index]
24        node.is_end_of_word = True  # Mark the end of a word
25
26    def search(self, word: str) -> bool:
27        """ Searches the dictionary for a word, can contain "." as a wildcard character """
28        def dfs(index, node):
29            """
30            Depth-First Search helper function to recursively search
31            for the word in the trie
32            """
33            for i in range(index, len(word)):
34                char = word[i]
35                char_index = ord(char) - ord('a')
36                # If the current character is a wildcard
37                if char == '.':
38                    # Check all possible child paths
39                    return any(child is not None and dfs(i + 1, child) for child in node.children)
40                elif node.children[char_index] is None:
41                    # Character path doesn't exist
42                    return False
43                # Move on to the next trie node
44                node = node.children[char_index]
45            return node.is_end_of_word  # Check if it's the end of a word
46
47        return dfs(0, self.root)  # Start the depth-first search from the root
48
49
50# Usage:
51# obj = WordDictionary()
52# obj.addWord("word")
53# param_2 = obj.search("word")
54# param_3 = obj.search("wor.")
55# param_4 = obj.search("wor..")
56
1class TrieNode {
2    // Each TrieNode contains an array of children TrieNodes to represent each letter of the alphabet
3    TrieNode[] children = new TrieNode[26];
4    // Indicates if a word ends at this node
5    boolean isEndOfWord;
6}
7
8class WordDictionary {
9    private TrieNode root;
10
11    /** Initialize your data structure here. */
12    public WordDictionary() {
13        root = new TrieNode();
14    }
15
16    // Adds a word into the data structure
17    public void addWord(String word) {
18        TrieNode node = root;
19        for (char c : word.toCharArray()) {
20            int index = c - 'a';
21            if (node.children[index] == null) {
22                // If there is no TrieNode for this letter, create a new TrieNode
23                node.children[index] = new TrieNode();
24            }
25            // Move to the next node
26            node = node.children[index];
27        }
28        // Mark this node as the end of a word
29        node.isEndOfWord = true;
30    }
31
32    // Searches for a word in the data structure and can handle '.' as a wildcard character
33    public boolean search(String word) {
34        return searchInNode(word, root);
35    }
36
37    private boolean searchInNode(String word, TrieNode node) {
38        for (int i = 0; i < word.length(); ++i) {
39            char c = word.charAt(i);
40            if (c == '.') {
41                // If it's a wildcard, we need to check all possible paths
42                for (TrieNode child : node.children) {
43                    if (child != null && searchInNode(word.substring(i + 1), child)) {
44                        return true;
45                    }
46                }
47                return false; // If no paths find a match, return false
48            } else {
49                int index = c - 'a';
50                // If the specific child TrieNode does not exist, the word does not exist
51                if (node.children[index] == null) {
52                    return false;
53                }
54                // Move to the next node
55                node = node.children[index];
56            }
57        }
58        // After processing all characters, check if it is the end of a word
59        return node.isEndOfWord;
60    }
61}
62
63/**
64 * Your WordDictionary object will be instantiated and called as such:
65 * WordDictionary obj = new WordDictionary();
66 * obj.addWord(word);
67 * boolean param_2 = obj.search(word);
68 */
69
1class TrieNode {
2public:
3    vector<TrieNode*> children; // Children nodes for each character
4    bool isWordEnd; // Flag to check if a node represents the end of a word
5
6    TrieNode() : children(26, nullptr), isWordEnd(false) {
7        // Constructor initializes children pointers to nullptrs and isWordEnd to false
8    }
9
10    void insert(const string& word) {
11        TrieNode* current = this;
12        for (char ch : word) {
13            int index = ch - 'a'; // Calculate the index of the character 'a' to 'z'
14            if (current->children[index] == nullptr) {
15                // If there is no TrieNode at index, create a new one
16                current->children[index] = new TrieNode;
17            }
18            current = current->children[index]; // Move to the next node
19        }
20        current->isWordEnd = true; // Mark the end of the word
21    }
22};
23
24class WordDictionary {
25private:
26    TrieNode* root; // Root node of the Trie
27
28public:
29    WordDictionary() : root(new TrieNode()) {
30        // Constructor initializes the root node
31    }
32
33    void addWord(string word) {
34        // Public method to add a word to the Trie
35        root->insert(word);
36    }
37
38    bool search(string word) {
39        // Public method to search for a word in the Trie
40        return dfsSearch(word, 0, root);
41    }
42
43private:
44    bool dfsSearch(const string& word, int index, TrieNode* node) {
45        if (index == word.size()) {
46            // If end of word is reached, return true if current node isWordEnd
47            return node->isWordEnd;
48        }
49
50        char ch = word[index]; // Get the character at the current index
51        if (ch != '.') {
52            // If the character is not a wildcard
53            TrieNode* child = node->children[ch - 'a']; // Get the corresponding child node
54            if (child && dfsSearch(word, index + 1, child)) {
55                // Recurse for the next index if child is not null
56                return true;
57            }
58        } else {
59            // If the character is a wildcard, check all possible nodes
60            for (TrieNode* child : node->children) {
61                if (child && dfsSearch(word, index + 1, child)) {
62                    // Recurse with each existing child node and move to the next index
63                    return true;
64                }
65            }
66        }
67        return false; // Return false if the word does not match or is not found
68    }
69};
70
71/**
72 * Use of WordDictionary class:
73 * WordDictionary* obj = new WordDictionary();
74 * obj->addWord(word);
75 * bool param_2 = obj->search(word);
76 */
77
1// Define a TrieNode type with children array and isWordEnd flag
2type TrieNode = {
3    children: (TrieNode | null)[];
4    isWordEnd: boolean;
5};
6
7// Function to create a new TrieNode with initialized values
8function createTrieNode(): TrieNode {
9    return {
10        children: new Array(26).fill(null),
11        isWordEnd: false
12    };
13}
14
15// Global root node of the Trie
16const root: TrieNode = createTrieNode();
17
18// Function to insert a word into the Trie
19function insert(word: string): void {
20    let current = root;
21    for (const ch of word) {
22        const index = ch.charCodeAt(0) - 'a'.charCodeAt(0);
23        if (current.children[index] === null) {
24            current.children[index] = createTrieNode();
25        }
26        current = current.children[index] as TrieNode;
27    }
28    current.isWordEnd = true;
29}
30
31// Function to add a word to the Trie
32function addWord(word: string): void {
33    insert(word);
34}
35
36// Depth-first search helper function for wildcard search
37function dfsSearch(word: string, index: number, node: TrieNode): boolean {
38    if (index === word.length) {
39        return node.isWordEnd;
40    }
41
42    const ch = word[index];
43    if (ch !== '.') {
44        const child = node.children[ch.charCodeAt(0) - 'a'.charCodeAt(0)];
45        if (child && dfsSearch(word, index + 1, child)) {
46            return true;
47        }
48    } else {
49        for (const child of node.children) {
50            if (child && dfsSearch(word, index + 1, child)) {
51                return true;
52            }
53        }
54    }
55    return false;
56}
57
58// Function to search for a word in the Trie, supporting '.' as a wildcard character
59function search(word: string): boolean {
60    return dfsSearch(word, 0, root);
61}
62
63// Usage:
64// addWord('example');
65// const found = search('example'); // returns true
66
Not Sure What to Study? Take the 2-min Quiz:

Which of the following problems can be solved with backtracking (select multiple)

Time and Space Complexity

Time Complexity

addWord Method

  • For each word inserted, the addWord method goes through each character of the word. Let n be the length of the word.
  • For each character, it may create a new Trie node if none exists, which requires constant time, and then moves to the next Trie node.
  • Therefore, the time complexity for addWord is O(n) for adding a word of length n.

search Method

  • The search method is a bit more complex than addWord due to the possible presence of the wildcard character '.'.
  • In the worst-case scenario, every character of the word being searched is a '.'. Let n be the number of characters in the word, and m be the number of possible characters (in this case, 26).
  • If the word contains only dots or if most of the characters are dots, this could result in a branching factor of up to 26 (the size of the alphabet), leading to a worst-case time complexity of O(m^n) (where n is the length of the word).
  • For words without dots, the time complexity is O(n) just as in addWord, as it simply follows the chain of nodes corresponding to the characters in the word.

Space Complexity

Trie Storage

  • The space complexity is dominated by the space used to store the Trie nodes.
  • A Trie node's children is an array of pointers to 26 possible next nodes (corresponding to the 26 letters of the alphabet), plus a boolean to indicate the end of a word.
  • The worst-case space complexity for storing all inserted words is O(m * l) where m is the size of the alphabet (26 in this case), and l is the total number of letters in all words inserted into the WordDictionary.
  • However, since the same node can be shared among prefixes of different words, the actual space usage can be less than this worst-case scenario.

Recursive search Calls

  • For the search method, because recursion is used, there is also space complexity involved with the call stack.
  • In the worst case, when searching for a word that consists solely of dots, and with a completely filled Trie, the recursion could branch 26 times for each dot. Given n dots, the call stack size could be O(m^n) in the worst case.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

What's the relationship between a tree and a graph?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫