208. Implement Trie (Prefix Tree)

MediumDesignTrieHash TableString
Leetcode Link

Problem Description

The problem requires implementing a Trie, also known as a prefix tree, which is a special type of tree used to store associative data structures. A trie stores keys in a way that makes it fast and efficient to retrieve them by their prefix. It can be used for various applications, notably for the efficient retrieval and storage of strings within a set for tasks like autocomplete or spell checking.

The task is to create a Trie class with the following operations:

  • [Trie](/problems/trie_intro)(): A method to initialize the trie object.
  • void insert(String word): This method inserts the string word into the trie.
  • boolean search(String word): This method checks if the string word exists in the trie (i.e., it has been inserted before), returning true if it exists and false otherwise.
  • boolean startsWith(String prefix): This method checks if there is any word in the trie that starts with the given prefix, returning true if such a word exists and false otherwise.

Intuition

The intuition for building a trie starts with the basic necessity of storing strings in a way that common prefixes are shared to save space and make search operations efficient. A Trie is a rooted tree where each node represents a common prefix of some strings. Each node might have up to 26 children (in the case of lowercase alphabetic characters) and a boolean flag indicating whether a word ends at this node.

Here's how we arrive at the solution:

  1. Trie Structure: The Trie class is visualized as a tree where each node contains an array of 26 elements representing the possible lowercase alphanumeric children ('a' to 'z'), and a boolean marker that signifies the end of a word (is_end).

  2. Insertion: To insert a word, we iterate through each character of the word, map it to the appropriate child index (by subtracting the ASCII value of 'a' from the character's ASCII value to keep it within a 0-25 range), and move to the corresponding child node, creating a new node if it doesn't exist. When the end of the word is reached, we mark is_end as true at the final node.

  3. Search: Searching for a whole word is similar to insertion, but instead of creating new nodes, we just traverse the nodes corresponding to each character. If we reach the end of the word, we check the is_end flag to see if the word is a complete and distinct entry in our trie.

  4. Prefix Search: To search for a prefix, the process is almost identical to the word search, but we don't need to check the is_end flag; any reachable node after traversing the prefix indicates that a word with that prefix exists in the trie.

The auxiliary method _search_prefix encapsulates the common logic of traversing the trie that is shared between searching for a complete word and checking for a word that starts with a given prefix.

Using such a structure allows operations like insert, search and prefix search to be performed in O(m) time complexity, where m is the length of the word or prefix being processed.

Learn more about Trie patterns.

Solution Approach

The implementation of the [Trie](/problems/trie_intro) class is based on the fundamental concepts of the trie data structure, which stores strings in such a way that all the descendants of a node share a common prefix associated with that node, with the root being an empty string. Here is a step-by-step breakdown of the implementation process and the algorithm used:

  1. Initialization (__init__): Our trie is initialized with 26 children (one for each lowercase letter) set to None, indicating that there are no child nodes yet. Additionally, a boolean is_end is initialized to False, signifying that initially, no word ends at this node.

    self.children = [None] * 26
    self.is_end = False
  2. Insert Method (insert): To insert a word, we iterate over each character in the word. We calculate the index for the children array by subtracting the ASCII value of 'a' from the character's ASCII value. If the child at this index is None, we create a new [Trie](/problems/trie_intro) node there. After processing each character, we set the is_end flag at the node corresponding to the last character to True, indicating the end of the word.

    for c in word:
        idx = ord(c) - ord('a')
        if node.children[idx] is None:
            node.children[idx] = [Trie](/problems/trie_intro)()
        node = node.children[idx]
    node.is_end = True
  3. Search Method (search): The search method makes use of an internal method _search_prefix that is common between the search and startsWith methods. This method attempts to traverse the trie according to the given word. If traversal is successful and the final node's is_end is True, it means the word exists in the trie.

    node = self._search_prefix(word)
    return node is not None and node.is_end
  4. Prefix Search Method (startsWith): Similar to the search method, startsWith uses _search_prefix. If we can traverse all the characters of the prefix in the trie, it means a word with that prefix exists, regardless of the is_end value.

    node = self._search_prefix(prefix)
    return node is not None
  5. Searching Prefix (_search_prefix): This is an internal method that traverses the tree according to the given prefix or word. If any character is not found, it immediately returns None. If the traversal is completed, it returns the last node accessed, which represents the final character in the prefix or word.

    for c in prefix:
        idx = ord(c) - ord('a')
        if node.children[idx] is None:
            return None
        node = node.children[idx]
    return node

By following this approach, the operations of inserting a word or searching for a word/prefix are done in O(m) time, where m represents the length of the word or prefix. This effective utilization of a trie allows for fast insertions and searches while optimizing space as common prefixes are shared among entries.

Ready to land your dream job?

Unlock your dream job with a 2-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example using the solution approach outlined for the implementation of a Trie:

Suppose we want to execute the following operations:

  1. Initialize the Trie.
  2. Insert the word "apple".
  3. Search for the word "apple".
  4. Search for the word "app" to check if it exists as a complete word.
  5. Check if a prefix "app" exists.

Let's begin with step-by-step operations on the Trie:

  1. Initialization:

    We create a Trie object.

    trie = Trie()
    • At this point, the trie is empty, with only the root node, and no words are added.
  2. Inserting the word "apple":

    We call insert("apple") on our trie.

    trie.insert("apple")
    • We iterate through each character in "apple".
      • For 'a', calculate the index (0) and go to the child node at index 0. As it is None, create a new Trie node there.
      • Repeat the process for 'p', 'p', 'l', and 'e'.
    • At the end of insertion, mark the is_end as True at the node representing 'e'.
  3. Searching for the word "apple":

    We call search("apple").

    trie.search("apple") # Should return True
    • Internally _search_prefix method is called that goes through each character in "apple".
    • Since we just inserted "apple", each of these nodes exist, and the final node's is_end is True.
    • Therefore, it returns True as "apple" exists in the trie.
  4. Searching for the word "app":

    We call search("app").

    trie.search("app") # Should return False
    • _search_prefix method is used to traverse for "app".
    • While 'a' and 'p' exist, the is_end at the second 'p' is False because we inserted "apple" and not "app".
    • Hence, it returns False, indicating "app" is not registered as a distinct word in the trie.
  5. Checking if prefix "app" exists:

    We call startsWith("app").

    trie.startsWith("app") # Should return True
    • _search_prefix method is used to traverse the trie for "app".
    • Since nodes for 'a' and 'p' exist, and we do not care about is_end when checking for a prefix,
    • It returns True as there is a word in the trie that starts with "app" (which is "apple").

Through this example, you can see how each of the operations work and how the methods _insert, _search, and _search_prefix interact with each other to perform trie operations efficiently.

Solution Implementation

1class Trie:
2    def __init__(self):
3        # Initialize the root of the trie with 26 children for each letter of the alphabet.
4        self.children = [None] * 26
5        # A boolean flag to check if the node represents the end of a word.
6        self.is_end_of_word = False
7
8    def insert(self, word: str) -> None:
9        """Inserts a word into the trie."""
10        node = self
11        for char in word:
12            index = ord(char) - ord('a')  # Map character to trie node index.
13            if node.children[index] is None:
14                node.children[index] = Trie()  # Create a new node if none exists.
15            node = node.children[index]
16        node.is_end_of_word = True  # Mark the end of the word.
17
18    def search(self, word: str) -> bool:
19        """Searches for a word in the trie."""
20        node = self._search_prefix(word)
21        # Word is found if we found a node where the prefix ends and is_end_of_word is True.
22        return node is not None and node.is_end_of_word
23
24    def startsWith(self, prefix: str) -> bool:
25        """Returns if there is any word in the trie that starts with the given prefix."""
26        node = self._search_prefix(prefix)
27        # If we found a node, the prefix exists in the trie.
28        return node is not None
29
30    def _search_prefix(self, prefix: str):
31        """Searches for a node in the trie that represents the given prefix."""
32        node = self
33        for char in prefix:
34            index = ord(char) - ord('a')  # Map character to trie node index.
35            if node.children[index] is None:
36                return None  # If a node doesn't exist for a character, return None.
37            node = node.children[index]
38        return node  # Return either the node that represents the prefix or None.
39
40# Example usage:
41# trie = Trie()
42# trie.insert("apple")
43# search_apple = trie.search("apple")  # Returns True
44# search_app = trie.search("app")      # Returns False
45# prefix_app = trie.startsWith("app")  # Returns True
46# trie.insert("app")
47# search_app = trie.search("app")      # Returns True
48
1class Trie {
2    // Each Trie node can have at most 26 children for each letter of the alphabet.
3    private Trie[] children;
4    // isEndOfWord is true if the node represents the end of a word.
5    private boolean isEndOfWord;
6
7    // Constructor initializes the Trie node with an empty array for child nodes.
8    public Trie() {
9        children = new Trie[26];
10        isEndOfWord = false;
11    }
12
13    // Inserts a word into the trie.
14    public void insert(String word) {
15        Trie currentNode = this;
16        for (char letter : word.toCharArray()) {
17            int index = letter - 'a'; // Calculate the index corresponding to the letter.
18            // If the letter doesn't have a corresponding Trie node, create one.
19            if (currentNode.children[index] == null) {
20                currentNode.children[index] = new Trie();
21            }
22            // Move to the next Trie node.
23            currentNode = currentNode.children[index];
24        }
25        // Mark the end of a word.
26        currentNode.isEndOfWord = true;
27    }
28
29    // Searches if the word exists in the trie.
30    public boolean search(String word) {
31        Trie node = searchPrefix(word);
32        // If the node is not null and isEndOfWord is true, the word exists in the trie.
33        return node != null && node.isEndOfWord;
34    }
35
36    // Checks if there is any word in the trie that starts with the given prefix.
37    public boolean startsWith(String prefix) {
38        Trie node = searchPrefix(prefix);
39        // If the node is not null, there is a word that starts with the prefix.
40        return node != null;
41    }
42
43    // Helper method to search for a prefix or a whole word in the trie and return the node where the search ends.
44    private Trie searchPrefix(String s) {
45        Trie currentNode = this;
46        for (char letter : s.toCharArray()) {
47            int index = letter - 'a';
48            if (currentNode.children[index] == null) {
49                // If the next node doesn't exist, return null indicating the prefix doesn't exist.
50                return null;
51            }
52            // Move to the next Trie node.
53            currentNode = currentNode.children[index];
54        }
55        return currentNode; // The node where the search ended, could represent the prefix or the whole word.
56    }
57}
58
1#include <vector>
2#include <string>
3
4class Trie {
5private:
6    std::vector<Trie*> children; // Array representing the alphabet for trie nodes
7    bool isEndOfWord;            // Flag indicating end of word
8
9    // Helper function to traverse the trie up to the end of the given prefix
10    Trie* searchPrefix(const std::string& prefix) {
11        Trie* node = this;
12        for (char c : prefix) {
13            int index = c - 'a'; // Calculate the index of the child
14            if (!node->children[index]) return nullptr; // Prefix not found
15            node = node->children[index]; // Move to next node
16        }
17        return node; // Return pointer to the last node in the prefix
18    }
19
20public:
21    // Constructor initializes children vector containing 26 elements for alphabet letters 
22    // and sets isEndOfWord to false for the root
23    Trie() : children(26, nullptr), isEndOfWord(false) {}
24
25    // Insert a word into the trie data structure
26    void insert(const std::string& word) {
27        Trie* node = this;
28        for (char c : word) {
29            int index = c - 'a'; // Calculate the index of the child
30            // If child doesn't exist, create a new Trie node
31            if (!node->children[index]) node->children[index] = new Trie();
32            node = node->children[index]; // Move to next node
33        }
34        node->isEndOfWord = true; // Mark the end of the word
35    }
36
37    // Search for a word in the trie and return true if the word exists
38    bool search(const std::string& word) {
39        Trie* node = searchPrefix(word);
40        return node != nullptr && node->isEndOfWord; // True if word found and it is an end of a word
41    }
42
43    // Check if there is any word in the trie that starts with the given prefix
44    bool startsWith(const std::string& prefix) {
45        Trie* node = searchPrefix(prefix);
46        return node != nullptr; // True if prefix found
47    }
48};
49
50/**
51 * Usage:
52 * Trie* trie = new Trie();
53 * trie->insert("word");
54 * bool isFound = trie->search("word");
55 * bool startsWithPrefix = trie->startsWith("prefix");
56 */
57
1// Global trie node structure
2const trieNodes: TrieNode[] = [];
3const isEndOfWord: boolean[] = [];
4let rootIndex: number;
5
6interface TrieNode {
7    childrenIndices: number[];
8}
9
10// Initialize a Trie
11function initializeTrie() {
12    rootIndex = createNode();
13}
14
15// Create a trie node and return its index
16function createNode(): number {
17    const nodeIndex = trieNodes.length;
18    trieNodes.push({
19        childrenIndices: new Array(26).fill(-1)  // Initialize all children indices to -1
20    });
21    isEndOfWord.push(false);  // By default, a node is not the end of a word
22    return nodeIndex;
23}
24
25// Insert a word into the trie
26function insert(word: string): void {
27    let nodeIndex = rootIndex;
28    for (let char of word) {
29        let charIndex = char.charCodeAt(0) - 97; // Convert char to an index (0-25)
30        if (trieNodes[nodeIndex].childrenIndices[charIndex] === -1) {
31            trieNodes[nodeIndex].childrenIndices[charIndex] = createNode(); // Create the child node if it doesn't exist
32        }
33        nodeIndex = trieNodes[nodeIndex].childrenIndices[charIndex];
34    }
35    isEndOfWord[nodeIndex] = true; // Mark the end of the word
36}
37
38// Search for a word in the trie
39function search(word: string): boolean {
40    const nodeIndex = searchPrefix(word);
41    return nodeIndex !== undefined && isEndOfWord[nodeIndex];
42}
43
44// Search for a prefix in the trie
45function startsWith(prefix: string): boolean {
46    return searchPrefix(prefix) !== undefined;
47}
48
49// Helper method to search for a prefix and return the index of the last node of the prefix
50function searchPrefix(prefix: string): number | undefined {
51    let nodeIndex = rootIndex;
52    for (let char of prefix) {
53        let charIndex = char.charCodeAt(0) - 97; // Convert char to an index (0-25)
54        if (trieNodes[nodeIndex].childrenIndices[charIndex] === -1) return undefined; // If the child doesn't exist, prefix not found
55        nodeIndex = trieNodes[nodeIndex].childrenIndices[charIndex];
56    }
57    return nodeIndex;
58}
59
60// Initialization call to set up the trie
61initializeTrie();
62

Time and Space Complexity

Time Complexity

  • insert method: The time complexity is O(n), where n is the length of the word being inserted. This is because we iterate through each character of the word and perform constant time operations for each one.

  • search method: The worst-case time complexity is also O(n), assuming that n is the length of the word being searched. This is similar to insert, as the method traverses the Trie, which takes time proportional to the length of the word.

  • startsWith method: The worst-case time complexity is O(m), where m is the length of the prefix. This is due to traversing the Trie up to the depth of the prefix length.

  • _search_prefix method (used by both search and startsWith): The time complexity here is O(p), where p is the length of the prefix. It is traversed only once for each method call.

Space Complexity

  • The overall space complexity of the Trie is O(k * l), where k is the maximum number of children (26 in this case, for each letter of the English alphabet) and l is the average length of the word. This is due to storing a number of nodes proportional to the length of the word times the number of children each node can have (which is 26 here, for lowercase English letters).

  • Each node contains an array of 26 pointers and a boolean flag, and the total space used grows with the number of nodes created, which is based on the number of words inserted and their lengths.

  • Note that the space complexity does not scale with the number of insert/search operations, but with the number of unique characters in the Trie. This makes Tries space-efficient when storing large sets of words with overlapping prefixes.

Learn more about how to find time and space complexity quickly using problem constraints.


Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:
Question 1 out of 10

A heap is a ...?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!