676. Implement Magic Dictionary

MediumDepth-First SearchDesignTrieHash TableString
Leetcode Link

Problem Description

The problem requires designing a data structure named MagicDictionary that can do two things: first, it should be able to store a list of distinct words. Second, it should provide a function to determine whether we can change exactly one character in a given string so that it matches any of the words stored in the dictionary.

When creating this data structure, we need to perform the following operations:

  • buildDict: This method initializes the dictionary with an array of unique strings.
  • search: This method takes a string searchWord and returns true if changing exactly one character in searchWord results in a word that is in the dictionary. It returns false otherwise.

The challenge lies in determining an efficient way to validate if a word with one character changed is in the dictionary, considering the dictionary can contain any number of words, and the search function might be called multiple times.

Intuition

The solution is to preprocess the dictionary in such a way that searching is efficient. Rather than checking every word in the dictionary during each search, the idea is to generate a pattern for each word by replacing each character in turn with a placeholder (e.g., an asterisk '*'). The patterns are then stored in a data structure with quick access time. By using this pattern-matching strategy:

  1. We create all possible patterns from each word in the dictionary by replacing each character with a '*'. This yields a generalized form of the words.
  2. When searching for a match of searchWord, we generate the same patterns for it.
  3. We then verify if these patterns match any patterns from the dictionary:
    • If a pattern has a count greater than 1, we can be certain that searchWord can match a different word in the dictionary (since we store only unique words).
    • If a pattern has a count of 1, and the searchWord is not already in the set of dictionary words, we know that there is exactly one different character.
  4. If any of the generated patterns of searchWord satisfy the conditions above, we can return true. Otherwise, false.

Utilizing a counter to track the number of occurrences of patterns and a set to keep track of the words ensures a quick look-up and decision during the search operation.

Learn more about Depth-First Search and Trie patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Which of the following is the prefix sum of array [1, 2, 3, 4, 5]?

Solution Approach

To implement the MagicDictionary, we use a set and a Counter from Python's collections library. The set, self.s, stores the words in the dictionary to ensure the uniqueness of the elements and to provide quick access for checks. The Counter, self.cnt, tracks the count of all the possible patterns formed by replacing each character in the words with the placeholder asterisk '*'. Here's a breakdown of how each part of the implementation contributes to the solution:

  • The __init__ method initializes the data structures (self.s for the set and self.cnt for the Counter) used to hold the words and patterns.
  • The gen method is a helper function that generates all possible patterns for a given word by replacing each character with ''. For example, for the word "hello", it will produce ["ello", "hllo", "helo", "helo", "hell"].
  • The buildDict method constructs the set with the given dictionary words and uses the gen method to create and count the patterns of each word. By iterating over the words in the dictionary and their generated patterns, we populate self.cnt.
  • In the search method, for a given searchWord, we generate all possible patterns and then loop over them to verify if we can find a match:
    • If self.cnt[p] > 1, it means that there is more than one word in the dictionary that can match the pattern, hence we can have a match with exactly one character different from searchWord.
    • If self.cnt[p] == 1 and the searchWord is not in self.s, the pattern is unique to one word in the dictionary, and since searchWord is not that word, this confirms we can match by changing exactly one character.

In terms of algorithms and patterns, this implementation uses a clever form of pattern matching that simplifies the search space. Instead of comparing the searchWord to every word in the dictionary, it only needs to check the generated patterns, significantly improving efficiency, especially when the search operation is called multiple times.

By using these data structures and algorithms, the MagicDictionary is able to quickly and efficiently determine whether there exists a word in the dictionary that can be formed by changing exactly one character in the searchWord.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Which data structure is used to implement priority queue?

Example Walkthrough

Let's illustrate the solution approach using a simple example. Suppose we want to initialize our MagicDictionary with the words ["hello", "leetcode"]. Here's how the MagicDictionary would be built and utilized:

  1. Initialization: The MagicDictionary is created with an empty set and counter.

  2. Building the Dictionary:

    • The buildDict function is called with ["hello", "leetcode"].
    • The gen helper function generates patterns by replacing each character with an asterisk '*'.
      • For "hello", the generated patterns would be ["*ello", "h*llo", "he*lo", "hel*o", "hell*"].
      • For "leetcode", the generated patterns would be ["*eetcode", "l*etcode", "le*tcode", "lee*code", "leet*ode", "leetc*de", "leetcod*"].
    • These patterns are added to self.cnt counter, and the original words are stored in the set self.s.
  3. Search Operation:

    • Now, let's say we want to search for the word "hullo" using the search function.
    • The generated patterns for "hullo" would be ["*ullo", "h*llo", "hu*lo", "hul*o", "hull*"].
    • Next, the algorithm goes through each of these patterns:
      • It finds that the pattern "h*llo" has a count of 1 in self.cnt, and since "hullo" is not in the set self.s (which contains "hello"), this means there is exactly one character different in a unique word in the dictionary that matches the pattern "h*llo".
    • Since all conditions for a successful match are met, the search function will return true for the word "hullo".
  4. Result:

    • The search operation effectively demonstrates the ability to determine if "hullo" can be transformed into a word in the dictionary by changing exactly one letter. It returns true without having to compare "hullo" against every word in the dictionary, showcasing the efficiency of the pattern matching approach.
Not Sure What to Study? Take the 2-min Quiz:

In a binary min heap, the minimum element can be found in:

Python Solution

1from collections import Counter
2
3class MagicDictionary:
4  
5    def __init__(self):
6        # Initialize attributes for storing words and counts of generic patterns
7        self.words_set = set()
8        self.pattern_count = Counter()
9
10    def generate_patterns(self, word):
11        # Generate all potential patterns by replacing each character with a '*'
12        return [word[:i] + '*' + word[i + 1:] for i in range(len(word))]
13
14    def buildDict(self, dictionary: List[str]) -> None:
15        # Build the dictionary from a list of words and count the occurences of the generated patterns
16        self.words_set = set(dictionary)
17        self.pattern_count = Counter(pattern for word in dictionary for pattern in self.generate_patterns(word))
18
19    def search(self, searchWord: str) -> bool:
20        # Search if there is any word in the dictionary that can be obtained by changing exactly one character of the searchWord
21        for pattern in self.generate_patterns(searchWord):
22            count = self.pattern_count[pattern]
23            # Check if more than one word matches the pattern 
24            # or if one word matches the pattern but it's not the searchWord itself
25            if count > 1 or (count == 1 and searchWord not in self.words_set):
26                return True
27        return False
28
29
30# Example of using the MagicDictionary class
31# obj = MagicDictionary()
32# obj.buildDict(["hello", "leetcode"])
33# param_2 = obj.search("hhllo")  # Should return True, as hello is in the dictionary after changing one character
34# param_3 = obj.search("hello")  # Should return False, as the word is in the dictionary and does not require a change
35

Java Solution

1import java.util.ArrayList;
2import java.util.HashMap;
3import java.util.HashSet;
4import java.util.List;
5import java.util.Map;
6import java.util.Set;
7
8class MagicDictionary {
9    // Using a set to store the actual words to ensure no duplicates
10    private Set<String> wordSet = new HashSet<>();
11    // Using a map to store placeholders for words with a character replaced by '*'
12    private Map<String, Integer> placeholderCountMap = new HashMap<>();
13
14    /** Constructor for MagicDictionary **/
15    public MagicDictionary() {
16        // nothing to initialize
17    }
18
19    /**
20     * Builds a dictionary through a list of words
21     *
22     * @param dictionary An array of words to be added to the MagicDictionary
23     */
24    public void buildDict(String[] dictionary) {
25        for (String word : dictionary) {
26            wordSet.add(word); // add the word to the set
27            // get placeholders for the word and update their count in the map
28            for (String placeholder : generatePlaceholders(word)) {
29                placeholderCountMap.put(placeholder, placeholderCountMap.getOrDefault(placeholder, 0) + 1);
30            }
31        }
32    }
33
34    /**
35     * Search if there is any word in the dictionary that can be obtained by changing
36     * exactly one character of the searchWord
37     *
38     * @param searchWord The word to search for in the dictionary
39     * @return True if such word exists, False otherwise
40     */
41    public boolean search(String searchWord) {
42        // generate placeholders for the search word
43        for (String placeholder : generatePlaceholders(searchWord)) {
44            int count = placeholderCountMap.getOrDefault(placeholder, 0);
45            // if count is more than 1 or the word itself is not in the set and count is 1
46            if (count > 1 || (count == 1 && !wordSet.contains(searchWord))) {
47                return true;
48            }
49        }
50        return false;
51    }
52
53    /**
54     * Generate all possible placeholders for a word by replacing each
55     * character with '*'
56     *
57     * @param word The word to generate placeholders for
58     * @return A list of placeholders
59     */
60    private List<String> generatePlaceholders(String word) {
61        List<String> placeholders = new ArrayList<>();
62        char[] chars = word.toCharArray(); // convert word to char array for manipulation
63        for (int i = 0; i < chars.length; ++i) {
64            char originalChar = chars[i];
65            chars[i] = '*'; // replace the i-th character with '*'
66            placeholders.add(new String(chars)); // add to the list of placeholders
67            chars[i] = originalChar; // revert the change to original character
68        }
69        return placeholders;
70    }
71}
72
73// Sample usage is shown in the comment below:
74/*
75MagicDictionary obj = new MagicDictionary();
76obj.buildDict(dictionary);
77boolean param_2 = obj.search(searchWord);
78*/
79

C++ Solution

1#include <vector>
2#include <string>
3#include <unordered_set>
4#include <unordered_map>
5using namespace std;
6
7class MagicDictionary {
8public:
9    /** Initialize your data structure here. */
10    MagicDictionary() {
11        // Constructor is currently empty; no initialization is needed.
12    }
13
14    /** Builds the dictionary from a list of words. */
15    void buildDict(vector<string> dictionary) {
16        for (const string& word : dictionary) {
17            // Insert the word into the set.
18            wordsSet.insert(word);
19
20            // For each possible generic representation with one letter replaced by a '*', update its count.
21            for (const string& genericWord : generateGenericWords(word)) {
22                genericWordCounts[genericWord]++;
23            }
24        }
25    }
26
27    /** Searches if there is any word in the dictionary that can be matched to searchWord after modifying exactly one character. */
28    bool search(string searchWord) {
29        for (const string& genericWord : generateGenericWords(searchWord)) {
30            if (genericWordCounts[genericWord] > 1 || 
31                (genericWordCounts[genericWord] == 1 && wordsSet.count(searchWord) == 0)) {
32                return true;
33            }
34        }
35        return false;
36    }
37
38private:
39    unordered_set<string> wordsSet; // Set to store words.
40    unordered_map<string, int> genericWordCounts; // Map to store the counts of all possible generic representations.
41
42    /** Generates all possible generic words by replacing each letter of the word with a '*' one at a time. */
43    vector<string> generateGenericWords(string word) {
44        vector<string> genericWords;
45        for (int i = 0; i < word.size(); ++i) {
46            char originalChar = word[i];
47            word[i] = '*';
48            genericWords.push_back(word);
49            word[i] = originalChar;
50        }
51        return genericWords;
52    }
53};
54
55// The MagicDictionary class can be used as follows:
56// MagicDictionary* obj = new MagicDictionary();
57// obj->buildDict(dictionary);
58// bool result = obj->search(searchWord);
59

Typescript Solution

1// Imports skipped since global definitions don't require import statements
2
3// Global variables to store words and counts of their generic representations
4const wordsSet: Set<string> = new Set();
5const genericWordCounts: Map<string, number> = new Map();
6
7/** Builds the dictionary from a list of words. */
8function buildDict(dictionary: string[]): void {
9    dictionary.forEach(word => {
10        // Insert the word into the set
11        wordsSet.add(word);
12
13        // For each possible generic representation with one letter replaced by '*', update its count
14        const genericWords = generateGenericWords(word);
15        genericWords.forEach(genericWord => {
16            const count = genericWordCounts.get(genericWord) || 0;
17            genericWordCounts.set(genericWord, count + 1);
18        });
19    });
20}
21
22/** Searches if there is any word in the dictionary that can be matched to searchWord after modifying exactly one character. */
23function search(searchWord: string): boolean {
24    const genericWords = generateGenericWords(searchWord);
25
26    for (const genericWord of genericWords) {
27        const count = genericWordCounts.get(genericWord) || 0;
28        if (count > 1 || (count === 1 && !wordsSet.has(searchWord))) {
29            return true;
30        }
31    }
32
33    return false;
34}
35
36/** Generates all possible generic words by replacing each letter of the word with '*' one at a time. */
37function generateGenericWords(word: string): string[] {
38    const genericWords: string[] = [];
39
40    for (let i = 0; i < word.length; i++) {
41        const originalChar = word[i];
42        // Replace one letter with '*'
43        word = word.substring(0, i) + '*' + word.substring(i + 1);
44        genericWords.push(word);
45
46        // Restore the original letter
47        word = word.substring(0, i) + originalChar + word.substring(i + 1);
48    }
49
50    return genericWords;
51}
52
53// Example usage:
54// buildDict(["hello", "leetcode"]);
55// const result: boolean = search("hhllo"); // Should return true
56
Fast Track Your Learning with Our Quick Skills Quiz:

Which algorithm should you use to find a node that is close to the root of the tree?

Time and Space Complexity

Time Complexity

  • __init__: O(1) since no operation is performed in the initialization process.
  • gen: The time complexity is O(N) where N is the length of the word since we generate a new string for each character by omitting that character.
  • buildDict:
    • The time complexity of constructing the set s is O(M * K) where M is the number of words in the dictionary and K is the average length of the words since we have to copy each word into the set.
    • The time complexity of creating the counter is O(M * N * K) where N is the average length of the words in the dictionary. For each word, we create N different patterns (by the gen function) and for each pattern we incur the cost of insertion into cnt. Note that insertion into a counter (which is basically a hash map) is typically O(1), so the main cost is in generating the patterns and iterating over the words.
  • search:
    • The search function involves generating patterns from the searchWord whose time complexity is O(N), where N is the length of searchWord.
    • The time complexity of searching in the counter is potentially O(1) for each pattern. Since there are N patterns, the total complexity in the worst-case for searching is O(N).
    • The searchWord not in self.s check also has a time complexity of O(1) on average due to the set's underlying hash table.
    • Combining these, the total time complexity for search is O(N).

In summary, if N is the maximum length of a word, M is the number of words in the dictionary, and searchWord is the word to search for with length N, then the time complexity for buildDict is O(M * N * K) and for search is O(N).

Space Complexity

  • self.s: Takes O(M * K) space to store each word in the dictionary.
  • self.cnt: The counter could at most hold O(M * N) unique patterns (if every generated pattern from every word is unique).
  • For gen function: Since it creates a list of strings, in the worst case, it could take O(N^2) space to store these strings if we consider the space used by all intermediate strings. However, in most practical scenarios where strings are immutable and the language runtime optimizes string storage, the actual additional space complexity incurred by gen is closer to O(N).

Overall, the space complexity is O(M * N + M * K) which is attributable to the size of the words in the dictionary and the patterns generated.

Note: The actual space used by some data structures like hash maps can be larger than the number of elements due to the need for a low load factor to maintain their performance characteristics. The complexities mentioned do not account for these implementation details.

Learn more about how to find time and space complexity quickly.


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫