966. Vowel Spellchecker
Problem Description
In this problem, we are tasked with creating a spellchecker system that takes a list of valid words, called wordlist
, and a list of query words. The goal is to correct each query word based on specific spelling mistake categories and rules. There are two categories of spelling mistakes the system needs to handle:
-
Capitalization: If the query word differs from valid words only by capital letters, it should be treated as a correct match. The correction should return the word from
wordlist
in the same case as it appears there.- For instance, if
wordlist = ["apple", "Banana"]
, queries like"APPLE"
and"banana"
should be corrected to"apple"
and"Banana"
respectively.
- For instance, if
-
Vowel Errors: If the query word differs from valid words only by the vowels ('a', 'e', 'i', 'o', 'u'), it should be treated as a correct match. The vowels in the query word can be replaced with any other vowel to find a match.
- For example,
wordlist = ["apple", "Banana"]
, and the query"appla"
could be corrected to"apple"
because changing 'a' to 'e' inquery
makes it matchwordlist
.
- For example,
It's important to note that the spellchecker should follow a specific order of precedence while correcting:
- An exact case-sensitive match should be returned as is.
- If there is a case-insensitive match, the first such match in the
wordlist
should be returned. - If there is a match allowing for vowel errors (case-insensitive), the first such match should be returned.
- If no match is found, return an empty string.
Given some queries
, we return a list of words answer
, where answer[i]
is the correct word for query = queries[i]
.
Intuition
To approach this problem efficiently, we need a way to check for each type of correction quickly. We'll preprocess the wordlist
to handle different cases and vowel variations, and then we'll use these preprocessed structures to correct each query
.
Here's how we break down the solution approach:
-
Exact Matches: We'll first create a set of all words from the
wordlist
to check for exact case-sensitive matches quickly. This allows us to find exact matches in O(1) time. -
Case-Insensitive Matches: We'll create a dictionary to store the words in lowercase form. If a word in
wordlist
happens to be in different cases, we only need to store the first occurrence as per our rule. The first occurrence can be heuristically considered the most common or correct form. -
Vowel Errors: Handling vowel errors involves creating a pattern where all vowels are replaced with a placeholder (like an asterisk ''). For example, "apple" and "epple" would both match the pattern "ppl". A dictionary is created to store these patterns with corresponding words from
wordlist
. As before, we only store the first occurrence of each pattern.
When checking a query
:
- If the
query
is in the exact match set, it is already correct. - Next, we convert the
query
to lowercase and look it up in the case-insensitive dictionary. If a match is found, we use the word stored there. - If there's no case-insensitive match, we replace vowels in the lowercase
query
with the placeholder and look for this pattern in the vowel error dictionary. - If all else fails, we append an empty string indicating no correction was found.
This solution ensures we can process each query
effectively, leveraging the different lookup structures for each type of mistake, while respecting the precedence rules.
Solution Approach
In the provided solution, several data structures are used to pre-process the wordlist and to provide an efficient lookup for each type of query correction needed:
-
A
set
nameds
is used to store all the words from thewordlist
. This allows for quick O(1) checks for exact matches, which is our highest priority according to the problem's rule set. -
Two dictionaries
low
andpat
are used to prepare for case-insensitive and vowel error matches. Thelow
(short for "lowercase") dictionary maps a lowercase version of each word to the first occurrence of it in the wordlist, ensuring we respect the order of occurrence rule. Likewise, thepat
(short for "pattern") dictionary is for patterns created by replacing vowels with a placeholder. -
The function
f(w)
takes a wordw
, converts it to lowercase, and then replaces each vowel with a"*"
to create a standardized pattern for vowel errors. Using this function we generate the keys for thepat
dictionary which helps us lookup words with a vowel placement mistake. -
The main spellchecker function iterates over each query and applies the following rules in order of precedence:
- First, it checks for an exact match by seeing if the query is in the
s
set. If it is, the query is immediately appended to the answers list. - Second, the query is converted to lowercase and we check the
low
dictionary to find a case-insensitive match. If found, the corresponding word is appended to the answers list. - Third, for vowel errors, we apply the
f
function on the query to create a pattern and then check thepat
dictionary. If found, we append the word from thepat
dictionary that corresponds to the pattern created by the query. - Finally, if none of the above conditions result in a match, an empty string is appended to the answers list, indicating that no correction was found for the query.
- First, it checks for an exact match by seeing if the query is in the
This extended processing for each type of mistake allows the function to handle all cases described in the problem statement efficiently and in a manner that respects the precedence of spell correction rules. Vowel error checks are only made if no case-sensitive or case-insensitive matches are found, thus adhering to the order specified, and ensuring the most accurate spell correction is provided.
Each query is handled in O(1) time lookups thanks to the data structures used, making the entire solution highly efficient even for large lists of words and queries.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's consider a small example to illustrate the solution approach described earlier.
Assuming we have a wordlist
as ["Kiwi", "banana", "Apple", "apricot", "Berry"]
and our queries
are ["berry", "BANANA", "apPle", "Apricot", "kwi", "Berri"]
. We want to spellcheck each query and return the correct word based on the rules.
Let's go through each step:
-
Exact Matches: We create a set
s
containing each word from thewordlist
exactly as it is, which will allow us to quickly identify any exact matches.s
:{"Kiwi", "banana", "Apple", "apricot", "Berry"}
-
Case-Insensitive Matches: We create a dictionary
low
where each lowercase word from thewordlist
is a key, and its value is the first occurrence of that word from thewordlist
.low
:{"kiwi": "Kiwi", "banana": "banana", "apple": "Apple", "apricot": "apricot", "berry": "Berry"}
-
Vowel Errors: Next, we create a dictionary
pat
where we replace the vowels in each lowercase word from thewordlist
with a"*"
to generate patterns that help us match words with incorrect vowels.pat
:{"k*w*": "Kiwi", "b*n*n*": "banana", "ppl*": "Apple", "pr*c*t": "apricot", "b*rry": "Berry"}
Now, we move on to checking the queries
.
- For
query = "berry"
, we first look for an exact match in sets
. It's not there, so we check the case-insensitive match inlow
and find"berry"
which returns"Berry"
as the correction. - For
query = "BANANA"
, again, no exact match, but the lowercase"banana"
is inlow
and we get"banana"
as the correction. - For
query = "apPle"
, no exact match, but inlow
, we find"apple"
giving us"Apple"
as the correction. - For
query = "Apricot"
, no exact match. We lowercase the query to"apricot"
, and find it inlow
, returning"apricot"
. - For
query = "kwi"
, no exact match, no case-insensitive match, we create a pattern"kw*"
which doesn't match any pattern inpat
, so we return an empty string. - For
query = "Berri"
, no exact match, no case-insensitive match, we create the pattern"b*rr*"
, no matching pattern is found inpat
, so we return an empty string as well.
Hence, our spellchecker returns the list: ["Berry", "banana", "Apple", "apricot", "", ""]
.
In this example, only when the exact and case-insensitive matches are not found do we consider vowel errors. Each step respects the order of precedence and uses efficient data structures for quick lookups, demonstrating the effectiveness of the solution approach.
Solution Implementation
1class Solution:
2 def spellchecker(self, wordlist, queries):
3 # Helper function to replace vowels in a word with '*'
4 def mask_vowels(word):
5 return ''.join('*' if char.lower() in 'aeiou' else char for char in word)
6
7 # Original set of words for exact match
8 exact_matches = set(wordlist)
9 # Dictionary for case-insensitive matches
10 case_insensitive_matches = {}
11 # Dictionary for vowel-error matches
12 vowel_error_matches = {}
13
14 # Preprocess wordlist to fill the above dictionaries
15 for word in wordlist:
16 lowercase_word = word.lower()
17 masked_word = mask_vowels(lowercase_word)
18
19 # For case-insensitive matches, store the first occurrence
20 if lowercase_word not in case_insensitive_matches:
21 case_insensitive_matches[lowercase_word] = word
22
23 # For vowel-error matches, store the first occurrence
24 if masked_word not in vowel_error_matches:
25 vowel_error_matches[masked_word] = word
26
27 # List for storing spellchecked results
28 spellchecked_queries = []
29
30 # Check each query against exact, case-insensitive, and vowel-error matches
31 for query in queries:
32 # Check for exact match
33 if query in exact_matches:
34 spellchecked_queries.append(query)
35 continue
36
37 # Convert the query to lowercase for case-insensitive and vowel-error comparisons
38 lowercase_query = query.lower()
39 # Check for case-insensitive match
40 if lowercase_query in case_insensitive_matches:
41 spellchecked_queries.append(case_insensitive_matches[lowercase_query])
42 continue
43
44 # Mask the vowels in the query
45 masked_query = mask_vowels(lowercase_query)
46 # Check for vowel-error match
47 if masked_query in vowel_error_matches:
48 spellchecked_queries.append(vowel_error_matches[masked_query])
49 continue
50
51 # No match found
52 spellchecked_queries.append("")
53
54 return spellchecked_queries
55
1class Solution {
2 // This function returns an array of spellcheck results for the given queries,
3 // based on the provided wordlist.
4 public String[] spellChecker(String[] wordList, String[] queries) {
5 // Initialize a set to hold the wordlist for fast look-up
6 Set<String> wordSet = new HashSet<>();
7 // Maps to hold lowercase and vowel-less variations of words
8 Map<String, String> wordLowerCaseMap = new HashMap<>();
9 Map<String, String> wordPatternMap = new HashMap<>();
10
11 // Populate the data structures with the word list
12 for (String word : wordList) {
13 // Add the exact word to the set
14 wordSet.add(word);
15 // Convert to lowercase
16 String wordLowerCase = word.toLowerCase();
17 // If the lowercase word isn't in the map, put it with the original word
18 wordLowerCaseMap.putIfAbsent(wordLowerCase, word);
19 // Create and put the vowel-less pattern if absent
20 wordPatternMap.putIfAbsent(transformPattern(wordLowerCase), word);
21 }
22
23 // Prepare an array to hold the answers
24 int queryCount = queries.length;
25 String[] answers = new String[queryCount];
26
27 // Process each query
28 for (int i = 0; i < queryCount; ++i) {
29 String query = queries[i];
30
31 // Check if the exact word exists in the word set
32 if (wordSet.contains(query)) {
33 answers[i] = query;
34 continue;
35 }
36
37 // Convert to lowercase and check if it exists in the lowercase map
38 query = query.toLowerCase();
39 if (wordLowerCaseMap.containsKey(query)) {
40 answers[i] = wordLowerCaseMap.get(query);
41 continue;
42 }
43
44 // Apply the vowel-less pattern and check if it exists in the pattern map
45 query = transformPattern(query);
46 if (wordPatternMap.containsKey(query)) {
47 answers[i] = wordPatternMap.get(query);
48 continue;
49 }
50
51 // If none of the conditions are met, the answer is an empty string
52 answers[i] = "";
53 }
54
55 return answers;
56 }
57
58 // This function creates a version of the string where all vowels are replaced by an asterisk.
59 private String transformPattern(String word) {
60 char[] characters = word.toCharArray();
61 for (int i = 0; i < characters.length; ++i) {
62 if ("aeiou".indexOf(characters[i]) >= 0) {
63 characters[i] = '*';
64 }
65 }
66 return new String(characters);
67 }
68}
69
1#include <vector>
2#include <string>
3#include <unordered_set>
4#include <unordered_map>
5#include <algorithm>
6
7class Solution {
8public:
9 std::vector<std::string> spellchecker(std::vector<std::string>& wordList, std::vector<std::string>& queries) {
10 // Create a set from the wordList for quick existence checks.
11 std::unordered_set<std::string> wordsSet(wordList.begin(), wordList.end());
12
13 // Mapping to store lowercase versions of words.
14 std::unordered_map<std::string, std::string> lowercaseMap;
15
16 // Mapping to store patternized versions of words (vowels replaced with '*').
17 std::unordered_map<std::string, std::string> patternMap;
18
19 // Function to replace vowels in a word with '*'.
20 auto replaceVowels = [](std::string& word) {
21 std::string result;
22 for (char& c : word) {
23 if (c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u') {
24 result += '*';
25 } else {
26 result += c;
27 }
28 }
29 return result;
30 };
31
32 // Populate lowercaseMap and patternMap from the words in wordList.
33 for (auto& word : wordList) {
34 std::string tempWord = word;
35 // Convert to lowercase.
36 std::transform(tempWord.begin(), tempWord.end(), tempWord.begin(), ::tolower);
37 if (!lowercaseMap.count(tempWord)) {
38 lowercaseMap[tempWord] = word;
39 }
40
41 // Convert to vowel-patterned form.
42 tempWord = replaceVowels(tempWord);
43 if (!patternMap.count(tempWord)) {
44 patternMap[tempWord] = word;
45 }
46 }
47
48 // Vector to store the answers to the queries.
49 std::vector<std::string> answers;
50
51 // Process each query.
52 for (auto& query : queries) {
53 // Exact match check.
54 if (wordsSet.count(query)) {
55 answers.emplace_back(query);
56 continue;
57 }
58
59 // Convert query to lowercase for case-insensitive match.
60 std::transform(query.begin(), query.end(), query.begin(), ::tolower);
61
62 // Check for case-insensitive match.
63 if (lowercaseMap.count(query)) {
64 answers.emplace_back(lowercaseMap[query]);
65 continue;
66 }
67
68 // Replace vowels in query to '*' for vowel-insensitive match.
69 query = replaceVowels(query);
70
71 // Check for vowel-insensitive match.
72 if (patternMap.count(query)) {
73 answers.emplace_back(patternMap[query]);
74 continue;
75 }
76
77 // If no matches were found, add an empty string.
78 answers.emplace_back("");
79 }
80
81 // Return the final answers vector.
82 return answers;
83 }
84};
85
1// Import required TypeScript modules.
2import { Set, Map } from "typescript-collections";
3
4// Function to replace vowels in a word with '*'.
5const replaceVowels = (word: string): string => {
6 // Define a regex pattern for vowels (a, e, i, o, u).
7 const vowelPattern = /[aeiou]/gi;
8 // Replace all vowels in the word with '*' and return the result.
9 return word.replace(vowelPattern, '*');
10};
11
12// Function to spell check words based on a given word list.
13const spellChecker = (wordList: string[], queries: string[]): string[] => {
14 // Create a set from the wordList for quick existence checks.
15 const wordsSet = new Set<string>();
16 wordList.forEach(word => wordsSet.add(word));
17
18 // Mapping to store lowercase versions of words.
19 const lowercaseMap = new Map<string, string>();
20
21 // Mapping to store patternized versions of words (vowels replaced with '*').
22 const patternMap = new Map<string, string>();
23
24 // Populate lowercaseMap and patternMap from the words in wordList.
25 wordList.forEach(word => {
26 const lowerWord = word.toLowerCase();
27 if (!lowercaseMap.containsKey(lowerWord)) {
28 lowercaseMap.setValue(lowerWord, word);
29 }
30
31 const vowelLessWord = replaceVowels(lowerWord);
32 if (!patternMap.containsKey(vowelLessWord)) {
33 patternMap.setValue(vowelLessWord, word);
34 }
35 });
36
37 // Vector to store the answers to the queries.
38 const answers: string[] = [];
39
40 // Process each query.
41 queries.forEach(query => {
42 // Exact match check.
43 if (wordsSet.contains(query)) {
44 answers.push(query);
45 return; // Continue to the next iteration.
46 }
47
48 // Convert query to lowercase for case-insensitive match.
49 const lowerQuery = query.toLowerCase();
50
51 // Check for case-insensitive match.
52 if (lowercaseMap.containsKey(lowerQuery)) {
53 answers.push(lowercaseMap.getValue(lowerQuery)!);
54 return; // Continue to the next iteration.
55 }
56
57 // Replace vowels in query to '*' for vowel-insensitive match.
58 const patternQuery = replaceVowels(lowerQuery);
59
60 // Check for vowel-insensitive match.
61 if (patternMap.containsKey(patternQuery)) {
62 answers.push(patternMap.getValue(patternQuery)!);
63 return; // Continue to the next iteration.
64 }
65
66 // If no matches were found, add an empty string.
67 answers.push("");
68 });
69
70 // Return the final answers array.
71 return answers;
72};
73
Time and Space Complexity
Time Complexity
The time complexity of the code can be analyzed in the following steps:
-
The function
f(w)
takesO(n)
time for transforming a word by replacing vowels with*
, wheren
is the length of the word. -
The first loop where it iterates over the
wordlist
:- For each word
w
it computest = w.lower()
which takesO(m)
time, wherem
is the length of the word. - It calls
f(t)
which takesO(m)
time. - So, each iteration through the
wordlist
isO(m)
and since it goes through the entirewordlist
once, the complexity for this loop isO(L * m)
whereL
is the length of thewordlist
.
- For each word
-
The second loop where it iterates over the
queries
does the following:- Checks if
q
is in sets
which is anO(1)
operation sinces
is a set. - Converts
q
to lowercase and check if it's inlow
which is anO(k)
operation for each query, wherek
is the length of the query. - Calls
f(q)
which is anO(k)
operation, and checks against thepat
dictionary. - Similar to before, each iteration’s complexity is
O(k)
and since it goes through each query once, the total complexity isO(Q * k)
whereQ
is the number of queries.
- Checks if
Therefore, combining these complexities, the overall time complexity of the function spellchecker
is O(L * m + Q * k)
.
Space Complexity
The space complexity of the code can be analyzed by considering the additional data structures being used:
-
The
s
set stores all the words as they are, taking upO(L)
space. -
The
low
andpat
dictionaries store keys that are at most as long as the words and an associated word for each key. They will have at mostL
keys each (duplicates are not stored again becausesetdefault
won't replace existing keys), bringing the space needed for these toO(L * m)
, wherem
is the average length of the words.
Combining the complexities for storing s
, low
, and pat
, the overall space complexity is O(L * m)
, assuming the average length of the words m
is not negligible compared to the number of words L
.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of the following array represent a max heap?
Recommended Readings
Patterns The Shortest Path Algorithm for Coding Interviews The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we can determine the
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Got a question? Ask the Monster Assistant anything you don't understand.
Still not clear? Submit the part you don't understand to our editors. Or join our Discord and ask the community.