524. Longest Word in Dictionary through Deleting
Problem Description
You are given a string s
and an array of strings called dictionary
. Your task is to find the longest string from the dictionary that can be formed by deleting some characters from string s
(while maintaining the relative order of remaining characters).
The key requirements are:
- The result string must be formable by deleting characters from
s
(it must be a subsequence ofs
) - If multiple valid strings have the same maximum length, return the one that comes first lexicographically (alphabetically)
- If no string from the dictionary can be formed, return an empty string
For example, if s = "abpcplea"
and dictionary = ["ale", "apple", "monkey", "plea"]
:
"ale"
can be formed by deleting characters froms
(keeping 'a', 'l', 'e' in order)"apple"
can be formed by deleting characters froms
(keeping 'a', 'p', 'p', 'l', 'e' in order)"monkey"
cannot be formed froms
"plea"
can be formed by deleting characters froms
Among the valid options, "apple"
is the longest with length 5, so it would be the answer.
The solution uses a two-pointer technique to check if each dictionary word is a subsequence of s
, then keeps track of the best valid word based on length and lexicographical order.
Intuition
The core insight is recognizing that this problem is about finding subsequences. A string can be formed by deleting characters from another string if and only if it's a subsequence of that string. This means we need to check if each dictionary word is a subsequence of s
.
To check if a string t
is a subsequence of s
, we can use the two-pointer technique. We traverse through s
character by character, and whenever we find a character that matches the current character we're looking for in t
, we move to the next character in t
. If we successfully match all characters of t
while traversing s
, then t
is a subsequence.
Why does this greedy matching work? Because we want to preserve the relative order of characters. When we find a matching character in s
, we can safely "consume" it for our subsequence without worrying about needing it later - if there's a valid subsequence, using the first occurrence of each character will always work.
For finding the best answer among all valid subsequences, we need to compare them based on two criteria:
- Length (longer is better)
- Lexicographical order (smaller is better when lengths are equal)
We can iterate through all dictionary words, check if each is a subsequence of s
, and keep track of the best one we've found so far. Whenever we find a valid word that's either longer than our current best, or has the same length but comes earlier alphabetically, we update our answer.
This straightforward approach works because we're essentially doing a filtered search - first filtering for valid subsequences, then selecting the optimal one based on the given criteria.
Learn more about Two Pointers and Sorting patterns.
Solution Approach
The solution implements a two-step process: checking subsequences and finding the optimal result.
Step 1: Subsequence Checking Function
The check(s, t)
function determines if string s
is a subsequence of string t
using two pointers:
- Initialize pointers
i = 0
for strings
andj = 0
for stringt
- Iterate through string
t
with pointerj
- When
s[i] == t[j]
, incrementi
to look for the next character ins
- Always increment
j
to continue scanning throught
- Return
True
ifi == len(s)
, meaning all characters ofs
were found in order
Step 2: Finding the Best Word
Initialize ans = ""
to store the best word found so far. Then iterate through each word t
in the dictionary:
- Check if
t
is a subsequence ofs
by callingcheck(t, s)
- If it is a subsequence, determine if it's better than the current
ans
:- Update
ans
iflen(t) > len(ans)
(found a longer word) - Update
ans
iflen(t) == len(ans)
andt < ans
lexicographically (found an equally long but alphabetically smaller word)
- Update
The comparison ans > t
works because Python compares strings lexicographically by default.
Time Complexity: O(n ร m)
where n
is the number of words in the dictionary and m
is the length of string s
. For each word, we potentially scan through the entire string s
.
Space Complexity: O(1)
as we only use a constant amount of extra space for pointers and the answer string (not counting the input).
The elegance of this solution lies in its simplicity - it directly checks each candidate and keeps track of the best one, without needing complex data structures or preprocessing.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example with s = "abpcplea"
and dictionary = ["ale", "apple", "monkey"]
.
Initial Setup:
ans = ""
(empty string to start)- We'll check each word in the dictionary
Checking "ale":
- Use two pointers to check if "ale" is a subsequence of "abpcplea"
i=0
(for "ale"),j=0
(for "abpcplea")j=0
: 'a' == 'a', match! Moveiโ1
,jโ1
j=1
: 'b' != 'l', no match. Movejโ2
j=2
: 'p' != 'l', no match. Movejโ3
j=3
: 'c' != 'l', no match. Movejโ4
j=4
: 'p' != 'l', no match. Movejโ5
j=5
: 'l' == 'l', match! Moveiโ2
,jโ6
j=6
: 'e' == 'e', match! Moveiโ3
,jโ7
i == 3 == len("ale")
, so "ale" is a subsequence- Compare with
ans=""
: len("ale")=3 > len("")=0, updateans = "ale"
Checking "apple":
- Two pointers:
i=0
(for "apple"),j=0
(for "abpcplea") j=0
: 'a' == 'a', match! Moveiโ1
,jโ1
j=1
: 'b' != 'p', no match. Movejโ2
j=2
: 'p' == 'p', match! Moveiโ2
,jโ3
j=3
: 'c' != 'p', no match. Movejโ4
j=4
: 'p' == 'p', match! Moveiโ3
,jโ5
j=5
: 'l' == 'l', match! Moveiโ4
,jโ6
j=6
: 'e' == 'e', match! Moveiโ5
,jโ7
i == 5 == len("apple")
, so "apple" is a subsequence- Compare with
ans="ale"
: len("apple")=5 > len("ale")=3, updateans = "apple"
Checking "monkey":
- Two pointers:
i=0
(for "monkey"),j=0
(for "abpcplea") j=0
: 'a' != 'm', no match. Movejโ1
j=1
: 'b' != 'm', no match. Movejโ2
- Continue through entire string... no 'm' found
- Reach end of "abpcplea" with
i=0
(never found 'm') i != len("monkey")
, so "monkey" is NOT a subsequence- Don't update
ans
Final Result: ans = "apple"
(the longest valid subsequence found)
Solution Implementation
1class Solution:
2 def findLongestWord(self, s: str, dictionary: List[str]) -> str:
3 def is_subsequence(word: str, source: str) -> bool:
4 """
5 Check if 'word' is a subsequence of 'source'.
6 A subsequence means all characters of 'word' appear in 'source'
7 in the same order, but not necessarily consecutively.
8 """
9 word_len, source_len = len(word), len(source)
10 word_idx = source_idx = 0
11
12 # Iterate through both strings
13 while word_idx < word_len and source_idx < source_len:
14 # If characters match, move word pointer forward
15 if word[word_idx] == source[source_idx]:
16 word_idx += 1
17 # Always move source pointer forward
18 source_idx += 1
19
20 # Return True if all characters in word were found
21 return word_idx == word_len
22
23 # Initialize result with empty string
24 result = ""
25
26 # Check each word in the dictionary
27 for word in dictionary:
28 # Check if current word is a subsequence of s
29 if is_subsequence(word, s):
30 # Update result if current word is:
31 # 1. Longer than current result, OR
32 # 2. Same length but lexicographically smaller
33 if len(result) < len(word) or (len(result) == len(word) and result > word):
34 result = word
35
36 return result
37
1class Solution {
2 /**
3 * Finds the longest word from dictionary that can be formed by deleting
4 * some characters from string s. If multiple valid words have the same length,
5 * returns the lexicographically smallest one.
6 *
7 * @param s The source string
8 * @param dictionary List of words to check
9 * @return The longest valid word, or empty string if none found
10 */
11 public String findLongestWord(String s, List<String> dictionary) {
12 String longestWord = "";
13
14 // Iterate through each word in the dictionary
15 for (String currentWord : dictionary) {
16 int longestLength = longestWord.length();
17 int currentLength = currentWord.length();
18
19 // Check if current word is a subsequence of s and is better than current best
20 if (isSubsequence(currentWord, s) &&
21 (longestLength < currentLength ||
22 (longestLength == currentLength && currentWord.compareTo(longestWord) < 0))) {
23 longestWord = currentWord;
24 }
25 }
26
27 return longestWord;
28 }
29
30 /**
31 * Checks if word is a subsequence of source string.
32 * A subsequence means word can be formed by deleting some characters
33 * from source without changing the order of remaining characters.
34 *
35 * @param word The word to check
36 * @param source The source string
37 * @return true if word is a subsequence of source, false otherwise
38 */
39 private boolean isSubsequence(String word, String source) {
40 int wordLength = word.length();
41 int sourceLength = source.length();
42 int wordPointer = 0;
43
44 // Two-pointer approach: iterate through source string
45 for (int sourcePointer = 0; wordPointer < wordLength && sourcePointer < sourceLength; sourcePointer++) {
46 // If characters match, move word pointer forward
47 if (word.charAt(wordPointer) == source.charAt(sourcePointer)) {
48 wordPointer++;
49 }
50 }
51
52 // Word is a subsequence if all characters were matched
53 return wordPointer == wordLength;
54 }
55}
56
1class Solution {
2public:
3 string findLongestWord(string s, vector<string>& dictionary) {
4 string result = "";
5
6 // Lambda function to check if word can be formed by deleting characters from str
7 // Returns true if word is a subsequence of str
8 auto isSubsequence = [&](const string& word, const string& str) {
9 int wordLength = word.size();
10 int strLength = str.size();
11 int wordIndex = 0;
12
13 // Iterate through str and match characters with word
14 for (int strIndex = 0; wordIndex < wordLength && strIndex < strLength; ++strIndex) {
15 if (word[wordIndex] == str[strIndex]) {
16 ++wordIndex;
17 }
18 }
19
20 // Check if all characters in word were matched
21 return wordIndex == wordLength;
22 };
23
24 // Iterate through each word in the dictionary
25 for (auto& currentWord : dictionary) {
26 int resultLength = result.size();
27 int currentWordLength = currentWord.size();
28
29 // Check if current word is a subsequence of s and satisfies the conditions:
30 // 1. It's longer than the current result, OR
31 // 2. It has the same length but is lexicographically smaller
32 if (isSubsequence(currentWord, s) &&
33 (resultLength < currentWordLength ||
34 (resultLength == currentWordLength && result > currentWord))) {
35 result = currentWord;
36 }
37 }
38
39 return result;
40 }
41};
42
1/**
2 * Finds the longest word in dictionary that can be formed by deleting characters from string s
3 * @param s - The source string to form words from
4 * @param dictionary - Array of words to check against
5 * @returns The longest word that can be formed, or lexicographically smallest if there's a tie
6 */
7function findLongestWord(s: string, dictionary: string[]): string {
8 /**
9 * Checks if word can be formed as a subsequence of targetString
10 * @param word - The word to check if it's a subsequence
11 * @param targetString - The string to check against
12 * @returns True if word is a subsequence of targetString
13 */
14 const isSubsequence = (word: string, targetString: string): boolean => {
15 const wordLength = word.length;
16 const targetLength = targetString.length;
17
18 let wordIndex = 0;
19
20 // Iterate through target string and match characters with word
21 for (let targetIndex = 0; wordIndex < wordLength && targetIndex < targetLength; targetIndex++) {
22 if (word[wordIndex] === targetString[targetIndex]) {
23 wordIndex++;
24 }
25 }
26
27 // Check if all characters in word were matched
28 return wordIndex === wordLength;
29 };
30
31 let longestWord: string = '';
32
33 // Check each word in the dictionary
34 for (const currentWord of dictionary) {
35 const longestWordLength = longestWord.length;
36 const currentWordLength = currentWord.length;
37
38 // Update longest word if current word is:
39 // 1. A subsequence of s, AND
40 // 2. Longer than current longest, OR same length but lexicographically smaller
41 if (isSubsequence(currentWord, s) &&
42 (longestWordLength < currentWordLength ||
43 (longestWordLength === currentWordLength && longestWord > currentWord))) {
44 longestWord = currentWord;
45 }
46 }
47
48 return longestWord;
49}
50
Time and Space Complexity
The time complexity is O(d ร (m + n))
, where:
d
is the number of words in the dictionary (length of the string list)m
is the average length of strings in the dictionaryn
is the length of strings
For each word in the dictionary, the check
function performs a two-pointer traversal. In the worst case, it needs to iterate through all characters of both the dictionary word (length m
) and the input string s
(length n
), resulting in O(m + n)
time for each check. Since we check all d
words in the dictionary, the total time complexity is O(d ร (m + n))
.
The space complexity is O(1)
. The algorithm only uses a constant amount of extra space for variables like i
, j
, and ans
. The ans
variable stores at most one string from the dictionary, but this is considered O(1)
additional space as it doesn't scale with the input size - it merely holds a reference to an existing string from the input dictionary.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Lexicographical Comparison Logic
The Pitfall: A common mistake is using the wrong comparison operator or logic when checking lexicographical order. Developers often write:
word < result
instead ofresult > word
- Or forget to check lexicographical order entirely when lengths are equal
This leads to returning the wrong word when multiple words have the same maximum length.
Example of the mistake:
# INCORRECT - will not properly handle lexicographical ordering
if len(word) > len(result):
result = word
# Missing the case when lengths are equal!
The Solution: Always use a compound condition that checks both length and lexicographical order:
if len(result) < len(word) or (len(result) == len(word) and result > word):
result = word
2. Early Termination Optimization Gone Wrong
The Pitfall: Some developers try to optimize by breaking early when they think they've found the best answer, but this can miss better solutions:
# INCORRECT - might miss lexicographically better words
for word in sorted(dictionary, key=len, reverse=True):
if is_subsequence(word, s):
return word # This might not be lexicographically smallest!
The Solution: If you want to optimize with sorting, you must sort by both length (descending) and lexicographical order (ascending):
# Sort by length (descending) then lexicographically (ascending)
for word in sorted(dictionary, key=lambda x: (-len(x), x)):
if is_subsequence(word, s):
return word
3. Inefficient Subsequence Check Implementation
The Pitfall: Using string slicing or find operations repeatedly, which creates new strings and degrades performance:
# INEFFICIENT - creates many substring copies
def is_subsequence(word, source):
for char in word:
idx = source.find(char)
if idx == -1:
return False
source = source[idx + 1:] # Creates new string each time!
return True
The Solution: Use the two-pointer approach without creating new strings:
def is_subsequence(word, source):
word_idx = 0
for char in source:
if word_idx < len(word) and char == word[word_idx]:
word_idx += 1
return word_idx == len(word)
4. Not Handling Edge Cases
The Pitfall: Forgetting to handle cases like:
- Empty dictionary โ should return ""
- No valid subsequences โ should return ""
- Empty string
s
โ only empty string from dictionary (if present) is valid
The Solution:
The provided solution naturally handles these cases by initializing result = ""
and only updating when a valid subsequence is found. No special edge case handling is needed.
What is the best way of checking if an element exists in an unsorted array once in terms of time complexity? Select the best that applies.
Recommended Readings
Tech Interview Pattern Two Pointers Introduction If you prefer videos here's a super quick introduction to Two Pointers div class responsive iframe iframe src https www youtube com embed xZ4AfXHQ1VQ title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture allowfullscreen iframe div Two pointers is a common interview
Sorting Summary Comparisons We presented quite a few sorting algorithms and it is essential to know the advantages and disadvantages of each one The basic algorithms are easy to visualize and easy to learn for beginner programmers because of their simplicity As such they will suffice if you don't know any advanced
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Want a Structured Path to Master System Design Too? Donโt Miss This!