524. Longest Word in Dictionary through Deleting

Problem Description

You are given a string s and an array of strings called dictionary. Your task is to find the longest string from the dictionary that can be formed by deleting some characters from string s (while maintaining the relative order of remaining characters).

The key requirements are:

The result string must be formable by deleting characters from s (it must be a subsequence of s)
If multiple valid strings have the same maximum length, return the one that comes first lexicographically (alphabetically)
If no string from the dictionary can be formed, return an empty string

For example, if s = "abpcplea" and dictionary = ["ale", "apple", "monkey", "plea"]:

"ale" can be formed by deleting characters from s (keeping 'a', 'l', 'e' in order)
"apple" can be formed by deleting characters from s (keeping 'a', 'p', 'p', 'l', 'e' in order)
"monkey" cannot be formed from s
"plea" can be formed by deleting characters from s

Among the valid options, "apple" is the longest with length 5, so it would be the answer.

The solution uses a two-pointer technique to check if each dictionary word is a subsequence of s, then keeps track of the best valid word based on length and lexicographical order.

Quick Interview Experience

Help others by sharing your interview experience

Have you seen this problem before?

Intuition

The core insight is recognizing that this problem is about finding subsequences. A string can be formed by deleting characters from another string if and only if it's a subsequence of that string. This means we need to check if each dictionary word is a subsequence of s.

To check if a string t is a subsequence of s, we can use the two-pointer technique. We traverse through s character by character, and whenever we find a character that matches the current character we're looking for in t, we move to the next character in t. If we successfully match all characters of t while traversing s, then t is a subsequence.

Why does this greedy matching work? Because we want to preserve the relative order of characters. When we find a matching character in s, we can safely "consume" it for our subsequence without worrying about needing it later - if there's a valid subsequence, using the first occurrence of each character will always work.

For finding the best answer among all valid subsequences, we need to compare them based on two criteria:

Length (longer is better)
Lexicographical order (smaller is better when lengths are equal)

We can iterate through all dictionary words, check if each is a subsequence of s, and keep track of the best one we've found so far. Whenever we find a valid word that's either longer than our current best, or has the same length but comes earlier alphabetically, we update our answer.

This straightforward approach works because we're essentially doing a filtered search - first filtering for valid subsequences, then selecting the optimal one based on the given criteria.

Learn more about Two Pointers and Sorting patterns.

Solution Approach

The solution implements a two-step process: checking subsequences and finding the optimal result.

Step 1: Subsequence Checking Function

The check(s, t) function determines if string s is a subsequence of string t using two pointers:

Initialize pointers i = 0 for string s and j = 0 for string t
Iterate through string t with pointer j
When s[i] == t[j], increment i to look for the next character in s
Always increment j to continue scanning through t
Return True if i == len(s), meaning all characters of s were found in order

Step 2: Finding the Best Word

Initialize ans = "" to store the best word found so far. Then iterate through each word t in the dictionary:

Check if t is a subsequence of s by calling check(t, s)
If it is a subsequence, determine if it's better than the current ans:
- Update ans if len(t) > len(ans) (found a longer word)
- Update ans if len(t) == len(ans) and t < ans lexicographically (found an equally long but alphabetically smaller word)

The comparison ans > t works because Python compares strings lexicographically by default.

Time Complexity: O(n × m) where n is the number of words in the dictionary and m is the length of string s. For each word, we potentially scan through the entire string s.

Space Complexity: O(1) as we only use a constant amount of extra space for pointers and the answer string (not counting the input).

The elegance of this solution lies in its simplicity - it directly checks each candidate and keeps track of the best one, without needing complex data structures or preprocessing.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example with s = "abpcplea" and dictionary = ["ale", "apple", "monkey"].

Initial Setup:

ans = "" (empty string to start)
We'll check each word in the dictionary

Checking "ale":

Use two pointers to check if "ale" is a subsequence of "abpcplea"
i=0 (for "ale"), j=0 (for "abpcplea")
j=0: 'a' == 'a', match! Move i→1, j→1
j=1: 'b' != 'l', no match. Move j→2
j=2: 'p' != 'l', no match. Move j→3
j=3: 'c' != 'l', no match. Move j→4
j=4: 'p' != 'l', no match. Move j→5
j=5: 'l' == 'l', match! Move i→2, j→6
j=6: 'e' == 'e', match! Move i→3, j→7
i == 3 == len("ale"), so "ale" is a subsequence
Compare with ans="": len("ale")=3 > len("")=0, update ans = "ale"

Checking "apple":

Two pointers: i=0 (for "apple"), j=0 (for "abpcplea")
j=0: 'a' == 'a', match! Move i→1, j→1
j=1: 'b' != 'p', no match. Move j→2
j=2: 'p' == 'p', match! Move i→2, j→3
j=3: 'c' != 'p', no match. Move j→4
j=4: 'p' == 'p', match! Move i→3, j→5
j=5: 'l' == 'l', match! Move i→4, j→6
j=6: 'e' == 'e', match! Move i→5, j→7
i == 5 == len("apple"), so "apple" is a subsequence
Compare with ans="ale": len("apple")=5 > len("ale")=3, update ans = "apple"

Checking "monkey":

Two pointers: i=0 (for "monkey"), j=0 (for "abpcplea")
j=0: 'a' != 'm', no match. Move j→1
j=1: 'b' != 'm', no match. Move j→2
Continue through entire string... no 'm' found
Reach end of "abpcplea" with i=0 (never found 'm')
i != len("monkey"), so "monkey" is NOT a subsequence
Don't update ans

Final Result: ans = "apple" (the longest valid subsequence found)

Solution Implementation

1class Solution:
2    def findLongestWord(self, s: str, dictionary: List[str]) -> str:
3        def is_subsequence(word: str, source: str) -> bool:
4            """
5            Check if 'word' is a subsequence of 'source'.
6            A subsequence means all characters of 'word' appear in 'source' 
7            in the same order, but not necessarily consecutively.
8            """
9            word_len, source_len = len(word), len(source)
10            word_idx = source_idx = 0
11          
12            # Iterate through both strings
13            while word_idx < word_len and source_idx < source_len:
14                # If characters match, move word pointer forward
15                if word[word_idx] == source[source_idx]:
16                    word_idx += 1
17                # Always move source pointer forward
18                source_idx += 1
19          
20            # Return True if all characters in word were found
21            return word_idx == word_len
22      
23        # Initialize result with empty string
24        result = ""
25      
26        # Check each word in the dictionary
27        for word in dictionary:
28            # Check if current word is a subsequence of s
29            if is_subsequence(word, s):
30                # Update result if current word is:
31                # 1. Longer than current result, OR
32                # 2. Same length but lexicographically smaller
33                if len(result) < len(word) or (len(result) == len(word) and result > word):
34                    result = word
35      
36        return result
37

1class Solution {
2    /**
3     * Finds the longest word from dictionary that can be formed by deleting 
4     * some characters from string s. If multiple valid words have the same length,
5     * returns the lexicographically smallest one.
6     * 
7     * @param s The source string
8     * @param dictionary List of words to check
9     * @return The longest valid word, or empty string if none found
10     */
11    public String findLongestWord(String s, List<String> dictionary) {
12        String longestWord = "";
13      
14        // Iterate through each word in the dictionary
15        for (String currentWord : dictionary) {
16            int longestLength = longestWord.length();
17            int currentLength = currentWord.length();
18          
19            // Check if current word is a subsequence of s and is better than current best
20            if (isSubsequence(currentWord, s) && 
21                (longestLength < currentLength || 
22                 (longestLength == currentLength && currentWord.compareTo(longestWord) < 0))) {
23                longestWord = currentWord;
24            }
25        }
26      
27        return longestWord;
28    }
29
30    /**
31     * Checks if word is a subsequence of source string.
32     * A subsequence means word can be formed by deleting some characters 
33     * from source without changing the order of remaining characters.
34     * 
35     * @param word The word to check
36     * @param source The source string
37     * @return true if word is a subsequence of source, false otherwise
38     */
39    private boolean isSubsequence(String word, String source) {
40        int wordLength = word.length();
41        int sourceLength = source.length();
42        int wordPointer = 0;
43      
44        // Two-pointer approach: iterate through source string
45        for (int sourcePointer = 0; wordPointer < wordLength && sourcePointer < sourceLength; sourcePointer++) {
46            // If characters match, move word pointer forward
47            if (word.charAt(wordPointer) == source.charAt(sourcePointer)) {
48                wordPointer++;
49            }
50        }
51      
52        // Word is a subsequence if all characters were matched
53        return wordPointer == wordLength;
54    }
55}
56

1class Solution {
2public:
3    string findLongestWord(string s, vector<string>& dictionary) {
4        string result = "";
5      
6        // Lambda function to check if word can be formed by deleting characters from str
7        // Returns true if word is a subsequence of str
8        auto isSubsequence = [&](const string& word, const string& str) {
9            int wordLength = word.size();
10            int strLength = str.size();
11            int wordIndex = 0;
12          
13            // Iterate through str and match characters with word
14            for (int strIndex = 0; wordIndex < wordLength && strIndex < strLength; ++strIndex) {
15                if (word[wordIndex] == str[strIndex]) {
16                    ++wordIndex;
17                }
18            }
19          
20            // Check if all characters in word were matched
21            return wordIndex == wordLength;
22        };
23      
24        // Iterate through each word in the dictionary
25        for (auto& currentWord : dictionary) {
26            int resultLength = result.size();
27            int currentWordLength = currentWord.size();
28          
29            // Check if current word is a subsequence of s and satisfies the conditions:
30            // 1. It's longer than the current result, OR
31            // 2. It has the same length but is lexicographically smaller
32            if (isSubsequence(currentWord, s) && 
33                (resultLength < currentWordLength || 
34                 (resultLength == currentWordLength && result > currentWord))) {
35                result = currentWord;
36            }
37        }
38      
39        return result;
40    }
41};
42

1/**
2 * Finds the longest word in dictionary that can be formed by deleting characters from string s
3 * @param s - The source string to form words from
4 * @param dictionary - Array of words to check against
5 * @returns The longest word that can be formed, or lexicographically smallest if there's a tie
6 */
7function findLongestWord(s: string, dictionary: string[]): string {
8    /**
9     * Checks if word can be formed as a subsequence of targetString
10     * @param word - The word to check if it's a subsequence
11     * @param targetString - The string to check against
12     * @returns True if word is a subsequence of targetString
13     */
14    const isSubsequence = (word: string, targetString: string): boolean => {
15        const wordLength = word.length;
16        const targetLength = targetString.length;
17      
18        let wordIndex = 0;
19      
20        // Iterate through target string and match characters with word
21        for (let targetIndex = 0; wordIndex < wordLength && targetIndex < targetLength; targetIndex++) {
22            if (word[wordIndex] === targetString[targetIndex]) {
23                wordIndex++;
24            }
25        }
26      
27        // Check if all characters in word were matched
28        return wordIndex === wordLength;
29    };
30  
31    let longestWord: string = '';
32  
33    // Check each word in the dictionary
34    for (const currentWord of dictionary) {
35        const longestWordLength = longestWord.length;
36        const currentWordLength = currentWord.length;
37      
38        // Update longest word if current word is:
39        // 1. A subsequence of s, AND
40        // 2. Longer than current longest, OR same length but lexicographically smaller
41        if (isSubsequence(currentWord, s) && 
42            (longestWordLength < currentWordLength || 
43             (longestWordLength === currentWordLength && longestWord > currentWord))) {
44            longestWord = currentWord;
45        }
46    }
47  
48    return longestWord;
49}
50

Time and Space Complexity

The time complexity is O(d × (m + n)), where:

d is the number of words in the dictionary (length of the string list)
m is the average length of strings in the dictionary
n is the length of string s

For each word in the dictionary, the check function performs a two-pointer traversal. In the worst case, it needs to iterate through all characters of both the dictionary word (length m) and the input string s (length n), resulting in O(m + n) time for each check. Since we check all d words in the dictionary, the total time complexity is O(d × (m + n)).

The space complexity is O(1). The algorithm only uses a constant amount of extra space for variables like i, j, and ans. The ans variable stores at most one string from the dictionary, but this is considered O(1) additional space as it doesn't scale with the input size - it merely holds a reference to an existing string from the input dictionary.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Lexicographical Comparison Logic

The Pitfall: A common mistake is using the wrong comparison operator or logic when checking lexicographical order. Developers often write:

word < result instead of result > word
Or forget to check lexicographical order entirely when lengths are equal

This leads to returning the wrong word when multiple words have the same maximum length.

Example of the mistake:

# INCORRECT - will not properly handle lexicographical ordering
if len(word) > len(result):
    result = word
# Missing the case when lengths are equal!

The Solution: Always use a compound condition that checks both length and lexicographical order:

if len(result) < len(word) or (len(result) == len(word) and result > word):
    result = word

2. Early Termination Optimization Gone Wrong

The Pitfall: Some developers try to optimize by breaking early when they think they've found the best answer, but this can miss better solutions:

# INCORRECT - might miss lexicographically better words
for word in sorted(dictionary, key=len, reverse=True):
    if is_subsequence(word, s):
        return word  # This might not be lexicographically smallest!

The Solution: If you want to optimize with sorting, you must sort by both length (descending) and lexicographical order (ascending):

# Sort by length (descending) then lexicographically (ascending)
for word in sorted(dictionary, key=lambda x: (-len(x), x)):
    if is_subsequence(word, s):
        return word

3. Inefficient Subsequence Check Implementation

The Pitfall: Using string slicing or find operations repeatedly, which creates new strings and degrades performance:

# INEFFICIENT - creates many substring copies
def is_subsequence(word, source):
    for char in word:
        idx = source.find(char)
        if idx == -1:
            return False
        source = source[idx + 1:]  # Creates new string each time!
    return True

The Solution: Use the two-pointer approach without creating new strings:

def is_subsequence(word, source):
    word_idx = 0
    for char in source:
        if word_idx < len(word) and char == word[word_idx]:
            word_idx += 1
    return word_idx == len(word)

4. Not Handling Edge Cases

The Pitfall: Forgetting to handle cases like:

Empty dictionary → should return ""
No valid subsequences → should return ""
Empty string s → only empty string from dictionary (if present) is valid

The Solution: The provided solution naturally handles these cases by initializing result = "" and only updating when a valid subsequence is found. No special edge case handling is needed.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Consider the classic dynamic programming of fibonacci numbers, what is the recurrence relation?

dp[i] = dp[i - 1] + dp[i - 2]

dp[i] = dp[i - 1] + 1

dp[i] = dp[i - 1] + dp[i - 2] + dp[i - 3]

dp[i] = dp[i - 1]

I don't know