Facebook Pixel

1048. Longest String Chain

Problem Description

You have an array of words where each word contains only lowercase English letters.

A word A is considered a predecessor of word B if you can insert exactly one letter anywhere in word A (without changing the order of other characters) to make it equal to word B.

For example:

  • "abc" is a predecessor of "abac" (insert 'a' between 'b' and 'c')
  • "cba" is NOT a predecessor of "bcad" (would need more than one insertion or rearrangement)

A word chain is a sequence of words [word₁, word₂, ..., wordₖ] where:

  • k ≥ 1
  • Each word is a predecessor of the next word in the sequence
  • word₁ is a predecessor of word₂, word₂ is a predecessor of word₃, and so on

Note that a single word by itself forms a valid word chain of length 1.

Your task is to find the length of the longest possible word chain that can be formed using words from the given array. You can use each word at most once and pick any subset of words to form the chain.

The solution uses dynamic programming where:

  1. Words are sorted by length
  2. For each word, it checks all shorter words to see if they can be predecessors
  3. The check function verifies if one word can be transformed into another by inserting exactly one character
  4. dp[i] stores the length of the longest chain ending at word i
  5. The maximum value in the dp array gives the answer
Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

When looking at this problem, we need to find the longest chain where each word is formed by adding one letter to the previous word. This naturally suggests thinking about the problem in terms of building chains incrementally.

The key insight is that if we want to find the longest chain ending at a particular word, we need to know the longest chains ending at all possible predecessor words. This screams dynamic programming - we can build up our solution by solving smaller subproblems first.

Why sort by length? A word can only be a predecessor of another word if it's exactly one character shorter. By sorting words by length, we ensure that when we're processing a word of length n, all potential predecessors (words of length n-1) have already been processed. This gives us the correct order to build our dynamic programming solution.

For each word, we ask: "What's the longest chain I can form ending at this word?" To answer this, we check all shorter words to see which ones could be predecessors. If word A is a predecessor of word B, then the longest chain ending at B could be the longest chain ending at A plus one more word (B itself).

The check function implements a two-pointer approach to verify the predecessor relationship. We traverse both strings simultaneously - when characters match, we advance both pointers; when they don't match, we only advance the pointer in the longer word (this represents where the extra character was inserted). If we can traverse the entire shorter word with at most one mismatch, we've found a valid predecessor relationship.

This bottom-up approach ensures we consider all possible chains and find the maximum length among them.

Learn more about Two Pointers, Dynamic Programming and Sorting patterns.

Solution Approach

The solution uses dynamic programming with sorting to build word chains from shorter to longer words.

Step 1: Sort the words by length

words.sort(key=lambda x: len(x))

This ensures we process shorter words before longer ones, allowing us to build chains in the correct order.

Step 2: Initialize DP array

dp = [1] * (n + 1)

dp[i] represents the length of the longest word chain ending at words[i]. Initially, each word forms a chain of length 1 by itself.

Step 3: Check predecessor relationship The check(w1, w2) function determines if w1 is a predecessor of w2:

  • First verify that w2 is exactly one character longer than w1
  • Use two pointers i and j to traverse w1 and w2 respectively
  • When characters match (w1[i] == w2[j]), advance both pointers
  • When characters don't match, only advance j (this represents the inserted character)
  • Count mismatches with cnt
  • Return True if we've traversed all of w1 (i == len(w1)) with less than 2 mismatches

Step 4: Build the DP solution

for i in range(1, n):
    for j in range(i):
        if check(words[j], words[i]):
            dp[i] = max(dp[i], dp[j] + 1)
    res = max(res, dp[i])

For each word at position i:

  • Check all previous words at positions j < i (which are shorter due to sorting)
  • If words[j] is a predecessor of words[i], update dp[i] to be the maximum of its current value or dp[j] + 1
  • Track the overall maximum chain length in res

Time Complexity: O(n² × L) where n is the number of words and L is the average word length. We have nested loops over words and the check function takes O(L) time.

Space Complexity: O(n) for the DP array.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with a small example: words = ["a", "ba", "bda", "bca", "bdca"]

Step 1: Sort by length After sorting: ["a", "ba", "bca", "bda", "bdca"]

  • Length 1: ["a"]
  • Length 2: ["ba"]
  • Length 3: ["bca", "bda"]
  • Length 4: ["bdca"]

Step 2: Initialize DP array dp = [1, 1, 1, 1, 1] (each word forms a chain of length 1 by itself)

Step 3: Process each word

Processing index 0 ("a"):

  • No previous words to check
  • dp[0] = 1, res = 1

Processing index 1 ("ba"):

  • Check if "a" is predecessor of "ba":
    • Using two pointers: i=0 (for "a"), j=0 (for "ba")
    • "a"[0] ≠ "ba"[0] ('a' ≠ 'b'), mismatch, advance j: j=1, cnt=1
    • "a"[0] = "ba"[1] ('a' = 'a'), match, advance both: i=1, j=2
    • i reached end of "a", cnt=1 < 2, so "a" IS a predecessor
  • Update: dp[1] = max(1, dp[0] + 1) = max(1, 2) = 2
  • res = 2

Processing index 2 ("bca"):

  • Check "a" → "bca": Not a predecessor (would need to insert 'b' and 'c')
  • Check "ba" → "bca":
    • Two pointers: "ba" vs "bca"
    • 'b' = 'b', advance both: i=1, j=1
    • 'a' ≠ 'c', mismatch: j=2, cnt=1
    • 'a' = 'a', advance both: i=2, j=3
    • "ba" IS a predecessor
  • Update: dp[2] = max(1, dp[1] + 1) = 3
  • res = 3

Processing index 3 ("bda"):

  • Check "a" → "bda": Not a predecessor
  • Check "ba" → "bda":
    • 'b' = 'b', advance both
    • 'a' ≠ 'd', mismatch, advance j
    • 'a' = 'a', advance both
    • "ba" IS a predecessor
  • Update: dp[3] = max(1, dp[1] + 1) = 3
  • res = 3

Processing index 4 ("bdca"):

  • Check "a" → "bdca": Not a predecessor
  • Check "ba" → "bdca": Not a predecessor (would need 'd' and 'c')
  • Check "bca" → "bdca":
    • 'b' = 'b', advance both
    • 'c' ≠ 'd', mismatch, advance j in "bdca"
    • 'c' = 'c', advance both
    • 'a' = 'a', advance both
    • "bca" IS a predecessor
  • Check "bda" → "bdca":
    • 'b' = 'b', 'd' = 'd', advance both
    • 'a' ≠ 'c', mismatch, advance j
    • 'a' = 'a', advance both
    • "bda" IS a predecessor
  • Update: dp[4] = max(1, dp[2] + 1, dp[3] + 1) = max(1, 4, 4) = 4
  • res = 4

Final Result: The longest word chain has length 4. One such chain is: "a" → "ba" → "bca" → "bdca"

Solution Implementation

1class Solution:
2    def longestStrChain(self, words: List[str]) -> int:
3        def is_predecessor(word1: str, word2: str) -> bool:
4            """
5            Check if word1 is a predecessor of word2.
6            word1 is a predecessor if we can insert exactly one letter to word1 to make it equal to word2.
7            """
8            # Words must differ by exactly one character in length
9            if len(word2) - len(word1) != 1:
10                return False
11          
12            i = 0  # Pointer for word1
13            j = 0  # Pointer for word2
14            mismatch_count = 0  # Count of characters in word2 that don't match word1
15          
16            # Compare characters of both words
17            while i < len(word1) and j < len(word2):
18                if word1[i] != word2[j]:
19                    # Found a mismatch, this should be the inserted character
20                    mismatch_count += 1
21                else:
22                    # Characters match, move pointer in word1
23                    i += 1
24                # Always move pointer in word2
25                j += 1
26          
27            # Valid predecessor if:
28            # - At most 1 mismatch (insertion) occurred
29            # - All characters in word1 were matched
30            return mismatch_count < 2 and i == len(word1)
31      
32        n = len(words)
33        # dp[i] represents the longest chain ending at words[i]
34        dp = [1] * n
35      
36        # Sort words by length to ensure we process shorter words first
37        words.sort(key=lambda x: len(x))
38      
39        max_chain_length = 1
40      
41        # Build chains by checking each word against all previous shorter words
42        for i in range(1, n):
43            for j in range(i):
44                # If words[j] is a predecessor of words[i], update chain length
45                if is_predecessor(words[j], words[i]):
46                    dp[i] = max(dp[i], dp[j] + 1)
47            # Update the maximum chain length found so far
48            max_chain_length = max(max_chain_length, dp[i])
49      
50        return max_chain_length
51
1class Solution {
2    public int longestStrChain(String[] words) {
3        // Sort words by length in ascending order
4        Arrays.sort(words, Comparator.comparingInt(String::length));
5      
6        // Variable to track the maximum chain length found
7        int maxChainLength = 0;
8      
9        // Map to store each word and its longest chain length ending at that word
10        Map<String, Integer> wordToChainLength = new HashMap<>();
11      
12        // Process each word in order of increasing length
13        for (String currentWord : words) {
14            // Initialize the chain length for current word as 1 (the word itself)
15            int currentChainLength = 1;
16          
17            // Try removing each character to find potential predecessors
18            for (int i = 0; i < currentWord.length(); ++i) {
19                // Create predecessor by removing character at position i
20                String predecessor = currentWord.substring(0, i) + currentWord.substring(i + 1);
21              
22                // Update chain length if this predecessor exists and gives a longer chain
23                currentChainLength = Math.max(currentChainLength, 
24                                             wordToChainLength.getOrDefault(predecessor, 0) + 1);
25            }
26          
27            // Store the longest chain length ending at current word
28            wordToChainLength.put(currentWord, currentChainLength);
29          
30            // Update the global maximum chain length
31            maxChainLength = Math.max(maxChainLength, currentChainLength);
32        }
33      
34        return maxChainLength;
35    }
36}
37
1class Solution {
2public:
3    int longestStrChain(vector<string>& words) {
4        // Sort words by length in ascending order
5        sort(words.begin(), words.end(), [](const string& a, const string& b) { 
6            return a.size() < b.size(); 
7        });
8      
9        int maxChainLength = 0;
10        // Map to store the longest chain ending at each word
11        unordered_map<string, int> chainLengthMap;
12      
13        // Process each word in order of increasing length
14        for (const string& currentWord : words) {
15            int currentChainLength = 1;  // Minimum chain length is 1 (word itself)
16          
17            // Try removing each character to find potential predecessors
18            for (int i = 0; i < currentWord.size(); ++i) {
19                // Create predecessor by removing character at position i
20                string predecessor = currentWord.substr(0, i) + currentWord.substr(i + 1);
21              
22                // Update chain length if this predecessor exists in map
23                if (chainLengthMap.count(predecessor)) {
24                    currentChainLength = max(currentChainLength, chainLengthMap[predecessor] + 1);
25                }
26            }
27          
28            // Store the longest chain length ending at current word
29            chainLengthMap[currentWord] = currentChainLength;
30          
31            // Update the global maximum chain length
32            maxChainLength = max(maxChainLength, currentChainLength);
33        }
34      
35        return maxChainLength;
36    }
37};
38
1/**
2 * Finds the longest string chain where each word is formed by adding exactly one letter to the previous word
3 * @param words - Array of words to form the chain
4 * @returns The length of the longest possible word chain
5 */
6function longestStrChain(words: string[]): number {
7    // Sort words by length to process shorter words first
8    words.sort((a: string, b: string) => a.length - b.length);
9  
10    // Track the maximum chain length found
11    let maxChainLength: number = 0;
12  
13    // Map to store the longest chain ending at each word
14    const chainLengthMap: Map<string, number> = new Map<string, number>();
15  
16    // Process each word in ascending order of length
17    for (const currentWord of words) {
18        // Initialize chain length for current word as 1 (word itself)
19        let currentChainLength: number = 1;
20      
21        // Try removing each character to find potential predecessors
22        for (let i = 0; i < currentWord.length; i++) {
23            // Create predecessor by removing character at index i
24            const predecessor: string = currentWord.substring(0, i) + currentWord.substring(i + 1);
25          
26            // Update chain length if this predecessor exists and forms a longer chain
27            const predecessorChainLength: number = chainLengthMap.get(predecessor) || 0;
28            currentChainLength = Math.max(currentChainLength, predecessorChainLength + 1);
29        }
30      
31        // Store the longest chain length ending at current word
32        chainLengthMap.set(currentWord, currentChainLength);
33      
34        // Update global maximum chain length
35        maxChainLength = Math.max(maxChainLength, currentChainLength);
36    }
37  
38    return maxChainLength;
39}
40

Time and Space Complexity

Time Complexity: O(n² × L) where n is the number of words and L is the average length of words.

  • Sorting the words array takes O(n log n) time
  • The nested loops iterate through all pairs where i > j, giving us O(n²) iterations
  • For each pair, the check function is called which takes O(L) time in the worst case, where it needs to compare characters of two strings
  • Therefore, the overall time complexity is O(n log n + n² × L) = O(n² × L) since n² × L dominates for typical inputs

Space Complexity: O(n)

  • The dp array uses O(n) space to store the longest chain length ending at each word
  • The sorting operation might use O(log n) space for the recursive call stack in typical implementations
  • The check function uses O(1) additional space with only a few integer variables
  • Overall space complexity is O(n) as the dp array dominates the space usage

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Predecessor Validation Logic

The most common pitfall is incorrectly implementing the predecessor check function. Many implementations fail to handle edge cases properly:

Problematic Implementation:

def is_predecessor_wrong(word1, word2):
    if len(word2) - len(word1) != 1:
        return False
  
    for i in range(len(word2)):
        # Simply removing one character and checking equality
        if word2[:i] + word2[i+1:] == word1:
            return True
    return False

Issue: While this works, it's inefficient with O(L²) complexity due to string slicing and comparison.

Another Common Mistake:

def is_predecessor_buggy(word1, word2):
    if len(word2) - len(word1) != 1:
        return False
  
    i, j = 0, 0
    while i < len(word1) and j < len(word2):
        if word1[i] == word2[j]:
            i += 1
            j += 1
        else:
            j += 1  # Skip the inserted character
  
    return i == len(word1)  # Bug: doesn't track number of mismatches

Issue: This doesn't properly count mismatches. It would incorrectly accept cases where multiple characters are inserted, like considering "ac" as a predecessor of "abcd" (inserting both 'b' and 'd').

Correct Solution:

def is_predecessor(word1, word2):
    if len(word2) - len(word1) != 1:
        return False
  
    i, j = 0, 0
    mismatch_count = 0
  
    while i < len(word1) and j < len(word2):
        if word1[i] == word2[j]:
            i += 1
        else:
            mismatch_count += 1
            if mismatch_count > 1:  # Early termination optimization
                return False
        j += 1
  
    # Must have traversed all of word1 and have at most 1 mismatch
    return i == len(word1) and mismatch_count <= 1

2. Forgetting to Sort Words by Length

Problematic Code:

def longestStrChain_wrong(words):
    n = len(words)
    dp = [1] * n
    # Missing: words.sort(key=lambda x: len(x))
  
    for i in range(1, n):
        for j in range(i):
            if is_predecessor(words[j], words[i]):
                dp[i] = max(dp[i], dp[j] + 1)
  
    return max(dp)

Issue: Without sorting, we might try to build chains in the wrong order. For example, if the array is ["bdca", "bda", "ba", "a"], we'd process "bdca" first and miss the chain "a" -> "ba" -> "bda" -> "bdca".

Solution: Always sort words by length before processing to ensure shorter words are processed before longer ones.

3. Using HashMap Instead of DP Array (Alternative Approach)

While using a HashMap with word-to-chain-length mapping is valid, a common mistake is not handling the deletion properly:

Problematic HashMap Approach:

def longestStrChain_hashmap_buggy(words):
    words.sort(key=len)
    dp = {}
  
    for word in words:
        dp[word] = 1
        for i in range(len(word)):
            # Create predecessor by removing character at position i
            predecessor = word[:i] + word[i+1:]
            if predecessor in dp:
                dp[word] = max(dp[word], dp[predecessor] + 1)
  
    return max(dp.values())

Issue: This approach works backwards (removing characters from the current word to find predecessors) which is correct, but many people confuse the direction and try to add characters to find successors, which doesn't work with the sorted order.

Better Implementation with Clear Logic:

def longestStrChain_hashmap(words):
    words.sort(key=len)
    dp = {}
    max_length = 1
  
    for word in words:
        # Start with chain length 1 for this word
        dp[word] = 1
      
        # Try removing each character to find potential predecessors
        for i in range(len(word)):
            predecessor = word[:i] + word[i+1:]
            if predecessor in dp:
                # Extend the chain from predecessor
                dp[word] = max(dp[word], dp[predecessor] + 1)
      
        max_length = max(max_length, dp[word])
  
    return max_length

This HashMap approach is actually more efficient (O(n × L²) vs O(n² × L)) when the number of words is large but word lengths are small.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Consider the classic dynamic programming of longest increasing subsequence:

Find the length of the longest subsequence of a given sequence such that all elements of the subsequence are sorted in increasing order.

For example, the length of LIS for [50, 3, 10, 7, 40, 80] is 4 and LIS is [3, 7, 40, 80].

What is the recurrence relation?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More