1048. Longest String Chain
Problem Description
You have an array of words where each word contains only lowercase English letters.
A word A is considered a predecessor of word B if you can insert exactly one letter anywhere in word A (without changing the order of other characters) to make it equal to word B.
For example:
"abc"
is a predecessor of"abac"
(insert 'a' between 'b' and 'c')"cba"
is NOT a predecessor of"bcad"
(would need more than one insertion or rearrangement)
A word chain is a sequence of words [word₁, word₂, ..., wordₖ]
where:
k ≥ 1
- Each word is a predecessor of the next word in the sequence
word₁
is a predecessor ofword₂
,word₂
is a predecessor ofword₃
, and so on
Note that a single word by itself forms a valid word chain of length 1.
Your task is to find the length of the longest possible word chain that can be formed using words from the given array. You can use each word at most once and pick any subset of words to form the chain.
The solution uses dynamic programming where:
- Words are sorted by length
- For each word, it checks all shorter words to see if they can be predecessors
- The
check
function verifies if one word can be transformed into another by inserting exactly one character dp[i]
stores the length of the longest chain ending at wordi
- The maximum value in the dp array gives the answer
Intuition
When looking at this problem, we need to find the longest chain where each word is formed by adding one letter to the previous word. This naturally suggests thinking about the problem in terms of building chains incrementally.
The key insight is that if we want to find the longest chain ending at a particular word, we need to know the longest chains ending at all possible predecessor words. This screams dynamic programming - we can build up our solution by solving smaller subproblems first.
Why sort by length? A word can only be a predecessor of another word if it's exactly one character shorter. By sorting words by length, we ensure that when we're processing a word of length n
, all potential predecessors (words of length n-1
) have already been processed. This gives us the correct order to build our dynamic programming solution.
For each word, we ask: "What's the longest chain I can form ending at this word?" To answer this, we check all shorter words to see which ones could be predecessors. If word A is a predecessor of word B, then the longest chain ending at B could be the longest chain ending at A plus one more word (B itself).
The check
function implements a two-pointer approach to verify the predecessor relationship. We traverse both strings simultaneously - when characters match, we advance both pointers; when they don't match, we only advance the pointer in the longer word (this represents where the extra character was inserted). If we can traverse the entire shorter word with at most one mismatch, we've found a valid predecessor relationship.
This bottom-up approach ensures we consider all possible chains and find the maximum length among them.
Learn more about Two Pointers, Dynamic Programming and Sorting patterns.
Solution Approach
The solution uses dynamic programming with sorting to build word chains from shorter to longer words.
Step 1: Sort the words by length
words.sort(key=lambda x: len(x))
This ensures we process shorter words before longer ones, allowing us to build chains in the correct order.
Step 2: Initialize DP array
dp = [1] * (n + 1)
dp[i]
represents the length of the longest word chain ending at words[i]
. Initially, each word forms a chain of length 1 by itself.
Step 3: Check predecessor relationship
The check(w1, w2)
function determines if w1
is a predecessor of w2
:
- First verify that
w2
is exactly one character longer thanw1
- Use two pointers
i
andj
to traversew1
andw2
respectively - When characters match (
w1[i] == w2[j]
), advance both pointers - When characters don't match, only advance
j
(this represents the inserted character) - Count mismatches with
cnt
- Return
True
if we've traversed all ofw1
(i == len(w1)
) with less than 2 mismatches
Step 4: Build the DP solution
for i in range(1, n):
for j in range(i):
if check(words[j], words[i]):
dp[i] = max(dp[i], dp[j] + 1)
res = max(res, dp[i])
For each word at position i
:
- Check all previous words at positions
j < i
(which are shorter due to sorting) - If
words[j]
is a predecessor ofwords[i]
, updatedp[i]
to be the maximum of its current value ordp[j] + 1
- Track the overall maximum chain length in
res
Time Complexity: O(n² × L)
where n
is the number of words and L
is the average word length. We have nested loops over words and the check
function takes O(L)
time.
Space Complexity: O(n)
for the DP array.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with a small example: words = ["a", "ba", "bda", "bca", "bdca"]
Step 1: Sort by length
After sorting: ["a", "ba", "bca", "bda", "bdca"]
- Length 1:
["a"]
- Length 2:
["ba"]
- Length 3:
["bca", "bda"]
- Length 4:
["bdca"]
Step 2: Initialize DP array
dp = [1, 1, 1, 1, 1]
(each word forms a chain of length 1 by itself)
Step 3: Process each word
Processing index 0 ("a"):
- No previous words to check
dp[0] = 1
,res = 1
Processing index 1 ("ba"):
- Check if "a" is predecessor of "ba":
- Using two pointers:
i=0
(for "a"),j=0
(for "ba") - "a"[0] ≠ "ba"[0] ('a' ≠ 'b'), mismatch, advance j:
j=1
,cnt=1
- "a"[0] = "ba"[1] ('a' = 'a'), match, advance both:
i=1
,j=2
i
reached end of "a",cnt=1 < 2
, so "a" IS a predecessor
- Using two pointers:
- Update:
dp[1] = max(1, dp[0] + 1) = max(1, 2) = 2
res = 2
Processing index 2 ("bca"):
- Check "a" → "bca": Not a predecessor (would need to insert 'b' and 'c')
- Check "ba" → "bca":
- Two pointers: "ba" vs "bca"
- 'b' = 'b', advance both:
i=1
,j=1
- 'a' ≠ 'c', mismatch:
j=2
,cnt=1
- 'a' = 'a', advance both:
i=2
,j=3
- "ba" IS a predecessor
- Update:
dp[2] = max(1, dp[1] + 1) = 3
res = 3
Processing index 3 ("bda"):
- Check "a" → "bda": Not a predecessor
- Check "ba" → "bda":
- 'b' = 'b', advance both
- 'a' ≠ 'd', mismatch, advance j
- 'a' = 'a', advance both
- "ba" IS a predecessor
- Update:
dp[3] = max(1, dp[1] + 1) = 3
res = 3
Processing index 4 ("bdca"):
- Check "a" → "bdca": Not a predecessor
- Check "ba" → "bdca": Not a predecessor (would need 'd' and 'c')
- Check "bca" → "bdca":
- 'b' = 'b', advance both
- 'c' ≠ 'd', mismatch, advance j in "bdca"
- 'c' = 'c', advance both
- 'a' = 'a', advance both
- "bca" IS a predecessor
- Check "bda" → "bdca":
- 'b' = 'b', 'd' = 'd', advance both
- 'a' ≠ 'c', mismatch, advance j
- 'a' = 'a', advance both
- "bda" IS a predecessor
- Update:
dp[4] = max(1, dp[2] + 1, dp[3] + 1) = max(1, 4, 4) = 4
res = 4
Final Result: The longest word chain has length 4.
One such chain is: "a" → "ba" → "bca" → "bdca"
Solution Implementation
1class Solution:
2 def longestStrChain(self, words: List[str]) -> int:
3 def is_predecessor(word1: str, word2: str) -> bool:
4 """
5 Check if word1 is a predecessor of word2.
6 word1 is a predecessor if we can insert exactly one letter to word1 to make it equal to word2.
7 """
8 # Words must differ by exactly one character in length
9 if len(word2) - len(word1) != 1:
10 return False
11
12 i = 0 # Pointer for word1
13 j = 0 # Pointer for word2
14 mismatch_count = 0 # Count of characters in word2 that don't match word1
15
16 # Compare characters of both words
17 while i < len(word1) and j < len(word2):
18 if word1[i] != word2[j]:
19 # Found a mismatch, this should be the inserted character
20 mismatch_count += 1
21 else:
22 # Characters match, move pointer in word1
23 i += 1
24 # Always move pointer in word2
25 j += 1
26
27 # Valid predecessor if:
28 # - At most 1 mismatch (insertion) occurred
29 # - All characters in word1 were matched
30 return mismatch_count < 2 and i == len(word1)
31
32 n = len(words)
33 # dp[i] represents the longest chain ending at words[i]
34 dp = [1] * n
35
36 # Sort words by length to ensure we process shorter words first
37 words.sort(key=lambda x: len(x))
38
39 max_chain_length = 1
40
41 # Build chains by checking each word against all previous shorter words
42 for i in range(1, n):
43 for j in range(i):
44 # If words[j] is a predecessor of words[i], update chain length
45 if is_predecessor(words[j], words[i]):
46 dp[i] = max(dp[i], dp[j] + 1)
47 # Update the maximum chain length found so far
48 max_chain_length = max(max_chain_length, dp[i])
49
50 return max_chain_length
51
1class Solution {
2 public int longestStrChain(String[] words) {
3 // Sort words by length in ascending order
4 Arrays.sort(words, Comparator.comparingInt(String::length));
5
6 // Variable to track the maximum chain length found
7 int maxChainLength = 0;
8
9 // Map to store each word and its longest chain length ending at that word
10 Map<String, Integer> wordToChainLength = new HashMap<>();
11
12 // Process each word in order of increasing length
13 for (String currentWord : words) {
14 // Initialize the chain length for current word as 1 (the word itself)
15 int currentChainLength = 1;
16
17 // Try removing each character to find potential predecessors
18 for (int i = 0; i < currentWord.length(); ++i) {
19 // Create predecessor by removing character at position i
20 String predecessor = currentWord.substring(0, i) + currentWord.substring(i + 1);
21
22 // Update chain length if this predecessor exists and gives a longer chain
23 currentChainLength = Math.max(currentChainLength,
24 wordToChainLength.getOrDefault(predecessor, 0) + 1);
25 }
26
27 // Store the longest chain length ending at current word
28 wordToChainLength.put(currentWord, currentChainLength);
29
30 // Update the global maximum chain length
31 maxChainLength = Math.max(maxChainLength, currentChainLength);
32 }
33
34 return maxChainLength;
35 }
36}
37
1class Solution {
2public:
3 int longestStrChain(vector<string>& words) {
4 // Sort words by length in ascending order
5 sort(words.begin(), words.end(), [](const string& a, const string& b) {
6 return a.size() < b.size();
7 });
8
9 int maxChainLength = 0;
10 // Map to store the longest chain ending at each word
11 unordered_map<string, int> chainLengthMap;
12
13 // Process each word in order of increasing length
14 for (const string& currentWord : words) {
15 int currentChainLength = 1; // Minimum chain length is 1 (word itself)
16
17 // Try removing each character to find potential predecessors
18 for (int i = 0; i < currentWord.size(); ++i) {
19 // Create predecessor by removing character at position i
20 string predecessor = currentWord.substr(0, i) + currentWord.substr(i + 1);
21
22 // Update chain length if this predecessor exists in map
23 if (chainLengthMap.count(predecessor)) {
24 currentChainLength = max(currentChainLength, chainLengthMap[predecessor] + 1);
25 }
26 }
27
28 // Store the longest chain length ending at current word
29 chainLengthMap[currentWord] = currentChainLength;
30
31 // Update the global maximum chain length
32 maxChainLength = max(maxChainLength, currentChainLength);
33 }
34
35 return maxChainLength;
36 }
37};
38
1/**
2 * Finds the longest string chain where each word is formed by adding exactly one letter to the previous word
3 * @param words - Array of words to form the chain
4 * @returns The length of the longest possible word chain
5 */
6function longestStrChain(words: string[]): number {
7 // Sort words by length to process shorter words first
8 words.sort((a: string, b: string) => a.length - b.length);
9
10 // Track the maximum chain length found
11 let maxChainLength: number = 0;
12
13 // Map to store the longest chain ending at each word
14 const chainLengthMap: Map<string, number> = new Map<string, number>();
15
16 // Process each word in ascending order of length
17 for (const currentWord of words) {
18 // Initialize chain length for current word as 1 (word itself)
19 let currentChainLength: number = 1;
20
21 // Try removing each character to find potential predecessors
22 for (let i = 0; i < currentWord.length; i++) {
23 // Create predecessor by removing character at index i
24 const predecessor: string = currentWord.substring(0, i) + currentWord.substring(i + 1);
25
26 // Update chain length if this predecessor exists and forms a longer chain
27 const predecessorChainLength: number = chainLengthMap.get(predecessor) || 0;
28 currentChainLength = Math.max(currentChainLength, predecessorChainLength + 1);
29 }
30
31 // Store the longest chain length ending at current word
32 chainLengthMap.set(currentWord, currentChainLength);
33
34 // Update global maximum chain length
35 maxChainLength = Math.max(maxChainLength, currentChainLength);
36 }
37
38 return maxChainLength;
39}
40
Time and Space Complexity
Time Complexity: O(n² × L)
where n
is the number of words and L
is the average length of words.
- Sorting the words array takes
O(n log n)
time - The nested loops iterate through all pairs where
i > j
, giving usO(n²)
iterations - For each pair, the
check
function is called which takesO(L)
time in the worst case, where it needs to compare characters of two strings - Therefore, the overall time complexity is
O(n log n + n² × L) = O(n² × L)
sincen² × L
dominates for typical inputs
Space Complexity: O(n)
- The
dp
array usesO(n)
space to store the longest chain length ending at each word - The sorting operation might use
O(log n)
space for the recursive call stack in typical implementations - The
check
function usesO(1)
additional space with only a few integer variables - Overall space complexity is
O(n)
as the dp array dominates the space usage
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Predecessor Validation Logic
The most common pitfall is incorrectly implementing the predecessor check function. Many implementations fail to handle edge cases properly:
Problematic Implementation:
def is_predecessor_wrong(word1, word2):
if len(word2) - len(word1) != 1:
return False
for i in range(len(word2)):
# Simply removing one character and checking equality
if word2[:i] + word2[i+1:] == word1:
return True
return False
Issue: While this works, it's inefficient with O(L²) complexity due to string slicing and comparison.
Another Common Mistake:
def is_predecessor_buggy(word1, word2):
if len(word2) - len(word1) != 1:
return False
i, j = 0, 0
while i < len(word1) and j < len(word2):
if word1[i] == word2[j]:
i += 1
j += 1
else:
j += 1 # Skip the inserted character
return i == len(word1) # Bug: doesn't track number of mismatches
Issue: This doesn't properly count mismatches. It would incorrectly accept cases where multiple characters are inserted, like considering "ac" as a predecessor of "abcd" (inserting both 'b' and 'd').
Correct Solution:
def is_predecessor(word1, word2):
if len(word2) - len(word1) != 1:
return False
i, j = 0, 0
mismatch_count = 0
while i < len(word1) and j < len(word2):
if word1[i] == word2[j]:
i += 1
else:
mismatch_count += 1
if mismatch_count > 1: # Early termination optimization
return False
j += 1
# Must have traversed all of word1 and have at most 1 mismatch
return i == len(word1) and mismatch_count <= 1
2. Forgetting to Sort Words by Length
Problematic Code:
def longestStrChain_wrong(words):
n = len(words)
dp = [1] * n
# Missing: words.sort(key=lambda x: len(x))
for i in range(1, n):
for j in range(i):
if is_predecessor(words[j], words[i]):
dp[i] = max(dp[i], dp[j] + 1)
return max(dp)
Issue: Without sorting, we might try to build chains in the wrong order. For example, if the array is ["bdca", "bda", "ba", "a"]
, we'd process "bdca" first and miss the chain "a" -> "ba" -> "bda" -> "bdca"
.
Solution: Always sort words by length before processing to ensure shorter words are processed before longer ones.
3. Using HashMap Instead of DP Array (Alternative Approach)
While using a HashMap with word-to-chain-length mapping is valid, a common mistake is not handling the deletion properly:
Problematic HashMap Approach:
def longestStrChain_hashmap_buggy(words):
words.sort(key=len)
dp = {}
for word in words:
dp[word] = 1
for i in range(len(word)):
# Create predecessor by removing character at position i
predecessor = word[:i] + word[i+1:]
if predecessor in dp:
dp[word] = max(dp[word], dp[predecessor] + 1)
return max(dp.values())
Issue: This approach works backwards (removing characters from the current word to find predecessors) which is correct, but many people confuse the direction and try to add characters to find successors, which doesn't work with the sorted order.
Better Implementation with Clear Logic:
def longestStrChain_hashmap(words):
words.sort(key=len)
dp = {}
max_length = 1
for word in words:
# Start with chain length 1 for this word
dp[word] = 1
# Try removing each character to find potential predecessors
for i in range(len(word)):
predecessor = word[:i] + word[i+1:]
if predecessor in dp:
# Extend the chain from predecessor
dp[word] = max(dp[word], dp[predecessor] + 1)
max_length = max(max_length, dp[word])
return max_length
This HashMap approach is actually more efficient (O(n × L²) vs O(n² × L)) when the number of words is large but word lengths are small.
Consider the classic dynamic programming of longest increasing subsequence:
Find the length of the longest subsequence of a given sequence such that all elements of the subsequence are sorted in increasing order.
For example, the length of LIS for [50, 3, 10, 7, 40, 80]
is 4
and LIS is
[3, 7, 40, 80]
.
What is the recurrence relation?
Recommended Readings
Tech Interview Pattern Two Pointers Introduction If you prefer videos here's a super quick introduction to Two Pointers div class responsive iframe iframe src https www youtube com embed xZ4AfXHQ1VQ title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture allowfullscreen iframe div Two pointers is a common interview
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
Sorting Summary Comparisons We presented quite a few sorting algorithms and it is essential to know the advantages and disadvantages of each one The basic algorithms are easy to visualize and easy to learn for beginner programmers because of their simplicity As such they will suffice if you don't know any advanced
Want a Structured Path to Master System Design Too? Don’t Miss This!