Facebook Pixel

139. Word Break

Problem Description

You are given a string s and a list of words called wordDict. Your task is to determine if the string s can be broken down into a sequence of words where each word exists in wordDict.

The key points to understand:

  • The string s needs to be completely segmented (no characters left over)
  • Each segment must be a valid word from wordDict
  • Words from the dictionary can be used multiple times
  • The segments, when joined together without spaces, should form the original string s

For example:

  • If s = "leetcode" and wordDict = ["leet", "code"], the answer is true because "leetcode" can be segmented as "leet" + "code"
  • If s = "applepenapple" and wordDict = ["apple", "pen"], the answer is true because it can be segmented as "apple" + "pen" + "apple"
  • If s = "catsandog" and wordDict = ["cats", "dog", "sand", "and", "cat"], the answer is false because there's no valid way to segment the entire string using the dictionary words

The solution uses dynamic programming where f[i] represents whether the substring s[0:i] can be segmented using words from the dictionary. For each position i, it checks all possible previous positions j to see if s[0:j] can be segmented (i.e., f[j] is true) and if s[j:i] is a valid word in the dictionary. If both conditions are met for any j, then f[i] is set to true.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

When we need to check if a string can be broken into valid words, we face a decision-making problem at each position: where should we make the next cut? This naturally suggests a recursive approach - we could try cutting at different positions and check if both parts work out.

However, naive recursion would lead to repeated calculations. For instance, if we're checking "catsanddog", we might check "cat" + "sanddog" and also "cats" + "anddog". Both paths would eventually need to verify if "dog" at the end is valid, leading to redundant work.

This observation points us toward dynamic programming. We can build up our solution from smaller subproblems to larger ones. The key insight is: if we know whether all prefixes of the string can be segmented, we can determine if the entire string can be segmented.

Think of it this way: to know if s[0:i] can be segmented, we need to find a split point j where:

  1. The prefix s[0:j] can be segmented (we've already computed this)
  2. The remaining part s[j:i] is a valid word in our dictionary

If we can find such a split point, then s[0:i] is valid. We start with an empty string (which is trivially segmentable) and gradually build up to the full string.

The beauty of this approach is that we solve each subproblem exactly once. By the time we need to check if a longer substring can be segmented, we already have all the information about shorter substrings stored in our f array. This transforms what could be an exponential-time problem into a polynomial-time solution with O(n²) complexity, where we check all possible split points for each position.

Solution Approach

The implementation uses dynamic programming with a bottom-up approach. Let's walk through the solution step by step:

Data Structure Setup:

  • Convert wordDict to a set called words for O(1) lookup time
  • Create a boolean array f of size n+1 where f[i] represents whether s[0:i] can be segmented
  • Initialize f[0] = True (empty string is always valid) and all other positions to False

Core Algorithm:

f = [True] + [False] * n

This creates our DP array where index i corresponds to the substring s[0:i].

Building the Solution: For each position i from 1 to n:

for i in range(1, n + 1):
    f[i] = any(f[j] and s[j:i] in words for j in range(i))

This line does the heavy lifting:

  • For each position i, we check all possible split points j from 0 to i-1
  • For each split point j, we verify two conditions:
    • f[j] is True: the prefix s[0:j] can be segmented
    • s[j:i] in words: the substring from j to i is a valid dictionary word
  • If any split point satisfies both conditions, we set f[i] = True

Example Walkthrough: Let's trace through s = "leetcode" with wordDict = ["leet", "code"]:

  • f[0] = True (base case)
  • i = 1: Check if "l" is in dictionary → f[1] = False
  • i = 2: Check if "le" is in dictionary → f[2] = False
  • i = 3: Check if "lee" is in dictionary → f[3] = False
  • i = 4: Check if "leet" is in dictionary → f[4] = True (found valid word)
  • i = 5: Check substrings ending at position 5 → f[5] = False
  • i = 6: Check substrings ending at position 6 → f[6] = False
  • i = 7: Check substrings ending at position 7 → f[7] = False
  • i = 8: Check "code" (from position 4 to 8, since f[4] = True) → f[8] = True

Final Result: Return f[n], which tells us whether the entire string can be segmented.

The time complexity is O(n² × m) where n is the length of the string and m is the average length of substring checks. The space complexity is O(n) for the DP array.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example with s = "catdog" and wordDict = ["cat", "dog", "catdo"].

Initial Setup:

  • Convert wordDict to set: words = {"cat", "dog", "catdo"}
  • String length n = 6
  • Create DP array: f = [True, False, False, False, False, False, False]
    • f[0] = True (empty string base case)

Step-by-step DP computation:

i = 1 (checking "c"):

  • j = 0: Check if f[0] is True AND "c" is in words
    • f[0] = True, but "c" ∉ words
  • Result: f[1] = False

i = 2 (checking "ca"):

  • j = 0: Check if f[0] is True AND "ca" is in words
    • f[0] = True, but "ca" ∉ words
  • j = 1: Check if f[1] is True AND "a" is in words
    • f[1] = False, skip
  • Result: f[2] = False

i = 3 (checking "cat"):

  • j = 0: Check if f[0] is True AND "cat" is in words
    • f[0] = True ✓ and "cat" ∈ words ✓
  • Result: f[3] = True (found valid segmentation!)

i = 4 (checking "catd"):

  • j = 0: Check if f[0] is True AND "catd" is in words
    • f[0] = True, but "catd" ∉ words
  • j = 1: Check if f[1] is True AND "atd" is in words
    • f[1] = False, skip
  • j = 2: Check if f[2] is True AND "td" is in words
    • f[2] = False, skip
  • j = 3: Check if f[3] is True AND "d" is in words
    • f[3] = True, but "d" ∉ words
  • Result: f[4] = False

i = 5 (checking "catdo"):

  • j = 0: Check if f[0] is True AND "catdo" is in words
    • f[0] = True ✓ and "catdo" ∈ words ✓
  • Result: f[5] = True (found valid segmentation!)

i = 6 (checking "catdog"):

  • j = 0: Check if f[0] is True AND "catdog" is in words
    • f[0] = True, but "catdog" ∉ words
  • j = 1 through j = 2: Skip (f[1] and f[2] are False)
  • j = 3: Check if f[3] is True AND "dog" is in words
    • f[3] = True ✓ and "dog" ∈ words ✓
  • Result: f[6] = True (found valid segmentation!)

Final DP array: f = [True, False, False, True, False, True, True]

Answer: f[6] = True, meaning "catdog" can be segmented as "cat" + "dog".

The algorithm correctly identified that the string can be broken at position 3, where "cat" (positions 0-3) is valid and "dog" (positions 3-6) is also valid.

Solution Implementation

1class Solution:
2    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
3        # Convert word list to set for O(1) lookup
4        word_set = set(wordDict)
5        n = len(s)
6      
7        # dp[i] represents whether s[0:i] can be segmented into words from dictionary
8        # dp[0] = True means empty string can always be segmented
9        dp = [True] + [False] * n
10      
11        # Build up the dp table
12        for i in range(1, n + 1):
13            # Check if s[0:i] can be segmented
14            # For each position j from 0 to i-1:
15            #   - dp[j] checks if s[0:j] can be segmented
16            #   - s[j:i] checks if the substring from j to i is in dictionary
17            # If both conditions are true for any j, then s[0:i] can be segmented
18            dp[i] = any(dp[j] and s[j:i] in word_set for j in range(i))
19      
20        # Return whether the entire string can be segmented
21        return dp[n]
22
1class Solution {
2    public boolean wordBreak(String s, List<String> wordDict) {
3        // Convert word list to HashSet for O(1) lookup
4        Set<String> wordSet = new HashSet<>(wordDict);
5      
6        // Get the length of the input string
7        int length = s.length();
8      
9        // dp[i] represents whether substring s[0...i-1] can be segmented into dictionary words
10        // dp[0] is true (empty string can always be segmented)
11        boolean[] dp = new boolean[length + 1];
12        dp[0] = true;
13      
14        // Iterate through each position in the string
15        for (int endIndex = 1; endIndex <= length; endIndex++) {
16            // Check all possible starting positions for the current ending position
17            for (int startIndex = 0; startIndex < endIndex; startIndex++) {
18                // If substring s[0...startIndex-1] can be segmented (dp[startIndex] is true)
19                // AND substring s[startIndex...endIndex-1] exists in the dictionary
20                // Then substring s[0...endIndex-1] can also be segmented
21                if (dp[startIndex] && wordSet.contains(s.substring(startIndex, endIndex))) {
22                    dp[endIndex] = true;
23                    break; // No need to check other starting positions
24                }
25            }
26        }
27      
28        // Return whether the entire string can be segmented
29        return dp[length];
30    }
31}
32
1class Solution {
2public:
3    bool wordBreak(string s, vector<string>& wordDict) {
4        // Convert word dictionary to hash set for O(1) lookup
5        unordered_set<string> wordSet(wordDict.begin(), wordDict.end());
6      
7        int n = s.size();
8      
9        // dp[i] represents whether substring s[0...i-1] can be segmented into dictionary words
10        // dp[0] = true means empty string can always be segmented
11        vector<bool> dp(n + 1, false);
12        dp[0] = true;
13      
14        // Iterate through each position in the string
15        for (int i = 1; i <= n; ++i) {
16            // Check all possible split positions before index i
17            for (int j = 0; j < i; ++j) {
18                // If s[0...j-1] can be segmented (dp[j] is true)
19                // AND s[j...i-1] exists in the dictionary
20                // Then s[0...i-1] can also be segmented
21                if (dp[j] && wordSet.count(s.substr(j, i - j))) {
22                    dp[i] = true;
23                    break;  // No need to check other split positions
24                }
25            }
26        }
27      
28        // Return whether the entire string can be segmented
29        return dp[n];
30    }
31};
32
1/**
2 * Determines if a string can be segmented into words from a dictionary
3 * @param s - The input string to be segmented
4 * @param wordDict - Array of words that can be used for segmentation
5 * @returns true if the string can be segmented, false otherwise
6 */
7function wordBreak(s: string, wordDict: string[]): boolean {
8    // Convert word dictionary to a Set for O(1) lookup
9    const wordSet: Set<string> = new Set(wordDict);
10  
11    // Get the length of the input string
12    const stringLength: number = s.length;
13  
14    // Dynamic programming array where canBreak[i] represents whether
15    // substring s[0...i-1] can be segmented into dictionary words
16    const canBreak: boolean[] = new Array(stringLength + 1).fill(false);
17  
18    // Base case: empty string can always be segmented
19    canBreak[0] = true;
20  
21    // Iterate through each position in the string
22    for (let endIndex = 1; endIndex <= stringLength; ++endIndex) {
23        // Check all possible starting positions for the current substring
24        for (let startIndex = 0; startIndex < endIndex; ++startIndex) {
25            // If s[0...startIndex-1] can be segmented AND
26            // s[startIndex...endIndex-1] exists in the dictionary,
27            // then s[0...endIndex-1] can also be segmented
28            if (canBreak[startIndex] && wordSet.has(s.substring(startIndex, endIndex))) {
29                canBreak[endIndex] = true;
30                break; // No need to check other partitions for this endIndex
31            }
32        }
33    }
34  
35    // Return whether the entire string can be segmented
36    return canBreak[stringLength];
37}
38

Time and Space Complexity

Time Complexity: O(n³)

The algorithm uses dynamic programming where f[i] represents whether the substring s[0:i] can be segmented into words from the dictionary.

  • The outer loop runs from 1 to n+1, contributing O(n) iterations
  • For each position i, the inner loop (within the any() function) iterates through all possible split positions j from 0 to i-1, contributing up to O(n) iterations in the worst case
  • For each split position j, we check if s[j:i] is in the word set, which involves creating a substring of length up to O(n) and checking membership in a hash set O(1) on average
  • The substring creation s[j:i] takes O(i-j) time, which can be up to O(n) in the worst case

Therefore, the total time complexity is O(n) × O(n) × O(n) = O(n³)

Space Complexity: O(n + m·k)

  • The words set stores all words from wordDict, requiring O(m·k) space where m is the number of words and k is the average length of words
  • The DP array f has length n+1, requiring O(n) space
  • The substring operations create temporary strings, but they don't accumulate, so they only add O(n) space at most for a single substring
  • The total space complexity is O(n + m·k), which can be simplified to O(n + S) where S is the total size of the word dictionary

Common Pitfalls

1. Not Converting wordDict to a Set

One of the most common mistakes is directly using the list wordDict for membership checking instead of converting it to a set first.

Problem Code:

def wordBreak(self, s: str, wordDict: List[str]) -> bool:
    n = len(s)
    dp = [True] + [False] * n
  
    for i in range(1, n + 1):
        # Using list instead of set - O(k) lookup where k is wordDict length
        dp[i] = any(dp[j] and s[j:i] in wordDict for j in range(i))
  
    return dp[n]

Why it's problematic: Checking membership in a list takes O(k) time where k is the number of words, while set lookup is O(1). This increases time complexity from O(n²) to O(n² × k).

Solution:

word_set = set(wordDict)  # Convert to set first
# Then use word_set for O(1) lookups

2. Incorrect DP Array Initialization

Another frequent error is incorrectly sizing or initializing the DP array.

Problem Code:

def wordBreak(self, s: str, wordDict: List[str]) -> bool:
    word_set = set(wordDict)
    n = len(s)
  
    # Wrong: Creating array of size n instead of n+1
    dp = [False] * n
    dp[0] = True  # This will cause IndexError or wrong logic
  
    for i in range(1, n):  # Wrong range
        dp[i] = any(dp[j] and s[j:i] in word_set for j in range(i))
  
    return dp[n-1]  # Wrong index

Why it's problematic: The DP array needs n+1 elements because dp[i] represents whether substring s[0:i] can be segmented. We need dp[0] for the empty string base case and dp[n] for the full string.

Solution:

dp = [True] + [False] * n  # Correct: size n+1 with dp[0] = True
# Or alternatively:
dp = [False] * (n + 1)
dp[0] = True

3. Checking All Substrings Instead of Valid Splits

A subtle but performance-impacting mistake is checking substrings that can never form valid segments.

Problem Code:

def wordBreak(self, s: str, wordDict: List[str]) -> bool:
    word_set = set(wordDict)
    n = len(s)
    dp = [True] + [False] * n
  
    for i in range(1, n + 1):
        # Checking all j values even when dp[j] is False
        for j in range(i):
            if s[j:i] in word_set:  # Missing dp[j] check
                dp[i] = True
                break
  
    return dp[n]

Why it's problematic: This ignores whether the prefix s[0:j] can be segmented. Even if s[j:i] is a valid word, it doesn't matter if we can't reach position j with valid words.

Solution:

for i in range(1, n + 1):
    for j in range(i):
        if dp[j] and s[j:i] in word_set:  # Check both conditions
            dp[i] = True
            break

4. Not Optimizing with Early Termination

While the one-liner with any() is elegant, you can optimize by breaking early once a valid segmentation is found.

Optimized Solution:

def wordBreak(self, s: str, wordDict: List[str]) -> bool:
    word_set = set(wordDict)
    n = len(s)
    dp = [True] + [False] * n
  
    # Optional optimization: find max word length
    max_len = max(len(word) for word in wordDict) if wordDict else 0
  
    for i in range(1, n + 1):
        # Only check positions within max_len distance
        for j in range(max(0, i - max_len), i):
            if dp[j] and s[j:i] in word_set:
                dp[i] = True
                break  # Early termination
  
    return dp[n]

This optimization limits the search space by only checking positions within the maximum word length distance, significantly improving performance for large strings with small dictionary words.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which of these pictures shows the visit order of a depth-first search?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More