1639. Number of Ways to Form a Target String Given a Dictionary


Problem Description

In this problem, you are given a list of strings called words, where each string is of the same length, and a target string target. Your goal is to construct the target string character by character, from left to right, using characters from the strings in words. The key rules to follow while forming the target string are:

  • You may select the character at position k from the jth string in words to form the ith character of target, if and only if target[i] matches words[j][k].
  • After using the kth character from jth string of words, all characters at or to the left of position k in any string of words can no longer be used. This means you have to choose subsequent characters from positions greater than k.
  • You can repeat this process of selecting characters from words until you completely form the target string.
  • It is allowed to use multiple characters from the same string in words as long as they meet the above conditions.

The problem asks for the number of distinct ways to form the target from words. As the total number could be very large, you are required to return this number modulo 10^9 + 7.

Intuition

To solve this problem, we make use of dynamic programming or a depth-first search approach to count all possible ways to construct the target.

Firstly, we process the words array to calculate the frequency of each character at each position (count array cnt). This preprocessing step simplifies the dynamic programming or DFS implementation by giving quick access to the number of times a particular character appears in each position across all strings.

The next step is defining a recursive function, let's call it dfs(i, j), representing the number of ways to construct the substring of target starting at index i with the starting character position in words being j. Our base cases are when we've successfully formed the target string (all characters are chosen) or when we can no longer choose characters from words (out of bounds).

The transitions depend on our choices at each step:

  • We can decide not to pick a character at position j in any string of words, in which case we simply move to the next character position in words (dfs(i, j + 1)).
  • Alternatively, we can pick the character if it matches the current target character we want to form, and then move to the next character in both target and words, while also multiplying by the count of the selected character (dfs(i + 1, j + 1) * count).

Finally, we initiate our search/recursion from the beginning of the target string and the words array (dfs(0, 0)), and continuously add the number of ways at each new choice we can make.

In the case of dynamic programming, we employ a bottom-up approach to store and reuse the results of subproblems, using a 2D table f[i][j] to represent the number of ways to build up to the ith character of target considering characters upto jth position in the strings of words.

Overall, the problem is approached via breaking down the complex task of string construction into smaller subproblems, calculating the number of ways for each subproblem, and then using these to build up to the final solution.

Learn more about Dynamic Programming patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece

A person thinks of a number between 1 and 1000. You may ask any number questions to them, provided that the question can be answered with either "yes" or "no".

What is the minimum number of questions you needed to ask so that you are guaranteed to know the number that the person is thinking?

Solution Approach

The problem lends itself nicely to a depth-first search (DFS) with memoization or dynamic programming to avoid redundant computations. In both the recursive (DFS) and iterative (dynamic programming) approach, a crucial step is preprocessing, and an essential data structure used is the count array cnt.

Preprocessing

Before we dive into building the target string, we calculate the frequency of each character at every position across all strings in words. We create a 2D array cnt of size n × 26, where n is the length of the strings in words. cnt[j][c] keeps track of how many times the character c appears in the jth position in all strings of words.

Recursive DFS with Memoization

In the DFS approach, we define a function dfs(i, j) that computes the number of ways to construct substring target[i...] given that we are currently looking to match characters starting from position j in the strings of words.

The recursion has two base cases:

  • When i >= m, which means the entire target is formed, the number of ways is 1.
  • When j >= n, which implies we can't select more characters from words and haven't formed the target, the number of ways is 0.

For other cases, we have two choices:

  • We can opt not to choose a character from position j in words, so we explore further with dfs(i, j + 1).
  • If we pick the character, we use the count of that character at position j and proceed to the next characters in both target and words (dfs(i + 1, j + 1) * cnt[j][target[i] - 'a']).

The results of these calls are added together to get the final answer for dfs(i, j) and memoized to optimize overlapping subproblems. The modulo operation ensures the answer does not overflow the limits.

Dynamic Programming

In dynamic programming, we also take advantage of the count array cnt. The dp array f[i][j] represents the number of ways to construct the first i characters of target while restricted to using up to the first j characters from each string in words.

We initialize f[0][j] = 1 for all 0 ≀ j ≀ n because there's always one way to construct an empty target regardless of words.

The dp transitions are straightforward: for each cell f[i][j], we consider two cases:

  • Not selecting the j-th character from words: we just carry over the previous count f[i][j - 1].
  • Selecting the j-th character: the number of ways increases by the number of ways to make the prefix target[...i - 1] while considering characters only up to j - 1th position multiplied by the count of the target[i - 1] at this position cnt[j - 1][target[i - 1] - 'a'].

The total number of ways for f[i][j] is the sum of the counts from the two cases, and we must again apply the modulo here. Once all cells are calculated, f[m][n] will contain the answer.

Both recursive and dynamic programming solutions ensure that we do not perform the same calculations repeatedly, and both require O(m * n) time and space, where m is the length of the target and n is the length of the strings in words.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

What's the relationship between a tree and a graph?

Example Walkthrough

Let's go through an example to illustrate the solution approach described.

Suppose we have the following inputs:

  • words = ["ax", "by", "cz"]
  • target = "abc"

Our target is "abc", and our words list contains strings of the same length, i.e., 2.

Preprocessing Step

Firstly, we create the count array cnt from words. Since all strings in words are of length 2, we will have a 2 by 26 (number of letters in the alphabet) count matrix.

cnt would be as follows:

  • For the first position (index 0) in our strings:
    • a appears once (in "ax"),
    • b appears once (in "by"),
    • c appears once (in "cz").
  • For the second position (index 1) in our strings:
    • x, y, and z each appear once.

Our cnt array after processing:

1cnt[0][0] = 1  // 'a' at position 0
2cnt[0][1] = 1  // 'b' at position 0
3cnt[0][2] = 1  // 'c' at position 0
4...            // other letters at position 0 would be 0
5cnt[1][23] = 1 // 'x' at position 1
6cnt[1][24] = 1 // 'y' at position 1
7cnt[1][25] = 1 // 'z' at position 1
8...            // other letters at position 1 would be 0

Recursive DFS with Memoization

We define our dfs function to begin constructing the target.

For target[0] = 'a', the DFS explores:

  • dfs(1, 1) since 'a' can be chosen from words[0] = "ax" at position 0. The count is cnt[0][0] = 1.

Next, for target[1] = 'b', from the new starting point j = 1, the DFS finds:

  • dfs(2, 2) this time choosing 'b' from words[1] = "by" at position 1 (since 'b' doesn't appear at position 1 in "ax"). The count is cnt[1][1] = 1.

Finally, for target[2] = 'c':

  • We would choose 'c' from words[2] = "cz" at position 2, which again has a count of cnt[2][2] = 1.

Since each step gives us a count of 1, and we can form the target with one way at each step, the DFS would end with a total of 1 way to create the target "abc".

Dynamic Programming

Using dynamic programming, we construct a DP table f that has m rows (target length + 1) and n columns (words string length + 1).

The base case is f[0][j] = 1 for all j, indicating that there is one way to construct an empty target.

The table is filled according to the discussed transitions. By examining choices at each step, the DP table for our example would look like:

1f[0][0] = 1
2f[0][1] = 1
3f[0][2] = 1
4
5From the transitions, we would find:
6f[1][1] = f[0][0] * cnt[0]['a' - 'a'] = 1
7f[1][2] = f[1][1]                  = 1
8f[2][2] = f[1][1] * cnt[1]['b' - 'a'] = 1
9

Since f[2][2] gives the number of ways to form "ab" using up to the second position in strings of words and cnt[2]['c' - 'a'] = 1, the final count f[3][3] for forming the entire "abc" target will be f[2][2] * cnt[2]['c' - 'a'] = 1 * 1 = 1.

Hence, there is 1 distinct way to form target = "abc" using the strings in words. This result matches our DFS calculation.

Solution Implementation

1from typing import List
2from functools import lru_cache
3
4class Solution:
5    def numWays(self, words: List[str], target: str) -> int:
6        # Define a dynamic programming function with memoization using LRU (Least Recently Used) cache
7        @lru_cache(maxsize=None)
8        def dfs(target_index: int, word_index: int) -> int:
9            # Base case: if all characters of the target are matched, return 1
10            if target_index >= target_length:
11                return 1
12            # Base case: if end of the words character columns are reached, return 0
13            if word_index >= num_columns:
14                return 0
15            # Count the ways if including the current character from the target
16            count_including = dfs(target_index + 1, word_index + 1) * char_frequency[word_index][ord(target[target_index]) - ord('a')]
17            # Count the ways if not including the current character from the target
18            count_excluding = dfs(target_index, word_index + 1)
19            # Add the ways from including and excluding the current character and take modulo
20            answer = (count_including + count_excluding) % modulo
21
22            return answer
23
24        # Calculate the length of target and the number of columns from the first word (all words have the same length by assumption)
25        target_length = len(target)
26        num_columns = len(words[0])
27        modulo = 10**9 + 7  # Define the modulo constant
28
29        # Initialize character frequency matrix with 26 columns for each alphabet character and rows equalling the number of columns in words
30        char_frequency = [[0] * 26 for _ in range(num_columns)]
31        # Fill the character frequency matrix with the frequency of each character at each column in the words
32        for word in words:
33            for column_index, char in enumerate(word):
34                char_frequency[column_index][ord(char) - ord('a')] += 1
35
36        # Start the DFS traversal with the first character of the target and the first column of the words
37        return dfs(0, 0)
38
1class Solution {
2    private int targetLength;   // Length of the target string
3    private int wordLength;     // Length of the words
4    private String target;      // The target string
5    private Integer[][] memo;   // Memoization array
6    private int[][] charCount;  // Counts of each character at each position across all words
7    private final int MOD = (int) 1e9 + 7; // The modulus value for avoiding overflow
8
9    public int numWays(String[] words, String target) {
10        targetLength = target.length();
11        wordLength = words[0].length();
12        memo = new Integer[targetLength][wordLength];
13        this.target = target;
14        charCount = new int[wordLength][26]; // 26 for 'a' to 'z'
15      
16        // Count the occurrence of each character in each column
17        for (String word : words) {
18            for (int j = 0; j < wordLength; ++j) {
19                charCount[j][word.charAt(j) - 'a']++;
20            }
21        }
22      
23        // Start the depth-first search from the first character of target 
24        // and first character position in words
25        return dfs(0, 0);
26    }
27
28    private int dfs(int i, int j) {
29        // If we have matched all characters of target return 1 way to match
30        if (i >= targetLength) {
31            return 1;
32        }
33        // If we run out of positions in words before matching target, return 0 ways
34        if (j >= wordLength) {
35            return 0;
36        }
37        // If we have already computed this state, return its value
38        if (memo[i][j] != null) {
39            return memo[i][j];
40        }
41      
42        // Case 1: Skip the current position in words and move to the next
43        long ans = dfs(i, j + 1);
44      
45        // Case 2: Use the current position in words if it matches the current character in target
46        ans += (long) dfs(i + 1, j + 1) * charCount[j][target.charAt(i) - 'a'];
47
48        // Modulus operation to avoid integer overflow
49        ans %= MOD;
50      
51        // Save and return the computed value
52        return memo[i][j] = (int) ans;
53    }
54}
55
1// Class to solve the problem of counting the number of ways to form a target string from a vertical sequence of characters in a list of strings
2class Solution {
3public:
4    int numWays(vector<string>& words, string target) {
5        const int MOD = 1e9 + 7; // Define the modulus to keep the numbers within the integer limit
6        int targetLength = target.size(), wordLength = words[0].size(); // Size of the target string and length of each word in the array
7        vector<vector<int>> count(wordLength, vector<int>(26)); // 2D vector to store the count of each character at each position
8      
9        // Loop to count the frequency of each character at each column in the word list
10        for (auto& word : words) {
11            for (int j = 0; j < wordLength; ++j) {
12                ++count[j][word[j] - 'a']; // Increment the count of the character at the relevant position
13            }
14        }
15
16        // DP array initialized with -1 to indicate that the value hasn't been computed yet
17        int dp[targetLength][wordLength];
18        memset(dp, -1, sizeof(dp));
19      
20        // Recursive lambda function to compute the number of ways using Depth First Search (DFS)
21        function<int(int, int)> dfs = [&](int i, int j) -> int {
22            if (i >= targetLength) {
23                return 1; // If the whole target is found, return 1 way
24            }
25            if (j >= wordLength) {
26                return 0; // If the end of the word is reached, no more ways can be found, return 0
27            }
28            if (dp[i][j] != -1) {
29                return dp[i][j]; // If the value is already computed, return the cached result
30            }
31            int ways = dfs(i, j + 1); // Recursive call to check for ways without using the current position
32            ways = (ways + 1LL * dfs(i + 1, j + 1) * count[j][target[i] - 'a']) % MOD; // Add the ways using the current character and proceed
33            return dp[i][j] = ways; // Store the computed ways in the DP array for memoization
34        };
35      
36        // Call the DFS function starting from the first character of the target and the word
37        return dfs(0, 0);
38    }
39};
40
1function numWays(words: string[], target: string): number {
2    // m is the length of the target string.
3    const targetLength = target.length;
4    // n is the length of the words assuming all words have the same length.
5    const wordLength = words[0].length;
6    // f is the dynamic programming table with dimensions (targetLength+1) x (wordLength+1).
7    const dp = new Array(targetLength + 1).fill(0).map(() => new Array(wordLength + 1).fill(0));
8    // mod is the value for modulo operation to avoid integer overflow.
9    const MOD = 1e9 + 7;
10
11    // Initialize the first row of the dp table to 1s.
12    for (let j = 0; j <= wordLength; ++j) {
13        dp[0][j] = 1;
14    }
15
16    // cnt is a 2D array that holds the counts of each character at each position in words.
17    const charCount = new Array(wordLength).fill(0).map(() => new Array(26).fill(0));
18  
19    // Populate charCount with the frequency of each character at each position.
20    for (const word of words) {
21        for (let j = 0; j < wordLength; ++j) {
22            ++charCount[j][word.charCodeAt(j) - 'a'.charCodeAt(0)];
23        }
24    }
25
26    // Build the dp table using the character frequencies.
27    for (let i = 1; i <= targetLength; ++i) {
28        for (let j = 1; j <= wordLength; ++j) {
29            dp[i][j] = (
30                dp[i][j - 1] + dp[i - 1][j - 1] * charCount[j - 1][target.charCodeAt(i - 1) - 'a'.charCodeAt(0)]
31            ) % MOD;
32        }
33    }
34
35    // The bottom-right cell of the dp table contains the number of ways to form the target.
36    return dp[targetLength][wordLength];
37}
38
Not Sure What to Study? Take the 2-min Quiz

Which of these pictures shows the visit order of a depth-first search?

Time and Space Complexity

The time complexity of the given code is O(m * n * k), where m is the length of the target string, n is the length of the strings in the words array, and k is the number of recursive calls made, which does not exceed the length of the target string. Each function call iterates through the length of words[0], and there are m possible values for i and n possible values for j. The memoization through @cache ensures that each (i, j) pair's computation is done only once.

The space complexity is O(m * n) for the memoization cache, which stores a result for each pair (i, j). Since there are m possible values for i and n possible values for j, we have m * n pairs. Additionally, the cnt array uses O(n * 26) space, which simplifies to O(n) since 26 is a constant. As a result, the space complexity does not change due to the cnt array. The overall space complexity includes the stack space used by recursion calls, which could be O(m) in the worst case. Hence, the total space complexity is O(m * n) when considering the memoization and recursion stack space, where n is the number of indices in each word and m is the number of characters in the target.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Consider the classic dynamic programming of fibonacci numbers, what is the recurrence relation?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.

←
↑TA đŸ‘šâ€đŸ«