115. Distinct Subsequences
Problem Description
Given two strings s
and t
, you need to find the number of distinct subsequences of string s
that equal string t
.
A subsequence is a sequence that can be derived from the original string by deleting some (or no) characters without changing the order of the remaining characters. For example, "ace" is a subsequence of "aecace".
The task is to count how many different ways you can form string t
by selecting characters from string s
in their original order.
For example:
- If
s = "rabbbit"
andt = "rabbit"
, there are 3 distinct ways to form "rabbit" from "rabbbit" by choosing different combinations of the letter 'b'. - If
s = "babgbag"
andt = "bag"
, there are 5 distinct ways to form "bag" from "babgbag".
The problem uses dynamic programming where f[i][j]
represents the number of ways to form the first j
characters of string t
using the first i
characters of string s
.
The base case is f[i][0] = 1
for all i
, meaning there's exactly one way to form an empty string (by selecting nothing).
For each position, if the characters match (s[i-1] == t[j-1]
), we have two choices:
- Use the current character from
s
: add the count fromf[i-1][j-1]
- Skip the current character from
s
: take the count fromf[i-1][j]
If the characters don't match, we can only skip the current character from s
, so f[i][j] = f[i-1][j]
.
The final answer is f[m][n]
, where m
and n
are the lengths of strings s
and t
respectively.
Intuition
The key insight is to think about this problem as making choices at each character position. When we're trying to match string t
using characters from string s
, we're essentially asking: "How many ways can I select characters from s
to form t
?"
Let's think step by step. Imagine we're building t
character by character from s
. At any point, we've processed some portion of both strings. The critical observation is that when we encounter a character in s
, we face a decision:
-
If the current character in
s
matches the current character we need int
, we have two options:- Use this character as part of our subsequence
- Skip it and look for another matching character later
-
If the characters don't match, we have no choice but to skip the current character in
s
.
This naturally leads to a dynamic programming approach. We can build up our solution by considering smaller subproblems first. The question "How many ways can we form the first j
characters of t
using the first i
characters of s
?" can be answered by looking at smaller versions of the same problem.
The beauty of this approach is that it captures the overlapping nature of the problem. When we have multiple occurrences of the same character in s
(like multiple 'b's in "rabbbit"), each occurrence creates a different path to form t
. By systematically considering each character position and accumulating the counts, we avoid counting the same subsequence twice while ensuring we count all possible distinct subsequences.
The base case makes intuitive sense too: there's exactly one way to form an empty string from any string - by selecting nothing. This gives us our starting point to build up the solution for longer substrings.
Solution Approach
We implement the solution using a 2D dynamic programming table f[i][j]
where i
ranges from 0
to m
(length of string s
) and j
ranges from 0
to n
(length of string t
). This table stores the number of ways to form the first j
characters of t
using the first i
characters of s
.
Initialization:
- Create a 2D array
f
with dimensions(m+1) × (n+1)
, initialized with zeros - Set
f[i][0] = 1
for alli
from0
tom
, representing that there's exactly one way to form an empty string
Filling the DP Table:
We iterate through each character of s
(index i
from 1
to m
) and for each position, we iterate through each character of t
(index j
from 1
to n
).
For each cell f[i][j]
, we apply the state transition formula:
Breaking this down:
f[i-1][j]
represents not using the current characters[i-1]
, which is always an option- When
s[i-1] == t[j-1]
, we additionally addf[i-1][j-1]
, which represents using the current character to match
Implementation Details:
# Initialize base cases
for i in range(m + 1):
f[i][0] = 1
# Fill the DP table
for i, a in enumerate(s, 1): # Start from index 1
for j, b in enumerate(t, 1): # Start from index 1
f[i][j] = f[i - 1][j] # Don't use s[i-1]
if a == b: # Characters match
f[i][j] += f[i - 1][j - 1] # Also add the option of using s[i-1]
The enumerate function with starting value 1 elegantly handles the 1-based indexing for the DP table while iterating through 0-based string indices.
Result:
The final answer is stored in f[m][n]
, representing the total number of distinct subsequences of the entire string s
that equal the entire string t
.
Time and Space Complexity:
- Time Complexity:
O(m × n)
as we fill each cell of the DP table exactly once - Space Complexity:
O(m × n)
for storing the DP table
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example with s = "babgbag"
and t = "bag"
to see how the dynamic programming solution works.
We'll build a DP table where f[i][j]
represents the number of ways to form the first j
characters of t
using the first i
characters of s
.
Step 1: Initialize the table
- Create a table with dimensions (8 × 4) since len(s) = 7 and len(t) = 3
- Set
f[i][0] = 1
for all rows (one way to form empty string)
Initial table:
"" b a g "" 1 0 0 0 b 1 0 0 0 a 1 0 0 0 b 1 0 0 0 g 1 0 0 0 b 1 0 0 0 a 1 0 0 0 g 1 0 0 0
Step 2: Fill the table row by row
For i=1, s[0]='b'
:
j=1, t[0]='b'
: Characters match!f[1][1] = f[0][1] + f[0][0] = 0 + 1 = 1
j=2, t[1]='a'
: No match.f[1][2] = f[0][2] = 0
j=3, t[2]='g'
: No match.f[1][3] = f[0][3] = 0
For i=2, s[1]='a'
:
j=1, t[0]='b'
: No match.f[2][1] = f[1][1] = 1
j=2, t[1]='a'
: Match!f[2][2] = f[1][2] + f[1][1] = 0 + 1 = 1
j=3, t[2]='g'
: No match.f[2][3] = f[1][3] = 0
For i=3, s[2]='b'
:
j=1, t[0]='b'
: Match!f[3][1] = f[2][1] + f[2][0] = 1 + 1 = 2
j=2, t[1]='a'
: No match.f[3][2] = f[2][2] = 1
j=3, t[2]='g'
: No match.f[3][3] = f[2][3] = 0
Continuing this process for all positions...
Final table:
"" b a g "" 1 0 0 0 b 1 1 0 0 a 1 1 1 0 b 1 2 1 0 g 1 2 1 1 b 1 3 1 1 a 1 3 4 1 g 1 3 4 5
Step 3: Read the answer
The value at f[7][3] = 5
tells us there are 5 distinct ways to form "bag" from "babgbag".
These 5 ways correspond to:
- babgbag (using 1st 'b', 2nd 'a', 4th 'g')
- babgbag (using 1st 'b', 2nd 'a', 7th 'g')
- babgbag (using 1st 'b', 3rd 'a', 7th 'g')
- babgbag (using 3rd 'b', 6th 'a', 7th 'g')
- babgbag (using 5th 'b', 6th 'a', 7th 'g')
The key insight: at each position where characters match, we accumulate counts from two sources:
- Not using the current character (inherit from
f[i-1][j]
) - Using the current character (add
f[i-1][j-1]
)
This ensures we count all distinct subsequences exactly once.
Solution Implementation
1class Solution:
2 def numDistinct(self, s: str, t: str) -> int:
3 """
4 Count the number of distinct subsequences of s that equal t.
5 Dynamic Programming approach.
6
7 Args:
8 s: The source string
9 t: The target string to match as subsequence
10
11 Returns:
12 Number of distinct subsequences of s that equal t
13 """
14 source_len, target_len = len(s), len(t)
15
16 # Create DP table where dp[i][j] represents the number of ways
17 # to form t[0:j] using s[0:i]
18 # dp[i][j] = number of distinct subsequences of s[0:i] that equals t[0:j]
19 dp = [[0] * (target_len + 1) for _ in range(source_len + 1)]
20
21 # Base case: empty target string can be formed in exactly one way
22 # (by selecting nothing from source string)
23 for i in range(source_len + 1):
24 dp[i][0] = 1
25
26 # Fill the DP table
27 for i, source_char in enumerate(s, 1):
28 for j, target_char in enumerate(t, 1):
29 # Case 1: Don't use current character from source string
30 # Inherit the count from previous source position
31 dp[i][j] = dp[i - 1][j]
32
33 # Case 2: If characters match, we can also use current character
34 # Add the count from diagonal (both strings reduced by 1)
35 if source_char == target_char:
36 dp[i][j] += dp[i - 1][j - 1]
37
38 # Return the final result: ways to form entire t using entire s
39 return dp[source_len][target_len]
40
1class Solution {
2 /**
3 * Count the number of distinct subsequences of s that equal t.
4 * Uses dynamic programming to build up the solution.
5 *
6 * @param s The source string to find subsequences in
7 * @param t The target string to match
8 * @return The number of distinct subsequences of s that equal t
9 */
10 public int numDistinct(String s, String t) {
11 int sourceLength = s.length();
12 int targetLength = t.length();
13
14 // dp[i][j] represents the number of distinct subsequences
15 // of s[0...i-1] that equal t[0...j-1]
16 int[][] dp = new int[sourceLength + 1][targetLength + 1];
17
18 // Initialize base case: empty target string can be formed
19 // in exactly one way from any source string (by selecting nothing)
20 for (int i = 0; i <= sourceLength; i++) {
21 dp[i][0] = 1;
22 }
23
24 // Fill the dp table
25 for (int i = 1; i <= sourceLength; i++) {
26 for (int j = 1; j <= targetLength; j++) {
27 // Case 1: Don't use the current character from source string
28 // The count remains the same as without this character
29 dp[i][j] = dp[i - 1][j];
30
31 // Case 2: If characters match, we can also use the current character
32 // Add the count of subsequences formed by matching both characters
33 if (s.charAt(i - 1) == t.charAt(j - 1)) {
34 dp[i][j] += dp[i - 1][j - 1];
35 }
36 }
37 }
38
39 // Return the final result: number of subsequences of entire s that equal entire t
40 return dp[sourceLength][targetLength];
41 }
42}
43
1class Solution {
2public:
3 int numDistinct(string s, string t) {
4 int sourceLen = s.size();
5 int targetLen = t.size();
6
7 // dp[i][j] represents the number of distinct subsequences of s[0...i-1] that equals t[0...j-1]
8 // Using unsigned long long to handle potential overflow for large results
9 unsigned long long dp[sourceLen + 1][targetLen + 1];
10 memset(dp, 0, sizeof(dp));
11
12 // Base case: empty target string can be matched by any source string in exactly one way (delete all characters)
13 for (int i = 0; i <= sourceLen; ++i) {
14 dp[i][0] = 1;
15 }
16
17 // Fill the dp table
18 for (int i = 1; i <= sourceLen; ++i) {
19 for (int j = 1; j <= targetLen; ++j) {
20 // Case 1: Skip the current character in source string
21 // The number of ways remains the same as without this character
22 dp[i][j] = dp[i - 1][j];
23
24 // Case 2: If characters match, we can also use this character
25 // Add the number of ways when both strings exclude current characters
26 if (s[i - 1] == t[j - 1]) {
27 dp[i][j] += dp[i - 1][j - 1];
28 }
29 }
30 }
31
32 // Return the number of distinct subsequences of entire s that equals entire t
33 return dp[sourceLen][targetLen];
34 }
35};
36
1/**
2 * Counts the number of distinct subsequences of string s that equal string t
3 * Uses dynamic programming approach
4 *
5 * @param s - The source string to search for subsequences
6 * @param t - The target string to match
7 * @returns The number of distinct subsequences of s that equal t
8 */
9function numDistinct(s: string, t: string): number {
10 const sourceLength: number = s.length;
11 const targetLength: number = t.length;
12
13 // Create a 2D DP table where dp[i][j] represents the number of distinct subsequences
14 // of s[0...i-1] that equal t[0...j-1]
15 const dp: number[][] = new Array(sourceLength + 1)
16 .fill(0)
17 .map(() => new Array(targetLength + 1).fill(0));
18
19 // Base case: empty target string can be formed by any source string in exactly one way
20 // (by selecting nothing)
21 for (let i = 0; i <= sourceLength; i++) {
22 dp[i][0] = 1;
23 }
24
25 // Fill the DP table
26 for (let i = 1; i <= sourceLength; i++) {
27 for (let j = 1; j <= targetLength; j++) {
28 // Case 1: Don't use the current character from source string
29 // Inherit the count from previous source position
30 dp[i][j] = dp[i - 1][j];
31
32 // Case 2: If characters match, we can also use the current character
33 // Add the count from diagonal (both strings reduced by one character)
34 if (s[i - 1] === t[j - 1]) {
35 dp[i][j] += dp[i - 1][j - 1];
36 }
37 }
38 }
39
40 // Return the final result: number of ways to form entire target from entire source
41 return dp[sourceLength][targetLength];
42}
43
Time and Space Complexity
Time Complexity: O(m * n)
, where m
is the length of string s
and n
is the length of string t
. The algorithm uses two nested loops that iterate through all characters of both strings, resulting in m * n
iterations. Each iteration performs constant time operations (array access and addition).
Space Complexity: O(m * n)
for the current implementation, which uses a 2D array f
of size (m + 1) × (n + 1)
to store the dynamic programming states.
Space Optimization: As noted in the reference answer, since the calculation of f[i][j]
only depends on values from the previous row (f[i-1][j]
and f[i-1][j-1]
), the space complexity can be optimized to O(n)
by using a 1D array instead of a 2D array. This would involve maintaining only the current and previous rows' values, or using a single row with careful updating from right to left or using temporary variables.
Common Pitfalls
1. Integer Overflow in Languages with Fixed Integer Size
The Pitfall:
When dealing with large strings, the number of distinct subsequences can grow exponentially. For instance, if s
contains many repeated characters that match characters in t
, the count can become extremely large. In languages like Java or C++ with fixed integer sizes (int: 32-bit, long: 64-bit), this can lead to integer overflow, producing incorrect results.
Example scenario:
s = "a" * 1000 # 1000 'a's t = "a" * 100 # 100 'a's # The result would be C(1000, 100) which is astronomically large
Solution:
- In Python, integers have arbitrary precision, so overflow isn't an issue
- In Java/C++, use appropriate data types (long long in C++, Long in Java)
- Some problems may ask for the result modulo a prime number (e.g., 10^9 + 7)
- If modulo is required, apply it at each step:
MOD = 10**9 + 7 if source_char == target_char: dp[i][j] = (dp[i][j] + dp[i - 1][j - 1]) % MOD
2. Incorrect Base Case Initialization
The Pitfall:
A common mistake is forgetting to initialize dp[0][0] = 1
or only initializing dp[i][0] = 1
for i > 0
, missing the case when both strings are empty. This leads to incorrect calculations throughout the entire DP table.
Incorrect initialization:
# Wrong: Starting from 1 instead of 0
for i in range(1, source_len + 1):
dp[i][0] = 1 # Misses dp[0][0] = 1
Solution: Always initialize from index 0:
# Correct: Include all rows from 0 to source_len
for i in range(source_len + 1):
dp[i][0] = 1
3. Off-by-One Errors in Index Mapping
The Pitfall:
Confusion between DP table indices (1-based) and string indices (0-based) often leads to accessing wrong characters. Using s[i]
instead of s[i-1]
when i
represents the DP table index is a frequent error.
Incorrect character access:
# Wrong: Using i and j directly as string indices if s[i] == t[j]: # Should be s[i-1] == t[j-1] dp[i][j] += dp[i - 1][j - 1]
Solution: Use enumerate with start=1 for cleaner code:
# Clear and correct: enumerate handles the offset
for i, source_char in enumerate(s, 1):
for j, target_char in enumerate(t, 1):
if source_char == target_char:
dp[i][j] += dp[i - 1][j - 1]
4. Space Optimization Pitfall
The Pitfall: When optimizing space from O(m×n) to O(n) using a 1D array, updating values in the wrong order can overwrite values needed for subsequent calculations.
Incorrect space optimization:
# Wrong: Left-to-right iteration overwrites needed values
dp = [1] + [0] * target_len
for source_char in s:
for j in range(1, target_len + 1): # Wrong direction!
if source_char == t[j - 1]:
dp[j] += dp[j - 1] # dp[j-1] might be already updated
Solution: Iterate from right to left when using 1D array:
# Correct: Right-to-left preserves needed values
dp = [1] + [0] * target_len
for source_char in s:
for j in range(target_len, 0, -1): # Reverse iteration
if source_char == t[j - 1]:
dp[j] += dp[j - 1] # dp[j-1] still has value from previous row
How many times is a tree node visited in a depth first search?
Recommended Readings
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Don’t Miss This!