Leetcode 940. Distinct Subsequences II

Problem Explanation:

Given a string S, the task is to find the number of distinct, non-empty subsequences of S. A subsequence is defined as a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements. For instance, the word 'cat' has subsequences like 'c', 'a', 'ca', 'at', 'cat' and so on.

Remember that the result may be a very large number, so you should return the answer modulo 10^9 + 7.

For example, if the input string S is "abc", then the output will be 7. The distinct subsequences are "a", "b", "c", "ab", "ac", "bc", and "abc".

Solution Approach:

Let's take the string "abc". It has a total of 7 distinct subsequences. But if we add a new letter "d" making the string "abcd", the total number of distinct subsequences will be double of the previous total plus one (for the new letter itself). In short, each new letter in the string generates twice the number of previous subsequences, plus one for itself.

But what if there are repeated letters? In this case, we need to make sure we do not count the duplicate subsequences. To handle this situation, we maintain an array endsWith, where endsWith[i] keeps the count of subsequences ending with char(i + 'a').

For every character in the string, we double the total subsequences (for the previous subsequences plus the new ones generated by appending this character) and subtract the count of previous subsequences ending with this character (to remove duplicate subsequences), finally adding 1 for the character itself.

Python Solution:

3class Solution:
4    def distinctSubseqII(self, s: str) -> int:
5        modulo = 10**9 + 7
6        # create a array with length 26 to keep track of each character in the string
7        dp = [0]*26
8        # iterate over each character in the string
9        for c in s:
10            # add the subsequences generated by the current character and subtract the previous count of this character (to remove duplicate subsequences)
11            dp[ord(c) - ord('a')] = sum(dp) * 2 - dp[ord(c) - ord('a')] + 1
12        # finally take modulo to handle large numbers and return the sum of all distinct subsequences
13        return sum(dp) % modulo

Java Solution:

3class Solution {
4    public int distinctSubseqII(String s) {
5        long[] endsWith = new long[26];
6        long mod = (long)1e9 + 7;
8        for (char c : s.toCharArray()) {
9            endsWith[c-'a'] = Arrays.stream(endsWith).sum()* 2 - endsWith[c-'a'] + 1;
10            endsWith[c-'a'] %= mod;
11        }
12        long res = Arrays.stream(endsWith).sum();
13        return (int)(res % mod);
14    }

JavaScript Solution:

3var distinctSubseqII = function(s) {
4    let dp = Array(26).fill(0);
5    const mod = 1e9+7;
6    for(let c of s) {
7        index = c.charCodeAt() - 97;
8        dp[index] = dp.reduce((a,b) => a+b) * 2 - dp[index] + 1;
9        dp[index] %= mod;
10    }
11    return dp.reduce((a,b) => a+b) % mod;

C++ Solution:

3class Solution {
5    int distinctSubseqII(string s) {
6        const int mod = 1e9+7;
7        vector<int> endsWith(26, 0);
8        for (char c : s) {
9            endsWith[c - 'a'] = accumulate(endsWith.begin(), endsWith.end(), 1L) % mod;
10        }
11        return accumulate(endsWith.begin(), endsWith.end(), 0L) % mod;
12    }

C# Solution:

3public class Solution {
4    public int DistinctSubseqII(string s) {
5        int[] dp = new int[26];
6        int mod = (int) Math.Pow(10, 9) + 7;
7        for(int i = 0; i < s.Length; i++){
8            dp[s[i] - 'a'] = (dp.Sum() * 2 - dp[s[i] - 'a'] + 1) %(mod);
9        }
10        return (dp.Sum() % mod + mod) % mod;
11    }

This is a great problem to study when preparing for coding interviews, as it involves understanding and applying concepts from string manipulation and dynamic programming.

In conclusion, it is apparent that the most efficient way to approach this problem is by using dynamic programming. We represent the problem in terms of smaller sub-problems by keeping track of the number of distinct subsequences that end with each character as we iterate through the string. This approach allows us to efficiently calculate the total number of subsequences for any length of the string by leveraging the results of smaller sub-problems.

This problem can be adapted in various ways - for example, finding the number of distinct subsequences that start with a particular character, or finding the number of distinct subsequences within a certain length range. By understanding the core approach of this problem you can more easily tackle related problems in your coding interviews.

We hope this analysis has helped you understand how to approach this problem and implement solutions in different programming languages such as Python, Java, JavaScript, C++, and C#. Happy coding!

Please note: All of the solutions mentioned above have a Time Complexity of O(N) where N is the length of the string S, and a Space Complexity of O(1) because we are using an array (endsWith or dp) of constant size (26) to keep track of distinct subsequences ending with each character.

Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.

TA 👨‍🏫