# 2842. Count K-Subsequences of a String With Maximum Beauty

## Problem Description

In this problem, we are given a string `s` and an integer `k`. Our goal is to find the number of subsequences of length `k` from `s`, such that all characters in the subsequence are unique, and the sum of the occurrences of these characters in `s` is the maximum possible. This sum is known as the "beauty" of the subsequence.

A `k-subsequence` is defined as a subsequence of `s` that has exactly `k` characters, which are all unique. We calculate the beauty of a k-subsequence by summing the number of times each of its characters appears in the original string `s` (not in the subsequence itself). For example, if `s ="abcb"`, the beauty of the subsequence "ac" would be `f('a') + f('c')`, which is `1 + 1 = 2` in this case.

The problem asks us to return the count of all `k-subsequences` that have the maximum beauty amongst all possible `k-subsequences`. The result should be returned modulo `10^9 + 7` to handle potentially large numbers.

## Intuition

To solve this problem effectively, we note a couple of key points:

1. The beauty of a `k-subsequence` depends solely on the characters included in it, not their order. Hence, we want to select the `k` characters that have the highest frequencies in `s`.

2. Once we determine the `k` highest frequency characters, the problem boils down to counting all unique combinations of these characters that form valid `k-subsequences`.

Now, let's build on this intuition:

1. We start by counting the frequency of each character in `s` using a Counter.

2. Next, if there are fewer than `k` unique characters in the string, there cannot be any valid `k-subsequences`, so we return `0`.

3. We sort the frequencies in descending order because we are interested in the `k` characters with the highest frequencies.

4. The beauty of any `k-subsequence` is at least the `k`-th highest frequency (represented by `val` in the solution). Thus, we find how many characters have this `k`-th highest frequency (`x`).

5. The solution then calculates the number of ways we can pick the remaining `k-1` characters from those that have a higher frequency than `val`.

6. Finally, it uses the combinatorial function to calculate the number of ways to choose the `k` characters with the `k`-th highest frequency `val`. It multiplies this with the previously calculated value and takes the whole product modulo `10^9 + 7`.

By the end of these steps, we have the total count of maximum beauty k-subsequences modulo `10^9 + 7`.

## Solution Approach

The solution to the problem implements the following steps, utilizing a few Python-specific features and utility functions:

1. Counting Character Frequencies:

• The solution uses `Counter` from Python's `collections` library to count the occurrences of each character in the string `s`. This gives us a dictionary where keys are characters and values are their respective frequencies.
2. Shortcut for Impossibility:

• If the unique character count is less than `k`, we cannot form a `k-subsequence`, so we return 0.
3. Sorting and Selecting Frequencies:

• We sort the frequencies in descending order to focus on the highest frequencies first. The `k-th` highest frequency is found at index `k - 1` because of zero-based indexing. This value is important because it sets the minimum frequency a character must have to be considered for the maximum beauty.
4. Calculating Combinations for Choosing Characters:

• To use the characters with higher frequencies than the `k-th` highest frequency, we multiply the ways of picking these characters cumulatively. For each such character, we multiply `ans` by its frequency modulo `mod`.
5. Handling Characters with Equal Frequency to the `k-th`:

• Next, we find how many characters, `x`, have the same frequency as the `k-th` highest frequency `val`. We need to choose `k` from these `x` characters. However, as some characters with higher frequency have already been chosen (`k` is decremented accordingly), we must choose the remaining from these `x` characters.
6. Calculations Involving Modular Arithmetic:

• We calculate `comb(x, k)` - the number of ways to choose `k` remaining characters from `x` characters with the exact frequency `val`.
• The `pow(val, k, mod)` calculates `val^k` (val to the power of k) modulo `mod`, representing multiplying the beauty combination by the frequency `val`, raised to the power of how many characters we are choosing (`k`).
7. Final Calculation and Modulo Operation:

• The final answer is calculated by multiplying these values and taking the result modulo `mod` (`10^9 + 7`) to handle large numbers according to the problem statement.

Here's how the solution implements these steps:

``````1class Solution:
2    def countKSubsequencesWithMaxBeauty(self, s: str, k: int) -> int:
3        f = Counter(s) # step 1
4        if len(f) < k: # step 2
5            return 0
6        mod = 10**9 + 7
7        vs = sorted(f.values(), reverse=True) # step 3
8        val = vs[k - 1] # this gives us the k-th highest frequency
9        x = vs.count(val) # step 5
10        ans = 1
11        for v in vs: # step 4
12            if v == val:
13                break
14            k -= 1
15            ans = ans * v % mod
16        ans = ans * comb(x, k) * pow(val, k, mod) % mod # step 6 and 7
17        return ans``````

This approach efficiently calculates the number of `k-subsequences` with maximum beauty by combining frequency analysis, sorting, combinatorial mathematics, and modular arithmetic.

### Example Walkthrough

Let us consider a small example to illustrate the solution approach. Assume we have the string `s = "aabbc"` and `k = 2`.

Following the steps of the solution:

1. Counting Character Frequencies:

• Using `Counter(s)`, we determine that the frequency of 'a' is 2, 'b' is 2, and 'c' is 1. We have our frequency dictionary: `{'a': 2, 'b': 2, 'c': 1}`.
2. Shortcut for Impossibility:

• There are more than `k` unique characters; hence a `k-subsequence` is possible.
3. Sorting and Selecting Frequencies:

• We sort the frequencies in descending order: `[2, 2, 1]`.
• The `k-th` highest frequency is at index `k - 1`, which is `1`. So, `val = vs[1] = 2`.
4. Calculating Combinations for Choosing Characters:

• As the first two frequencies are equal and are the highest, we have no higher frequency characters to choose from. Therefore, `ans` remains 1.
5. Handling Characters with Equal Frequency to the `k-th`:

• There are two characters ('a' and 'b') with the same frequency as `val`, so `x = 2`.
• Since we have no characters with a higher frequency, we want to choose all `k` characters from these `x` characters with the highest frequency, meaning we want to choose 2 characters from a set of 2 characters.
6. Calculations Involving Modular Arithmetic:

• We calculate `comb(x, k)` which is `comb(2, 2)`. Since choosing 2 characters out of 2 is only possible in one way, `comb(2, 2) = 1`.
• The `pow(val, k, mod)` term is basically `2^2 % mod` which equals 4 modulo `10^9 + 7`.
7. Final Calculation and Modulo Operation:

• The final answer is `ans * comb(x, k) * pow(val, k, mod) % mod`, which simplifies to `1 * 1 * 4 % mod` = 4.

Therefore, for the string `s = "aabbc"` with `k = 2`, there are 4 `k-subsequences` with the maximum beauty. This is because the beautiful subsequences are: `"aa"`, `"ab"`, `"ba"`, and `"bb"`, all having a total frequency sum of 4 in the string `s`.

## Python Solution

``````1from collections import Counter
2from math import comb
3
4class Solution:
5    def countKSubsequencesWithMaxBeauty(self, string: str, k: int) -> int:
6        # Create a frequency counter for all characters in the string
7        frequency_counter = Counter(string)
8
9        # If there are fewer types of characters than k, return 0
10        if len(frequency_counter) < k:
11            return 0
12
13        # Define modulo for result to prevent integer overflow
14        mod = 10**9 + 7
15
16        # Sort the frequencies in descending order
17        sorted_frequencies = sorted(frequency_counter.values(), reverse=True)
18
19        # Get the k-th largest frequency, this will be the max beauty
20        max_beauty = sorted_frequencies[k - 1]
21
22        # Count how many times this frequency (max beauty) appears
23        count_of_max_beauty = sorted_frequencies.count(max_beauty)
24
25        # Initialize answer as 1 (multiplicative identity)
27
28        # Multiplying the count of characters by which we can not form subsequences (as they are
29        # more frequent than the k-th frequent character) modulo mod to keep the number manageable
30        for freq in sorted_frequencies:
31            if freq == max_beauty:
32                # Once we reach characters with the max beauty value, stop
33                break
34            k -= 1
36
37        # Multiply by the number of combinations of choosing k characters from 'count_of_max_beauty'
38        # and the max_beauty^(k) to calculate the beauty value, then take the modulo of the result
39        answer = answer * comb(count_of_max_beauty, k) * pow(max_beauty, k, mod) % mod
40
41        # Return the final answer
43``````

## Java Solution

``````1import java.util.Arrays;
2
3class Solution {
4    private static final int MOD = (int) 1e9 + 7;
5
6    public int countKSubsequencesWithMaxBeauty(String s, int k) {
7        int[] frequency = new int[26];
8        int n = s.length();
9        int uniqueCharCount = 0;
10
11        // Count frequency of each character in the string s
12        for (int i = 0; i < n; i++) {
13            if (++frequency[s.charAt(i) - 'a'] == 1) {
14                uniqueCharCount++;
15            }
16        }
17
18        // If there are not enough unique characters to form a subsequence of length k
19        if (uniqueCharCount < k) {
20            return 0;
21        }
22
23        Integer[] values = new Integer[uniqueCharCount];
24        for (int i = 0, j = 0; i < 26; i++) {
25            if (frequency[i] > 0) {
26                values[j++] = frequency[i];
27            }
28        }
29
30        // Sort the frequency values in descending order
31        Arrays.sort(values, (a, b) -> b - a);
32
34        int minValue = values[k - 1];
35        int minFreqCount = 0;
36
37        // Calculate the frequency of the k-th largest unique character
38        for (int v : values) {
39            if (v == minValue) {
40                minFreqCount++;
41            }
42        }
43
44        // Calculate product of first k-1 largest frequencies
45        for (int v : values) {
46            if (v == minValue) {
47                break;
48            }
49            k--;
51        }
52
53        int[][] combinations = new int[minFreqCount + 1][minFreqCount + 1];
54
55        // Initialize the combinations (Pascal's triangle for binomial coefficients)
56        for (int i = 0; i <= minFreqCount; i++) {
57            combinations[i][0] = 1;
58            for (int j = 1; j <= i; j++) {
59                combinations[i][j] = (combinations[i - 1][j - 1] + combinations[i - 1][j]) % MOD;
60            }
61        }
62
63        // Multiply with the number of ways to choose k elements from minFreqCount
64        // and raise minValue to power k
65        answer = (((answer * combinations[minFreqCount][k]) % MOD) * quickPow(minValue, k)) % MOD;
67    }
68
69    // Helper method to perform fast exponentiation modulo MOD
70    private long quickPow(long base, int exponent) {
71        long result = 1;
72        for (; exponent > 0; exponent >>= 1) {
73            if ((exponent & 1) == 1) {
74                result = result * base % MOD;
75            }
76            base = base * base % MOD;
77        }
78        return result;
79    }
80}
81``````

## C++ Solution

``````1class Solution {
2public:
3    // Function to count the number of k subsequences with maximum beauty
4    int countKSubsequencesWithMaxBeauty(string s, int k) {
5        // Array to count frequency of letters
6        int frequency[26]{};
7        // Variable to store the distinct letter count
8        int distinctCount = 0;
9
10        // Loop to calculate the frequency of each character in the string
11        for (char& c : s) {
12            if (++frequency[c - 'a'] == 1) {
13                ++distinctCount;
14            }
15        }
16
17        // If there are fewer distinct letters than k, no valid subsequence exists
18        if (distinctCount < k) {
19            return 0;
20        }
21
22        // Vector to store frequencies of distinct letters
23        vector<int> frequencies(distinctCount);
24
25        // Populate the vector with the frequency counts of the letters
26        for (int i = 0, j = 0; i < 26; ++i) {
27            if (frequency[i]) {
28                frequencies[j++] = frequency[i];
29            }
30        }
31
32        // Sort the frequencies in descending order
33        sort(frequencies.rbegin(), frequencies.rend());
34
35        // Modulo for the result to avoid overflow
36        const int mod = 1e9 + 7;
37
38        // Variable to store the answer
39        long long answer = 1;
40
41        // Value of the k-th maximum frequency
42        int kthValue = frequencies[k - 1];
43
44        // Count how many letters have the k-th maximum frequency
45        int equalCount = 0;
46        for (int freq : frequencies) {
47            equalCount += freq == kthValue;
48        }
49
50        // Calculate part of the answer before considering combinations
51        for (int freq : frequencies) {
52            if (freq == kthValue) {
53                break;
54            }
55            --k;
57        }
58
59        // Combinations array (Use Pascal's triangle)
60        int combinations[equalCount + 1][equalCount + 1];
61        memset(combinations, 0, sizeof(combinations));
62
63        for (int i = 0; i <= equalCount; ++i) {
64            combinations[i][0] = 1;
65            for (int j = 1; j <= i; ++j) {
66                combinations[i][j] = (combinations[i - 1][j] + combinations[i - 1][j - 1]) % mod;
67            }
68        }
69
70        // Quick power function to handle large exponents
71        auto quickPower = [&](long long base, int exponent) {
72            long long result = 1;
73            for (; exponent; exponent >>= 1) {
74                if (exponent & 1) {
75                    result = result * base % mod;
76                }
77                base = base * base % mod;
78            }
79            return result;
80        };
81
82        // Finalize the answer with combinations and power calculation
83        answer = (answer * combinations[equalCount][k] % mod) * quickPower(kthValue, k) % mod;
84
86    }
87};
88``````

## Typescript Solution

``````1function countKSubsequencesWithMaxBeauty(s: string, k: number): number {
2    // Allocate an array to keep counts of each letter appearing in the string.
3    // Initialize an array of length 26 (for each letter of the alphabet) with zeros.
4    const frequency: number[] = new Array(26).fill(0);
5
6    // Initialize a count of unique characters present in the string.
7    let uniqueCharCount = 0;
8
9    // Count the frequency of each character in the input string.
10    for (const char of s) {
11        // Calculate the index by subtracting ASCII of 'a' to get a range from 0 to 25.
12        const index = char.charCodeAt(0) - 97;
13        // Increase the count for this character and increment unique character count if this is the first occurrence.
14        if (++frequency[index] === 1) {
15            ++uniqueCharCount;
16        }
17    }
18
19    // If the unique count is less than k, it is not possible to have k subsequences.
20    if (uniqueCharCount < k) {
21        return 0;
22    }
23
24    // Define modulus as per the problem constraints for large number handling.
25    const mod = BigInt(10 ** 9 + 7);
26
27    // Filter out zero frequency characters and sort the frequencies in descending order.
28    const validFrequencies: number[] = frequency.filter(value => value > 0).sort((a, b) => b - a);
29
30    // The value is the frequency of the k-th most frequent character after sorting.
31    const value = validFrequencies[k - 1];
32
33    // Count how many characters have the same frequency as the value.
34    const countWithValue = validFrequencies.filter(value => value === val).length;
35
36    // Initialize answer as a big integer.
38
39    // Multiply answer by the frequency of each character more frequent than the k-th one.
40    for (const freq of validFrequencies) {
41        if (freq === value) {
42            break;
43        }
44        --k;
46    }
47
48    // Initialize combination array for calculating combinations.
49    const combinations: number[][] = new Array(countWithValue + 1).fill(0).map(() => new Array(k + 1).fill(0));
50
51    // Populate the combination array using Pascal Triangle approach.
52    for (let i = 0; i <= countWithValue; ++i) {
53        combinations[i][0] = 1;
54        for (let j = 1; j <= Math.min(i, k); ++j) {
55            combinations[i][j] = (combinations[i - 1][j] + combinations[i - 1][j - 1]) % Number(mod);
56        }
57    }
58
59    // Define quick power function to calculate a^n % mod efficiently.
60    const quickPower = (a: bigint, n: number): bigint => {
61        let ans = 1n;
62        for (; n; n >>>= 1) {
63            if (n & 1) {
64                ans = (ans * a) % mod;
65            }
66            a = (a * a) % mod;
67        }
68        return ans;
69    };
70
71    // Complete the answer calculation by incorporating the combinations and quick power results.
72    answer = (((answer * BigInt(combinations[countWithValue][k])) % mod) * quickPower(BigInt(value), k)) % mod;
73
74    // Return the answer converted to a regular number.
76}
77``````

## Time and Space Complexity

### Time Complexity

The overall time complexity of the function `countKSubsequencesWithMaxBeauty` is dominated by a few key operations:

1. Constructing the frequency counter (`f = Counter(s)`) with respect to the characters of the string `s`, which has a complexity of `O(n)` where `n` is the length of the input string `s`.

2. Checking the length of the unique elements in the frequency counter (`if len(f) < k`), which is `O(1)` because the length operation for a dictionary is constant time.

3. Sorting the values of the frequency counter (`vs = sorted(f.values(), reverse=True)`), which will take `O(m log m)` where `m` is the number of distinct characters in `s`. This is typically less than or equal to `n`, but we consider it separately in case `s` is very long but has few distinct characters.

4. Counting the frequency of the `k-th` highest value (`x = vs.count(val)`), which has a complexity of `O(m)` since it requires traversing the sorted list of frequencies.

5. We have a loop to compute the product of values and modulo `ans = ans * v % mod`, but the loop runs at most `m` times, so this is `O(m)`.

6. Calculating the combination (`comb(x, k)`) which typically can be computed in `O(k)` time, and calculating the power (`pow(val, k, mod)`) which can be done in `O(log k)` time using fast exponentiation.

Since `m <= n`, we can say the time complexity is `O(n + n log n + k + log k) = O(n log n)` since the `log n` term from the sorting will dominate as `n` grows large. `k` is bounded by `n`, hence it doesn't change the overall complexity.

### Space Complexity

The space complexity is as follows:

1. The frequency counter `f` which will have at most `m` elements where `m` is the number of unique characters in `s`, giving us `O(m)` space.

2. The sorted list `vs`, which will also be of size `m`, giving us another `O(m)` space.

3. Constant space for auxiliary variables like `mod`, `val`, `x`, `ans` and a few others during iterations and calculations.

Therefore, the space complexity of the code is `O(m)`, which can be `O(n)` in the worst case when all the characters in `s` are unique.

Got a question?ย Ask the Teaching Assistantย anything you don't understand.

Still not clear? Ask in the Forum, ย Discordย orย Submitย the part you don't understand to our editors.

โ
โTA ๐จโ๐ซ