2842. Count K-Subsequences of a String With Maximum Beauty
Problem Description
In this problem, we are given a string s
and an integer k
. Our goal is to find the number of subsequences of length k
from s
, such that all characters in the subsequence are unique, and the sum of the occurrences of these characters in s
is the maximum possible. This sum is known as the "beauty" of the subsequence.
A k-subsequence
is defined as a subsequence of s
that has exactly k
characters, which are all unique. We calculate the beauty of a k-subsequence by summing the number of times each of its characters appears in the original string s
(not in the subsequence itself). For example, if s ="abcb"
, the beauty of the subsequence "ac" would be f('a') + f('c')
, which is 1 + 1 = 2
in this case.
The problem asks us to return the count of all k-subsequences
that have the maximum beauty amongst all possible k-subsequences
. The result should be returned modulo 10^9 + 7
to handle potentially large numbers.
Intuition
To solve this problem effectively, we note a couple of key points:
-
The beauty of a
k-subsequence
depends solely on the characters included in it, not their order. Hence, we want to select thek
characters that have the highest frequencies ins
. -
Once we determine the
k
highest frequency characters, the problem boils down to counting all unique combinations of these characters that form validk-subsequences
.
Now, let's build on this intuition:
-
We start by counting the frequency of each character in
s
using a Counter. -
Next, if there are fewer than
k
unique characters in the string, there cannot be any validk-subsequences
, so we return0
. -
We sort the frequencies in descending order because we are interested in the
k
characters with the highest frequencies. -
The beauty of any
k-subsequence
is at least thek
-th highest frequency (represented byval
in the solution). Thus, we find how many characters have thisk
-th highest frequency (x
). -
The solution then calculates the number of ways we can pick the remaining
k-1
characters from those that have a higher frequency thanval
. -
Finally, it uses the combinatorial function to calculate the number of ways to choose the
k
characters with thek
-th highest frequencyval
. It multiplies this with the previously calculated value and takes the whole product modulo10^9 + 7
.
By the end of these steps, we have the total count of maximum beauty k-subsequences modulo 10^9 + 7
.
Learn more about Greedy, Math and Combinatorics patterns.
Solution Approach
The solution to the problem implements the following steps, utilizing a few Python-specific features and utility functions:
-
Counting Character Frequencies:
- The solution uses
Counter
from Python'scollections
library to count the occurrences of each character in the strings
. This gives us a dictionary where keys are characters and values are their respective frequencies.
- The solution uses
-
Shortcut for Impossibility:
- If the unique character count is less than
k
, we cannot form ak-subsequence
, so we return 0.
- If the unique character count is less than
-
Sorting and Selecting Frequencies:
- We sort the frequencies in descending order to focus on the highest frequencies first. The
k-th
highest frequency is found at indexk - 1
because of zero-based indexing. This value is important because it sets the minimum frequency a character must have to be considered for the maximum beauty.
- We sort the frequencies in descending order to focus on the highest frequencies first. The
-
Calculating Combinations for Choosing Characters:
- To use the characters with higher frequencies than the
k-th
highest frequency, we multiply the ways of picking these characters cumulatively. For each such character, we multiplyans
by its frequency modulomod
.
- To use the characters with higher frequencies than the
-
Handling Characters with Equal Frequency to the
k-th
:- Next, we find how many characters,
x
, have the same frequency as thek-th
highest frequencyval
. We need to choosek
from thesex
characters. However, as some characters with higher frequency have already been chosen (k
is decremented accordingly), we must choose the remaining from thesex
characters.
- Next, we find how many characters,
-
Calculations Involving Modular Arithmetic:
- We calculate
comb(x, k)
- the number of ways to choosek
remaining characters fromx
characters with the exact frequencyval
. - The
pow(val, k, mod)
calculatesval^k
(val to the power of k) modulomod
, representing multiplying the beauty combination by the frequencyval
, raised to the power of how many characters we are choosing (k
).
- We calculate
-
Final Calculation and Modulo Operation:
- The final answer is calculated by multiplying these values and taking the result modulo
mod
(10^9 + 7
) to handle large numbers according to the problem statement.
- The final answer is calculated by multiplying these values and taking the result modulo
Here's how the solution implements these steps:
class Solution:
def countKSubsequencesWithMaxBeauty(self, s: str, k: int) -> int:
f = Counter(s) # step 1
if len(f) < k: # step 2
return 0
mod = 10**9 + 7
vs = sorted(f.values(), reverse=True) # step 3
val = vs[k - 1] # this gives us the k-th highest frequency
x = vs.count(val) # step 5
ans = 1
for v in vs: # step 4
if v == val:
break
k -= 1
ans = ans * v % mod
ans = ans * comb(x, k) * pow(val, k, mod) % mod # step 6 and 7
return ans
This approach efficiently calculates the number of k-subsequences
with maximum beauty by combining frequency analysis, sorting, combinatorial mathematics, and modular arithmetic.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let us consider a small example to illustrate the solution approach. Assume we have the string s = "aabbc"
and k = 2
.
Following the steps of the solution:
-
Counting Character Frequencies:
- Using
Counter(s)
, we determine that the frequency of 'a' is 2, 'b' is 2, and 'c' is 1. We have our frequency dictionary:{'a': 2, 'b': 2, 'c': 1}
.
- Using
-
Shortcut for Impossibility:
- There are more than
k
unique characters; hence ak-subsequence
is possible.
- There are more than
-
Sorting and Selecting Frequencies:
- We sort the frequencies in descending order:
[2, 2, 1]
. - The
k-th
highest frequency is at indexk - 1
, which is1
. So,val = vs[1] = 2
.
- We sort the frequencies in descending order:
-
Calculating Combinations for Choosing Characters:
- As the first two frequencies are equal and are the highest, we have no higher frequency characters to choose from. Therefore,
ans
remains 1.
- As the first two frequencies are equal and are the highest, we have no higher frequency characters to choose from. Therefore,
-
Handling Characters with Equal Frequency to the
k-th
:- There are two characters ('a' and 'b') with the same frequency as
val
, sox = 2
. - Since we have no characters with a higher frequency, we want to choose all
k
characters from thesex
characters with the highest frequency, meaning we want to choose 2 characters from a set of 2 characters.
- There are two characters ('a' and 'b') with the same frequency as
-
Calculations Involving Modular Arithmetic:
- We calculate
comb(x, k)
which iscomb(2, 2)
. Since choosing 2 characters out of 2 is only possible in one way,comb(2, 2) = 1
. - The
pow(val, k, mod)
term is basically2^2 % mod
which equals 4 modulo10^9 + 7
.
- We calculate
-
Final Calculation and Modulo Operation:
- The final answer is
ans * comb(x, k) * pow(val, k, mod) % mod
, which simplifies to1 * 1 * 4 % mod
= 4.
- The final answer is
Therefore, for the string s = "aabbc"
with k = 2
, there are 4 k-subsequences
with the maximum beauty. This is because the beautiful subsequences are: "aa"
, "ab"
, "ba"
, and "bb"
, all having a total frequency sum of 4 in the string s
.
Solution Implementation
1from collections import Counter
2from math import comb
3
4class Solution:
5 def countKSubsequencesWithMaxBeauty(self, string: str, k: int) -> int:
6 # Create a frequency counter for all characters in the string
7 frequency_counter = Counter(string)
8
9 # If there are fewer types of characters than k, return 0
10 if len(frequency_counter) < k:
11 return 0
12
13 # Define modulo for result to prevent integer overflow
14 mod = 10**9 + 7
15
16 # Sort the frequencies in descending order
17 sorted_frequencies = sorted(frequency_counter.values(), reverse=True)
18
19 # Get the k-th largest frequency, this will be the max beauty
20 max_beauty = sorted_frequencies[k - 1]
21
22 # Count how many times this frequency (max beauty) appears
23 count_of_max_beauty = sorted_frequencies.count(max_beauty)
24
25 # Initialize answer as 1 (multiplicative identity)
26 answer = 1
27
28 # Multiplying the count of characters by which we can not form subsequences (as they are
29 # more frequent than the k-th frequent character) modulo mod to keep the number manageable
30 for freq in sorted_frequencies:
31 if freq == max_beauty:
32 # Once we reach characters with the max beauty value, stop
33 break
34 k -= 1
35 answer = answer * freq % mod
36
37 # Multiply by the number of combinations of choosing k characters from 'count_of_max_beauty'
38 # and the max_beauty^(k) to calculate the beauty value, then take the modulo of the result
39 answer = answer * comb(count_of_max_beauty, k) * pow(max_beauty, k, mod) % mod
40
41 # Return the final answer
42 return answer
43
1import java.util.Arrays;
2
3class Solution {
4 private static final int MOD = (int) 1e9 + 7;
5
6 public int countKSubsequencesWithMaxBeauty(String s, int k) {
7 int[] frequency = new int[26];
8 int n = s.length();
9 int uniqueCharCount = 0;
10
11 // Count frequency of each character in the string s
12 for (int i = 0; i < n; i++) {
13 if (++frequency[s.charAt(i) - 'a'] == 1) {
14 uniqueCharCount++;
15 }
16 }
17
18 // If there are not enough unique characters to form a subsequence of length k
19 if (uniqueCharCount < k) {
20 return 0;
21 }
22
23 Integer[] values = new Integer[uniqueCharCount];
24 for (int i = 0, j = 0; i < 26; i++) {
25 if (frequency[i] > 0) {
26 values[j++] = frequency[i];
27 }
28 }
29
30 // Sort the frequency values in descending order
31 Arrays.sort(values, (a, b) -> b - a);
32
33 long answer = 1;
34 int minValue = values[k - 1];
35 int minFreqCount = 0;
36
37 // Calculate the frequency of the k-th largest unique character
38 for (int v : values) {
39 if (v == minValue) {
40 minFreqCount++;
41 }
42 }
43
44 // Calculate product of first k-1 largest frequencies
45 for (int v : values) {
46 if (v == minValue) {
47 break;
48 }
49 k--;
50 answer = answer * v % MOD;
51 }
52
53 int[][] combinations = new int[minFreqCount + 1][minFreqCount + 1];
54
55 // Initialize the combinations (Pascal's triangle for binomial coefficients)
56 for (int i = 0; i <= minFreqCount; i++) {
57 combinations[i][0] = 1;
58 for (int j = 1; j <= i; j++) {
59 combinations[i][j] = (combinations[i - 1][j - 1] + combinations[i - 1][j]) % MOD;
60 }
61 }
62
63 // Multiply with the number of ways to choose k elements from minFreqCount
64 // and raise minValue to power k
65 answer = (((answer * combinations[minFreqCount][k]) % MOD) * quickPow(minValue, k)) % MOD;
66 return (int) answer;
67 }
68
69 // Helper method to perform fast exponentiation modulo MOD
70 private long quickPow(long base, int exponent) {
71 long result = 1;
72 for (; exponent > 0; exponent >>= 1) {
73 if ((exponent & 1) == 1) {
74 result = result * base % MOD;
75 }
76 base = base * base % MOD;
77 }
78 return result;
79 }
80}
81
1class Solution {
2public:
3 // Function to count the number of k subsequences with maximum beauty
4 int countKSubsequencesWithMaxBeauty(string s, int k) {
5 // Array to count frequency of letters
6 int frequency[26]{};
7 // Variable to store the distinct letter count
8 int distinctCount = 0;
9
10 // Loop to calculate the frequency of each character in the string
11 for (char& c : s) {
12 if (++frequency[c - 'a'] == 1) {
13 ++distinctCount;
14 }
15 }
16
17 // If there are fewer distinct letters than k, no valid subsequence exists
18 if (distinctCount < k) {
19 return 0;
20 }
21
22 // Vector to store frequencies of distinct letters
23 vector<int> frequencies(distinctCount);
24
25 // Populate the vector with the frequency counts of the letters
26 for (int i = 0, j = 0; i < 26; ++i) {
27 if (frequency[i]) {
28 frequencies[j++] = frequency[i];
29 }
30 }
31
32 // Sort the frequencies in descending order
33 sort(frequencies.rbegin(), frequencies.rend());
34
35 // Modulo for the result to avoid overflow
36 const int mod = 1e9 + 7;
37
38 // Variable to store the answer
39 long long answer = 1;
40
41 // Value of the k-th maximum frequency
42 int kthValue = frequencies[k - 1];
43
44 // Count how many letters have the k-th maximum frequency
45 int equalCount = 0;
46 for (int freq : frequencies) {
47 equalCount += freq == kthValue;
48 }
49
50 // Calculate part of the answer before considering combinations
51 for (int freq : frequencies) {
52 if (freq == kthValue) {
53 break;
54 }
55 --k;
56 answer = answer * freq % mod;
57 }
58
59 // Combinations array (Use Pascal's triangle)
60 int combinations[equalCount + 1][equalCount + 1];
61 memset(combinations, 0, sizeof(combinations));
62
63 for (int i = 0; i <= equalCount; ++i) {
64 combinations[i][0] = 1;
65 for (int j = 1; j <= i; ++j) {
66 combinations[i][j] = (combinations[i - 1][j] + combinations[i - 1][j - 1]) % mod;
67 }
68 }
69
70 // Quick power function to handle large exponents
71 auto quickPower = [&](long long base, int exponent) {
72 long long result = 1;
73 for (; exponent; exponent >>= 1) {
74 if (exponent & 1) {
75 result = result * base % mod;
76 }
77 base = base * base % mod;
78 }
79 return result;
80 };
81
82 // Finalize the answer with combinations and power calculation
83 answer = (answer * combinations[equalCount][k] % mod) * quickPower(kthValue, k) % mod;
84
85 return answer;
86 }
87};
88
1function countKSubsequencesWithMaxBeauty(s: string, k: number): number {
2 // Allocate an array to keep counts of each letter appearing in the string.
3 // Initialize an array of length 26 (for each letter of the alphabet) with zeros.
4 const frequency: number[] = new Array(26).fill(0);
5
6 // Initialize a count of unique characters present in the string.
7 let uniqueCharCount = 0;
8
9 // Count the frequency of each character in the input string.
10 for (const char of s) {
11 // Calculate the index by subtracting ASCII of 'a' to get a range from 0 to 25.
12 const index = char.charCodeAt(0) - 97;
13 // Increase the count for this character and increment unique character count if this is the first occurrence.
14 if (++frequency[index] === 1) {
15 ++uniqueCharCount;
16 }
17 }
18
19 // If the unique count is less than k, it is not possible to have k subsequences.
20 if (uniqueCharCount < k) {
21 return 0;
22 }
23
24 // Define modulus as per the problem constraints for large number handling.
25 const mod = BigInt(10 ** 9 + 7);
26
27 // Filter out zero frequency characters and sort the frequencies in descending order.
28 const validFrequencies: number[] = frequency.filter(value => value > 0).sort((a, b) => b - a);
29
30 // The value is the frequency of the k-th most frequent character after sorting.
31 const value = validFrequencies[k - 1];
32
33 // Count how many characters have the same frequency as the value.
34 const countWithValue = validFrequencies.filter(value => value === val).length;
35
36 // Initialize answer as a big integer.
37 let answer = 1n;
38
39 // Multiply answer by the frequency of each character more frequent than the k-th one.
40 for (const freq of validFrequencies) {
41 if (freq === value) {
42 break;
43 }
44 --k;
45 answer = (answer * BigInt(freq)) % mod;
46 }
47
48 // Initialize combination array for calculating combinations.
49 const combinations: number[][] = new Array(countWithValue + 1).fill(0).map(() => new Array(k + 1).fill(0));
50
51 // Populate the combination array using Pascal Triangle approach.
52 for (let i = 0; i <= countWithValue; ++i) {
53 combinations[i][0] = 1;
54 for (let j = 1; j <= Math.min(i, k); ++j) {
55 combinations[i][j] = (combinations[i - 1][j] + combinations[i - 1][j - 1]) % Number(mod);
56 }
57 }
58
59 // Define quick power function to calculate a^n % mod efficiently.
60 const quickPower = (a: bigint, n: number): bigint => {
61 let ans = 1n;
62 for (; n; n >>>= 1) {
63 if (n & 1) {
64 ans = (ans * a) % mod;
65 }
66 a = (a * a) % mod;
67 }
68 return ans;
69 };
70
71 // Complete the answer calculation by incorporating the combinations and quick power results.
72 answer = (((answer * BigInt(combinations[countWithValue][k])) % mod) * quickPower(BigInt(value), k)) % mod;
73
74 // Return the answer converted to a regular number.
75 return Number(answer);
76}
77
Time and Space Complexity
Time Complexity
The overall time complexity of the function countKSubsequencesWithMaxBeauty
is dominated by a few key operations:
-
Constructing the frequency counter (
f = Counter(s)
) with respect to the characters of the strings
, which has a complexity ofO(n)
wheren
is the length of the input strings
. -
Checking the length of the unique elements in the frequency counter (
if len(f) < k
), which isO(1)
because the length operation for a dictionary is constant time. -
Sorting the values of the frequency counter (
vs = sorted(f.values(), reverse=True)
), which will takeO(m log m)
wherem
is the number of distinct characters ins
. This is typically less than or equal ton
, but we consider it separately in cases
is very long but has few distinct characters. -
Counting the frequency of the
k-th
highest value (x = vs.count(val)
), which has a complexity ofO(m)
since it requires traversing the sorted list of frequencies. -
We have a loop to compute the product of values and modulo
ans = ans * v % mod
, but the loop runs at mostm
times, so this isO(m)
. -
Calculating the combination (
comb(x, k)
) which typically can be computed inO(k)
time, and calculating the power (pow(val, k, mod)
) which can be done inO(log k)
time using fast exponentiation.
Since m <= n
, we can say the time complexity is O(n + n log n + k + log k) = O(n log n)
since the log n
term from the sorting will dominate as n
grows large. k
is bounded by n
, hence it doesn't change the overall complexity.
Space Complexity
The space complexity is as follows:
-
The frequency counter
f
which will have at mostm
elements wherem
is the number of unique characters ins
, giving usO(m)
space. -
The sorted list
vs
, which will also be of sizem
, giving us anotherO(m)
space. -
Constant space for auxiliary variables like
mod
,val
,x
,ans
and a few others during iterations and calculations.
Therefore, the space complexity of the code is O(m)
, which can be O(n)
in the worst case when all the characters in s
are unique.
Learn more about how to find time and space complexity quickly using problem constraints.
Which algorithm is best for finding the shortest distance between two points in an unweighted graph?
Recommended Readings
Greedy Introduction div class responsive iframe iframe src https www youtube com embed WTslqPbj7I title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture web share allowfullscreen iframe div When do we use greedy Greedy algorithms tend to solve optimization problems Typically they will ask you to calculate the max min of some value Commonly you may see this phrased in the problem as max min longest shortest largest smallest etc These keywords can be identified by just scanning
Math for Technical Interviews How much math do I need to know for technical interviews The short answer is about high school level math Computer science is often associated with math and some universities even place their computer science department under the math faculty However the reality is that you
Backtracking Template Prereq DFS with States problems dfs_with_states Combinatorial search problems Combinatorial search problems involve finding groupings and assignments of objects that satisfy certain conditions Finding all permutations combinations subsets and solving Sudoku are classic combinatorial problems The time complexity of combinatorial problems often grows rapidly with the size of
Want a Structured Path to Master System Design Too? Don’t Miss This!