2842. Count K-Subsequences of a String With Maximum Beauty
Problem Description
You are given a string s
and an integer k
. Your task is to find k-subsequences with maximum beauty value.
A k-subsequence is defined as:
- A subsequence of string
s
with exactly lengthk
- All characters in the subsequence must be unique (each character appears only once in the subsequence)
The beauty of a k-subsequence is calculated as the sum of f(c)
for every character c
in that subsequence, where f(c)
is the total number of times character c
appears in the original string s
(not in the subsequence).
For example, with s = "abbbdd"
and k = 2
:
- Character frequencies:
f('a') = 1
,f('b') = 3
,f('d') = 2
- Some possible k-subsequences:
"ab"
has beauty =f('a') + f('b') = 1 + 3 = 4
"ad"
has beauty =f('a') + f('d') = 1 + 2 = 3
"bd"
has beauty =f('b') + f('d') = 3 + 2 = 5
You need to:
- Find the maximum possible beauty value among all k-subsequences
- Count how many k-subsequences achieve this maximum beauty
- Return the count modulo
10^9 + 7
Important notes:
- Two k-subsequences are considered different if they use different indices from the original string, even if they form the same character sequence
- If it's impossible to form a k-subsequence (when there are fewer than
k
distinct characters ins
), return0
The challenge involves identifying which characters to include in the k-subsequence to maximize beauty, then counting all possible ways to form such subsequences from the original string positions.
Intuition
To maximize the beauty of a k-subsequence, we need to choose characters that have the highest frequencies in the original string. Think of it this way: since beauty is the sum of f(c)
for each character in our subsequence, we want to include characters with the largest f(c)
values.
The greedy approach works here - we should always pick the k
characters with the highest frequencies. If we have characters with frequencies [5, 4, 3, 2, 1]
and need to pick k=3
characters, we'd pick those with frequencies [5, 4, 3]
to get maximum beauty of 5 + 4 + 3 = 12
.
However, there's a complication when multiple characters have the same frequency. For example, if frequencies are [5, 4, 3, 3, 3, 2]
and k=4
, we must pick the characters with frequencies [5, 4]
and then choose 2 out of the 3 characters that have frequency 3.
This leads us to the key insight:
- First, sort all character frequencies in descending order
- The k-th largest frequency (
val
) determines our cutoff point - All characters with frequency greater than
val
must be included - For characters with frequency exactly equal to
val
, we need to choose some of them
The counting part requires combinatorial mathematics:
- For characters with frequency >
val
: each occurrence in the string can be chosen independently, giving usfrequency
choices per character - For characters with frequency =
val
: we need to choose exactlyremaining_k
characters fromx
available characters (wherex
is the count of characters with this frequency), giving usC(x, remaining_k)
ways - Each selected character with frequency
val
contributesval
choices for positions in the original string
The final answer is the product of all these choices: ans = (product of frequencies > val) × C(x, remaining_k) × val^remaining_k
This approach ensures we find all k-subsequences with maximum beauty by systematically counting all valid combinations of the highest-frequency characters.
Learn more about Greedy, Math and Combinatorics patterns.
Solution Approach
The implementation follows these key steps:
-
Character Frequency Counting: Use a Counter (hash table)
f
to count the occurrence of each character in strings
. This gives usf[c]
for each character. -
Early Termination Check: If the number of distinct characters (
len(f)
) is less thank
, it's impossible to form a k-subsequence with unique characters, so return0
. -
Sort Frequencies: Extract all frequency values from
f
and sort them in descending order into arrayvs
. This allows us to greedily select characters with highest frequencies. -
Find Cutoff Frequency: The k-th largest frequency
val = vs[k-1]
determines our cutoff. Characters with this frequency may or may not be fully included in the optimal subsequences. -
Count Characters at Cutoff: Count how many characters have frequency exactly equal to
val
. Store this asx = vs.count(val)
. -
Calculate Answer in Two Parts:
Part 1 - Characters with frequency > val:
- Iterate through
vs
and for each frequencyv > val
:- Multiply answer by
v
(each occurrence can be chosen) - Decrement
k
to track remaining slots
- Multiply answer by
- After this loop,
k
represents how many characters with frequencyval
we need to select
Part 2 - Characters with frequency = val:
- We need to choose
k
characters fromx
available characters: multiply bycomb(x, k)
- Each selected character contributes
val
possible positions: multiply bypow(val, k, mod)
- Iterate through
-
Modular Arithmetic: All multiplications are done modulo
10^9 + 7
to prevent overflow.
The final formula becomes:
ans = (∏ v for all v > val) × C(x, k) × val^k (mod 10^9 + 7)
This approach efficiently counts all k-subsequences with maximum beauty by:
- Using greedy selection to identify which characters must be included
- Using combinatorics to count different ways to select characters at the cutoff frequency
- Using modular exponentiation for efficient computation of large powers
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through an example with s = "aabbccd"
and k = 3
.
Step 1: Count character frequencies
f('a') = 2
f('b') = 2
f('c') = 2
f('d') = 1
Step 2: Check if k-subsequence is possible
- We have 4 distinct characters, and k = 3, so it's possible (4 ≥ 3) ✓
Step 3: Sort frequencies in descending order
vs = [2, 2, 2, 1]
Step 4: Find the cutoff frequency
- The k-th (3rd) largest frequency is
val = vs[k-1] = vs[2] = 2
- This means we need at least characters with frequency ≥ 2
Step 5: Count characters at cutoff
- Characters with frequency exactly 2:
'a'
,'b'
,'c'
→ countx = 3
- Characters with frequency > 2: none
Step 6: Calculate the answer
Since all top frequencies equal val = 2
, we skip Part 1 and go directly to Part 2:
- We need to choose k = 3 characters from x = 3 available characters
- Number of ways to choose:
C(3, 3) = 1
- Each chosen character has 2 occurrences in the string, so each contributes factor of 2
- Total combinations:
1 × 2^3 = 1 × 8 = 8
Verification of the 8 subsequences: The maximum beauty is 2 + 2 + 2 = 6 (choosing any 3 characters from {a, b, c})
Since we must choose all three characters {a, b, c}, and each appears twice in the string:
- For 'a': can choose index 0 or 1 (2 ways)
- For 'b': can choose index 2 or 3 (2 ways)
- For 'c': can choose index 4 or 5 (2 ways)
- Total: 2 × 2 × 2 = 8 different k-subsequences
All 8 k-subsequences have the same character set "abc" but use different index combinations from the original string, each achieving the maximum beauty of 6.
Solution Implementation
1from collections import Counter
2from math import comb
3
4class Solution:
5 def countKSubsequencesWithMaxBeauty(self, s: str, k: int) -> int:
6 # Count frequency of each character in the string
7 char_frequency = Counter(s)
8
9 # If there are fewer unique characters than k, no valid subsequence exists
10 if len(char_frequency) < k:
11 return 0
12
13 MOD = 10**9 + 7
14
15 # Sort character frequencies in descending order
16 frequency_values = sorted(char_frequency.values(), reverse=True)
17
18 # Find the k-th largest frequency value (threshold frequency)
19 threshold_freq = frequency_values[k - 1]
20
21 # Count how many characters have the threshold frequency
22 num_chars_with_threshold = frequency_values.count(threshold_freq)
23
24 # Initialize result to 1 (multiplicative identity)
25 result = 1
26
27 # Process all frequencies greater than the threshold
28 # These characters must be included in every valid subsequence
29 for freq in frequency_values:
30 if freq == threshold_freq:
31 break
32 k -= 1 # Reduce remaining slots to fill
33 result = (result * freq) % MOD
34
35 # For characters with threshold frequency, we need to choose k from num_chars_with_threshold
36 # Multiply by combinations * (threshold_freq ^ k) for the remaining slots
37 result = (result * comb(num_chars_with_threshold, k) * pow(threshold_freq, k, MOD)) % MOD
38
39 return result
40
1class Solution {
2 private final int MOD = (int) 1e9 + 7;
3
4 public int countKSubsequencesWithMaxBeauty(String s, int k) {
5 // Count frequency of each character
6 int[] frequency = new int[26];
7 int stringLength = s.length();
8 int distinctCharCount = 0;
9
10 // Calculate frequency and count distinct characters
11 for (int i = 0; i < stringLength; i++) {
12 int charIndex = s.charAt(i) - 'a';
13 if (++frequency[charIndex] == 1) {
14 distinctCharCount++;
15 }
16 }
17
18 // If we don't have enough distinct characters, return 0
19 if (distinctCharCount < k) {
20 return 0;
21 }
22
23 // Create array of non-zero frequencies
24 Integer[] frequencies = new Integer[distinctCharCount];
25 int index = 0;
26 for (int i = 0; i < 26; i++) {
27 if (frequency[i] > 0) {
28 frequencies[index++] = frequency[i];
29 }
30 }
31
32 // Sort frequencies in descending order
33 Arrays.sort(frequencies, (a, b) -> b - a);
34
35 // Initialize result
36 long result = 1;
37
38 // Get the k-th largest frequency value (boundary value)
39 int boundaryFrequency = frequencies[k - 1];
40
41 // Count how many characters have the boundary frequency
42 int boundaryFrequencyCount = 0;
43 for (int freq : frequencies) {
44 if (freq == boundaryFrequency) {
45 boundaryFrequencyCount++;
46 }
47 }
48
49 // Multiply result by frequencies greater than boundary
50 for (int freq : frequencies) {
51 if (freq == boundaryFrequency) {
52 break;
53 }
54 k--;
55 result = result * freq % MOD;
56 }
57
58 // Build Pascal's triangle for combination calculation
59 int[][] combinations = new int[boundaryFrequencyCount + 1][boundaryFrequencyCount + 1];
60 for (int i = 0; i <= boundaryFrequencyCount; i++) {
61 combinations[i][0] = 1;
62 for (int j = 1; j <= i; j++) {
63 combinations[i][j] = (combinations[i - 1][j - 1] + combinations[i - 1][j]) % MOD;
64 }
65 }
66
67 // Calculate final result:
68 // result * C(boundaryFrequencyCount, k) * boundaryFrequency^k
69 result = ((result * combinations[boundaryFrequencyCount][k]) % MOD) *
70 quickPower(boundaryFrequency, k) % MOD;
71
72 return (int) result;
73 }
74
75 /**
76 * Calculate (base^exponent) % MOD using binary exponentiation
77 */
78 private long quickPower(long base, int exponent) {
79 long result = 1;
80 while (exponent > 0) {
81 // If current bit is 1, multiply result by base
82 if ((exponent & 1) == 1) {
83 result = result * base % MOD;
84 }
85 // Square the base for next bit
86 base = base * base % MOD;
87 // Move to next bit
88 exponent >>= 1;
89 }
90 return result;
91 }
92}
93
1class Solution {
2public:
3 int countKSubsequencesWithMaxBeauty(string s, int k) {
4 // Count frequency of each character
5 int frequency[26]{};
6 int distinctChars = 0;
7
8 for (char& c : s) {
9 if (++frequency[c - 'a'] == 1) {
10 distinctChars++;
11 }
12 }
13
14 // If distinct characters less than k, impossible to form subsequence
15 if (distinctChars < k) {
16 return 0;
17 }
18
19 // Collect non-zero frequencies
20 vector<int> frequencies(distinctChars);
21 for (int i = 0, j = 0; i < 26; ++i) {
22 if (frequency[i]) {
23 frequencies[j++] = frequency[i];
24 }
25 }
26
27 // Sort frequencies in descending order
28 sort(frequencies.rbegin(), frequencies.rend());
29
30 const int MOD = 1e9 + 7;
31 long long result = 1;
32
33 // Find the k-th largest frequency value
34 int kthFrequency = frequencies[k - 1];
35
36 // Count how many characters have the k-th frequency
37 int countWithKthFreq = 0;
38 for (int freq : frequencies) {
39 countWithKthFreq += (freq == kthFrequency);
40 }
41
42 // Multiply frequencies greater than k-th frequency
43 for (int freq : frequencies) {
44 if (freq == kthFrequency) {
45 break;
46 }
47 k--;
48 result = result * freq % MOD;
49 }
50
51 // Build Pascal's triangle for combination calculation
52 int combinations[countWithKthFreq + 1][countWithKthFreq + 1];
53 memset(combinations, 0, sizeof(combinations));
54
55 for (int i = 0; i <= countWithKthFreq; ++i) {
56 combinations[i][0] = 1;
57 for (int j = 1; j <= i; ++j) {
58 combinations[i][j] = (combinations[i - 1][j] + combinations[i - 1][j - 1]) % MOD;
59 }
60 }
61
62 // Fast exponentiation function
63 auto quickPower = [&](long long base, int exponent) {
64 long long answer = 1;
65 while (exponent > 0) {
66 if (exponent & 1) {
67 answer = answer * base % MOD;
68 }
69 base = base * base % MOD;
70 exponent >>= 1;
71 }
72 return answer;
73 };
74
75 // Calculate final result: result * C(countWithKthFreq, k) * kthFrequency^k
76 result = (result * combinations[countWithKthFreq][k] % MOD) * quickPower(kthFrequency, k) % MOD;
77
78 return result;
79 }
80};
81
1/**
2 * Counts k-subsequences with maximum beauty from string s
3 * Beauty is defined as the product of character frequencies in the subsequence
4 * @param s - Input string
5 * @param k - Number of distinct characters to select
6 * @returns Number of k-subsequences with maximum beauty modulo 10^9 + 7
7 */
8function countKSubsequencesWithMaxBeauty(s: string, k: number): number {
9 // Initialize frequency array for 26 lowercase letters
10 const charFrequencies: number[] = new Array(26).fill(0);
11 let distinctCharCount = 0;
12
13 // Count frequency of each character and track distinct characters
14 for (const char of s) {
15 const charIndex = char.charCodeAt(0) - 97; // Convert 'a'-'z' to 0-25
16 if (++charFrequencies[charIndex] === 1) {
17 ++distinctCharCount;
18 }
19 }
20
21 // If we don't have enough distinct characters, return 0
22 if (distinctCharCount < k) {
23 return 0;
24 }
25
26 const MOD = BigInt(10 ** 9 + 7);
27
28 // Get non-zero frequencies and sort in descending order
29 const nonZeroFrequencies: number[] = charFrequencies
30 .filter(freq => freq > 0)
31 .sort((a, b) => b - a);
32
33 // Find the k-th largest frequency value
34 const kthLargestFreq = nonZeroFrequencies[k - 1];
35
36 // Count how many characters have the k-th largest frequency
37 const countOfKthFreq = nonZeroFrequencies.filter(freq => freq === kthLargestFreq).length;
38
39 let result = 1n;
40
41 // Multiply frequencies larger than the k-th largest
42 for (const freq of nonZeroFrequencies) {
43 if (freq === kthLargestFreq) {
44 break;
45 }
46 --k;
47 result = (result * BigInt(freq)) % MOD;
48 }
49
50 // Build Pascal's triangle for combinations C(n, r)
51 const combinations: number[][] = new Array(countOfKthFreq + 1)
52 .fill(0)
53 .map(() => new Array(k + 1).fill(0));
54
55 for (let n = 0; n <= countOfKthFreq; ++n) {
56 combinations[n][0] = 1;
57 for (let r = 1; r <= Math.min(n, k); ++r) {
58 combinations[n][r] = (combinations[n - 1][r] + combinations[n - 1][r - 1]) % Number(MOD);
59 }
60 }
61
62 // Calculate final result: result * C(countOfKthFreq, k) * kthLargestFreq^k
63 result = (((result * BigInt(combinations[countOfKthFreq][k])) % MOD) *
64 quickPower(BigInt(kthLargestFreq), k)) % MOD;
65
66 return Number(result);
67}
68
69/**
70 * Performs fast modular exponentiation
71 * @param base - Base number
72 * @param exponent - Exponent value
73 * @returns (base^exponent) % MOD
74 */
75function quickPower(base: bigint, exponent: number): bigint {
76 const MOD = BigInt(10 ** 9 + 7);
77 let result = 1n;
78
79 while (exponent > 0) {
80 // If current bit is 1, multiply result by base
81 if (exponent & 1) {
82 result = (result * base) % MOD;
83 }
84 // Square the base for next bit
85 base = (base * base) % MOD;
86 // Right shift to process next bit
87 exponent >>>= 1;
88 }
89
90 return result;
91}
92
Time and Space Complexity
The time complexity is O(n + |Σ| log |Σ|)
, where n
is the length of the string and |Σ|
is the size of the character set (at most 26 for lowercase letters).
- Creating the frequency counter takes
O(n)
time as we iterate through the string once - Sorting the frequency values takes
O(|Σ| log |Σ|)
time, which isO(26 log 26) = O(1)
for lowercase letters - Counting occurrences of
val
takesO(|Σ|)
time - The first loop iterates at most
k
times, which is at most|Σ|
times, takingO(|Σ|)
time - The combination calculation
comb(x, k)
takesO(min(k, x-k))
time, which is at mostO(|Σ|)
- The modular exponentiation
pow(val, k, mod)
takesO(log k)
time
Since |Σ| = 26
is a constant for lowercase letters, the overall time complexity simplifies to O(n)
.
The space complexity is O(|Σ|)
, where |Σ|
is the size of the character set.
- The Counter object stores at most
|Σ|
key-value pairs - The sorted values list
vs
contains at most|Σ|
elements - Other variables use
O(1)
space
For lowercase letters where |Σ| = 26
, this is effectively O(1)
constant space relative to the input size.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Handling of Integer Overflow
The most critical pitfall is forgetting to apply modulo operations consistently throughout the calculation. The combination value comb(num_chars_with_threshold, k)
can become extremely large, and multiplying it with other values without proper modulo handling will cause integer overflow.
Problem Code:
# This will overflow for large values
result = (result * comb(num_chars_with_threshold, k) * pow(threshold_freq, k, MOD)) % MOD
Solution: Apply modulo to the combination result before multiplication:
combination_value = comb(num_chars_with_threshold, k) % MOD
result = (result * combination_value % MOD * pow(threshold_freq, k, MOD)) % MOD
2. Misunderstanding the Beauty Calculation
A common mistake is thinking that beauty is calculated based on character frequency within the subsequence itself, rather than the frequency in the original string.
Wrong Interpretation:
- Counting how many times 'b' appears in subsequence "bd"
- This would give beauty = 1 + 1 = 2
Correct Interpretation:
- Using original string frequencies: f('b') = 3, f('d') = 2
- Beauty of "bd" = 3 + 2 = 5
3. Off-by-One Error in Finding Threshold
Using the wrong index when finding the k-th largest frequency can lead to incorrect results.
Problem Code:
# Wrong: using k instead of k-1 threshold_freq = frequency_values[k] # This gets the (k+1)-th element
Solution:
# Correct: arrays are 0-indexed threshold_freq = frequency_values[k - 1]
4. Not Handling the Case Where All Characters Have Same Frequency
When all characters have the same frequency and k equals the number of distinct characters, the algorithm might incorrectly calculate combinations.
Example: s = "abc", k = 3
- All frequencies are 1
- The loop for frequencies > threshold won't execute
- Must ensure k is properly maintained for the combination calculation
Solution: Track k carefully through the algorithm and ensure the final combination calculation uses the correct remaining value of k.
5. Confusion Between Character Count and Occurrence Count
Mixing up the number of distinct characters with a certain frequency versus the frequency value itself.
Problem Code:
# Wrong: using frequency value instead of count of characters num_chars_with_threshold = threshold_freq # This is the frequency, not the count
Solution:
# Correct: count how many characters have this frequency num_chars_with_threshold = frequency_values.count(threshold_freq)
6. Inefficient Power Calculation
Using regular Python exponentiation without modulo can cause performance issues and overflow.
Problem Code:
# Inefficient and may overflow result = (result * (threshold_freq ** k)) % MOD
Solution:
# Use built-in modular exponentiation
result = (result * pow(threshold_freq, k, MOD)) % MOD
Which two pointer techniques do you use to check if a string is a palindrome?
Recommended Readings
Greedy Introduction div class responsive iframe iframe src https www youtube com embed WTslqPbj7I title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture web share allowfullscreen iframe div When do we use greedy Greedy algorithms tend to solve optimization problems Typically they will ask you to calculate the max min of some value Commonly you may see this phrased in the problem as max min longest shortest largest smallest etc These keywords can be identified by just scanning
Math for Technical Interviews How much math do I need to know for technical interviews The short answer is about high school level math Computer science is often associated with math and some universities even place their computer science department under the math faculty However the reality is that you
Backtracking Template Prereq DFS with States problems dfs_with_states Combinatorial search problems Combinatorial search problems involve finding groupings and assignments of objects that satisfy certain conditions Finding all permutations combinations subsets and solving Sudoku are classic combinatorial problems The time complexity of combinatorial problems often grows rapidly with the size of
Want a Structured Path to Master System Design Too? Don’t Miss This!