3085. Minimum Deletions to Make String K-Special
Problem Description
You are given a string word
containing only lowercase letters and an integer k
.
A string is considered k-special if the difference between the frequencies of any two characters in the string is at most k
. In other words, for every pair of characters in the string, the absolute difference between their occurrence counts should not exceed k
.
Mathematically, word
is k-special if |freq(word[i]) - freq(word[j])| <= k
for all indices i
and j
, where freq(x)
represents how many times character x
appears in the string.
Your task is to find the minimum number of characters you need to delete from word
to make it k-special.
For example, if you have a string where one character appears 10 times and another appears 2 times, and k = 3
, the difference is |10 - 2| = 8
, which is greater than 3. So you would need to delete some characters to reduce this gap to at most 3.
The solution works by:
- Counting the frequency of each character in the string
- Trying different possible minimum frequencies
v
from 0 to the length of the string - For each minimum frequency
v
, calculating how many deletions are needed:- Characters with frequency less than
v
must be completely removed - Characters with frequency greater than
v + k
must be reduced tov + k
- Characters with frequency less than
- Returning the minimum number of deletions across all possible values of
v
Intuition
The key insight is that in a k-special string, all character frequencies must fall within a range of size k
. Think of it as finding a "window" of width k
on a number line where we want to fit all our character frequencies.
Since we can only delete characters (not add them), we can only reduce frequencies. This means we need to find an optimal range [v, v+k]
where v
is some minimum frequency value, and adjust all character frequencies to fit within this range.
For any character with frequency less than v
, we have no choice but to delete all occurrences since we can't increase its frequency. For characters with frequency greater than v+k
, we need to delete just enough to bring them down to v+k
.
The challenge is finding the best value of v
. If we set v
too high, we'll need to delete many characters that have low frequencies. If we set v
too low, we might need to delete many occurrences from high-frequency characters to bring them down to v+k
.
The solution elegantly handles this by trying all possible values of v
from 0 to n
(the length of the string). For each candidate value of v
, we calculate the total deletions needed using a simple formula:
- For frequencies below
v
: delete all (cost =x
) - For frequencies above
v+k
: delete excess (cost =x - v - k
) - For frequencies in range
[v, v+k]
: no deletions needed
By checking all possible values of v
and taking the minimum, we guarantee finding the optimal solution. This brute force approach works efficiently because we only have at most 26 distinct character frequencies (one for each lowercase letter), making the computation fast regardless of the string length.
Solution Approach
The implementation follows a counting and enumeration strategy:
Step 1: Count Character Frequencies
First, we use Python's Counter
to count the frequency of each character in the string and extract just the frequency values into nums
:
nums = Counter(word).values()
This gives us a collection of frequencies, one for each unique character in the string. For example, if word = "aabbbc"
, then nums
would contain [2, 3, 1]
(frequencies of 'a', 'b', and 'c').
Step 2: Define the Cost Function
The helper function f(v)
calculates the minimum deletions needed when we choose v
as the minimum frequency in our target range [v, v+k]
:
def f(v: int) -> int:
ans = 0
for x in nums:
if x < v:
ans += x # Delete all occurrences
elif x > v + k:
ans += x - v - k # Delete excess to bring down to v+k
return ans
The logic is straightforward:
- If a character's frequency
x
is less thanv
, we must delete allx
occurrences - If
x
is greater thanv + k
, we deletex - (v + k)
characters to bring it down to exactlyv + k
- If
x
is already in the range[v, v+k]
, no deletions are needed
Step 3: Find the Minimum
We enumerate all possible values of v
from 0 to the length of the string:
return min(f(v) for v in range(len(word) + 1))
Why this range? The minimum frequency v
can be at most n
(if we keep all occurrences of a single character). Values beyond n
don't make sense since no character can appear more than n
times in a string of length n
.
Time Complexity: O(n ร m)
where n
is the length of the string and m
is the number of unique characters (at most 26). For each value of v
(n+1 iterations), we check all character frequencies (at most 26).
Space Complexity: O(m)
for storing the frequency counts, where m
is at most 26.
The elegance of this solution lies in its simplicity - by trying all possible minimum frequencies and calculating the cost for each, we guarantee finding the optimal solution without complex optimization techniques.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with a concrete example:
Input: word = "aaabbbcc"
, k = 1
Step 1: Count Character Frequencies
- 'a' appears 3 times
- 'b' appears 3 times
- 'c' appears 2 times
- So
nums = [3, 3, 2]
Step 2: Check if Already k-special
- Maximum difference:
|3 - 2| = 1 โค k
, so the string is already k-special! - But let's see how the algorithm confirms this.
Step 3: Try All Possible Minimum Frequencies
For each value v
from 0 to 8 (length of string), calculate deletions needed:
-
v = 0: Range is [0, 1]
- 'a' (freq=3): 3 > 1, delete 3-1 = 2 characters
- 'b' (freq=3): 3 > 1, delete 3-1 = 2 characters
- 'c' (freq=2): 2 > 1, delete 2-1 = 1 character
- Total deletions: 2 + 2 + 1 = 5
-
v = 1: Range is [1, 2]
- 'a' (freq=3): 3 > 2, delete 3-2 = 1 character
- 'b' (freq=3): 3 > 2, delete 3-2 = 1 character
- 'c' (freq=2): 2 is in range, delete 0
- Total deletions: 1 + 1 + 0 = 2
-
v = 2: Range is [2, 3]
- 'a' (freq=3): 3 is in range, delete 0
- 'b' (freq=3): 3 is in range, delete 0
- 'c' (freq=2): 2 is in range, delete 0
- Total deletions: 0 + 0 + 0 = 0 โ
-
v = 3: Range is [3, 4]
- 'a' (freq=3): 3 is in range, delete 0
- 'b' (freq=3): 3 is in range, delete 0
- 'c' (freq=2): 2 < 3, delete all 2 characters
- Total deletions: 0 + 0 + 2 = 2
-
v โฅ 4: All frequencies are below v, so we'd delete everything (8 deletions)
Step 4: Return Minimum The minimum deletions across all values is 0 (at v=2), confirming the string is already k-special.
Let's trace through a second example where deletions are needed:
Input: word = "aaaaabbc"
, k = 1
- Frequencies: 'a'=5, 'b'=2, 'c'=1, so
nums = [5, 2, 1]
- Maximum difference:
|5 - 1| = 4 > 1
, so we need deletions
Testing key values of v:
- v = 1: Range [1, 2] โ Delete 3 'a's (5โ2), keep 'b's, keep 'c' โ Total: 3
- v = 2: Range [2, 3] โ Delete 2 'a's (5โ3), keep 'b's, delete all 'c' โ Total: 3
The minimum is 3 deletions, resulting in a string like "aabbc" where all frequencies are within range [1, 2].
Solution Implementation
1from collections import Counter
2from typing import List
3
4class Solution:
5 def minimumDeletions(self, word: str, k: int) -> int:
6 """
7 Find minimum deletions to make all character frequencies differ by at most k.
8
9 Args:
10 word: Input string
11 k: Maximum allowed difference between any two character frequencies
12
13 Returns:
14 Minimum number of characters to delete
15 """
16
17 def calculate_deletions_for_target(target_freq: int) -> int:
18 """
19 Calculate deletions needed if we set minimum frequency to target_freq.
20 All frequencies must be in range [target_freq, target_freq + k].
21
22 Args:
23 target_freq: The target minimum frequency for characters
24
25 Returns:
26 Number of deletions required
27 """
28 deletions = 0
29
30 for current_freq in character_frequencies:
31 if current_freq < target_freq:
32 # Delete all occurrences of this character
33 deletions += current_freq
34 elif current_freq > target_freq + k:
35 # Delete excess occurrences to bring frequency to target_freq + k
36 deletions += current_freq - target_freq - k
37 # If frequency is in [target_freq, target_freq + k], no deletion needed
38
39 return deletions
40
41 # Get frequency count for each character in the word
42 character_frequencies = list(Counter(word).values())
43
44 # Try all possible target frequencies from 0 to word length
45 # and find the minimum deletions required
46 min_deletions = min(
47 calculate_deletions_for_target(target_freq)
48 for target_freq in range(len(word) + 1)
49 )
50
51 return min_deletions
52
1class Solution {
2 // List to store non-zero character frequencies
3 private List<Integer> characterFrequencies = new ArrayList<>();
4
5 /**
6 * Finds minimum deletions to make all character frequencies differ by at most k
7 * @param word The input string
8 * @param k Maximum allowed difference between any two character frequencies
9 * @return Minimum number of characters to delete
10 */
11 public int minimumDeletions(String word, int k) {
12 // Array to count frequency of each letter (a-z)
13 int[] frequencyCount = new int[26];
14 int wordLength = word.length();
15
16 // Count frequency of each character in the word
17 for (int i = 0; i < wordLength; i++) {
18 frequencyCount[word.charAt(i) - 'a']++;
19 }
20
21 // Store only non-zero frequencies in the list
22 for (int frequency : frequencyCount) {
23 if (frequency > 0) {
24 characterFrequencies.add(frequency);
25 }
26 }
27
28 // Try all possible target frequencies from 0 to wordLength
29 int minimumDeletions = wordLength;
30 for (int targetFrequency = 0; targetFrequency <= wordLength; targetFrequency++) {
31 minimumDeletions = Math.min(minimumDeletions, calculateDeletions(targetFrequency, k));
32 }
33
34 return minimumDeletions;
35 }
36
37 /**
38 * Calculates deletions needed if we want all frequencies in range [targetFreq, targetFreq + k]
39 * @param targetFrequency The minimum frequency we want to maintain
40 * @param k Maximum allowed difference from targetFrequency
41 * @return Number of deletions needed for this target frequency
42 */
43 private int calculateDeletions(int targetFrequency, int k) {
44 int deletionsNeeded = 0;
45
46 for (int currentFrequency : characterFrequencies) {
47 if (currentFrequency < targetFrequency) {
48 // Delete all occurrences if frequency is below target
49 deletionsNeeded += currentFrequency;
50 } else if (currentFrequency > targetFrequency + k) {
51 // Delete excess occurrences if frequency exceeds target + k
52 deletionsNeeded += currentFrequency - targetFrequency - k;
53 }
54 // If frequency is in range [targetFrequency, targetFrequency + k], no deletions needed
55 }
56
57 return deletionsNeeded;
58 }
59}
60
1class Solution {
2public:
3 int minimumDeletions(string word, int k) {
4 // Count frequency of each character
5 int charFrequency[26]{};
6 for (char& c : word) {
7 ++charFrequency[c - 'a'];
8 }
9
10 // Collect non-zero frequencies into a vector
11 vector<int> frequencies;
12 for (int freq : charFrequency) {
13 if (freq > 0) {
14 frequencies.push_back(freq);
15 }
16 }
17
18 int totalLength = word.size();
19 int minDeletions = totalLength;
20
21 // Lambda function to calculate deletions needed for a target minimum frequency
22 auto calculateDeletions = [&](int targetMinFreq) {
23 int deletions = 0;
24 for (int freq : frequencies) {
25 if (freq < targetMinFreq) {
26 // Delete all occurrences if frequency is less than target
27 deletions += freq;
28 } else if (freq > targetMinFreq + k) {
29 // Delete excess occurrences to maintain max difference of k
30 deletions += freq - targetMinFreq - k;
31 }
32 // If freq is in range [targetMinFreq, targetMinFreq + k], no deletions needed
33 }
34 return deletions;
35 };
36
37 // Try all possible minimum frequencies from 0 to totalLength
38 for (int minFreq = 0; minFreq <= totalLength; ++minFreq) {
39 minDeletions = min(minDeletions, calculateDeletions(minFreq));
40 }
41
42 return minDeletions;
43 }
44};
45
1/**
2 * Calculates the minimum number of character deletions needed to make the word "k-special"
3 * A word is k-special if the absolute difference between the frequencies of any two characters is at most k
4 * @param word - The input string to process
5 * @param k - The maximum allowed difference between character frequencies
6 * @returns The minimum number of deletions required
7 */
8function minimumDeletions(word: string, k: number): number {
9 // Initialize frequency array for 26 lowercase letters
10 const charFrequencies: number[] = Array(26).fill(0);
11
12 // Count the frequency of each character in the word
13 for (const char of word) {
14 const charIndex: number = char.charCodeAt(0) - 97; // Convert 'a'-'z' to 0-25
15 charFrequencies[charIndex]++;
16 }
17
18 // Filter out only the non-zero frequencies (characters that actually appear)
19 const actualFrequencies: number[] = charFrequencies.filter(frequency => frequency > 0);
20
21 /**
22 * Calculate the minimum deletions needed if we set the minimum frequency to targetMinFreq
23 * @param targetMinFreq - The target minimum frequency for characters we keep
24 * @returns Number of deletions needed for this target minimum frequency
25 */
26 const calculateDeletionsForTargetFrequency = (targetMinFreq: number): number => {
27 let totalDeletions: number = 0;
28
29 for (const currentFreq of actualFrequencies) {
30 if (currentFreq < targetMinFreq) {
31 // Delete all occurrences of characters with frequency less than target
32 totalDeletions += currentFreq;
33 } else if (currentFreq > targetMinFreq + k) {
34 // Delete excess occurrences to bring frequency within the allowed range
35 totalDeletions += currentFreq - targetMinFreq - k;
36 }
37 // Characters with frequency in range [targetMinFreq, targetMinFreq + k] need no deletions
38 }
39
40 return totalDeletions;
41 };
42
43 // Try all possible minimum frequencies from 0 to word length and find the minimum deletions
44 const possibleDeletions: number[] = Array.from(
45 { length: word.length + 1 },
46 (_, minFreq) => calculateDeletionsForTargetFrequency(minFreq)
47 );
48
49 return Math.min(...possibleDeletions);
50}
51
Time and Space Complexity
The time complexity is O(n ร |ฮฃ|)
, where n
is the length of the string word
and |ฮฃ|
represents the size of the character set (which is 26 for lowercase English letters).
Breaking down the analysis:
Counter(word).values()
takesO(n)
time to count character frequencies- The outer loop
for v in range(len(word) + 1)
iterates up ton + 1
times - For each value of
v
, the functionf(v)
iterates through all unique character frequencies innums
, which is at most|ฮฃ|
(since there can be at most 26 unique characters) - Therefore, the total time complexity is
O(n) + O(n ร |ฮฃ|) = O(n ร |ฮฃ|)
The space complexity is O(|ฮฃ|)
:
- The
Counter
object andnums
store at most|ฮฃ|
frequency values (one for each unique character in the string) - No other significant auxiliary space is used
- Thus, the space complexity is
O(|ฮฃ|)
, which equalsO(26) = O(1)
for this specific problem with lowercase English letters
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrectly Assuming We Need to Match a Specific Frequency
A common misconception is thinking we need to make all characters have the same frequency or match some existing frequency in the string. This leads to solutions that only consider existing frequencies as targets.
Wrong Approach:
def minimumDeletions(self, word: str, k: int) -> int:
frequencies = list(Counter(word).values())
min_deletions = float('inf')
# WRONG: Only considering existing frequencies as targets
for target in frequencies:
deletions = 0
for freq in frequencies:
if abs(freq - target) > k:
# This doesn't correctly handle the range [v, v+k]
deletions += abs(freq - target) - k
min_deletions = min(min_deletions, deletions)
return min_deletions
Why it's wrong: The problem allows frequencies to be in a range [v, v+k]
, not centered around a specific value. We need to consider all possible minimum values v
, not just existing frequencies.
2. Misunderstanding the Frequency Range Constraint
Another pitfall is misinterpreting how the k-special condition works, thinking characters must be within k
distance of some central frequency rather than all being within a range of size k
.
Wrong Approach:
def minimumDeletions(self, word: str, k: int) -> int:
frequencies = list(Counter(word).values())
avg_freq = sum(frequencies) // len(frequencies)
deletions = 0
for freq in frequencies:
# WRONG: Trying to bring everything close to average
if freq < avg_freq - k:
deletions += avg_freq - k - freq
elif freq > avg_freq + k:
deletions += freq - avg_freq - k
return deletions
Why it's wrong: The solution doesn't require centering around an average or median. We need to find an optimal range [v, v+k]
that minimizes deletions.
3. Off-by-One Errors in Range Iteration
A subtle but critical error is using the wrong upper bound when iterating through possible minimum frequencies.
Wrong Approach:
def minimumDeletions(self, word: str, k: int) -> int:
frequencies = list(Counter(word).values())
# WRONG: Should be range(len(word) + 1), not range(max(frequencies) + 1)
return min(calculate_cost(v) for v in range(max(frequencies) + 1))
Why it's wrong: While iterating only up to max(frequencies)
might seem logical, we could miss optimal solutions where v
is higher than the current maximum frequency. The correct upper bound is len(word) + 1
because that's the theoretical maximum any character could appear.
4. Incorrect Deletion Calculation for Out-of-Range Frequencies
When a frequency exceeds v + k
, some might incorrectly calculate the deletions needed.
Wrong Approach:
def calculate_deletions(v: int) -> int:
deletions = 0
for freq in frequencies:
if freq < v:
deletions += freq
elif freq > v + k:
# WRONG: Should delete (freq - v - k), not just k
deletions += k
return deletions
Correct Approach:
When frequency > v + k, we need to delete exactly freq - (v + k)
characters to bring it down to v + k
, not some fixed amount.
Solution to Avoid These Pitfalls:
- Always enumerate all possible minimum frequencies from 0 to the string length
- Understand the range constraint means all frequencies must fit within a window of size k
- Carefully calculate deletions:
- Below range: delete all
- Above range: delete excess to reach upper bound
- Within range: keep as is
- Test with edge cases like single character strings, k=0, and strings where all characters have the same frequency
In a binary min heap, the minimum element can be found in:
Recommended Readings
Greedy Introduction div class responsive iframe iframe src https www youtube com embed WTslqPbj7I title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture web share allowfullscreen iframe div When do we use greedy Greedy algorithms tend to solve optimization problems Typically they will ask you to calculate the max min of some value Commonly you may see this phrased in the problem as max min longest shortest largest smallest etc These keywords can be identified by just scanning
Sorting Summary Comparisons We presented quite a few sorting algorithms and it is essential to know the advantages and disadvantages of each one The basic algorithms are easy to visualize and easy to learn for beginner programmers because of their simplicity As such they will suffice if you don't know any advanced
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Want a Structured Path to Master System Design Too? Donโt Miss This!