3662. Filter Characters by Frequency 🔒
Problem Description
You are given a string s
containing only lowercase English letters and an integer k
.
Your task is to create a new string by filtering characters from s
. A character should be included in the new string if and only if it appears fewer than k
times in the entire original string s
.
The key requirements are:
- Count how many times each character appears in the entire string
s
- Include a character in the result only if its total count is less than
k
- Every occurrence of qualifying characters must be kept (not just one copy)
- The relative order of characters must be preserved from the original string
For example, if s = "aabbcc"
and k = 3
:
- 'a' appears 2 times (less than 3) → include all 'a's
- 'b' appears 2 times (less than 3) → include all 'b's
- 'c' appears 2 times (less than 3) → include all 'c's
- Result:
"aabbcc"
If s = "aaabbc"
and k = 3
:
- 'a' appears 3 times (not less than 3) → exclude all 'a's
- 'b' appears 2 times (less than 3) → include all 'b's
- 'c' appears 1 time (less than 3) → include all 'c's
- Result:
"bbc"
Return the filtered string. If no characters have a count less than k
, return an empty string.
Intuition
The problem asks us to filter characters based on their frequency in the string. To determine which characters to keep, we need to know how many times each character appears in total. This immediately suggests a two-pass approach.
Think about it this way: when we encounter a character at any position in the string, we can't decide whether to include it or not without knowing its total count across the entire string. For instance, if we see an 'a' at position 0, we don't know if there are more 'a's later that would push the total count to k
or beyond.
This leads us to the natural solution structure:
-
First pass: Count everything. We need to scan the entire string once to build a frequency map. This gives us complete information about how many times each character appears.
-
Second pass: Make decisions. Now that we know the total count for each character, we can go through the string again and decide for each character whether to include it in our result. If its total count (which we now know from step 1) is less than
k
, we include it; otherwise, we skip it.
The key insight is that we must preserve the original order of characters. This means we can't just iterate through our frequency map and reconstruct the string - we need to iterate through the original string in order, checking each character against our frequency map to decide inclusion.
Using a Counter
(or hash table) for the frequency map gives us O(1)
lookups during the second pass, making the overall solution efficient at O(n)
time complexity where n
is the length of the string.
Solution Approach
The solution follows the counting approach we identified in our intuition. Let's walk through the implementation step by step:
Step 1: Count Character Frequencies
cnt = Counter(s)
We use Python's Counter
from the collections module to count the frequency of each character in the string s
. This creates a dictionary-like object where:
- Keys are the unique characters in
s
- Values are the number of times each character appears
For example, if s = "aabbcc"
, then cnt
would be {'a': 2, 'b': 2, 'c': 2}
.
Step 2: Build the Result String
ans = [] for c in s: if cnt[c] < k: ans.append(c)
We iterate through the original string s
character by character. For each character c
:
- We check if its total count
cnt[c]
is less thank
- If yes, we append this character to our result list
ans
- If no, we skip it and move to the next character
This approach naturally preserves the original order of characters since we're iterating through s
from left to right.
Step 3: Return the Result
return "".join(ans)
Finally, we join all characters in the ans
list into a single string and return it. If no characters qualified (all had frequency ≥ k
), the list would be empty and we'd return an empty string.
Time and Space Complexity:
- Time Complexity:
O(n)
wheren
is the length of the string. We make two passes through the string - one for counting and one for filtering. - Space Complexity:
O(1)
for the counter (at most 26 lowercase English letters) plusO(n)
for the result list in the worst case where all characters qualify.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the solution with s = "programming"
and k = 2
.
Step 1: Count Character Frequencies
First, we count how many times each character appears in "programming":
- 'p': 1 time
- 'r': 2 times
- 'o': 1 time
- 'g': 2 times
- 'a': 1 time
- 'm': 2 times
- 'i': 1 time
- 'n': 1 time
So our frequency map is: {'p': 1, 'r': 2, 'o': 1, 'g': 2, 'a': 1, 'm': 2, 'i': 1, 'n': 1}
Step 2: Filter Characters Based on Count
Now we go through "programming" character by character and check if each character's count is less than k = 2
:
- 'p': count is 1 < 2 ✓ Include it → result: "p"
- 'r': count is 2 ≮ 2 ✗ Skip it → result: "p"
- 'o': count is 1 < 2 ✓ Include it → result: "po"
- 'g': count is 2 ≮ 2 ✗ Skip it → result: "po"
- 'r': count is 2 ≮ 2 ✗ Skip it → result: "po"
- 'a': count is 1 < 2 ✓ Include it → result: "poa"
- 'm': count is 2 ≮ 2 ✗ Skip it → result: "poa"
- 'm': count is 2 ≮ 2 ✗ Skip it → result: "poa"
- 'i': count is 1 < 2 ✓ Include it → result: "poai"
- 'n': count is 1 < 2 ✓ Include it → result: "poain"
- 'g': count is 2 ≮ 2 ✗ Skip it → result: "poain"
Step 3: Return Result
The final filtered string is "poain"
.
Notice how:
- Characters 'r', 'g', and 'm' all appear exactly 2 times (not less than k=2), so ALL occurrences of these characters are excluded
- Characters 'p', 'o', 'a', 'i', 'n' each appear once (less than k=2), so they are all included
- The relative order from the original string is preserved: 'p' came before 'o', which came before 'a', and so on
Solution Implementation
1from collections import Counter
2
3class Solution:
4 def filterCharacters(self, s: str, k: int) -> str:
5 # Count the frequency of each character in the string
6 char_count = Counter(s)
7
8 # Initialize result list to store characters that appear less than k times
9 result = []
10
11 # Iterate through each character in the original string
12 for char in s:
13 # Keep characters that appear less than k times
14 if char_count[char] < k:
15 result.append(char)
16
17 # Join the filtered characters into a string and return
18 return "".join(result)
19
1class Solution {
2 /**
3 * Filters characters from a string based on their frequency.
4 * Returns a new string containing only characters that appear less than k times.
5 *
6 * @param s the input string containing lowercase English letters
7 * @param k the frequency threshold
8 * @return a string with characters that appear less than k times in original order
9 */
10 public String filterCharacters(String s, int k) {
11 // Array to store frequency count of each lowercase letter (a-z)
12 int[] frequencyCount = new int[26];
13
14 // Count the frequency of each character in the string
15 for (char character : s.toCharArray()) {
16 // Map character to array index (0-25) and increment count
17 frequencyCount[character - 'a']++;
18 }
19
20 // StringBuilder to efficiently build the result string
21 StringBuilder result = new StringBuilder();
22
23 // Iterate through the original string to maintain character order
24 for (char character : s.toCharArray()) {
25 // Include character if its frequency is less than threshold k
26 if (frequencyCount[character - 'a'] < k) {
27 result.append(character);
28 }
29 }
30
31 // Convert StringBuilder to String and return
32 return result.toString();
33 }
34}
35
1class Solution {
2public:
3 string filterCharacters(string s, int k) {
4 // Initialize frequency array for 26 lowercase letters
5 int charFrequency[26] = {0};
6
7 // Count the frequency of each character in the string
8 for (char ch : s) {
9 charFrequency[ch - 'a']++;
10 }
11
12 // Build result string with characters that appear less than k times
13 string result;
14 for (char ch : s) {
15 if (charFrequency[ch - 'a'] < k) {
16 result.push_back(ch);
17 }
18 }
19
20 return result;
21 }
22};
23
1/**
2 * Filters characters from a string, keeping only those that appear less than k times
3 * @param s - The input string to filter
4 * @param k - The threshold for character frequency
5 * @returns A new string containing only characters that appear less than k times
6 */
7function filterCharacters(s: string, k: number): string {
8 // Create a frequency map to count occurrences of each character
9 const characterFrequency: Record<string, number> = {};
10
11 // Count the frequency of each character in the string
12 for (const character of s) {
13 characterFrequency[character] = (characterFrequency[character] || 0) + 1;
14 }
15
16 // Array to store characters that meet the filter criteria
17 const filteredCharacters: string[] = [];
18
19 // Iterate through the original string and keep characters with frequency < k
20 for (const character of s) {
21 if (characterFrequency[character] < k) {
22 filteredCharacters.push(character);
23 }
24 }
25
26 // Join the filtered characters into a single string and return
27 return filteredCharacters.join('');
28}
29
Time and Space Complexity
The time complexity is O(n)
, where n
is the length of the string s
. This is because:
- Creating the Counter object requires one pass through the string:
O(n)
- The for loop iterates through each character in the string once:
O(n)
- Inside the loop, checking
cnt[c]
and appending to the list are bothO(1)
operations - The final
join
operation isO(n)
as it processes all characters that were added to the list - Total:
O(n) + O(n) + O(n) = O(n)
The space complexity is O(|Σ| + n)
, where |Σ|
is the size of the character set (number of distinct characters in s
) and n
is the length of the string. This is because:
- The Counter object stores at most
|Σ|
key-value pairs:O(|Σ|)
- The
ans
list can store at mostn
characters in the worst case (when all characters have frequency less thank
):O(n)
- However, if we consider only the auxiliary space excluding the output, the space complexity is
O(|Σ|)
, which matches the reference answer
Common Pitfalls
Pitfall 1: Filtering Only Unique Characters Instead of All Occurrences
The Mistake: A common misunderstanding is to include only one instance of each qualifying character in the result, rather than preserving all occurrences.
Incorrect Implementation:
def filterCharacters(self, s: str, k: int) -> str:
char_count = Counter(s)
result = []
# WRONG: Only adds each character once
for char in char_count:
if char_count[char] < k:
result.append(char)
return "".join(result)
Why It's Wrong:
If s = "aabbcc"
and k = 3
, this would return "abc"
instead of "aabbcc"
. The problem requires keeping every occurrence of qualifying characters.
Correct Approach: Iterate through the original string and check each character individually, preserving all occurrences:
for char in s: # Iterate through original string, not the counter if char_count[char] < k: result.append(char)
Pitfall 2: Misunderstanding the Comparison (< vs <=)
The Mistake:
Using "less than or equal to" instead of strictly "less than" when comparing with k
.
Incorrect Implementation:
def filterCharacters(self, s: str, k: int) -> str:
char_count = Counter(s)
result = []
for char in s:
# WRONG: Should be < not <=
if char_count[char] <= k:
result.append(char)
return "".join(result)
Why It's Wrong:
If s = "aaabbc"
and k = 3
, using <=
would incorrectly include the three 'a's in the result, giving "aaabbc"
instead of "bbc"
.
Correct Approach:
Use strict inequality (<
) to ensure characters appearing exactly k
times are excluded:
if char_count[char] < k: # Strictly less than k result.append(char)
Pitfall 3: Not Preserving Original Order
The Mistake: Reconstructing the string based on the order of characters in the counter or sorted order, rather than maintaining the original sequence.
Incorrect Implementation:
def filterCharacters(self, s: str, k: int) -> str:
char_count = Counter(s)
result = []
# WRONG: Iterates in arbitrary/sorted order
for char in sorted(char_count.keys()):
if char_count[char] < k:
result.extend([char] * char_count[char])
return "".join(result)
Why It's Wrong:
If s = "bcaba"
and k = 2
, this would return "abc"
(sorted order) instead of "bcb"
(original order preserved).
Correct Approach: Always iterate through the original string to maintain the relative positions:
for char in s: # Maintains original order if char_count[char] < k: result.append(char)
How would you design a stack which has a function min
that returns the minimum element in the stack, in addition to push
and pop
? All push
, pop
, min
should have running time O(1)
.
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!