3662. Filter Characters by Frequency

EasyHash TableStringCounting

Problem Description

You are given a string s containing only lowercase English letters and an integer k.

Your task is to create a new string by filtering characters from s. A character should be included in the new string if and only if it appears fewer than k times in the entire original string s.

The key requirements are:

Count how many times each character appears in the entire string s
Include a character in the result only if its total count is less than k
Every occurrence of qualifying characters must be kept (not just one copy)
The relative order of characters must be preserved from the original string

For example, if s = "aabbcc" and k = 3:

'a' appears 2 times (less than 3) → include all 'a's
'b' appears 2 times (less than 3) → include all 'b's
'c' appears 2 times (less than 3) → include all 'c's
Result: "aabbcc"

If s = "aaabbc" and k = 3:

'a' appears 3 times (not less than 3) → exclude all 'a's
'b' appears 2 times (less than 3) → include all 'b's
'c' appears 1 time (less than 3) → include all 'c's
Result: "bbc"

Return the filtered string. If no characters have a count less than k, return an empty string.

Quick Interview Experience

Help others by sharing your interview experience

Have you seen this problem before?

Intuition

The problem asks us to filter characters based on their frequency in the string. To determine which characters to keep, we need to know how many times each character appears in total. This immediately suggests a two-pass approach.

Think about it this way: when we encounter a character at any position in the string, we can't decide whether to include it or not without knowing its total count across the entire string. For instance, if we see an 'a' at position 0, we don't know if there are more 'a's later that would push the total count to k or beyond.

This leads us to the natural solution structure:

First pass: Count everything. We need to scan the entire string once to build a frequency map. This gives us complete information about how many times each character appears.
Second pass: Make decisions. Now that we know the total count for each character, we can go through the string again and decide for each character whether to include it in our result. If its total count (which we now know from step 1) is less than k, we include it; otherwise, we skip it.

The key insight is that we must preserve the original order of characters. This means we can't just iterate through our frequency map and reconstruct the string - we need to iterate through the original string in order, checking each character against our frequency map to decide inclusion.

Using a Counter (or hash table) for the frequency map gives us O(1) lookups during the second pass, making the overall solution efficient at O(n) time complexity where n is the length of the string.

Solution Implementation

1from collections import Counter
2
3class Solution:
4    def filterCharacters(self, s: str, k: int) -> str:
5        # Count the frequency of each character in the string
6        char_count = Counter(s)
7
8        # Initialize result list to store characters that appear less than k times
9        result = []
10
11        # Iterate through each character in the original string
12        for char in s:
13            # Keep characters that appear less than k times
14            if char_count[char] < k:
15                result.append(char)
16
17        # Join the filtered characters into a string and return
18        return "".join(result)
19

1class Solution {
2    /**
3     * Filters characters from a string based on their frequency.
4     * Returns a new string containing only characters that appear less than k times.
5     *
6     * @param s the input string containing lowercase English letters
7     * @param k the frequency threshold
8     * @return a string with characters that appear less than k times in original order
9     */
10    public String filterCharacters(String s, int k) {
11        // Array to store frequency count of each lowercase letter (a-z)
12        int[] frequencyCount = new int[26];
13
14        // Count the frequency of each character in the string
15        for (char character : s.toCharArray()) {
16            // Map character to array index (0-25) and increment count
17            frequencyCount[character - 'a']++;
18        }
19
20        // StringBuilder to efficiently build the result string
21        StringBuilder result = new StringBuilder();
22
23        // Iterate through the original string to maintain character order
24        for (char character : s.toCharArray()) {
25            // Include character if its frequency is less than threshold k
26            if (frequencyCount[character - 'a'] < k) {
27                result.append(character);
28            }
29        }
30
31        // Convert StringBuilder to String and return
32        return result.toString();
33    }
34}
35

1class Solution {
2public:
3    string filterCharacters(string s, int k) {
4        // Initialize frequency array for 26 lowercase letters
5        int charFrequency[26] = {0};
6
7        // Count the frequency of each character in the string
8        for (char ch : s) {
9            charFrequency[ch - 'a']++;
10        }
11
12        // Build result string with characters that appear less than k times
13        string result;
14        for (char ch : s) {
15            if (charFrequency[ch - 'a'] < k) {
16                result.push_back(ch);
17            }
18        }
19
20        return result;
21    }
22};
23

1/**
2 * Filters characters from a string, keeping only those that appear less than k times
3 * @param s - The input string to filter
4 * @param k - The threshold for character frequency
5 * @returns A new string containing only characters that appear less than k times
6 */
7function filterCharacters(s: string, k: number): string {
8    // Create a frequency map to count occurrences of each character
9    const characterFrequency: Record<string, number> = {};
10
11    // Count the frequency of each character in the string
12    for (const character of s) {
13        characterFrequency[character] = (characterFrequency[character] || 0) + 1;
14    }
15
16    // Array to store characters that meet the filter criteria
17    const filteredCharacters: string[] = [];
18
19    // Iterate through the original string and keep characters with frequency < k
20    for (const character of s) {
21        if (characterFrequency[character] < k) {
22            filteredCharacters.push(character);
23        }
24    }
25
26    // Join the filtered characters into a single string and return
27    return filteredCharacters.join('');
28}
29

Solution Approach

The solution follows the counting approach we identified in our intuition. Let's walk through the implementation step by step:

Step 1: Count Character Frequencies

cnt = Counter(s)

We use Python's Counter from the collections module to count the frequency of each character in the string s. This creates a dictionary-like object where:

Keys are the unique characters in s
Values are the number of times each character appears

For example, if s = "aabbcc", then cnt would be {'a': 2, 'b': 2, 'c': 2}.

Step 2: Build the Result String

ans = []
for c in s:
    if cnt[c] < k:
        ans.append(c)

We iterate through the original string s character by character. For each character c:

We check if its total count cnt[c] is less than k
If yes, we append this character to our result list ans
If no, we skip it and move to the next character

This approach naturally preserves the original order of characters since we're iterating through s from left to right.

Step 3: Return the Result

return "".join(ans)

Finally, we join all characters in the ans list into a single string and return it. If no characters qualified (all had frequency ≥ k), the list would be empty and we'd return an empty string.

Time and Space Complexity:

Time Complexity: O(n) where n is the length of the string. We make two passes through the string - one for counting and one for filtering.
Space Complexity: O(1) for the counter (at most 26 lowercase English letters) plus O(n) for the result list in the worst case where all characters qualify.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the solution with s = "programming" and k = 2.

Step 1: Count Character Frequencies

First, we count how many times each character appears in "programming":

'p': 1 time
'r': 2 times
'o': 1 time
'g': 2 times
'a': 1 time
'm': 2 times
'i': 1 time
'n': 1 time

So our frequency map is: {'p': 1, 'r': 2, 'o': 1, 'g': 2, 'a': 1, 'm': 2, 'i': 1, 'n': 1}

Step 2: Filter Characters Based on Count

Now we go through "programming" character by character and check if each character's count is less than k = 2:

'p': count is 1 < 2 ✓ Include it → result: "p"
'r': count is 2 ≮ 2 ✗ Skip it → result: "p"
'o': count is 1 < 2 ✓ Include it → result: "po"
'g': count is 2 ≮ 2 ✗ Skip it → result: "po"
'r': count is 2 ≮ 2 ✗ Skip it → result: "po"
'a': count is 1 < 2 ✓ Include it → result: "poa"
'm': count is 2 ≮ 2 ✗ Skip it → result: "poa"
'm': count is 2 ≮ 2 ✗ Skip it → result: "poa"
'i': count is 1 < 2 ✓ Include it → result: "poai"
'n': count is 1 < 2 ✓ Include it → result: "poain"
'g': count is 2 ≮ 2 ✗ Skip it → result: "poain"

Step 3: Return Result

The final filtered string is "poain".

Notice how:

Characters 'r', 'g', and 'm' all appear exactly 2 times (not less than k=2), so ALL occurrences of these characters are excluded
Characters 'p', 'o', 'a', 'i', 'n' each appear once (less than k=2), so they are all included
The relative order from the original string is preserved: 'p' came before 'o', which came before 'a', and so on

Time and Space Complexity

The time complexity is O(n), where n is the length of the string s. This is because:

Creating the Counter object requires one pass through the string: O(n)
The for loop iterates through each character in the string once: O(n)
Inside the loop, checking cnt[c] and appending to the list are both O(1) operations
The final join operation is O(n) as it processes all characters that were added to the list
Total: O(n) + O(n) + O(n) = O(n)

The space complexity is O(|Σ| + n), where |Σ| is the size of the character set (number of distinct characters in s) and n is the length of the string. This is because:

The Counter object stores at most |Σ| key-value pairs: O(|Σ|)
The ans list can store at most n characters in the worst case (when all characters have frequency less than k): O(n)
However, if we consider only the auxiliary space excluding the output, the space complexity is O(|Σ|), which matches the reference answer

Common Pitfalls

Pitfall 1: Filtering Only Unique Characters Instead of All Occurrences

The Mistake: A common misunderstanding is to include only one instance of each qualifying character in the result, rather than preserving all occurrences.

Incorrect Implementation:

def filterCharacters(self, s: str, k: int) -> str:
    char_count = Counter(s)
    result = []

    # WRONG: Only adds each character once
    for char in char_count:
        if char_count[char] < k:
            result.append(char)

    return "".join(result)

Why It's Wrong: If s = "aabbcc" and k = 3, this would return "abc" instead of "aabbcc". The problem requires keeping every occurrence of qualifying characters.

Correct Approach: Iterate through the original string and check each character individually, preserving all occurrences:

for char in s:  # Iterate through original string, not the counter
    if char_count[char] < k:
        result.append(char)

Pitfall 2: Misunderstanding the Comparison (< vs <=)

The Mistake: Using "less than or equal to" instead of strictly "less than" when comparing with k.

Incorrect Implementation:

def filterCharacters(self, s: str, k: int) -> str:
    char_count = Counter(s)
    result = []

    for char in s:
        # WRONG: Should be < not <=
        if char_count[char] <= k:
            result.append(char)

    return "".join(result)

Why It's Wrong: If s = "aaabbc" and k = 3, using <= would incorrectly include the three 'a's in the result, giving "aaabbc" instead of "bbc".

Correct Approach: Use strict inequality (<) to ensure characters appearing exactly k times are excluded:

if char_count[char] < k:  # Strictly less than k
    result.append(char)

Pitfall 3: Not Preserving Original Order

The Mistake: Reconstructing the string based on the order of characters in the counter or sorted order, rather than maintaining the original sequence.

Incorrect Implementation:

def filterCharacters(self, s: str, k: int) -> str:
    char_count = Counter(s)
    result = []

    # WRONG: Iterates in arbitrary/sorted order
    for char in sorted(char_count.keys()):
        if char_count[char] < k:
            result.extend([char] * char_count[char])

    return "".join(result)

Why It's Wrong: If s = "bcaba" and k = 2, this would return "abc" (sorted order) instead of "bcb" (original order preserved).

Correct Approach: Always iterate through the original string to maintain the relative positions:

for char in s:  # Maintains original order
    if char_count[char] < k:
        result.append(char)

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which of these properties could exist for a graph but not a tree?

Contain cycles

There are multiple ways to traverse

Connected

Has no cycle

I don't know

3662. Filter Characters by Frequency

Problem Description

Intuition

Solution Implementation

Solution Approach

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Example Walkthrough

Time and Space Complexity

Common Pitfalls

Pitfall 1: Filtering Only Unique Characters Instead of All Occurrences

Pitfall 2: Misunderstanding the Comparison (< vs <=)

Pitfall 3: Not Preserving Original Order

Recommended Readings