451. Sort Characters By Frequency

MediumHash TableStringBucket SortCountingSortingHeap (Priority Queue)
Leetcode Link

Problem Description

The problem requires us to tackle a string sorting task based on a non-standard criterion: the frequency of each character in the string. Specifically, the goal is to reorder the string so that the characters that occur most frequently are placed first. If two characters have the same frequency, they can be in any order with respect to each other. The final output must be a string where the sorted order reflects these frequency criteria.

Intuition

The intuitive approach to this problem involves counting how often each character appears in the string, then sorting the characters based on these counts. This is typically a two-step process:

  1. Count the occurrences: We need to go through the string and count the occurrences of each character. This can be done efficiently by using a hash table or a counter data structure where each character is a key, and its count is the value.

  2. Sort based on counts: Once we have the counts, the next step is to sort the characters by these counts in descending order. We want the characters with higher counts to come first.

With these counts, we can construct a new string. We do this by iterating over each unique character, repeating the character by its count (since sorting by frequency means if a character appears ( n ) times, it should also appear ( n ) times consecutively in the final string), and concatenating these repetitions to form the final string.

In the provided solution:

  • The Counter from the collections module is used to count occurrences of each character.
  • The sorted() function sorts the items in the counter by their counts (values), with the sort being in descending order because of the negative sign in the sort key -x[1].
  • The sorted items are then concatenated to create the final string through a string join operation, which combines the characters multiplied by their frequencies.

This method is consistent with the requirements and efficiently achieves the sorting based on frequency.

Learn more about Sorting and Heap (Priority Queue) patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

How many ways can you arrange the three letters A, B and C?

Solution Approach

The solution makes use of a few key concepts in Python to address the problem:

  • Counter Data Structure: The Counter class from the collections module is perfect for counting the frequency of characters because it automatically builds a hash map (dictionary) where characters are keys and their counts are values.

    1cnt = Counter(s)

    Here, s is the input string, and cnt becomes a Counter object holding counts of each character.

  • Sorting by Frequency: The sorted() function is used to sort the characters based on their frequency.

    1sorted(cnt.items(), key=lambda x: -x[1])

    cnt.items() provides a sequence of (character, count) pairs. The key argument to sorted() specifies that sorting should be based on the count, which is the second item in each pair (x[1]). The negative sign ensures that the sorting is in decreasing order of frequency.

  • String Joining and Character Multiplication: Python's expressive syntax allows us to multiply a string by an integer to repeat it, and the join() method of a string allows us to concatenate an iterable of strings:

    1return ''.join(c * v for c, v in sorted(cnt.items(), key=lambda x: -x[1]))

    For each character c and its count v, c * v creates a string where c is repeated v times. The join() method is used to concatenate all these strings together without any separator (''), creating the final sorted string.

These steps are combined into a concise one-liner inside the frequencySort method of the Solution class. This is efficient because it leverages Python's built-in data structures and functions that are implemented in C under the hood, thus being quite fast.

1class Solution:
2    def frequencySort(self, s: str) -> str:
3        cnt = Counter(s)
4        return ''.join(c * v for c, v in sorted(cnt.items(), key=lambda x: -x[1]))

The approach works well and guarantees that the most frequent characters will be placed at the beginning of the resulting string, while less frequent characters will follow, adhering to the problem's constraints.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Which data structure is used to implement recursion?

Example Walkthrough

Let's run through a small example to illustrate how the solution approach works. Suppose our input string is s = "tree".

  1. Count the occurrences of each character: We use the Counter class to count the characters.

    1from collections import Counter
    2cnt = Counter("tree")
    3# cnt is now Counter({'t': 1, 'r': 1, 'e': 2})

    In the string "tree", the character 'e' appears twice, while 't' and 'r' each appear once.

  2. Sort characters by frequency: We then use the sorted() function to sort these characters by their frequency in descending order.

    1sorted_characters = sorted(cnt.items(), key=lambda x: -x[1])
    2# sorted_characters is now [('e', 2), ('t', 1), ('r', 1)]

    Here, we use a lambda function as the key to sort by the counts—the negative sign ensures it is in descending order.

  3. Construct the new string based on frequency: Finally, we iterate over the sorted character-count pairs and repeat each character by its frequency, joining them to form the final result.

    1result_string = ''.join(c * v for c, v in sorted_characters)
    2# result_string is "eett" or "eetr" or "tree" or "ttee", etc.

    Since 'e' has the highest frequency, it comes first. 't' and 'r' have the same frequency, so their order with respect to each other does not matter in the final output. The result can be "eett", "eetr", "tree", or "ttee" because the order of characters with the same frequency is not specified.

Putting all this within the class method frequencySort would look like this:

1class Solution:
2    def frequencySort(self, s: str) -> str:
3        cnt = Counter(s)
4        return ''.join(c * v for c, v in sorted(cnt.items(), key=lambda x: -x[1]))

By applying this method to our example:

1solution = Solution()
2print(solution.frequencySort("tree"))

We will get a string that has 'e' characters first because they have the highest frequency, followed by 't' and 'r' in any order, which may result in one of the possible outcomes such as "eett", "eetr", "tree", or "ttee".

Solution Implementation

1from collections import Counter
2
3class Solution:
4    def frequencySort(self, s: str) -> str:
5        # Count the frequency of each character in the input string
6        char_frequency = Counter(s)
7      
8        # Sort the characters based on frequency in descending order
9        sorted_characters = sorted(char_frequency.items(), key=lambda item: -item[1])
10      
11        # Create a string with characters repeated by their frequency
12        frequency_sorted_string = ''.join(character * frequency for character, frequency in sorted_characters)
13      
14        return frequency_sorted_string
15
16# Example usage:
17# sol = Solution()
18# result = sol.frequencySort("tree")
19# print(result)  # Outputs a string with characters sorted by frequency, e.g. "eetr"
20
1import java.util.*;
2
3class Solution {
4    public String frequencySort(String s) {
5        // Initialize a hash map to store frequency of each character
6        Map<Character, Integer> frequencyMap = new HashMap<>(52);
7
8        // Loop through all the characters in the string to fill the frequency map
9        for (int i = 0; i < s.length(); ++i) {
10            // Merge the current character into the map, increasing its count by 1
11            frequencyMap.merge(s.charAt(i), 1, Integer::sum);
12        }
13
14        // Create a list to store the characters (for sorting purposes)
15        List<Character> characters = new ArrayList<>(frequencyMap.keySet());
16      
17        // Sort the character list based on their frequencies in descending order
18        characters.sort((a, b) -> frequencyMap.get(b) - frequencyMap.get(a));
19      
20        // Use StringBuilder to build the result string
21        StringBuilder sortedString = new StringBuilder();
22      
23        // Loop through the sorted list of characters
24        for (char c : characters) {
25            // Append each character to the StringBuilder based on its frequency
26            for (int frequency = frequencyMap.get(c); frequency > 0; --frequency) {
27                sortedString.append(c);
28            }
29        }
30      
31        // Return the sorted string
32        return sortedString.toString();
33    }
34}
35
1#include <string>
2#include <unordered_map>
3#include <vector>
4#include <algorithm>
5
6class Solution {
7public:
8    // Function to sort characters by frequency of appearance in a string
9    string frequencySort(string s) {
10        // Create a hash map to store the frequency of appearance of each character
11        unordered_map<char, int> frequencyMap;
12        // Calculate the frequency of each character in the string
13        for (char ch : s) {
14            ++frequencyMap[ch];
15        }
16
17        // Create a vector to store unique characters
18        vector<char> uniqueChars;
19        // Populate the vector with the keys from the frequencyMap
20        for (auto& keyValue : frequencyMap) {
21            uniqueChars.push_back(keyValue.first);
22        }
23
24        // Sort the unique characters based on their frequency
25        sort(uniqueChars.begin(), uniqueChars.end(), [&](char a, char b) {
26            return frequencyMap[a] > frequencyMap[b];
27        });
28
29        // Create a result string to store the sorted characters by frequency
30        string result;
31        // Go through each character and append it to the result string, multiplied by its frequency
32        for (char ch : uniqueChars) {
33            result += string(frequencyMap[ch], ch);
34        }
35
36        // Return the result string
37        return result;
38    }
39};
40
1// Function to sort the characters in a string by frequency of appearance in descending order.
2function frequencySort(s: string): string {
3    // Create a map to hold character frequencies.
4    const charFrequencyMap: Map<string, number> = new Map();
5
6    // Iterate over each character in the input string.
7    for (const char of s) {
8        // Update the frequency count for each character.
9        charFrequencyMap.set(char, (charFrequencyMap.get(char) || 0) + 1);
10    }
11
12    // Convert map keys to an array, sort the array by frequency in descending order.
13    const sortedCharacters = Array.from(charFrequencyMap.keys()).sort(
14        (charA, charB) => charFrequencyMap.get(charB)! - charFrequencyMap.get(charA)!
15    );
16
17    // Initialize an array to hold the sorted characters by frequency.
18    const sortedArray: string[] = [];
19
20    // Build a string for each character, repeated by its frequency.
21    for (const char of sortedCharacters) {
22        sortedArray.push(char.repeat(charFrequencyMap.get(char)!));
23    }
24
25    // Join the array of strings into a single string and return it.
26    return sortedArray.join('');
27}
28
Not Sure What to Study? Take the 2-min Quiz:

What is the worst case running time for finding an element in a binary search tree(not necessarily balanced) of size n?

Time and Space Complexity

Time Complexity:

The time complexity of the provided code can be analyzed as follows:

  1. Counting the frequency of each character - The Counter from the collections module iterates through the string s once to count the frequency of each character. This operation has a time complexity of O(n), where n is the length of the string s.

  2. Sorting the counted characters - The sorted function is used to sort the items in the counter based on their frequency (the value in the key-value pair). Sorting in python is typically implemented with the Timsort algorithm, which has a time complexity of O(m log m), where m is the number of unique characters in the string s.

Overall, the dominating factor is the sort operation, so the total time complexity is O(m log m + n). However, since m can be at most n in cases where all characters are unique, the time complexity is often described as O(n log n) for practical worst-case scenarios.

Space Complexity:

The space complexity of the code is analyzed as follows:

  1. Counter dictionary - The Counter constructs a dictionary with m entries, where m is the number of unique characters in the string s. This space usage is O(m).

  2. Sorted list - The sorted function generates a list of tuples which is essentially the items of the counter sorted based on their frequency. This also takes O(m) space.

  3. Output string - The output string is formed by joining individual characters multiplied by their frequency. The length of the resulting string is equal to n, the length of the input string. Hence, the space required for the output string is O(n).

Since m can be at most n, the overall space complexity is O(n) considering the storage for the output string and the data structures used for counting and sorting.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Consider the classic dynamic programming of longest increasing subsequence:

Find the length of the longest subsequence of a given sequence such that all elements of the subsequence are sorted in increasing order.

For example, the length of LIS for [50, 3, 10, 7, 40, 80] is 4 and LIS is [3, 7, 40, 80].

What is the recurrence relation?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫