Facebook Pixel

49. Group Anagrams

MediumArrayHash TableStringSorting
Leetcode Link

Problem Description

You are given an array of strings strs. Your task is to group all the anagrams together and return them as a list of lists. Anagrams are words that contain the same characters but in different orders (for example, "eat" and "tea" are anagrams).

The problem asks you to:

  1. Take an array of strings as input
  2. Identify which strings are anagrams of each other
  3. Group these anagrams together into separate lists
  4. Return all groups as a list of lists

For example, if the input is ["eat", "tea", "tan", "ate", "nat", "bat"]:

  • "eat", "tea", and "ate" are anagrams of each other (they all contain the letters 'a', 'e', 't')
  • "tan" and "nat" are anagrams of each other (they both contain 'a', 'n', 't')
  • "bat" has no anagrams in the list

So the output would be: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]

The order of the groups in the output doesn't matter, and the order of strings within each group also doesn't matter. The key requirement is that all anagrams must be grouped together.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is that anagrams share a common property: when their characters are sorted, they produce the exact same string. For instance, both "eat" and "tea" become "aet" when their characters are sorted alphabetically.

This observation leads us to think about using this sorted string as a unique identifier or "signature" for each group of anagrams. If two strings produce the same sorted string, they must be anagrams of each other.

We need a way to collect all strings that share the same signature. A hash table (dictionary) is perfect for this because:

  • We can use the sorted string as the key
  • We can store all original strings that map to this key as a list in the value

The process becomes straightforward:

  1. For each string, compute its sorted version (the signature)
  2. Use this signature to determine which group the string belongs to
  3. Add the string to its corresponding group in the hash table

By the end, each key in our hash table represents a unique anagram pattern, and its value contains all strings from the input that match this pattern. The final answer is simply all the values from our hash table.

This approach elegantly solves the grouping problem in a single pass through the input array, making it both intuitive and efficient.

Learn more about Sorting patterns.

Solution Approach

Let's walk through the implementation step by step using a hash table approach:

1. Initialize a Hash Table We use a defaultdict(list) in Python, which automatically creates an empty list for any new key we encounter. This prevents key errors when adding strings to groups.

2. Process Each String For each string s in the input array strs:

  • Sort the characters of the string: sorted(s) returns a list of characters in alphabetical order
  • Join them back into a string: ''.join(sorted(s)) creates our key
  • For example: "eat" β†’ ['a', 'e', 't'] β†’ "aet"

3. Group Strings by Their Sorted Key Using the sorted string as the key k, append the original string to the corresponding list in our hash table: d[k].append(s)

4. Build the Result After processing all strings, our hash table contains:

  • Keys: sorted strings (like "aet", "ant", "abt")
  • Values: lists of original strings that are anagrams

Return all the value lists: list(d.values())

Example Walkthrough: Given strs = ["eat", "tea", "tan", "ate", "nat", "bat"]:

StepCurrent StringSorted KeyHash Table State
1"eat""aet"{"aet": ["eat"]}
2"tea""aet"{"aet": ["eat", "tea"]}
3"tan""ant"{"aet": ["eat", "tea"], "ant": ["tan"]}
4"ate""aet"{"aet": ["eat", "tea", "ate"], "ant": ["tan"]}
5"nat""ant"{"aet": ["eat", "tea", "ate"], "ant": ["tan", "nat"]}
6"bat""abt"{"aet": ["eat", "tea", "ate"], "ant": ["tan", "nat"], "abt": ["bat"]}

Final output: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]

The time complexity is O(n * k log k) where n is the number of strings and k is the maximum length of a string (due to sorting each string). The space complexity is O(n * k) for storing all strings in the hash table.

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example with strs = ["abc", "bca", "tan", "nat"] to illustrate the solution approach.

Step-by-step process:

  1. Initialize an empty hash table to store our groups.

  2. Process "abc":

    • Sort the characters: 'a', 'b', 'c' β†’ "abc"
    • Use "abc" as the key
    • Add "abc" to the hash table: {"abc": ["abc"]}
  3. Process "bca":

    • Sort the characters: 'b', 'c', 'a' β†’ "abc"
    • The key "abc" already exists
    • Add "bca" to the existing group: {"abc": ["abc", "bca"]}
  4. Process "tan":

    • Sort the characters: 't', 'a', 'n' β†’ "ant"
    • Use "ant" as a new key
    • Add "tan" to the hash table: {"abc": ["abc", "bca"], "ant": ["tan"]}
  5. Process "nat":

    • Sort the characters: 'n', 'a', 't' β†’ "ant"
    • The key "ant" already exists
    • Add "nat" to the existing group: {"abc": ["abc", "bca"], "ant": ["tan", "nat"]}
  6. Extract the final result:

    • Take all values from the hash table
    • Result: [["abc", "bca"], ["tan", "nat"]]

The key insight here is that "abc" and "bca" both produce the same sorted string "abc", so they're recognized as anagrams and grouped together. Similarly, "tan" and "nat" both produce "ant" when sorted, placing them in the same group. This demonstrates how sorting characters creates a unique signature for each anagram group.

Solution Implementation

1from collections import defaultdict
2from typing import List
3
4class Solution:
5    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
6        # Dictionary to store anagram groups
7        # Key: sorted string (anagram signature)
8        # Value: list of original strings that are anagrams
9        anagram_groups = defaultdict(list)
10      
11        # Iterate through each string in the input list
12        for string in strs:
13            # Create a key by sorting the characters of the string
14            # All anagrams will have the same sorted character sequence
15            sorted_key = ''.join(sorted(string))
16          
17            # Add the original string to its anagram group
18            anagram_groups[sorted_key].append(string)
19      
20        # Convert the dictionary values to a list and return
21        # Each value is a list of strings that are anagrams of each other
22        return list(anagram_groups.values())
23
1class Solution {
2    public List<List<String>> groupAnagrams(String[] strs) {
3        // HashMap to store sorted string as key and list of anagrams as value
4        Map<String, List<String>> anagramGroups = new HashMap<>();
5      
6        // Iterate through each string in the input array
7        for (String str : strs) {
8            // Convert string to character array for sorting
9            char[] charArray = str.toCharArray();
10          
11            // Sort the character array to create a unique key for anagrams
12            Arrays.sort(charArray);
13          
14            // Convert sorted character array back to string to use as key
15            String sortedKey = String.valueOf(charArray);
16          
17            // Add the original string to the corresponding anagram group
18            // If key doesn't exist, create a new ArrayList for this group
19            anagramGroups.computeIfAbsent(sortedKey, key -> new ArrayList<>()).add(str);
20        }
21      
22        // Return all anagram groups as a list of lists
23        return new ArrayList<>(anagramGroups.values());
24    }
25}
26
1class Solution {
2public:
3    vector<vector<string>> groupAnagrams(vector<string>& strs) {
4        // Hash map to store sorted string as key and list of anagrams as value
5        unordered_map<string, vector<string>> anagramGroups;
6      
7        // Iterate through each string in the input array
8        for (const auto& str : strs) {
9            // Create a copy of the current string to use as key
10            string sortedKey = str;
11          
12            // Sort the characters to create a unique key for all anagrams
13            sort(sortedKey.begin(), sortedKey.end());
14          
15            // Add the original string to the corresponding anagram group
16            anagramGroups[sortedKey].emplace_back(str);
17        }
18      
19        // Result vector to store all groups of anagrams
20        vector<vector<string>> result;
21      
22        // Extract all anagram groups from the hash map
23        for (auto& [key, group] : anagramGroups) {
24            result.emplace_back(group);
25        }
26      
27        return result;
28    }
29};
30
1/**
2 * Groups anagrams together from an array of strings
3 * @param strs - Array of strings to group by anagrams
4 * @returns Array of arrays, where each inner array contains anagrams
5 */
6function groupAnagrams(strs: string[]): string[][] {
7    // Map to store sorted string as key and array of anagrams as value
8    const anagramMap: Map<string, string[]> = new Map<string, string[]>();
9  
10    // Iterate through each string in the input array
11    for (const str of strs) {
12        // Sort characters of the string to create a unique key for anagrams
13        const sortedKey: string = str.split('').sort().join('');
14      
15        // Initialize array for this key if it doesn't exist
16        if (!anagramMap.has(sortedKey)) {
17            anagramMap.set(sortedKey, []);
18        }
19      
20        // Add the original string to its anagram group
21        anagramMap.get(sortedKey)!.push(str);
22    }
23  
24    // Convert map values to array and return
25    return Array.from(anagramMap.values());
26}
27

Time and Space Complexity

Time Complexity: O(n Γ— k Γ— log k)

  • n is the number of strings in the input array
  • k is the maximum length of a string in the array
  • For each of the n strings, we perform a sorting operation using sorted(s)
  • Sorting a string of length k takes O(k Γ— log k) time
  • Therefore, the total time complexity is O(n Γ— k Γ— log k)

Space Complexity: O(n Γ— k)

  • The dictionary d stores all n strings, where each string has up to k characters, requiring O(n Γ— k) space
  • For each string, we create a sorted key string which takes O(k) space temporarily
  • The output list contains all the original strings, which is O(n Γ— k) space
  • Overall space complexity is O(n Γ— k)

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Using Mutable Objects as Dictionary Keys

A common mistake is attempting to use the sorted list directly as a dictionary key:

# WRONG - This will raise a TypeError
for string in strs:
    sorted_key = sorted(string)  # Returns a list ['a', 'e', 't']
    anagram_groups[sorted_key].append(string)  # Lists can't be dictionary keys!

Why it fails: Python dictionaries require immutable objects as keys. Lists are mutable and therefore cannot be used as dictionary keys.

Solution: Convert the sorted list to an immutable type like a string or tuple:

# Correct approach 1: Join into a string
sorted_key = ''.join(sorted(string))

# Correct approach 2: Convert to tuple
sorted_key = tuple(sorted(string))

2. Incorrect Character Frequency Counting

Some might try to use character counting without proper sorting:

# WRONG - This doesn't properly identify anagrams
for string in strs:
    key = ''.join(set(string))  # Using set loses duplicate characters!
    anagram_groups[key].append(string)

Why it fails: Using set() removes duplicate characters. For example, "aab" and "ab" would incorrectly be grouped together since set("aab") and set("ab") both produce {'a', 'b'}.

Solution: Either sort the string properly or use a frequency count approach:

# Alternative correct approach using character frequency
from collections import Counter
for string in strs:
    # Create a tuple of (char, count) pairs sorted by character
    sorted_key = tuple(sorted(Counter(string).items()))
    anagram_groups[sorted_key].append(string)

3. Not Handling Empty Strings or Edge Cases

The code might fail to handle edge cases properly:

# Input: ["", "", "a"]
# Without proper handling, empty strings might cause issues

Solution: The provided solution actually handles empty strings correctly since sorting an empty string returns an empty list, which joins to an empty string key. However, be aware that all empty strings will be grouped together, which is the correct behavior.

4. Performance Issues with Large Strings

For very large strings, sorting can become expensive. While the O(k log k) sorting approach works well for typical cases, an alternative counting approach can achieve O(k) time per string:

def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
    anagram_groups = defaultdict(list)
  
    for string in strs:
        # Create a frequency count key (26 letters)
        count = [0] * 26
        for char in string:
            count[ord(char) - ord('a')] += 1
      
        # Convert count array to tuple for use as key
        key = tuple(count)
        anagram_groups[key].append(string)
  
    return list(anagram_groups.values())

This approach is more efficient for very long strings but uses more memory for the 26-element count array per unique anagram group.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

In a binary min heap, the maximum element can be found in:


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More