Facebook Pixel

916. Word Subsets

MediumArrayHash TableString
Leetcode Link

Problem Description

You are given two arrays of strings: words1 and words2.

A string b is considered a subset of string a if every letter in b appears in a with at least the same frequency. In other words, a must contain all letters from b, and each letter must appear at least as many times as it does in b.

For example:

  • "wrr" is a subset of "warrior" because "warrior" contains at least one 'w' and at least two 'r's
  • "wrr" is NOT a subset of "world" because "world" only has one 'r' while "wrr" needs two

A string a from words1 is called universal if every single string b in words2 is a subset of a. This means a must satisfy the subset requirement for all strings in words2 simultaneously.

Your task is to find all universal strings from words1 and return them in an array. The order of strings in the result doesn't matter.

The solution works by first finding the maximum frequency requirement for each character across all strings in words2. This creates a "combined requirement" stored in cnt. Then, for each string in words1, it checks if that string meets all the character frequency requirements. If a string from words1 contains enough of each required character, it's added to the answer list.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The key insight is recognizing that we need to find strings in words1 that can satisfy all strings in words2 simultaneously.

Think about what it means for a string a to be universal - it must be a superset of every string in words2. If we have words2 = ["ec", "oc", "ceo"], then a must contain:

  • At least 1 'e' and 1 'c' (to satisfy "ec")
  • At least 1 'o' and 1 'c' (to satisfy "oc")
  • At least 1 'c', 1 'e', and 1 'o' (to satisfy "ceo")

Notice that we need to satisfy all these requirements at once. So for character 'c', we need at least 1 occurrence (the maximum required across all strings in words2). For 'e', we need at least 1, and for 'o', we need at least 1.

This leads us to the realization: instead of checking each string in words1 against every string in words2 (which would be inefficient), we can merge all requirements from words2 into a single "combined requirement".

For each character, we take the maximum frequency required across all strings in words2. Why maximum? Because if a character appears 2 times in one word and 3 times in another word from words2, we need at least 3 occurrences to satisfy both requirements.

Once we have this combined requirement (stored in cnt), checking becomes simple: for each string in words1, we just need to verify it meets or exceeds the frequency requirement for each character in our combined requirement. This transforms a complex multiple-comparison problem into a single comparison problem.

Solution Approach

The solution uses a counting approach with hash maps (Python's Counter) to efficiently track character frequencies.

Step 1: Build the Combined Requirement

First, we create an empty counter cnt to store the maximum frequency requirement for each character across all strings in words2.

cnt = Counter()
for b in words2:
    t = Counter(b)
    for c, v in t.items():
        cnt[c] = max(cnt[c], v)

For each string b in words2:

  • Count the frequency of each character using Counter(b) and store it in t
  • For each character c with frequency v in the current string, update cnt[c] to be the maximum of its current value and v

For example, if words2 = ["ec", "oc", "ceo"]:

  • After processing "ec": cnt = {'e': 1, 'c': 1}
  • After processing "oc": cnt = {'e': 1, 'c': 1, 'o': 1}
  • After processing "ceo": cnt = {'e': 1, 'c': 1, 'o': 1} (no changes as frequencies don't exceed existing maximums)

Step 2: Check Each String in words1

Now we iterate through each string a in words1 and check if it satisfies the combined requirement:

ans = []
for a in words1:
    t = Counter(a)
    if all(v <= t[c] for c, v in cnt.items()):
        ans.append(a)

For each string a:

  • Count its character frequencies using Counter(a) and store in t
  • Check if for every character c with required frequency v in cnt, the string a has at least v occurrences of that character (v <= t[c])
  • If all requirements are satisfied, add a to the answer list

The all() function returns True only if every character requirement is met. If any character appears fewer times in a than required by cnt, the string is not universal.

Time Complexity: O(M + N) where M is the total number of characters across all strings in words2 and N is the total number of characters across all strings in words1.

Space Complexity: O(1) for the character counters since there are at most 26 lowercase English letters.

Ready to land your dream job?

Unlock your dream job with a 3-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a concrete example with words1 = ["amazon", "apple", "facebook", "google", "leetcode"] and words2 = ["e", "o"].

Step 1: Build the Combined Requirement

We need to find the maximum frequency requirement for each character across all strings in words2:

  • Process "e":

    • Character frequencies: {'e': 1}
    • Update cnt: cnt = {'e': 1}
  • Process "o":

    • Character frequencies: {'o': 1}
    • Update cnt: cnt = {'e': 1, 'o': 1}

So our combined requirement is: any universal string must have at least 1 'e' and at least 1 'o'.

Step 2: Check Each String in words1

Now we check each string in words1 against this combined requirement:

  1. "amazon":

    • Character frequencies: {'a': 2, 'm': 1, 'z': 1, 'o': 1, 'n': 1}
    • Check: needs 'e' ≥ 1 (has 0) ❌
    • Not universal
  2. "apple":

    • Character frequencies: {'a': 1, 'p': 2, 'l': 1, 'e': 1}
    • Check: needs 'e' ≥ 1 (has 1) ✓, needs 'o' ≥ 1 (has 0) ❌
    • Not universal
  3. "facebook":

    • Character frequencies: {'f': 1, 'a': 1, 'c': 1, 'e': 1, 'b': 1, 'o': 2, 'k': 1}
    • Check: needs 'e' ≥ 1 (has 1) ✓, needs 'o' ≥ 1 (has 2) ✓
    • Universal! Add to answer
  4. "google":

    • Character frequencies: {'g': 2, 'o': 2, 'l': 1, 'e': 1}
    • Check: needs 'e' ≥ 1 (has 1) ✓, needs 'o' ≥ 1 (has 2) ✓
    • Universal! Add to answer
  5. "leetcode":

    • Character frequencies: {'l': 1, 'e': 3, 't': 1, 'c': 1, 'o': 1, 'd': 1}
    • Check: needs 'e' ≥ 1 (has 3) ✓, needs 'o' ≥ 1 (has 1) ✓
    • Universal! Add to answer

Final Answer: ["facebook", "google", "leetcode"]

These three strings are universal because they each contain at least one 'e' and at least one 'o', satisfying all strings in words2.

Solution Implementation

1class Solution:
2    def wordSubsets(self, words1: List[str], words2: List[str]) -> List[str]:
3        # Build the maximum frequency requirement for each character across all words in words2
4        # This represents the "universal" subset requirement
5        max_freq_requirement = Counter()
6      
7        # For each word in words2, find the maximum frequency needed for each character
8        for word in words2:
9            char_freq = Counter(word)
10            for char, freq in char_freq.items():
11                # Update to keep the maximum frequency requirement for each character
12                max_freq_requirement[char] = max(max_freq_requirement[char], freq)
13      
14        # List to store words from words1 that are universal
15        result = []
16      
17        # Check each word in words1 to see if it satisfies all requirements
18        for word in words1:
19            char_freq = Counter(word)
20            # Check if this word contains enough of each required character
21            if all(required_freq <= char_freq[char] for char, required_freq in max_freq_requirement.items()):
22                result.append(word)
23      
24        return result
25
1class Solution {
2    public List<String> wordSubsets(String[] words1, String[] words2) {
3        // Build the maximum frequency requirement for each character across all words in words2
4        int[] maxFrequency = new int[26];
5      
6        // Process each word in words2 to determine maximum frequency requirements
7        for (String word : words2) {
8            // Count character frequencies in the current word
9            int[] currentFrequency = new int[26];
10            for (int i = 0; i < word.length(); i++) {
11                currentFrequency[word.charAt(i) - 'a']++;
12            }
13          
14            // Update the maximum frequency requirement for each character
15            for (int i = 0; i < 26; i++) {
16                maxFrequency[i] = Math.max(maxFrequency[i], currentFrequency[i]);
17            }
18        }
19      
20        // List to store words from words1 that are universal
21        List<String> result = new ArrayList<>();
22      
23        // Check each word in words1 to see if it meets all requirements
24        for (String word : words1) {
25            // Count character frequencies in the current word
26            int[] currentFrequency = new int[26];
27            for (int i = 0; i < word.length(); i++) {
28                currentFrequency[word.charAt(i) - 'a']++;
29            }
30          
31            // Check if current word satisfies all frequency requirements
32            boolean isUniversal = true;
33            for (int i = 0; i < 26; i++) {
34                if (maxFrequency[i] > currentFrequency[i]) {
35                    isUniversal = false;
36                    break;
37                }
38            }
39          
40            // Add to result if the word is universal
41            if (isUniversal) {
42                result.add(word);
43            }
44        }
45      
46        return result;
47    }
48}
49
1class Solution {
2public:
3    vector<string> wordSubsets(vector<string>& words1, vector<string>& words2) {
4        // Array to store the maximum frequency of each character across all words in words2
5        // This represents the "universal" requirement for a word to be universal
6        int maxFreq[26] = {0};
7      
8        // Temporary array to count character frequencies in current word
9        int currentFreq[26];
10      
11        // Build the maximum frequency requirements from words2
12        // For each character, we need the maximum count that appears in any single word
13        for (const auto& word : words2) {
14            memset(currentFreq, 0, sizeof(currentFreq));
15          
16            // Count character frequencies in current word from words2
17            for (const char& ch : word) {
18                currentFreq[ch - 'a']++;
19            }
20          
21            // Update maximum frequency requirements
22            for (int i = 0; i < 26; ++i) {
23                maxFreq[i] = max(maxFreq[i], currentFreq[i]);
24            }
25        }
26      
27        // Result vector to store universal words from words1
28        vector<string> result;
29      
30        // Check each word in words1 to see if it meets all requirements
31        for (const auto& word : words1) {
32            memset(currentFreq, 0, sizeof(currentFreq));
33          
34            // Count character frequencies in current word from words1
35            for (const char& ch : word) {
36                currentFreq[ch - 'a']++;
37            }
38          
39            // Check if current word satisfies all maximum frequency requirements
40            bool isUniversal = true;
41            for (int i = 0; i < 26; ++i) {
42                if (maxFreq[i] > currentFreq[i]) {
43                    isUniversal = false;
44                    break;
45                }
46            }
47          
48            // Add to result if word is universal
49            if (isUniversal) {
50                result.emplace_back(word);
51            }
52        }
53      
54        return result;
55    }
56};
57
1/**
2 * Finds all universal words from words1 that contain all characters from all words in words2
3 * with sufficient frequency
4 * @param words1 - Array of words to check for universality
5 * @param words2 - Array of words defining the required character frequencies
6 * @returns Array of universal words from words1
7 */
8function wordSubsets(words1: string[], words2: string[]): string[] {
9    // Build the maximum frequency requirement for each character across all words in words2
10    const maxCharFrequency: number[] = Array(26).fill(0);
11  
12    for (const word of words2) {
13        // Count character frequencies in current word
14        const currentWordFrequency: number[] = Array(26).fill(0);
15        for (const char of word) {
16            const charIndex = char.charCodeAt(0) - 97; // Convert 'a'-'z' to 0-25
17            currentWordFrequency[charIndex]++;
18        }
19      
20        // Update maximum frequency requirements
21        for (let i = 0; i < 26; i++) {
22            maxCharFrequency[i] = Math.max(maxCharFrequency[i], currentWordFrequency[i]);
23        }
24    }
25
26    // Check each word in words1 against the frequency requirements
27    const universalWords: string[] = [];
28  
29    for (const word of words1) {
30        // Count character frequencies in current word
31        const currentWordFrequency: number[] = Array(26).fill(0);
32        for (const char of word) {
33            const charIndex = char.charCodeAt(0) - 97; // Convert 'a'-'z' to 0-25
34            currentWordFrequency[charIndex]++;
35        }
36
37        // Check if current word satisfies all frequency requirements
38        let isUniversal = true;
39        for (let i = 0; i < 26; i++) {
40            if (maxCharFrequency[i] > currentWordFrequency[i]) {
41                isUniversal = false;
42                break;
43            }
44        }
45
46        // Add to result if word is universal
47        if (isUniversal) {
48            universalWords.push(word);
49        }
50    }
51
52    return universalWords;
53}
54

Time and Space Complexity

Time Complexity: O(L), where L is the sum of the lengths of all words in words1 and words2.

  • Building the maximum frequency counter for words2: We iterate through each word in words2 and count the frequency of each character. For each word b in words2, creating Counter(b) takes O(len(b)) time. Updating the maximum frequencies takes O(26) at most (for lowercase English letters). The total time for processing words2 is O(sum of lengths of words in words2).

  • Checking each word in words1: For each word a in words1, creating Counter(a) takes O(len(a)) time. The all() check iterates through at most 26 unique characters in cnt, each lookup in t is O(1). The total time for processing words1 is O(sum of lengths of words in words1).

  • Overall: O(sum of lengths in words1 + sum of lengths in words2) = O(L)

Space Complexity: O(1) or O(26) for the character space, which is constant.

  • The cnt counter stores at most 26 lowercase English letters with their maximum frequencies: O(26) = O(1)
  • For each word being processed, the temporary counter t also stores at most 26 characters: O(26) = O(1)
  • The output list ans stores references to strings from words1, not creating new strings: O(|words1|) in the worst case when all words are universal

If we consider the output space, the space complexity is O(|words1|). If we only consider auxiliary space excluding the output, it's O(1).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

Pitfall 1: Checking Against Individual Words Instead of Combined Requirements

The Mistake: A common error is to check if each word in words1 is a superset of each word in words2 individually, like this:

# INCORRECT APPROACH
result = []
for a in words1:
    is_universal = True
    for b in words2:
        if not is_subset(b, a):  # Check each b individually
            is_universal = False
            break
    if is_universal:
        result.append(a)

Why It's Wrong: While this seems logical, it's inefficient and leads to redundant checks. If words2 = ["a", "aa", "aaa"], you'd check the same character 'a' multiple times for each word in words1.

The Fix: Pre-compute the maximum frequency requirement for each character across ALL words in words2 first, then check each word in words1 against this combined requirement only once.

Pitfall 2: Using Set Operations Instead of Frequency Counting

The Mistake:

# INCORRECT - only checks presence, not frequency
max_chars = set()
for word in words2:
    max_chars.update(word)

result = []
for word in words1:
    if max_chars.issubset(set(word)):
        result.append(word)

Why It's Wrong: Sets only track whether a character exists, not how many times it appears. The word "world" would incorrectly be considered universal for words2 = ["wrr"] because both contain 'w' and 'r', ignoring that "wrr" needs TWO 'r's.

The Fix: Use Counter or frequency maps to track the exact count of each character, not just its presence.

Pitfall 3: Incorrectly Accumulating Character Frequencies

The Mistake:

# INCORRECT - sums frequencies instead of taking maximum
cnt = Counter()
for b in words2:
    cnt += Counter(b)  # Wrong: adds frequencies together

Why It's Wrong: If words2 = ["aa", "aaa"], this would require 5 'a's (2+3) in each universal string. But the actual requirement is only 3 'a's (the maximum from any single word).

The Fix: Use max() to keep only the highest frequency requirement for each character:

for b in words2:
    t = Counter(b)
    for c, v in t.items():
        cnt[c] = max(cnt[c], v)

Pitfall 4: Missing Edge Case - Empty Character Count

The Mistake:

# Potential KeyError if character doesn't exist in word from words1
if all(v <= t[c] for c, v in cnt.items()):  # t[c] might not exist

Why It's Wrong: If a required character doesn't exist in the word from words1, accessing t[c] could cause issues in some implementations.

The Fix: Python's Counter handles this gracefully by returning 0 for missing keys, but in other languages or data structures, you might need:

if all(v <= t.get(c, 0) for c, v in cnt.items()):
Discover Your Strengths and Weaknesses: Take Our 3-Minute Quiz to Tailor Your Study Plan:

What's the output of running the following function using the following tree as input?

1def serialize(root):
2    res = []
3    def dfs(root):
4        if not root:
5            res.append('x')
6            return
7        res.append(root.val)
8        dfs(root.left)
9        dfs(root.right)
10    dfs(root)
11    return ' '.join(res)
12
1import java.util.StringJoiner;
2
3public static String serialize(Node root) {
4    StringJoiner res = new StringJoiner(" ");
5    serializeDFS(root, res);
6    return res.toString();
7}
8
9private static void serializeDFS(Node root, StringJoiner result) {
10    if (root == null) {
11        result.add("x");
12        return;
13    }
14    result.add(Integer.toString(root.val));
15    serializeDFS(root.left, result);
16    serializeDFS(root.right, result);
17}
18
1function serialize(root) {
2    let res = [];
3    serialize_dfs(root, res);
4    return res.join(" ");
5}
6
7function serialize_dfs(root, res) {
8    if (!root) {
9        res.push("x");
10        return;
11    }
12    res.push(root.val);
13    serialize_dfs(root.left, res);
14    serialize_dfs(root.right, res);
15}
16

Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More