916. Word Subsets
Problem Description
You are given two arrays of strings: words1
and words2
.
A string b
is considered a subset of string a
if every letter in b
appears in a
with at least the same frequency. In other words, a
must contain all letters from b
, and each letter must appear at least as many times as it does in b
.
For example:
"wrr"
is a subset of"warrior"
because"warrior"
contains at least one'w'
and at least two'r'
s"wrr"
is NOT a subset of"world"
because"world"
only has one'r'
while"wrr"
needs two
A string a
from words1
is called universal if every single string b
in words2
is a subset of a
. This means a
must satisfy the subset requirement for all strings in words2
simultaneously.
Your task is to find all universal strings from words1
and return them in an array. The order of strings in the result doesn't matter.
The solution works by first finding the maximum frequency requirement for each character across all strings in words2
. This creates a "combined requirement" stored in cnt
. Then, for each string in words1
, it checks if that string meets all the character frequency requirements. If a string from words1
contains enough of each required character, it's added to the answer list.
Intuition
The key insight is recognizing that we need to find strings in words1
that can satisfy all strings in words2
simultaneously.
Think about what it means for a string a
to be universal - it must be a superset of every string in words2
. If we have words2 = ["ec", "oc", "ceo"]
, then a
must contain:
- At least 1
'e'
and 1'c'
(to satisfy"ec"
) - At least 1
'o'
and 1'c'
(to satisfy"oc"
) - At least 1
'c'
, 1'e'
, and 1'o'
(to satisfy"ceo"
)
Notice that we need to satisfy all these requirements at once. So for character 'c'
, we need at least 1 occurrence (the maximum required across all strings in words2
). For 'e'
, we need at least 1, and for 'o'
, we need at least 1.
This leads us to the realization: instead of checking each string in words1
against every string in words2
(which would be inefficient), we can merge all requirements from words2
into a single "combined requirement".
For each character, we take the maximum frequency required across all strings in words2
. Why maximum? Because if a character appears 2 times in one word and 3 times in another word from words2
, we need at least 3 occurrences to satisfy both requirements.
Once we have this combined requirement (stored in cnt
), checking becomes simple: for each string in words1
, we just need to verify it meets or exceeds the frequency requirement for each character in our combined requirement. This transforms a complex multiple-comparison problem into a single comparison problem.
Solution Approach
The solution uses a counting approach with hash maps (Python's Counter
) to efficiently track character frequencies.
Step 1: Build the Combined Requirement
First, we create an empty counter cnt
to store the maximum frequency requirement for each character across all strings in words2
.
cnt = Counter()
for b in words2:
t = Counter(b)
for c, v in t.items():
cnt[c] = max(cnt[c], v)
For each string b
in words2
:
- Count the frequency of each character using
Counter(b)
and store it int
- For each character
c
with frequencyv
in the current string, updatecnt[c]
to be the maximum of its current value andv
For example, if words2 = ["ec", "oc", "ceo"]
:
- After processing
"ec"
:cnt = {'e': 1, 'c': 1}
- After processing
"oc"
:cnt = {'e': 1, 'c': 1, 'o': 1}
- After processing
"ceo"
:cnt = {'e': 1, 'c': 1, 'o': 1}
(no changes as frequencies don't exceed existing maximums)
Step 2: Check Each String in words1
Now we iterate through each string a
in words1
and check if it satisfies the combined requirement:
ans = []
for a in words1:
t = Counter(a)
if all(v <= t[c] for c, v in cnt.items()):
ans.append(a)
For each string a
:
- Count its character frequencies using
Counter(a)
and store int
- Check if for every character
c
with required frequencyv
incnt
, the stringa
has at leastv
occurrences of that character (v <= t[c]
) - If all requirements are satisfied, add
a
to the answer list
The all()
function returns True
only if every character requirement is met. If any character appears fewer times in a
than required by cnt
, the string is not universal.
Time Complexity: O(M + N)
where M
is the total number of characters across all strings in words2
and N
is the total number of characters across all strings in words1
.
Space Complexity: O(1)
for the character counters since there are at most 26 lowercase English letters.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a concrete example with words1 = ["amazon", "apple", "facebook", "google", "leetcode"]
and words2 = ["e", "o"]
.
Step 1: Build the Combined Requirement
We need to find the maximum frequency requirement for each character across all strings in words2
:
-
Process
"e"
:- Character frequencies:
{'e': 1}
- Update
cnt
:cnt = {'e': 1}
- Character frequencies:
-
Process
"o"
:- Character frequencies:
{'o': 1}
- Update
cnt
:cnt = {'e': 1, 'o': 1}
- Character frequencies:
So our combined requirement is: any universal string must have at least 1 'e'
and at least 1 'o'
.
Step 2: Check Each String in words1
Now we check each string in words1
against this combined requirement:
-
"amazon":
- Character frequencies:
{'a': 2, 'm': 1, 'z': 1, 'o': 1, 'n': 1}
- Check: needs
'e'
≥ 1 (has 0) ❌ - Not universal
- Character frequencies:
-
"apple":
- Character frequencies:
{'a': 1, 'p': 2, 'l': 1, 'e': 1}
- Check: needs
'e'
≥ 1 (has 1) ✓, needs'o'
≥ 1 (has 0) ❌ - Not universal
- Character frequencies:
-
"facebook":
- Character frequencies:
{'f': 1, 'a': 1, 'c': 1, 'e': 1, 'b': 1, 'o': 2, 'k': 1}
- Check: needs
'e'
≥ 1 (has 1) ✓, needs'o'
≥ 1 (has 2) ✓ - Universal! Add to answer
- Character frequencies:
-
"google":
- Character frequencies:
{'g': 2, 'o': 2, 'l': 1, 'e': 1}
- Check: needs
'e'
≥ 1 (has 1) ✓, needs'o'
≥ 1 (has 2) ✓ - Universal! Add to answer
- Character frequencies:
-
"leetcode":
- Character frequencies:
{'l': 1, 'e': 3, 't': 1, 'c': 1, 'o': 1, 'd': 1}
- Check: needs
'e'
≥ 1 (has 3) ✓, needs'o'
≥ 1 (has 1) ✓ - Universal! Add to answer
- Character frequencies:
Final Answer: ["facebook", "google", "leetcode"]
These three strings are universal because they each contain at least one 'e'
and at least one 'o'
, satisfying all strings in words2
.
Solution Implementation
1class Solution:
2 def wordSubsets(self, words1: List[str], words2: List[str]) -> List[str]:
3 # Build the maximum frequency requirement for each character across all words in words2
4 # This represents the "universal" subset requirement
5 max_freq_requirement = Counter()
6
7 # For each word in words2, find the maximum frequency needed for each character
8 for word in words2:
9 char_freq = Counter(word)
10 for char, freq in char_freq.items():
11 # Update to keep the maximum frequency requirement for each character
12 max_freq_requirement[char] = max(max_freq_requirement[char], freq)
13
14 # List to store words from words1 that are universal
15 result = []
16
17 # Check each word in words1 to see if it satisfies all requirements
18 for word in words1:
19 char_freq = Counter(word)
20 # Check if this word contains enough of each required character
21 if all(required_freq <= char_freq[char] for char, required_freq in max_freq_requirement.items()):
22 result.append(word)
23
24 return result
25
1class Solution {
2 public List<String> wordSubsets(String[] words1, String[] words2) {
3 // Build the maximum frequency requirement for each character across all words in words2
4 int[] maxFrequency = new int[26];
5
6 // Process each word in words2 to determine maximum frequency requirements
7 for (String word : words2) {
8 // Count character frequencies in the current word
9 int[] currentFrequency = new int[26];
10 for (int i = 0; i < word.length(); i++) {
11 currentFrequency[word.charAt(i) - 'a']++;
12 }
13
14 // Update the maximum frequency requirement for each character
15 for (int i = 0; i < 26; i++) {
16 maxFrequency[i] = Math.max(maxFrequency[i], currentFrequency[i]);
17 }
18 }
19
20 // List to store words from words1 that are universal
21 List<String> result = new ArrayList<>();
22
23 // Check each word in words1 to see if it meets all requirements
24 for (String word : words1) {
25 // Count character frequencies in the current word
26 int[] currentFrequency = new int[26];
27 for (int i = 0; i < word.length(); i++) {
28 currentFrequency[word.charAt(i) - 'a']++;
29 }
30
31 // Check if current word satisfies all frequency requirements
32 boolean isUniversal = true;
33 for (int i = 0; i < 26; i++) {
34 if (maxFrequency[i] > currentFrequency[i]) {
35 isUniversal = false;
36 break;
37 }
38 }
39
40 // Add to result if the word is universal
41 if (isUniversal) {
42 result.add(word);
43 }
44 }
45
46 return result;
47 }
48}
49
1class Solution {
2public:
3 vector<string> wordSubsets(vector<string>& words1, vector<string>& words2) {
4 // Array to store the maximum frequency of each character across all words in words2
5 // This represents the "universal" requirement for a word to be universal
6 int maxFreq[26] = {0};
7
8 // Temporary array to count character frequencies in current word
9 int currentFreq[26];
10
11 // Build the maximum frequency requirements from words2
12 // For each character, we need the maximum count that appears in any single word
13 for (const auto& word : words2) {
14 memset(currentFreq, 0, sizeof(currentFreq));
15
16 // Count character frequencies in current word from words2
17 for (const char& ch : word) {
18 currentFreq[ch - 'a']++;
19 }
20
21 // Update maximum frequency requirements
22 for (int i = 0; i < 26; ++i) {
23 maxFreq[i] = max(maxFreq[i], currentFreq[i]);
24 }
25 }
26
27 // Result vector to store universal words from words1
28 vector<string> result;
29
30 // Check each word in words1 to see if it meets all requirements
31 for (const auto& word : words1) {
32 memset(currentFreq, 0, sizeof(currentFreq));
33
34 // Count character frequencies in current word from words1
35 for (const char& ch : word) {
36 currentFreq[ch - 'a']++;
37 }
38
39 // Check if current word satisfies all maximum frequency requirements
40 bool isUniversal = true;
41 for (int i = 0; i < 26; ++i) {
42 if (maxFreq[i] > currentFreq[i]) {
43 isUniversal = false;
44 break;
45 }
46 }
47
48 // Add to result if word is universal
49 if (isUniversal) {
50 result.emplace_back(word);
51 }
52 }
53
54 return result;
55 }
56};
57
1/**
2 * Finds all universal words from words1 that contain all characters from all words in words2
3 * with sufficient frequency
4 * @param words1 - Array of words to check for universality
5 * @param words2 - Array of words defining the required character frequencies
6 * @returns Array of universal words from words1
7 */
8function wordSubsets(words1: string[], words2: string[]): string[] {
9 // Build the maximum frequency requirement for each character across all words in words2
10 const maxCharFrequency: number[] = Array(26).fill(0);
11
12 for (const word of words2) {
13 // Count character frequencies in current word
14 const currentWordFrequency: number[] = Array(26).fill(0);
15 for (const char of word) {
16 const charIndex = char.charCodeAt(0) - 97; // Convert 'a'-'z' to 0-25
17 currentWordFrequency[charIndex]++;
18 }
19
20 // Update maximum frequency requirements
21 for (let i = 0; i < 26; i++) {
22 maxCharFrequency[i] = Math.max(maxCharFrequency[i], currentWordFrequency[i]);
23 }
24 }
25
26 // Check each word in words1 against the frequency requirements
27 const universalWords: string[] = [];
28
29 for (const word of words1) {
30 // Count character frequencies in current word
31 const currentWordFrequency: number[] = Array(26).fill(0);
32 for (const char of word) {
33 const charIndex = char.charCodeAt(0) - 97; // Convert 'a'-'z' to 0-25
34 currentWordFrequency[charIndex]++;
35 }
36
37 // Check if current word satisfies all frequency requirements
38 let isUniversal = true;
39 for (let i = 0; i < 26; i++) {
40 if (maxCharFrequency[i] > currentWordFrequency[i]) {
41 isUniversal = false;
42 break;
43 }
44 }
45
46 // Add to result if word is universal
47 if (isUniversal) {
48 universalWords.push(word);
49 }
50 }
51
52 return universalWords;
53}
54
Time and Space Complexity
Time Complexity: O(L)
, where L
is the sum of the lengths of all words in words1
and words2
.
-
Building the maximum frequency counter for
words2
: We iterate through each word inwords2
and count the frequency of each character. For each wordb
inwords2
, creatingCounter(b)
takesO(len(b))
time. Updating the maximum frequencies takesO(26)
at most (for lowercase English letters). The total time for processingwords2
isO(sum of lengths of words in words2)
. -
Checking each word in
words1
: For each worda
inwords1
, creatingCounter(a)
takesO(len(a))
time. Theall()
check iterates through at most 26 unique characters incnt
, each lookup int
isO(1)
. The total time for processingwords1
isO(sum of lengths of words in words1)
. -
Overall:
O(sum of lengths in words1 + sum of lengths in words2) = O(L)
Space Complexity: O(1)
or O(26)
for the character space, which is constant.
- The
cnt
counter stores at most 26 lowercase English letters with their maximum frequencies:O(26) = O(1)
- For each word being processed, the temporary counter
t
also stores at most 26 characters:O(26) = O(1)
- The output list
ans
stores references to strings fromwords1
, not creating new strings:O(|words1|)
in the worst case when all words are universal
If we consider the output space, the space complexity is O(|words1|)
. If we only consider auxiliary space excluding the output, it's O(1)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
Pitfall 1: Checking Against Individual Words Instead of Combined Requirements
The Mistake:
A common error is to check if each word in words1
is a superset of each word in words2
individually, like this:
# INCORRECT APPROACH result = [] for a in words1: is_universal = True for b in words2: if not is_subset(b, a): # Check each b individually is_universal = False break if is_universal: result.append(a)
Why It's Wrong:
While this seems logical, it's inefficient and leads to redundant checks. If words2 = ["a", "aa", "aaa"]
, you'd check the same character 'a' multiple times for each word in words1
.
The Fix:
Pre-compute the maximum frequency requirement for each character across ALL words in words2
first, then check each word in words1
against this combined requirement only once.
Pitfall 2: Using Set Operations Instead of Frequency Counting
The Mistake:
# INCORRECT - only checks presence, not frequency
max_chars = set()
for word in words2:
max_chars.update(word)
result = []
for word in words1:
if max_chars.issubset(set(word)):
result.append(word)
Why It's Wrong:
Sets only track whether a character exists, not how many times it appears. The word "world" would incorrectly be considered universal for words2 = ["wrr"]
because both contain 'w' and 'r', ignoring that "wrr" needs TWO 'r's.
The Fix:
Use Counter
or frequency maps to track the exact count of each character, not just its presence.
Pitfall 3: Incorrectly Accumulating Character Frequencies
The Mistake:
# INCORRECT - sums frequencies instead of taking maximum cnt = Counter() for b in words2: cnt += Counter(b) # Wrong: adds frequencies together
Why It's Wrong:
If words2 = ["aa", "aaa"]
, this would require 5 'a's (2+3) in each universal string. But the actual requirement is only 3 'a's (the maximum from any single word).
The Fix:
Use max()
to keep only the highest frequency requirement for each character:
for b in words2:
t = Counter(b)
for c, v in t.items():
cnt[c] = max(cnt[c], v)
Pitfall 4: Missing Edge Case - Empty Character Count
The Mistake:
# Potential KeyError if character doesn't exist in word from words1
if all(v <= t[c] for c, v in cnt.items()): # t[c] might not exist
Why It's Wrong:
If a required character doesn't exist in the word from words1
, accessing t[c]
could cause issues in some implementations.
The Fix:
Python's Counter
handles this gracefully by returning 0 for missing keys, but in other languages or data structures, you might need:
if all(v <= t.get(c, 0) for c, v in cnt.items()):
What's the output of running the following function using the following tree as input?
1def serialize(root):
2 res = []
3 def dfs(root):
4 if not root:
5 res.append('x')
6 return
7 res.append(root.val)
8 dfs(root.left)
9 dfs(root.right)
10 dfs(root)
11 return ' '.join(res)
12
1import java.util.StringJoiner;
2
3public static String serialize(Node root) {
4 StringJoiner res = new StringJoiner(" ");
5 serializeDFS(root, res);
6 return res.toString();
7}
8
9private static void serializeDFS(Node root, StringJoiner result) {
10 if (root == null) {
11 result.add("x");
12 return;
13 }
14 result.add(Integer.toString(root.val));
15 serializeDFS(root.left, result);
16 serializeDFS(root.right, result);
17}
18
1function serialize(root) {
2 let res = [];
3 serialize_dfs(root, res);
4 return res.join(" ");
5}
6
7function serialize_dfs(root, res) {
8 if (!root) {
9 res.push("x");
10 return;
11 }
12 res.push(root.val);
13 serialize_dfs(root.left, res);
14 serialize_dfs(root.right, res);
15}
16
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!