916. Word Subsets
Problem Description
The problem provides two arrays of strings, words1
and words2
. The goal is to determine which strings from words1
are considered "universal." A string is universal if every string in words2
is a subset of it. A string b
is a subset of another string a
if a
contains all the letters of b
in the same number or more. That means for every character and its count in b
, this character must appear at least the same number of times in a
. For example, 'wrr'
is a subset of 'warrior'
but not of 'world'
. The task is to return an array of universal strings from words1
, and the order of results does not matter.
Intuition
To find the universal strings efficiently, rather than comparing each string in words1
with all the strings in words2
, we can create a letter frequency count that represents the maximum frequency of each character across all strings in words2
. This way, we only need to compare each string in words1
with this combined frequency count.
Firstly, we create a Counter
(which is a dictionary subclass for counting hashable objects) to hold the maximum frequency of characters required from words2
. For each string in words2
, we count its characters and update our maximum frequency Counter
. This counter will tell us the least number of times each character should appear in any string from words1
for it to be considered universal.
After this, we iterate through each string in words1
and create a frequency count for it. Then, we check if this count satisfies the requirements of our precomputed maximum frequency counter. The all()
function helps in determining whether all conditions are true for each character's count.
If a string a
from words1
has at least as many occurrences of each character as computed in the maximum frequency Counter
, then it is universal. We append such strings to our answer list ans
.
This solution approach minimizes the number of total comparisons needed to identify the universal strings, and therefore is optimized for the task.
Solution Approach
To solve this problem, the code implementation makes use of Python's Counter
class from the collections
module. The Counter
class is a specialized dictionary used for counting hashable objects, in our case, characters in strings. It's particularly useful because it allows us to easily compare the frequencies of letters in different strings.
Hereās a step-by-step walkthrough of the implementation:
-
Initialize a new
Counter
object calledcnt
, which will be used to store the maximum frequency requirements for each letter, derived fromwords2
.1cnt = Counter()
-
Loop over each string
b
in the arraywords2
. For each string:a. Create a temporary counter
t
fromb
.b. Update the
cnt
counter with the maximum frequency of characters fromt
. This meanscnt[c] = max(cnt[c], v)
for each characterc
and its countv
int
. This ensures thatcnt
holds the maximum number of times any character needs to appear for a string fromwords1
to be universal.1for b in words2: 2 t = Counter(b) 3 for c, v in t.items(): 4 cnt[c] = max(cnt[c], v)
-
Initialize an empty list
ans
, which will hold all the universal strings fromwords1
.1ans = []
-
Loop over each string
a
inwords1
. For each string:a. Create a temporary counter
t
froma
.b. Use the
all
function to check ifa
contains at least as many of each character as required by thecnt
counter. The comparisonv <= t[c] for c, v in cnt.items()
ensures thata
meets the criteria for all charactersc
with their respective countsv
incnt
.c. If
a
meets the criteria, append it to theans
list.1for a in words1: 2 t = Counter(a) 3 if all(v <= t[c] for c, v in cnt.items()): 4 ans.append(a)
-
Finally, return the list
ans
, which now contains all the universal strings fromwords1
.1 return ans
The primary algorithm patterns used in this solution include frequency counting and iterating with condition checking. By using the Counter
class with maximum frequency logic, the code efficiently reduces what could be an O(n * m) problem (checking each of n strings in words1
against each of m strings in words2
) to a more manageable O(n + m) problem by minimizing repetitive checks.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Consider the following example to illustrate the solution approach:
Let words1
be ["ecology", "universal", "computer"]
and words2
be ["cool", "envy", "yon", "yes"]
.
-
The maximum frequency counters (
cnt
) are determined fromwords2
. For "cool", it will be{'c': 1, 'o': 2, 'l': 1}
, for "envy", it will be{'e': 1, 'n': 1, 'v': 1, 'y': 1}
and so forth. Ultimately, after comparing allwords2
strings,cnt
would be:1cnt = {'c': 1, 'o': 2, 'l': 1, 'e': 1, 'n': 1, 'v': 1, 'y': 2, 's': 1}
This
cnt
means for a string inwords1
to be universal; it must have at least one 'c', two 'o's, one 'l', one 'e', one 'n', one 'v', two 'y's, and one 's'. -
Now we check each word in
words1
againstcnt
.-
"ecology": It's counter is
{'e': 1, 'c': 1, 'o': 1, 'l': 1, 'g': 1, 'y': 1}
. While this word has all the required characters, it fails to meet the count for 'o' and 'y'. Thus, "ecology" is not universal. -
"universal": Counter is
{'u': 1, 'n': 1, 'i': 1, 'v': 1, 'e': 1, 'r': 1, 's': 1, 'a': 1, 'l': 1}
. Despite having all characters, it does not have enough 'o's and 'y's. So, "universal" is not universal. -
"computer": Counter is
{'c': 1, 'o': 1, 'm': 1, 'p': 1, 'u': 1, 't': 1, 'e': 1, 'r': 1}
. Even though "computer" meets or exceeds the count for 'c', 'e', 't', 'r', it doesnāt contain any 'l', 'v', 'y', or sufficient 'o's. Therefore, it's not universal either.
-
-
Applying this logic, none of the words in
words1
are universal, as none meet the frequency criteria established bywords2
. The final answerans
list would be empty.
The result is an efficient check that precisely tells us that there are no strings in words1
that are universal with respect to words2
. Thus, the function would return an empty list []
.
Solution Implementation
1from collections import Counter
2
3class Solution:
4 def wordSubsets(self, words1: List[str], words2: List[str]) -> List[str]:
5 # Create a counter to store the maximum frequency of each character
6 # across all words in words2
7 max_freq_counter = Counter()
8 for word in words2:
9 word_freq_counter = Counter(word)
10 for char, freq in word_freq_counter.items():
11 # Update the counter for each character to the maximum frequency
12 max_freq_counter[char] = max(max_freq_counter[char], freq)
13
14 # Initialize a list to keep all words from words1 that meet the criteria
15 universal_words = []
16 # Iterate through each word in words1 to check if it is a universal word
17 for word in words1:
18 word_freq_counter = Counter(word)
19 # Check if word has at least as many of each character as needed
20 is_universal = all(freq <= word_freq_counter[char] for char, freq in max_freq_counter.items())
21 # If the word meets the criteria, add it to the universal words list
22 if is_universal:
23 universal_words.append(word)
24
25 # Return the list of universal words
26 return universal_words
27
1class Solution {
2 public List<String> wordSubsets(String[] universalSet, String[] subsetWords) {
3 // This array will keep the max frequency of each letter required by subsetWords
4 int[] maxSubsetFreq = new int[26];
5
6 // Calculate the max frequency of each character across all words in subsetWords
7 for (String subsetWord : subsetWords) {
8 // Temporary array to store frequency of each character in the current word
9 int[] tempFreq = new int[26];
10 for (char ch : subsetWord.toCharArray()) {
11 // Increment character frequency
12 tempFreq[ch - 'a']++;
13
14 // Update the maxSubsetFreq array with the maximum frequency needed for this character
15 maxSubsetFreq[ch - 'a'] = Math.max(maxSubsetFreq[ch - 'a'], tempFreq[ch - 'a']);
16 }
17 }
18
19 // This will store our final result
20 List<String> result = new ArrayList<>();
21
22 // Loop through each word in universalSet
23 for (String word : universalSet) {
24 // Temporary array to store frequency of each character in the current word
25 int[] wordFreq = new int[26];
26 for (char ch : word.toCharArray()) {
27 // Increment character frequency
28 wordFreq[ch - 'a']++;
29 }
30
31 // Check if the current word contains all the required characters in proper frequency
32 boolean isUniversal = true;
33 for (int i = 0; i < 26; ++i) {
34 if (maxSubsetFreq[i] > wordFreq[i]) {
35 // If any character is found in less frequency than required,
36 // mark word as non-universal, and break the loop
37 isUniversal = false;
38 break;
39 }
40 }
41
42 // If the word is universal, add it to the result list
43 if (isUniversal) {
44 result.add(word);
45 }
46 }
47
48 // Return the list of all universal words
49 return result;
50 }
51}
52
1#include <vector>
2#include <algorithm>
3#include <string>
4#include <cstring>
5
6class Solution {
7public:
8 // Determines the words from words1 that are universal for words2.
9 vector<string> wordSubsets(vector<string>& words1, vector<string>& words2) {
10 int maxCharFrequencies[26] = {0}; // Array to store the maximum frequency of each character required across all words in words2
11 int currentWordFrequencies[26]; // Array to store the frequency of characters in the current word
12
13 // Calculate the maximum frequency of each character across all words in words2
14 for (const auto& wordB : words2) {
15 memset(currentWordFrequencies, 0, sizeof(currentWordFrequencies));
16 for (const char &ch : wordB) {
17 currentWordFrequencies[ch - 'a']++;
18 }
19 // Update maxCharFrequencies with the maximum frequency for each character
20 for (int i = 0; i < 26; ++i) {
21 maxCharFrequencies[i] = std::max(maxCharFrequencies[i], currentWordFrequencies[i]);
22 }
23 }
24
25 vector<string> universalWords; // Vector to store the universal words
26
27 // Iterate over each word in words1 to determine if it is universal.
28 for (const auto& wordA : words1) {
29 memset(currentWordFrequencies, 0, sizeof(currentWordFrequencies));
30 for (const char &ch : wordA) {
31 currentWordFrequencies[ch - 'a']++;
32 }
33 // Check if wordA has at least the maximum frequency of each character required by words in words2
34 bool isUniversal = true;
35 for (int i = 0; i < 26; ++i) {
36 if (maxCharFrequencies[i] > currentWordFrequencies[i]) {
37 isUniversal = false;
38 break;
39 }
40 }
41 // If the word is universal, add it to the resulting vector.
42 if (isUniversal) {
43 universalWords.emplace_back(wordA);
44 }
45 }
46
47 return universalWords; // Return the vector containing all universal words
48 }
49};
50
1// Type definition for a frequency array representing characters 'a' to 'z'
2type CharFrequencyArray = number[];
3
4// Calculates character frequencies in a given word
5function calculateCharFrequencies(word: string): CharFrequencyArray {
6 const frequencies: CharFrequencyArray = new Array(26).fill(0);
7 for (const char of word) {
8 frequencies[char.charCodeAt(0) - 'a'.charCodeAt(0)]++;
9 }
10 return frequencies;
11}
12
13// Determine the words from words1 that are universal for words2
14function wordSubsets(words1: string[], words2: string[]): string[] {
15 let maxCharFrequencies: CharFrequencyArray = new Array(26).fill(0); // Store max frequency of each char across all words in words2
16
17 // Calculate max frequency of each character required by words in words2
18 for (const word of words2) {
19 const currentWordFrequencies = calculateCharFrequencies(word);
20 // Update max frequencies
21 for (let i = 0; i < 26; i++) {
22 maxCharFrequencies[i] = Math.max(maxCharFrequencies[i], currentWordFrequencies[i]);
23 }
24 }
25
26 const universalWords: string[] = []; // Storage for universal words
27
28 // Check each word in words1 for universality
29 for (const word of words1) {
30 const currentWordFrequencies = calculateCharFrequencies(word);
31 let isUniversal = true; // Assume word is universal until proven otherwise
32
33 // If word has fewer of any character than required, not universal
34 for (let i = 0; i < 26; i++) {
35 if (maxCharFrequencies[i] > currentWordFrequencies[i]) {
36 isUniversal = false;
37 break;
38 }
39 }
40
41 // Add universal words to the result list
42 if (isUniversal) {
43 universalWords.push(word);
44 }
45 }
46
47 return universalWords; // Return the list of all universal words
48}
49
Time and Space Complexity
Time Complexity
The given code has two main sections contributing to the time complexity:
-
Building the Universal Counter:
- Iterating over each word in
words2
and updating thecnt
counter for each character. - If
M
is the average length of words inwords2
andN2
is the number of words inwords2
, this part isO(M * N2)
in the worst case.
- Iterating over each word in
-
Checking if Words in
words1
are Universal:- For each word in
words1
, a counter is created and checked against thecnt
counter. - If
L
is the average length of words inwords1
andN1
is the number of words inwords1
, and ifK
is the number of unique characters incnt
(bounded by the alphabet size), then checking each word has complexityO(L + K)
, and doing this for all words inwords1
givesO(N1 * (L + K))
.
- For each word in
Combining the two we get a total time complexity of O(M * N2) + O(N1 * (L + K))
.
- Note that
K
is bounded by the alphabet size which can be considered a constant, hence it sometimes can be omitted from the Big-O notation, simplifying toO(M * N2 + N1 * L)
.
Space Complexity
The space complexity of the code comprises the following factors:
-
The Universal Counter
cnt
:- The maximum space required is the size of the alphabet, let's denote it as
a
, which is constant. Hence, this isO(a)
.
- The maximum space required is the size of the alphabet, let's denote it as
-
Temporary Counter for Each Word in
words1
:- At most, the size of the alphabet
a
for each word. As this is temporary and not used simultaneously, it's stillO(a)
.
- At most, the size of the alphabet
-
The
ans
List:- At most, it can be as big as
N1
in case allwords1
are universal. So the space for it would beO(N1)
.
- At most, it can be as big as
Considering all, the total space complexity is O(a) + O(N1)
which simplifies to O(N1)
because a
is constant and generally considered negligible compared to N1
.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of these pictures shows the visit order of a depth-first search?
Recommended Readings
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time