1178. Number of Valid Words for Each Puzzle
Problem Description
You are given a list of words
and a list of puzzles
. For each puzzle, you need to count how many words from the word list are "valid" with respect to that puzzle.
A word is considered valid for a puzzle if it satisfies both of these conditions:
- The word must contain the first letter of the puzzle
- Every letter in the word must exist somewhere in the puzzle
For example, if the puzzle is "abcdefg"
:
"faced"
is valid (contains 'a' which is the first letter, and all letters f,a,c,e,d are in the puzzle)"cabbage"
is valid (contains 'a' and all its letters are in the puzzle)"baggage"
is valid (contains 'a' and all its letters are in the puzzle)"beefed"
is invalid (doesn't contain 'a', the first letter of the puzzle)"based"
is invalid (contains 's' which is not in the puzzle)
Your task is to return an array answer
where answer[i]
represents the number of valid words for puzzles[i]
.
The solution uses bit manipulation to efficiently handle this problem. Each word and puzzle is converted to a bitmask where each bit represents whether a specific letter (a-z) is present. Words are preprocessed and counted by their bitmask representation using a hash table. For each puzzle, the solution enumerates all possible subsets of its letters that include the first letter, and counts how many words match each subset by looking up the hash table. This approach is efficient because puzzles have a fixed length of 7, making subset enumeration feasible.
Intuition
The key insight is recognizing that we only care about which unique letters are present in each word, not their frequency or order. This immediately suggests using a set-like representation for each word.
Since we're dealing with lowercase English letters (only 26 possibilities), we can use bit manipulation where each bit position represents whether a specific letter exists. For example, if bit 0 is set, it means 'a' is present; if bit 1 is set, 'b' is present, and so on. This gives us a compact integer representation for any combination of letters.
The naive approach would be to check every word against every puzzle, but this would be inefficient. Instead, we can preprocess all words into their bitmask representations and count how many words share the same bitmask using a hash table. This is beneficial because multiple words might have the same set of unique letters (like "abc", "cab", "bca" all have the same bitmask).
For each puzzle, we need to find all words whose letters are a subset of the puzzle's letters AND contain the puzzle's first letter. Here's where the crucial observation comes in: since puzzles have exactly 7 letters, there are at most 2^7 = 128
possible subsets for each puzzle. This is a manageable number to enumerate.
We can iterate through all subsets of a puzzle's bitmask that include the first letter. For each such subset, we check our hash table to see how many words have exactly that bitmask. The sum of all these counts gives us the answer for that puzzle.
The subset enumeration technique j = (j - 1) & mask
is a clever bit manipulation trick that efficiently generates all subsets of a given bitmask in descending order, allowing us to check each valid combination exactly once.
Learn more about Trie patterns.
Solution Approach
The solution uses state compression with bit manipulation, a hash table for counting, and subset enumeration. Let's walk through the implementation step by step:
Step 1: Preprocessing Words with State Compression
First, we convert each word into a bitmask and count occurrences:
cnt = Counter()
for w in words:
mask = 0
for c in w:
mask |= 1 << (ord(c) - ord("a"))
cnt[mask] += 1
For each word, we create a bitmask where bit i
is set if letter i
appears in the word. The expression 1 << (ord(c) - ord("a"))
creates a bitmask with only the bit corresponding to character c
set. We use the OR operation |=
to combine all character bits. The Counter stores how many words map to each unique bitmask.
Step 2: Processing Each Puzzle
For each puzzle, we need to find all valid words:
ans = []
for p in puzzles:
mask = 0
for c in p:
mask |= 1 << (ord(c) - ord("a"))
Similar to words, we first create a bitmask for the puzzle representing all its letters.
Step 3: Subset Enumeration with First Letter Constraint
x, i, j = 0, ord(p[0]) - ord("a"), mask
while j:
if j >> i & 1:
x += cnt[j]
j = (j - 1) & mask
ans.append(x)
Here's the clever part:
x
accumulates the count of valid words for this puzzlei
stores the bit position of the puzzle's first letterj
iterates through all subsets of the puzzle's bitmask
The subset enumeration uses the technique j = (j - 1) & mask
. This generates all subsets of mask
in descending order. Here's how it works:
- Subtracting 1 from
j
flips all trailing 1s to 0s and the rightmost 0 to 1 - ANDing with
mask
ensures we only keep bits that were originally in the puzzle - This efficiently generates each subset exactly once
The condition if j >> i & 1
checks if the current subset j
contains the puzzle's first letter (bit at position i
). Only if this condition is true do we add the count of words with this exact bitmask to our answer.
Time Complexity: O(N Γ W + M Γ 2^7) where N is the number of words, W is the average word length, and M is the number of puzzles. The 2^7 factor comes from enumerating at most 128 subsets per puzzle.
Space Complexity: O(N) for storing the hash table of word bitmasks.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example with words = ["ace", "aec", "face"]
and puzzles = ["cafe"]
.
Step 1: Preprocessing Words
Convert each word to a bitmask:
-
"ace"
:- 'a' β bit 0:
mask = 0000...001
- 'c' β bit 2:
mask = 0000...101
- 'e' β bit 4:
mask = 0001...0101
(binary: 10101, decimal: 21)
- 'a' β bit 0:
-
"aec"
: Same letters as "ace" β bitmask = 21 -
"face"
:- 'f' β bit 5: starts with
0010...0000
- 'a' β bit 0: becomes
0010...0001
- 'c' β bit 2: becomes
0010...0101
- 'e' β bit 4: becomes
0011...0101
(binary: 110101, decimal: 53)
- 'f' β bit 5: starts with
Our hash table cnt
:
cnt[21] = 2
(both "ace" and "aec" map to this)cnt[53] = 1
("face" maps to this)
Step 2: Processing Puzzle "cafe"
Convert puzzle to bitmask:
- 'c' β bit 2, 'a' β bit 0, 'f' β bit 5, 'e' β bit 4
- Puzzle bitmask =
0011...0101
(binary: 110101, decimal: 53)
Step 3: Enumerate Subsets
First letter is 'c' (bit position 2). We enumerate all subsets of 53 and check if they contain bit 2:
Starting with j = 53
(binary: 110101):
j = 53
(110101): Contains bit 2? YES β Checkcnt[53] = 1
β Add 1 to countj = 52
(110100): Contains bit 2? YES β Checkcnt[52] = 0
β Add 0j = 37
(100101): Contains bit 2? YES β Checkcnt[37] = 0
β Add 0j = 36
(100100): Contains bit 2? YES β Checkcnt[36] = 0
β Add 0j = 21
(010101): Contains bit 2? YES β Checkcnt[21] = 2
β Add 2 to countj = 20
(010100): Contains bit 2? YES β Checkcnt[20] = 0
β Add 0j = 5
(000101): Contains bit 2? YES β Checkcnt[5] = 0
β Add 0j = 4
(000100): Contains bit 2? YES β Checkcnt[4] = 0
β Add 0- Continue until
j = 0
...
Total valid words for puzzle "cafe" = 1 + 2 = 3
All three words are valid because:
- "ace" and "aec" contain 'c' (first letter) and all their letters {a,c,e} are in {c,a,f,e}
- "face" contains 'c' (first letter) and all its letters {f,a,c,e} are in {c,a,f,e}
Solution Implementation
1class Solution:
2 def findNumOfValidWords(self, words: List[str], puzzles: List[str]) -> List[int]:
3 # Count frequency of each word's letter bitmask
4 # Each bit position represents a letter (a=0, b=1, ..., z=25)
5 word_bitmask_count = Counter()
6
7 for word in words:
8 bitmask = 0
9 # Convert word to bitmask representation
10 for char in word:
11 bit_position = ord(char) - ord('a')
12 bitmask |= 1 << bit_position
13 word_bitmask_count[bitmask] += 1
14
15 result = []
16
17 for puzzle in puzzles:
18 # Create bitmask for puzzle letters
19 puzzle_bitmask = 0
20 for char in puzzle:
21 bit_position = ord(char) - ord('a')
22 puzzle_bitmask |= 1 << bit_position
23
24 # Count valid words for this puzzle
25 valid_word_count = 0
26 first_letter_bit = ord(puzzle[0]) - ord('a')
27
28 # Iterate through all subsets of puzzle_bitmask that contain the first letter
29 # Using technique: iterate through subsets by decrementing and ANDing with original mask
30 subset = puzzle_bitmask
31 while subset:
32 # Check if subset contains the first letter of puzzle
33 if (subset >> first_letter_bit) & 1:
34 valid_word_count += word_bitmask_count[subset]
35 # Move to next subset of puzzle_bitmask
36 subset = (subset - 1) & puzzle_bitmask
37
38 result.append(valid_word_count)
39
40 return result
41
1class Solution {
2 public List<Integer> findNumOfValidWords(String[] words, String[] puzzles) {
3 // Map to store bitmask representation of words and their frequency
4 Map<Integer, Integer> wordMaskCount = new HashMap<>(words.length);
5
6 // Convert each word to a bitmask and count occurrences
7 for (String word : words) {
8 int bitmask = 0;
9 // Set bit for each unique character in the word
10 for (int i = 0; i < word.length(); i++) {
11 int charIndex = word.charAt(i) - 'a';
12 bitmask |= (1 << charIndex);
13 }
14 // Increment count for this bitmask pattern
15 wordMaskCount.merge(bitmask, 1, Integer::sum);
16 }
17
18 // List to store result for each puzzle
19 List<Integer> result = new ArrayList<>();
20
21 // Process each puzzle
22 for (String puzzle : puzzles) {
23 int puzzleMask = 0;
24 // Create bitmask for all characters in the puzzle
25 for (int i = 0; i < puzzle.length(); i++) {
26 int charIndex = puzzle.charAt(i) - 'a';
27 puzzleMask |= (1 << charIndex);
28 }
29
30 int validWordCount = 0;
31 int firstCharIndex = puzzle.charAt(0) - 'a';
32
33 // Iterate through all subsets of the puzzle's bitmask
34 for (int subset = puzzleMask; subset > 0; subset = (subset - 1) & puzzleMask) {
35 // Check if the subset contains the first character of the puzzle
36 if ((subset >> firstCharIndex & 1) == 1) {
37 // Add count of words matching this subset
38 validWordCount += wordMaskCount.getOrDefault(subset, 0);
39 }
40 }
41
42 result.add(validWordCount);
43 }
44
45 return result;
46 }
47}
48
1class Solution {
2public:
3 vector<int> findNumOfValidWords(vector<string>& words, vector<string>& puzzles) {
4 // Map to store frequency of each word's bitmask representation
5 // Key: bitmask of unique letters in word, Value: count of words with that bitmask
6 unordered_map<int, int> wordMaskFrequency;
7
8 // Convert each word to a bitmask and count frequency
9 for (const auto& word : words) {
10 int bitmask = 0;
11 // Set bit for each unique letter in the word
12 for (const char& letter : word) {
13 bitmask |= 1 << (letter - 'a');
14 }
15 wordMaskFrequency[bitmask]++;
16 }
17
18 vector<int> result;
19
20 // Process each puzzle
21 for (const auto& puzzle : puzzles) {
22 // Create bitmask for all letters in puzzle
23 int puzzleMask = 0;
24 for (const char& letter : puzzle) {
25 puzzleMask |= 1 << (letter - 'a');
26 }
27
28 int validWordCount = 0;
29 int firstLetterBit = puzzle[0] - 'a'; // First letter of puzzle (required)
30
31 // Iterate through all subsets of puzzle's bitmask
32 // This clever iteration goes through all non-empty subsets
33 for (int subset = puzzleMask; subset > 0; subset = (subset - 1) & puzzleMask) {
34 // Check if the subset contains the first letter of puzzle
35 if ((subset >> firstLetterBit) & 1) {
36 // Add count of words matching this subset
37 validWordCount += wordMaskFrequency[subset];
38 }
39 }
40
41 result.push_back(validWordCount);
42 }
43
44 return result;
45 }
46};
47
1/**
2 * Finds the number of valid words for each puzzle.
3 * A word is valid for a puzzle if:
4 * 1. The word contains the first letter of the puzzle
5 * 2. Every letter in the word exists in the puzzle
6 *
7 * @param words - Array of words to check
8 * @param puzzles - Array of puzzles to validate against
9 * @returns Array where each element represents the count of valid words for the corresponding puzzle
10 */
11function findNumOfValidWords(words: string[], puzzles: string[]): number[] {
12 // Map to store bitmask representations of words and their frequencies
13 const wordBitmaskCount: Map<number, number> = new Map();
14
15 // Convert each word to a bitmask representation
16 for (const word of words) {
17 let bitmask = 0;
18
19 // Set bit for each unique character in the word
20 // Character 'a' maps to bit 0, 'b' to bit 1, etc.
21 for (const char of word) {
22 const bitPosition = char.charCodeAt(0) - 97; // 'a' has ASCII value 97
23 bitmask |= 1 << bitPosition;
24 }
25
26 // Increment count for this bitmask pattern
27 wordBitmaskCount.set(bitmask, (wordBitmaskCount.get(bitmask) || 0) + 1);
28 }
29
30 const result: number[] = [];
31
32 // Process each puzzle
33 for (const puzzle of puzzles) {
34 let puzzleBitmask = 0;
35
36 // Create bitmask for the puzzle
37 for (const char of puzzle) {
38 const bitPosition = char.charCodeAt(0) - 97;
39 puzzleBitmask |= 1 << bitPosition;
40 }
41
42 let validWordCount = 0;
43 const firstCharBitPosition = puzzle.charCodeAt(0) - 97; // Position of puzzle's first character
44
45 // Iterate through all subsets of the puzzle bitmask
46 // This technique generates all possible combinations of letters from the puzzle
47 for (let subset = puzzleBitmask; subset > 0; subset = (subset - 1) & puzzleBitmask) {
48 // Check if the subset contains the first letter of the puzzle
49 if ((subset >> firstCharBitPosition) & 1) {
50 // Add count of words matching this subset pattern
51 validWordCount += wordBitmaskCount.get(subset) || 0;
52 }
53 }
54
55 result.push(validWordCount);
56 }
57
58 return result;
59}
60
Time and Space Complexity
Time Complexity: O(m Γ |w| + n Γ 2^|p|)
The time complexity consists of two main parts:
-
Processing words:
O(m Γ |w|)
- We iterate through
m
words - For each word, we iterate through at most
|w|
characters to create the bitmask - Each character operation (bitwise OR) takes
O(1)
time
- We iterate through
-
Processing puzzles:
O(n Γ 2^|p|)
- We iterate through
n
puzzles - For each puzzle, we first create a bitmask in
O(|p|)
time - Then we enumerate all submasks using the technique
j = (j - 1) & mask
- This submask enumeration iterates through all submasks of
mask
, which can be at most2^|p|
submasks (where|p|
is the number of set bits in the mask, bounded by the puzzle length) - The constraint check
j >> i & 1
ensures we only count valid submasks containing the first letter
- We iterate through
Space Complexity: O(m)
The space complexity is determined by:
- The
Counter
dictionarycnt
which stores at mostm
unique bitmasks (one for each word in the worst case) - The
ans
list which storesn
results, but this is considered output space and typically not counted in auxiliary space - Other variables use
O(1)
space
Where:
m
= length of thewords
arrayn
= length of thepuzzles
array|w|
= maximum length of words in thewords
array|p|
= length of puzzles (typically 7 in this problem)
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Missing the Zero Subset
The subset enumeration loop while subset:
terminates when subset
becomes 0, but 0 is actually a valid subset (the empty subset). This means we miss counting words that consist only of the puzzle's first letter.
Example: If puzzle is "abcdefg" and we have a word "aaa", its bitmask would be just the bit for 'a'. When enumerating subsets, we never check subset = 0b0000001 (only 'a' set) if we stop at 0.
Solution: Add a special check after the loop for the subset containing only the first letter:
# After the while loop subset = 1 << first_letter_bit # Subset with only first letter valid_word_count += word_bitmask_count[subset]
2. Not Filtering Words with Too Many Unique Letters
Words can have more than 7 unique letters, but puzzles always have exactly 7 letters. A word with 8+ unique letters can never be valid for any puzzle, yet we still store them in our hash table, wasting memory.
Example: The word "acknowledgment" has 12 unique letters and will never match any 7-letter puzzle.
Solution: Skip words with more than 7 unique letters during preprocessing:
for word in words:
bitmask = 0
for char in word:
bitmask |= 1 << (ord(char) - ord('a'))
# Count number of set bits (unique letters)
if bin(bitmask).count('1') <= 7:
word_bitmask_count[bitmask] += 1
3. Incorrect Subset Enumeration Starting Point
Some implementations might accidentally start subset enumeration from puzzle_bitmask - 1
instead of puzzle_bitmask
, missing the case where all puzzle letters are used.
Example: If puzzle is "abc" with bitmask 0b111, starting from 0b110 would miss checking if any words have exactly the letters "abc".
Solution: Always initialize subset = puzzle_bitmask
before the enumeration loop to ensure we check the full set of puzzle letters first.
4. Confusing Bit Position with Bit Value
A common mistake is using 1 << first_letter_bit
when we should use just first_letter_bit
as the position, or vice versa.
Incorrect:
if subset & (1 << first_letter_bit): # Wrong - double shifting
Correct:
if (subset >> first_letter_bit) & 1: # Right - check bit at position
Which of the two traversal algorithms (BFS and DFS) can be used to find whether two nodes are connected?
Recommended Readings
Trie Introduction Imagine you have a magic dictionary where as soon as you start spelling a word it immediately tells you how many words that start with those letters How cool would that be This is exactly the kind of magic a data structure called trie performs What's a Trie
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Donβt Miss This!