2135. Count Words Obtained After Adding a Letter
Problem Description
You are provided with two lists of strings: startWords
and targetWords
. Each string is composed of lowercase English letters. Your task is to determine how many strings in targetWords
can be formed from any string in startWords
by performing a specific conversion operation.
The conversion operation consists of two steps:
- Append a single lowercase letter that is not already in the string to its end. For example, if you have the string "abc", you can add "d", "e", "y", but not "a" to it, creating strings like "abcd".
- Rearrange the letters of the newly formed string in any order. For instance, the string "abcd" could be rearranged into "acbd", "bacd", "cbda", etc.
You need to count the number of strings in targetWords
that can be achieved by applying this operation on any of the strings in startWords
.
It's important to note that startWords
aren't actually altered during this process; the operation is only used to verify the possibility of transforming a startWord
into a targetWord
.
Intuition
The solution is based on a clever use of bitwise operations to track the presence of each letter in a given string. We use an integer (bitmask) to represent each string, where the ith bit (from right to left) is set to 1 if the letter ‘a’ + i is in the string. For instance, the string "abc" would result in the bitmask 0b111
(in binary), representing the presence of 'a', 'b', and 'c'.
By using this approach, we can easily check whether we can form a targetWord
from any startWord
by appending a letter and rearranging it. For each targetWord
, we can generate its bitmask and then, for each letter in that targetWord
, we toggle the corresponding bit (using XOR operation) to simulate the removal of the letter. We then check if the resulting bitmask is present in the set of bitmasks created from the startWords
. If it's found, we know that we can form the targetWord
from that startWord
.
Let’s walk through the steps of the solution:
- Create an integer set
s
. For each word instartWords
, calculate a bitmask representing the letters in the word and add it tos
. - Initialize a counter
ans
to zero, which will keep track of the number oftargetWords
that can be formed. - For every word in
targetWords
, compute the bitmask in a similar way. - For each letter in the current
targetWord
, create a temporary bitmask by toggling (using XOR operation) the bit corresponding to the current letter. If the resulting bitmask is found ins
, incrementans
by one, and proceed to check the nexttargetWord
. - Return the final count
ans
.
By representing strings as bitmasks, we switch from an O(N*26) character comparison problem to an O(N) integer comparison problem, which is much more efficient.
Learn more about Sorting patterns.
Solution Approach
The solution implements a bit manipulation strategy using Python's bitwise operations to encode each word as a bitmask, where each bit represents the presence of a corresponding letter from the alphabet.
Let's detail the steps followed in the given Python code:
-
Bitmask Creation for
startWords
:- Initialize an empty set
s
which will hold the unique bitmasks of all the words instartWords
. - Iterate over each word in
startWords
, and for each word:- Initialize a variable
mask
with a value of 0. Thismask
will be the bitmask representation of the word. - For each character
c
in the word, calculate the differenceord(c) - ord('a')
to find the position of the bit that corresponds to the character in the alphabet (0 for 'a', 1 for 'b', and so on). - Use the bitwise OR
mask |= 1 << (ord(c) - ord('a'))
operation to set the bit at the calculated position. This ensures that each bit in themask
represents whether a particular character is in the word.
- Initialize a variable
- Add the resulting
mask
to the sets
.
- Initialize an empty set
-
Verification for
targetWords
:- Initialize the
ans
counter to 0, which will count the validtargetWords
. - Repeat the similar process for each word in
targetWords
to create the bitmask. - For each character
c
in the currenttargetWord
, generate a temporary bitmaskt
by toggling the bit corresponding toc
in the word's bitmask. Toggling is done using the XOR operationmask ^ (1 << (ord(c) - ord('a')))
, which effectively simulates removing the characterc
from the word. - If the temporary bitmask
t
is found in the sets
, increment theans
because this indicates that the originaltargetWord
can be formed by addingc
to somestartWord
and rearranging the letters. - If a match is found in the set
s
for the currenttargetWord
, break the inner loop to prevent counting the sametargetWord
multiple times, and continue with the nexttargetWord
.
- Initialize the
-
Return the Result:
- After iterating over all the words in
targetWords
, return the final countans
.
- After iterating over all the words in
The algorithm leverages bitwise operations for efficient comparison and the set data structure for constant-time lookup to check the possibility of formation of targetWords
. The key patterns used here are bitwise encoding and set membership checks, which make the solution compact and efficient.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
To elaborate on the solution approach, let's walk through a small example:
Let's say we have:
startWords = ["go","bat"]
targetWords = ["bago","atb","tabg"]
Now, we will apply the solution approach on this example.
Bitmask Creation for startWords
:
- Initialize an empty set
s
. This will store the bitmasks ofstartWords
. - For "
go
":- Start with bitmask
mask = 0
. - We add 'g' (bitmask
1 << (ord('g') - ord('a')) = 1 << 6
). - We add 'o' (bitmask
1 << (ord('o') - ord('a')) = 1 << 14
). - Final bitmask for "go" is
0 | 1 << 6 | 1 << 14
, which is0b1000001000000
. - Add this bitmask to set
s
.
- Start with bitmask
- For "
bat
":- Start with bitmask
mask = 0
. - We add 'b' (bitmask
1 << (ord('b') - ord('a')) = 1 << 1
). - We add 'a' (bitmask
1 << (ord('a') - ord('a')) = 1 << 0
). - We add 't' (bitmask
1 << (ord('t') - ord('a')) = 1 << 19
). - Final bitmask for "bat" is
0 | 1 << 1 | 1 << 0 | 1 << 19
, which is0b1000000000011
. - Add this bitmask to set
s
.
- Start with bitmask
After processing startWords
, our set s
has bitmasks {0b1000001000000, 0b1000000000011}
.
Verification for targetWords
:
- Initialize
ans
to 0. - Process
targetWords
similarly and compare with bitmasks ins
. - For "
bago
":- Bitmask for "bago" is
0b1010001000010
. - Check each letter by toggling it in the bitmask:
- Toggle 'b': We get mask
0b1010001000010 ^ 1 << 1 = 0b1010001000000
which is not ins
. - Toggle 'a': We get mask
0b1010001000010 ^ 1 << 0 = 0b1010001000011
which is ins
. - No need to check further, increment
ans
to1
.
- Toggle 'b': We get mask
- Bitmask for "bago" is
- For "
atb
":- Bitmask for "atb" directly matches the bitmask of "bat" in
s
. - Since one letter has to be added, this shouldn't have happened; "atb" is the same length as "bat", thus, we skip and don't increment
ans
.
- Bitmask for "atb" directly matches the bitmask of "bat" in
- For "
tabg
":- Bitmask for "tabg" is
0b1000001000011
. - Check each letter by toggling it in the bitmask:
- Toggle 't': We get mask
0b1000001000011 ^ 1 << 19 = 0b0000001000011
, which matches "ab" (not ins
, and too short anyway). - Toggle 'a': We get mask
0b1000001000011 ^ 1 << 0 = 0b1000001000010
, which is ins
. - No need to check further, increment
ans
to2
.
- Toggle 't': We get mask
- Bitmask for "tabg" is
Return the Result:
After examining all targetWords
, the final ans
is 2
, since the words "bago
" and "tabg
" in targetWords
can be formed from the startWords
"go
" and "bat
", respectively.
The key insights from this example that lead to the count of valid targetWords
are the efficient bitmask representation of the words and constant-time set membership checks to verify the transformation possibilities.
Solution Implementation
1from typing import List
2
3class Solution:
4 def wordCount(self, start_words: List[str], target_words: List[str]) -> int:
5 bit_masks = set() # Set to store the unique bit masks for start words.
6
7 # Preprocessing start words by converting them to bit masks
8 for word in start_words:
9 mask = 0
10 for char in word:
11 # Set the bit corresponding to the character in the mask
12 mask |= 1 << (ord(char) - ord('a'))
13 bit_masks.add(mask) # Add the mask to the set
14
15 valid_target_count = 0 # Counter for valid target words
16
17 # Check if each target word can be formed by adding a single letter to a word in start_words
18 for word in target_words:
19 mask = 0
20 for char in word:
21 # Set the bit corresponding to the character in the mask
22 mask |= 1 << (ord(char) - ord('a'))
23 # Now try removing one character at a time and check if it matches a start word
24 for char in word:
25 # Toggle off the bit corresponding to the character to remove it
26 temp_mask = mask ^ (1 << (ord(char) - ord('a')))
27 # If the resulting mask is in the set, increment the valid target count.
28 if temp_mask in bit_masks:
29 valid_target_count += 1
30 break # Break, as we only need to find one such start word
31
32 return valid_target_count # Return the total count of valid target words
33
1class Solution {
2 public int wordCount(String[] startWords, String[] targetWords) {
3 // Initialize a set to store the bit masks of startWords
4 Set<Integer> startWordMasks = new HashSet<>();
5
6 // Convert each start word into a bit mask and add to the set
7 for (String word : startWords) {
8 int bitmask = 0;
9 // For each character in the word, set the corresponding bit in the bitmask
10 for (char ch : word.toCharArray()) {
11 bitmask |= (1 << (ch - 'a'));
12 }
13 // Add the bitmask to the set
14 startWordMasks.add(bitmask);
15 }
16
17 // Counter for the number of valid target words
18 int validTargetCount = 0;
19
20 // Check each target word against the bit masks of startWords
21 for (String word : targetWords) {
22 int targetBitmask = 0;
23 // Calculate the bit mask for the target word
24 for (char ch : word.toCharArray()) {
25 targetBitmask |= (1 << (ch - 'a'));
26 }
27 // Try to find a start word that matches the target word by removing one letter
28 for (char ch : word.toCharArray()) {
29 // Create a new mask by flipping the bit corresponding to the current character
30 int modifiedMask = targetBitmask ^ (1 << (ch - 'a'));
31 // If the modified mask exists in the start words, we found a valid target
32 if (startWordMasks.contains(modifiedMask)) {
33 validTargetCount++;
34 break; // We found a matching start word; move to the next target word
35 }
36 }
37 }
38 // Return the count of valid target words
39 return validTargetCount;
40 }
41}
42
1#include <vector>
2#include <string>
3#include <unordered_set>
4
5class Solution {
6public:
7 int wordCount(vector<string>& startWords, vector<string>& targetWords) {
8 // A set to store the unique representation of each startWord.
9 unordered_set<int> wordMasks;
10
11 // Convert each startWord into a bitmask representing the letters it contains.
12 for (const auto& word : startWords) {
13 int mask = 0;
14 for (char c : word)
15 mask |= (1 << (c - 'a')); // Set the bit corresponding to the letter.
16 wordMasks.insert(mask); // Add the bitmask to the set.
17 }
18
19 int count = 0; // Counter for eligible target words.
20
21 // Iterate over each target word.
22 for (const auto& word : targetWords) {
23 int mask = 0;
24 for (char c : word)
25 mask |= (1 << (c - 'a')); // Create a bitmask for the target word.
26
27 // Try removing one letter from the target word to match a start word.
28 for (char c : word) {
29 int modifiedMask = mask ^ (1 << (c - 'a')); // Remove one letter from the target word.
30 if (wordMasks.count(modifiedMask)) {
31 // If the modified target word's bitmask matches any start word's bitmask,
32 // increment the count and stop checking further letters.
33 ++count;
34 break;
35 }
36 }
37 }
38
39 return count;
40 }
41};
42
1// Define the function that counts the eligible target words.
2function wordCount(startWords: string[], targetWords: string[]): number {
3 // A set to store the unique representation of each start word.
4 const wordMasks = new Set<number>();
5
6 // Convert each start word into a bitmask representing the letters it contains.
7 startWords.forEach(startWord => {
8 let mask = 0;
9 for (const char of startWord) {
10 mask |= (1 << (char.charCodeAt(0) - 'a'.charCodeAt(0))); // Set the bit corresponding to the letter.
11 }
12 wordMasks.add(mask); // Add the bitmask to the set.
13 });
14
15 // Counter for eligible target words.
16 let count = 0;
17
18 // Iterate over each target word.
19 targetWords.forEach(targetWord => {
20 let mask = 0;
21 for (const char of targetWord) {
22 mask |= (1 << (char.charCodeAt(0) - 'a'.charCodeAt(0))); // Create a bitmask for the target word.
23 }
24
25 // Try removing one letter from the target word to match a start word.
26 for (const char of targetWord) {
27 const modifiedMask = mask ^ (1 << (char.charCodeAt(0) - 'a'.charCodeAt(0))); // Remove one letter from the target word.
28 if (wordMasks.has(modifiedMask)) {
29 // If the modified target word's bitmask matches any start word's bitmask,
30 // increment the count and stop checking further letters.
31 count++;
32 break;
33 }
34 }
35 });
36
37 // Return the count of eligible target words.
38 return count;
39}
40
Time and Space Complexity
The time complexity of the given code is O(L * (N + M))
, where L
is the average length of the words across both startWords
and targetWords
, N
is the number of startWords
, and M
is the number of targetWords
. This complexity arises because the code iterates over each character of every word in startWords
to create bit masks (L * N
operations), and then for each word in targetWords
, it does the same to get a bitmask and tries to find if there is any word in startWords
that can be turned into this target word by adding one letter (L * M
operations).
The space complexity is O(N)
because we store bit masks of all startWords
in a set s
. If k
is the size of the alphabet and all words are of length L
, the actual bit masks would take up L
bits each, but since a set of integers is used here and the letters are mapped to bit indices, it only matters how many different bit masks there are, which corresponds to the number of startWords
(or N
).
Learn more about how to find time and space complexity quickly using problem constraints.
Which type of traversal does breadth first search do?
Recommended Readings
Sorting Summary Comparisons We presented quite a few sorting algorithms and it is essential to know the advantages and disadvantages of each one The basic algorithms are easy to visualize and easy to learn for beginner programmers because of their simplicity As such they will suffice if you don't know any advanced
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!