890. Find and Replace Pattern
Problem Description
You are given a list of strings words
and a string pattern
. Your task is to find all words from the list that match the given pattern.
A word matches the pattern if you can create a one-to-one character mapping (bijection) between the pattern and the word. This means:
- Each character in the pattern maps to exactly one character in the word
- Each character in the word maps to exactly one character in the pattern
- The mapping must be consistent throughout the entire string
For example:
- If
pattern = "abb"
andword = "egg"
, they match because we can mapa → e
andb → g
- If
pattern = "abb"
andword = "abc"
, they don't match becauseb
would need to map to bothb
andc
The solution uses two arrays m1
and m2
to track when each character was first encountered. For matching strings, characters at the same positions should have been first encountered at the same index. The algorithm iterates through both strings simultaneously:
m1[ord(a)]
stores the position where charactera
from the word was first seenm2[ord(b)]
stores the position where characterb
from the pattern was first seen- If these values differ at any position, the strings don't match
The function returns all words that successfully match the pattern, in any order.
Intuition
The key insight is that two strings follow the same pattern if their characters appear in the same relative positions. Rather than building an explicit character-to-character mapping, we can check if the pattern of first occurrences is identical.
Think about it this way: when we scan through "abb" and "egg" character by character:
- At position 0: 'a' appears for the first time, 'e' appears for the first time
- At position 1: 'b' appears for the first time, 'g' appears for the first time
- At position 2: 'b' appears again (was first seen at position 1), 'g' appears again (was first seen at position 1)
The pattern of first appearances matches perfectly! This is the core idea - instead of tracking "a maps to e, b maps to g", we track "when did each character first appear?"
We can implement this by maintaining two arrays that record the first occurrence position of each character. As we traverse both strings simultaneously:
- If a character in the word was first seen at position
i
, and the corresponding character in the pattern was first seen at positionj
, theni
must equalj
for a valid match - If they differ, the strings don't follow the same pattern
This approach elegantly handles the bijection requirement automatically. If two different characters in the pattern try to map to the same character in the word (or vice versa), their first occurrence positions will be different, causing the match to fail.
The beauty of this solution is that it transforms a seemingly complex mapping problem into a simple comparison of occurrence patterns, making the code both efficient and easy to understand.
Solution Approach
The implementation uses a helper function match(s, t)
to check if a word s
matches the pattern t
.
Data Structures:
- Two arrays
m1
andm2
of size 128 (to cover all ASCII characters) m1
tracks first occurrence positions for characters in the wordm2
tracks first occurrence positions for characters in the pattern
Algorithm Steps:
-
Initialize tracking arrays: Create
m1
andm2
with all zeros. A zero value means the character hasn't been seen yet. -
Simultaneous traversal: Use
enumerate(zip(s, t), 1)
to iterate through both strings together, starting the index from 1 (to distinguish from the initial 0 values). -
Character position checking: For each pair of characters
(a, b)
at positioni
:- Convert characters to ASCII values using
ord(a)
andord(b)
- Check if
m1[ord(a)] != m2[ord(b)]
- If they differ, it means:
- Either one character was seen before and the other wasn't
- Or both were seen before but at different positions
- In either case, return
False
as the pattern doesn't match
- Convert characters to ASCII values using
-
Update first occurrence: If the positions match (both are 0 or both have the same value), update both to the current position
i
:m1[ord(a)] = m2[ord(b)] = i
-
Filter the words: Use a list comprehension to filter all words that match the pattern:
return [word for word in words if match(word, pattern)]
Why this works:
- When both arrays have 0 at a position, both characters are appearing for the first time
- When both arrays have the same non-zero value, both characters appeared before at the same position
- Any mismatch in these values indicates the bijection property is violated
Time Complexity: O(n * m)
where n
is the number of words and m
is the length of each word/pattern
Space Complexity: O(1)
for the tracking arrays (fixed size 128)
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through finding words that match the pattern "abb"
from the list ["egg", "abc", "foo"]
.
Checking "egg" against pattern "abb":
Initialize m1
and m2
arrays (all zeros). We'll track only relevant positions for clarity:
Position | Pattern char | Word char | m2[ord(pattern)] | m1[ord(word)] | Action |
---|---|---|---|---|---|
i=1 | 'a' | 'e' | m2[97]=0 | m1[101]=0 | Both are 0 (first occurrence), set both to 1 |
i=2 | 'b' | 'g' | m2[98]=0 | m1[103]=0 | Both are 0 (first occurrence), set both to 2 |
i=3 | 'b' | 'g' | m2[98]=2 | m1[103]=2 | Both are 2 (seen at position 2), match continues |
Result: "egg" matches! The pattern holds because 'a'→'e' and 'b'→'g' consistently.
Checking "abc" against pattern "abb":
Reset m1
and m2
arrays:
Position | Pattern char | Word char | m2[ord(pattern)] | m1[ord(word)] | Action |
---|---|---|---|---|---|
i=1 | 'a' | 'a' | m2[97]=0 | m1[97]=0 | Both are 0, set both to 1 |
i=2 | 'b' | 'b' | m2[98]=0 | m1[98]=0 | Both are 0, set both to 2 |
i=3 | 'b' | 'c' | m2[98]=2 | m1[99]=0 | Mismatch! (2 ≠ 0) |
Result: "abc" doesn't match! The second 'b' in pattern should map to the same character as the first 'b', but here it maps to 'c' instead of 'b'.
Checking "foo" against pattern "abb":
Reset m1
and m2
arrays:
Position | Pattern char | Word char | m2[ord(pattern)] | m1[ord(word)] | Action |
---|---|---|---|---|---|
i=1 | 'a' | 'f' | m2[97]=0 | m1[102]=0 | Both are 0, set both to 1 |
i=2 | 'b' | 'o' | m2[98]=0 | m1[111]=0 | Both are 0, set both to 2 |
i=3 | 'b' | 'o' | m2[98]=2 | m1[111]=2 | Both are 2, match continues |
Result: "foo" matches! The pattern holds because 'a'→'f' and 'b'→'o' consistently.
Final output: ["egg", "foo"]
The algorithm elegantly detects pattern mismatches by comparing when characters were first encountered. If the "first seen" positions don't align, the bijection is broken.
Solution Implementation
1class Solution:
2 def findAndReplacePattern(self, words: List[str], pattern: str) -> List[str]:
3 def matches_pattern(word: str, pattern: str) -> bool:
4 """
5 Check if a word matches the given pattern by verifying bijective mapping.
6
7 Two arrays track the last seen position of each character.
8 If characters should map to each other, they must have the same last seen position.
9 """
10 # Initialize mapping arrays for ASCII characters (size 128)
11 # These arrays store the last index where each character was seen
12 word_last_seen = [0] * 128
13 pattern_last_seen = [0] * 128
14
15 # Iterate through word and pattern simultaneously with 1-based indexing
16 for index, (word_char, pattern_char) in enumerate(zip(word, pattern), start=1):
17 # Get ASCII values of current characters
18 word_char_ascii = ord(word_char)
19 pattern_char_ascii = ord(pattern_char)
20
21 # If the last seen positions don't match, the mapping is inconsistent
22 if word_last_seen[word_char_ascii] != pattern_last_seen[pattern_char_ascii]:
23 return False
24
25 # Update last seen position for both characters
26 word_last_seen[word_char_ascii] = index
27 pattern_last_seen[pattern_char_ascii] = index
28
29 return True
30
31 # Filter and return words that match the pattern
32 return [word for word in words if matches_pattern(word, pattern)]
33
1class Solution {
2 /**
3 * Finds all words in the array that match the given pattern.
4 * A word matches the pattern if there exists a bijection between
5 * characters in the word and characters in the pattern.
6 *
7 * @param words Array of words to check against the pattern
8 * @param pattern The pattern string to match
9 * @return List of words that match the pattern
10 */
11 public List<String> findAndReplacePattern(String[] words, String pattern) {
12 List<String> matchingWords = new ArrayList<>();
13
14 // Check each word against the pattern
15 for (String word : words) {
16 if (isPatternMatch(word, pattern)) {
17 matchingWords.add(word);
18 }
19 }
20
21 return matchingWords;
22 }
23
24 /**
25 * Checks if a word matches a pattern by verifying character mappings.
26 * Uses the first occurrence index technique to ensure bijective mapping.
27 *
28 * @param word The word to check
29 * @param pattern The pattern to match against
30 * @return true if the word matches the pattern, false otherwise
31 */
32 private boolean isPatternMatch(String word, String pattern) {
33 // Arrays to store the first occurrence index of each character
34 // Size 128 covers all ASCII characters
35 int[] wordCharMapping = new int[128];
36 int[] patternCharMapping = new int[128];
37
38 // Iterate through each character position
39 for (int i = 0; i < word.length(); i++) {
40 char wordChar = word.charAt(i);
41 char patternChar = pattern.charAt(i);
42
43 // Check if the first occurrence indices match
44 // If they don't match, the bijection is violated
45 if (wordCharMapping[wordChar] != patternCharMapping[patternChar]) {
46 return false;
47 }
48
49 // Store the position (i + 1) to distinguish from default 0
50 // This ensures unvisited characters have value 0
51 wordCharMapping[wordChar] = i + 1;
52 patternCharMapping[patternChar] = i + 1;
53 }
54
55 return true;
56 }
57}
58
1class Solution {
2public:
3 vector<string> findAndReplacePattern(vector<string>& words, string pattern) {
4 vector<string> result;
5
6 // Lambda function to check if two strings follow the same pattern
7 auto isPatternMatch = [](const string& word, const string& pattern) -> bool {
8 // Arrays to store the last seen position of each character
9 // Using ASCII table size (128) for character mapping
10 int wordCharMapping[128] = {0};
11 int patternCharMapping[128] = {0};
12
13 // Check if both strings follow the same pattern
14 for (int i = 0; i < word.size(); ++i) {
15 char wordChar = word[i];
16 char patternChar = pattern[i];
17
18 // If the last seen positions don't match, patterns are different
19 if (wordCharMapping[wordChar] != patternCharMapping[patternChar]) {
20 return false;
21 }
22
23 // Update last seen position (using i+1 to distinguish from uninitialized 0)
24 wordCharMapping[wordChar] = i + 1;
25 patternCharMapping[patternChar] = i + 1;
26 }
27
28 return true;
29 };
30
31 // Check each word against the pattern
32 for (const auto& word : words) {
33 if (isPatternMatch(word, pattern)) {
34 result.push_back(word);
35 }
36 }
37
38 return result;
39 }
40};
41
1/**
2 * Finds all words that match the given pattern where each letter in the pattern
3 * maps to exactly one unique letter in the word (bijection mapping)
4 * @param words - Array of words to check against the pattern
5 * @param pattern - Pattern string to match against
6 * @returns Array of words that match the pattern
7 */
8function findAndReplacePattern(words: string[], pattern: string): string[] {
9 return words.filter((word: string) => {
10 // Map to store the last seen index of each character in the word
11 const wordCharIndexMap: Map<string, number> = new Map<string, number>();
12
13 // Map to store the last seen index of each character in the pattern
14 const patternCharIndexMap: Map<string, number> = new Map<string, number>();
15
16 // Iterate through each character position
17 for (let i: number = 0; i < word.length; i++) {
18 // Get the last seen index for current characters in both word and pattern
19 // If characters haven't been seen before, undefined will be returned
20 const wordCharLastIndex: number | undefined = wordCharIndexMap.get(word[i]);
21 const patternCharLastIndex: number | undefined = patternCharIndexMap.get(pattern[i]);
22
23 // If the mapping is inconsistent (different last seen indices),
24 // the word doesn't match the pattern
25 if (wordCharLastIndex !== patternCharLastIndex) {
26 return false;
27 }
28
29 // Update the last seen index for both characters
30 wordCharIndexMap.set(word[i], i);
31 patternCharIndexMap.set(pattern[i], i);
32 }
33
34 // Word matches the pattern
35 return true;
36 });
37}
38
Time and Space Complexity
Time Complexity: O(n * m)
where n
is the number of words in the input list and m
is the length of each word/pattern.
- The outer loop iterates through all
n
words in the list - For each word, the
match
function is called which:- Iterates through
m
characters (usingzip
andenumerate
) - Each iteration performs constant time operations: array lookups
O(1)
and assignmentsO(1)
- Iterates through
- Therefore, the total time complexity is
O(n * m)
Space Complexity: O(1)
for the auxiliary space (excluding the output list).
- The
match
function creates two fixed-size arraysm1
andm2
, each of size 128 (for ASCII characters) - These arrays have constant size regardless of input size, contributing
O(1)
space - The list comprehension creates an output list that could contain up to
n
words in the worst case, which would beO(n * m)
for the output - However, considering only auxiliary space used by the algorithm itself (not including the output), the space complexity is
O(1)
If we include the output space, the total space complexity would be O(n * m)
in the worst case where all words match the pattern.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Handling of Different Length Strings
One common pitfall is assuming that the word and pattern will always have the same length. If they don't, the zip()
function will silently truncate to the shorter length, potentially causing false positives.
Problem Example:
- Pattern:
"ab"
- Word:
"abc"
- The zip would only check
"ab"
vs"ab"
and miss the extra"c"
Solution: Add a length check before processing:
def matches_pattern(word: str, pattern: str) -> bool:
if len(word) != len(pattern):
return False
# ... rest of the code
2. Using Dictionary Instead of Array for Character Mapping
While using a dictionary seems more intuitive, it can lead to subtle bugs if not handled carefully:
Problematic approach:
def matches_pattern(word: str, pattern: str) -> bool:
word_to_pattern = {}
pattern_to_word = {}
for w, p in zip(word, pattern):
if w in word_to_pattern:
if word_to_pattern[w] != p:
return False
else:
word_to_pattern[w] = p
# Bug: Forgetting to check reverse mapping
return True
This only checks one direction of the bijection and would incorrectly accept cases where multiple word characters map to the same pattern character.
Correct dictionary approach:
def matches_pattern(word: str, pattern: str) -> bool:
if len(word) != len(pattern):
return False
word_to_pattern = {}
pattern_to_word = {}
for w, p in zip(word, pattern):
if w in word_to_pattern:
if word_to_pattern[w] != p:
return False
else:
word_to_pattern[w] = p
if p in pattern_to_word:
if pattern_to_word[p] != w:
return False
else:
pattern_to_word[p] = w
return True
3. Starting Index from 0 Instead of 1
If you start the enumerate index from 0, the initial array values (also 0) would be indistinguishable from "character seen at position 0", causing incorrect pattern matching:
Problematic code:
for index, (word_char, pattern_char) in enumerate(zip(word, pattern)): # Starts from 0
word_char_ascii = ord(word_char)
pattern_char_ascii = ord(pattern_char)
# Bug: Can't distinguish between "not seen" and "seen at position 0"
if word_last_seen[word_char_ascii] != pattern_last_seen[pattern_char_ascii]:
return False
Example where this fails:
- Pattern:
"aa"
- Word:
"bb"
- At index 0: Both arrays are 0 (initial value), so we set them to 0
- At index 1: Both arrays are still 0 (we set them to 0 in previous step), condition passes incorrectly
Solution:
Always use enumerate(zip(word, pattern), start=1)
to avoid this ambiguity.
4. Array Size Assumptions
Using an array of size 128 assumes only ASCII characters. If the input contains Unicode characters beyond ASCII range, this will cause an IndexError:
Solution for Unicode support:
def matches_pattern(word: str, pattern: str) -> bool:
if len(word) != len(pattern):
return False
word_mapping = {}
pattern_mapping = {}
for index, (w, p) in enumerate(zip(word, pattern), start=1):
word_last = word_mapping.get(w, 0)
pattern_last = pattern_mapping.get(p, 0)
if word_last != pattern_last:
return False
word_mapping[w] = index
pattern_mapping[p] = index
return True
Which data structure is used to implement recursion?
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!