44. Wildcard Matching
Problem Description
You need to implement a wildcard pattern matching algorithm that checks if a given string s
matches a pattern p
. The pattern can contain two special wildcard characters:
'?'
- Matches exactly one character (any single character)'*'
- Matches any sequence of characters (can match zero, one, or multiple characters)
The matching must cover the entire input string, not just a partial match.
For example:
- Pattern
"a*b"
would match strings like"ab"
,"aab"
,"aaab"
,"axyzb"
etc. - Pattern
"a?b"
would match strings like"aab"
,"acb"
,"adb"
but not"ab"
or"aaab"
- Pattern
"*"
would match any string including empty string - Pattern
"a*b*c"
would match"abc"
,"axxxbxxxc"
,"abbc"
, etc.
The solution uses a recursive approach with memoization. The function dfs(i, j)
checks if the substring of s
starting at index i
matches the substring of pattern p
starting at index j
.
The key logic handles four cases:
- When we've processed all characters in
s
(base case) - When we've processed all characters in
p
(base case) - When the current pattern character is
'*'
- we try three possibilities: skip a character ins
, skip both current characters, or skip the'*'
in pattern - When the current pattern character is
'?'
or a regular character - we check for a match and continue with the next characters in both strings
The @cache
decorator is used to store previously computed results and avoid redundant calculations, improving the time complexity.
Intuition
When we see a pattern matching problem with wildcards, the natural approach is to process both strings character by character and make decisions based on what we encounter. Think of it like walking through both strings simultaneously with two pointers.
The key insight is that at each position, we need to make a decision based on the current characters we're looking at. If we see a regular character or '?'
, the decision is straightforward - they either match or they don't. But when we encounter '*'
, things get interesting because it can match any number of characters.
For the '*'
wildcard, we have multiple choices:
- We can match it with nothing (skip the
'*'
and move to the next pattern character) - We can match it with one character from
s
(move forward ins
but stay at'*'
in case we need to match more) - We can match it with one character and move past the
'*'
(move forward in both strings)
This branching nature suggests a recursive solution where we explore all possible matching paths. However, we might encounter the same subproblem multiple times. For instance, pattern "a*b*c"
matching string "aaabbbccc"
could reach the state (i=3, j=3)
through different paths - this is where memoization becomes crucial.
The recursive structure emerges naturally:
- Base cases: What happens when we reach the end of either string?
- Recursive cases: How do we handle each type of pattern character?
By breaking down the problem into "does substring s[i:]
match pattern p[j:]
?", we transform it into smaller subproblems. The answer to our original question is simply whether s[0:]
matches p[0:]
, which is dfs(0, 0)
.
The beauty of this approach is that it mirrors how we would manually check if a string matches a pattern - we'd go character by character, and when we see a '*'
, we'd consider all possibilities of what it could match.
Learn more about Greedy, Recursion and Dynamic Programming patterns.
Solution Approach
The solution implements a recursive approach with memoization using a depth-first search function dfs(i, j)
. This function determines whether the substring s[i:]
matches the pattern p[j:]
.
Let's walk through the implementation step by step:
1. Base Cases:
When i >= len(s)
(we've processed all characters in s
):
- Return
True
ifj >= len(p)
(both strings are exhausted) - Or if
p[j] == '*'
anddfs(i, j + 1)
isTrue
(remaining pattern can be all'*'
which matches empty string)
When j >= len(p)
(pattern is exhausted but string is not):
- Return
False
since we have unmatched characters ins
2. Handling the '*'
Wildcard:
When p[j] == '*'
, we have three possible transitions:
dfs(i + 1, j)
: Match'*'
with one character froms
and keep'*'
for potential future matchesdfs(i + 1, j + 1)
: Match'*'
with one character froms
and move past'*'
dfs(i, j + 1)
: Match'*'
with zero characters (skip the'*'
)
The function returns True
if any of these paths lead to a successful match.
3. Handling Regular Characters and '?'
:
For non-'*'
characters:
- Check if
p[j] == '?'
(matches any single character) ors[i] == p[j]
(exact character match) - If there's a match, recursively check
dfs(i + 1, j + 1)
for the remaining substrings - Return
True
only if both conditions are satisfied
4. Memoization with @cache
:
The @cache
decorator automatically stores the results of dfs(i, j)
calls. This prevents redundant computations when the same (i, j)
pair is encountered through different recursive paths, reducing time complexity from exponential to polynomial.
Time Complexity: O(m * n)
where m
is the length of string s
and n
is the length of pattern p
, since each (i, j)
pair is computed at most once.
Space Complexity: O(m * n)
for the memoization cache plus O(m + n)
for the recursion stack.
The algorithm systematically explores all valid matching possibilities while efficiently pruning redundant paths through memoization, making it an optimal solution for wildcard pattern matching.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's trace through matching string s = "abc"
with pattern p = "a*c"
.
We start with dfs(0, 0)
:
i = 0, j = 0
: Looking ats[0] = 'a'
andp[0] = 'a'
- Neither base case applies (both strings have remaining characters)
p[0]
is not'*'
, so we check if characters matchs[0] == p[0]
('a' == 'a'), so we recursively calldfs(1, 1)
Next, dfs(1, 1)
:
i = 1, j = 1
: Looking ats[1] = 'b'
andp[1] = '*'
p[1] == '*'
, so we have three options to explore:- Option 1:
dfs(2, 1)
- match '' with 'b', keep '' for more matches - Option 2:
dfs(2, 2)
- match '' with 'b', move past '' - Option 3:
dfs(1, 2)
- skip '*' without matching anything
- Option 1:
Let's explore Option 1: dfs(2, 1)
:
i = 2, j = 1
: Looking ats[2] = 'c'
andp[1] = '*'
p[1] == '*'
, so again three options:dfs(3, 1)
- match '' with 'c', keep ''dfs(3, 2)
- match '' with 'c', move past ''dfs(2, 2)
- skip '*'
Following dfs(3, 2)
:
i = 3 >= len(s)
, we've exhausted strings
j = 2
: Looking atp[2] = 'c'
- Since
p[2]
is not '*' and we have no characters left ins
, returnFalse
Let's backtrack and try dfs(2, 2)
from the previous level:
i = 2, j = 2
: Looking ats[2] = 'c'
andp[2] = 'c'
- Characters match ('c' == 'c'), so call
dfs(3, 3)
Finally, dfs(3, 3)
:
i = 3 >= len(s)
andj = 3 >= len(p)
- Both strings are exhausted, return
True
The True
value propagates back up through the recursive calls, ultimately returning True
for dfs(0, 0)
.
The memoization cache would store results like:
(3, 3) → True
(2, 2) → True
(3, 2) → False
(2, 1) → True
(1, 1) → True
(0, 0) → True
This prevents recalculating these subproblems if encountered again through different paths.
Solution Implementation
1class Solution:
2 def isMatch(self, s: str, p: str) -> bool:
3 """
4 Wildcard pattern matching with '?' and '*'.
5 '?' matches any single character.
6 '*' matches any sequence of characters (including empty sequence).
7
8 Args:
9 s: The input string to match.
10 p: The pattern string containing wildcards.
11
12 Returns:
13 True if the entire string matches the pattern, False otherwise.
14 """
15 from functools import cache
16
17 @cache
18 def dfs(string_idx: int, pattern_idx: int) -> bool:
19 """
20 Recursively check if substring s[string_idx:] matches pattern p[pattern_idx:].
21
22 Args:
23 string_idx: Current index in the string s.
24 pattern_idx: Current index in the pattern p.
25
26 Returns:
27 True if the remaining substring matches the remaining pattern.
28 """
29 # Base case: reached end of string
30 if string_idx >= len(s):
31 # Check if pattern is also exhausted or only has '*' remaining
32 return pattern_idx >= len(p) or (p[pattern_idx] == "*" and dfs(string_idx, pattern_idx + 1))
33
34 # Base case: pattern exhausted but string still has characters
35 if pattern_idx >= len(p):
36 return False
37
38 # Case 1: current pattern character is '*'
39 if p[pattern_idx] == "*":
40 # Three options:
41 # 1. Match one character from string and stay at '*' (match more)
42 # 2. Match one character from string and move past '*' (match exactly one)
43 # 3. Match zero characters and move past '*' (match empty)
44 return (dfs(string_idx + 1, pattern_idx) or
45 dfs(string_idx + 1, pattern_idx + 1) or
46 dfs(string_idx, pattern_idx + 1))
47
48 # Case 2: current pattern is '?' or exact character match
49 # Both string and pattern must advance together
50 return ((p[pattern_idx] == "?" or s[string_idx] == p[pattern_idx]) and
51 dfs(string_idx + 1, pattern_idx + 1))
52
53 # Start matching from the beginning of both strings
54 return dfs(0, 0)
55
1class Solution {
2 // Memoization table to store results of subproblems
3 // dp[i][j] represents if s[i:] matches p[j:]
4 private Boolean[][] dp;
5
6 // Character arrays for the input string and pattern
7 private char[] sChars;
8 private char[] pChars;
9
10 // Lengths of the input string and pattern
11 private int sLength;
12 private int pLength;
13
14 /**
15 * Determines if the string matches the given wildcard pattern.
16 * '?' matches any single character
17 * '*' matches any sequence of characters (including empty sequence)
18 *
19 * @param s the input string to match
20 * @param p the wildcard pattern
21 * @return true if the string matches the pattern, false otherwise
22 */
23 public boolean isMatch(String s, String p) {
24 // Convert strings to character arrays for easier access
25 this.sChars = s.toCharArray();
26 this.pChars = p.toCharArray();
27
28 // Store lengths
29 this.sLength = s.length();
30 this.pLength = p.length();
31
32 // Initialize memoization table
33 this.dp = new Boolean[sLength][pLength];
34
35 // Start DFS from the beginning of both strings
36 return dfs(0, 0);
37 }
38
39 /**
40 * Recursive DFS function with memoization to check pattern matching.
41 *
42 * @param sIndex current index in the string
43 * @param pIndex current index in the pattern
44 * @return true if s[sIndex:] matches p[pIndex:], false otherwise
45 */
46 private boolean dfs(int sIndex, int pIndex) {
47 // Base case: reached end of string
48 if (sIndex >= sLength) {
49 // Check if we've also reached end of pattern
50 // or if remaining pattern is '*' which can match empty string
51 return pIndex >= pLength || (pChars[pIndex] == '*' && dfs(sIndex, pIndex + 1));
52 }
53
54 // Base case: reached end of pattern but not end of string
55 if (pIndex >= pLength) {
56 return false;
57 }
58
59 // Check memoization table to avoid redundant computation
60 if (dp[sIndex][pIndex] != null) {
61 return dp[sIndex][pIndex];
62 }
63
64 // Handle wildcard '*' - can match zero or more characters
65 if (pChars[pIndex] == '*') {
66 // Three choices:
67 // 1. Match one or more characters: move string pointer (sIndex + 1, pIndex)
68 // 2. Match exactly one character: move both pointers (sIndex + 1, pIndex + 1)
69 // 3. Match zero characters: skip the '*' (sIndex, pIndex + 1)
70 dp[sIndex][pIndex] = dfs(sIndex + 1, pIndex) ||
71 dfs(sIndex + 1, pIndex + 1) ||
72 dfs(sIndex, pIndex + 1);
73 } else {
74 // Handle regular character or '?'
75 // Character must match (or pattern has '?'), then check remaining strings
76 dp[sIndex][pIndex] = (pChars[pIndex] == '?' || sChars[sIndex] == pChars[pIndex]) &&
77 dfs(sIndex + 1, pIndex + 1);
78 }
79
80 // Return and store the result
81 return dp[sIndex][pIndex];
82 }
83}
84
1class Solution {
2public:
3 bool isMatch(string s, string p) {
4 int sLen = s.size(), pLen = p.size();
5
6 // Create memoization table: dp[i][j] represents if s[i..] matches p[j..]
7 // -1: not computed, 0: false, 1: true
8 int dp[sLen + 1][pLen + 1];
9 memset(dp, -1, sizeof(dp));
10
11 // Define recursive function with memoization
12 function<bool(int, int)> dfs = [&](int sIdx, int pIdx) -> bool {
13 // Base case 1: Reached end of string s
14 if (sIdx >= sLen) {
15 // Either pattern is also finished, or current pattern char is '*'
16 // which can match empty sequence
17 return pIdx >= pLen || (p[pIdx] == '*' && dfs(sIdx, pIdx + 1));
18 }
19
20 // Base case 2: Pattern exhausted but string still has characters
21 if (pIdx >= pLen) {
22 return false;
23 }
24
25 // Check memoization table
26 if (dp[sIdx][pIdx] != -1) {
27 return dp[sIdx][pIdx] == 1;
28 }
29
30 // Process current pattern character
31 if (p[pIdx] == '*') {
32 // '*' can match:
33 // 1. One or more characters: dfs(sIdx + 1, pIdx)
34 // 2. Zero characters: dfs(sIdx, pIdx + 1)
35 dp[sIdx][pIdx] = (dfs(sIdx + 1, pIdx) || dfs(sIdx, pIdx + 1)) ? 1 : 0;
36 } else {
37 // '?' matches any single character, or characters must match exactly
38 // Then move both pointers forward
39 dp[sIdx][pIdx] = ((p[pIdx] == '?' || s[sIdx] == p[pIdx]) &&
40 dfs(sIdx + 1, pIdx + 1)) ? 1 : 0;
41 }
42
43 return dp[sIdx][pIdx] == 1;
44 };
45
46 // Start matching from beginning of both strings
47 return dfs(0, 0);
48 }
49};
50
1/**
2 * Checks if a string matches a pattern with wildcard support
3 * @param s - The input string to match
4 * @param p - The pattern string where '?' matches any single character and '*' matches any sequence
5 * @returns true if the string matches the pattern, false otherwise
6 */
7function isMatch(s: string, p: string): boolean {
8 const stringLength = s.length;
9 const patternLength = p.length;
10
11 // Initialize memoization table with -1 (unvisited state)
12 // dp[i][j] represents whether s[i:] matches p[j:]
13 // -1: unvisited, 0: false, 1: true
14 const memoTable: number[][] = Array.from(
15 { length: stringLength + 1 },
16 () => Array.from({ length: patternLength + 1 }, () => -1)
17 );
18
19 /**
20 * Recursive helper function with memoization to check pattern matching
21 * @param stringIndex - Current index in the string s
22 * @param patternIndex - Current index in the pattern p
23 * @returns true if s[stringIndex:] matches p[patternIndex:]
24 */
25 const matchHelper = (stringIndex: number, patternIndex: number): boolean => {
26 // Base case: reached end of string
27 if (stringIndex >= stringLength) {
28 // Check if we've also reached end of pattern or remaining pattern is '*'
29 return patternIndex >= patternLength ||
30 (p[patternIndex] === '*' && matchHelper(stringIndex, patternIndex + 1));
31 }
32
33 // Base case: reached end of pattern but not end of string
34 if (patternIndex >= patternLength) {
35 return false;
36 }
37
38 // Check memoization table for previously computed result
39 if (memoTable[stringIndex][patternIndex] !== -1) {
40 return memoTable[stringIndex][patternIndex] === 1;
41 }
42
43 // Handle wildcard '*' - can match zero or more characters
44 if (p[patternIndex] === '*') {
45 // Try matching '*' with current character (stay at same pattern position)
46 // OR skip '*' (move to next pattern position)
47 const matchResult = matchHelper(stringIndex + 1, patternIndex) ||
48 matchHelper(stringIndex, patternIndex + 1);
49 memoTable[stringIndex][patternIndex] = matchResult ? 1 : 0;
50 } else {
51 // Handle '?' wildcard or exact character match
52 // '?' matches any single character, or characters must match exactly
53 const matchResult = (p[patternIndex] === '?' || s[stringIndex] === p[patternIndex]) &&
54 matchHelper(stringIndex + 1, patternIndex + 1);
55 memoTable[stringIndex][patternIndex] = matchResult ? 1 : 0;
56 }
57
58 return memoTable[stringIndex][patternIndex] === 1;
59 };
60
61 // Start matching from the beginning of both strings
62 return matchHelper(0, 0);
63}
64
Time and Space Complexity
The time complexity is O(m × n)
, where m
is the length of string s
and n
is the length of pattern p
. This is because the memoized recursive function dfs(i, j)
can have at most (m + 1) × (n + 1)
unique states, where i
ranges from 0
to m
and j
ranges from 0
to n
. Due to the @cache
decorator, each state is computed only once, and each state computation involves constant time operations (excluding recursive calls).
The space complexity is O(m × n)
. This consists of two components:
- The cache storage for memoization requires
O(m × n)
space to store the results of all possible(i, j)
pairs. - The recursion call stack can go up to
O(m + n)
depth in the worst case, but this is dominated by the cache storage requirement.
Therefore, the overall space complexity is O(m × n)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Inefficient Handling of Consecutive Asterisks
One common pitfall is not optimizing for patterns with multiple consecutive '*'
characters (e.g., "a***b"
). While the current solution works correctly, consecutive asterisks are functionally equivalent to a single asterisk, and processing them separately creates unnecessary recursive branches.
Problem Example:
# Pattern: "a***b" matching string: "axxxxb" # This creates many redundant recursive paths through the three asterisks
Solution: Preprocess the pattern to collapse consecutive asterisks into a single one:
def isMatch(self, s: str, p: str) -> bool:
# Preprocess pattern to remove consecutive asterisks
cleaned_pattern = []
for char in p:
if char != '*' or (not cleaned_pattern or cleaned_pattern[-1] != '*'):
cleaned_pattern.append(char)
p = ''.join(cleaned_pattern)
# Continue with the original solution...
from functools import cache
@cache
def dfs(string_idx: int, pattern_idx: int) -> bool:
# ... rest of the implementation
2. Stack Overflow with Deep Recursion
For very long strings or patterns, the recursive solution might hit Python's default recursion limit (typically 1000 levels deep), causing a RecursionError
.
Problem Example:
# Very long string with many asterisks s = "a" * 5000 p = "*" * 100 + "a" * 5000 # This could exceed recursion depth
Solution: Either increase the recursion limit or convert to an iterative dynamic programming approach:
import sys
sys.setrecursionlimit(10000) # Increase limit if needed
# Or use iterative DP approach:
def isMatch(self, s: str, p: str) -> bool:
m, n = len(s), len(p)
dp = [[False] * (n + 1) for _ in range(m + 1)]
dp[0][0] = True
# Handle patterns starting with '*'
for j in range(1, n + 1):
if p[j-1] == '*':
dp[0][j] = dp[0][j-1]
for i in range(1, m + 1):
for j in range(1, n + 1):
if p[j-1] == '*':
dp[i][j] = dp[i-1][j] or dp[i][j-1]
elif p[j-1] == '?' or s[i-1] == p[j-1]:
dp[i][j] = dp[i-1][j-1]
return dp[m][n]
3. Memory Issues with Large Input Combinations
The @cache
decorator stores all unique (string_idx, pattern_idx)
combinations. For very large strings and patterns, this could consume significant memory.
Problem Example:
# Large inputs s = "x" * 10000 p = "*x*x*x*" * 1000 # Cache could store up to O(m*n) entries
Solution:
Use lru_cache
with a maximum size limit instead of unlimited cache:
from functools import lru_cache
@lru_cache(maxsize=10000) # Limit cache size
def dfs(string_idx: int, pattern_idx: int) -> bool:
# ... implementation
Or clear the cache periodically if processing multiple test cases:
def isMatch(self, s: str, p: str) -> bool:
from functools import cache
@cache
def dfs(string_idx: int, pattern_idx: int) -> bool:
# ... implementation
result = dfs(0, 0)
dfs.cache_clear() # Clear cache after each match to free memory
return result
What's the output of running the following function using input [30, 20, 10, 100, 33, 12]
?
1def fun(arr: List[int]) -> List[int]:
2 import heapq
3 heapq.heapify(arr)
4 res = []
5 for i in range(3):
6 res.append(heapq.heappop(arr))
7 return res
8
1public static int[] fun(int[] arr) {
2 int[] res = new int[3];
3 PriorityQueue<Integer> heap = new PriorityQueue<>();
4 for (int i = 0; i < arr.length; i++) {
5 heap.add(arr[i]);
6 }
7 for (int i = 0; i < 3; i++) {
8 res[i] = heap.poll();
9 }
10 return res;
11}
12
1class HeapItem {
2 constructor(item, priority = item) {
3 this.item = item;
4 this.priority = priority;
5 }
6}
7
8class MinHeap {
9 constructor() {
10 this.heap = [];
11 }
12
13 push(node) {
14 // insert the new node at the end of the heap array
15 this.heap.push(node);
16 // find the correct position for the new node
17 this.bubble_up();
18 }
19
20 bubble_up() {
21 let index = this.heap.length - 1;
22
23 while (index > 0) {
24 const element = this.heap[index];
25 const parentIndex = Math.floor((index - 1) / 2);
26 const parent = this.heap[parentIndex];
27
28 if (parent.priority <= element.priority) break;
29 // if the parent is bigger than the child then swap the parent and child
30 this.heap[index] = parent;
31 this.heap[parentIndex] = element;
32 index = parentIndex;
33 }
34 }
35
36 pop() {
37 const min = this.heap[0];
38 this.heap[0] = this.heap[this.size() - 1];
39 this.heap.pop();
40 this.bubble_down();
41 return min;
42 }
43
44 bubble_down() {
45 let index = 0;
46 let min = index;
47 const n = this.heap.length;
48
49 while (index < n) {
50 const left = 2 * index + 1;
51 const right = left + 1;
52
53 if (left < n && this.heap[left].priority < this.heap[min].priority) {
54 min = left;
55 }
56 if (right < n && this.heap[right].priority < this.heap[min].priority) {
57 min = right;
58 }
59 if (min === index) break;
60 [this.heap[min], this.heap[index]] = [this.heap[index], this.heap[min]];
61 index = min;
62 }
63 }
64
65 peek() {
66 return this.heap[0];
67 }
68
69 size() {
70 return this.heap.length;
71 }
72}
73
74function fun(arr) {
75 const heap = new MinHeap();
76 for (const x of arr) {
77 heap.push(new HeapItem(x));
78 }
79 const res = [];
80 for (let i = 0; i < 3; i++) {
81 res.push(heap.pop().item);
82 }
83 return res;
84}
85
Recommended Readings
Greedy Introduction div class responsive iframe iframe src https www youtube com embed WTslqPbj7I title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture web share allowfullscreen iframe div When do we use greedy Greedy algorithms tend to solve optimization problems Typically they will ask you to calculate the max min of some value Commonly you may see this phrased in the problem as max min longest shortest largest smallest etc These keywords can be identified by just scanning
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
Want a Structured Path to Master System Design Too? Don’t Miss This!