Facebook Pixel

10. Regular Expression Matching

Problem Description

You need to implement a regular expression matcher that supports two special characters:

  • '.' - Matches any single character
  • '*' - Matches zero or more of the preceding element

Given a string s and a pattern p, determine if the pattern matches the entire string (not just a substring).

The key rules are:

  1. A dot '.' can match any single character in the string
  2. A star '*' applies to the character immediately before it, allowing that character to appear 0 or more times
    • For example, "a*" can match "" (empty), "a", "aa", "aaa", etc.
    • ".*" can match any sequence of characters including empty string

The solution uses dynamic programming with memoization. The function dfs(i, j) checks if substring s[i:] matches pattern p[j:].

The algorithm handles three cases:

  1. When pattern p is exhausted (j >= n), the match succeeds only if string s is also exhausted
  2. When the next pattern character is '*' (at position j+1), we have two choices:
    • Skip the current pattern character and '*' (match 0 occurrences): dfs(i, j+2)
    • If current characters match, use the pattern again (match 1+ occurrences): dfs(i+1, j)
  3. When there's no '*' following, characters must match directly and we advance both pointers: dfs(i+1, j+1)

The @cache decorator prevents recalculating the same subproblems, improving efficiency from exponential to polynomial time complexity.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

The challenge with regular expression matching is handling the '*' character, which creates variable-length matching possibilities. When we see "a*" in the pattern, it could match 0, 1, 2, or any number of 'a' characters in the string. This creates a branching decision tree.

The key insight is to think recursively: at each position, we make a decision based on what we see in the pattern. If we encounter a character followed by '*', we face a choice - either use this pattern to match the current character in the string (and potentially use it again), or skip this pattern entirely (matching 0 occurrences).

Why does this naturally lead to dynamic programming? Consider matching "aaa" with pattern "a*a*". As we explore different ways to match, we'll repeatedly ask questions like "can the substring starting at position 2 match the pattern starting at position 3?" These subproblems overlap extensively.

The recursive structure emerges from breaking down the problem:

  • Base case: If we've consumed the entire pattern, we succeed only if we've also consumed the entire string
  • For '*' patterns: We can either skip it (match 0 times) or, if the current character matches, consume one character and keep the pattern available for reuse
  • For regular characters or '.': We must match exactly once and move forward in both string and pattern

The beauty of this approach is that it handles all the complex cases naturally. When we have ".*", the recursion automatically explores matching different lengths. When we have consecutive '*' patterns like "a*a*", the branching explores all valid distributions of characters between them.

Memoization becomes essential because without it, we'd recalculate the same (i, j) pairs multiple times through different recursive paths, leading to exponential time complexity.

Learn more about Recursion and Dynamic Programming patterns.

Solution Approach

The solution uses memoization search (top-down dynamic programming) to efficiently explore all possible matching paths.

Implementation Details:

The core function dfs(i, j) determines if s[i:] matches p[j:], where i and j are pointers to the current positions in string s and pattern p respectively.

Base Case:

if j >= n:
    return i == m

When we've exhausted the pattern (j >= n), the match succeeds only if we've also exhausted the string (i == m).

Case 1: Next character is '*'

if j + 1 < n and p[j + 1] == '*':
    return dfs(i, j + 2) or (
        i < m and (s[i] == p[j] or p[j] == '.') and dfs(i + 1, j)
    )

When we detect a '*' at position j + 1, we have two choices:

  • Match 0 occurrences: Skip both the character and '*' by calling dfs(i, j + 2)
  • Match 1+ occurrences: If current characters match (s[i] == p[j] or p[j] == '.'), consume one character from s while keeping the pattern available by calling dfs(i + 1, j)

The function returns True if either path succeeds (using logical OR).

Case 2: Regular character or '.'

return i < m and (s[i] == p[j] or p[j] == '.') and dfs(i + 1, j + 1)

Without a '*' following, we must match exactly once. We check:

  • String hasn't ended (i < m)
  • Characters match (s[i] == p[j] or p[j] == '.')
  • The rest of the string matches the rest of the pattern (dfs(i + 1, j + 1))

Memoization: The @cache decorator automatically memoizes the function results. Each unique (i, j) pair is computed only once, reducing time complexity from exponential O(2^(m+n)) to polynomial O(m * n).

Space Complexity: The solution uses O(m * n) space for the memoization cache, where m and n are the lengths of string s and pattern p respectively.

Ready to land your dream job?

Unlock your dream job with a 3-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's trace through matching string s = "aab" with pattern p = "c*a*b".

We start with dfs(0, 0) - matching from the beginning of both strings.

Step 1: dfs(0, 0) - Looking at s[0]='a' and p[0]='c'

  • Next pattern character p[1]='*', so we have a star case
  • Two options:
    • Skip "c*": Call dfs(0, 2)
    • Match 'c': Would need s[0]='a' to equal p[0]='c' - this fails
  • So we only explore dfs(0, 2)

Step 2: dfs(0, 2) - Looking at s[0]='a' and p[2]='a'

  • Next pattern character p[3]='*', so we have another star case
  • Two options:
    • Skip "a*": Call dfs(0, 4)
    • Match 'a': Since s[0]='a' equals p[2]='a', call dfs(1, 2)
  • We explore both paths

Path A - Skip "a":* dfs(0, 4) - Looking at s[0]='a' and p[4]='b'

  • No star follows 'b'
  • Need s[0]='a' to equal p[4]='b' - this fails
  • Return False

Path B - Use "a":* dfs(1, 2) - Looking at s[1]='a' and p[2]='a'

  • Next pattern character p[3]='*', star case again
  • Two options:
    • Skip "a*": Call dfs(1, 4)
    • Match another 'a': Since s[1]='a' equals p[2]='a', call dfs(2, 2)

Path B1: dfs(1, 4) - Looking at s[1]='a' and p[4]='b'

  • No star follows
  • Need s[1]='a' to equal p[4]='b' - fails
  • Return False

Path B2: dfs(2, 2) - Looking at s[2]='b' and p[2]='a'

  • Next pattern character p[3]='*', star case
  • Two options:
    • Skip "a*": Call dfs(2, 4)
    • Match: Need s[2]='b' to equal p[2]='a' - fails

Final Path: dfs(2, 4) - Looking at s[2]='b' and p[4]='b'

  • No star follows
  • Check: s[2]='b' equals p[4]='b'
  • Call dfs(3, 5)

Base Case: dfs(3, 5)

  • j=5 >= n=5 (pattern exhausted)
  • Check if string exhausted: i=3 == m=3
  • Return True

The successful path: Skip "c*" → Match two 'a's with "a*" → Match 'b' with 'b'.

Each (i, j) pair is computed once and cached, preventing redundant calculations when multiple paths reach the same state.

Solution Implementation

1class Solution:
2    def isMatch(self, s: str, p: str) -> bool:
3        """
4        Regular Expression Matching with '.' and '*'
5        '.' matches any single character
6        '*' matches zero or more of the preceding element
7      
8        Args:
9            s: The input string to match
10            p: The pattern string with regular expression
11          
12        Returns:
13            True if s matches p, False otherwise
14        """
15        from functools import cache
16      
17        # Get lengths of input string and pattern
18        s_length = len(s)
19        p_length = len(p)
20      
21        @cache
22        def match_helper(s_index: int, p_index: int) -> bool:
23            """
24            Recursively check if substring s[s_index:] matches pattern p[p_index:]
25          
26            Args:
27                s_index: Current index in string s
28                p_index: Current index in pattern p
29              
30            Returns:
31                True if the remaining portions match, False otherwise
32            """
33            # Base case: reached end of pattern
34            if p_index >= p_length:
35                # Pattern exhausted, check if string is also exhausted
36                return s_index == s_length
37          
38            # Check if next character in pattern is '*' (Kleene star)
39            if p_index + 1 < p_length and p[p_index + 1] == '*':
40                # Two options with '*':
41                # 1. Skip the pattern char and '*' (match 0 occurrences)
42                skip_pattern = match_helper(s_index, p_index + 2)
43              
44                # 2. Match current char and stay at same pattern position (match 1+ occurrences)
45                # Only if current position is valid and characters match
46                match_current = (s_index < s_length and 
47                                (s[s_index] == p[p_index] or p[p_index] == '.') and 
48                                match_helper(s_index + 1, p_index))
49              
50                return skip_pattern or match_current
51          
52            # Regular character match (no '*' following)
53            # Check if current characters match and continue with next positions
54            return (s_index < s_length and 
55                   (s[s_index] == p[p_index] or p[p_index] == '.') and 
56                   match_helper(s_index + 1, p_index + 1))
57      
58        # Start matching from the beginning of both strings
59        return match_helper(0, 0)
60
1class Solution {
2    // Memoization cache to store results of subproblems
3    private Boolean[][] memo;
4    // Input strings
5    private String text;
6    private String pattern;
7    // Lengths of input strings
8    private int textLength;
9    private int patternLength;
10
11    /**
12     * Determines if the entire string s matches the regular expression pattern p.
13     * '.' matches any single character.
14     * '*' matches zero or more of the preceding element.
15     * 
16     * @param s the input string to match
17     * @param p the pattern with regular expression
18     * @return true if s matches p, false otherwise
19     */
20    public boolean isMatch(String s, String p) {
21        // Initialize lengths
22        textLength = s.length();
23        patternLength = p.length();
24      
25        // Initialize memoization cache with dimensions (textLength + 1) x (patternLength + 1)
26        // to handle cases where indices reach the end of strings
27        memo = new Boolean[textLength + 1][patternLength + 1];
28      
29        // Store input strings for use in recursive function
30        this.text = s;
31        this.pattern = p;
32      
33        // Start matching from the beginning of both strings
34        return matchHelper(0, 0);
35    }
36
37    /**
38     * Recursive helper function to check if substring starting at text[textIndex]
39     * matches pattern starting at pattern[patternIndex].
40     * 
41     * @param textIndex current index in the text string
42     * @param patternIndex current index in the pattern string
43     * @return true if remaining substrings match, false otherwise
44     */
45    private boolean matchHelper(int textIndex, int patternIndex) {
46        // Base case: if we've consumed the entire pattern
47        if (patternIndex >= patternLength) {
48            // Match is successful only if we've also consumed the entire text
49            return textIndex == textLength;
50        }
51      
52        // Check if this subproblem has already been solved
53        if (memo[textIndex][patternIndex] != null) {
54            return memo[textIndex][patternIndex];
55        }
56      
57        boolean isMatch = false;
58      
59        // Check if the next character in pattern is '*' (Kleene star)
60        if (patternIndex + 1 < patternLength && pattern.charAt(patternIndex + 1) == '*') {
61            // Two options with '*':
62            // Option 1: Match zero occurrences - skip current char and '*' in pattern
63            isMatch = matchHelper(textIndex, patternIndex + 2);
64          
65            // Option 2: Match one or more occurrences
66            // First check if current characters match, then try to match more
67            if (!isMatch && textIndex < textLength) {
68                char textChar = text.charAt(textIndex);
69                char patternChar = pattern.charAt(patternIndex);
70              
71                if (textChar == patternChar || patternChar == '.') {
72                    // Current chars match, consume text char and keep pattern position
73                    isMatch = matchHelper(textIndex + 1, patternIndex);
74                }
75            }
76        } else {
77            // No '*' following current pattern character
78            // Must match current characters exactly
79            if (textIndex < textLength) {
80                char textChar = text.charAt(textIndex);
81                char patternChar = pattern.charAt(patternIndex);
82              
83                if (textChar == patternChar || patternChar == '.') {
84                    // Characters match, move both pointers forward
85                    isMatch = matchHelper(textIndex + 1, patternIndex + 1);
86                }
87            }
88        }
89      
90        // Cache the result before returning
91        memo[textIndex][patternIndex] = isMatch;
92        return isMatch;
93    }
94}
95
1class Solution {
2public:
3    bool isMatch(string s, string p) {
4        int stringLength = s.size();
5        int patternLength = p.size();
6      
7        // Memoization table: -1 = not computed, 0 = false, 1 = true
8        // memo[i][j] represents if s[i:] matches p[j:]
9        vector<vector<int>> memo(stringLength + 1, vector<int>(patternLength + 1, -1));
10      
11        // Recursive function to check if s[stringIdx:] matches p[patternIdx:]
12        function<bool(int, int)> matchHelper = [&](int stringIdx, int patternIdx) -> bool {
13            // Base case: if pattern is exhausted, string must also be exhausted
14            if (patternIdx >= patternLength) {
15                return stringIdx == stringLength;
16            }
17          
18            // Check if result is already memoized
19            if (memo[stringIdx][patternIdx] != -1) {
20                return memo[stringIdx][patternIdx] == 1;
21            }
22          
23            // Initialize result
24            bool isMatchFound = false;
25          
26            // Check if next character in pattern is '*' (zero or more of preceding element)
27            if (patternIdx + 1 < patternLength && p[patternIdx + 1] == '*') {
28                // Two options with '*':
29                // 1. Skip the current pattern char and '*' (match zero occurrences)
30                // 2. If current chars match, use '*' to match one or more occurrences
31                isMatchFound = matchHelper(stringIdx, patternIdx + 2) ||  // Match zero occurrences
32                              (stringIdx < stringLength && 
33                               (s[stringIdx] == p[patternIdx] || p[patternIdx] == '.') && 
34                               matchHelper(stringIdx + 1, patternIdx));  // Match one or more occurrences
35            }
36            // No '*' follows current pattern character
37            else {
38                // Check if current characters match and continue matching rest
39                isMatchFound = stringIdx < stringLength && 
40                              (s[stringIdx] == p[patternIdx] || p[patternIdx] == '.') && 
41                              matchHelper(stringIdx + 1, patternIdx + 1);
42            }
43          
44            // Store result in memoization table
45            memo[stringIdx][patternIdx] = isMatchFound ? 1 : 0;
46          
47            return isMatchFound;
48        };
49      
50        // Start matching from the beginning of both strings
51        return matchHelper(0, 0);
52    }
53};
54
1/**
2 * Checks if string s matches pattern p with regular expression support
3 * @param s - The input string to match
4 * @param p - The pattern with '.' (any single character) and '*' (zero or more of preceding element)
5 * @returns True if the entire string matches the pattern, false otherwise
6 */
7function isMatch(s: string, p: string): boolean {
8    const stringLength: number = s.length;
9    const patternLength: number = p.length;
10  
11    // Memoization table: -1 = not computed, 0 = false, 1 = true
12    const memo: number[][] = Array.from(
13        { length: stringLength + 1 }, 
14        () => Array(patternLength + 1).fill(0)
15    );
16  
17    /**
18     * Recursive helper function with memoization to check pattern matching
19     * @param stringIndex - Current index in string s
20     * @param patternIndex - Current index in pattern p
21     * @returns Whether substring s[stringIndex:] matches pattern p[patternIndex:]
22     */
23    const matchHelper = (stringIndex: number, patternIndex: number): boolean => {
24        // Base case: reached end of pattern
25        if (patternIndex >= patternLength) {
26            return stringIndex === stringLength;
27        }
28      
29        // Check memoization table for already computed result
30        if (memo[stringIndex][patternIndex]) {
31            return memo[stringIndex][patternIndex] === 1;
32        }
33      
34        let matchResult: number = -1;
35      
36        // Handle star pattern (zero or more occurrences)
37        if (patternIndex + 1 < patternLength && p[patternIndex + 1] === '*') {
38            // Try matching zero occurrences (skip pattern char and star)
39            const skipPattern: boolean = matchHelper(stringIndex, patternIndex + 2);
40          
41            // Try matching one or more occurrences (if current char matches)
42            const matchCurrent: boolean = stringIndex < stringLength && 
43                (s[stringIndex] === p[patternIndex] || p[patternIndex] === '.') && 
44                matchHelper(stringIndex + 1, patternIndex);
45          
46            if (skipPattern || matchCurrent) {
47                matchResult = 1;
48            }
49        } 
50        // Handle regular character or dot pattern (single character match)
51        else if (stringIndex < stringLength && 
52                 (s[stringIndex] === p[patternIndex] || p[patternIndex] === '.') && 
53                 matchHelper(stringIndex + 1, patternIndex + 1)) {
54            matchResult = 1;
55        }
56      
57        // Store result in memoization table
58        memo[stringIndex][patternIndex] = matchResult;
59        return matchResult === 1;
60    };
61  
62    return matchHelper(0, 0);
63}
64

Time and Space Complexity

The time complexity is O(m × n), where m is the length of string s and n is the length of pattern p. This is because the recursive function dfs(i, j) explores different combinations of indices i (ranging from 0 to m) and j (ranging from 0 to n). Due to the @cache decorator, each unique state (i, j) is computed at most once, resulting in at most (m + 1) × (n + 1) states being evaluated.

The space complexity is O(m × n). This consists of two main components:

  1. The memoization cache from the @cache decorator stores up to (m + 1) × (n + 1) states
  2. The recursion call stack depth in the worst case can be O(m + n)

Since the cache dominates the space usage with O(m × n), the overall space complexity is O(m × n).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Handling of Consecutive Stars

A common mistake is not properly handling patterns with consecutive star operators like "a*b*c*". The algorithm might incorrectly process the star's scope.

Issue Example: When matching string "aaa" with pattern "a*a", developers might incorrectly think the a* should consume all 'a's, leaving nothing for the final 'a' in the pattern.

Solution: The recursive approach naturally handles this by exploring both possibilities at each step - either using the star (consuming a character) or skipping it. The memoization ensures we don't recalculate the same state.

2. Off-by-One Errors When Checking for Star

The most frequent bug occurs when checking if the next character is a star:

Incorrect:

if p[j + 1] == '*':  # Might cause IndexError

Correct:

if j + 1 < p_length and p[p_index + 1] == '*':  # Check bounds first

Always verify that j + 1 is within bounds before accessing p[j + 1].

3. Forgetting to Handle Empty Pattern with Star

Patterns like "a*" can match an empty string, but developers often forget this case.

Issue: Not allowing the pattern to skip both the character and star together when the star means "zero occurrences".

Solution: Always include the option match_helper(s_index, p_index + 2) when encountering a star, which skips both the preceding character and the star itself.

4. Incorrect Base Case Logic

A subtle but critical error is the base case implementation:

Incorrect:

if p_index >= p_length:
    return s_index >= s_length  # Wrong! This would accept partial matches

Correct:

if p_index >= p_length:
    return s_index == s_length  # Must be exactly at the end

The string must be completely consumed, not just partially matched.

5. Not Using Short-Circuit Evaluation Properly

When checking conditions, the order matters for both correctness and efficiency:

Problematic:

return (s[s_index] == p[p_index] or p[p_index] == '.') and s_index < s_length
# Accesses s[s_index] before checking if s_index is valid

Correct:

return s_index < s_length and (s[s_index] == p[p_index] or p[p_index] == '.')
# Checks bounds first, preventing IndexError

6. Misunderstanding Star's Greedy Nature

Developers sometimes implement the star operator as greedy-only (consuming as many characters as possible) or lazy-only (consuming as few as possible).

Solution: The algorithm must explore both paths - this is why we use:

return skip_pattern or match_current

This tries both matching zero occurrences AND matching one-or-more occurrences, letting the recursion find the correct path.

Discover Your Strengths and Weaknesses: Take Our 3-Minute Quiz to Tailor Your Study Plan:

Which of the following uses divide and conquer strategy?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More