Leetcode 10. Regular Expression Matching

Problem Explanation

The task is to implement regular expression pattern matching. The regular expression (pattern) can include two special characters along with the lowercase alphabets 'a' through 'z'. The two characters are '.' and '*'. "." represents any single character, while "*" denotes zero or multiple occurrences of the previous character.

The input string to be matched against the pattern can only include lowercase alphabets.

The function should return true if the entire string matches the pattern or false otherwise.

Examples:

If the input string, s = "ab", and the pattern, p = ".*", then the output will be true. This is simply because "." means "zero or more () of any character (.)".

If input string s = "aab" and pattern p = "c*a*b", then the output will be true, c can be repeated 0 times, a can be repeated 1 time. Therefore, it matches "aab".


Solution Approach

We use a dynamic programming approach to solve this problem. The basic idea is to use a 2D boolean array, dp, where dp[i][j] is true if the substring s[0..i) matches with the pattern p[0..j).

We initialize dp[0][0] as true and iterate over the pattern to mark dp[0][j+1] as true only when a star ('*') character is found, and when dp[0][j-1] is true.

After the initialization, we iterate over the string and the pattern.

  • If the pattern at current index j is a star ('*'), then there can be two possibilities - a. the character before the star matches zero times, in that case, dp[i+1][j+1] is true if dp[i+1][j-1] is true. b. The character before the star matches at least one time, in that case, dp[i+1][j+1] is true if dp[i][j+1] is true and character s[i] matches with the character before the star in the pattern.
  • If the pattern at current index j is not a star ('*'), then dp[i+1][j+1] is true if dp[i][j] is true and character s[i] matches with p[j].

Finally, dp[m][n] will give us the answer, where m and n is the length of the string and the pattern respectively.


Solution in Python

1
2python
3class Solution:
4    def isMatch(self, s: str, p: str) -> bool:
5        
6        m, n = len(s), len(p)
7        
8        # Initialize a dp table with m+1 rows and n+1 columns. 
9        # dp[i][j] is True if s[0..i) matches p[0..j)
10        dp = [[False]*(n+1) for _ in range(m+1)]
11        
12        dp[0][0] = True
13
14        for j in range(n):
15            if p[j] == '*':
16                dp[0][j+1] = dp[0][j-1]
17
18        for i in range(m):
19            for j in range(n):
20                if p[j] == '*':
21                    dp[i+1][j+1] = dp[i+1][j-1] or (dp[i][j+1] and (s[i] == p[j-1] or p[j-1] == '.'))
22                elif p[j] == '.' or s[i] == p[j]:
23                    dp[i+1][j+1] = dp[i][j]
24        
25        return dp[-1][-1]

Solution in Java

1
2java
3class Solution {
4    public boolean isMatch(String s, String p) {
5        boolean[][] dp = new boolean[s.length()+1][p.length()+1];
6        dp[0][0] = true;
7        
8        for(int i=1; i<=p.length(); i++)
9        {
10            if(p.charAt(i-1) == '*')
11            {
12                dp[0][i] = dp[0][i-2];
13            }
14        }
15        
16        for(int i=0;i<s.length();i++)
17        {
18            for(int j=0;j<p.length();j++)
19            {
20                if(p.charAt(j) == '.' || p.charAt(j)==s.charAt(i))
21                {
22                    dp[i+1][j+1] = dp[i][j];
23                }
24                else if(p.charAt(j) == '*')
25                {
26                    dp[i+1][j+1] = dp[i+1][j-1];
27                    if(p.charAt(j-1) == '.' || p.charAt(j-1) == s.charAt(i))
28                    {
29                        dp[i+1][j+1] = dp[i+1][j+1] | dp[i][j+1];
30                    }
31                }
32            }
33        }
34        
35        return dp[s.length()][p.length()];
36    }
37}

Solution in JavaScript

1
2javascript
3var isMatch = function(s, p) {
4    let dp = new Array(s.length + 1).fill(0).map(() => new Array(p.length + 1).fill(false));
5    dp[0][0] = true;
6    for(let i = 0; i <= s.length; i++){
7        for(let j = 1; j <= p.length; j++){
8            if(p[j - 1] === '*'){
9                dp[i][j] = dp[i][j - 2] || (i > 0 && (s[i - 1] === p[j - 2] || p[j - 2] === '.') && dp[i - 1][j]);
10            } 
11            else {
12                dp[i][j] = i > 0 && dp[i - 1][j - 1] && (p[j - 1] === '.' || p[j - 1] === s[i - 1]);
13            }
14        }
15    }
16    return dp[s.length][p.length];
17};

Solution in C++

1
2cpp
3class Solution {
4public:
5    bool isMatch(string s, string p) {
6        int m = s.size(), n = p.size();
7        vector<vector<bool>> dp(m + 1, vector<bool> (n + 1, false));
8        dp[0][0] = true;
9        for (int i = 0; i <= m; i++)
10            for (int j = 1; j <= n; j++)
11                if (p[j - 1] == '*')
12                    dp[i][j] = dp[i][j - 2] || (i > 0 && (s[i - 1] == p[j - 2] || p[j - 2] == '.') && dp[i - 1][j]);
13                else
14                    dp[i][j] = i > 0 && dp[i - 1][j - 1] && (s[i - 1] == p[j - 1] || p[j - 1] == '.');
15        return dp[m][n];
16    }
17};

Solution in C#

1
2csharp
3public class Solution {
4    public bool IsMatch(string s, string p) {
5        bool[,] dp = new bool[s.Length + 1, p.Length + 1];
6        dp[s.Length, p.Length] = true;
7
8        for (int i = s.Length; i >= 0; i--){
9            for (int j = p.Length - 1; j >= 0; j--){
10                bool first_match = (i < s.Length && 
11                                       (p[j] == s[i] ||
12                                        p[j] == '.'));
13                if (j + 1 < p.Length && p[j + 1] == '*'){
14                    dp[i,j] = dp[i,j+2] || first_match && dp[i+1,j];
15                } else {
16                    dp[i,j] = first_match && dp[i+1,j+1];
17                }
18            }
19        }
20        return dp[0,0];
21    }
22}

Solution in Go

1
2go
3func isMatch(s string, p string) bool {
4    dp := make([][]bool, len(s) + 1)
5    for i, _ := range dp {
6        dp[i] = make([]bool, len(p) + 1)
7    }
8    dp[0][0] = true
9    for j := 1; j <= len(p); j++ {
10        if p[j - 1] == '*' {
11            dp[0][j] = dp[0][j - 2]
12        }
13    }
14    for i := 0; i < len(s); i++ {
15        for j := 0; j < len(p); j++ {
16            if p[j] == s[i] || p[j] == '.' {
17                dp[i + 1][j + 1] = dp[i][j]
18            } else if p[j] == '*' {
19                dp[i + 1][j + 1] = dp[i + 1][j - 1] || ((s[i] == p[j - 1] || p[j - 1] == '.') && dp[i][j + 1])
20            }
21        }
22    }
23    return dp[len(s)][len(p)]
24}

The idea behind regrex matching is the use of recursion or dynamic programming. A star '*' in the pattern can either represent zero occurrences of the character before it or any number of occurrences. Therefore, when the pattern contains a star then we have two possibilities - if the character before the star matches zero times, then we check if the remaining pattern can match the complete string, or if the character before the star matches at least once, then we check if the remaining pattern can match the remaining string.

A dot '.' in the pattern matches exactly one character. Hence, if we encounter a dot in the pattern, then we check if the remaining pattern matches the remaining string.

In Go implementation, a 2D dp array is created as in Python, Java and JavaScript solutions. After initializing the dp array, for each character in the pattern and string, the dp array is updated according to the rules explained. In the end, the last element of the dp array is returned which indicates if the complete string matches the pattern.


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫