Implement regular expression matching with support for ‘.’ and ‘*’

这题很多人都被问过,很多人都背了答案,但是很重要的事要知道算法复杂度。很多人洋洋洒洒写出来后被拒还不知道为什么。

Regular Expression Matching

Implement regular expression matching with support for '.' and '*'.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "a*") → true
isMatch("aa", ".*") → true
isMatch("ab", ".*") → true
isMatch("aab", "c*a*b") → true

Analysis:

Note that this problem is NOT the same as Wildcard Matching.
The major difference is the meaning of  “*” in the string.
Here, “*” means 0 or more preceding element. For example, the last test case in the problem description:
isMatch(“aab”,””c*a*b”)
The pattern :  “c*a*b” means,  there can be 0 or more ‘c’, 0 or more ‘a’ and one b.
e.g., “ccaab”, “aab”, “ccb”, “b” are all valid strings for this pattern, thus the correct answer for this test case is also true (0 ‘c’, 2 ‘a’ and 1 ‘b’).

How to solve this question?  DFS!
Since we don’t know how many “* element ” is needed to get a match, one way is to search every possibilities. However in this problem, it is not complex because there are many restrictions, which seems even easier than the wildcard matching problem.

First of all, let’s define “{x}” means 0 or more ‘x’ can be here in the string.
Using the above example, the problem can be rewritten as:
                             isMatch(“aab” “{c}{a}b”)

Then we can do the matching from the start of the two strings. The condition of a match between two elements s[i] and p[j] are:
(1) s[i]==p[j]
(2) s[i]!=p[j] but p[j]==’.’
Otherwise, 2 strings cannot have a match.

In the first case (s[i]==p[j]),
If p[j] is not a { }, (not followed by a *), we can just go to the next element (i++, j++).
If p[j] is a { } (followed by a *), zero or more p[j]s might be here.

How many time we need here to have multiple p[j]s?
Let’s consider this problem in another way.
If it is zero p[j], then just search the next j (j++,i).
Otherwise, no matter how many we need, for the next element s[i+1], s[i+1] only knows it is now matching with {p[j]}. So, we can just search the current j with next i (j, i++).

For the end of the pattern, if all the remaining elements are { }, then it can all be empty so it is a match. Otherwise it is a false match.

To reduce the number of recursion, we know that, if p[j]==p[j+1] and they both are “{ }”, then they can be considered as only one element with “{ }”. Without this condition, your code might not pass the last test case in the OJ.

Code:

 

class Solution {
public:
 
    bool emp(int pm, vector &vali){ //check if the remaining pattern can be empty or not
        if (pm==vali.size()){return true;}
        else{return vali[pm];}
    }
 
    bool match(string s,int sm, string p, int pm, vector &stars, vector &vali){
        if (sm<s.size() && pm==p.size()){ // if string is not ended but pattern is ended, it is a false return false; } if (sm>=s.size()){  //if the string is ended
            if (emp(pm,vali)){return true;} // if remaining pattern can be empty, it is a match
            else{return false;}
        }else{
            if (s[sm]!=p[pm] && p[pm]!='.'){ //if current elements not equal
                if (stars[pm]){ //if there can be zero p[pm].
                    return match(s,sm,p,pm+1,stars,vali); //search the next element
                }else{
                    return false;
                }
            }
            if (s[sm]==p[pm]||p[pm]=='.'){ //if current elements equal
                if (stars[pm]==1){ //if there can be 0 or more p[pm]
                    if(p[pm]==p[pm+1] && stars[pm+1]==1){ //if the next element is the same and also followed by *
                        return match(s,sm+1,p,pm+1,stars,vali);
                    }else{
                        ////////////////////search zero or more p[pm]///////////////////////
                        return match(s,sm,p,pm+1,stars,vali)||match(s,sm+1,p,pm,stars,vali); 
                        ////////////////////////////////////////////////////////////////////
                    }
                }else{
                    return match(s,sm+1,p,pm+1,stars,vali);
                }
            }
        }
    }
 
    bool isMatch(const char *s, const char *p) {
        // Start typing your C/C++ solution below
        // DO NOT write int main() function
        string str_s(s);
        string str_p(p);
        vector stars;
        int i=0;
        while (i<=str_p.size()){  //rewrite the pattern string, store the elements followed with "*"
            if (str_p[i+1]=='*'){
                str_p.erase(i+1,1);
                stars.push_back(1);
            }else{
                stars.push_back(0);
            }
            i++;
        }
         
        vector vali(str_p.size(),0); //if the remaining pattern are all with *, then vali[i]=1, otherwise 0
        i=str_p.size()-1;
        while(i>=0){
            if (stars[i]==1){vali[i--]=1;}
            else{break;}
        }
         
        return match(str_s,0,str_p,0,stars,vali);
    }
 
};

Related Post

Be the first to comment

Leave a Reply

Your email address will not be published.


*