題目來源
https://leetcode.com/problems/regular-expression-matching/
內容:實現一個能支持”.” “*”的模式匹配程序,並判斷是否整個都匹配上。例如:
isMatch("aa", "a") -> false
isMatch("aab", "c*a*b") -> true
做法比較傳統,用自動機來做。
自動機形如下方(其表達式爲c*aa*bb*):
當狀態爲s1的時候,如果此處讀到的數據是a的話,可以進入狀態s2。
那麼對於 表達式爲”a*ab”怎麼辦?
它既要適配”ab”, “aab”, “aaab”…
其實可以用一個棧stackSavePoint保存這種(a*)既可以讀當前的數據, 也可以直接跳過到下一個這種類型的節點,保存當前數據的index,然後直接將當前數據交由下一狀態處理。如果遇到當前狀態並不能處理,則從stackSavePoint棧中彈出一節點,當前狀態轉到這一節點,並將當前數據的指針指向(index + 1),也就是回退N-1步,效率有點低,但還是前進了一步。
具體代碼如下:
class Solution {
public:
// c*cab
struct Node {
bool isEverything;
bool isSelfContain;
char accept;
struct Node* son;
};
struct Node* addNode(struct Node* parent, char ch) { // return current node after added.
struct Node* son = parent->son;
if (ch == '*') {
parent->isSelfContain = true;
return parent;
}
struct Node* newSon = (Node*)malloc(sizeof(struct Node));
newSon->accept = ch;
newSon->isEverything = '.' == ch;
newSon->isSelfContain = false;
newSon->son = NULL;
parent->son = newSon;
return newSon;
}
// to merge something like a*a*a*a*b => a*b or .*a*b*c*d => .*d
struct Node* mergeSameNeighbour(struct Node* parent) {
struct Node* header = parent;
while (parent && parent->son) {
if (parent->isSelfContain && parent->son->isSelfContain) {
bool isEverything = parent->isEverything || parent->son->isEverything;
if (isEverything || parent->accept == parent->son->accept) {
parent->isEverything = isEverything;
struct Node* tmp = parent->son;
parent->son = parent->son->son;
free(tmp);
}
}
parent = parent->son;
}
return header;
}
bool isMatch(string s, string p) {
if (p.empty()) {
return s.empty();
}
Node* emptyHeader = (Node*) malloc(sizeof(struct Node));
Node* current = emptyHeader;
int index = 0;
int len = p.length();
while (index < len) {
current = addNode(current, p.at(index++));
}
mergeSameNeighbour(emptyHeader->son);
index = 0;
len = s.length();
stack<Node*> savePoint;
stack<int> saveIndex;
current = emptyHeader->son;
while (index < len) {
char ch = s[index];
if (!current) {
goto helpme;
}
// printf("char[%d]: %c, current: %c\n", index, ch, current->accept);
if (current->isSelfContain) {
if (current->isEverything) {
// call me out when you cannot handle it
savePoint.push(current);
saveIndex.push(index);
} else if (current->accept == ch) {
savePoint.push(current);
saveIndex.push(index);
}
current = current->son;
continue;
}
if (current->isEverything || current->accept == ch) {
index++;
current = current->son;
continue;
}
helpme:
if (savePoint.size()) {
current = savePoint.top();
savePoint.pop();
int lastIndex = saveIndex.top();
saveIndex.pop();
index = lastIndex + 1;
} else {
return false;
}
}
free(emptyHeader);
/* in here, the test word run out, so we just find whethere the rest is NULL, or all are selfContain, which can be regarded as 0 */
while(current) {
if (current->isSelfContain) {
current = current->son;
} else {
return false;
}
}
return !current;
}
};
這裏有個問題, 如果需要支持[a-zA-Z.*-+]等的話,估計在Node.accept 中修改成keymap的形式,估計問題應該不是很大。