改良的KMP字符串模式匹配算法

2023年7月16日 380次阅读来源: KMP算法

KMP算法是使用了next数组，改良后的在next数组基础上进行2次模式参照，某一个字符与模式串不匹配则沿用其前一位字符的next值
由于清华版的《数据结构(c语言版)》的类C代码使用的是下标从0开始，0作为长度存储位，与实际的C/C++标准库不一样，所以讲其改为纯C/C++标准实现，使用std库的string类，由此引发了数组下标的问题。在使用string库的迭代器进行定位字符的话，可以将0对应为begin(),-1对应为begin()-1,(即越下界)。
当nextVal为-1时，则主串应该向右滑动，同时模式串置为初始状态index=0。

原KMP算法原理： next[j] = { -1,当j=0时，即模式串回至起始下标 Max{k|0 0，其他情况，无有效模式参照信息 } 改良的KMP则在Max集合保存有效比较信息，公式1：假如P(k) = P(j),则nextVal[j+1] =next[j]+1;若不等，则模式串应该向右滑动（即自身下标减小），直至满足公式1,实在寻找不到合适的则

下面的代码是残缺的，因为新浪的文字编辑器过滤算法有问题总是发现无法处理程序字符，详细无错代码可以到我的其他博客上

// KMP.cpp : 定义控制台应用程序的入口点。 //

#include “stdafx.h” #include #include using namespace std;

void getNextVal(const string t,int * nextVal) { int i = 0; nextVal[0] = -1; int j = -1; int size = static_cast(t.length()); //这是重点1 while (i { if (j==-1 || t.c_str()[i]==t.c_str()[j]) { ++i;++j;

if (t.c_str()[i]!=t.c_str()[j]) { nextVal[i] = j; } else { nextVal[i] = nextVal[j]; } } else { j = nextVal[j]; } } }//getNextVal

bool Index_KMP(const string &s,const string &t,int&targetIndex,int beginPos = 0) { cout<<“okkkkkkkkk”; int * nextVal = new int[t.length()+1]; int i = beginPos;int j = -1;

getNextVal(t,nextVal);

//signed和unsigned转化问题，要认真处理 int s_size = static_cast(s.length()); int t_size = static_cast(t.length());

while (i<<span style=”line-height: 21px;”>s_size&& j<</span>t_size) { if (j == -1 || s.c_str()[i]==t.c_str()[j]) { ++i; ++j; } else { j = nextVal[j]; } } delete[] nextVal; if (j>=t_size) { targetIndex = i – t.length(); cout<<“\n”<<i<<“\t”<<j<<“\t”<<targetIndex; return true; } else { return false; } }//Index_KMP

int _tmain(int argc, _TCHAR* argv[]) { string s = string(“acabaabaabcacaabc”); string t(“abaabcac”); int index = 0; Index_KMP(s,t,index); cin>>index; return 0; }

    原文作者：KMP算法
    原文地址: https://blog.csdn.net/jingzhewangzi/article/details/38932087
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。