字符串查找算法Sunday的实现

2023年7月10日 193次阅读来源: 查找算法

引言

在字符串查找算法中，最出名的莫过于KMP，然而相信很少人知道，这个算法基本上没有被应用于常用的软件当中

比如微软的notepad、word等都不是采用这个算法，为什么呢，还是查询效率使然！

本文不对KMP的原理进行介绍，也不对KMP的各种扩展进行细化，如果有人对这个算法不是很清楚，

建议大家可以看看这边博文：http://blog.csdn.net/v_july_v/article/details/7041827，是我见过写的最通俗易懂的。

正文

在实际的软件应用中用到BM算法以及今天要说的Sunday算法，关于其实这两者算法在实现原理上基本相同，、

最大区别在于：BM从匹配串从后往前匹配，其思想是寻找“好后缀”与“坏字符”两个特定场景的移动距离；

而Sunday则是从前往后，当出现不匹配字符时，查询匹配串所对应待查找字符串的最后一个偏移一位，

例如：where is an sample example 中查找example

where is an sample exam

exam

因为h!=e 所以选择模式串与待查字符串对应的下一个，即

where is an sample exam 中的e 在模式串中是否存在。

存在时，平移模式串至待查字符串对应位置，如下：

whereisansampleexam

exam

不存在时，就直接平移整个模式串的距离，如下：

whereisansampleexam

exam

字符不再exam中，直接平移exam至s的下一个位置开始比较，即：

whereisansampleexam

exam

以此规则类推。

知道了规则，示例源码：

#include <stdio.h>
#include <malloc.h>
#include <string.h>

int isCharInString(char *string,char in){
    char *p=string;
    int index=-1,find=0;
    while(*p) {
		index++;
		if(*(p++) == in) {
			find=1;
            break;
		}
	}
	if(find) return index;
    else return -1;
}

int SundaySearch(char *s, char *p){
    int i=0,j=0;
    int sLen = strlen(s);
    int pLen = strlen(p);
    int charIndex = -1;
    while(i<sLen-pLen && j<pLen){
        if(s[i] != p[j]) {
	    charIndex = isCharInString(p,s[i-j+pLen]);
            if(charIndex != -1){
		i+=pLen-charIndex-j;
            }else{
                i+=pLen;
            }
	    j=0;
        }else {
            i++,j++;
	}
    }
    if(j == pLen) return i-pLen;
    else return -1; 
}

int main(void){
    char string[] = {"substring searching algorithm"};
    char pattern[] = {"search"};
    printf("Find index of string is: %d\n",SundaySearch(string,pattern));
    return 0;
}

测试结果：Find index of string is: 10（从0开始编号）

小结

关于算法的实现代码，有兴趣的人可以进行优化，希望能不吝赐教。

    原文作者：查找算法
    原文地址: https://blog.csdn.net/yjbqzsf/article/details/48498721
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。