字符串KMP算法思考

2023年10月17日 307次阅读来源: 唾手可得的树

kmp算法不是查找最长公共子字符串算法，而是一个判断字符串A是否包含字符串B的更优的查找算法。

kmp算法的核心是next数组的计算（最长相同前缀和后缀的字符串）

比如ABCDABD的next数组是-1,0,0,0,0,1,2

kmp算法查询逻辑和获取next数组的逻辑非常相似，都是while循环里面的if else。

next数组匹配过程：

/**
* 匹配过程,自己跟自己比较
* ABADABAEABC
* ABADABAEABC
*
* ABADABAEABC
* ABADABAEABC
*
* ABADABAEABC
* ABADABAEABC
*
* ABADABAEABC
* ABADABAEABC
*
* ABADABAEABC
* ABADABAEABC
*
* ABADABAEABC
* ABADABAEABC
*
* ABADABAEABC
* ABADABAEABC
*/

/**
 * KMP匹配算法
 *
 * @param sStr 父串
 * @param dStr 子串
 * @return 子串在父串中下标index[int]
 */
public static int find(String sStr, String dStr) {
    int sLength = sStr.length();
    int dLength = dStr.length();
    int sIndex = 0, dIndex = 0;
    int[] next = getNextArray(dStr);

    while (sIndex < sLength && dIndex < dLength) {
        //当前字符匹配
        if (dIndex == -1 || sStr.charAt(sIndex) == dStr.charAt(dIndex)) {
            //父串和子串同时后移一个字符
            sIndex++;
            dIndex++;
        } else {//不匹配 sIndex不变dIndex取next[j]
            System.out.println("sStr ele is " + sStr.charAt(sIndex) + ",dStr ele is " + dStr.charAt(dIndex));
            int temp = dIndex;
            dIndex = next[dIndex];
            System.out.println("current dIndex is " + temp + ",next dIndex is " + dIndex);
        }
    }
    //字符串匹配结束
    if (dIndex == dLength) {
        return sIndex - dLength;
    }
    return -1;
}

查找包含子字符串的匹配过程：

/**
* 匹配过程
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*
* BBC ABCDAB ABCDABCDABDE
* ABCDABD
*/

参考文章：https://blog.csdn.net/Thousa_Ho/article/details/72842029

/**
 * 获取next数组
 *
 * @param destStr 目的字符串
 * @return next数组
 */
public static int[] getNextArray2(String destStr) {
    int[] nextArr = new int[destStr.length()];
    nextArr[0] = -1;
    int k = -1, j = 0;
    while (j < destStr.length() - 1) {
        //匹配上 
        if (k == -1 || (destStr.charAt(k) == destStr.charAt(j))) {
            ++k;
            ++j;
            nextArr[j] = k;//代表当前字符之前的字符串中，有多大长度的相同前缀后缀
            System.out.println("nextArr[" + j + "] is " + k);
        } else {
            int temp = k;
            k = nextArr[k];
            System.out.println("before k is " + temp + ",now is " + k);
        }
    }
    return nextArr;
}

    原文作者：唾手可得的树
    原文地址: https://www.cnblogs.com/usual2013blog/p/9096021.html
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。