使用KMP算法求子串出现次数_Jiffies

http://blog.sina.com.cn/u/2216603352

首页博文目录关于我

个人资料

微博

加好友发纸条

写留言加关注

博客等级：
博客积分：

博客访问：
关注人气：
获赠金笔：0支
赠出金笔：0支
荣誉徽章：

正文字体大小：大中小

使用KMP算法求子串出现次数

(2012-07-13 10:35:16)

标签：

kmp

杂谈

1. KMP算法：对应长度为n的目标串和长度为m的模式串，kmp算法的复杂度是o(m+n).其中o(m)的时间用于需找模式串的失效函数，o(n)的时间用于匹配。算法思想说起来比较麻烦，但是并不复杂，参考数据结构的书吧。

2. 下面给出kmp的代码search()和子串出现次数代码count().其中count()的复杂度是o(n)，整体复杂度也是o(m+n).

#include <iostream>
#include <cstdio>
#include <cstring>
using namespace std;

void fail(const char* pat, int* f){
    const int len=strlen(pat);
    f[0]=-1;
    for (int j=1; j < len; j++){
        if (pat[j] == pat[f[j-1]+1])
            f[j] = f[j-1]+1;
        else {
            int i=f[j-1];
            while ( i >=0 && pat[j] != pat[i+1] )
                i = f[i];
            if (pat[j] ==pat[i+1])
                f[j] = i+1;
            else
                f[j] = -1;
        }
    }
}

int search(const char* tg, const char *pt, const int* f){
    const int ptLen=strlen(pt);
    const int tgLen = strlen(tg);
    int tgPos=0,ptPos=0;
    for (; tgPos < tgLen && ptPos < ptLen;) {
        if (tg[tgPos] == pt[ptPos]){
            ptPos++;
            tgPos++;
        }
        else {
            if (ptPos == 0)
                tgPos ++;
            else
                ptPos = f[ptPos-1] + 1;
        }
    }
    return ptPos < ptLen ? -1: tgPos - ptLen;
}

int count(const char* tg, const char *pt, const int* f){
    const int ptLen=strlen(pt);
    const int tgLen = strlen(tg);
    int tgPos=0,ptPos=0,count=0;
    for (; tgPos < tgLen;) {
        if (tg[tgPos] == pt[ptPos]){
            ptPos++;
            tgPos++;
        }
        else {
            if (ptPos == 0)
                tgPos ++;
            else
                ptPos = f[ptPos-1] + 1;
        }
        if (ptPos >= ptLen ){
            count ++;
            ptPos=f[ptLen-1] +1;
        }
    }
    return count;
}

int main()
{
    const char *tag="abcbcbcbc";
    const char *pat="bcbc";
    int len=strlen(pat);
    int *f =new int[len];
    fail(pat,f);
    cout <<"失效函数位置:\n";
    for(int i=0;i<len;++i)
        cout << f[i] << '\n';
    cout << "子串第一次出现的位置: \n";
    cout << search(tag,pat,f) << '\n';
    cout <<"子串出现的次数 : \n";
    cout << count(tag,pat,f) <<'\n';
    return 0;
}

阅读┊ 收藏 ┊ 喜欢 ▼ ┊打印┊举报/Report

前一篇：linux驱动头文件说明

后一篇：链表排序

新浪BLOG意见反馈留言板　欢迎批评指正