Distinct Subsequences

2023年10月22日 241次阅读来源: OldPanda

Problem

Given a string S and a string T, count the number of distinct subsequences of T in S.

A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, "ACE" is a subsequence of "ABCDE" while "AEC" is not).

Here is an example:
S = "rabbbit", T = "rabbit"

Return 3.

Solution

终于见到一个使用动态规划的题目了，似乎这种字符串比对的差不多都是DP的思路。
这个问题实际上是问一个长字符串中有几个给定的子串，因此从开始比较，以最后一个字符为例，如果T的最后一个字符和S的最后一个字符不相同相同，那么问题就成为求字符串S[:-2]中字符T的个数；如果相同，问题就变为求字符串S[:-2]中字符T的个数和S[:-2]中子串T[:-2]的个数之和。从后向前递推，我们可以得到下面的矩阵

    r a b b b i t

  1 1 1 1 1 1 1 1

r 0 1 1 1 1 1 1 1

a 0 0 1 1 1 1 1 1

b 0 0 0 1 2 3 3 3

b 0 0 0 0 1 3 3 3

i 0 0 0 0 0 0 3 3

t 0 0 0 0 0 0 0 3

可以看出，矩阵中每个entry的数值为match[i][j] = match[i][j-1] + (match[i-1][j-1] if S[j-1] == T[i-1] else 0)，这样右下角的值即为所求。

AC代码如下：

class Solution:
    # @return an integer
    def numDistinct(self, S, T):
        length_s = len(S)
        length_t = len(T)
        if length_s == 0:
            return 0 if length_t != 0 else 1
        if length_t == 0:
            return 1
        match = [[0 for dummy_i in range(length_s + 1)] for dummy_j in range(length_t + 1)]
        for col in range(length_s + 1):
            match[0][col] = 1
        for s_idx in range(1, length_s + 1):
            for t_idx in range(1, length_t + 1):
                match[t_idx][s_idx] = match[t_idx][s_idx - 1]
                if S[s_idx - 1] == T[t_idx - 1]:
                    match[t_idx][s_idx] += match[t_idx - 1][s_idx - 1]
        return match[length_t][length_s]

    原文作者：OldPanda
    原文地址: https://segmentfault.com/a/1190000000615426
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。