Problem
Given a string S and a string T, count the number of distinct subsequences of T in S.
A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, "ACE"
is a subsequence of "ABCDE"
while "AEC"
is not).
Here is an example:
S = "rabbbit"
, T = "rabbit"
Return 3
.
Solution
终于见到一个使用动态规划的题目了,似乎这种字符串比对的差不多都是DP的思路。
这个问题实际上是问一个长字符串中有几个给定的子串,因此从开始比较,以最后一个字符为例,如果T的最后一个字符和S的最后一个字符不相同相同,那么问题就成为求字符串S[:-2]
中字符T的个数;如果相同,问题就变为求字符串S[:-2]
中字符T的个数和S[:-2]
中子串T[:-2]
的个数之和。从后向前递推,我们可以得到下面的矩阵
r a b b b i t
1 1 1 1 1 1 1 1
r 0 1 1 1 1 1 1 1
a 0 0 1 1 1 1 1 1
b 0 0 0 1 2 3 3 3
b 0 0 0 0 1 3 3 3
i 0 0 0 0 0 0 3 3
t 0 0 0 0 0 0 0 3
可以看出,矩阵中每个entry的数值为match[i][j] = match[i][j-1] + (match[i-1][j-1] if S[j-1] == T[i-1] else 0)
,这样右下角的值即为所求。
AC代码如下:
class Solution:
# @return an integer
def numDistinct(self, S, T):
length_s = len(S)
length_t = len(T)
if length_s == 0:
return 0 if length_t != 0 else 1
if length_t == 0:
return 1
match = [[0 for dummy_i in range(length_s + 1)] for dummy_j in range(length_t + 1)]
for col in range(length_s + 1):
match[0][col] = 1
for s_idx in range(1, length_s + 1):
for t_idx in range(1, length_t + 1):
match[t_idx][s_idx] = match[t_idx][s_idx - 1]
if S[s_idx - 1] == T[t_idx - 1]:
match[t_idx][s_idx] += match[t_idx - 1][s_idx - 1]
return match[length_t][length_s]