LightOJ - 1224 DNA Prefix trie字典树 毒瘤题

DNA Prefix

Given a set of n DNA samples, where each sample is a string containing characters from {A, C, G, T}, we are trying to find a subset of samples in the set, where the length of the longest common prefix multiplied by the number of samples in that subset is maximum.

To be specific, let the samples be:

ACGT

ACGTGCGT

ACCGTGC

ACGCCGT

If we take the subset {ACGT} then the result is 4 (4 * 1), if we take {ACGT, ACGTGCGT, ACGCCGT} then the result is 3 * 3 = 9 (since ACG is the common prefix), if we take {ACGT, ACGTGCGT, ACCGTGC, ACGCCGT} then the result is 2 * 4 = 8.

Now your task is to report the maximum result we can get from the samples.

Input
Input starts with an integer T (≤ 10), denoting the number of test cases.

Each case starts with a line containing an integer n (1 ≤ n ≤ 50000) denoting the number of DNA samples. Each of the next n lines contains a non empty string whose length is not greater than 50. And the strings contain characters from {A, C, G, T}.

Output
For each case, print the case number and the maximum result that can be obtained.

Sample Input
3

4

ACGT

ACGTGCGT

ACCGTGC

ACGCCGT

3

CGCGCGCGCGCGCCCCGCCCGCGC

CGCGCGCGCGCGCCCCGCCCGCAC

CGCGCGCGCGCGCCCCGCCCGCTC

2

CGCGCCGCGCGCGCGCGCGC

GGCGCCGCGCGCGCGCGCTC

Sample Output
Case 1: 9

Case 2: 66

Case 3: 20
这题是求多组字符串的子集的最长公共前缀和长度的乘积,题目不难,cut[u]保存的就是以u结尾的前缀的出现次数,所以每次处理字符串的时候更新一下最大值就可以了
可是这题我少说也re、wa了十几发!!!这编译器不知道哪里抽风,竟然对const和宏定义有什么奇怪的问题,所以我只好改用c++写,没想到,题目中说50000个串,我开了2500000个才够用!!坑爹啊!!

#include<stdio.h>
#include<stdlib.h>
#include<math.h>
#include<string.h>

int trie[2500000][5];
int cnt[2500000]; 
int cnts=0;
long long sz,ans=0;
int t,n; char s[600];


void init()
{
	sz=1;
	memset(cnt,0,sizeof(cnt));
	memset(trie[0],0,sizeof(trie));
}

int getid(char ch)
{

		if(ch=='A') return 1;
		if(ch=='C') return 2;
		if(ch=='G')	return 3;
		if(ch=='T') return 4;
}

void build(char s[])
{
	int u=0;
	int len=strlen(s);
	for(int i=0;i<len;i++)
	{
		char ch = getid(s[i]);
		if(!trie[u][ch])
		{
			memset(trie[sz],0,sizeof(trie[sz]));
			trie[u][ch]=sz++;
		}
		u=trie[u][ch];
		cnt[u]++;
	
		if(cnt[u]*(i+1)>ans)
			ans=cnt[u]*(i+1);//cnt[u] 保存以当前结尾的前缀的出现次数
	}
}

int main()
{
	scanf("%d",&t);
	while(t--)
	{
		ans=0;
		init();
		scanf("%d",&n);
		for(int i=0;i<n;i++)
		{
			scanf("%s",s);

			build(s);
		}
	
		printf("Case %d: %lld\n",++cnts,ans);
	
	}
	return 0;
}
    原文作者:Trie树
    原文地址: https://blog.csdn.net/u011469138/article/details/82935288
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞

发表评论

电子邮件地址不会被公开。 必填项已用*标注