c# – 子集和算法效率

2023年5月16日 155次阅读

我们每天都有许多付款(交易)进入我们的业务.每笔交易都有一个ID和金额.我们要求将大量这些交易与特定金额进行匹配.例：

Transaction    Amount
1              100
2              200
3              300
4              400
5              500

如果我们想要找到总计600的交易,你将拥有多个集合(1,2,3),(2,4),(1,5).

我发现了一个我已经适应的算法,其工作方式如下所述. 30次交易需要15ms.但是,交易数量平均约为740,最高接近6000.这是一种更有效的方式来执行此搜索吗？

sum_up(TransactionList,remittanceValue,ref MatchedLists);

private static void sum_up(List<Transaction> transactions, decimal target, ref List<List<Transaction>> matchedLists)
{
    sum_up_recursive(transactions, target, new List<Transaction>(), ref matchedLists);
}

private static void sum_up_recursive(List<Transaction> transactions, decimal target, List<Transaction> partial, ref List<List<Transaction>> matchedLists)
{
    decimal s = 0;
    foreach (Transaction x in partial) s += x.Amount;

    if (s == target)
    {
        matchedLists.Add(partial);
    }

    if (s > target)
        return;

    for (int i = 0; i < transactions.Count; i++)
    {
        List<Transaction> remaining = new List<Transaction>();
        Transaction n = new Transaction(0, transactions[i].ID, transactions[i].Amount);
        for (int j = i + 1; j < transactions.Count; j++) remaining.Add(transactions[j]);

        List<Transaction> partial_rec = new List<Transaction>(partial);
        partial_rec.Add(new Transaction(n.MatchNumber, n.ID, n.Amount));
        sum_up_recursive(remaining, target, partial_rec, ref matchedLists);
    }
}

事务定义为：

class Transaction
{
    public int ID;
    public decimal Amount;
    public int MatchNumber;

    public Transaction(int matchNumber, int id, decimal amount)
    {
        ID = id;
        Amount = amount;
        MatchNumber = matchNumber;
    }
}

最佳答案如前所述,您的问题可以通过O(n * G)中的伪多项式算法来解决,其中n个项目和G – 您的目标总和.

第一部分问题：是否有可能实现目标总和G.以下伪/ python代码解决了它(在我的机器上没有C#)：

def subsum(values, target):
    reached=[False]*(target+1) # initialize as no sums reached at all
    reached[0]=True # with 0 elements we can only achieve the sum=0
    for val in values:
        for s in reversed(xrange(target+1)): #for target, target-1,...,0
            if reached[s] and s+val<=target: # if subsum=s can be reached, that we can add the current value to this sum and build an new sum 
                reached[s+val]=True
    return reached[target]

这个想法是什么？让我们考虑值[1,2,3,6]和目标总和7：

>我们从空集开始 – 可能的总和显然为0.
>现在我们来看第一个元素1,并且必须选择是否采取.这留下了可能的总和{0,1}.
>现在看下一个元素2：导致可能的集合{0,1}(不采取){2,3}(采取).
>到目前为止,与您的方法没有多大区别,但现在对于元素3,我们有可能设置a.不考虑{0,1,2,3}和b.取{3,4,5,6}导致{0,1,2,3,4,5,6}成为可能的总和.你的方法的不同之处在于有两种方法可以达到3,你的递归将从那里开始两次(这是不需要的).一遍又一遍地计算基本相同的人员是你的方法的问题以及为什么提出的算法更好.

>作为最后一步,我们考虑6并获得{0,1,2,3,4,5,6,7}作为可能的总和.

但是你还需要导致目标总和的子集,为此我们只记得采用哪个元素来实现当前的子和.此版本返回一个子集,该子集导致目标总和,否则为None：

def subsum(values, target):
    reached=[False]*(target+1)
    val_ids=[-1]*(target+1)
    reached[0]=True # with 0 elements we can only achieve the sum=0

    for (val_id,val) in enumerate(values):
        for s in reversed(xrange(target+1)): #for target, target-1,...,0
            if reached[s] and s+val<=target:
                reached[s+val]=True
                val_ids[s+val]=val_id          

    #reconstruct the subset for target:
    if not reached[target]:
        return None # means not possible
    else:
        result=[]
        current=target
        while current!=0:# search backwards jumping from predecessor to predecessor
           val_id=val_ids[current]
           result.append(val_id)
           current-=values[val_id]
        return result

作为另一种方法,您可以使用memoization加速当前解决方案,记住状态(子项,number_of_elements_not考虑)是否可以实现目标总和.但我想说标准的动态编程在这里是一个不易出错的可能性.