Python：具有最常见条目的数据子集

2019年8月6日 142次阅读

我正在努力解决以下问题.

想象一下,我有很多这样的数据：

one = {'A':'m','B':'n','C':'o'}
two = {'A':'m','B':'n','C':'p'}
three = {'A':'x','B':'n','C':'p'}

等等,不一定必须存储在dicts中.
如何获得最常见条目的数据子集？

在上面的例子中,我想得到

one, two          with same A and B = m,n
two, three        with same B and C = n,p
one, two three    with same B       = n
one, two          with same A       = m

最佳答案对于长字典,一种方法但不是更有效率是使用
itertools.combinations找到字典之间的组合然后循环组合然后循环设置并获得设置项之间的交集：

one = {'one':{'A':'m','B':'n','C':'o'}}
two ={'two':{'A':'m','B':'n','C':'p'}}
three = {'three':{'A':'x','B':'n','C':'p'}}

dict_list=[one,two,three]
v_item=[i.items() for i in dict_list]

from itertools import combinations
names=[]
items=[]
l=[combinations(v_item,i) for i in range(2,4)]
flat=[[[t[0] for t in k] for k in j] for j in l]  
"""this line is for flattening the combinations i don't know why but python puts every elements within a list :
>>> l
[[([('one', {'A': 'm', 'C': 'o', 'B': 'n'})], [('two', {'A': 'm', 'C': 'p', 'B': 'n'})]), 
([('one', {'A': 'm', 'C': 'o', 'B': 'n'})], [('three', {'A': 'x', 'C': 'p', 'B': 'n'})]), 
([('two', {'A': 'm', 'C': 'p', 'B': 'n'})], [('three', {'A': 'x', 'C': 'p', 'B': 'n'})])], 
[([('one', {'A': 'm', 'C': 'o', 'B': 'n'})], [('two', {'A': 'm', 'C': 'p', 'B': 'n'})], [('three', {'A': 'x', 'C': 'p', 'B': 'n'})])]]"""


for comb in flat :
   for pair in comb:
     names,items =zip(*pair)
     items=[i.viewitems() for i in items]
     print names,reduce(lambda x,y:x&y,items)

结果：

('one', 'two') set([('B', 'n'), ('A', 'm')])
('one', 'three') set([('B', 'n')])
('two', 'three') set([('B', 'n'), ('C', 'p')])
('one', 'two', 'three') set([('B', 'n')])

关于以下几行：

     items=[i.viewitems() for i in items]
     print names,reduce(lambda x,y:x&y,items)

您需要create a view object of your items作为设置对象,然后您可以计算项目与&的交集.操作数.
使用reduce功能.