python excel处理 pandas 统计重复数据

 读取excel文件,并找到重复的数据

df = pd.read_excel(r'project.xls',sheet_name='Sheet1')
data = {}
dupList = [k for k,v in df['title'].value_counts().to_dict().items() if v > 1]
print(type(dupList),len(dupList),dupList)
for i in dupList:
    d = df[df['title']==i]['id'].tolist()
    data[i] = d
print(data)

处理结果: 

{‘title1’: [‘2110251552596668’, ‘2110251913137755’, ‘2110251930146802’], …}

    原文作者:bismillahhh
    原文地址: https://blog.csdn.net/qq_41617060/article/details/121138683
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞