鉴于:
df = pd.DataFrame({"panum": ["PA1", "PA1", "PA1", "PA2", "PA2", "PA2"],
"which": ["A", "A", "A", "B", "B", "B"],
"score": [88, 80, 90, 92, 95, 99]})
df.set_index(['panum', 'which'], inplace=True)
df
score
panum which
PA1 A 88
A 80
A 90
PA2 B 92
B 95
B 99
是否有可能写出一些会在’哪个’中创建一个新的索引条目,这个参数最大但是对于这个级别,所以它会创建两个新行,PA1,Max和PA2,Max?
更新
我已经纠正了索引.上面的例子不是我的意思.
panmum factor score
PA1 init 90
resub 94
final 93
PA2 init 60
resub 90
final 88
我在这个更好的场景中的问题是:“我想创建一个名为mean的新”panum“,它将有三行,(mean,init),(mean,resub),(mean,final)”.
伪代码就像df [‘mean’] =(df [‘pa1’] df [‘pa2’])/ 2
我知道这是一个不同的问题!
最佳答案 您可以创建最大值的新DataFrame,添加第二级最大值,
append
到原始值和最后
sort_index
:
m = df.max(level=0).assign(max='max').set_index('max', append=True)
print (m)
score
panum max
PA1 max 90
PA2 max 99
df = df.append(m).sort_index()
print (df)
score
panum which
PA1 A 88
A 80
A 90
max 90
PA2 B 92
B 95
B 99
max 99
编辑答案:解决方案的平均值由第二级和swaplevel更改为正确对齐到最终的DataFrame:
df = pd.DataFrame({"panum": ["PA1", "PA1", "PA1", "PA2", "PA2", "PA2"],
"factor": ["init", "resub", "final"] * 2,
"score": [90, 94, 93, 60, 90, 88]})
df.set_index(['panum', 'factor'], inplace=True)
print (df)
score
panum factor
PA1 init 90
resub 94
final 93
PA2 init 60
resub 90
final 88
m = (df.mean(level=1)
.assign(factor='mean')
.set_index('factor', append=True)
.swaplevel(0,1))
print (m)
score
factor factor
mean init 75.0
resub 92.0
final 90.5
df = df.append(m)
print (df)
score
panum factor
PA1 init 90.0
resub 94.0
final 93.0
PA2 init 60.0
resub 90.0
final 88.0
mean init 75.0
resub 92.0
final 90.5