python – 基于特定列属性的Pandas fillna()

2019年8月3日 391次阅读

假设我有这张桌子

Type | Killed | Survived
Dog      5         2
Dog      3         4
Cat      1         7
Dog     nan        3
cow     nan        2

[Type] = Dog缺少Killed值之一.

我想在[类型] =狗的[Killed]中归咎于平均值.

我的代码如下：

>搜索平均值

df [df [‘Type’] ==’Dog’].mean().round()

这将给我平均值(约2.25)

>估算均值(这是问题开始的地方)

df.loc [(df [‘Type’] ==’Dog’)& (df [‘Killed’])].fillna(2.25,inplace = True)

代码运行,但值不是估算,NaN值仍然存在.

我的问题是,我如何根据[Type] = Dog来估算[Killed]中的均值.

最佳答案对我来说工作：

df.ix[df['Type'] == 'Dog', 'Killed'] = df.ix[df['Type'] == 'Dog', 'Killed'].fillna(2.25)
print (df)
  Type  Killed  Survived
0  Dog    5.00         2
1  Dog    3.00         4
2  Cat    1.00         7
3  Dog    2.25         3
4  cow     NaN         2

如果系列需要fillna – 因为2列被杀和幸存：

m = df[df['Type'] == 'Dog'].mean().round()
print (m)
Killed      4.0
Survived    3.0
dtype: float64

df.ix[df['Type'] == 'Dog'] = df.ix[df['Type'] == 'Dog'].fillna(m)
print (df)
  Type  Killed  Survived
0  Dog     5.0         2
1  Dog     3.0         4
2  Cat     1.0         7
3  Dog     4.0         3
4  cow     NaN         2

如果需要fillna只在Killed列中：

#if dont need rounding, omit it
m = round(df.ix[df['Type'] == 'Dog', 'Killed'].mean())
print (m)
4

df.ix[df['Type'] == 'Dog', 'Killed'] = df.ix[df['Type'] == 'Dog', 'Killed'].fillna(m)
print (df)
  Type  Killed  Survived
0  Dog     5.0         2
1  Dog     3.0         8
2  Cat     1.0         7
3  Dog     4.0         3
4  cow     NaN         2

您可以重用以下代码：

filtered = df.ix[df['Type'] == 'Dog', 'Killed']
print (filtered)
0    5.0
1    3.0
3    NaN
Name: Killed, dtype: float64

df.ix[df['Type'] == 'Dog', 'Killed'] = filtered.fillna(filtered.mean())
print (df)
  Type  Killed  Survived
0  Dog     5.0         2
1  Dog     3.0         8
2  Cat     1.0         7
3  Dog     4.0         3
4  cow     NaN         2