我有一个数据框
df = pd.DataFrame({
"species":["cat","dog","dog","cat","cat"],
"weight":[5,4,3,7,None],
"length":[12,None,13,14,15],
})
species weight length
0 cat 5.0 12.0
1 dog 4.0 NaN
2 dog 3.0 13.0
3 cat 7.0 14.0
4 cat NaN 15.0
我想用该物种的平均值来填充缺失的数据,即
df.loc[1,"length"] = 13 # the average dog length
df.loc[4,"weight"] = 6 # (5+7)/2 the average cat weight
我怎么做?
(大概我需要传递value=DataFrame
给df.fillna
,但我没有看到构建框架的简单方法)
df.fillna(df.groupby('species').transform('mean'))
返回