给定两个格式相同的数据框:
df1
Counterparty Product Deal Date Value
foo bar Buy 01/01/24 10.00
foo bar Buy 01/01/24 10.00
foo bar Sell 01/01/24 10.00
foo bar Sell 01/01/24 10.00
fizz bar Buy 01/01/24 10.00
fizz bar Buy 01/01/24 10.00
fizz buzz Sell 01/01/24 10.00
fizz buzz Sell 01/01/24 10.00
df2
Counterparty Product Deal Date Value
foo bar Buy 01/01/24 11.00
foo bar Buy 01/01/24 09.00
foo bar Sell 01/01/24 09.00
foo bar Sell 01/01/24 10.00
fizz bar Buy 01/01/24 12.00
fizz bar Buy 01/01/24 08.00
fizz buzz Sell 01/01/24 09.00
fizz buzz Sell 01/01/24 10.00
到目前为止我已经这样做了:
out = pd.pivot_table(df1, values = 'Value', index='Counterparty', columns = 'Product', aggfunc='sum').reset_index().rename_axis(None, axis=1)
out = out.fillna(0)
Counterparty bar buzz
0 fizz 20.0 20.0
1 foo 40.0 0.0
购买我如何旋转这些来创建像这样的视觉效果:
Counterparty Bar Buzz Total col1 col2
foo 40 0 40 39 1
fizz 20 20 40 39 1
col1
来自哪里df2
,和之间col2
的区别是什么Total
col1
样本:
df1 = pd.DataFrame({
"Counterparty": ["foo", "foo", "foo", "foo", "fizz", "fizz", "fizz", "fizz"],
"Product": ["bar", "bar", "bar", "bar", "bar", "bar", "buzz", "buzz"],
"Deal": ["Buy","Buy", "Sell", "Sell", "Buy", "Buy", "Sell", "Sell"],
"Date": ["01/01/24", "01/01/24", "01/01/24", "01/01/24", "01/01/24", "01/01/24", "01/01/24", "01/01/24"],
"Value": [10, 10, 10, 10, 10, 10, 10, 10]
})
df2 = pd.DataFrame({
"Counterparty": ["foo", "foo", "foo", "foo", "fizz", "fizz", "fizz", "fizz"],
"Product": ["bar", "bar", "bar", "bar", "bar", "bar", "buzz", "buzz"],
"Deal": ["Buy","Buy", "Sell", "Sell", "Buy", "Buy", "Sell", "Sell"],
"Date": ["01/01/24", "01/01/24", "01/01/24", "01/01/24", "01/01/24", "01/01/24", "01/01/24", "01/01/24"],
"Value": [11, 9, 9, 10, 12, 8, 9, 10]
})
out = pd.pivot_table(df1, values = 'Value', index='Counterparty', columns = 'Product', aggfunc='sum').reset_index().rename_axis(None, axis=1)
out = out.fillna(0)
Total
可以通过对除第一列之外的所有现有列求和来生成列。在添加其他列之前必须先完成此操作。col1
列通过 groupby 完成,合并Counterparty
然后重命名该列:col2
很简单:您可以将分组总和 df2 合并到 df1 的数据透视表中,然后使用assign 添加缺失的列。