我有 5 分钟间隔的 OHLC 数据。我想将其转换为每小时的数据,但我希望这些值是在整点一刻之间,也就是在 9:15-10:15、10:15-11:15、11:15-12:15 等等之间。以下代码可以在整点测量之间执行此操作。我该如何修改代码以获得我想要的行为?
import pandas as pd
df = pd.read_csv('data.csv')
df['Datetime'] = pd.to_datetime(df['Datetime'])
df.set_index('Datetime', inplace=True)
hourly_data = df.resample('H').agg({'Open': 'first', 'High': 'max', 'Low': 'min', 'Close': 'last', 'Volume': 'sum'})
示例数据.csv:
Datetime,Open,High,Low,Close,Volume
2015-02-03 09:15:00+05:30,200.4,202,199.1,200.6,286228
2015-02-03 09:20:00+05:30,200.6,201.5,200.45,201.15,110749
2015-02-03 09:25:00+05:30,201.2,201.7,201.15,201.65,108342
2015-02-03 09:30:00+05:30,201.75,202.5,201.45,202.2,162976
2015-02-03 09:35:00+05:30,202.2,202.2,201.4,202,118370
2015-02-03 09:40:00+05:30,201.9,202,201.5,201.7,49873
2015-02-03 09:45:00+05:30,201.7,201.85,201.6,201.65,27798
2015-02-03 09:50:00+05:30,201.65,201.7,201.5,201.5,41560
2015-02-03 09:55:00+05:30,201.5,201.75,201.5,201.65,105351
2015-02-03 10:00:00+05:30,201.65,201.9,201.6,201.8,79346
2015-02-03 10:05:00+05:30,201.65,201.9,201.65,201.75,23715
2015-02-03 10:10:00+05:30,201.65,202.85,201.65,202.8,112551
2015-02-03 10:15:00+05:30,202.85,203.2,202.65,202.95,165270
2015-02-03 10:20:00+05:30,202.95,203.25,202.8,203,105614
2015-02-03 10:25:00+05:30,203,203.4,203,203.2,51944
2015-02-03 10:30:00+05:30,203.2,203.8,203,203.75,78474
2015-02-03 10:35:00+05:30,203.75,204.35,203.75,203.75,125514
2015-02-03 10:40:00+05:30,203.75,205,203.75,204.9,249231
2015-02-03 10:45:00+05:30,204.9,206.2,204.6,205.6,388224
2015-02-03 10:50:00+05:30,205.85,205.85,204.7,205,118267
2015-02-03 10:55:00+05:30,205,205.25,204.4,204.9,72310
2015-02-03 11:00:00+05:30,204.9,205.35,204.05,205,94266
2015-02-03 11:05:00+05:30,205,205.4,204.8,205,53333
resample
加上:'h'
offset
15m
基本上有两种选择。
您将
origin
的默认时间(时间序列中第一天的午夜)更改为 15 分钟后。如果您的时间序列始终在 的 15 分钟后开始,则这非常简单:df.resample('H', origin='start')...
您使用参数
offset
将 15 分钟添加到原点:df.resample('H', offset='15min')...