我的问题是,在极坐标系中,是否有办法在滚动窗口开始时让值为空,直到可以填满整个窗口。例如:
dates = [
"2020-01-01",
"2020-01-02",
"2020-01-03",
"2020-01-04",
"2020-01-05",
"2020-01-06",
"2020-01-01",
"2020-01-02",
"2020-01-03",
"2020-01-04",
"2020-01-05",
"2020-01-06",
]
df = pl.DataFrame({"dt": dates, "a": [3, 4, 2, 8, 10, 1, 1, 7, 5, 9, 2, 1], "b": ["Yes","Yes","Yes","Yes","Yes", "Yes", "No", "No", "No", "No", "No", "No"]}).with_columns(
pl.col("dt").str.strptime(pl.Date).set_sorted()
)
df = df.sort(by = 'dt')
df.rolling(
index_column="dt", period="2d", group_by = 'b'
).agg(pl.col("a").mean().alias("ma_2d"))
结果
b dt ma_2d
str date f64
"Yes" 2020-01-01 3.0
"Yes" 2020-01-02 3.5
"Yes" 2020-01-03 3.0
"Yes" 2020-01-04 5.0
"Yes" 2020-01-05 9.0
在这种情况下,我的预期是第一天应该为空,因为没有两天来填充窗口。但 polars 似乎只是截断窗口以填充起始日期。
你能检查一下长度吗?
或者,有一种专用的
.rolling_mean_by()
方法来支持min_periods
。有一个功能请求(#12798)来实现具有此效果的
min_periods
/参数,这也在问题#12049中进行了讨论。min_samples