在准备另一个问题的答案时,我编写了一个查询,其中包含具有相同子句的多个窗口函数OVER(...)
。结果符合预期。
select ...
sum(sum(s.price)) over (partition by p.productid
order by c.date
rows between 6 preceding and current row)
/ nullif(
sum(count(s.price)) over(partition by p.productid
order by c.date
rows between 6 preceding and current row),
0)
as avg7DayPrice
...
当我尝试OVER(...)
用共享WINDOW
子句引用OVER Last7Days
(后来定义为 )替换重复子句时WINDOW Last7Days AS (...)
,结果却出乎意料地不同。(请参阅WINDOW
子句- 2022 年新增。)
select ...
sum(sum(s.price)) over last7days
/ nullif(sum(count(s.price)) over last7days, 0)
as avg7DayPrice
...
window last7days as (partition by p.productid
order by c.date
rows between 6 preceding and current row)
完整原始查询:
-- Inline window over() clauses
-- Results are as expected
with calendar as (
select min(date) as date, max(date) as endDate
from sales
union all
select dateadd(day, 1, date), endDate
from calendar
where date < enddate
),
products as (
select distinct productid
from sales
)
select
p.productid,
c.date as salesDate,
avg(price) as avg1DayPrice,
sum(sum(s.price)) over (partition by p.productid
order by c.date
rows between 6 preceding and current row)
/ nullif(
sum(count(s.price)) over(partition by p.productid
order by c.date
rows between 6 preceding and current row),
0)
as avg7DayPrice
from calendar c
cross join products p
left join sales s
on s.date = c.date
and s.productid = p.productid
group by p.productid, c.date
完整修改的查询:
-- Reference to common defined window clause
-- Expect same results, but that is not what I get.
with calendar as (
select min(date) as date, max(date) as endDate
from sales
union all
select dateadd(day, 1, date), endDate
from calendar
where date < enddate
),
products as (
select distinct productid
from sales
)
select
p.productid,
c.date as salesDate,
avg(price) as avg1DayPrice,
sum(sum(s.price)) over last7days
/ nullif(
sum(count(s.price)) over last7days,
0)
as avg7DayPrice
from calendar c
cross join products p
left join sales s
on s.date = c.date
and s.productid = p.productid
group by p.productid, c.date
window last7days as (partition by p.productid
order by c.date
rows between 6 preceding and current row)
销售表:
产品编号 | 日期 | 价格 |
---|---|---|
1 | 2025-02-01 | 10.00 |
1 | 2025-02-02 | 20.00 |
1 | 2025-02-02 | 30.00 |
1 | 2025-02-03 | 40.00 |
原始结果:
产品编号 | 销售日期 | 平均单日价格 | 平均7天价格 |
---|---|---|---|
1 | 2025-02-01 | 10.000000 | 10.000000 |
1 | 2025-02-02 | 25.000000 | 20.000000 |
1 | 2025-02-03 | 40.000000 | 25.000000 |
修改结果:
产品编号 | 销售日期 | 平均单日价格 | 平均7天价格 |
---|---|---|---|
1 | 2025-02-01 | 10.000000 | 2.500000 |
1 | 2025-02-02 | 25.000000 | 15.000000 |
1 | 2025-02-03 | 40.000000 | 25.000000 |
我不明白为什么结果不同。看起来sum(count(s.price)) over(...)
计算部分是1, 3, 4
针对第一个查询的,但4, 4, 4
第二个查询却不同。执行计划也有很大不同。有人能解释一下吗?
这肯定是一个错误(报告在这里),一个更简单的再现是
返回
问题在于,它
NULLIF
扩展为一个表达式,其中第一个参数出现两次,并且窗口框架仅正确保留了第一次引用。即,它不是按照上面的方法扩展,而是
correct_expansion
扩展为faulty_expansion
变体。运行以下命令...
在部分输出中显示了这一点(第二个之后没有窗口框架
stopSum
)