有几个类似的问题(例如https://dba.stackexchange.com/questions/72419/filling-in-missing-dates-in-record-set-from-generate-series),但解决方案似乎对我的情况不起作用...本质上,我正在尝试为系列中不存在的日期生成零个条目,但我怀疑问题是我必须从时间戳中提取日期值?我已经使用 SQL 多年了,但对 postgres 还很陌生 - 到目前为止印象深刻。在这里尝试了左连接和右连接,但没有成功......
这是一个小测试用例(仍然鼓励使用 SQL 语句吗?):
-- temp test table - works as expected
WITH incomplete_data(payment_date, payment_id) AS (
VALUES
('2024-09-06 11:26:57.509429+01'::timestamp with time zone, 'uuid01')
,('2024-09-06 12:26:57.509429+01', 'uuid02')
,('2024-09-07 07:26:57.509429+01', 'uuid03')
,('2024-09-08 10:26:57.509429+01', 'uuid05')
,('2024-09-08 12:26:57.509429+01', 'uuid08')
,('2024-09-08 14:26:57.509429+01', 'uuid11')
,('2024-09-10 09:26:57.509429+01', 'uuid23')
)
select * from incomplete_data;
-- generated dates - work as expected
select * FROM (
SELECT generate_series(timestamp '2024-01-01'
, timestamp '2024-01-01' + interval '1 year - 1 day'
, interval '1 day')::date
) d(day)
;
-- join - failing to do what I was hoping..
WITH incomplete_data(payment_date, payment_id) AS (
VALUES
('2024-09-06 11:26:57.509429+01'::timestamp with time zone, 'uuid01')
,('2024-09-06 12:26:57.509429+01', 'uuid02')
,('2024-09-07 07:26:57.509429+01', 'uuid03')
,('2024-09-08 10:26:57.509429+01', 'uuid05')
,('2024-09-08 12:26:57.509429+01', 'uuid08')
,('2024-09-08 14:26:57.509429+01', 'uuid11')
,('2024-09-10 09:26:57.509429+01', 'uuid23')
)
select count(payment_id), date_trunc('day',payment_date)::date as time
FROM (
SELECT generate_series(timestamp '2024-01-01'
, timestamp '2024-01-01' + interval '1 year - 1 day'
, interval '1 day')::date
) d(day)
right JOIN incomplete_data p ON date_trunc('day',payment_date) = d.day
where payment_date BETWEEN '2024-09-01T12:55:36.824Z' AND '2024-09-30T13:55:36.824Z'
GROUP BY date_trunc('day',payment_date)
ORDER BY date_trunc('day',payment_date);
count | time
-------+------------
2 | 2024-09-06
1 | 2024-09-07
3 | 2024-09-08
1 | 2024-09-10
(4 rows)
我希望获取该月中每一天的一行,其中未填充的天数为零。背景是这是为了填充 grafana 查询。
有人能指出我做错了什么吗?或者我没能理解这里更大的问题吗?我的版本是:PostgreSQL 15.9 (Debian 15.9-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
更新
下面的 jjanes 答案帮助我澄清了连接和过滤的顺序 - 这是所需的选择:
WITH incomplete_data(payment_date, payment_id) AS (
VALUES
('2024-09-06 11:26:57.509429+01'::timestamp with time zone, 'uuid01')
,('2024-09-06 12:26:57.509429+01', 'uuid02')
,('2024-09-07 07:26:57.509429+01', 'uuid03')
,('2024-09-08 10:26:57.509429+01', 'uuid05')
,('2024-09-08 12:26:57.509429+01', 'uuid08')
,('2024-09-08 14:26:57.509429+01', 'uuid11')
,('2024-09-10 09:26:57.509429+01', 'uuid23')
)
select count(payment_id), d.day as time
FROM (
SELECT generate_series(timestamp '2024-01-01'
, timestamp '2024-01-01' + interval '1 year - 1 day'
, interval '1 day')::date
) d(day)
left JOIN incomplete_data p ON date_trunc('day',payment_date) = d.day
where d.day BETWEEN '2024-09-01T12:55:36.824Z' AND '2024-09-30T13:55:36.824Z'
GROUP BY d.day
ORDER BY d.day
;