我可以在使用数据库后激活 PITR 吗？

Question

Asked: 2024-05-20 20:42:53 +0800 CST2024-05-20 20:42:53 +0800 CST 2024-05-20 20:42:53 +0800 CST

时间间隔检查的性能问题

772

我正在运行以下 Postgres 查询，其中根据消费者使用服务表中的服务的时间，将消费者的使用情况与该时间间隔的服务成本结合起来。下面的查询是一个截断版本，有时我必须执行多达 12 个或更多连接。问题是运行一个查询可能需要 2.5 分钟。我怎样才能减少这个时间？我采取的方法正确吗？

select 
  c.consumption, 
  c.interval_start, 
  c.interval_end, 
  s1.value_exc_vat as service1_price, 
  s2.value_exc_vat as service2_price 
from 
  consumer as c 
  left join service1 as s1 on c.interval_start >= timestamp '2023-03-18T00:00:00Z' 
  and c.interval_start < timestamp '2024-02-15T00:00:00Z' 
  and (
    s1.payment_method is null 
    or s1.payment_method = 'DIRECT_DEBIT'
  ) 
  and c.interval_start >= s1.valid_from 
  and (
    c.interval_start < s1.valid_to 
    or s1.valid_to is null
  ) 
  left join service2 as s2 on c.interval_start >= timestamp '2024-02-15T00:00:00Z' 
  and c.interval_start < timestamp '2025-02-15T00:00:00Z' 
  and (
    s2.payment_method is null 
    or s2.payment_method = 'DIRECT_DEBIT'
  ) 
  and c.interval_start >= s2.valid_from 
  and (
    c.interval_start < s2.valid_to 
    or s2.valid_to is null
  ) 
order by 
  c.interval_start desc

我已将问题隔离到每个联接中的查询的这一部分：

and c.interval_start >= s1.valid_from 
  and (
    c.interval_start < s1.valid_to 
    or s1.valid_to is null
  )

看来找到正确的间隔来加入表需要花费很多时间。对于消费者表和服务表来说，时间段可能相隔 30 分钟到几个月或几年不等，所以我不能做一个简单的计算c.interval_start = s1.valid_from and c.interval_end = s1.valid_to

这是一个EXPLAIN (ANALYZE, BUFFERS)：

"QUERY PLAN"
"Sort  (cost=5523752586.99..5549030615.96 rows=10111211585 width=28) (actual time=140588.278..140589.028 rows=20593 loops=1)"
"  Sort Key: c.interval_start DESC"
"  Sort Method: quicksort  Memory: 2216kB"
"  Buffers: shared hit=396"
"  ->  Nested Loop Left Join  (cost=0.00..2633916836.23 rows=10111211585 width=28) (actual time=12.169..140546.307 rows=20593 loops=1)"
"        Join Filter: ((c.interval_start >= '2023-03-18 00:00:00'::timestamp without time zone) AND (c.interval_start < '2024-02-15 00:00:00'::timestamp without time zone) AND (c.interval_start >= s1.valid_from) AND ((c.interval_start < s1.valid_to) OR (s1.valid_to IS NULL)))"
"        Rows Removed by Join Filter: 559372220"
"        Buffers: shared hit=396"
"        ->  Nested Loop Left Join  (cost=0.00..3979716.14 rows=4302977 width=24) (actual time=0.058..27617.147 rows=20593 loops=1)"
"              Join Filter: ((c.interval_start >= '2024-02-15 00:00:00'::timestamp without time zone) AND (c.interval_start < '2025-02-15 00:00:00'::timestamp without time zone) AND (c.interval_start >= s2.valid_from) AND ((c.interval_start < s2.valid_to) OR (s2.valid_to IS NULL)))"
"              Rows Removed by Join Filter: 176848172"
"              Buffers: shared hit=196"
"              ->  Seq Scan on consumer c  (cost=0.00..337.93 rows=20593 width=20) (actual time=0.007..21.813 rows=20593 loops=1)"
"                    Buffers: shared hit=132"
"              ->  Materialize  (cost=0.00..214.29 rows=8588 width=20) (actual time=0.000..0.272 rows=8588 loops=20593)"
"                    Buffers: shared hit=64"
"                    ->  Seq Scan on service2 s2  (cost=0.00..171.35 rows=8588 width=20) (actual time=0.006..0.891 rows=8588 loops=1)"
"                          Filter: ((payment_method IS NULL) OR (payment_method = 'DIRECT_DEBIT'::bpchar))"
"                          Buffers: shared hit=64"
"        ->  Materialize  (cost=0.00..675.37 rows=27164 width=20) (actual time=0.000..0.896 rows=27164 loops=20593)"
"              Buffers: shared hit=200"
"              ->  Seq Scan on service1 s1  (cost=0.00..539.55 rows=27164 width=20) (actual time=0.002..2.830 rows=27164 loops=1)"
"                    Filter: ((payment_method IS NULL) OR (payment_method = 'DIRECT_DEBIT'::bpchar))"
"                    Buffers: shared hit=200"
"Planning Time: 0.107 ms"
"Execution Time: 140590.312 ms"

1 个回答

Voted

jjanes · Answer 1 · 2024-05-21T01:47:09+08:00

我认为你可以替换

  and c.interval_start >= s1.valid_from 
  and (
    c.interval_start < s1.valid_to 
    or s1.valid_to is null
  )

和

and c.interval_start <@ tstzrange(s1.valid_from,s1.valid_to)

然后可以使用索引

create index on service1 using gist (tstzrange(valid_from,valid_to))

通过使用功能索引，您不需要重构表

时间间隔检查的性能问题

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

时间间隔检查的性能问题

1 个回答

相关问题