我可以在使用数据库后激活 PITR 吗？

Question

Asked: 2023-01-31 05:41:12 +0800 CST2023-01-31 05:41:12 +0800 CST 2023-01-31 05:41:12 +0800 CST

当我按索引布尔列过滤时，为什么查询需要很长时间？

772

我有一个在boolean具有索引的列上进行过滤的查询。但是，查询需要很长时间才能完成。当我不使用这个过滤器时，查询返回得非常快。

这是解释计划。第一个有processed is true并且需要很长时间才能完成。第二个没有它并立即返回。

explain select count(*) from listen_events where (started_at >='2021-12-26' and started_at <'2021-12-27') and processed is true;
                                                                                 QUERY PLAN                                                                                 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=212405.62..212405.63 rows=1 width=8)
   ->  Bitmap Heap Scan on listen_events  (cost=187657.78..212390.09 rows=6213 width=0)
         Recheck Cond: ((started_at >= '2021-12-26 00:00:00'::timestamp without time zone) AND (started_at < '2021-12-27 00:00:00'::timestamp without time zone))
         Filter: (processed IS TRUE)
         ->  BitmapAnd  (cost=187657.78..187657.78 rows=6213 width=0)
               ->  Bitmap Index Scan on index_listen_events_on_started_at  (cost=0.00..17323.56 rows=813898 width=0)
                     Index Cond: ((started_at >= '2021-12-26 00:00:00'::timestamp without time zone) AND (started_at < '2021-12-27 00:00:00'::timestamp without time zone))
               ->  Bitmap Index Scan on listen_events_processed_idx  (cost=0.00..170330.87 rows=9125639 width=0)
                     Index Cond: (processed = true)
(9 rows)

=> explain select count(*) from listen_events where (started_at >='2021-12-26' and started_at <'2021-12-27');
                                                                                 QUERY PLAN                                                                                 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=24549.13..24549.14 rows=1 width=8)
   ->  Gather  (cost=24548.92..24549.13 rows=2 width=8)
         Workers Planned: 2
         ->  Partial Aggregate  (cost=23548.92..23548.93 rows=1 width=8)
               ->  Parallel Index Only Scan using index_listen_events_on_started_at on listen_events  (cost=0.58..22701.11 rows=339124 width=0)
                     Index Cond: ((started_at >= '2021-12-26 00:00:00'::timestamp without time zone) AND (started_at < '2021-12-27 00:00:00'::timestamp without time zone))
(6 rows)

这是表格配置：

Table "public.listen_events"
     Column     |            Type             | Collation | Nullable |                  Default                  | Storage  | Stats target | Description 
----------------+-----------------------------+-----------+----------+-------------------------------------------+----------+--------------+-------------
 id             | integer                     |           | not null | nextval('listen_events_id_seq'::regclass) | plain    |              | 
 event_type     | text                        |           |          |                                           | extended |              | 
 stream_type    | text                        |           |          |                                           | extended |              | 
 event_id       | text                        |           |          |                                           | extended |              | 
 broadcast_uid  | text                        |           |          |                                           | extended |              | 
 user_agent     | text                        |           |          |                                           | extended |              | 
 city           | text                        |           |          |                                           | extended |              | 
 country        | text                        |           |          |                                           | extended |              | 
 referrer       | text                        |           |          |                                           | extended |              | 
 country_code   | character varying(2)        |           |          |                                           | extended |              | 
 continent_code | character varying(2)        |           |          |                                           | extended |              | 
 user_id        | integer                     |           |          |                                           | plain    |              | 
 started_at     | timestamp without time zone |           |          |                                           | plain    |              | 
 created_at     | timestamp without time zone |           |          |                                           | plain    |              | 
 updated_at     | timestamp without time zone |           |          |                                           | plain    |              | 
 ip_address     | cidr                        |           |          |                                           | main     |              | 
 location       | point                       |           |          |                                           | plain    |              | 
 ended_at       | timestamp without time zone |           |          |                                           | plain    |              | 
 server_id      | text                        |           |          |                                           | extended |              | 
 channel_id     | integer                     |           |          |                                           | plain    |              | 
 id_bigint      | bigint                      |           |          |                                           | plain    |              | 
 processed      | boolean                     |           | not null | false                                     | plain    |              | 
Indexes:
    "listen_events_pkey" PRIMARY KEY, btree (id)
    "index_listen_events_event_id" btree (event_id)
    "index_listen_events_on_broadcast_uid" btree (broadcast_uid)
    "index_listen_events_on_started_at" btree (started_at)
    "index_listen_events_on_user_id" btree (user_id)
    "listen_events_processed_idx" btree (processed)
Options: autovacuum_enabled=true, autovacuum_vacuum_scale_factor=0, autovacuum_vacuum_threshold=30000, autovacuum_vacuum_cost_delay=0, autovacuum_analyze_scale_factor=0, autovacuum_analyze_threshold=30000, toast.autovacuum_enabled=true

目前，该表有 19 亿行，其中大部分为processed = false.

任何线索为什么会这样？

1 个回答

Voted

Laurenz Albe · Answer 1 · 2023-01-31T05:56:30+08:00

你没有显示EXPLAIN (ANALYZE, BUFFERS)输出，所以我只能猜测。无论如何，有两个主要区别：

由于查询没有单一索引，PostgreSQL 结合了两个索引。这比扫描单个索引要多一些工作。
主要区别在于快速查询可以使用仅索引扫描，而慢速查询则不能。

我会像这样创建一个两列索引：

CREATE INDEX ON listen_events (processed, started_at);

如果您只使用查询行processed IS TRUE，您还可以创建一个更小更快的索引：

CREATE INDEX ON listen_events (started_at) WHERE processed IS TRUE;

当我按索引布尔列过滤时，为什么查询需要很长时间？

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

当我按索引布尔列过滤时，为什么查询需要很长时间？

1 个回答

相关问题