我可以在使用数据库后激活 PITR 吗？

Question

laurent

Asked: 2024-04-12 18:43:42 +0800 CST2024-04-12 18:43:42 +0800 CST 2024-04-12 18:43:42 +0800 CST

可以创建什么索引来优化这个查询？

772

我有下面的 SQL 查询，运行速度非常慢。至于这个查询，这是由于“ORDER BY”语句造成的，因为Postgres通过“计数器”扫描changes表，它可以有数百万个值。删除“ORDER BY”语句使查询速度更快。

对于上面提到的其他查询，我通过在两个字段上创建索引来优化它。然而，对于这个查询，我不确定哪个索引是正确的。我尝试打开索引，(item_id, counter)但它根本没有帮助，我不知道我还能尝试什么。有什么建议么？

慢SQL查询：

SELECT "id", "item_id", "item_name", "type", "updated_time", "counter"
FROM "changes"
WHERE counter > -1
AND type = 2
AND item_id IN (SELECT item_id FROM user_items WHERE user_id = 'xxxx')
ORDER BY "counter" ASC
LIMIT 200;

解释（分析、缓冲区、设置）结果：

------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1001.15..27628.99 rows=200 width=99) (actual time=98730.912..116273.818 rows=200 loops=1)
   Buffers: shared hit=78113369 read=3224064 dirtied=3
   I/O Timings: read=137436.119
   ->  Gather Merge  (cost=1001.15..10431526.45 rows=78343 width=99) (actual time=98730.911..116273.783 rows=200 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=78113369 read=3224064 dirtied=3
         I/O Timings: read=137436.119
         ->  Nested Loop  (cost=1.13..10421483.70 rows=32643 width=99) (actual time=98493.185..112919.559 rows=75 loops=3)
               Buffers: shared hit=78113369 read=3224064 dirtied=3
               I/O Timings: read=137436.119
               ->  Parallel Index Scan using changes_pkey on changes  (cost=0.56..5949383.56 rows=6197986 width=99) (actual time=1.076..42523.117 rows=4075591 loops=3)
                     Index Cond: (counter > '-1'::integer)
                     Filter: (type = 2)
                     Rows Removed by Filter: 10370914
                     Buffers: shared hit=18993521 read=2672415
                     I/O Timings: read=85551.814
               ->  Index Scan using user_items_item_id_index on user_items  (cost=0.56..0.72 rows=1 width=23) (actual time=0.017..0.017 rows=0 loops=12226772)
                     Index Cond: ((item_id)::text = (changes.item_id)::text)
                     Filter: ((user_id)::text = 'xxxx'::text)
                     Rows Removed by Filter: 1
                     Buffers: shared hit=59119848 read=551649 dirtied=3
                     I/O Timings: read=51884.305
 Settings: effective_cache_size = '16179496kB', jit = 'off', work_mem = '100000kB'
 Planning Time: 1.465 ms
 Execution Time: 116273.929 ms
(26 rows)

索引：

"changes_pkey" PRIMARY KEY, btree (counter)
"changes_id_index" btree (id)
"changes_id_unique" UNIQUE CONSTRAINT, btree (id)
"changes_item_id_index" btree (item_id)
"changes_user_id_counter_index" btree (user_id, counter)
"changes_user_id_index" btree (user_id)

3 个回答

Voted

nbk · Answer 1 · 2024-04-12T22:09:10+08:00

你应该将你的查询重写为

SELECT "id", c."item_id", "item_name", "type", "updated_time", "counter"
FROM "changes" c JOIN (SELECT item_id FROM user_items WHERE user_id = 'xxxx') ui
ON c.item_id = ui.item_id
WHERE counter > -1
AND type = 2
ORDER BY "counter" ASC
LIMIT 200;

与索引

  changes (type, item_id, counter) INCLUDE (id, item_name, updated_time)
  user_items (user_id)

这应该可以提高查询速度

连接通常会更快，IN因为

包含 ON 和 WHERE 子句中的三列的组合索引应该可以单独提高速度。

user_item 也是如此，如果用户还没有索引，那么用户也应该有一个索引

Laurenz Albe · Answer 2 · 2024-04-13T03:02:46+08:00

问题的原因如下：优化器认为其中有足够多的行与正确的行changes相关，因此通过按顺序扫描可以快速找到 100 个结果，并丢弃不满足条件的行，直到找到正确的行。已找到 100 个结果并已完成。但是，它必须扫描 10371014 行，直到获得足够的结果，这需要很长时间。原因很可能是所有匹配都具有相当高的值。user_itemsuser_idchangescounterchangescounter

对此你无能为力：

您可以尽可能加快内部索引扫描速度，就像其他答案所建议的那样。
您可以更改ORDER BY以便 PostgreSQL 无法使用其首选策略：
```
ORDER BY counter + 0
```
也许最终的执行计划会更快。

Charlieface · Answer 3 · 2024-04-12T20:21:11+08:00

看起来以下索引适合您。

这个想法是首先添加相等谓词，然后添加连接/排序/不等谓词，然后添加其他列作为INCLUDE.

changes (type, counter) INCLUDE (id, item_id, item_name, updated_time)
user_items (user_id, item_id)

另一种选择，取决于连接的基数（有多少行）

changes (type, item_id, counter) INCLUDE (id, item_name, updated_time)

可以创建什么索引来优化这个查询？

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

可以创建什么索引来优化这个查询？

3 个回答

相关问题