Roee提出的问题 -dba

Asked: 2023-07-24 20:31:51 +0800 CST

Postgres15 memoize 减慢了许多查询的速度

我有一个使用联接和分组依据的查询。它正在处理的表由几百万条记录（10-5000 万）组成，其中一条已分区（不确定是否相关，但提供尽可能多的信息）。由于某种原因，这个查询（以及我研究中的许多其他查询，不一定必须包含分组依据）运行得有点慢，但是当更改时，enable_memoize=false它的运行时间几乎减少了一半。

为什么会出现这种情况？整个网络 memoize 都被誉为一项可以改进许多查询的出色新功能。

是否需要更改任何 psql 设置才能使 memoize 快速工作/不被规划者选择，因为它是较差的计划（这不是直接禁用 memoize）？

查询本身：

EXPLAIN ANALYZE 
SELECT table3.id, table3.type, count(*) FROM table1 
 JOIN table2 ON table1.id=table2.table1_id AND table1.tenant_id=table2.tenant_id
 JOIN table3 ON table2.table3_id=table3.id AND table2.tenant_id=table3.tenant_id
WHERE table1.tenant_id=123 
GROUP BY table3.id, table3.type
ORDER BY count(*) DESC LIMIT 10;

产出计划：

 Limit  (cost=79555.10..79555.12 rows=10 width=45) (actual time=11017.288..11017.294 rows=10 loops=1)
   ->  Sort  (cost=79555.10..80128.80 rows=229481 width=45) (actual time=11017.286..11017.291 rows=10 loops=1)
         Sort Key: (count(*)) DESC
         Sort Method: top-N heapsort  Memory: 26kB
         ->  HashAggregate  (cost=68715.65..74596.10 rows=229481 width=45) (actual time=9814.014..10892.716 rows=814630 loops=1)
               Group Key: table3.id
               Planned Partitions: 4  Batches: 33  Memory Usage: 9265kB  Disk Usage: 105720kB
               ->  Merge Join  (cost=5.67..59034.42 rows=229481 width=37) (actual time=0.101..8846.868 rows=1912806 loops=1)
                     Merge Cond: (table2.table1_id = table1.id)
                     ->  Nested Loop  (cost=0.87..184852.18 rows=919397 width=53) (actual time=0.062..7766.353 rows=1912806 loops=1)
                           ->  Index Scan using idx_table2_tenant_id_table1id on table2  (cost=0.43..124987.43 rows=1872626 width=24) (actual time=0.034..1710.167 rows=1912806 loops=1)
                                 Index Cond: (tenant_id = 123)
                           ->  Memoize  (cost=0.44..0.62 rows=1 width=45) (actual time=0.003..0.003 rows=1 loops=1912806)
                                 Cache Key: table2.table3_id
                                 Cache Mode: logical
                                 Hits: 1040220  Misses: 872586  Evictions: 816079  Overflows: 0  Memory Usage: 8389kB
                                 ->  Index Scan using table3_pkey on table3  (cost=0.43..0.61 rows=1 width=45) (actual time=0.004..0.004 rows=1 loops=872586)
                                       Index Cond: (id = table2.table3_id)
                                       Filter: (tenant_id = 123)
                     ->  Index Only Scan using table1_partition_123_pkey on table1_partition_123 table1  (cost=0.43..39581.09 rows=1683017 width=16) (actual time=0.035..455.399 rows=1912806 loops=1)
                           Index Cond: (tenant_id = 123)
                           Heap Fetches: 9346
 Planning Time: 7.250 ms
 Execution Time: 11038.258 ms

设置后的输出计划enable_memoize=false：

  Limit  (cost=102850.70..102850.72 rows=10 width=45) (actual time=6040.773..6040.960 rows=10 loops=1)
   ->  Sort  (cost=102850.70..103424.40 rows=229481 width=45) (actual time=6040.772..6040.957 rows=10 loops=1)
         Sort Key: (count(*)) DESC
         Sort Method: top-N heapsort  Memory: 26kB
         ->  HashAggregate  (cost=92011.24..97891.69 rows=229481 width=45) (actual time=4841.865..5916.868 rows=814630 loops=1)
               Group Key: table3.id
               Planned Partitions: 4  Batches: 33  Memory Usage: 9265kB  Disk Usage: 105720kB
               ->  Merge Join  (cost=1005.72..82330.01 rows=229481 width=37) (actual time=10.344..3868.398 rows=1912806 loops=1)
                     Merge Cond: (table2.table1_id = table1.id)
                     ->  Gather Merge  (cost=1000.92..508058.81 rows=919397 width=53) (actual time=10.288..2796.661 rows=1912806 loops=1)
                           Workers Planned: 4
                           Workers Launched: 4
                           ->  Nested Loop  (cost=0.86..397549.71 rows=229849 width=53) (actual time=0.071..2360.817 rows=382561 loops=5)
                                 ->  Parallel Index Scan using idx_table2_tenant_id_table1id on table2  (cost=0.43..110942.73 rows=468156 width=24) (actual time=0.040..403.754 rows=382561 loops=5)
                                       Index Cond: (tenant_id = 123)
                                 ->  Index Scan using table3_pkey on table3  (cost=0.43..0.61 rows=1 width=45) (actual time=0.004..0.004 rows=1 loops=1912806)
                                       Index Cond: (id = table2.table3_id)
                                       Filter: (tenant_id = 123)
                     ->  Index Only Scan using table1_partition_123_pkey on table1_partition_123 table1  (cost=0.43..39581.09 rows=1683017 width=16) (actual time=0.052..460.241 rows=1912806 loops=1)
                           Index Cond: (tenant_id = 123)
                           Heap Fetches: 9346
 Planning Time: 0.907 ms
 Execution Time: 6061.154 ms

可能相关的 Psql 设置（？）

max_parallel_workers_per_gather=4;
max_parallel_workers=32;
max_parallel_maintenance_workers=4;
random_page_cost=1.1;
work_mem='4194kB';

Postgres15 memoize 减慢了许多查询的速度

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

Roee's questions