我可以在使用数据库后激活 PITR 吗？

Question

Asked: 2024-02-06 02:53:25 +0800 CST2024-02-06 02:53:25 +0800 CST 2024-02-06 02:53:25 +0800 CST

更便宜的查询计划需要更长的时间来执行

772

我正在运行一个查询，例如

explain (analyze, buffers) select col1, col2, count(col3) as c from table1 group by 2, 1 order by 2, 1

当work_mem设置为 4 MB 时，计划如下所示：


"GroupAggregate  (cost=0.70..211944.54 rows=573788 width=26) (actual time=5.146..2601.133 rows=1867574 loops=1)"
"  Group Key: col2, col1"
"  Buffers: shared hit=1844356 read=9682"
"  ->  Incremental Sort  (cost=0.70..191999.45 rows=1894295 width=21) (actual time=5.131..1848.190 rows=1894295 loops=1)"
"        Sort Key: col2, col1"
"        Presorted Key: col2"
"        Full-sort Groups: 58831  Sort Method: quicksort  Average Memory: 27kB  Peak Memory: 27kB"
"        Buffers: shared hit=1844356 read=9682"
"        ->  Index Scan using table1_pkey on table1  (cost=0.43..121686.41 rows=1894295 width=21) (actual time=5.071..923.512 rows=1894295 loops=1)"
"              Buffers: shared hit=1844356 read=9682"
"Planning:"
"  Buffers: shared hit=2"
"Planning Time: 0.127 ms"
"JIT:"
"  Functions: 7"
"  Options: Inlining false, Optimization false, Expressions true, Deforming true"
"  Timing: Generation 0.614 ms, Inlining 0.000 ms, Optimization 0.346 ms, Emission 4.648 ms, Total 5.609 ms"
"Execution Time: 2725.164 ms"

当我将work_mem增加到1GB时，它突然变得非常不同

"Sort  (cost=107700.32..109134.79 rows=573788 width=26) (actual time=6461.310..6821.930 rows=1867574 loops=1)"
"  Sort Key: col2, col1"
"  Sort Method: quicksort  Memory: 195057kB"
"  Buffers: shared hit=13813 read=116"
"  ->  HashAggregate  (cost=47079.16..52817.04 rows=573788 width=26) (actual time=1194.218..1777.794 rows=1867574 loops=1)"
"        Group Key: col2, col1"
"        Batches: 1  Memory Usage: 303121kB"
"        Buffers: shared hit=13813 read=116"
"        ->  Seq Scan on table1  (cost=0.00..32871.95 rows=1894295 width=21) (actual time=0.016..214.794 rows=1894295 loops=1)"
"              Buffers: shared hit=13813 read=116"
"Planning:"
"  Buffers: shared read=2"
"Planning Time: 0.122 ms"
"JIT:"
"  Functions: 7"
"  Options: Inlining false, Optimization false, Expressions true, Deforming true"
"  Timing: Generation 0.477 ms, Inlining 0.000 ms, Optimization 0.216 ms, Emission 4.722 ms, Total 5.416 ms"
"Execution Time: 6967.294 ms"

令人困惑的观察结果——

它切换到顺序扫描而不是索引扫描，因为内存更多
它放弃了高效的增量排序（随后是 GroupAggregate），并进行了 HashAggregate 和快速排序
具有 1 GB 内存的新计划的成本较低，但运行时间要长得多。

这是怎么回事？

2 个回答

Voted

Laurenz Albe · Answer 1 · 2024-02-06T16:27:50+08:00

您可以尝试使用扩展统计数据来改进估计：

CREATE STATISTICS mystats (ndistinct) ON col1, col2 FROM table1;
ANALYZE table1;

我不确定这是否足以使快速计划获胜。

jjanes · Answer 2 · 2024-02-07T02:18:18+08:00

它切换到顺序扫描而不是索引扫描，因为内存更多

它从 GroupAggregate 切换到 HashAggregate。seq 扫描是这一变化的偶然结果。HashAggregate 因预期溢出到磁盘而受到惩罚，而增加 work_mem 可以消除该惩罚。

它放弃了高效的增量排序（随后是 GroupAggregate），并进行了 HashAggregate 和快速排序

HashAggregate 也很高效。后来排序的行数是预期的 3 倍（573788 vs 1867574），速度很慢。

具有 1 GB 内存的新计划的成本较低，但运行时间要长得多。

估计很难。（一般来说）我们对于估算的困难无能为力。你想做什么？用这个精确的查询解决一个具体的问题？了解一些一般原则？发泄？

更便宜的查询计划需要更长的时间来执行

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

更便宜的查询计划需要更长的时间来执行

2 个回答

相关问题