我在 postgres 13 中有一个表,声明范围按 ID 分区。
我正在按 ID 降序选择少量行,我希望它能够可靠地使用有序分区扫描,其中以相反的顺序搜索分区,直到找到所需的行数。我知道几乎在每种情况下,结果都会在最近的分区中找到,而旧的分区可以跳过。
我有时会得到有序的分区扫描,但有时规划器决定查询每个分区,它的性能比只检查最近的分区要差。我有兴趣了解原因,如果有什么我可以做的来影响计划者
测试设置:
CREATE SEQUENCE public.measurements_id_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;
ALTER SEQUENCE measurements_id_seq RESTART WITH 1;
CREATE TABLE measurements (
id integer DEFAULT nextval('public.measurements_id_seq'::regclass) PRIMARY KEY,
uuid uuid NOT NULL,
num integer NOT NULL,
created_at timestamp without time zone NOT NULL
)
PARTITION BY RANGE (id);
CREATE INDEX ON measurements (num);
CREATE INDEX ON measurements (uuid);
CREATE TABLE measurements_p0 PARTITION OF measurements FOR VALUES FROM (0) TO (1000000);
CREATE TABLE measurements_p1 PARTITION OF measurements FOR VALUES FROM (1000000) TO (2000000);
CREATE TABLE measurements_p2 PARTITION OF measurements FOR VALUES FROM (2000000) TO (3000000);
CREATE TABLE measurements_p3 PARTITION OF measurements FOR VALUES FROM (3000000) TO (4000000);
CREATE TABLE measurements_p4 PARTITION OF measurements FOR VALUES FROM (4000000) TO (5000000);
CREATE TABLE measurements_p5 PARTITION OF measurements FOR VALUES FROM (5000000) TO (6000000);
然后我插入批量样本数据,包含 100 个随机数(每个占表的 1%)和 10 个随机 UUID(每个占表的 10%):
with uuids AS (
select gen_random_uuid() as uuid from generate_series(1, 10) s(i)
)
insert into measurements (
num, uuid, created_at
)
select
random() * 100,
(select array_agg(uuid) from uuids)[floor(random() * 10 + 1)],
clock_timestamp()
from generate_series(1, 4999999) s(i);
最后,我添加了一些额外的样本数据,其 UUID 远小于行数的 10%,然后分析:
with uuids AS (
select gen_random_uuid() as uuid from generate_series(1, 10) s(i)
)
insert into measurements (
num, uuid, created_at
)
select
random() * 100,
(select array_agg(uuid) from uuids)[floor(random() * 10 + 1)],
clock_timestamp()
from generate_series(1, 10000) s(i);
analyze measurements;
✅ 如果我选择 1% 的行,我会得到一个有序的分区扫描,按 id desc 排序
# explain (analyze, buffers) select * from measurements where num=5 order by id desc limit 4;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=2.41..16.66 rows=4 width=32) (actual time=0.266..0.540 rows=4 loops=1)
Buffers: shared hit=11
-> Append (cost=2.41..179806.71 rows=50464 width=32) (actual time=0.260..0.531 rows=4 loops=1)
Buffers: shared hit=11
-> Index Scan Backward using measurements_p5_pkey on measurements_p5 measurements_6 (cost=0.29..372.29 rows=97 width=32) (actual time=0.257..0.525 rows=4 loops=1)
Filter: (num = 5)
Rows Removed by Filter: 533
Buffers: shared hit=11
-> Index Scan Backward using measurements_p4_pkey on measurements_p4 measurements_5 (cost=0.42..35836.43 rows=8400 width=32) (never executed)
Filter: (num = 5)
-> Index Scan Backward using measurements_p3_pkey on measurements_p3 measurements_4 (cost=0.42..35836.43 rows=10900 width=32) (never executed)
Filter: (num = 5)
-> Index Scan Backward using measurements_p2_pkey on measurements_p2 measurements_3 (cost=0.42..35836.43 rows=9867 width=32) (never executed)
Filter: (num = 5)
-> Index Scan Backward using measurements_p1_pkey on measurements_p1 measurements_2 (cost=0.42..35836.43 rows=10200 width=32) (never executed)
Filter: (num = 5)
-> Index Scan Backward using measurements_p0_pkey on measurements_p0 measurements_1 (cost=0.42..35836.41 rows=11000 width=32) (never executed)
Filter: (num = 5)
Planning Time: 1.227 ms
Execution Time: 0.827 ms
✅ 如果我选择 10% 的行,我会得到一个有序的分区扫描,按 id desc 排序:
# explain (analyze, buffers) select * from measurements where uuid='0a246187-edf6-44f3-8517-2a899667db0f' order by id desc limit 4;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=2.41..3.88 rows=4 width=32) (actual time=5.378..5.395 rows=4 loops=1)
Buffers: shared hit=135
-> Append (cost=2.41..182034.40 rows=496001 width=32) (actual time=5.374..5.389 rows=4 loops=1)
Buffers: shared hit=135
-> Index Scan Backward using measurements_p5_pkey on measurements_p5 measurements_6 (cost=0.29..372.29 rows=1 width=32) (actual time=5.293..5.294 rows=0 loops=1)
Filter: (uuid = '0a246187-edf6-44f3-8517-2a899667db0f'::uuid)
Rows Removed by Filter: 10000
Buffers: shared hit=131
-> Index Scan Backward using measurements_p4_pkey on measurements_p4 measurements_5 (cost=0.42..35836.43 rows=99400 width=32) (actual time=0.076..0.088 rows=4 loops=1)
Filter: (uuid = '0a246187-edf6-44f3-8517-2a899667db0f'::uuid)
Rows Removed by Filter: 42
Buffers: shared hit=4
-> Index Scan Backward using measurements_p3_pkey on measurements_p3 measurements_4 (cost=0.42..35836.43 rows=100733 width=32) (never executed)
Filter: (uuid = '0a246187-edf6-44f3-8517-2a899667db0f'::uuid)
-> Index Scan Backward using measurements_p2_pkey on measurements_p2 measurements_3 (cost=0.42..35836.43 rows=97567 width=32) (never executed)
Filter: (uuid = '0a246187-edf6-44f3-8517-2a899667db0f'::uuid)
-> Index Scan Backward using measurements_p1_pkey on measurements_p1 measurements_2 (cost=0.42..35836.43 rows=97833 width=32) (never executed)
Filter: (uuid = '0a246187-edf6-44f3-8517-2a899667db0f'::uuid)
-> Index Scan Backward using measurements_p0_pkey on measurements_p0 measurements_1 (cost=0.42..35836.41 rows=100467 width=32) (never executed)
Filter: (uuid = '0a246187-edf6-44f3-8517-2a899667db0f'::uuid)
Planning Time: 0.728 ms
Execution Time: 5.630 ms
❌ 如果我选择的 UUID 在最终分区中只有少量行(按 id desc 排序),我不会得到有序分区扫描:
# explain (analyze, buffers) select * from measurements where uuid='1d58534d-c795-4f9b-b1d9-ab6316a8fb9a' order by id desc limit 4;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=164.31..164.32 rows=4 width=32) (actual time=1.952..1.960 rows=4 loops=1)
Buffers: shared hit=91
-> Sort (cost=164.31..166.80 rows=996 width=32) (actual time=1.949..1.954 rows=4 loops=1)
Sort Key: measurements.id DESC
Sort Method: top-N heapsort Memory: 25kB
Buffers: shared hit=91
-> Append (cost=0.42..149.37 rows=996 width=32) (actual time=0.456..1.470 rows=991 loops=1)
Buffers: shared hit=91
-> Index Scan using measurements_p0_uuid_idx on measurements_p0 measurements_1 (cost=0.42..8.39 rows=1 width=32) (actual time=0.057..0.058 rows=0 loops=1)
Index Cond: (uuid = '1d58534d-c795-4f9b-b1d9-ab6316a8fb9a'::uuid)
Buffers: shared hit=3
-> Index Scan using measurements_p1_uuid_idx on measurements_p1 measurements_2 (cost=0.42..8.41 rows=1 width=32) (actual time=0.073..0.073 rows=0 loops=1)
Index Cond: (uuid = '1d58534d-c795-4f9b-b1d9-ab6316a8fb9a'::uuid)
Buffers: shared hit=3
-> Index Scan using measurements_p2_uuid_idx on measurements_p2 measurements_3 (cost=0.42..8.40 rows=1 width=32) (actual time=0.038..0.038 rows=0 loops=1)
Index Cond: (uuid = '1d58534d-c795-4f9b-b1d9-ab6316a8fb9a'::uuid)
Buffers: shared hit=3
-> Index Scan using measurements_p3_uuid_idx on measurements_p3 measurements_4 (cost=0.42..8.40 rows=1 width=32) (actual time=0.047..0.047 rows=0 loops=1)
Index Cond: (uuid = '1d58534d-c795-4f9b-b1d9-ab6316a8fb9a'::uuid)
Buffers: shared hit=3
-> Index Scan using measurements_p4_uuid_idx on measurements_p4 measurements_5 (cost=0.42..8.44 rows=1 width=32) (actual time=0.041..0.042 rows=0 loops=1)
Index Cond: (uuid = '1d58534d-c795-4f9b-b1d9-ab6316a8fb9a'::uuid)
Buffers: shared hit=3
-> Bitmap Heap Scan on measurements_p5 measurements_6 (cost=15.97..102.35 rows=991 width=32) (actual time=0.195..0.961 rows=991 loops=1)
Recheck Cond: (uuid = '1d58534d-c795-4f9b-b1d9-ab6316a8fb9a'::uuid)
Heap Blocks: exact=74
Buffers: shared hit=76
-> Bitmap Index Scan on measurements_p5_uuid_idx (cost=0.00..15.72 rows=991 width=0) (actual time=0.146..0.146 rows=991 loops=1)
Index Cond: (uuid = '1d58534d-c795-4f9b-b1d9-ab6316a8fb9a'::uuid)
Buffers: shared hit=2
Planning Time: 1.024 ms
Execution Time: 2.280 ms
如果我将最后一个查询更改为仅针对最近的分区,我们可以看到它只命中少数几个缓冲区,并且将是订单分区扫描的一个很好的候选者:
# explain (analyze, buffers) select * from measurements_p5 where uuid='1d58534d-c795-4f9b-b1d9-ab6316a8fb9a' order by id desc limit 4;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..1.79 rows=4 width=32) (actual time=0.127..0.143 rows=4 loops=1)
Buffers: shared hit=3
-> Index Scan Backward using measurements_p5_pkey on measurements_p5 (cost=0.29..372.29 rows=991 width=32) (actual time=0.123..0.137 rows=4 loops=1)
Filter: (uuid = '1d58534d-c795-4f9b-b1d9-ab6316a8fb9a'::uuid)
Rows Removed by Filter: 37
Buffers: shared hit=3
Planning Time: 0.245 ms
Execution Time: 0.234 ms
我可以看到的一个区别是所有带有有序分区扫描的查询计划都使用主键索引,这也是分区键。有序分区扫描是否仅在计划者认为可以对分区键索引进行反向扫描时发生?
是的,它只会在有合适的索引可用时进行有序分区扫描,该索引具有与分区/排序键匹配的可用子集。在这种情况下,索引 on
(uuid, id)
将起作用,因为在测试 uuid 是否相等后,它允许按顺序传递 id。您可以想象在 Append 和单个分区扫描之间注入排序可能值得的情况,这样即使没有这样的索引,它也可以从有序分区扫描中受益。但这没有实施。