我正在使用 PostgreSQL 11。我有一个包含 ~500к 条记录的表,其大小约为 2GB。如果我正在运行此查询:
SELECT DISTINCT point.id
FROM "buildings_point" as point
LEFT OUTER JOIN "equipment_equipment" as equipment ON (point.equipment_id = equipment.id)
WHERE (
equipment.id = 4
OR point.id = 4
);
我在buildings_point
表中确实有两个索引:
"buildings_point_pkey" PRIMARY KEY, btree (id)
"buildings_point_997b9956" btree (equipment_id)
对于equipment_equipment
表,我确实有索引id
:
"equipment_equipment_pkey" PRIMARY KEY, btree (id)
但是调度程序运行:
Parallel Seq Scan on buildings_point point
大约需要 18 秒来遍历所有 500к 条记录。如果我删除任何equipment.id = 4
orpoint.id = 4
查询将花费大约 5 毫秒。为什么调度程序不使用两个索引?
解释分析输出:
-> Sort (cost=175133.54..175133.59 rows=19 width=4) (actual time=19163.682..19181.748 rows=3 loops=1)
Sort Key: point.id
Sort Method: quicksort Memory: 25kB
-> Gather (cost=2307.62..175133.14 rows=19 width=4) (actual time=3958.040..19181.729 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Hash Left Join (cost=1307.62..174131.24 rows=8 width=4) (actual time=8205.581..19155.693 rows=1 loops=3)
Hash Cond: (point.equipment_id = equipment.id)
Filter: ((equipment.id = 4) OR (point.id = 4))
Rows Removed by Filter: 218243
-> Parallel Seq Scan on buildings_point point (cost=0.00..172111.33 rows=271333 width=8) (actual time=4.661..18838.772 rows=218244 loops=3)
-> Parallel Hash (cost=1116.56..1116.56 rows=15285 width=4) (actual time=52.361..52.364 rows=12229 loops=3)
Buckets: 65536 Batches: 1 Memory Usage: 1984kB
-> Parallel Index Only Scan using buildings_equipment_pkey on equipment_equipment equipment (cost=0.29..1116.56 rows=15285 width=4) (actual time=0.848..34.102 rows=12229 loops=3)
Heap Fetches: 176
Planning Time: 0.275 ms
Execution Time: 19182.162 ms