请协助查询。开发人员要求提高查询性能。我尝试使用 CTE 代替横向连接和 EXISTS,测试覆盖索引、附加过滤。没有显着的性能优势。可能的建议可能包括:
- 附加过滤
- 重写的方法我可以在那里看到堆获取,所以我很快就会执行VACUUM,那么除了VACUUM还能做什么呢?感谢你的帮助
https://explain.dalibo.com/plan/e6b0c5757962095a
PG版本:PostgreSQL 14.7
查询:
SELECT
oh.order_header_id AS SfmId,
oh.status AS Status,
oh.case_type AS CaseType,
oh.partner_id AS CompanyId,
oh.date_created AS DateCreated,
oh.update_date_utc AS DateUpdated,
oh.contact_id AS DoctorId,
scan_detail.due_date AS DueDate,
lab_link.partner_id AS LabId,
COALESCE(milling_site_link.partner_id, -1) AS MillingSiteId,
COALESCE(int_site_link.partner_id, -1) AS InterpretationSiteId,
oh.order_tags AS OrderTags,
oh.patient_guid AS PatientGuid,
oh.rx_id AS RxId,
FALSE AS IsConventional,
COALESCE(prev_wo.work_type, -1) AS PreviousBowId,
-1 AS LastDetailsId,
oh.last_work_order_id AS LastWorkOrderSfmId,
wo.date_created AS LastWorkOrderDateCreated,
wo.date_updated AS LastWorkOrderDateUpdated,
oh.direct_to_lab_status AS IsDirectToLab,
wo.resource_id AS LastResourceId,
wo.resource_type AS LastResourceTypeId,
oh.scan_info AS ScanInfo,
oh.extended_info AS ExtendedInfo,
oh.file_upload_report AS FileUploadReport,
wo.status AS LastWorkOrderStatus,
wo.work_type AS LastBowId,
wo.order_detail_id AS LastDetailsSfmId,
od.due_date AS LastDetailsDueDate,
-1 AS LastWorkOrderId,
wo.status AS LastWorkOrderStatus,
oh.order_code AS OrderCode,
od.date_created AS LastDetailsDateCreated
FROM
tab1 cpl
LEFT JOIN tab2 oh ON oh.order_header_id = cpl.order_header_id
LEFT JOIN LATERAL (
SELECT
due_date
FROM
tab3 scan_detail
WHERE
scan_detail.order_header_id = oh.order_header_id
AND EXISTS (
SELECT
1
FROM
tab4 ctdc2
WHERE
ctdc2.detail_type = scan_detail.item
AND ctdc2.detail_category = 1
)
LIMIT 1
) AS scan_detail ON TRUE
LEFT JOIN LATERAL (
SELECT
partner_id
FROM
tab1 lab_link
WHERE
lab_link.order_header_id = oh.order_header_id
AND lab_link.partner_type = 300
LIMIT 1
) AS lab_link ON TRUE
LEFT JOIN LATERAL (
SELECT
partner_id
FROM
tab1 milling_site_link
WHERE
milling_site_link.order_header_id = oh.order_header_id
AND milling_site_link.partner_type = 500
LIMIT 1
) AS milling_site_link ON TRUE
LEFT JOIN LATERAL (
SELECT
partner_id
FROM
tab1 int_site_link
WHERE
int_site_link.order_header_id = oh.order_header_id
AND int_site_link.partner_type = 1100
LIMIT 1
) AS int_site_link ON TRUE
INNER JOIN tab5 wo ON oh.last_work_order_id = wo.work_order_id
INNER JOIN tab3 od ON oh.order_header_id = od.order_header_id AND wo.order_detail_id = od.order_detail_id
LEFT JOIN LATERAL (
SELECT
*
FROM
tab5 prev_wo
WHERE
prev_wo.work_order_id = wo.created_by_work_order
LIMIT 1
) AS prev_wo ON TRUE
WHERE
cpl.partner_id = 8133
AND cpl.partner_type = ANY (VALUES (200), (500), (1900), (2700))
AND wo.partner_id != 8133
AND (
EXISTS (
SELECT
1
FROM
tab2 oh2
INNER JOIN tab3 od2 ON oh2.order_header_id = od2.order_header_id
INNER JOIN tab5 wo2 ON wo2.order_detail_id = od2.order_detail_id
WHERE
oh2.order_header_id = oh.order_header_id
AND wo2.work_order_id != oh2.last_work_order_id
AND wo2.partner_id = 8133
AND wo2.date_updated > (NOW() AT TIME ZONE 'UTC' + INTERVAL '-90 days')
AND wo2.work_type <> 131
LIMIT 1
)
OR (
101 = ANY (VALUES (102))
AND lab_link.partner_id = 8133
)
)
AND (
wo.work_type > 0
OR (
(
wo.work_type = -1
OR wo.status <> 1
)
AND wo.date_updated > (NOW() AT TIME ZONE 'UTC' + INTERVAL '-7 days')
)
)
LIMIT 1500;
这是查询计划
(分析、缓冲区、设置、文本格式)
也在explain.depesz.com上 https://explain.depesz.com/s/j73P#html
Limit (cost=1771.30..248159.49 rows=1 width=1206) (actual time=47.173..548.541 rows=1500 loops=1)
Buffers: shared hit=776970
-> Nested Loop Semi Join (cost=1771.30..248159.49 rows=1 width=1206) (actual time=47.172..548.291 rows=1500 loops=1)
Join Filter: (cpl.partner_type = "*VALUES*".column1)
Rows Removed by Join Filter: 7412
Buffers: shared hit=776970
-> Nested Loop Left Join (cost=1771.30..248159.39 rows=1 width=1197) (actual time=46.221..543.890 rows=2978 loops=1)
Buffers: shared hit=776970
-> Nested Loop Left Join (cost=1770.74..248150.79 rows=1 width=1209) (actual time=46.206..518.898 rows=2978 loops=1)
Buffers: shared hit=762079
-> Nested Loop Left Join (cost=1770.31..248142.30 rows=1 width=1205) (actual time=46.194..506.935 rows=2978 loops=1)
Buffers: shared hit=752105
-> Nested Loop Left Join (cost=1769.88..248133.80 rows=1 width=1201) (actual time=46.179..492.135 rows=2978 loops=1)
Filter: ((SubPlan 1) OR ((hashed SubPlan 3) AND (lab_link.partner_id = 8133)))
Rows Removed by Filter: 2649
Buffers: shared hit=740174
-> Nested Loop Left Join (cost=1769.43..247893.45 rows=1 width=1197) (actual time=13.117..282.515 rows=5627 loops=1)
Buffers: shared hit=596990
-> Gather (cost=1769.00..247875.58 rows=1 width=1189) (actual time=13.069..214.075 rows=5627 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=570677
-> Nested Loop (cost=769.00..246875.48 rows=1 width=1189) (actual time=9.170..289.074 rows=1905 loops=3)
Join Filter: (oh.tab2_id = od.tab2_id)
Buffers: shared hit=570677
-> Nested Loop (cost=768.57..236676.59 rows=20408 width=1189) (actual time=9.146..276.915 rows=1905 loops=3)
Buffers: shared hit=547781
-> Nested Loop (cost=768.00..209231.31 rows=25330 width=1127) (actual time=9.017..152.212 rows=19269 loops=3)
Buffers: shared hit=258744
-> Parallel Bitmap Heap Scan on tab1 cpl (cost=767.57..66251.14 rows=25330 width=20) (actual time=8.989..41.637 rows=19269 loops=3)
Recheck Cond: (partner_id = 8133)
Heap Blocks: exact=5996
Buffers: shared hit=27514
-> Bitmap Index Scan on ix_tab1_partner_id_partner_type (cost=0.00..752.37 rows=60792 width=0) (actual time=7.558..7.558 rows=60795 loops=1)
Index Cond: (partner_id = 8133)
Buffers: shared hit=83
-> Index Scan using "PK_tab2" on tab2 oh (cost=0.43..5.64 rows=1 width=1107) (actual time=0.005..0.005 rows=1 loops=57807)
Index Cond: (tab2_id = cpl.tab2_id)
Buffers: shared hit=231230
-> Index Scan using "PK_tab5" on tab5 wo (cost=0.56..1.08 rows=1 width=78) (actual time=0.006..0.006 rows=0 loops=57807)
Index Cond: (tab5_id = oh.last_tab5_id)
Filter: ((partner_id <> 8133) AND ((work_type > 0) OR (((work_type = '-1'::integer) OR (status <> 1)) AND (date_updated > ((now() AT TIME ZONE 'UTC'::text) + '-7 days'::interval)))))
Rows Removed by Filter: 1
Buffers: shared hit=289037
-> Index Scan using "PK_tab3" on tab3 od (cost=0.43..0.49 rows=1 width=48) (actual time=0.005..0.005 rows=1 loops=5715)
Index Cond: (tab3_id = wo.tab3_id)
Buffers: shared hit=22879
-> Limit (cost=0.43..17.84 rows=1 width=8) (actual time=0.011..0.011 rows=1 loops=5627)
Buffers: shared hit=26313
-> Nested Loop Semi Join (cost=0.43..17.84 rows=1 width=8) (actual time=0.011..0.011 rows=1 loops=5627)
Join Filter: (scan_detail.item = ctdc2.detail_type)
Rows Removed by Join Filter: 19
Buffers: shared hit=26313
-> Index Scan using "IX_tab3_tab2_id" on tab3 scan_detail (cost=0.43..16.48 rows=3 width=12) (actual time=0.005..0.006 rows=2 loops=5627)
Index Cond: (tab2_id = oh.tab2_id)
Buffers: shared hit=25878
-> Materialize (cost=0.00..1.32 rows=1 width=4) (actual time=0.000..0.001 rows=9 loops=12647)
Buffers: shared hit=1
-> Seq Scan on tab4 ctdc2 (cost=0.00..1.31 rows=1 width=4) (actual time=0.006..0.010 rows=14 loops=1)
Filter: (detail_category = 1)
Rows Removed by Filter: 11
Buffers: shared hit=1
-> Limit (cost=0.43..8.47 rows=1 width=4) (actual time=0.007..0.007 rows=1 loops=5627)
Buffers: shared hit=22214
-> Index Only Scan using ix_tab2_id_partner_id_partner_type on tab1 lab_link (cost=0.43..8.47 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=5627)
Index Cond: ((tab2_id = oh.tab2_id) AND (partner_type = 300))
Heap Fetches: 5199
Buffers: shared hit=22214
SubPlan 1
-> Nested Loop (cost=1.30..231.85 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=5627)
Join Filter: (wo2.tab5_id <> oh2.last_tab5_id)
Buffers: shared hit=120970
-> Nested Loop (cost=0.87..223.38 rows=1 width=32) (actual time=0.027..0.027 rows=1 loops=5627)
Buffers: shared hit=109058
-> Index Scan using "IX_tab3_tab2_id" on tab3 od2 (cost=0.43..16.48 rows=3 width=32) (actual time=0.003..0.004 rows=3 loops=5627)
Index Cond: (tab2_id = oh.tab2_id)
Buffers: shared hit=28948
-> Index Scan using "IX_tab5_tab3_id" on tab5 wo2 (cost=0.44..68.96 rows=1 width=32) (actual time=0.006..0.006 rows=0 loops=18155)
Index Cond: (tab3_id = od2.tab3_id)
Filter: ((work_type <> 131) AND (partner_id = 8133) AND (date_updated > ((now() AT TIME ZONE 'UTC'::text) + '-90 days'::interval)))
Rows Removed by Filter: 2
Buffers: shared hit=79660
-> Index Scan using "PK_tab2" on tab2 oh2 (cost=0.43..8.45 rows=1 width=32) (actual time=0.005..0.005 rows=1 loops=2978)
Index Cond: (tab2_id = oh.tab2_id)
Buffers: shared hit=11912
SubPlan 3
-> Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)
-> Limit (cost=0.43..8.47 rows=1 width=4) (actual time=0.004..0.004 rows=1 loops=2978)
Buffers: shared hit=11931
-> Index Only Scan using ix_tab2_id_partner_id_partner_type on tab1 milling_site_link (cost=0.43..8.47 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=2978)
Index Cond: ((tab2_id = oh.tab2_id) AND (partner_type = 500))
Heap Fetches: 2993
Buffers: shared hit=11931
-> Limit (cost=0.43..8.47 rows=1 width=4) (actual time=0.003..0.003 rows=0 loops=2978)
Buffers: shared hit=9974
-> Index Only Scan using ix_tab2_id_partner_id_partner_type on tab1 int_site_link (cost=0.43..8.47 rows=1 width=4) (actual time=0.003..0.003 rows=0 loops=2978)
Index Cond: ((tab2_id = oh.tab2_id) AND (partner_type = 1100))
Heap Fetches: 1056
Buffers: shared hit=9974
-> Limit (cost=0.56..8.58 rows=1 width=134) (actual time=0.008..0.008 rows=1 loops=2978)
Buffers: shared hit=14891
-> Index Scan using "PK_tab5" on tab5 prev_wo (cost=0.56..8.58 rows=1 width=134) (actual time=0.007..0.007 rows=1 loops=2978)
Index Cond: (tab5_id = wo.created_by_tab5)
Buffers: shared hit=14890
-> Values Scan on "*VALUES*" (cost=0.00..0.05 rows=4 width=4) (actual time=0.000..0.001 rows=3 loops=2978)
Settings: effective_cache_size = '88445488kB', maintenance_io_concurrency = '1'
Planning:
Buffers: shared hit=5660
Planning Time: 5.158 ms
Execution Time: 549.025 ms
除了各种其他可能的问题(其中一些已经解决) - 最重要的是服务器配置和索引优化- 还存在连接问题。
我计算了9 个连接项
from_collapse_limit
(普通查询和子查询),这超出了和的默认设置 8join_collapse_limit
。不确定具体结果如何。但由于连接的顺序在错误的LEFT JOIN
基础上被搞乱了,所以这不会有什么好处。这个序列是无稽之谈:
打开的过滤器
wo
会强制LEFT OUTER JOIN
表现得像INNER JOIN
. 看:但由于超出了“崩溃”限制,错误的连接序列被冻结(至少在某种程度上)。尝试这个等效的查询(不更改提到的限制):
我将 的所有实例转换
LEFT JOIN LATERAL ... LIMIT 1 .. ON true
为列表中的相关子查询SELECT
。LEFT JOIN
这应该使您低于连接限制 - 在修复错误并重新排序连接后,这应该不重要。我期望有一个(更好)更好的查询计划。我注释掉了替换的部分。
另外,我还注释掉了这个噪音:
您可能仍然需要它来获取其他参数值!(?)
如果
tab2.order_header_id
是PK或者UNIQUE,我们可以进一步简化。缺少信息。最后一张
OR
看起来很丑 分成两个UNION
查询可能会有所帮助。看:Also,
LIMIT 1
withoutORDER BY
produces arbitrary results, which is suspicious. There may be lurking problems ...您的更快计划 (j73P) 仅使用两个并行工作线程,并且可能能够从更多并行工作线程中受益。因此,尝试增加 max_parallel_workers_per_gather (我假设默认值为 2,因为它没有以非默认值显示在“Settings:”行上)。您稍微简单但较慢的计划(58Eq)根本不使用并行工作人员(这可能就是它较慢的原因),但我不知道为什么会这样。也许规划者只是认为它们没有用,或者可能有一些功能限制了它们。如果没有看到该查询的文本,就很难知道。
显然,只有在资源未得到充分利用的情况下,使用更多并行工作线程才会带来好处。
主要的时间消耗之一(但不是主要的时间消耗,因为没有单一的主要时间消耗)是这样的:
现在我们不知道该过滤器逻辑的哪个组件负责丢弃被丢弃的 2/3 行。但如果丢弃是在索引内完成而不访问表,这可能会有所帮助。这可能意味着例如 上的多列索引
(tab3_id, partner_id)
。类似的策略也可能在计划的其他部分得到回报,但使用“或”逻辑可能会使该策略变得不可行。或者,也许这可以作为散列连接而不是嵌套循环来完成,方法是预加载可能符合匹配条件的所有部分,然后在 tab3_id 上对它们进行散列。为了有效地支持这一点,您需要一个索引
(partner_id, date_updated, worker_type, tab3_id)
可能有一些自上而下的选项来重写查询或完全避免它,但查询本身并不是不言自明的。因此,对查询正在执行的操作以及为什么需要执行此操作的一些高级描述可能会有所帮助。例如,为什么这种性质的查询需要在不到半秒的时间内完成?如果你正在编制一个供某人处理的任务列表,那么完成 1500 个任务肯定需要很长时间,那么多久需要一次这样的查询呢?或者,如果任务列表不是那么长,那么也许不需要将 LIMIT 设置为高达 1500 来构建它。