我有一个问题:
select F.DVALUE as f1, VF.VAL as f2, f_strlen( VF.VAL ) as f3
from PR_STR_TO_LIST( '37593' ) L
left join DUALS U on U.NUM >= 0 and U.NUM < 2
left join D_B_CONT_SET_FIX S on U.NUM=1 and S.DOC=L.VAL
left join OD_DOCS D on D.ID = (case when U.NUM=0 then L.VAL else S.PORT_ID end)
left join OD_DOC_CATS C on C.ID=D.D_CAT
left join OD_DOC_CATS CC on CC.WALK<=C.WALK and C.WALK<(CC.WALK+CC.AMOUNT) and CC.LEV<=C.LEV
left join OP_FIELDS F on F.OP=CC.ID and F.AS_NAME='DESC_ESTIM'
left join OD_OPTIONS VF on VF.OBJECT=D.ID and VF.DESCR=F.ID and VF.B_DATE<='20180715' and VF.E_DATE>'20180715'
where F.ID is not null
order by 3 desc, U.NUM, CC.LEV desc
功能PR_STR_TO_LIST( '37593' )
是IMMUTABLE PARALLEL SAFE
和return setof T_PR_STR_TO_LIST
:
create type T_PR_STR_TO_LIST as (
VAL integer
);
查询很慢,这是执行计划:
Sort (cost=193656.73..193658.89 rows=866 width=18) (actual time=184.363..184.363 rows=1 loops=1)
Sort Key: (f_strlen(vf.val)) DESC, u.num, d_2.lev DESC
Sort Method: quicksort Memory: 25kB
-> Nested Loop Left Join (cost=104216.04..193614.47 rows=866 width=18) (actual time=184.351..184.354 rows=1 loops=1)
-> Hash Join (cost=104215.62..186270.79 rows=866 width=19) (actual time=184.306..184.309 rows=1 loops=1)
Hash Cond: (CASE WHEN (u.num = 0) THEN l.val ELSE s.port_id END = d.id)
-> Merge Left Join (cost=84933.39..166862.95 rows=23390 width=12) (actual time=133.823..133.824 rows=2 loops=1)
Merge Cond: (l.val = s.doc)
Join Filter: (u.num = 1)
-> Sort (cost=1109.34..1141.84 rows=13000 width=8) (actual time=0.162..0.163 rows=2 loops=1)
Sort Key: l.val
Sort Method: quicksort Memory: 25kB
-> Nested Loop Left Join (cost=0.25..221.03 rows=13000 width=8) (actual time=0.149..0.150 rows=2 loops=1)
-> Function Scan on pr_str_to_list l (cost=0.25..10.25 rows=1000 width=4) (actual time=0.117..0.117 rows=1 loops=1)
-> Materialize (cost=0.00..48.31 rows=13 width=4) (actual time=0.030..0.031 rows=2 loops=1)
-> Seq Scan on duals u (cost=0.00..48.25 rows=13 width=4) (actual time=0.021..0.021 rows=2 loops=1)
Filter: ((num >= 0) AND (num < 2))
-> Sort (cost=83824.05..85695.24 rows=748476 width=8) (actual time=133.656..133.656 rows=1 loops=1)
Sort Key: s.doc
Sort Method: quicksort Memory: 56931kB
-> Seq Scan on d_b_cont_set_fix s (cost=0.00..10796.76 rows=748476 width=8) (actual time=0.012..40.324 rows=748476 loops=1)
-> Hash (cost=18106.86..18106.86 rows=94030 width=15) (actual time=49.902..49.902 rows=97361 loops=1)
Buckets: 131072 Batches: 1 Memory Usage: 5588kB
-> Nested Loop (cost=1.26..18106.86 rows=94030 width=15) (actual time=0.054..28.764 rows=97361 loops=1)
-> Nested Loop (cost=0.83..128.58 rows=15 width=15) (actual time=0.036..0.134 rows=1 loops=1)
-> Nested Loop (cost=0.55..126.94 rows=1 width=19) (actual time=0.020..0.118 rows=1 loops=1)
-> Index Scan using op_fields_op_as_name_key on op_fields f (cost=0.28..118.61 rows=1 width=11) (actual time=0.013..0.110 rows=1 loops=1)
Index Cond: ((as_name)::text = 'DESC_ESTIM'::text)
Filter: (id IS NOT NULL)
-> Index Scan using pk_od_doc_cats on fs_od_doc_cats d_2 (cost=0.27..8.29 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=1)
Index Cond: (id = f.op)
-> Index Scan using od_ix_doc_cats_walk on fs_od_doc_cats d_1 (cost=0.28..1.48 rows=15 width=12) (actual time=0.014..0.015 rows=1 loops=1)
Index Cond: ((d_2.walk <= walk) AND (walk < (d_2.walk + d_2.amount)))
Filter: (d_2.lev <= lev)
-> Index Scan using x2x_od_docs on fs_od_docs d (cost=0.43..952.06 rows=24649 width=8) (actual time=0.017..20.391 rows=97361 loops=1)
Index Cond: (d_cat = d_1.id)
-> Index Scan using uk_od_options on od_options vf (cost=0.42..8.44 rows=1 width=11) (actual time=0.023..0.023 rows=0 loops=1)
Index Cond: ((descr = f.id) AND (object = d.id) AND (b_date <= '2018-07-15 00:00:00'::timestamp without time zone))
Filter: (e_date > '2018-07-15 00:00:00'::timestamp without time zone)
Planning time: 2.592 ms
Execution time: 196.785 ms
问题出在一行case
语句中。它使散列连接:
Hash Join (cost=104215.62..186270.79 rows=866 width=19) (actual time=184.306..184.309 rows=1 loops=1)
Hash Cond: (CASE WHEN (u.num = 0) THEN l.val ELSE s.port_id END = d.id)
但是它应该是索引扫描。Planner 忽略 OD_DOCS.ID PK 列索引。U.NUM、L.VAL、S.PORT_ID 列也是如此int
。
为了检查我做了一些实验。对于我的示例,我可以摆脱 PR_STR_TO_LIST 函数并将其结果替换为整数常量,查询可能会像这样简化:
select F.DVALUE as f1, VF.VAL as f2, f_strlen( VF.VAL ) as f3
from DUALS U
left join D_B_CONT_SET_FIX S on U.NUM=1 and S.DOC=37593
left join OD_DOCS D on D.ID=(case when U.NUM=0 then 37593 else S.PORT_ID end)
left join OD_DOC_CATS C on C.ID=D.D_CAT
left join OD_DOC_CATS CC on CC.WALK<=C.WALK and C.WALK<(CC.WALK+CC.AMOUNT) and CC.LEV<=C.LEV
left join OP_FIELDS F on F.OP=CC.ID and F.AS_NAME='DESC_ESTIM'
left join OD_OPTIONS VF on VF.OBJECT=D.ID and VF.DESCR=F.ID and VF.B_DATE<='20180715' and VF.E_DATE>'20180715'
where U.NUM >= 0 and U.NUM < 2 and F.ID is not null
order by 3 desc, U.NUM, CC.LEV desc
计划:
Sort (cost=678.66..678.67 rows=1 width=18) (actual time=0.310..0.310 rows=1 loops=1)
Sort Key: (f_strlen(vf.val)) DESC, u.num, d_2.lev DESC
Sort Method: quicksort Memory: 25kB
-> Nested Loop Left Join (cost=2.11..678.65 rows=1 width=18) (actual time=0.181..0.300 rows=1 loops=1)
-> Nested Loop (cost=1.68..670.17 rows=1 width=19) (actual time=0.084..0.202 rows=1 loops=1)
Join Filter: ((d_2.walk <= d_1.walk) AND (d_2.lev <= d_1.lev) AND (d_1.walk < (d_2.walk + d_2.amount)))
-> Nested Loop (cost=0.55..126.94 rows=1 width=19) (actual time=0.028..0.143 rows=1 loops=1)
-> Index Scan using op_fields_op_as_name_key on op_fields f (cost=0.28..118.61 rows=1 width=11) (actual time=0.014..0.129 rows=1 loops=1)
Index Cond: ((as_name)::text = 'DESC_ESTIM'::text)
Filter: (id IS NOT NULL)
-> Index Scan using pk_od_doc_cats on fs_od_doc_cats d_2 (cost=0.27..8.29 rows=1 width=16) (actual time=0.010..0.010 rows=1 loops=1)
Index Cond: (id = f.op)
-> Nested Loop (cost=1.13..542.97 rows=13 width=16) (actual time=0.054..0.057 rows=1 loops=1)
-> Nested Loop (cost=0.86..539.20 rows=13 width=12) (actual time=0.048..0.051 rows=1 loops=1)
-> Nested Loop Left Join (cost=0.42..429.35 rows=13 width=8) (actual time=0.036..0.037 rows=2 loops=1)
Join Filter: (u.num = 1)
-> Seq Scan on duals u (cost=0.00..48.25 rows=13 width=4) (actual time=0.017..0.018 rows=2 loops=1)
Filter: ((num >= 0) AND (num < 2))
-> Materialize (cost=0.42..343.56 rows=195 width=4) (actual time=0.009..0.009 rows=0 loops=2)
-> Index Only Scan using pk_d_b_cont_set_fix on d_b_cont_set_fix s (cost=0.42..342.59 rows=195 width=4) (actual time=0.014..0.014 rows=0 loops=1)
Index Cond: (doc = 37593)
Heap Fetches: 0
-> Index Scan using pk_od_docs on fs_od_docs d (cost=0.43..8.45 rows=1 width=8) (actual time=0.006..0.006 rows=1 loops=2)
Index Cond: (id = CASE WHEN (u.num = 0) THEN 37593 ELSE s.port_id END)
-> Index Scan using pk_od_doc_cats on fs_od_doc_cats d_1 (cost=0.27..0.29 rows=1 width=12) (actual time=0.005..0.005 rows=1 loops=1)
Index Cond: (id = d.d_cat)
-> Index Scan using uk_od_options on od_options vf (cost=0.42..8.44 rows=1 width=11) (actual time=0.010..0.010 rows=0 loops=1)
Index Cond: ((descr = f.id) AND (object = d.id) AND (b_date <= '2018-07-15 00:00:00'::timestamp without time zone))
Filter: (e_date > '2018-07-15 00:00:00'::timestamp without time zone)
Planning time: 2.198 ms
Execution time: 0.461 ms
现在查询的速度提高了 400 倍。在同样的情况下,规划器使用索引扫描:
Index Scan using pk_od_docs on fs_od_docs d (cost=0.43..8.45 rows=1 width=8) (actual time=0.006..0.006 rows=1 loops=2)
Index Cond: (id = CASE WHEN (u.num = 0) THEN 37593 ELSE s.port_id END)
需要说明的是,OD_DOCS 是fs_od_docs 的视图。
所以问题是在这种情况下我如何帮助 PostgreSQL 规划器?
我正在使用最新的 PostgreSQL 10 安装。
----- 更新 1
这是PR_STR_TO_LIST
代码,通常它只是转换逗号分隔列表。
create or replace function PR_STR_TO_LIST (STR varchar(2000))
returns setof T_PR_STR_TO_LIST
as $body$
declare
P$ integer;
L$ integer;out_rec T_PR_STR_TO_LIST;
STR$ varchar(2000);
begin
STR$ := STR;
P$ := f_substr(',', STR$ );
L$ := f_strlen(STR$ );
while (P$>=0) loop
out_rec.VAL := f_AsInteger(f_left( STR$, P$ ));
L$ := L$-P$-1;
STR$ := f_right( STR$, L$ );
return next out_rec;
P$ := f_substr( ',', STR$ );
end loop;
if (L$>0) then
out_rec.VAL := f_AsInteger(STR$);
return next out_rec;
end if;
return;
end /*PR_STR_TO_LIST*/;
$body$ LANGUAGE 'plpgsql' PARALLEL SAFE IMMUTABLE CALLED ON NULL INPUT SECURITY INVOKER COST 100;
create or replace function PR_STR_TO_LIST (
STR integer
) RETURNS setof T_PR_STR_TO_LIST
as $body$
declare
out_rec T_PR_STR_TO_LIST;
begin
out_rec.VAL := STR;
return next out_rec;
end
$body$ LANGUAGE 'plpgsql' PARALLEL SAFE IMMUTABLE CALLED ON NULL INPUT SECURITY INVOKER COST 100;
----- 更新 2我根据@a_horse_with_no_name 的回答做了一些修改,现在使用内联 sql 查询:
select F.DVALUE as f1, VF.VAL as f2, f_strlen( VF.VAL ) as f3
from unnest(string_to_array(nullif(trim('37594'),''), ',')::int[]) as L
left join DUALS U on U.NUM >= 0 and U.NUM < 2
left join D_B_CONT_SET_FIX S on U.NUM=1 and S.DOC=L
left join OD_DOCS D on D.ID = (case when U.NUM=0 then L else S.PORT_ID end)
...
D_B_CONT_SET_FIX 表也缺少一个索引,所以我修复了它。然而问题依然存在。执行计划:
Sort (cost=108502.17..108505.51 rows=1336 width=18) (actual time=51.349..51.349 rows=0 loops=1)
Sort Key: (f_strlen(vf.val)) DESC, u.num, d_2.lev DESC
Sort Method: quicksort Memory: 25kB
-> Nested Loop Left Join (cost=88804.42..108432.81 rows=1336 width=18) (actual time=51.344..51.344 rows=0 loops=1)
-> Hash Join (cost=88804.00..104611.85 rows=1336 width=19) (actual time=51.343..51.343 rows=0 loops=1)
Hash Cond: (CASE WHEN (u.num = 0) THEN l.l ELSE s.port_id END = d.id)
-> Nested Loop Left Join (cost=0.44..15494.33 rows=36071 width=12) (actual time=0.062..0.075 rows=2 loops=1)
Join Filter: (u.num = 1)
-> Nested Loop Left Join (cost=0.01..31.73 rows=200 width=8) (actual time=0.052..0.056 rows=2 loops=1)
-> Function Scan on unnest l (cost=0.01..10.01 rows=100 width=4) (actual time=0.022..0.023 rows=1 loops=1)
-> Materialize (cost=0.00..1.22 rows=2 width=4) (actual time=0.028..0.031 rows=2 loops=1)
-> Seq Scan on duals u (cost=0.00..1.21 rows=2 width=4) (actual time=0.021..0.023 rows=2 loops=1)
Filter: ((num >= 0) AND (num < 2))
-> Index Only Scan using pk_d_b_cont_set_fix on d_b_cont_set_fix s (cost=0.42..40.31 rows=361 width=8) (actual time=0.007..0.007 rows=0 loops=2)
Index Cond: (doc = l.l)
Heap Fetches: 0
-> Hash (cost=79151.45..79151.45 rows=94167 width=15) (actual time=50.640..50.640 rows=97361 loops=1)
Buckets: 131072 Batches: 1 Memory Usage: 5588kB
-> Nested Loop (cost=1.26..79151.45 rows=94167 width=15) (actual time=0.186..29.367 rows=97361 loops=1)
-> Nested Loop (cost=0.83..169.92 rows=15 width=15) (actual time=0.169..0.347 rows=1 loops=1)
Join Filter: ((d_2.walk <= d_1.walk) AND (d_2.lev <= d_1.lev) AND (d_1.walk < (d_2.walk + d_2.amount)))
Rows Removed by Join Filter: 407
-> Index Scan using pk_od_doc_cats on fs_od_doc_cats d_1 (cost=0.27..63.62 rows=408 width=12) (actual time=0.018..0.082 rows=408 loops=1)
-> Materialize (cost=0.55..60.40 rows=1 width=19) (actual time=0.000..0.000 rows=1 loops=408)
-> Nested Loop (cost=0.55..60.39 rows=1 width=19) (actual time=0.030..0.137 rows=1 loops=1)
-> Index Scan using op_fields_op_as_name_key on op_fields f (cost=0.28..57.80 rows=1 width=11) (actual time=0.015..0.123 rows=1 loops=1)
Index Cond: ((as_name)::text = 'DESC_ESTIM'::text)
Filter: (id IS NOT NULL)
-> Index Scan using pk_od_doc_cats on fs_od_doc_cats d_2 (cost=0.27..2.58 rows=1 width=16) (actual time=0.011..0.011 rows=1 loops=1)
Index Cond: (id = f.op)
-> Index Scan using x2x_od_docs on fs_od_docs d (cost=0.43..2797.04 rows=24684 width=8) (actual time=0.016..20.450 rows=97361 loops=1)
Index Cond: (d_cat = d_1.id)
-> Index Scan using uk_od_options on od_options vf (cost=0.42..2.74 rows=1 width=11) (never executed)
Index Cond: ((descr = f.id) AND (object = d.id) AND (b_date <= '2018-07-15 00:00:00'::timestamp without time zone))
Filter: (e_date > '2018-07-15 00:00:00'::timestamp without time zone)
Planning time: 2.686 ms
Execution time: 53.637 ms
至于现在,这是解决方案:
create or replace function PR_STR_TO_LIST (STR varchar(2000))
returns table (val integer)
as $body$
begin
RETURN QUERY select unnest(string_to_array(nullif(trim(STR),''), ',')::int[]);
end;
$body$ LANGUAGE 'plpgsql' PARALLEL SAFE IMMUTABLE CALLED ON NULL INPUT SECURITY INVOKER COST 5 rows 10;
为什么它起作用而内联不起作用 - 一个谜。