Ice2burn提出的问题 -dba

Ice2burn

Asked: 2019-10-08 01:39:35 +0800 CST

PostgreSQL 规划器不使用 PK 索引

我有一个问题：

select F.DVALUE as f1, VF.VAL as f2, f_strlen( VF.VAL ) as f3
from PR_STR_TO_LIST( '37593' ) L                    
left join DUALS U on U.NUM >= 0 and U.NUM < 2      
left join D_B_CONT_SET_FIX S on U.NUM=1 and S.DOC=L.VAL
left join OD_DOCS D on D.ID = (case when U.NUM=0 then L.VAL else S.PORT_ID end)
left join OD_DOC_CATS C  on C.ID=D.D_CAT
left join OD_DOC_CATS CC on CC.WALK<=C.WALK and C.WALK<(CC.WALK+CC.AMOUNT) and CC.LEV<=C.LEV
left join OP_FIELDS F on F.OP=CC.ID and F.AS_NAME='DESC_ESTIM'
left join OD_OPTIONS VF on VF.OBJECT=D.ID and VF.DESCR=F.ID and VF.B_DATE<='20180715' and VF.E_DATE>'20180715'
where F.ID is not null
order by 3 desc, U.NUM, CC.LEV desc

功能PR_STR_TO_LIST( '37593' )是IMMUTABLE PARALLEL SAFE和return setof T_PR_STR_TO_LIST：

create type T_PR_STR_TO_LIST as (
  VAL integer
);

查询很慢，这是执行计划：

Sort  (cost=193656.73..193658.89 rows=866 width=18) (actual time=184.363..184.363 rows=1 loops=1)
  Sort Key: (f_strlen(vf.val)) DESC, u.num, d_2.lev DESC
  Sort Method: quicksort  Memory: 25kB
  ->  Nested Loop Left Join  (cost=104216.04..193614.47 rows=866 width=18) (actual time=184.351..184.354 rows=1 loops=1)
        ->  Hash Join  (cost=104215.62..186270.79 rows=866 width=19) (actual time=184.306..184.309 rows=1 loops=1)
              Hash Cond: (CASE WHEN (u.num = 0) THEN l.val ELSE s.port_id END = d.id)
              ->  Merge Left Join  (cost=84933.39..166862.95 rows=23390 width=12) (actual time=133.823..133.824 rows=2 loops=1)
                    Merge Cond: (l.val = s.doc)
                    Join Filter: (u.num = 1)
                    ->  Sort  (cost=1109.34..1141.84 rows=13000 width=8) (actual time=0.162..0.163 rows=2 loops=1)
                          Sort Key: l.val
                          Sort Method: quicksort  Memory: 25kB
                          ->  Nested Loop Left Join  (cost=0.25..221.03 rows=13000 width=8) (actual time=0.149..0.150 rows=2 loops=1)
                                ->  Function Scan on pr_str_to_list l  (cost=0.25..10.25 rows=1000 width=4) (actual time=0.117..0.117 rows=1 loops=1)
                                ->  Materialize  (cost=0.00..48.31 rows=13 width=4) (actual time=0.030..0.031 rows=2 loops=1)
                                      ->  Seq Scan on duals u  (cost=0.00..48.25 rows=13 width=4) (actual time=0.021..0.021 rows=2 loops=1)
                                            Filter: ((num >= 0) AND (num < 2))
                    ->  Sort  (cost=83824.05..85695.24 rows=748476 width=8) (actual time=133.656..133.656 rows=1 loops=1)
                          Sort Key: s.doc
                          Sort Method: quicksort  Memory: 56931kB
                          ->  Seq Scan on d_b_cont_set_fix s  (cost=0.00..10796.76 rows=748476 width=8) (actual time=0.012..40.324 rows=748476 loops=1)
              ->  Hash  (cost=18106.86..18106.86 rows=94030 width=15) (actual time=49.902..49.902 rows=97361 loops=1)
                    Buckets: 131072  Batches: 1  Memory Usage: 5588kB
                    ->  Nested Loop  (cost=1.26..18106.86 rows=94030 width=15) (actual time=0.054..28.764 rows=97361 loops=1)
                          ->  Nested Loop  (cost=0.83..128.58 rows=15 width=15) (actual time=0.036..0.134 rows=1 loops=1)
                                ->  Nested Loop  (cost=0.55..126.94 rows=1 width=19) (actual time=0.020..0.118 rows=1 loops=1)
                                      ->  Index Scan using op_fields_op_as_name_key on op_fields f  (cost=0.28..118.61 rows=1 width=11) (actual time=0.013..0.110 rows=1 loops=1)
                                            Index Cond: ((as_name)::text = 'DESC_ESTIM'::text)
                                            Filter: (id IS NOT NULL)
                                      ->  Index Scan using pk_od_doc_cats on fs_od_doc_cats d_2  (cost=0.27..8.29 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=1)
                                            Index Cond: (id = f.op)
                                ->  Index Scan using od_ix_doc_cats_walk on fs_od_doc_cats d_1  (cost=0.28..1.48 rows=15 width=12) (actual time=0.014..0.015 rows=1 loops=1)
                                      Index Cond: ((d_2.walk <= walk) AND (walk < (d_2.walk + d_2.amount)))
                                      Filter: (d_2.lev <= lev)
                          ->  Index Scan using x2x_od_docs on fs_od_docs d  (cost=0.43..952.06 rows=24649 width=8) (actual time=0.017..20.391 rows=97361 loops=1)
                                Index Cond: (d_cat = d_1.id)
        ->  Index Scan using uk_od_options on od_options vf  (cost=0.42..8.44 rows=1 width=11) (actual time=0.023..0.023 rows=0 loops=1)
              Index Cond: ((descr = f.id) AND (object = d.id) AND (b_date <= '2018-07-15 00:00:00'::timestamp without time zone))
              Filter: (e_date > '2018-07-15 00:00:00'::timestamp without time zone)
Planning time: 2.592 ms
Execution time: 196.785 ms

问题出在一行case语句中。它使散列连接：

Hash Join  (cost=104215.62..186270.79 rows=866 width=19) (actual time=184.306..184.309 rows=1 loops=1)
Hash Cond: (CASE WHEN (u.num = 0) THEN l.val ELSE s.port_id END = d.id)

但是它应该是索引扫描。Planner 忽略 OD_DOCS.ID PK 列索引。U.NUM、L.VAL、S.PORT_ID 列也是如此int。

为了检查我做了一些实验。对于我的示例，我可以摆脱 PR_STR_TO_LIST 函数并将其结果替换为整数常量，查询可能会像这样简化：

select F.DVALUE as f1, VF.VAL as f2, f_strlen( VF.VAL ) as f3
from DUALS U        
left join D_B_CONT_SET_FIX S on U.NUM=1 and S.DOC=37593
left join OD_DOCS D on D.ID=(case when U.NUM=0 then 37593 else S.PORT_ID end)
left join OD_DOC_CATS C  on C.ID=D.D_CAT
left join OD_DOC_CATS CC on CC.WALK<=C.WALK and C.WALK<(CC.WALK+CC.AMOUNT) and CC.LEV<=C.LEV
left join OP_FIELDS F on F.OP=CC.ID and F.AS_NAME='DESC_ESTIM'
left join OD_OPTIONS VF on VF.OBJECT=D.ID and VF.DESCR=F.ID and VF.B_DATE<='20180715' and VF.E_DATE>'20180715'
where U.NUM >= 0 and U.NUM < 2 and F.ID is not null
order by 3 desc, U.NUM, CC.LEV desc

计划：

Sort  (cost=678.66..678.67 rows=1 width=18) (actual time=0.310..0.310 rows=1 loops=1)
  Sort Key: (f_strlen(vf.val)) DESC, u.num, d_2.lev DESC
  Sort Method: quicksort  Memory: 25kB
  ->  Nested Loop Left Join  (cost=2.11..678.65 rows=1 width=18) (actual time=0.181..0.300 rows=1 loops=1)
        ->  Nested Loop  (cost=1.68..670.17 rows=1 width=19) (actual time=0.084..0.202 rows=1 loops=1)
              Join Filter: ((d_2.walk <= d_1.walk) AND (d_2.lev <= d_1.lev) AND (d_1.walk < (d_2.walk + d_2.amount)))
              ->  Nested Loop  (cost=0.55..126.94 rows=1 width=19) (actual time=0.028..0.143 rows=1 loops=1)
                    ->  Index Scan using op_fields_op_as_name_key on op_fields f  (cost=0.28..118.61 rows=1 width=11) (actual time=0.014..0.129 rows=1 loops=1)
                          Index Cond: ((as_name)::text = 'DESC_ESTIM'::text)
                          Filter: (id IS NOT NULL)
                    ->  Index Scan using pk_od_doc_cats on fs_od_doc_cats d_2  (cost=0.27..8.29 rows=1 width=16) (actual time=0.010..0.010 rows=1 loops=1)
                          Index Cond: (id = f.op)
              ->  Nested Loop  (cost=1.13..542.97 rows=13 width=16) (actual time=0.054..0.057 rows=1 loops=1)
                    ->  Nested Loop  (cost=0.86..539.20 rows=13 width=12) (actual time=0.048..0.051 rows=1 loops=1)
                          ->  Nested Loop Left Join  (cost=0.42..429.35 rows=13 width=8) (actual time=0.036..0.037 rows=2 loops=1)
                                Join Filter: (u.num = 1)
                                ->  Seq Scan on duals u  (cost=0.00..48.25 rows=13 width=4) (actual time=0.017..0.018 rows=2 loops=1)
                                      Filter: ((num >= 0) AND (num < 2))
                                ->  Materialize  (cost=0.42..343.56 rows=195 width=4) (actual time=0.009..0.009 rows=0 loops=2)
                                      ->  Index Only Scan using pk_d_b_cont_set_fix on d_b_cont_set_fix s  (cost=0.42..342.59 rows=195 width=4) (actual time=0.014..0.014 rows=0 loops=1)
                                            Index Cond: (doc = 37593)
                                            Heap Fetches: 0
                          ->  Index Scan using pk_od_docs on fs_od_docs d  (cost=0.43..8.45 rows=1 width=8) (actual time=0.006..0.006 rows=1 loops=2)
                                Index Cond: (id = CASE WHEN (u.num = 0) THEN 37593 ELSE s.port_id END)
                    ->  Index Scan using pk_od_doc_cats on fs_od_doc_cats d_1  (cost=0.27..0.29 rows=1 width=12) (actual time=0.005..0.005 rows=1 loops=1)
                          Index Cond: (id = d.d_cat)
        ->  Index Scan using uk_od_options on od_options vf  (cost=0.42..8.44 rows=1 width=11) (actual time=0.010..0.010 rows=0 loops=1)
              Index Cond: ((descr = f.id) AND (object = d.id) AND (b_date <= '2018-07-15 00:00:00'::timestamp without time zone))
              Filter: (e_date > '2018-07-15 00:00:00'::timestamp without time zone)
Planning time: 2.198 ms
Execution time: 0.461 ms

现在查询的速度提高了 400 倍。在同样的情况下，规划器使用索引扫描：

Index Scan using pk_od_docs on fs_od_docs d  (cost=0.43..8.45 rows=1 width=8) (actual time=0.006..0.006 rows=1 loops=2)
Index Cond: (id = CASE WHEN (u.num = 0) THEN 37593 ELSE s.port_id END)

需要说明的是，OD_DOCS 是fs_od_docs 的视图。

所以问题是在这种情况下我如何帮助 PostgreSQL 规划器？

我正在使用最新的 PostgreSQL 10 安装。

----- 更新 1 这是PR_STR_TO_LIST代码，通常它只是转换逗号分隔列表。

create or replace function PR_STR_TO_LIST (STR varchar(2000))
returns setof T_PR_STR_TO_LIST
as $body$
declare
    P$  integer;
    L$  integer;out_rec T_PR_STR_TO_LIST;
    STR$ varchar(2000);
begin
    STR$ := STR;
    P$ := f_substr(',', STR$ );
    L$ := f_strlen(STR$ );
    while (P$>=0) loop
        out_rec.VAL := f_AsInteger(f_left( STR$, P$ ));
        L$ := L$-P$-1;
        STR$ := f_right( STR$, L$ );
        return next out_rec;
        P$ := f_substr( ',', STR$ );
    end loop;
    if (L$>0) then
        out_rec.VAL := f_AsInteger(STR$);
        return next out_rec;
    end if;
    return;
end /*PR_STR_TO_LIST*/;
$body$ LANGUAGE 'plpgsql' PARALLEL SAFE IMMUTABLE CALLED ON NULL INPUT SECURITY INVOKER COST 100;

create or replace function PR_STR_TO_LIST (
    STR integer
) RETURNS setof T_PR_STR_TO_LIST
as $body$
declare
  out_rec T_PR_STR_TO_LIST;
begin
  out_rec.VAL := STR;
  return next out_rec;
end
$body$ LANGUAGE 'plpgsql' PARALLEL SAFE IMMUTABLE CALLED ON NULL INPUT SECURITY INVOKER COST 100;

----- 更新 2我根据@a_horse_with_no_name 的回答做了一些修改，现在使用内联 sql 查询：

select F.DVALUE as f1, VF.VAL as f2, f_strlen( VF.VAL ) as f3
from unnest(string_to_array(nullif(trim('37594'),''), ',')::int[]) as L         
left join DUALS U on U.NUM >= 0 and U.NUM < 2      
left join D_B_CONT_SET_FIX S on U.NUM=1 and S.DOC=L
left join OD_DOCS D on D.ID = (case when U.NUM=0 then L else S.PORT_ID end)
...

D_B_CONT_SET_FIX 表也缺少一个索引，所以我修复了它。然而问题依然存在。执行计划：

Sort  (cost=108502.17..108505.51 rows=1336 width=18) (actual time=51.349..51.349 rows=0 loops=1)
  Sort Key: (f_strlen(vf.val)) DESC, u.num, d_2.lev DESC
  Sort Method: quicksort  Memory: 25kB
  ->  Nested Loop Left Join  (cost=88804.42..108432.81 rows=1336 width=18) (actual time=51.344..51.344 rows=0 loops=1)
        ->  Hash Join  (cost=88804.00..104611.85 rows=1336 width=19) (actual time=51.343..51.343 rows=0 loops=1)
              Hash Cond: (CASE WHEN (u.num = 0) THEN l.l ELSE s.port_id END = d.id)
              ->  Nested Loop Left Join  (cost=0.44..15494.33 rows=36071 width=12) (actual time=0.062..0.075 rows=2 loops=1)
                    Join Filter: (u.num = 1)
                    ->  Nested Loop Left Join  (cost=0.01..31.73 rows=200 width=8) (actual time=0.052..0.056 rows=2 loops=1)
                          ->  Function Scan on unnest l  (cost=0.01..10.01 rows=100 width=4) (actual time=0.022..0.023 rows=1 loops=1)
                          ->  Materialize  (cost=0.00..1.22 rows=2 width=4) (actual time=0.028..0.031 rows=2 loops=1)
                                ->  Seq Scan on duals u  (cost=0.00..1.21 rows=2 width=4) (actual time=0.021..0.023 rows=2 loops=1)
                                      Filter: ((num >= 0) AND (num < 2))
                    ->  Index Only Scan using pk_d_b_cont_set_fix on d_b_cont_set_fix s  (cost=0.42..40.31 rows=361 width=8) (actual time=0.007..0.007 rows=0 loops=2)
                          Index Cond: (doc = l.l)
                          Heap Fetches: 0
              ->  Hash  (cost=79151.45..79151.45 rows=94167 width=15) (actual time=50.640..50.640 rows=97361 loops=1)
                    Buckets: 131072  Batches: 1  Memory Usage: 5588kB
                    ->  Nested Loop  (cost=1.26..79151.45 rows=94167 width=15) (actual time=0.186..29.367 rows=97361 loops=1)
                          ->  Nested Loop  (cost=0.83..169.92 rows=15 width=15) (actual time=0.169..0.347 rows=1 loops=1)
                                Join Filter: ((d_2.walk <= d_1.walk) AND (d_2.lev <= d_1.lev) AND (d_1.walk < (d_2.walk + d_2.amount)))
                                Rows Removed by Join Filter: 407
                                ->  Index Scan using pk_od_doc_cats on fs_od_doc_cats d_1  (cost=0.27..63.62 rows=408 width=12) (actual time=0.018..0.082 rows=408 loops=1)
                                ->  Materialize  (cost=0.55..60.40 rows=1 width=19) (actual time=0.000..0.000 rows=1 loops=408)
                                      ->  Nested Loop  (cost=0.55..60.39 rows=1 width=19) (actual time=0.030..0.137 rows=1 loops=1)
                                            ->  Index Scan using op_fields_op_as_name_key on op_fields f  (cost=0.28..57.80 rows=1 width=11) (actual time=0.015..0.123 rows=1 loops=1)
                                                  Index Cond: ((as_name)::text = 'DESC_ESTIM'::text)
                                                  Filter: (id IS NOT NULL)
                                            ->  Index Scan using pk_od_doc_cats on fs_od_doc_cats d_2  (cost=0.27..2.58 rows=1 width=16) (actual time=0.011..0.011 rows=1 loops=1)
                                                  Index Cond: (id = f.op)
                          ->  Index Scan using x2x_od_docs on fs_od_docs d  (cost=0.43..2797.04 rows=24684 width=8) (actual time=0.016..20.450 rows=97361 loops=1)
                                Index Cond: (d_cat = d_1.id)
        ->  Index Scan using uk_od_options on od_options vf  (cost=0.42..2.74 rows=1 width=11) (never executed)
              Index Cond: ((descr = f.id) AND (object = d.id) AND (b_date <= '2018-07-15 00:00:00'::timestamp without time zone))
              Filter: (e_date > '2018-07-15 00:00:00'::timestamp without time zone)
Planning time: 2.686 ms
Execution time: 53.637 ms

至于现在，这是解决方案：

create or replace function PR_STR_TO_LIST (STR varchar(2000))
returns table (val integer)
as $body$
begin
    RETURN QUERY select unnest(string_to_array(nullif(trim(STR),''), ',')::int[]);
end;
$body$ LANGUAGE 'plpgsql' PARALLEL SAFE IMMUTABLE CALLED ON NULL INPUT SECURITY INVOKER COST 5 rows 10;

为什么它起作用而内联不起作用 - 一个谜。

Ice2burn

Asked: 2016-08-16 01:48:07 +0800 CST

如何为 PostgreSQL 调试器设置默认模式？

我有 3 个模式：公共、元数据、数据。

在 PgAdmin III 或 SQL Manager for PostgreSQL 中从“数据”模式调试大型函数时，调试器看不到来自同一模式的任何表或函数。

raise notice原因函数很大并且依赖于其他函数，使用或编写模式前缀无效。

如何设置默认架构以正确调试它？我猜公共架构是默认的。

PostgreSQL 规划器不使用 PK 索引

如何为 PostgreSQL 调试器设置默认模式？

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

Ice2burn's questions