bstovall提出的问题 -dba

Asked: 2023-08-08 05:19:11 +0800 CST

增加 LIMIT 会导致性能下降

这似乎是一个常见问题的变体，即对查询中 LIMIT 子句的微小更改会将查询计划更改为性能极差的计划。在本例中，我在 PostgreSQL 13 数据库中有两个表：

assets有 ~140 行；
asset_data大约有 1.41 亿行。

我的查询如下：

SELECT
    a.dp as dp,
    a.at as at,
    a.dt as dt,
    a.it as it,
    a.ai as ai,
    ad.idt as idt,
    ad.data as data
FROM assets a
JOIN asset_data ad ON a.id = ad.asset_id_fk
WHERE 
    a.dp = 'kr' 
    AND a.at = 'fr' 
    AND a.dt = 'oh' 
    AND a.it = '1m' 
    AND a.ai = 'st'
ORDER BY idt desc
LIMIT 8000

上有一个索引idt。

当限制时，8000它只是使用idt索引进行排序；当限制时，9000它在连接后执行排序。

净性能从 3 秒缩短到近 12 分钟。

在阅读了其中一些类型的问题后，我尝试了一个VACUUM ANALYZE，这改变了查询计划，但没有任何重要的方式。

更新：

我还尝试设置idt和asset_id_fk列的统计信息，但这不起作用1000，而且对我来说它应该起作用并不明显。
idt 本身并不独特
idt + asset_id_fk 是唯一的并且有相应的约束
资产表仅返回一行
asset_id_fk 和 idt 的组合上有一个索引
idt上有单独的索引

关于解决此问题的适当方法有什么建议吗？

创建表语句：

CREATE TABLE IF NOT EXISTS public.asset_data
(
    id integer NOT NULL GENERATED BY DEFAULT AS IDENTITY ( INCREMENT 1 START 1 MINVALUE 1 MAXVALUE 2147483647 CACHE 1 ),
    dp character varying(40) COLLATE pg_catalog."default",
    at character varying(40) COLLATE pg_catalog."default",
    it character varying(10) COLLATE pg_catalog."default",
    ai character varying(40) COLLATE pg_catalog."default",
    idt timestamp without time zone NOT NULL,
    data jsonb NOT NULL,
    inserted timestamp without time zone DEFAULT now(),
    updated timestamp without time zone DEFAULT now(),
    dt character varying(20) COLLATE pg_catalog."default",
    asset_id_fk integer NOT NULL DEFAULT 1,
    CONSTRAINT asset_data_pkey PRIMARY KEY (id),
    CONSTRAINT asset_data_1_idx UNIQUE (dp, at, it, idt, dt, ai),
    CONSTRAINT idx_asset_data_2_unique UNIQUE (asset_id_fk, idt)
)

CREATE INDEX IF NOT EXISTS asset_data_it_aif_idx
    ON public.asset_data (idt, asset_id_fk);

CREATE INDEX IF NOT EXISTS idx_asset_data_asset_fk
    ON public.asset_data (asset_id_fk);

CREATE INDEX IF NOT EXISTS idx_asset_data_asset_fk_idt_index
    ON public.asset_data (asset_id_fk, idt);

CREATE INDEX IF NOT EXISTS idx_asset_data_idt_idx
    ON public.asset_data (idt);

解释分析，限制为 8000：

Limit  (cost=0.57..4170458.87 rows=8000 width=173) (actual time=0.160..3677.025 rows=8000 loops=1)
  Buffers: shared hit=61448 read=14657 dirtied=756
  ->  Nested Loop  (cost=0.57..507975375.06 rows=974426 width=173) (actual time=0.159..3675.290 rows=8000 loops=1)
        Join Filter: (a.id = ad.asset_id_fk)
        Rows Removed by Join Filter: 474053
        Buffers: shared hit=61448 read=14657 dirtied=756
        ->  Index Scan Backward using idx_asset_data_idt_idx on asset_data ad  (cost=0.57..505855992.43 rows=141291824 width=146) (actual time=0.051..3437.070 rows=482053 loops=1)
              Buffers: shared hit=61446 read=14657 dirtied=756
        ->  Materialize  (cost=0.00..5.27 rows=1 width=35) (actual time=0.000..0.000 rows=1 loops=482053)
              Buffers: shared hit=2
              ->  Seq Scan on assets a  (cost=0.00..5.26 rows=1 width=35) (actual time=0.052..0.067 rows=1 loops=1)
                    Filter: (((dp)::text = 'kr'::text) AND ((at)::text = 'fr'::text) AND ((dt)::text = 'oh'::text) AND ((it)::text = '1m'::text) AND ((ai)::text = 'st'::text))
                    Rows Removed by Filter: 144
                    Buffers: shared hit=2
Settings: effective_cache_size = '1507160kB'
Planning:
  Buffers: shared hit=4
Planning Time: 0.445 ms
Execution Time: 3679.005 ms

解释分析，限制为 9000：

Limit  (cost=4269756.79..4269779.29 rows=9000 width=173) (actual time=700091.606..700094.588 rows=9000 loops=1)
  Buffers: shared hit=133 read=1538340, temp read=74205 written=112013
  ->  Sort  (cost=4269756.79..4272192.85 rows=974426 width=173) (actual time=700091.604..700093.738 rows=9000 loops=1)
        Sort Key: ad.idt DESC
        Sort Method: external merge  Disk: 304680kB
        Buffers: shared hit=133 read=1538340, temp read=74205 written=112013
        ->  Nested Loop  (cost=0.57..4200885.77 rows=974426 width=173) (actual time=1.190..693283.441 rows=1687735 loops=1)
              Buffers: shared hit=133 read=1538340
              ->  Seq Scan on assets a  (cost=0.00..5.26 rows=1 width=35) (actual time=0.032..0.050 rows=1 loops=1)
                    Filter: (((dp)::text = 'kr'::text) AND ((at)::text = 'fr'::text) AND ((dt)::text = 'oh'::text) AND ((it)::text = '1m'::text) AND ((ai)::text = 'st'::text))
                    Rows Removed by Filter: 144
                    Buffers: shared hit=2
              ->  Index Scan using idx_asset_data_asset_fk on asset_data ad  (cost=0.57..4190011.91 rows=1086860 width=146) (actual time=1.151..691172.001 rows=1687735 loops=1)
                    Index Cond: (asset_id_fk = a.id)
                    Buffers: shared hit=131 read=1538340
Settings: effective_cache_size = '1507160kB'
Planning:
  Buffers: shared hit=4
Planning Time: 0.317 ms
Execution Time: 700245.659 ms

更新2：

切换到以下查询后：

SELECT a.dp, a.at, a.dt, a.it, a.ai, ad.idt, ad.data
FROM  (
   SELECT a.id, a.dp, a.at, a.dt, a.it, a.ai
   FROM   assets a
   WHERE  a.dp = 'kr' 
   AND    a.at = 'fr' 
   AND    a.dt = 'oh' 
   AND    a.it = '1m' 
   AND    a.ai = 'st'
   LIMIT  1  -- make sure the planner understands
   ) a
JOIN   asset_data ad ON ad.asset_id_fk = a.id
ORDER  BY ad.idt DESC
LIMIT  8000;

唯一的区别是查询计划在 9000 和 10000 之间切换：

Limit  (cost=0.57..4206410.37 rows=9000 width=173) (actual time=0.139..432.265 rows=9000 loops=1)
  Buffers: shared hit=82159
  ->  Nested Loop  (cost=0.57..507975395.07 rows=1086860 width=173) (actual time=0.138..431.358 rows=9000 loops=1)
        Join Filter: (a.id = ad.asset_id_fk)
        Rows Removed by Join Filter: 533354
        Buffers: shared hit=82159
        ->  Index Scan Backward using idx_asset_data_idt_idx on asset_data ad  (cost=0.57..505856012.43 rows=141291824 width=146) (actual time=0.049..164.873 rows=542354 loops=1)
              Buffers: shared hit=82158
        ->  Materialize  (cost=0.00..5.28 rows=1 width=35) (actual time=0.000..0.000 rows=1 loops=542354)
              Buffers: shared hit=1
              ->  Subquery Scan on a  (cost=0.00..5.27 rows=1 width=35) (actual time=0.032..0.035 rows=1 loops=1)
                    Buffers: shared hit=1
                    ->  Limit  (cost=0.00..5.26 rows=1 width=35) (actual time=0.032..0.033 rows=1 loops=1)
                          Buffers: shared hit=1
                          ->  Seq Scan on assets a_1  (cost=0.00..5.26 rows=1 width=35) (actual time=0.031..0.032 rows=1 loops=1)
                                Filter: (((dp)::text = 'kr'::text) AND ((at)::text = 'fr'::text) AND ((dt)::text = 'oh'::text) AND ((it)::text = '1m'::text) AND ((ai)::text = 'st'::text))
                                Rows Removed by Filter: 96
                                Buffers: shared hit=1
Settings: effective_cache_size = '1507160kB'
Planning Time: 0.322 ms
Execution Time: 432.904 ms

Limit  (cost=4278529.50..4278554.50 rows=10000 width=173) (actual time=702389.569..702392.909 rows=10000 loops=1)
  Buffers: shared hit=350 read=1538221, temp read=74241 written=112019
  ->  Sort  (cost=4278529.50..4281246.65 rows=1086860 width=173) (actual time=702389.568..702392.118 rows=10000 loops=1)
        Sort Key: ad.idt DESC
        Sort Method: external merge  Disk: 304704kB
        Buffers: shared hit=350 read=1538221, temp read=74241 written=112019
        ->  Nested Loop  (cost=0.57..4200885.78 rows=1086860 width=173) (actual time=1.267..695545.975 rows=1687836 loops=1)
              Buffers: shared hit=350 read=1538221
              ->  Limit  (cost=0.00..5.26 rows=1 width=35) (actual time=0.071..0.074 rows=1 loops=1)
                    Buffers: shared hit=1
                    ->  Seq Scan on assets a  (cost=0.00..5.26 rows=1 width=35) (actual time=0.071..0.071 rows=1 loops=1)
                          Filter: (((dp)::text = 'kr'::text) AND ((at)::text = 'fr'::text) AND ((dt)::text = 'oh'::text) AND ((it)::text = '1m'::text) AND ((ai)::text = 'st'::text))
                          Rows Removed by Filter: 96
                          Buffers: shared hit=1
              ->  Index Scan using idx_asset_data_asset_fk on asset_data ad  (cost=0.57..4190011.91 rows=1086860 width=146) (actual time=1.190..693414.817 rows=1687836 loops=1)
                    Index Cond: (asset_id_fk = a.id)
                    Buffers: shared hit=349 read=1538221
Settings: effective_cache_size = '1507160kB'
Planning Time: 0.301 ms
Execution Time: 702526.728 ms

增加 LIMIT 会导致性能下降

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

bstovall's questions