AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / user-234213

Lucas03's questions

Martin Hope
Lucas03
Asked: 2022-06-10 01:15:52 +0800 CST

优化 PostgreSQL 上 SQL 查询的过滤器

  • 0

我有一个过滤结果成本很高的查询,我想我应该添加一个索引来优化计划,但我到目前为止尝试的索引没有任何影响。我可以通过为过滤列添加复合索引来优化查询吗?这是计划:

Limit  (cost=3069.33..14926.59 rows=4 width=509) (actual time=258424.190..258424.197 rows=4 loops=1)
  InitPlan 1 (returns $0)
    ->  HashAggregate  (cost=82.19..82.99 rows=80 width=8) (actual time=1320.215..1320.535 rows=2045 loops=1)
          Group Key: booking_passengers.bid
          Batches: 1  Memory Usage: 257kB
          ->  Index Scan using idx_booking_passengers_user_id on booking_passengers  (cost=0.44..81.99 rows=80 width=8) (actual time=10.687..1314.519 rows=2045 loops=1)
                Index Cond: ((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text)
  ->  Incremental Sort  (cost=2986.35..18414332.62 rows=6211 width=509) (actual time=258424.188..258424.189 rows=4 loops=1)
"        Sort Key: booking_data.last_segment_arrival_at DESC, booking_data.bid"
        Presorted Key: booking_data.last_segment_arrival_at
        Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 27kB  Peak Memory: 27kB
        ->  Index Scan Backward using idx_booking_data_last_segment_arrival_at on booking_data  (cost=0.44..18414054.67 rows=6211 width=509) (actual time=48419.376..258424.093 rows=5 loops=1)
              Index Cond: (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone)
              Filter: ((is_deleted IS FALSE) AND (bid >= 1100000) AND (((confirmation_sent IS TRUE) AND ((final_status)::text <> 'refunded'::text)) OR ((final_status)::text = 'confirmed'::text) OR ((confirmation_sent IS FALSE) AND ((final_status)::text = 'closed'::text))) AND (((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text) OR (bid = ANY ($0))))
              Rows Removed by Filter: 2315888
Planning Time: 2.132 ms
Execution Time: 258424.387 ms

这是查询:

explain analyze
    SELECT *
      FROM booking_data
      WHERE booking_data.bid >= 1100000
        AND booking_data.is_deleted IS false
        AND booking_data.last_segment_arrival_at < '2022-06-13 13:36'
        AND (booking_data.user_id = 'NJ8QigsGcQCDOttoGsD3iS'
                 OR booking_data.bid = ANY (CAST(array((
                     SELECT DISTINCT booking_passengers.bid AS anon_2
                     FROM booking_passengers
                     WHERE booking_passengers.user_id = 'NJ8QigsGcQCDOttoGsD3iS')) AS BIGINT[]))
            )
        AND (booking_data.confirmation_sent IS true
                 AND booking_data.final_status != 'refunded'
                 OR booking_data.final_status = 'confirmed'
                 OR booking_data.confirmation_sent IS false
                        AND booking_data.final_status IN ('closed')
            )
      ORDER BY booking_data.last_segment_arrival_at DESC, booking_data.bid ASC
      LIMIT 4 OFFSET 0

booking_data 表上的当前索引:

create index idx_booking_data_final_status on booking_data (final_status);
create index idx_booking_data_user_id on booking_data (user_id);
create index idx_booking_data_last_segment_arrival_at on booking_data (last_segment_arrival_at);
create index idx_booking_data_first_segment_arrival_at on booking_data (first_segment_arrival_at);
create index idx_booking_data_confirmed_at on booking_data (confirmed_at);
create index idx_booking_data_booked_email on booking_data (booked, email);
create index idx_booking_data_first_last_segment_bid_user_id on booking_data (first_segment_arrival_at, last_segment_arrival_at, bid, user_id);

我添加了索引:

CREATE index CONCURRENTLY idx_booking_data_user_id_last_segment_arrival_at on booking_data (user_id, last_segment_arrival_at);

它现在有关于暂存数据库的计划(具有生产数据的较弱实例)。这是计划:

Limit  (cost=13432.55..13432.56 rows=4 width=509) (actual time=11958.229..11958.235 rows=4 loops=1)
  InitPlan 1 (returns $0)
    ->  HashAggregate  (cost=82.19..82.99 rows=80 width=8) (actual time=2741.877..2742.215 rows=2053 loops=1)
          Group Key: booking_passengers.bid
          Batches: 1  Memory Usage: 257kB
          ->  Index Scan using idx_booking_passengers_user_id on booking_passengers  (cost=0.44..81.99 rows=80 width=8) (actual time=18.064..2734.284 rows=2053 loops=1)
                Index Cond: ((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text)
  ->  Sort  (cost=13349.57..13365.09 rows=6210 width=509) (actual time=11958.227..11958.230 rows=4 loops=1)
"        Sort Key: booking_data.last_segment_arrival_at DESC, booking_data.bid"
        Sort Method: top-N heapsort  Memory: 28kB
        ->  Bitmap Heap Scan on booking_data  (cost=195.64..13256.42 rows=6210 width=509) (actual time=3771.506..11952.815 rows=854 loops=1)
              Recheck Cond: ((((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text) AND (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone)) OR ((bid = ANY ($0)) AND (bid >= 1100000)))
              Filter: ((is_deleted IS FALSE) AND (bid >= 1100000) AND (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone) AND (((confirmation_sent IS TRUE) AND ((final_status)::text <> 'refunded'::text)) OR ((final_status)::text = 'confirmed'::text) OR ((confirmation_sent IS FALSE) AND ((final_status)::text = 'closed'::text))))
              Rows Removed by Filter: 10202
              Heap Blocks: exact=10935
              ->  BitmapOr  (cost=195.64..195.64 rows=12634 width=0) (actual time=3718.959..3718.961 rows=0 loops=1)
                    ->  Bitmap Index Scan on idx_booking_data_user_id_last_segment_arrival_at  (cost=0.00..176.81 rows=12625 width=0) (actual time=17.294..17.294 rows=11025 loops=1)
                          Index Cond: (((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text) AND (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone))
                    ->  Bitmap Index Scan on booking_data_pkey  (cost=0.00..15.72 rows=10 width=0) (actual time=3701.663..3701.663 rows=2062 loops=1)
                          Index Cond: ((bid = ANY ($0)) AND (bid >= 1100000))
Planning Time: 2.263 ms
Execution Time: 11958.434 ms

首次运行查询后,执行时间更快:

Limit  (cost=13432.55..13432.56 rows=4 width=509) (actual time=29.641..29.647 rows=4 loops=1)
  InitPlan 1 (returns $0)
    ->  HashAggregate  (cost=82.19..82.99 rows=80 width=8) (actual time=2.507..2.761 rows=2053 loops=1)
          Group Key: booking_passengers.bid
          Batches: 1  Memory Usage: 257kB
          ->  Index Scan using idx_booking_passengers_user_id on booking_passengers  (cost=0.44..81.99 rows=80 width=8) (actual time=0.021..1.664 rows=2053 loops=1)
                Index Cond: ((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text)
  ->  Sort  (cost=13349.57..13365.09 rows=6210 width=509) (actual time=29.640..29.643 rows=4 loops=1)
"        Sort Key: booking_data.last_segment_arrival_at DESC, booking_data.bid"
        Sort Method: top-N heapsort  Memory: 28kB
        ->  Bitmap Heap Scan on booking_data  (cost=195.64..13256.42 rows=6210 width=509) (actual time=11.942..28.832 rows=854 loops=1)
              Recheck Cond: ((((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text) AND (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone)) OR ((bid = ANY ($0)) AND (bid >= 1100000)))
              Filter: ((is_deleted IS FALSE) AND (bid >= 1100000) AND (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone) AND (((confirmation_sent IS TRUE) AND ((final_status)::text <> 'refunded'::text)) OR ((final_status)::text = 'confirmed'::text) OR ((confirmation_sent IS FALSE) AND ((final_status)::text = 'closed'::text))))
              Rows Removed by Filter: 10202
              Heap Blocks: exact=10935
              ->  BitmapOr  (cost=195.64..195.64 rows=12634 width=0) (actual time=10.139..10.140 rows=0 loops=1)
                    ->  Bitmap Index Scan on idx_booking_data_user_id_last_segment_arrival_at  (cost=0.00..176.81 rows=12625 width=0) (actual time=2.024..2.024 rows=11025 loops=1)
                          Index Cond: (((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text) AND (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone))
                    ->  Bitmap Index Scan on booking_data_pkey  (cost=0.00..15.72 rows=10 width=0) (actual time=8.113..8.113 rows=2062 loops=1)
                          Index Cond: ((bid = ANY ($0)) AND (bid >= 1100000))
Planning Time: 0.404 ms
Execution Time: 29.765 ms

在生产实例上,所有查询运行都很慢,即使它是更强大的实例(idx_booking_data_user_id_last_segment_arrival_at不使用索引):

Limit  (cost=523.03..2268.86 rows=4 width=509) (actual time=28549.479..28549.482 rows=4 loops=1)
  InitPlan 1 (returns $0)
    ->  HashAggregate  (cost=82.19..82.99 rows=80 width=8) (actual time=155.070..155.307 rows=2053 loops=1)
          Group Key: booking_passengers.bid
          Batches: 1  Memory Usage: 257kB
          ->  Index Scan using idx_booking_passengers_user_id on booking_passengers  (cost=0.44..81.99 rows=80 width=8) (actual time=0.414..153.733 rows=2053 loops=1)
                Index Cond: ((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text)
  ->  Incremental Sort  (cost=440.05..2710839.81 rows=6210 width=509) (actual time=28549.478..28549.479 rows=4 loops=1)
"        Sort Key: booking_data.last_segment_arrival_at DESC, booking_data.bid"
        Presorted Key: booking_data.last_segment_arrival_at
        Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 27kB  Peak Memory: 27kB
        ->  Index Scan Backward using idx_booking_data_last_segment_arrival_at on booking_data  (cost=0.44..2710561.90 rows=6210 width=509) (actual time=2034.195..28549.417 rows=5 loops=1)
              Index Cond: (last_segment_arrival_at < '2022-06-13 13:36:00+00'::timestamp with time zone)
              Filter: ((is_deleted IS FALSE) AND (bid >= 1100000) AND (((confirmation_sent IS TRUE) AND ((final_status)::text <> 'refunded'::text)) OR ((final_status)::text = 'confirmed'::text) OR ((confirmation_sent IS FALSE) AND ((final_status)::text = 'closed'::text))) AND (((user_id)::text = 'NJ8QigsGcQCDOttoGsD3iS'::text) OR (bid = ANY ($0))))
              Rows Removed by Filter: 2323153
Planning Time: 1.845 ms
Execution Time: 28549.694 ms

这是关于表格分析的答案吗?

SELECT schemaname, relname, last_analyze FROM pg_stat_all_tables WHERE relname = 'booking_passengers';

所以在两个相关表上运行 ANALYZE :

ANALYZE VERBOSE public.booking_data;
ANALYZE VERBOSE public.booking_passengers;

生产索引仍未使用:(

你的 WHERE 有 5 个 ANDed together 块。在没有 LIMIT 的情况下,每个单独返回多少行?

select count(*) FROM booking_data WHERE bid >= 1100000- 28208008
select count(*) FROM booking_data WHERE is_deleted IS false- 29249188
select count(*) FROM booking_data WHERE last_segment_arrival_at < '2022-06-13 13:36'- 23594003

select count(*)
FROM booking_data
WHERE (booking_data.user_id = 'NJ8QigsGcQCDOttoGsD3iS'
    OR booking_data.bid = ANY (CAST(array((
        SELECT DISTINCT booking_passengers.bid AS anon_2
        FROM booking_passengers
        WHERE booking_passengers.user_id = 'NJ8QigsGcQCDOttoGsD3iS')) AS BIGINT[]))
          )

11079

select count(*)
FROM booking_data
WHERE (booking_data.confirmation_sent IS true
                 AND booking_data.final_status != 'refunded'
                 OR booking_data.final_status = 'confirmed'
                 OR booking_data.confirmation_sent IS false
                        AND booking_data.final_status IN ('closed')
            )

17294003

我按照建议使用更高的 statistics_target 运行 ANALYZE:

show default_statistics_target ;
set default_statistics_target to 1000;
ANALYZE VERBOSE public.booking_data;
ANALYZE VERBOSE public.booking_passengers;

但仍然没有使用 user_id 和 last_segment_arrival_at 的索引:(

postgresql optimization
  • 1 个回答
  • 282 Views
Martin Hope
Lucas03
Asked: 2022-04-27 04:48:41 +0800 CST

PostgreSQL 中的简单查询优化(两表连接和排序)

  • 1

我有一个简单的查询应该返回有关预订的数据,但它行为不端,有时不会在 1 分钟内返回数据。

所以我有主要的预订表:

table booking_data
- bid bigint primary key
- user_id varchar(32)
(other columns not relevant here and there is index on user_id)

和带有助手数据的表,例如如果预订中有多个航班,我想将最后一个航班日期时间存储到助手表中:

table itinerary_timestamps
- bid references booking_data.bid
- last_segment_arrival_at    timestamp with time zone not null
(some other columns and index on last_segment_arrival_at)

这是简化的查询:

explain analyze select *
from booking_data join itinerary_timestamps using (bid)
where booking_data.user_id = 'dUZYLebTiZOBG1R3crKLhh'
  and itinerary_timestamps.last_segment_arrival_at < now()
order by itinerary_timestamps.last_segment_arrival_at desc
limit 11;

查询计划:

Limit  (cost=1.00..1253.62 rows=11 width=515) (actual time=7171.379..10008.565 rows=11 loops=1)
  ->  Nested Loop  (cost=1.00..17263562.69 rows=151602 width=515) (actual time=7171.378..10008.551 rows=11 loops=1)
        ->  Index Scan Backward using idx_itinerary_timestamps_last_segment_arrival_at on itinerary_timestamps  (cost=0.44..781981.48 rows=25695264 width=60) (actual time=0.014..1918.143 rows=1760471 loops=1)
              Index Cond: (last_segment_arrival_at < now())
        ->  Index Scan using booking_data_pkey on booking_data  (cost=0.56..0.64 rows=1 width=463) (actual time=0.004..0.004 rows=0 loops=1760471)
              Index Cond: (bid = itinerary_timestamps.bid)
              Filter: ((user_id)::text = 'dUZYLebTiZOBG1R3crKLhh'::text)
              Rows Removed by Filter: 1
Planning Time: 0.423 ms
Execution Time: 10008.630 ms

该用户的 booking_data 中有165867 个预订。itinerary_timestamps 表中有165862条记录,bid 属于 user_id,总共26424052条记录。

您对如何优化这个简单的查询有任何想法吗?

我正在考虑使用出价来过滤该表格,但它具有相似的结果(有时更好有时相同):

explain analyze select *
from booking_data join itinerary_timestamps using (bid)
where booking_data.user_id = 'aQIDkXXEx3nWY5KvLSuEiK'
  and itinerary_timestamps.last_segment_arrival_at < now()
  and booking_data.bid in (select bid from booking_data where user_id='aQIDkXXEx3nWY5KvLSuEiK')
order by itinerary_timestamps.last_segment_arrival_at desc
limit 11;
postgresql query-performance
  • 1 个回答
  • 64 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    连接到 PostgreSQL 服务器:致命:主机没有 pg_hba.conf 条目

    • 12 个回答
  • Marko Smith

    如何让sqlplus的输出出现在一行中?

    • 3 个回答
  • Marko Smith

    选择具有最大日期或最晚日期的日期

    • 3 个回答
  • Marko Smith

    如何列出 PostgreSQL 中的所有模式?

    • 4 个回答
  • Marko Smith

    列出指定表的所有列

    • 5 个回答
  • Marko Smith

    如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

    • 4 个回答
  • Marko Smith

    你如何mysqldump特定的表?

    • 4 个回答
  • Marko Smith

    使用 psql 列出数据库权限

    • 10 个回答
  • Marko Smith

    如何从 PostgreSQL 中的选择查询中将值插入表中?

    • 4 个回答
  • Marko Smith

    如何使用 psql 列出所有数据库和表?

    • 7 个回答
  • Martin Hope
    Jin 连接到 PostgreSQL 服务器:致命:主机没有 pg_hba.conf 条目 2014-12-02 02:54:58 +0800 CST
  • Martin Hope
    Stéphane 如何列出 PostgreSQL 中的所有模式? 2013-04-16 11:19:16 +0800 CST
  • Martin Hope
    Mike Walsh 为什么事务日志不断增长或空间不足? 2012-12-05 18:11:22 +0800 CST
  • Martin Hope
    Stephane Rolland 列出指定表的所有列 2012-08-14 04:44:44 +0800 CST
  • Martin Hope
    haxney MySQL 能否合理地对数十亿行执行查询? 2012-07-03 11:36:13 +0800 CST
  • Martin Hope
    qazwsx 如何监控大型 .sql 文件的导入进度? 2012-05-03 08:54:41 +0800 CST
  • Martin Hope
    markdorison 你如何mysqldump特定的表? 2011-12-17 12:39:37 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 对 SQL 查询进行计时? 2011-06-04 02:22:54 +0800 CST
  • Martin Hope
    Jonas 如何从 PostgreSQL 中的选择查询中将值插入表中? 2011-05-28 00:33:05 +0800 CST
  • Martin Hope
    Jonas 如何使用 psql 列出所有数据库和表? 2011-02-18 00:45:49 +0800 CST

热门标签

sql-server mysql postgresql sql-server-2014 sql-server-2016 oracle sql-server-2008 database-design query-performance sql-server-2017

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve