如何将 jackc/pgx 与连接池、上下文、准备好的语句等一起使用

Question

Matthew

Asked: 2025-01-02 21:33:45 +0800 CST2025-01-02 21:33:45 +0800 CST 2025-01-02 21:33:45 +0800 CST

Postgres 索引未用于今天的数据

772

我有一张包含 3.5 亿条记录的 Postgres 表。我在该表上有 3 个索引：

historical_offers(recovery_date, uprn)
historical_offers(recovery_date, account_id)
historical_offers(recovery_date, individual_id)

如果我针对 24 小时之前的日期运行查询，它很快。但如果我针对今天（有时是昨天）运行查询，它就太慢了（0.05 毫秒 vs 300 毫秒）。

我的查询针对所有 3x 字段，并使用 3x 索引，然后针对日期 > 24 小时~ 快速完美地混合结果。因此，我认为这与 3x 字段上的 OR 条件需要使用 3x 索引无关。此外：如果我将查询修改为仅在 1 个字段上运行，我会遇到同样的问题。

目前的理论：

索引的写入存在滞后（但我认为索引是在表更新的同时更新的）
查询规划搞砸了，并使用了最小的索引（我读到过一些文章说这是 Postgres 的已知做法）。也许我需要添加提示来“强制”它使用正确的索引？

响应缓慢（今天）：

EXPLAIN ANALYZE SELECT * FROM historical_offers.historical_offers WHERE (historical_offers.uprn = '1001005' OR historical_offers.account_id = 'SW1006' OR historical_offers.individual_id = '6752da6') AND (historical_offers.recovery_date = '2025-01-02');
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                                                       |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Index Scan using historical_offers_date_individual_id_idx on historical_offers  (cost=0.57..8.56 rows=1 width=174) (actual time=346.467..346.467 rows=0 loops=1) |
|   Index Cond: (recovery_date = '2025-01-02'::date)                                                                                                               |
|   Filter: (((uprn)::text = '1001005'::text) OR ((account_id)::text = 'SW1006'::text) OR ((individual_id)::text = '6752da6'::text))     |
|   Rows Removed by Filter: 1470748                                                                                                                                |
| Planning Time: 0.099 ms                                                                                                                                          |
| Execution Time: 346.488 ms                                                                                                                                       |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
EXPLAIN 6
Time: 0.383s

快速查询（2天前）：

EXPLAIN ANALYZE SELECT * FROM historical_offers.historical_offers WHERE (historical_offers.uprn = '1001005' OR historical_offers.account_id = 'SW1006' OR historical_offers.individual_id = '6752da6') AND (historical_offers.recovery_date = '2025-01-01');
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                                                                                                                                                                                          |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Bitmap Heap Scan on historical_offers  (cost=13.88..78.14 rows=16 width=174) (actual time=0.031..0.032 rows=0 loops=1)                                                                                                                                                                              |
|   Recheck Cond: (((recovery_date = '2025-01-01'::date) AND ((uprn)::text = '1001005'::text)) OR ((recovery_date = '2025-01-01'::date) AND ((account_id)::text = 'SW1006'::text)) OR ((recovery_date = '2025-01-01'::date) AND ((individual_id)::text = '6752da6'::text))) |
|   ->  BitmapOr  (cost=13.88..13.88 rows=16 width=0) (actual time=0.030..0.030 rows=0 loops=1)                                                                                                                                                                                                       |
|         ->  Bitmap Index Scan on historical_offers_date_uprn_idx  (cost=0.00..4.62 rows=5 width=0) (actual time=0.013..0.013 rows=0 loops=1)                                                                                                                                                        |
|               Index Cond: ((recovery_date = '2025-01-01'::date) AND ((uprn)::text = '1001005'::text))                                                                                                                                                                                          |
|         ->  Bitmap Index Scan on historical_offers_date_account_id_idx  (cost=0.00..4.62 rows=5 width=0) (actual time=0.008..0.008 rows=0 loops=1)                                                                                                                                                  |
|               Index Cond: ((recovery_date = '2025-01-01'::date) AND ((account_id)::text = 'SW1006'::text))                                                                                                                                                                                      |
|         ->  Bitmap Index Scan on historical_offers_date_individual_id_idx  (cost=0.00..4.62 rows=5 width=0) (actual time=0.008..0.008 rows=0 loops=1)                                                                                                                                               |
|               Index Cond: ((recovery_date = '2025-01-01'::date) AND ((individual_id)::text = '6752da6'::text))                                                                                                                                                                     |
| Planning Time: 0.113 ms                                                                                                                                                                                                                                                                             |
| Execution Time: 0.054 ms                                                                                                                                                                                                                                                                            |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
EXPLAIN 11
Time: 0.026s

2 个回答

Voted

Laurenz Albe · Answer 1 · 2025-01-03T05:36:09+08:00

Best Answer

Laurenz Albe

2025-01-03T05:36:09+08:002025-01-03T05:36:09+08:00

当表中一定比例（默认 10%）的数据发生变化时，PostgreSQL 会自动收集表统计信息。这意味着您可能没有最新数据的良好统计信息，这可能导致查询计划不佳。

解决方案是ANALYZE更频繁地告诉桌面上的自动清理：

ALTER TABLE historical_offers SET (autovacuum_analyze_scale_factor = 0.01);

每当表有 1% 发生更改时，此操作都会收集统计信息。如果您想要的是已修改行的绝对值，而不是百分比（可能是因为更改率保持不变），您可以

ALTER TABLE historical_offers SET (
   autovacuum_analyze_scale_factor = 0,
   autovacuum_analyze_threshold = 1000000
);

1

jjanes · Answer 2 · 2025-01-03T05:59:44+08:00

Index Scan ... on historical_offers  (cost=0.57..8.56 rows=1 width=174) (actual time=346.467..346.467 rows=0 loops=1)

它期望找到一行，但实际上找到的是零行。但该期望是基于后过滤的。该计划没有明确告诉我们它期望索引扫描在过滤前返回多少行。

但根据成本估算，如果使用默认成本参数，我们可以假设它预计只会找到一行 recovery_date = '2025-01-02'。这将提供一个索引叶页面访问和一个表页面访问，每个访问 4 个单位。如果这是真的，它选择的计划将是一个好的计划，但由于实际上有 1470748 行，所以这是一个糟糕的计划选择。

因此，自开始添加“2025-01-02”的行以来，表统计信息尚未更新。此外，该列的 pg_stats.histogram_bounds 必须为 NULL，因为统计系统认为所有存在的值都已包含在 pg_stats.most_common_vals 中，这使其有信心预测只有一行具有未观察到的值。这意味着表列的统计目标（或系统本身的统计目标，如果表列没有）必须与列中不同值的数量大致相同或更高。

因此，您可以通过分析表格或降低统计目标来解决这个问题。前者会使统计数据更加准确，后者会使系统对已有的统计数据缺乏信心。

Postgres 索引未用于今天的数据

Vue 3：创建时出错“预期标识符但发现‘导入’”[重复]

为什么这个简单而小的 Java 代码在所有 Graal JVM 上的运行速度都快 30 倍，但在任何 Oracle JVM 上却不行？

具有指定基础类型但没有枚举器的“枚举类”的用途是什么？

如何修复未手动导入的模块的 MODULE_NOT_FOUND 错误？

`(表达式，左值) = 右值` 在 C 或 C++ 中是有效的赋值吗？为什么有些编译器会接受/拒绝它？

何时应使用 std::inplace_vector 而不是 std::vector？

在 C++ 中，一个不执行任何操作的空程序需要 204KB 的堆，但在 C 中则不需要

PowerBI 目前与 BigQuery 不兼容：Simba 驱动程序与 Windows 更新有关

AdMob：MobileAds.initialize() - 对于某些设备，“java.lang.Integer 无法转换为 java.lang.String”

我正在尝试仅使用海龟随机和数学模块来制作吃豆人游戏

Postgres 索引未用于今天的数据

2 个回答

相关问题