我可以在使用数据库后激活 PITR 吗？

Question

Asked: 2020-09-18 01:19:13 +0800 CST2020-09-18 01:19:13 +0800 CST 2020-09-18 01:19:13 +0800 CST

Postgres 查询不使用索引，但使用索引更快

772

抱歉，如果这似乎是一个重复的问题。我在 AWS RDS 上使用 Postgres 11.6。我有 2 张桌子：

CREATE TABLE public.e
(
    id character varying(32) COLLATE pg_catalog."default" NOT NULL,
    p_id character varying(32) COLLATE pg_catalog."default" NOT NULL,
    CONSTRAINT e_pkey PRIMARY KEY (id)
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;

CREATE TABLE public.ed
(
    e_id character varying(32) COLLATE pg_catalog."default" NOT NULL,
    <other columns + primary key>
)
WITH (
    OIDS = FALSE
)
TABLESPACE pg_default;

我有一个索引ed.e_id：

CREATE INDEX ix_ed_e_id
    ON public.ed USING btree
    (e_id COLLATE pg_catalog."default" ASC NULLS LAST)
    TABLESPACE pg_default;

当我运行此查询时：

select *
from ed, e
where e.id = ed.e_id
and e.p_id = '5c7cae8df6d10f1064b2eaf5';

（使用时问题依然存在from ed inner join e on e.id = ed.e_id）

explain analyze计划是：

Gather  (cost=1136.68..141235.01 rows=28320 width=311) (actual time=0.456..871.155 rows=102709 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Hash Join  (cost=136.68..137403.01 rows=11800 width=311) (actual time=0.241..688.095 rows=34236 loops=3)
        Hash Cond: (ed.e_id = e.id)
        ->  Parallel Seq Scan on ed ed  (cost=0.00..133210.10 rows=1544610 width=218) (actual time=0.005..314.524 rows=1235269 loops=3)
        ->  Hash  (cost=135.67..135.67 rows=81 width=93) (actual time=0.125..0.126 rows=81 loops=3)
              Buckets: 1024  Batches: 1  Memory Usage: 19kB
              ->  Bitmap Heap Scan on e e  (cost=4.91..135.67 rows=81 width=93) (actual time=0.045..0.097 rows=81 loops=3)
                    Recheck Cond: ((p_id)::text = '5c7cae8df6d10f1064b2eaf5'::text)
                    Heap Blocks: exact=31
                    ->  Bitmap Index Scan on ix_e_p_id  (cost=0.00..4.89 rows=81 width=0) (actual time=0.035..0.035 rows=81 loops=3)
                          Index Cond: ((p_id)::text = '5c7cae8df6d10f1064b2eaf5'::text)
Planning Time: 0.329 ms
Execution Time: 877.804 ms

用一个Parallel Seq Scan on ed为ed.e_id匹配。

当 ISET SESSION enable_seqscan = OFF时，解释计划是：

Nested Loop  (cost=0.72..395895.14 rows=28320 width=311) (actual time=0.037..60.068 rows=102709 loops=1)
  ->  Index Scan using e_pkey on e e  (cost=0.29..917.61 rows=81 width=93) (actual time=0.019..4.995 rows=81 loops=1)
        Filter: ((p_id)::text = '5c7cae8df6d10f1064b2eaf5'::text)
        Rows Removed by Filter: 10522
  ->  Index Scan using ix_ed_e_id on ed ed  (cost=0.43..4757.83 rows=11844 width=218) (actual time=0.013..0.334 rows=1268 loops=81)
        Index Cond: (e_id = e.id)
Planning Time: 0.273 ms
Execution Time: 64.675 ms

快了一个数量级（877ms vs 64ms）！我试过VACUUM ANALYZE ed了，但这没有帮助。我什至尝试将e.id&更改ed.e_id为UUID类型，但这也无济于事。

如何说服 Postgres 使用ix_ed_e_id索引而不设置enable_seqscan为关闭？

Laurenz Albe · Answer 1 · 2020-09-18T19:44:01+08:00

似乎 PostgreSQL 高估了索引扫描的成本，这导致它更喜欢哈希连接而不是嵌套循环连接。

有两个参数告诉 PostgreSQL 硬件并影响它对索引扫描成本的估计：

random_page_cost: 与相比越大，seq_page_costPostgreSQL 估计索引扫描的随机 I/O 与顺序 I/O 相比就越昂贵。因此，您可以降低该参数以鼓励索引扫描。
effective_cache_size：这告诉优化器有多少内存可用于缓存数据。如果该值很高，它将假定索引被缓存并且价格索引扫描较低。

也许调整这些参数会改变 PostgreSQL 的想法，尽管成本估计相差甚远。

Postgres 查询不使用索引，但使用索引更快

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

Postgres 查询不使用索引，但使用索引更快

1 个回答

相关问题