我可以在使用数据库后激活 PITR 吗？

Question

Asked: 2021-09-17 15:49:33 +0800 CST2021-09-17 15:49:33 +0800 CST 2021-09-17 15:49:33 +0800 CST

为什么这个带有 UNION 的 SQL 查询明显快于没有 UNION 的相同查询？

772

我正在尝试优化一个查询，该查询在 Postgres 12.7 上永远不会完成。它需要几个小时，甚至几天，才能使 CPU 达到 100%，并且永远不会返回：

SELECT "id", "counter", "item_id", "item_name", "type", "updated_time"
FROM "changes"
WHERE (type = 1 OR type = 3) AND user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW'
OR type = 2 AND item_id IN (SELECT item_id FROM user_items WHERE user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW')
ORDER BY "counter" ASC LIMIT 100;

我随机尝试使用 UNION 重写它，我相信它是等价的。基本上，查询中有两个部分，一个用于 type = 1 或 3，另一个用于 type = 2。

(
    SELECT "id", "counter", "item_id", "item_name", "type", "updated_time"
    FROM "changes"
    WHERE (type = 1 OR type = 3) AND user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW'
) UNION (
    SELECT "id", "counter", "item_id", "item_name", "type", "updated_time"
    FROM "changes"
    WHERE type = 2 AND item_id IN (SELECT item_id FROM user_items WHERE user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW')
) ORDER BY "counter" ASC LIMIT 100;

此查询在 10 秒内返回，而另一个查询则在几天后不再返回。知道是什么导致了这种巨大的差异吗？

查询计划

对于原始查询：

                                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1001.01..1697110.80 rows=100 width=119)
   ->  Gather Merge  (cost=1001.01..8625312957.40 rows=508535 width=119)
         Workers Planned: 2
         ->  Parallel Index Scan using changes_pkey on changes  (cost=0.98..8625253259.82 rows=211890 width=119)
               Filter: ((((type = 1) OR (type = 3)) AND ((user_id)::text = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW'::text)) OR ((type = 2) AND (SubPlan 1)))
               SubPlan 1
                 ->  Materialize  (cost=0.55..18641.22 rows=143863 width=33)
                       ->  Index Only Scan using user_items_user_id_item_id_unique on user_items  (cost=0.55..16797.90 rows=143863 width=33)
                             Index Cond: (user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW'::text)

对于 UNION 查询：

Limit  (cost=272866.63..272866.88 rows=100 width=212) (actual time=10564.742..10566.964 rows=100 loops=1)
   ->  Sort  (cost=272866.63..273371.95 rows=202128 width=212) (actual time=10564.739..10566.950 rows=100 loops=1)
         Sort Key: changes.counter
         Sort Method: top-N heapsort  Memory: 69kB
         ->  Unique  (cost=261604.20..265141.44 rows=202128 width=212) (actual time=9530.376..10493.030 rows=147261 loops=1)
               ->  Sort  (cost=261604.20..262109.52 rows=202128 width=212) (actual time=9530.374..10375.845 rows=147261 loops=1)
                     Sort Key: changes.id, changes.counter, changes.item_id, changes.item_name, changes.type, changes.updated_time
                     Sort Method: external merge  Disk: 19960kB
                     ->  Gather  (cost=1000.00..223064.76 rows=202128 width=212) (actual time=2439.116..7356.233 rows=147261 loops=1)
                           Workers Planned: 2
                           Workers Launched: 2
                           ->  Parallel Append  (cost=0.00..201851.96 rows=202128 width=212) (actual time=2421.400..7815.315 rows=49087 loops=3)
                                 ->  Parallel Hash Join  (cost=12010.60..103627.94 rows=47904 width=119) (actual time=907.286..3118.898 rows=24 loops=3)
                                       Hash Cond: ((changes.item_id)::text = (user_items.item_id)::text)
                                       ->  Parallel Seq Scan on changes  (cost=0.00..90658.65 rows=365215 width=119) (actual time=1.466..2919.855 rows=295810 loops=3)
                                             Filter: (type = 2)
                                             Rows Removed by Filter: 428042
                                       ->  Parallel Hash  (cost=11290.21..11290.21 rows=57631 width=33) (actual time=78.190..78.191 rows=48997 loops=3)
                                             Buckets: 262144  Batches: 1  Memory Usage: 12416kB
                                             ->  Parallel Index Only Scan using user_items_user_id_item_id_unique on user_items  (cost=0.55..11290.21 rows=57631 width=33) (actual time=0.056..107.247 rows=146991 loops=1)
                                                   Index Cond: (user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW'::text)
                                                   Heap Fetches: 11817
                                 ->  Parallel Seq Scan on changes changes_1  (cost=0.00..95192.10 rows=36316 width=119) (actual time=2410.556..7026.664 rows=73595 loops=2)
                                       Filter: (((user_id)::text = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW'::text) AND ((type = 1) OR (type = 3)))
                                       Rows Removed by Filter: 1012184
 Planning Time: 65.846 ms
 Execution Time: 10575.679 ms
(27 rows)

定义

                                         Table "public.changes"
    Column     |         Type          | Collation | Nullable |                 Default
---------------+-----------------------+-----------+----------+------------------------------------------
 counter       | integer               |           | not null | nextval('changes_counter_seq'::regclass)
 id            | character varying(32) |           | not null |
 item_type     | integer               |           | not null |
 item_id       | character varying(32) |           | not null |
 item_name     | text                  |           | not null | ''::text
 type          | integer               |           | not null |
 updated_time  | bigint                |           | not null |
 created_time  | bigint                |           | not null |
 previous_item | text                  |           | not null | ''::text
 user_id       | character varying(32) |           | not null | ''::character varying
Indexes:
    "changes_pkey" PRIMARY KEY, btree (counter)
    "changes_id_unique" UNIQUE CONSTRAINT, btree (id)
    "changes_id_index" btree (id)
    "changes_item_id_index" btree (item_id)
    "changes_user_id_index" btree (user_id)

                                      Table "public.user_items"
    Column    |         Type          | Collation | Nullable |                Default
--------------+-----------------------+-----------+----------+----------------------------------------
 id           | integer               |           | not null | nextval('user_items_id_seq'::regclass)
 user_id      | character varying(32) |           | not null |
 item_id      | character varying(32) |           | not null |
 updated_time | bigint                |           | not null |
 created_time | bigint                |           | not null |
Indexes:
    "user_items_pkey" PRIMARY KEY, btree (id)
    "user_items_user_id_item_id_unique" UNIQUE CONSTRAINT, btree (user_id, item_id)
    "user_items_item_id_index" btree (item_id)
    "user_items_user_id_index" btree (user_id)

类型计数

postgres=> select count(*) from changes where type = 1;
  count
---------
 1201839
(1 row)

postgres=> select count(*) from changes where type = 2;
 count
--------
 888269
(1 row)

postgres=> select count(*) from changes where type = 3;
 count
-------
 83849
(1 row)

每个 user_id 有多少 item_id

postgres=> SELECT min(ct), max(ct), avg(ct), sum(ct) FROM (SELECT count(*) AS ct FROM user_items GROUP BY user_id) x;
 min |  max   |          avg          |   sum
-----+--------+-----------------------+---------
   6 | 146991 | 2253.0381526104417671 | 1122013
(1 row)

2 个回答

Voted

Erwin Brandstetter · Answer 1 · 2021-09-17T19:22:25+08:00

将丑陋OR的内容拆分为UNION查询通常是个好主意。看：

为什么 OR 语句比 UNION 慢？

使用此部分多列索引，查询的第一个SELECTUNION应该会缩短到毫秒：

CREATE INDEX ON changes (user_id, counter)
WHERE  type IN (1, 3);

并且添加后ORDER BY counter LIMIT 100。由于外部查询具有相同的内容，因此我们从不需要这部分的 100 多行：

(  -- now parentheses are required
SELECT id, counter, item_id, item_name, type, updated_time
FROM   changes
WHERE  type IN (1, 3)
AND    user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW'
ORDER  BY counter
LIMIT  100
)

您没有提供实际数字，因此从每个用户的大量项目（rows=146991在查询计划中）来看，请尝试将此作为第二个SELECT：

(
SELECT id, counter, item_id, item_name, type, updated_time
FROM   changes c
WHERE  type = 2
AND    EXISTS (
   SELECT FROM user_items u
   WHERE  u.user_id = 'kJ6GYJNPM4wdDY5dUV1b8PqDRJj6RRgW'   
   AND    c.item_id = u.item_id
   )
ORDER  BY counter
LIMIT  100
);

结合这个指数：

CREATE INDEX ON changes (counter, item_id) WHERE  type = 2;

对于显着不同的基数，不同的基数SELECT可能（很多）更好。特别是，这对于拥有很少或没有项目的用户会适得其反。

然后完整的查询：

(<query 1>)
UNION
(<query 2>)
ORDER  BY counter
LIMIT  100;

是的，ORDER BY counter LIMIT 100总共是 3 倍。

旁白

查询计划显示(never executed)for SubPlan 1，这似乎意味着没有type = 2找到带有的行。这很奇怪。（有关可能的解释，请参阅jjanes 的附加答案。）

您正在使用大varchar(32)ID 进行操作。如果您确实需要全局唯一标识符，请考虑uuid。更小更快。否则，一个普通的bigint（甚至integer）可以轻松覆盖您的 200 万行。使表和索引更小更快。也更快UNION。看：

UUID 或 BIGSERIAL 外键 Postgres

如果做不到这一点，您至少可以添加COLLATE "C"到varchar(32)列中以提高UNION性能（以及所有排序和相关操作）。除非您COLLATE "C"无论如何都运行数据库，这似乎不太可能。看：

https://www.postgresql.org/docs/current/collation.html

您当前的桌子设计很浪费。考虑这样重写：

                                         Table "public.changes"
    Column     |         Type          | Collation | Nullable |                 Default
---------------+-----------------------+-----------+----------+------------------------------------------
 counter       | integer               |           | not null | nextval('changes_counter_seq'::regclass)
 type          | integer               |           | not null |
 item_type     | integer               |           | not null |
 item_id       | character varying(32) |           | not null |
 item_name     | text                  |           | not null | ''::text
 id            | character varying(32) |           | not null |
 previous_item | text                  |           | not null | ''::text
 user_id       | character varying(32) |           | not null | ''::character varying
 updated_time  | bigint                |           | not null |
 created_time  | bigint                |           | not null |

应该使表格更小〜 15 MB（比较没有膨胀的原始表格）并且一切都稍微快一些。看：

在 PostgreSQL 中计算和节省空间

jjanes · Answer 2 · 2021-09-18T06:56:09+08:00

可以容忍的 OR 很幸运，因为它找到了 100 个类型为 1 或 3 的匹配行，然后才找到任何类型 2 必须与另一个表进行检查。无法容忍的显然确实必须对另一张表进行检查，并且它以非常慢的方式进行检查，方法是遍历其中的所有行。现在它应该使用散列子计划，而不是常规子计划。它不会使用我能想到的散列子计划的唯一原因是你的 work_mem 设置非常低，所以它认为它不能将散列表放入内存中，所以它回退到一个完全可怕的方法。

“散列子计划”没有办法溢出到磁盘，所以如果计划者认为它会使用太多内存，它就不会安排一个。在联合方面，哈希连接可能会溢出到磁盘，因此它更愿意使用它。

如果你提高你的 work_mem，OR 计划应该会变得更快。不需要太多，在我手中 10MB 就足够了（但这对于现代服务器来说仍然很小，除非您有充分的理由不这样做，否则我可能会将其设置为至少 100MB）

为什么这个带有 UNION 的 SQL 查询明显快于没有 UNION 的相同查询？

查询计划

定义

类型计数

每个 user_id 有多少 item_id

旁白

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

为什么这个带有 UNION 的 SQL 查询明显快于没有 UNION 的相同查询？

查询计划

定义

类型计数

每个 user_id 有多少 item_id

2 个回答

旁白

相关问题