Vadim Samokhin提出的问题 -dba

Vadim Samokhin

Asked: 2022-01-24 03:57:59 +0800 CST

Postgresql 10：具有精确堆块的位图堆扫描

2

我有以下查询：

select ro.*
from courier c1
    join courier c2 on c2.real_physical_courier_1c_id = c1.real_physical_courier_1c_id
    join restaurant_order ro on ro.courier_id = c2.id
    left join jsonb_array_elements(items) jae on true
    left join jsonb_array_elements(jae->'options') ji on true
    inner join catalogue c on c.id in ((jae->'id')::int, (ji->'id')::int)
    join restaurant r on r.id = ro.restaurant_id
where c1.id = '7b35cdab-b423-472a-bde1-d6699f6cefd3' and ro.status in (70, 73)
group by ro.order_id, r.id ;

这是查询计划的一部分，它需要大约 95% 的时间：

->  Parallel Bitmap Heap Scan on restaurant_order ro  (cost=23.87..2357.58 rows=1244 width=1257) (actual time=11.931..38.163 rows=98 loops=2)"
      Recheck Cond: (status = ANY ('{70,73}'::integer[]))"
      Heap Blocks: exact=28755"
      ->  Bitmap Index Scan on ro__status  (cost=0.00..23.34 rows=2115 width=0) (actual time=9.168..9.168 rows=51540 loops=1)"
            Index Cond: (status = ANY ('{70,73}'::integer[]))"

我有一些问题。

首先是位图索引扫描部分。Postgres 遍历 51540 条 ro__status 索引记录，Index Cond: (status = ANY ('{70,73}'::integer[]))"并创建一个包含 28755 个元素的位图。它的键是对应表行的物理位置（exact在Heap Blocks节中表示）。这个对吗？
其次，这张图被传递到 Bitmap Heap Scan 阶段。Recheck Cond实际上并没有执行，因为堆块不是有损样式。位图堆扫描按元组的物理位置对位图进行排序，以启用顺序访问。然后它分两次依次读取表数据 ( loops=2) 并获得不超过 196 个表行。那是对的吗？
线中反映的位图大小Heap Blocks: exact=28755随时间变化很大。差异是两个数量级。比如昨天是500左右，为什么会这样？
现在，为什么在位图索引扫描阶段创建的位图有这么多键？有 ro__status 索引可以表明只有大约 200 条状态为 70 和 73 的记录。我想不出任何原因阻止 postgres 只保留那些实际满足index cond. 开销似乎很大：而不是约 200 个键，而是 28755 个！
为什么位图堆扫描需要这么长时间？据我所知，有两次顺序读取（loops=2），它应该花费更少的时间，不是吗？或者，按元组的物理位置排序的位图是罪魁祸首吗？
我应该担心估计不佳吗？如果是这样，增加 default_statistics_target 应该会有所帮助，对吧？现在默认为 100。

以防万一，这是一个完整的计划：

"Group  (cost=51297.15..52767.65 rows=19998 width=1261) (actual time=42.555..42.555 rows=0 loops=1)"
"  Group Key: ro.order_id, r.id"
"  ->  Gather Merge  (cost=51297.15..52708.83 rows=11764 width=1261) (actual time=42.554..45.459 rows=0 loops=1)"
"        Workers Planned: 1"
"        Workers Launched: 1"
"        ->  Group  (cost=50297.14..50385.37 rows=11764 width=1261) (actual time=38.850..38.850 rows=0 loops=2)"
"              Group Key: ro.order_id, r.id"
"              ->  Sort  (cost=50297.14..50326.55 rows=11764 width=1261) (actual time=38.850..38.850 rows=0 loops=2)"
"                    Sort Key: ro.order_id, r.id"
"                    Sort Method: quicksort  Memory: 25kB"
"                    Worker 0:  Sort Method: quicksort  Memory: 25kB"
"                    ->  Nested Loop  (cost=31.84..45709.27 rows=11764 width=1261) (actual time=38.819..38.819 rows=0 loops=2)"
"                          ->  Nested Loop Left Join  (cost=27.21..5194.50 rows=5882 width=1325) (actual time=38.819..38.819 rows=0 loops=2)"
"                                ->  Nested Loop Left Join  (cost=27.20..5076.49 rows=59 width=1293) (actual time=38.818..38.818 rows=0 loops=2)"
"                                      ->  Nested Loop  (cost=27.20..5074.49 rows=1 width=1261) (actual time=38.818..38.818 rows=0 loops=2)"
"                                            ->  Hash Join  (cost=26.93..5073.59 rows=1 width=1257) (actual time=38.817..38.818 rows=0 loops=2)"
"                                                  Hash Cond: (c2.real_physical_courier_1c_id = c1.real_physical_courier_1c_id)"
"                                                  ->  Nested Loop  (cost=24.28..5068.22 rows=1038 width=1267) (actual time=11.960..38.732 rows=98 loops=2)"
"                                                        ->  Parallel Bitmap Heap Scan on restaurant_order ro  (cost=23.87..2357.58 rows=1244 width=1257) (actual time=11.931..38.163 rows=98 loops=2)"
"                                                              Recheck Cond: (status = ANY ('{70,73}'::integer[]))"
"                                                              Heap Blocks: exact=28755"
"                                                              ->  Bitmap Index Scan on ro__status  (cost=0.00..23.34 rows=2115 width=0) (actual time=9.168..9.168 rows=51540 loops=1)"
"                                                                    Index Cond: (status = ANY ('{70,73}'::integer[]))"
"                                                        ->  Index Scan using courier_pkey on courier c2  (cost=0.41..2.18 rows=1 width=26) (actual time=0.005..0.005 rows=1 loops=195)"
"                                                              Index Cond: (id = ro.courier_id)"
"                                                  ->  Hash  (cost=2.63..2.63 rows=1 width=10) (actual time=0.039..0.039 rows=1 loops=2)"
"                                                        Buckets: 1024  Batches: 1  Memory Usage: 9kB"
"                                                        ->  Index Scan using courier_pkey on courier c1  (cost=0.41..2.63 rows=1 width=10) (actual time=0.034..0.034 rows=1 loops=2)"
"                                                              Index Cond: (id = '7b35cdab-b423-472a-bde1-d6699f6cefd3'::uuid)"
"                                            ->  Index Only Scan using restaurant_pkey on restaurant r  (cost=0.27..0.89 rows=1 width=4) (never executed)"
"                                                  Index Cond: (id = ro.restaurant_id)"
"                                                  Heap Fetches: 0"
"                                      ->  Function Scan on jsonb_array_elements jae  (cost=0.00..1.00 rows=100 width=32) (never executed)"
"                                ->  Function Scan on jsonb_array_elements ji  (cost=0.01..1.00 rows=100 width=32) (never executed)"
"                          ->  Bitmap Heap Scan on catalogue c  (cost=4.63..6.87 rows=2 width=4) (never executed)"
"                                Recheck Cond: ((id = ((jae.value -> 'id'::text))::integer) OR (id = ((ji.value -> 'id'::text))::integer))"
"                                ->  BitmapOr  (cost=4.63..4.63 rows=2 width=0) (never executed)"
"                                      ->  Bitmap Index Scan on catalogue_pkey  (cost=0.00..0.97 rows=1 width=0) (never executed)"
"                                            Index Cond: (id = ((jae.value -> 'id'::text))::integer)"
"                                      ->  Bitmap Index Scan on catalogue_pkey  (cost=0.00..0.97 rows=1 width=0) (never executed)"
"                                            Index Cond: (id = ((ji.value -> 'id'::text))::integer)"
"Planning Time: 1.113 ms"
"Execution Time: 45.588 ms"

Vadim Samokhin

Asked: 2019-12-27 02:14:19 +0800 CST

极长的postgres查询

0

我在一个事务中运行了一个迁移查询，它是这样的：

alter table big_and_loaded_table
    add column col1 bool;

update big_and_loaded_table set col1 = false;

看似很无辜，却执行了两分钟左右。此外，它“锁定”big_and_loaded_table了：我的应用程序中涉及它的任何查询——读取和写入——都花了很长时间来执行，同一分钟左右。我根本不在我的应用程序中使用事务，以及任何类型的显式锁。

所以我有两个问题：

为什么迁移查询需要这么长时间才能执行？是因为交易吗？还是因为查询本身？
为什么它会锁定涉及 table 的应用程序查询big_and_loaded_table？
将来我应该如何运行此类迁移？当我弄清楚前两个时，这一点可能没有任何意义。

Postgres 版本是 11.6。

Vadim Samokhin

Asked: 2018-10-29 05:01:14 +0800 CST

创建一个数据库并使其成为“当前”

4

我有一个重新创建我的单元测试数据库的 sql 脚本。目前它与\i <path_to_my_script>. 它就像

// create user, give him some priveleges

CREATE DATABASE my_database;

CREATE TABLE my_table (
  id UUID primary key,
  data json
);

但是my_table是在postgres数据库中创建的，而不是在my_database.

我如何指定my_table应在中创建my_database？

我使用 postgresql 10.5。

Vadim Samokhin

Asked: 2013-05-01 06:51:57 +0800 CST

读提交隔离级别

4

引用自文档：

Read Committed 是 PostgreSQL 中的默认隔离级别。当事务使用此隔离级别时，SELECT 查询（没有 FOR UPDATE/SHARE 子句）只能看到查询开始之前提交的数据；它永远不会看到未提交的数据或并发事务在查询执行期间提交的更改。实际上，SELECT 查询会在查询开始运行时看到数据库的快照。但是，SELECT 确实会看到在其自己的事务中执行的先前更新的影响，即使它们尚未提交。另请注意，如果其他事务在第一个 SELECT 执行期间提交更改，则两个连续的 SELECT 命令可以看到不同的数据，即使它们在单个事务中也是如此。

那么 PostgreSQL 是否看到其他事务提交的更改？

Vadim Samokhin

Asked: 2013-04-17 08:56:11 +0800 CST

没有显式锁定的postgres死锁

1

我使用 PostgreSQL 9.2，并且我不在任何地方使用显式锁定，无论是LOCK语句还是SELECT ... FOR UPDATE. 但是，最近我得到了ERROR: 40P01: deadlock detected. 但是，检测到死锁的查询被包装在事务块中。无论如何，它是怎么来的？

Vadim Samokhin

Asked: 2013-04-13 00:56:25 +0800 CST

数据库原子操作实现

1

问题是关于未包含在“开始提交”块中的查询，而是关于在 PostgreSQL、MySQL（至少是 innodb 引擎）中是原子的普通插入和更新。那么这在内部是如何实现的呢？

Vadim Samokhin

Asked: 2013-03-20 07:45:49 +0800 CST

VACUUM 将磁盘空间返回给操作系统

34

VACUUM通常不会将磁盘空间归还给操作系统，除非在某些特殊情况下。
从文档：

VACUUM删除表和索引中的死行版本并标记可用空间以供将来重用的标准形式。但是，它不会将空间返回给操作系统，除非在特殊情况下，表末尾的一个或多个页面变得完全空闲并且可以轻松获得排他表锁。相反，VACUUM FULL通过编写没有死空间的完整新版本的表文件来主动压缩表。这可以最小化表的大小，但可能需要很长时间。它还需要额外的磁盘空间来存储表的新副本，直到操作完成。

问题是：如何实现这个数据库状态one or more pages at the end of a table become entirely free？这可以通过来完成VACUUM FULL，但我没有足够的空间来实现它。那么还有其他可能吗？

Postgresql 10：具有精确堆块的位图堆扫描

极长的postgres查询

创建一个数据库并使其成为“当前”

读提交隔离级别

没有显式锁定的postgres死锁

数据库原子操作实现

VACUUM 将磁盘空间返回给操作系统

连接到 PostgreSQL 服务器：致命：主机没有 pg_hba.conf 条目

如何让sqlplus的输出出现在一行中？

选择具有最大日期或最晚日期的日期

如何列出 PostgreSQL 中的所有模式？

列出指定表的所有列

如何在不修改我自己的 tnsnames.ora 的情况下使用 sqlplus 连接到位于另一台主机上的 Oracle 数据库

你如何mysqldump特定的表？

使用 psql 列出数据库权限

如何从 PostgreSQL 中的选择查询中将值插入表中？

如何使用 psql 列出所有数据库和表？

Vadim Samokhin's questions