Posgresql 9.6
我有一个使用和 的遗留系统Ubuntu 16.04
。几天前,即使使用标准 Postgresql 设置,一切都很好。我使用压缩选项(最大压缩级别 -Z 9)转储了数据库,以获得更小的大小 ( pg_dump --compress=9 database_name > database_name.sql
)。之后我遇到了很多问题。对某些表的某些查询开始执行速度非常慢。对其他表的查询工作正常。
这是我遇到问题的表格。
isbns:
id (integer)
value (string), index b-tree
type (string)
books:
id (integer)
isbn (string), index b-tree
...
(total 32 columns)
isbns_statistics:
id (integer)
average_price (float)
average_rating (integer)
isbn_id (foreign key)
...
(total 17 columns)
这些表每个包含 1 400 000 行。
基本上我使用了以下查询并且效果很好:
(1) SELECT * FROM ISBNS JOIN BOOKS ON BOOKS.ISBN = ISBN.VALUE JOIN ISBNS_STATISTICS ON ISBNS_STATISTICS.ISBN_ID = ISBNS.ID ORDER BY ISBNS.VALUE LIMIT 100;
但在我转储之后,它的执行速度开始变得非常慢。我不确定是否是因为转储的原因,但在转储之前一切都运行良好。这个查询也很好用:
SELECT * FROM ISBNS JOIN BOOKS ON BOOKS.ISBN = ISBN.VALUE JOIN ISBNS_STATISTICS ON ISBNS_STATISTICS.ISBN_ID = ISBNS.ID LIMIT 100;
这个查询执行得也很快:
SELECT * FROM ISBNS JOIN BOOKS ON BOOKS.ISBN = ISBN.VALUE ORDER BY ISBNS.VALUE LIMIT 100;
我更改了性能设置(例如,增加shared_buffer
),但速度并没有增加太多。
我读过包含 LIMIT 和 ORDER BY 的查询运行速度非常慢,但如果我对其他表进行此类查询,它运行得很好。
以下是查询 (1) 的查询计划:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1048379.37..1048428.33 rows=100 width=498) (actual time=5264.193..5264.444 rows=100 loops=1)
Buffers: shared hit=40250 read=332472, temp read=16699 written=28392
-> Merge Join (cost=1048379.37..2291557.51 rows=2539360 width=498) (actual time=5264.191..5264.436 rows=100 loops=1)
Merge Cond: ((books.isbn)::text = (isbns.value)::text)
Buffers: shared hit=40250 read=332472, temp read=16699 written=28392
-> Index Scan using books_isbn_key on books (cost=0.43..1205494.88 rows=1386114 width=333) (actual time=0.020..0.150 rows=100 loops=1)
Buffers: shared hit=103
-> Materialize (cost=1042333.77..1055199.75 rows=2573197 width=155) (actual time=5263.901..5263.960 rows=100 loops=1)
Buffers: shared hit=40147 read=332472, temp read=16699 written=28392
-> Sort (cost=1042333.77..1048766.76 rows=2573197 width=155) (actual time=5263.895..5263.949 rows=100 loops=1)
Sort Key: isbns.value
Sort Method: external merge Disk: 136864kB
Buffers: shared hit=40147 read=332472, temp read=16699 written=28392
-> Hash Join (cost=55734.14..566061.44 rows=2573197 width=155) (actual time=403.962..1994.884 rows=1404582 loops=1)
Hash Cond: (isbns_statistics.isbn_id = isbns.id)
Buffers: shared hit=40147 read=332472, temp read=11281 written=11279
-> Seq Scan on isbns_statistics (cost=0.00..385193.97 rows=2573197 width=120) (actual time=0.024..779.717 rows=1404582 loops=1)
Buffers: shared hit=26990 read=332472
-> Hash (cost=27202.84..27202.84 rows=1404584 width=35) (actual time=402.431..402.431 rows=1404584 loops=1)
Buckets: 1048576 Batches: 2 Memory Usage: 51393kB
Buffers: shared hit=13157, temp written=4363
-> Seq Scan on isbns (cost=0.00..27202.84 rows=1404584 width=35) (actual time=0.027..152.568 rows=1404584 loops=1)
Buffers: shared hit=13157
Planning time: 1.160 ms
Execution time: 5279.983 ms
(25 rows)
所以,问题是:
- 为什么一切都运行良好,但开始运行缓慢?
- 为什么与其他表的类似查询仍然执行得很快?