我试图弄清楚为什么我的一个表在索引扫描速度显着加快时使用位图堆扫描。
桌子:
webarchive=# \d web_pages
Table "public.web_pages"
Column | Type | Modifiers
-------------------+-----------------------------+---------------------------------------------------------------------
id | bigint | not null default nextval('web_pages_id_seq'::regclass)
state | dlstate_enum | not null
errno | integer |
url | text | not null
starturl | text | not null
netloc | text | not null
file | bigint |
priority | integer | not null
distance | integer | not null
is_text | boolean |
limit_netloc | boolean |
title | citext |
mimetype | text |
type | itemtype_enum |
content | text |
fetchtime | timestamp without time zone |
addtime | timestamp without time zone |
normal_fetch_mode | boolean | default true
ignoreuntiltime | timestamp without time zone | not null default '1970-01-01 00:00:00'::timestamp without time zone
Indexes:
"web_pages_pkey" PRIMARY KEY, btree (id)
"ix_web_pages_url" UNIQUE, btree (url)
"ix_web_pages_distance" btree (distance)
"ix_web_pages_fetchtime" btree (fetchtime)
"ix_web_pages_id" btree (id)
"ix_web_pages_id_state" btree (id, state)
"ix_web_pages_netloc" btree (netloc)
"ix_web_pages_priority" btree (priority)
"ix_web_pages_state" btree (state)
"web_pages_netloc_fetchtime_idx" btree (netloc, fetchtime)
"web_pages_netloc_id_idx" btree (netloc, id)
Foreign-key constraints:
"web_pages_file_fkey" FOREIGN KEY (file) REFERENCES web_files(id)
Tablespace: "main_webarchive_tablespace"
询问:
EXPLAIN ANALYZE UPDATE
web_pages
SET
state = 'new'
WHERE
(state = 'fetching' OR state = 'processing')
AND
id <= 150000000;
在这种情况下,因为我有一个覆盖索引 ( ix_web_pages_id_state
),所以我希望查询规划器只进行索引扫描。但是,它生成的是位图堆扫描,速度要慢得多:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages (cost=524.06..532.09 rows=2 width=671) (actual time=2356.900..2356.900 rows=0 loops=1)
-> Bitmap Heap Scan on web_pages (cost=524.06..532.09 rows=2 width=671) (actual time=2356.896..2356.896 rows=0 loops=1)
Recheck Cond: (((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum)) AND (id <= 150000000))
Heap Blocks: exact=6
-> BitmapAnd (cost=524.06..524.06 rows=2 width=0) (actual time=2353.388..2353.388 rows=0 loops=1)
-> BitmapOr (cost=151.98..151.98 rows=6779 width=0) (actual time=2021.635..2021.636 rows=0 loops=1)
-> Bitmap Index Scan on ix_web_pages_state (cost=0.00..147.41 rows=6779 width=0) (actual time=2021.616..2021.617 rows=11668131 loops=1)
Index Cond: (state = 'fetching'::dlstate_enum)
-> Bitmap Index Scan on ix_web_pages_state (cost=0.00..4.57 rows=1 width=0) (actual time=0.015..0.016 rows=0 loops=1)
Index Cond: (state = 'processing'::dlstate_enum)
-> Bitmap Index Scan on web_pages_pkey (cost=0.00..371.83 rows=16435 width=0) (actual time=0.046..0.047 rows=205 loops=1)
Index Cond: (id <= 150000000)
Planning time: 0.232 ms
Execution time: 2406.234 ms
(14 rows)
如果我强制它不进行位图堆扫描(通过set enable_bitmapscan to off;
),它会生成一个更快的计划:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages (cost=0.56..38591.75 rows=2 width=671) (actual time=0.284..0.285 rows=0 loops=1)
-> Index Scan using web_pages_pkey on web_pages (cost=0.56..38591.75 rows=2 width=671) (actual time=0.281..0.281 rows=0 loops=1)
Index Cond: (id <= 150000000)
Filter: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
Rows Removed by Filter: 181
Planning time: 0.190 ms
Execution time: 0.334 ms
(7 rows)
我重新运行了一次真空分析,看看表统计信息是否可能已过时,但这似乎没有任何好处。另外,以上是在多次重新运行相同的查询之后,所以我认为缓存也不应该是相关的。
我怎样才能诱导规划器在这里生成一个性能更好的计划?
编辑:正如评论中所建议的,我添加了一个 index "ix_web_pages_state_id" btree (state, id)
。不幸的是,它没有帮助。
我还尝试过减少random_page_cost
(低至 0.5)以及增加统计目标,但都没有任何效果。
进一步编辑 - 删除 OR 条件:
EXPLAIN ANALYZE UPDATE
web_pages
SET
state = 'new'
WHERE
state = 'fetching'
AND
id <= 150000000;
产量:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages (cost=311.83..315.84 rows=1 width=589) (actual time=2574.654..2574.655 rows=0 loops=1)
-> Bitmap Heap Scan on web_pages (cost=311.83..315.84 rows=1 width=589) (actual time=2574.650..2574.651 rows=0 loops=1)
Recheck Cond: ((id <= 150000000) AND (state = 'fetching'::dlstate_enum))
Heap Blocks: exact=6
-> BitmapAnd (cost=311.83..311.83 rows=1 width=0) (actual time=2574.556..2574.556 rows=0 loops=1)
-> Bitmap Index Scan on web_pages_pkey (cost=0.00..49.60 rows=1205 width=0) (actual time=0.679..0.680 rows=726 loops=1)
Index Cond: (id <= 150000000)
-> Bitmap Index Scan on ix_web_pages_state (cost=0.00..261.98 rows=7122 width=0) (actual time=2519.950..2519.951 rows=11873888 loops=1)
Index Cond: (state = 'fetching'::dlstate_enum)
进一步编辑 - MOAR WEIRDNESS:
我重写了查询以使用子查询:
EXPLAIN ANALYZE UPDATE
web_pages
SET
state = 'new'
WHERE
(state = 'fetching' OR state = 'processing')
AND
id IN (
SELECT
id
FROM
web_pages
WHERE
id <= 150000000
);
这会产生一个最终的执行计划,该计划迄今为止优于所有其他执行计划。有时:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages (cost=1.12..13878.31 rows=1 width=595) (actual time=2.773..2.774 rows=0 loops=1)
-> Nested Loop (cost=1.12..13878.31 rows=1 width=595) (actual time=2.772..2.773 rows=0 loops=1)
-> Index Scan using web_pages_pkey on web_pages web_pages_1 (cost=0.56..3533.34 rows=1205 width=14) (actual time=0.000..0.602 rows=181 loops=1)
Index Cond: (id <= 150000000)
-> Index Scan using web_pages_pkey on web_pages (cost=0.56..8.58 rows=1 width=585) (actual time=0.010..0.010 rows=0 loops=181)
Index Cond: (id = web_pages_1.id)
Filter: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
Rows Removed by Filter: 1
Planning time: 0.891 ms
Execution time: 2.894 ms
(10 rows)
Update on web_pages (cost=21193.19..48917.78 rows=2 width=595)
-> Hash Semi Join (cost=21193.19..48917.78 rows=2 width=595)
Hash Cond: (web_pages.id = web_pages_1.id)
-> Bitmap Heap Scan on web_pages (cost=270.14..27976.00 rows=7126 width=585)
Recheck Cond: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
-> BitmapOr (cost=270.14..270.14 rows=7126 width=0)
-> Bitmap Index Scan on ix_web_pages_state (cost=0.00..262.01 rows=7126 width=0)
Index Cond: (state = 'fetching'::dlstate_enum)
-> Bitmap Index Scan on ix_web_pages_state (cost=0.00..4.57 rows=1 width=0)
Index Cond: (state = 'processing'::dlstate_enum)
-> Hash (cost=20834.15..20834.15 rows=7112 width=14)
-> Index Scan using web_pages_pkey on web_pages web_pages_1 (cost=0.56..20834.15 rows=7112 width=14)
Index Cond: ((id > 1883250000) AND (id <= 1883300000))
在这一点上,我不知道发生了什么。我所知道的是每个案例都是由set enable_bitmapscan to off;
.
好的,我昨晚运行的超长时间运行事务完成了,我设法VACUUM VERBOSE ANALYZE
在桌面上运行了一个:
webarchive=# VACUUM ANALYZE VERBOSE web_pages;
INFO: vacuuming "public.web_pages"
INFO: scanned index "ix_web_pages_distance" to remove 33328301 row versions
DETAIL: CPU 6.85s/21.21u sec elapsed 171.28 sec
INFO: scanned index "ix_web_pages_fetchtime" to remove 33328301 row versions
DETAIL: CPU 6.20s/25.28u sec elapsed 186.53 sec
INFO: scanned index "ix_web_pages_id" to remove 33328301 row versions
DETAIL: CPU 7.37s/29.56u sec elapsed 226.49 sec
INFO: scanned index "ix_web_pages_netloc" to remove 33328301 row versions
DETAIL: CPU 8.47s/41.44u sec elapsed 260.50 sec
INFO: scanned index "ix_web_pages_priority" to remove 33328301 row versions
DETAIL: CPU 5.65s/16.35u sec elapsed 180.78 sec
INFO: scanned index "ix_web_pages_state" to remove 33328301 row versions
DETAIL: CPU 4.51s/21.14u sec elapsed 189.60 sec
INFO: scanned index "ix_web_pages_url" to remove 33328301 row versions
DETAIL: CPU 26.59s/78.52u sec elapsed 969.99 sec
INFO: scanned index "web_pages_netloc_fetchtime_idx" to remove 33328301 row versions
DETAIL: CPU 8.23s/48.39u sec elapsed 301.37 sec
INFO: scanned index "web_pages_netloc_id_idx" to remove 33328301 row versions
DETAIL: CPU 15.52s/43.25u sec elapsed 423.68 sec
INFO: scanned index "web_pages_pkey" to remove 33328301 row versions
DETAIL: CPU 8.12s/33.43u sec elapsed 215.93 sec
INFO: scanned index "ix_web_pages_id_state" to remove 33328301 row versions
DETAIL: CPU 8.22s/33.26u sec elapsed 214.43 sec
INFO: scanned index "ix_web_pages_state_id" to remove 33328301 row versions
DETAIL: CPU 6.01s/28.04u sec elapsed 174.19 sec
INFO: "web_pages": removed 33328301 row versions in 3408348 pages
DETAIL: CPU 89.90s/50.24u sec elapsed 1928.70 sec
INFO: index "ix_web_pages_distance" now contains 29463963 row versions in 215671 pages
DETAIL: 33328301 index row versions were removed.
32914 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_web_pages_fetchtime" now contains 29463982 row versions in 253375 pages
DETAIL: 33328301 index row versions were removed.
40460 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_web_pages_id" now contains 29464000 row versions in 238212 pages
DETAIL: 33328301 index row versions were removed.
21081 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_web_pages_netloc" now contains 29464025 row versions in 358150 pages
DETAIL: 33328301 index row versions were removed.
99235 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_web_pages_priority" now contains 29464032 row versions in 214923 pages
DETAIL: 33328301 index row versions were removed.
21451 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_web_pages_state" now contains 29466359 row versions in 215150 pages
DETAIL: 33328301 index row versions were removed.
81340 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_web_pages_url" now contains 29466350 row versions in 1137027 pages
DETAIL: 33197635 index row versions were removed.
236405 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "web_pages_netloc_fetchtime_idx" now contains 29466381 row versions in 539255 pages
DETAIL: 33328301 index row versions were removed.
220594 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "web_pages_netloc_id_idx" now contains 29466392 row versions in 501276 pages
DETAIL: 33328301 index row versions were removed.
144217 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "web_pages_pkey" now contains 29466394 row versions in 236560 pages
DETAIL: 33173411 index row versions were removed.
20559 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_web_pages_id_state" now contains 29466415 row versions in 256699 pages
DETAIL: 33328301 index row versions were removed.
27194 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "ix_web_pages_state_id" now contains 29466435 row versions in 244076 pages
DETAIL: 33328301 index row versions were removed.
91918 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "web_pages": found 33339704 removable, 29367176 nonremovable row versions in 4224021 out of 4231795 pages
DETAIL: 2541 dead row versions cannot be removed yet.
There were 2079389 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 330.54s/537.34u sec elapsed 7707.90 sec.
INFO: vacuuming "pg_toast.pg_toast_705758310"
INFO: scanned index "pg_toast_705758310_index" to remove 7184381 row versions
DETAIL: CPU 7.32s/13.70u sec elapsed 240.71 sec
INFO: "pg_toast_705758310": removed 7184381 row versions in 2271192 pages
DETAIL: CPU 62.81s/46.41u sec elapsed 1416.12 sec
INFO: index "pg_toast_705758310_index" now contains 114558558 row versions in 338256 pages
DETAIL: 7184381 index row versions were removed.
2033 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_705758310": found 7184381 removable, 40907769 nonremovable row versions in 11388831 out of 29033065 pages
DETAIL: 5 dead row versions cannot be removed yet.
There were 74209 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 433.26s/247.73u sec elapsed 8444.85 sec.
INFO: analyzing "public.web_pages"
INFO: "web_pages": scanned 600000 of 4232727 pages, containing 4191579 live rows and 4552 dead rows; 600000 rows in sample, 29569683 estimated total rows
VACUUM
它仍在生成非仅索引查询,尽管执行时间更符合仅索引查询。我不明白为什么行为发生了如此大的变化。运行很长时间的查询会导致那么大的开销吗?
webarchive=# EXPLAIN ANALYZE UPDATE
web_pages
SET
state = 'new'
WHERE
(state = 'fetching' OR state = 'processing')
AND
id IN (
SELECT
id
FROM
web_pages
WHERE
id > 1883250000
AND
id <= 1883300000
);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages (cost=36.00..9936.00 rows=1 width=594) (actual time=37.856..37.857 rows=0 loops=1)
-> Nested Loop Semi Join (cost=36.00..9936.00 rows=1 width=594) (actual time=37.852..37.853 rows=0 loops=1)
-> Bitmap Heap Scan on web_pages (cost=35.44..3167.00 rows=788 width=584) (actual time=23.984..31.489 rows=2321 loops=1)
Recheck Cond: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
Heap Blocks: exact=2009
-> BitmapOr (cost=35.44..35.44 rows=788 width=0) (actual time=22.347..22.348 rows=0 loops=1)
-> Bitmap Index Scan on ix_web_pages_state (cost=0.00..30.47 rows=788 width=0) (actual time=22.326..22.327 rows=9202 loops=1)
Index Cond: (state = 'fetching'::dlstate_enum)
-> Bitmap Index Scan on ix_web_pages_state_id (cost=0.00..4.57 rows=1 width=0) (actual time=0.017..0.017 rows=0 loops=1)
Index Cond: (state = 'processing'::dlstate_enum)
-> Index Scan using ix_web_pages_id_state on web_pages web_pages_1 (cost=0.56..8.58 rows=1 width=14) (actual time=0.001..0.001 rows=0 loops=2321)
Index Cond: ((id = web_pages.id) AND (id > 1883250000) AND (id <= 1883300000))
Planning time: 2.677 ms
Execution time: 37.945 ms
(14 rows)
有趣的是,ID 偏移量的值似乎会影响规划:
webarchive=# EXPLAIN ANALYZE UPDATE
web_pages
SET
state = 'new'
WHERE
(state = 'fetching' OR state = 'processing')
AND
id IN (
SELECT
id
FROM
web_pages
WHERE
id > 149950000
AND
id <= 150000000
);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages (cost=1.12..17.18 rows=1 width=594) (actual time=0.030..0.031 rows=0 loops=1)
-> Nested Loop (cost=1.12..17.18 rows=1 width=594) (actual time=0.026..0.028 rows=0 loops=1)
-> Index Scan using ix_web_pages_id_state on web_pages web_pages_1 (cost=0.56..8.58 rows=1 width=14) (actual time=0.022..0.024 rows=0 loops=1)
Index Cond: ((id > 149950000) AND (id <= 150000000))
-> Index Scan using ix_web_pages_id_state on web_pages (cost=0.56..8.59 rows=1 width=584) (never executed)
Index Cond: (id = web_pages_1.id)
Filter: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
Planning time: 1.531 ms
Execution time: 0.155 ms
(9 rows)
查询规划器是否在其规划中考虑了查询参数的值?我原以为规划将与查询参数无关,但现在考虑它,使用参数来改进规划是有意义的,所以我可以看到它以这种方式工作。
有趣的是,位图扫描现在的性能似乎要好得多;
webarchive=# set enable_bitmapscan to off;
SET
webarchive=# EXPLAIN ANALYZE UPDATE
web_pages
SET
state = 'new'
WHERE
(state = 'fetching' OR state = 'processing')
AND
id IN (
SELECT
id
FROM
web_pages
WHERE
id > 1883250000
AND
id <= 1883300000
);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages (cost=1.12..82226.59 rows=1 width=594) (actual time=66.993..66.994 rows=0 loops=1)
-> Nested Loop (cost=1.12..82226.59 rows=1 width=594) (actual time=66.992..66.993 rows=0 loops=1)
-> Index Scan using web_pages_pkey on web_pages web_pages_1 (cost=0.56..21082.82 rows=7166 width=14) (actual time=0.055..20.206 rows=8567 loops=1)
Index Cond: ((id > 1883250000) AND (id <= 1883300000))
-> Index Scan using web_pages_pkey on web_pages (cost=0.56..8.52 rows=1 width=584) (actual time=0.004..0.004 rows=0 loops=8567)
Index Cond: (id = web_pages_1.id)
Filter: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
Rows Removed by Filter: 1
Planning time: 1.963 ms
Execution time: 67.112 ms
(10 rows)
webarchive=# set enable_bitmapscan to on;
SET
webarchive=# EXPLAIN ANALYZE UPDATE
web_pages
SET
state = 'new'
WHERE
(state = 'fetching' OR state = 'processing')
AND
id IN (
SELECT
id
FROM
web_pages
WHERE
id > 1883250000
AND
id <= 1883300000
);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on web_pages (cost=36.00..9936.00 rows=1 width=594) (actual time=23.331..23.331 rows=0 loops=1)
-> Nested Loop Semi Join (cost=36.00..9936.00 rows=1 width=594) (actual time=23.327..23.328 rows=0 loops=1)
-> Bitmap Heap Scan on web_pages (cost=35.44..3167.00 rows=788 width=584) (actual time=6.727..17.027 rows=1966 loops=1)
Recheck Cond: ((state = 'fetching'::dlstate_enum) OR (state = 'processing'::dlstate_enum))
Heap Blocks: exact=3825
-> BitmapOr (cost=35.44..35.44 rows=788 width=0) (actual time=3.499..3.499 rows=0 loops=1)
-> Bitmap Index Scan on ix_web_pages_state (cost=0.00..30.47 rows=788 width=0) (actual time=3.471..3.472 rows=21996 loops=1)
Index Cond: (state = 'fetching'::dlstate_enum)
-> Bitmap Index Scan on ix_web_pages_state_id (cost=0.00..4.57 rows=1 width=0) (actual time=0.022..0.023 rows=0 loops=1)
Index Cond: (state = 'processing'::dlstate_enum)
-> Index Scan using ix_web_pages_id_state on web_pages web_pages_1 (cost=0.56..8.58 rows=1 width=14) (actual time=0.001..0.001 rows=0 loops=1966)
Index Cond: ((id = web_pages.id) AND (id > 1883250000) AND (id <= 1883300000))
Planning time: 0.774 ms
Execution time: 23.425 ms
(14 rows)
So I think the issue was just the index having LOTS of rows that were no longer valid, and the process of filtering those was the primary time cost. The underlying issue here is, (I think) the way the MVCC system interacts with the VACUUM system in the context of extremely long running transactions.
It would make sense (in retrospect), that entries cannot be removed from an index until every single transaction that could use that index has been completed. From the documentation:
But there is an additional requirement for any table scan in PostgreSQL: it must verify that each retrieved row be "visible" to the query's MVCC snapshot, as discussed in Chapter 13. Visibility information is not stored in index entries, only in heap entries; so at first glance it would seem that every row retrieval would require a heap access anyway. And this is indeed the case, if the table row has been modified recently. However, for seldom-changing data there is a way around this problem.
In this case, I started a db dump, and then went on and did a bunch of cleanup (which involved a LOT of row churn). That would lead to lots of heap lookups for each index query, since the index contained lots of now-deleted rows.
This is mostly hypothetical, though, since I don't have the resources to try to recreate the situation.
Anyways, @jjanes's hint about long-running queries was the key to finding my way down the rabbit hole here.
This part is clearly deranged:
It is finding over a thousand times more rows than it thinks. A VACUUM ANALYZE should have fixed this. Perhaps your vacuum didn't do much because some long-open transaction is preventing it from removing dead tuples?
Doing VACUUM VERBOSE ANALYZE could shed some light on that.
You can also explore this further by using the test query:
Try to get it to run as both a bitmap scan, and regular index scan by fiddling with enable_* parameters, so we can compare the two reports. In particular, a held-open transaction will lead to the actual count for the bitmap index scan node being higher, as it must count rows before testing for visibility, while the ordinary index scan will discard invisible rows before increment the count.