Posso ativar o PITR depois que o banco de dados foi usado

Question

Ed Sabol

Asked: 2019-12-05 22:29:47 +0800 CST2019-12-05 22:29:47 +0800 CST 2019-12-05 22:29:47 +0800 CST

O pg_stats é enorme e está deixando o servidor PostgreSQL lento.... O que eu faço?

772

Eu tenho dois servidores rodando PostgreSQL 10.11 que são configurados da mesma forma, rodam em hardware comparável, contêm os mesmos bancos de dados e (principalmente) as mesmas tabelas.

Várias verificações do Nagios usando check_postgresa verificação desses servidores rotineiramente, incluindo uma que verifica se há "inchaço".

Em um servidor, essa verificação de inchaço leva de 3 a 4 segundos.

No outro servidor, a verificação de inchaço leva de 24 a 30 minutos . EDIT: Leva de 55 a 60 minutos a partir de 14/01/2020.

Além disso, essa condição começou há 3 dias. Antes disso, a verificação de inchaço era executada em um período de tempo razoável (menos de ~ 10 segundos) no servidor agora "lento".

Após investigação adicional sobre a consulta "check_bloat" usada pelo check_postgres, descobri o que acho que pode ser o motivo pelo qual é muito mais lento em um servidor do que no outro. A pg_statstabela contém aproximadamente 815.000 entradas no servidor lento. No servidor "rápido", ele contém apenas ~45.000 entradas. EDIT: Quarenta e poucos dias depois, a pg_statstabela contém ~ 838.000 entradas a partir de 14/01/2020.

O que é que eu posso fazer sobre isto? Existe uma maneira de reduzir a tabela pg_stats( pg_statistic)?

Eu tentei reiniciar o servidor "lento" e fazer um vacuum analyze;nele. Nenhum efeito.

Tentei executar select pg_stat_reset()em todos os bancos de dados no servidor "lento", mas não parecia fazer nada. (O número de linhas em pg_statsnão mudou.)

EDIT: Aqui está o SQL:

SELECT  ns.nspname, tbl.relname, hdr, ma, bs,
        SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS datawidth,
        MAX(coalesce(null_frac,0)) AS maxfracsum,
        hdr+(
          SELECT 1+count(*)/8
          FROM pg_stats s2
          WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname
        ) AS nullhdr
      FROM pg_attribute att
      JOIN pg_class tbl ON att.attrelid = tbl.oid
      JOIN pg_namespace ns ON ns.oid = tbl.relnamespace
      LEFT JOIN pg_stats s ON s.schemaname=ns.nspname
      AND s.tablename = tbl.relname
      AND s.inherited=false
      AND s.attname=att.attname,
      (
        SELECT
          (SELECT current_setting('block_size')::numeric) AS bs,
            CASE WHEN SUBSTRING(SPLIT_PART(v, ' ', 2) FROM '#\"[0-9]+.[0-9]+#\"%' for '#')
              IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
          CASE WHEN v ~ 'mingw32' OR v ~ '64-bit' THEN 8 ELSE 4 END AS ma
        FROM (SELECT version() AS v) AS foo
      ) AS constants
      WHERE att.attnum > 0 AND tbl.relkind='r'
      GROUP BY 1,2,3,4,5;

EDIT: Aqui está a saída do EXPLAIN (ANALYZE,BUFFERS)servidor "lento":

                                                                                                                                            QUERY PLAN                                                                                                                                            
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=1935395.99..4529587255.06 rows=4097995 width=188) (actual time=106609.972..396360.540 rows=96069 loops=1)
   Group Key: ns.nspname, tbl.relname, CASE WHEN ("substring"(split_part(version(), ' '::text, 2), '#\"[0-9]+.[0-9]+#\"%'::text, '#'::text) = ANY ('{8.0,8.1,8.2}'::text[])) THEN 27 ELSE 23 END, CASE WHEN ((version() ~ 'mingw32'::text) OR (version() ~ '64-bit'::text)) THEN 8 ELSE 4 END, $0
   Buffers: shared hit=81434823 read=621336, temp read=340059 written=340054
   InitPlan 1 (returns $0)
     ->  Result  (cost=0.00..0.02 rows=1 width=32) (actual time=0.022..0.023 rows=1 loops=1)
   ->  Merge Left Join  (cost=1935395.97..2063714.73 rows=4097995 width=176) (actual time=106594.140..263322.505 rows=13304172 loops=1)
         Merge Cond: ((ns.nspname = s.schemaname) AND (tbl.relname = s.tablename) AND (att.attname = s.attname))
         Buffers: shared hit=6832941 read=331003, temp read=340059 written=340054
         ->  Sort  (cost=1427752.62..1437997.60 rows=4097995 width=192) (actual time=100214.124..113229.530 rows=13304172 loops=1)
               Sort Key: ns.nspname, tbl.relname, att.attname
               Sort Method: external merge  Disk: 2629600kB
               Buffers: shared hit=52022 read=282240, temp read=328700 written=328701
               ->  Hash Join  (cost=15335.36..599474.35 rows=4097995 width=192) (actual time=242.121..56135.695 rows=13304172 loops=1)
                     Hash Cond: (tbl.relnamespace = ns.oid)
                     Buffers: shared hit=52022 read=282240
                     ->  Hash Join  (cost=15246.41..588609.50 rows=4097995 width=132) (actual time=237.991..35301.264 rows=13304172 loops=1)
                           Hash Cond: (att.attrelid = tbl.oid)
                           Buffers: shared hit=52006 read=282219
                           ->  Seq Scan on pg_attribute att  (cost=0.00..532962.28 rows=15390604 width=68) (actual time=0.004..13225.400 rows=13870158 loops=1)
                                 Filter: (attnum > 0)
                                 Rows Removed by Filter: 1147617
                                 Buffers: shared hit=42449 read=282219
                           ->  Hash  (cost=14050.06..14050.06 rows=95708 width=72) (actual time=237.495..237.496 rows=96069 loops=1)
                                 Buckets: 131072  Batches: 1  Memory Usage: 10782kB
                                 Buffers: shared hit=9557
                                 ->  Seq Scan on pg_class tbl  (cost=0.00..14050.06 rows=95708 width=72) (actual time=0.005..149.710 rows=96069 loops=1)
                                       Filter: (relkind = 'r'::"char")
                                       Rows Removed by Filter: 270863
                                       Buffers: shared hit=9557
                     ->  Hash  (cost=58.42..58.42 rows=2442 width=68) (actual time=4.095..4.096 rows=2393 loops=1)
                           Buckets: 4096  Batches: 1  Memory Usage: 266kB
                           Buffers: shared hit=13 read=21
                           ->  Seq Scan on pg_namespace ns  (cost=0.00..58.42 rows=2442 width=68) (actual time=0.007..2.020 rows=2393 loops=1)
                                 Buffers: shared hit=13 read=21
         ->  Sort  (cost=507643.36..507707.46 rows=25640 width=200) (actual time=6379.663..6409.276 rows=37000 loops=1)
               Sort Key: s.schemaname, s.tablename, s.attname
               Sort Method: quicksort  Memory: 11365kB
               Buffers: shared hit=6780919 read=48763, temp read=11359 written=11353
               ->  Subquery Scan on s  (cost=23845.50..505765.73 rows=25640 width=200) (actual time=1299.167..6301.807 rows=37000 loops=1)
                     Buffers: shared hit=6780919 read=48763, temp read=11359 written=11353
                     ->  Gather  (cost=23845.50..505509.33 rows=25640 width=401) (actual time=1299.164..6213.093 rows=37000 loops=1)
                           Workers Planned: 2
                           Workers Launched: 2
                           Buffers: shared hit=6780919 read=48763, temp read=11359 written=11353
                           ->  Hash Left Join  (cost=22845.50..501945.33 rows=10683 width=401) (actual time=1127.147..5861.412 rows=12333 loops=3)
                                 Hash Cond: (c.relnamespace = n.oid)
                                 Buffers: shared hit=6780919 read=48763, temp read=11359 written=11353
                                 ->  Hash Join  (cost=22756.56..501828.29 rows=10683 width=140) (actual time=1121.277..5826.907 rows=12333 loops=3)
                                       Hash Cond: (s_1.starelid = c.oid)
                                       Join Filter: has_column_privilege(c.oid, a.attnum, 'select'::text)
                                       Rows Removed by Join Filter: 266947
                                       Buffers: shared hit=6780817 read=48763, temp read=11359 written=11353
                                       ->  Nested Loop  (cost=0.43..462367.09 rows=416791 width=82) (actual time=0.061..2480.113 rows=279280 loops=3)
                                             Buffers: shared hit=3354232 read=48763
                                             ->  Parallel Seq Scan on pg_statistic s_1  (cost=0.00..53955.73 rows=236437 width=14) (actual time=0.030..405.942 rows=279280 loops=3)
                                                   Filter: (NOT stainherit)
                                                   Buffers: shared hit=464 read=48763
                                             ->  Index Scan using pg_attribute_relid_attnum_index on pg_attribute a  (cost=0.43..3.02 rows=2 width=70) (actual time=0.004..0.004 rows=1 loops=837840)
                                                   Index Cond: ((attrelid = s_1.starelid) AND (attnum = s_1.staattnum))
                                                   Filter: (NOT attisdropped)
                                                   Buffers: shared hit=3353768
                                       ->  Hash  (cost=14050.06..14050.06 rows=359445 width=72) (actual time=1118.810..1118.811 rows=366932 loops=3)
                                             Buckets: 262144  Batches: 2  Memory Usage: 20667kB
                                             Buffers: shared hit=28671, temp written=6186
                                             ->  Seq Scan on pg_class c  (cost=0.00..14050.06 rows=359445 width=72) (actual time=0.007..551.732 rows=366932 loops=3)
                                                   Filter: ((NOT relrowsecurity) OR (NOT row_security_active(oid)))
                                                   Buffers: shared hit=28671
                                 ->  Hash  (cost=58.42..58.42 rows=2442 width=68) (actual time=5.839..5.840 rows=2393 loops=3)
                                       Buckets: 4096  Batches: 1  Memory Usage: 266kB
                                       Buffers: shared hit=102
                                       ->  Seq Scan on pg_namespace n  (cost=0.00..58.42 rows=2442 width=68) (actual time=0.017..2.882 rows=2393 loops=3)
                                             Buffers: shared hit=102
   SubPlan 2
     ->  Aggregate  (cost=1104.73..1104.74 rows=1 width=8) (actual time=1.149..1.150 rows=1 loops=96069)
           Buffers: shared hit=74601882 read=290333
           ->  Nested Loop  (cost=1.56..1104.72 rows=1 width=401) (actual time=1.144..1.147 rows=0 loops=96069)
                 Join Filter: (c_1.oid = s_2.starelid)
                 Buffers: shared hit=74601882 read=290333
                 ->  Nested Loop  (cost=1.14..1103.30 rows=1 width=10) (actual time=1.141..1.143 rows=0 loops=96069)
                       Buffers: shared hit=74440049 read=284290
                       ->  Nested Loop  (cost=0.70..16.75 rows=1 width=4) (actual time=0.181..0.356 rows=1 loops=96069)
                             Join Filter: (c_1.relnamespace = n_1.oid)
                             Rows Removed by Join Filter: 57
                             Buffers: shared hit=19268934 read=1187
                             ->  Index Scan using pg_class_relname_nsp_index on pg_class c_1  (cost=0.42..8.44 rows=1 width=8) (actual time=0.006..0.069 rows=58 loops=96069)
                                   Index Cond: (relname = tbl.relname)
                                   Filter: ((NOT relrowsecurity) OR (NOT row_security_active(oid)))
                                   Buffers: shared hit=2617035 read=1168
                             ->  Index Scan using pg_namespace_nspname_index on pg_namespace n_1  (cost=0.28..8.30 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=5531986)
                                   Index Cond: (nspname = ns.nspname)
                                   Buffers: shared hit=16651899 read=19
                       ->  Index Scan using pg_attribute_relid_attnum_index on pg_attribute a_1  (cost=0.43..1084.53 rows=201 width=6) (actual time=0.782..0.782 rows=0 loops=96069)
                             Index Cond: (attrelid = c_1.oid)
                             Filter: ((NOT attisdropped) AND has_column_privilege(c_1.oid, attnum, 'select'::text))
                             Rows Removed by Filter: 144
                             Buffers: shared hit=55171115 read=283103
                 ->  Index Scan using pg_statistic_relid_att_inh_index on pg_statistic s_2  (cost=0.43..1.07 rows=28 width=6) (actual time=0.003..0.004 rows=0 loops=43518)
                       Index Cond: ((starelid = a_1.attrelid) AND (staattnum = a_1.attnum))
                       Filter: (stanullfrac <> '0'::double precision)
                       Rows Removed by Filter: 0
                       Buffers: shared hit=161833 read=6043
 Planning time: 3.341 ms
 Execution time: 397056.086 ms
(103 rows)

E aqui está a saída do EXPLAIN (ANALYZE,BUFFERS)servidor "rápido":

                                                                                                                                            QUERY PLAN                                                                                                                                            
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=18339.88..841921.77 rows=13555 width=188) (actual time=1379.248..8600.924 rows=2694 loops=1)
   Group Key: ns.nspname, tbl.relname, CASE WHEN ("substring"(split_part(version(), ' '::text, 2), '#\"[0-9]+.[0-9]+#\"%'::text, '#'::text) = ANY ('{8.0,8.1,8.2}'::text[])) THEN 27 ELSE 23 END, CASE WHEN ((version() ~ 'mingw32'::text) OR (version() ~ '64-bit'::text)) THEN 8 ELSE 4 END, $0
   Buffers: shared hit=3648911
   InitPlan 1 (returns $0)
     ->  Result  (cost=0.00..0.02 rows=1 width=32) (actual time=0.026..0.027 rows=1 loops=1)
   ->  Merge Left Join  (cost=18339.86..18764.78 rows=13555 width=176) (actual time=1372.967..2097.150 rows=55107 loops=1)
         Merge Cond: ((ns.nspname = s.schemaname) AND (tbl.relname = s.tablename) AND (att.attname = s.attname))
         Buffers: shared hit=262164
         ->  Sort  (cost=5887.15..5921.04 rows=13555 width=192) (actual time=371.212..415.656 rows=55107 loops=1)
               Sort Key: ns.nspname, tbl.relname, att.attname
               Sort Method: quicksort  Memory: 16174kB
               Buffers: shared hit=3056
               ->  Hash Join  (cost=708.49..4956.84 rows=13555 width=192) (actual time=9.314..282.912 rows=55107 loops=1)
                     Hash Cond: (tbl.relnamespace = ns.oid)
                     Buffers: shared hit=3056
                     ->  Hash Join  (cost=670.68..4883.30 rows=13555 width=132) (actual time=7.529..194.704 rows=55107 loops=1)
                           Hash Cond: (att.attrelid = tbl.oid)
                           Buffers: shared hit=3038
                           ->  Seq Scan on pg_attribute att  (cost=0.00..4007.49 rows=78125 width=68) (actual time=0.004..77.499 rows=82849 loops=1)
                                 Filter: (attnum > 0)
                                 Rows Removed by Filter: 37659
                                 Buffers: shared hit=2595
                           ->  Hash  (cost=637.01..637.01 rows=2693 width=72) (actual time=7.497..7.497 rows=2694 loops=1)
                                 Buckets: 4096  Batches: 1  Memory Usage: 306kB
                                 Buffers: shared hit=443
                                 ->  Seq Scan on pg_class tbl  (cost=0.00..637.01 rows=2693 width=72) (actual time=0.007..5.110 rows=2694 loops=1)
                                       Filter: (relkind = 'r'::"char")
                                       Rows Removed by Filter: 12831
                                       Buffers: shared hit=443
                     ->  Hash  (cost=25.14..25.14 rows=1014 width=68) (actual time=1.757..1.758 rows=1036 loops=1)
                           Buckets: 2048 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 118kB
                           Buffers: shared hit=15
                           ->  Seq Scan on pg_namespace ns  (cost=0.00..25.14 rows=1014 width=68) (actual time=0.008..0.841 rows=1036 loops=1)
                                 Buffers: shared hit=15
         ->  Sort  (cost=12452.71..12453.04 rows=131 width=200) (actual time=1001.404..1031.723 rows=36997 loops=1)
               Sort Key: s.schemaname, s.tablename, s.attname
               Sort Method: quicksort  Memory: 11364kB
               Buffers: shared hit=259108
               ->  Subquery Scan on s  (cost=5455.47..12448.10 rows=131 width=200) (actual time=709.076..941.593 rows=36997 loops=1)
                     Buffers: shared hit=259108
                     ->  Hash Left Join  (cost=5455.47..12446.79 rows=131 width=401) (actual time=709.073..886.682 rows=36997 loops=1)
                           Hash Cond: (c.relnamespace = n.oid)
                           Buffers: shared hit=259108
                           ->  Hash Join  (cost=5417.65..12408.63 rows=131 width=140) (actual time=707.296..823.908 rows=36997 loops=1)
                                 Hash Cond: ((s_1.starelid = c.oid) AND (s_1.staattnum = a.attnum))
                                 Buffers: shared hit=259093
                                 ->  Seq Scan on pg_statistic s_1  (cost=0.00..6793.45 rows=19622 width=14) (actual time=0.014..45.484 rows=45420 loops=1)
                                       Filter: (NOT stainherit)
                                       Buffers: shared hit=6401
                                 ->  Hash  (cost=4852.66..4852.66 rows=37666 width=142) (actual time=706.898..706.899 rows=69779 loops=1)
                                       Buckets: 131072 (originally 65536)  Batches: 1 (originally 1)  Memory Usage: 12881kB
                                       Buffers: shared hit=252689
                                       ->  Hash Join  (cost=831.03..4852.66 rows=37666 width=142) (actual time=29.082..629.863 rows=69779 loops=1)
                                             Hash Cond: (a.attrelid = c.oid)
                                             Join Filter: has_column_privilege(c.oid, a.attnum, 'select'::text)
                                             Rows Removed by Join Filter: 50729
                                             Buffers: shared hit=252689
                                             ->  Seq Scan on pg_attribute a  (cost=0.00..3724.99 rows=112999 width=70) (actual time=0.005..117.968 rows=120508 loops=1)
                                                   Filter: (NOT attisdropped)
                                                   Buffers: shared hit=2595
                                             ->  Hash  (cost=637.01..637.01 rows=15521 width=72) (actual time=28.956..28.957 rows=15525 loops=1)
                                                   Buckets: 16384  Batches: 1  Memory Usage: 1705kB
                                                   Buffers: shared hit=443
                                                   ->  Seq Scan on pg_class c  (cost=0.00..637.01 rows=15521 width=72) (actual time=0.007..15.192 rows=15525 loops=1)
                                                         Filter: ((NOT relrowsecurity) OR (NOT row_security_active(oid)))
                                                         Buffers: shared hit=443
                           ->  Hash  (cost=25.14..25.14 rows=1014 width=68) (actual time=1.760..1.761 rows=1036 loops=1)
                                 Buckets: 2048 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 118kB
                                 Buffers: shared hit=15
                                 ->  Seq Scan on pg_namespace n  (cost=0.00..25.14 rows=1014 width=68) (actual time=0.012..0.858 rows=1036 loops=1)
                                       Buffers: shared hit=15
   SubPlan 2
     ->  Aggregate  (cost=60.64..60.66 rows=1 width=8) (actual time=2.376..2.377 rows=1 loops=2694)
           Buffers: shared hit=3386747
           ->  Nested Loop  (cost=1.39..60.63 rows=1 width=401) (actual time=2.280..2.369 rows=6 loops=2694)
                 Join Filter: (c_1.oid = s_2.starelid)
                 Buffers: shared hit=3386747
                 ->  Nested Loop  (cost=1.10..60.06 rows=1 width=10) (actual time=2.228..2.283 rows=16 loops=2694)
                       Buffers: shared hit=3262332
                       ->  Nested Loop  (cost=0.69..16.73 rows=1 width=4) (actual time=1.108..2.211 rows=1 loops=2694)
                             Join Filter: (c_1.relnamespace = n_1.oid)
                             Rows Removed by Join Filter: 364
                             Buffers: shared hit=3247620
                             ->  Index Scan using pg_class_relname_nsp_index on pg_class c_1  (cost=0.41..8.43 rows=1 width=8) (actual time=0.005..0.398 rows=365 loops=2694)
                                   Index Cond: (relname = tbl.relname)
                                   Filter: ((NOT relrowsecurity) OR (NOT row_security_active(oid)))
                                   Buffers: shared hit=290274
                             ->  Index Scan using pg_namespace_nspname_index on pg_namespace n_1  (cost=0.28..8.29 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=983801)
                                   Index Cond: (nspname = ns.nspname)
                                   Buffers: shared hit=2957346
                       ->  Index Scan using pg_attribute_relid_attnum_index on pg_attribute a_1  (cost=0.42..43.29 rows=4 width=6) (actual time=0.019..0.045 rows=16 loops=2694)
                             Index Cond: (attrelid = c_1.oid)
                             Filter: ((NOT attisdropped) AND has_column_privilege(c_1.oid, attnum, 'select'::text))
                             Rows Removed by Filter: 10
                             Buffers: shared hit=14712
                 ->  Index Scan using pg_statistic_relid_att_inh_index on pg_statistic s_2  (cost=0.29..0.55 rows=1 width=6) (actual time=0.003..0.003 rows=0 loops=43518)
                       Index Cond: ((starelid = a_1.attrelid) AND (staattnum = a_1.attnum))
                       Filter: (stanullfrac <> '0'::double precision)
                       Rows Removed by Filter: 0
                       Buffers: shared hit=124415
 Planning time: 4.337 ms
 Execution time: 8605.391 ms
(102 rows)

1 respostas

Voted

Ed Sabol · Answer 1 · 2020-02-25T16:33:31+08:00

Best Answer

Ed Sabol

2020-02-25T16:33:31+08:002020-02-25T16:33:31+08:00

Bem, mistério resolvido. Graças aos comentários acima, vi que a saída EXPLAIN indicava que pg_attribute também era muito maior do que eu esperava. Depois de olhar para isso, vi qual era o problema. Um de nossos desenvolvedores teve um processo descontrolado que criou mais de cem mil tabelas que deveriam ser temporárias, mas não eram e então "esqueceu" de limpá-las. Eles estavam em um esquema não público, então não os notei até agora. Uma vez que eu larguei essas centenas de milhares de tabelas e executei vacuum full pg_attributee vacuum full pg_classe vacuum full pg_statistic, o problema que eu estava tendo com a consulta de inchaço do Nagios foi finalmente resolvido.

2

O pg_stats é enorme e está deixando o servidor PostgreSQL lento.... O que eu faço?

conectar ao servidor PostgreSQL: FATAL: nenhuma entrada pg_hba.conf para o host

Como fazer a saída do sqlplus aparecer em uma linha?

Selecione qual tem data máxima ou data mais recente

Como faço para listar todos os esquemas no PostgreSQL?

Listar todas as colunas de uma tabela especificada

Como usar o sqlplus para se conectar a um banco de dados Oracle localizado em outro host sem modificar meu próprio tnsnames.ora

Como você mysqldump tabela (s) específica (s)?

Listar os privilégios do banco de dados usando o psql

Como inserir valores em uma tabela de uma consulta de seleção no PostgreSQL?

Como faço para listar todos os bancos de dados e tabelas usando o psql?

O pg_stats é enorme e está deixando o servidor PostgreSQL lento.... O que eu faço?

1 respostas

relate perguntas