Eu tenho um servidor Plesk Ubuntu 22.04 com 120 GB de RAM e mariadb 10.6.16 que uso para executar o banco de dados de um único site. O banco de dados sozinho tem cerca de 8 GB, mas apesar de todas as otimizações que fiz, o mariadb ainda está rodando com mais de 140% da CPU. Isso resulta em tempos de resposta lentos do site.
Já fiz algumas pesquisas no Google e otimizei PHP FPM e MySQL mas apesar de tudo isso a velocidade do site ainda é lenta e estou apontando o dedo para o mariadb cujo processo está em mais de 140% do processador. Vou deixar para vocês a captura de tela abaixo.
Configuração PHP FPM
Eu configurei:
pm.max_children = 90
pm.max_requests = 200
pm = static
memory_limit = 10240M
max_execution_time = 300
max_input_time = 600
post_max_size = 1024M
upload_max_filesize = 512M
Configuração MariaDB
No /etc/mysql/my.cnf
arquivo, configurei esses valores
[mysqld] sql_mode=ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
bind-address = 127.0.0.1
local-infile=0
loose-local-infile=1
innodb_buffer_pool_size = 96G
innodb_log_file_size = 12G
innodb_log_buffer_size = 64M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
query_cache_type = 1
query_cache_size = 256M
query_cache_limit = 512M
query_cache_min_res_unit = 4k
innodb_lock_wait_timeout = 6000
Também é curioso mas na imagem da memória abaixo a parte em cache (cor bege) é muito pequena. Captura de tela da memória
O servidor em si estava funcionando bem em um servidor Ubuntu 20.04 de 64 Gb sem problemas. Mas desde esta migração é lenta. Mesmo que tenhamos passado a RAM de 64 Gb para 120 Go no servidor Ubuntu 22.04.
Alguém já enfrentou esse tipo de problema?
Agradeço antecipadamente por sua ajuda.
Cumprimentos
Aqui estão informações adicionais baseadas em comentários: A CPU do servidor é 24 x Processador AMD EPYC 7282 de 16 núcleos O disco rígido é NVMe
MariaDB [(none)]> SHOW ENGINE INNODB STATUS;
| InnoDB | |
=====================================
2024-02-06 09:18:33 0x7fd0721fe640 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 55 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 0 srv_active, 0 srv_shutdown, 386065 srv_idle
srv_master_thread log flush and writes: 386059
----------
SEMAPHORES
----------
------------
TRANSACTIONS
------------
Trx id counter 5295351
Purge done for trx's n:o < 5295349 undo n:o < 0 state: running
History list length 1
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION (0x7fe8c01d0880), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01cdc80), not started
mysql tables in use 1, locked 0
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01cb080), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c8480), not started
mysql tables in use 6, locked 0
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c4d80), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01cbb80), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c4280), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01d1380), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01cd180), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01ce780), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01cc680), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c3780), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c0b80), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01d1e80), not started
mysql tables in use 2, locked 0
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c9a80), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01cfd80), not started
mysql tables in use 1, locked 0
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01cf280), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01ca580), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c7980), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c2c80), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c6e80), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c2180), not started
mysql tables in use 5, locked 0
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c8f80), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c6380), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c5880), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION (0x7fe8c01c1680), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
--------
FILE I/O
--------
Pending flushes (fsync) log: 0; buffer pool: 0
406980 OS file reads, 1500489 OS file writes, 309011 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 1.07 writes/s, 0.56 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2, 0 merges
merged operations:
insert 0, delete mark 0, delete 0
discarded operations:
insert 0, delete mark 0, delete 0
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 10381178784
Log flushed up to 10381178282
Pages flushed up to 9758411800
Last checkpoint at 9758411788
0 pending log flushes, 0 pending chkp writes
1500490 log i/o's done, 1.07 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 103213432832
Dictionary memory allocated 613451784
Buffer pool size 6230016
Free buffers 5820886
Database pages 409130
Old database pages 151006
Modified db pages 21938
Percent of dirty pages(LRU & free pages): 0.352
Max dirty pages percent: 90.000
Pending reads 0
Pending writes: LRU 0, flush list 0
Pages made young 1300160, not young 5227092
0.04 youngs/s, 0.00 non-youngs/s
Pages read 406341, created 12733, written 0
0.00 reads/s, 0.02 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 409130, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 read views open inside InnoDB
Process ID=0, Main thread ID=0, state: sleeping
Number of rows inserted 774791, updated 948664, deleted 367, read 777328684815
0.85 inserts/s, 0.00 updates/s, 0.00 deletes/s, 1854176.96 reads/s
Number of system rows inserted 0, updated 0, deleted 0, read 0
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
Os demais comandos são resultados muito longos para serem inseridos na descrição com base no limite fixado pelo Stackexchange.
Outras informações:
MOSTRAR STATUS GLOBAL; https://justpaste.it/4iip0
MOSTRAR VARIÁVEIS GLOBAIS; https://justpaste.it/d9k81
STATUS; https://justpaste.it/fk95g
ulimit -a https://justpaste.it/bc50r
Thanks very much for your response.
I solve my problem by taking a Dedicated server instead of VPS.
The old VPS was : 120Go RAM 24 Core AMD CPU NVMe as storage Ubuntu 22.04
O novo é: 64Go RAM 18 Core Intel CPU NVMe como armazenamento Ubuntu 22.04
Sobre a otimização do Mariadb, usei a configuração abaixo e o site carregou muito melhor
Acho que foi um problema com o VPS.
Taxa por segundo = RPS
Sugestões para o prompt de comando do seu sistema operacional considerar,
Esta é uma variável dinâmica e após o teste precisará ser persistente durante os ciclos de parada/inicialização do sistema operacional. Nós podemos ajudar.
Sugestões a serem consideradas para sua seção 10.6.16 MariaDB my.cnf [mysqld]
There are many other opportunities to improve your configuration and reduce CPU busy percentage.
Observations:
The More Important Issues:
Please rerun SHOW GLOBAL STATUS after several hours. The following analysis will not be very useful because of having run for only 15 minutes.
6,190 queries per second with take a lot of CPU time. But is the qps normally that high? Or is this just during startup?
Suggested setting changes:
Details and other observations:
( ( Key_reads + Key_writes + Innodb_pages_read + Innodb_pages_written + Innodb_dblwr_writes + Innodb_buffer_pool_pages_flushed ) / innodb_io_capacity / Uptime ) = ( 0 + 0 + 400094 + 0 + 0 + 0 ) / 200 / 912 = 219.3%
-- This may be a metric indicating what innodb_io_capacity is set reasonably. -- Increase innodb_io_capacity (now 200) if the hardware can handle it.( table_open_cache ) = 8,000
-- Number of table descriptors to cache -- Several hundred is usually good.( innodb_lru_scan_depth ) = 1,536
-- innodb_lru_scan_depth is a very poorly named variable. A better name would be innodb_free_page_target_per_buffer_pool. It is a number of pages InnoDB tries to keep free in each buffer pool instance to speed up read and page creation operations. -- "InnoDB: page_cleaner: 1000ms intended loop took ..." may be fixed by lowering lru_scan_depth( Innodb_buffer_pool_pages_free * 16384 / innodb_buffer_pool_size ) = 4,791,284 * 16384 / 81920M = 91.4%
-- buffer pool free -- buffer_pool_size is bigger than working set; could decrease it( innodb_io_capacity ) = 200
-- When flushing, use this many IOPs. -- Reads could be slugghish or spiky. Use 2000 if using SSD drive.( Innodb_buffer_pool_pages_free / Innodb_buffer_pool_pages_total ) = 4,791,284 / 5191680 = 92.3%
-- Pct of buffer_pool currently not in use -- innodb_buffer_pool_size (now 85899345920) is bigger than necessary?( innodb_io_capacity_max / innodb_io_capacity ) = 2,000 / 200 = 10
-- Capacity: max/plain -- Recommend 2. Max should be about equal to the IOPs your I/O subsystem can handle. (If the drive type is unknown 2000/200 may be a reasonable pair.)( innodb_change_buffering ) = innodb_change_buffering = none
-- Pre-5.6.11 / 5.5.31, there was a bug that made ="changes" a safer option. MariaDB 10.5.15 is moving toward "none" and deprecating in 10.9( Innodb_buffer_pool_bytes_data / innodb_buffer_pool_size ) = 6,560,088,064 / 81920M = 7.6%
-- Percent of buffer pool taken up by data -- A small percent may indicate that the buffer_pool is unnecessarily big.( innodb_doublewrite ) = innodb_doublewrite = OFF
-- ON leads to extra I/O, but extra safety in crash. -- OFF is OK for FusionIO, Galera, Replicas, ZFS, EXT4.( innodb_log_buffer_size ) = 1024M
-- Suggest 2MB-64MB, and at least as big as biggest blob set in transactions. -- Adjust innodb_log_buffer_size (now 1073741824).( Uptime / 60 * innodb_log_file_size / Innodb_os_log_written ) = 912 / 60 * 16384M / 4567040 = 57,177
-- Minutes between InnoDB log rotations Beginning with 5.6.8, innodb_log_file_size can be changed dynamically; I don't know about MariaDB. Be sure to also change my.cnf -- (The recommendation of 60 minutes between rotations is somewhat arbitrary.) Adjust innodb_log_file_size (now 17179869184). (Cannot change in AWS.)( default_tmp_storage_engine ) = default_tmp_storage_engine =
( innodb_flush_neighbors ) = innodb_flush_neighbors = 1
-- A minor optimization when writing blocks to disk. -- Use 0 for SSD drives; 1 for HDD.( ( Innodb_pages_read + Innodb_pages_written ) / Uptime / innodb_io_capacity ) = ( 400094 + 0 ) / 912 / 200 = 219.3%
-- If > 100%, need more io_capacity. -- Increase innodb_io_capacity (now 200) if the drives can handle it.( innodb_io_capacity ) = 200
-- I/O ops per second capable on disk . 100 for slow drives; 200 for spinning drives; 1000-2000 for SSDs; multiply by RAID factor. Limits write IO requests per second (IOPS). -- For starters: HDD: 200; SSD: 2000.( innodb_flush_log_at_trx_commit ) = 2
-- 1 = secure; 2 = faster -- (You decide) Use 1, along with sync_binlog (now 0)=1 for the greatest level of fault tolerance. 0 is best for speed. 2 is a compromise between 0 and 1.( sync_binlog ) = 0
-- Use 1 for added security, at some cost of I/O =1 may lead to lots of "query end"; =0 may lead to "binlog at impossible position" and lose transactions in a crash, but is faster. 0 is OK for Galera.( innodb_print_all_deadlocks ) = innodb_print_all_deadlocks = OFF
-- Whether to log all Deadlocks. -- If you are plagued with Deadlocks, turn this on. Caution: If you have lots of deadlocks, this may write a lot to disk.( innodb_purge_threads ) = 8
-- Number of threads to clean up history list. -- If you have a lot of writes, recommend 4 in versions 5.6 and 10.0 or later.( local_infile ) = local_infile = ON
-- local_infile (now ON) = ON is a potential security issue( bulk_insert_buffer_size ) = 8 / 122880M = 0.01%
-- Buffer for multi-row INSERTs and LOAD DATA -- Too big could threaten RAM size. Too small could hinder such operations.( tmp_table_size ) = 512M
-- Limit on size of MEMORY temp tables used to support a SELECT -- Decrease tmp_table_size (now 536870912) to avoid running out of RAM. Perhaps no more than 64M.( binlog_format ) = binlog_format = MIXED
-- STATEMENT/ROW/MIXED. -- ROW is preferred by 5.7 (10.3)( slow_query_log ) = slow_query_log = OFF
-- Whether to log slow queries. (5.1.12)( long_query_time ) = 10
-- Cutoff (Seconds) for defining a "slow" query. -- Suggest 2( Uptime_since_flush_status ) = 912 = 15m 12s
-- How long (in seconds) since FLUSH STATUS (or server startup). -- GLOBAL STATUS has not been gathered long enough to get reliable suggestions for many of the issues. Fix what you can, then come back in a several hours.( Uptime ) = 912 = 15m 12s
-- How long (in seconds) the server has been running. -- The system has not been up long enough to get reliable suggestions for many of the issues. Fix what you can, then come back with fresh values after the system has been running several hours.( Max_used_connections / max_connections ) = 37 / 500 = 7.4%
-- Peak % of connections -- Since several memory factors can expand based on max_connections (now 500), it is good not to have that setting too high.( thread_cache_size / Max_used_connections ) = 100 / 37 = 270.3%
-- There is no advantage in having the thread cache bigger than your likely number of connections. Wasting space is the disadvantage.( thread_pool_max_threads ) = 65,536
-- One of many settings for MariaDB's thread pooling -- Lower the value.Abnormally small:
Abnormally large:
Abnormal strings: