我管理一个 Percona XtraDB 集群,该集群使用带有摆动连接的网络存储。我们会周期性地遇到高 iowait 崩溃并以只读方式重新挂载 fs。不幸的是,更换存储目前不是一种选择。
最近我注意到当运行 mysqldump 或 mysqlcheck 时,它们会使节点上的 MySQL 服务器崩溃,并出现错误mysqlcheck: Got error: 2013: Lost connection to MySQL server during query when executing 'CHECK TABLE ... '
以下是mysqld.log
崩溃期间的内容:
InnoDB: Error in pages 9479 and 9480 of index "PRIMARY" of table "foobar"."quux"
InnoDB: broken FIL_PAGE_NEXT or FIL_PAGE_PREV links
2015-09-28 14:39:45 7f015813b700 InnoDB: Page dump in ascii and hex (16384 bytes):
(...)
InnoDB: End of page dump
2015-09-28 14:39:45 7f015813b700 InnoDB: uncompressed page, stored checksum in field1 4038097986, calculated checksums for field1: crc32 2787032309, innodb 4038097986, none 3735928559, stored checksum in field2 1190336748, calculated checksums for field2: crc32 2787032309, innodb 1190336748, none 3735928559, page LSN 4 3652646491, low 4 bytes of LSN at page end 3652646491, page number (if stored to page already) 9479, space id (if created with >= MySQL-4.1.1 and stored already) 18
InnoDB: Page may be an index page where index id is 67
InnoDB: (index "PRIMARY" of table "foobar"."quux")
2015-09-28 14:39:45 7f015813b700 InnoDB: Page dump in ascii and hex (16384 bytes):
(...)
InnoDB: End of page dump
2015-09-28 14:39:46 7f015813b700 InnoDB: uncompressed page, stored checksum in field1 554678569, calculated checksums for field1: crc32 2178598661, innodb 554678569, none 3735928559, stored checksum in field2 1065260512, calculated checksums for field2: crc32 2178598661, innodb 1065260512, none 3735928559, page LSN 10 202985777, low 4 bytes of LSN at page end 202985777, page number (if stored to page already) 6792, space id (if created with >= MySQL-4.1.1 and stored already) 18
InnoDB: Page may be an index page where index id is 67
InnoDB: (index "PRIMARY" of table "foobar"."quux")
InnoDB: Corruption of an index tree: table "foobar"."quux", index "PRIMARY",
InnoDB: father ptr page no 55234, child page no 9479
PHYSICAL RECORD: n_fields 14; compact format; info bits 0
0: len 30; hex 34616434393538322d353232372d653863302d326466662d353461663639; asc 4ad49582-5227-e8c0-2dff-54af69; (total 36 bytes);
1: len 6; hex 000000000dd7; asc ;;
2: len 7; hex c8000001741ea1; asc t ;;
3: len 5; hex 99951259cb; asc Y ;;
4: len 5; hex 99951259cb; asc Y ;;
5: len 30; hex 62633965323864352d383865382d343466322d393337322d353339303931; asc bc9e28d5-88e8-44f2-9372-539091; (total 36 bytes);
6: len 30; hex 62633965323864352d383865382d343466322d393337322d353339303931; asc bc9e28d5-88e8-44f2-9372-539091; (total 36 bytes);
7: len 1; hex 80; asc ;;
8: len 30; hex 64356664666538352d656431652d346465362d383363612d616439663164; asc d5fdfe85-ed1e-4de6-83ca-ad9f1d; (total 36 bytes);
9: len 8; hex 436f6e7461637473; asc Contacts;;
10: len 4; hex 6c696e6b; asc link;;
11: len 30; hex 7b226f626a656374223a7b226e616d65223a224d72204a6f7264616e204e; asc {"object":{"name":"Mr John Hacker; (total 343 bytes);
12: len 4; hex 80000000; asc ;;
13: len 30; hex 7b226e616d65223a22222c22646f635f6f776e6572223a22222c22757365; asc {"name":"","doc_owner":"","use; (total 72 bytes);
n_owned: 0; heap_no: 2; next rec: 751
PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 30; hex 34616435616262302d303535662d333939612d613038652d353439396461; asc 4ad5abb0-055f-399a-a08e-5499da; (total 36 bytes);
1: len 4; hex 0000d7c2; asc ;;
n_owned: 0; heap_no: 277; next rec: 4688
InnoDB: You should dump + drop + reimport the table to fix the
InnoDB: corruption. If the crash happens at the database startup, see
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html about
InnoDB: forcing recovery. Then dump + drop + reimport.
2015-09-28 14:39:46 7f015813b700 InnoDB: Assertion failure in thread 139643749381888 in file btr0btr.cc line 1492
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
12:39:46 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster
key_buffer_size=25165824
read_buffer_size=131072
max_used_connections=7
max_threads=202
thread_count=10
connection_count=5
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 105204 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0xf4de780
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f015813ad38 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x8fa965]
/usr/sbin/mysqld(handle_fatal_signal+0x4b4)[0x665644]
/lib64/libpthread.so.0(+0xf710)[0x7f0185a25710]
/lib64/libc.so.6(gsignal+0x35)[0x7f0183e6b625]
/lib64/libc.so.6(abort+0x175)[0x7f0183e6ce05]
/usr/sbin/mysqld[0xa10d84]
/usr/sbin/mysqld[0xa16cc8]
/usr/sbin/mysqld[0x917920]
/usr/sbin/mysqld(_ZN7handler8ha_checkEP3THDP15st_ha_check_opt+0x6a)[0x5a422a]
/usr/sbin/mysqld[0x835fc3]
/usr/sbin/mysqld(_ZN19Sql_cmd_check_table7executeEP3THD+0xc2)[0x836cd2]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x33d5)[0x6ed235]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x658)[0x6f0958]
/usr/sbin/mysqld[0x6f0acd]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x19d5)[0x6f2de5]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x22b)[0x6f42cb]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x17f)[0x6bc52f]
/usr/sbin/mysqld(handle_one_connection+0x47)[0x6bc717]
/usr/sbin/mysqld(pfs_spawn_thread+0x12a)[0xaf611a]
/lib64/libpthread.so.0(+0x79d1)[0x7f0185a1d9d1]
/lib64/libc.so.6(clone+0x6d)[0x7f0183f218fd]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7efebc091d30): is an invalid pointer
Connection ID (thread ID): 2336
Status: NOT_KILLED
很明显,桌子foobar.quux
严重损坏。使用数据库的应用程序仍然有效(尽管性能有所降低),SELECT
语句也是如此。
mysqlcheck 工具不能用来修复它,所以我知道的解决方案是做 a SELECT * FROM quux INTO OUTFILE
,删除表,然后LOAD DATA INFILE
在下一个维护窗口期间做 a 。这种处理方式是否存在缺点,是否有其他选择来修复表格?
编辑:我重新启动了 MySQL 服务器,其值innodb_force_recovery
从 1 增加到 4,结果始终相同:
mysqldump 失败并出现错误
mysqldump: Error 2013: Lost connection to MySQL server during query when dumping table quux at row: 156915
MySQL 命令
SELECT * FROM quux INTO OUTFILE '/root/quux.sql';
在出现错误后不久失败ERROR 2013 (HY000): Lost connection to MySQL server during query
我要试试innodb_force_recovery=5
吗innodb_force_recovery=6
?可能有什么缺点?
mysqlcheck
不会修复 InnoDB 表中的损坏。您需要从表中转储数据并重新创建它。innodb_force_recovery
使用选项启动 MySQL 。尝试从 1 到 6 的值,直到 MySQL 启动。用 转储表
mysqldump
。放下桌子。
重启 MySQL w/o
innodb_force_recovery
。重新加载转储。
对于它的价值,我发现它
mysqlcheck -o
可以有效地修复损坏的 InnoDB 表,因为它实际上做了“重新创建 + 分析”。我有一个损坏的表,它阻止了 mysql 启动。首先我试过:返回:
然后我尝试了:
瞧……
不确定这是否会修复各种损坏,但它对我有用。
我一直在处理 innodb 腐败问题。
我想我会在转储和重新导入数据库之前尝试一些事情。
检查所有数据库对我不起作用,但确实如此。
我使用了下面的命令并且没有指定表名。
我相信这会检查/修复数据库中的每个表,根据您的表数量,这可能会非常冗长,但总体而言,事情已经恢复并运行。
注意:登录到 mysql 并运行 show databases;
获取数据库列表,然后在上面的命令中插入名称,然后让它运行一些。
我遇到了 nextcloud 的问题,它为我解决了这些问题。