我有一个使用三星 840 pro 磁盘的数据库服务器。即使网站上没有太多活动,负载也会持续高于平时。所以我怀疑磁盘已经磨损了。但是如何检查磁盘 i/o 是否是瓶颈?
以下是一些可能相关的快照:
top - 03:02:11 up 766 days, 20:45, 1 user, load average: 7.42, 6.89, 6.72
Tasks: 325 total, 1 running, 321 sleeping, 3 stopped, 0 zombie
%Cpu(s): 17.3 us, 0.4 sy, 0.0 ni, 82.1 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem: 13227468+total, 27130284 used, 10514440+free, 94308 buffers
KiB Swap: 3906556 total, 9136 used, 3897420 free. 3833216 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
21764 mysql 20 0 27.058g 0.021t 12164 S 576.0 17.2 17369,44 mysqld
574 root 20 0 0 0 0 S 0.3 0.0 280:00.66 jbd2/sda1-8
5585 root 20 0 0 0 0 S 0.3 0.0 0:08.04 kworker/18:0
1 root 20 0 28692 4540 2964 S 0.0 0.0 42:51.98 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.50 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 894:44.38 ksoftirqd/0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
6 root 20 0 0 0 0 S 0.0 0.0 21:07.91 kworker/u64:0
8 root 20 0 0 0 0 S 0.0 0.0 2510:32 rcu_sched
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00
iotop 输出:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
21764 mysql 20 0 27.058g 0.021t 12164 S 576.0 17.2 17369,44 mysqld
574 root 20 0 0 0 0 S 0.3 0.0 280:00.66 jbd2/sda1-8
5585 root 20 0 0 0 0 S 0.3 0.0 0:08.04 kworker/18:0
1 root 20 0 28692 4540 2964 S 0.0 0.0 42:51.98 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.50 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 894:44.38 ksoftirqd/0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
6 root 20 0 0 0 0 S 0.0 0.0 21:07.91 kworker/u64:0
8 root 20 0 0 0 0 S 0.0 0.0 2510:32 rcu_sched
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
10 root rt 0 0 0 0 S 0.0 0.0 5:28.52 migration/0
11 root rt 0 0 0 0 S 0.0 0.0 3:15.12 watchdog/0
12 root rt 0 0 0 0 S 0.0 0.0 3:27.27 watchdog/1
13 root rt 0 0 0 0 S 0.0 0.0 3:19.37 migration/1
14 root 20 0 0 0 0 S 0.0 0.0 190:10.26 ksoftirqd/1
16 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H
17 root rt 0 0 0 0 S 0.0 0.0 3:19.65 watchdog/2
18 root rt 0 0 0 0 S 0.0 0.0 2:52.44 migration/2
19 root 20 0 0 0 0 S 0.0 0.0 194:18.02 ksoftirqd/2
21 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/2:0H
22 root rt 0 0 0 0 S 0.0 0.0 3:21.4
iostat -m (数据库在 sda 上,linux 文件系统在 sdb 上)
Linux 3.16.0-4-amd64 (back) 03/27/20 _x86_64_ (32 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
6.76 0.00 0.38 0.07 0.00 92.79
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sdd 0.03 0.00 0.01 43327 384521
sdc 0.08 0.00 0.01 166547 748630
sdb 0.37 0.00 0.02 78269 1076710
sda 8.46 0.00 0.11 54407 7463246
上面发布的所有内容都暗示 CPU 是瓶颈,而不是 SSD。最明显的是 576% 的 CPU,我预计这意味着消耗了 5.76 个 CPU 内核。
问题似乎与 MySQL 有关——可能是竞争条件或复杂查询、损坏的表或错误的索引?
如果是磁盘的问题,我预计会看到低 CPU 和高 IOWait,但 IOWait 是 0.07。