问题的开始
我在托管提供商处有一台专用服务器,最近我的节点导出器检测到我的 RAID 1 阵列 /dev/md3 上的磁盘 io 饱和度很高。我已检查硬盘的 smartctl,发现阵列中的两个驱动器都显示大量读取错误:
[root@ovh-ds03 ~]# smartctl /dev/sda -a | grep Err
Error logging capability: (0x01) Error logging supported.
SCT Error Recovery Control supported.
1 Raw_Read_Error_Rate 0x000b 099 099 016 Pre-fail Always - 65538
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
[root@ovh-ds03 ~]# smartctl /dev/sdb -a | grep Err
Error logging capability: (0x01) Error logging supported.
SCT Error Recovery Control supported.
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 65536
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
我请求支持人员更换 2 个磁盘,但他们并没有更换,而是添加了 2 个磁盘,并在 2 个新磁盘上重建了阵列。一切都很好,但现在阵列处于降级状态,我因此收到警报,名为️NodeRAIDDegraded
,在服务器上检查,是的,它处于降级状态:
[root@ovh-ds03 ~]# mdadm --detail /dev/md3
/dev/md3:
Version : 1.2
Creation Time : Sat Mar 30 18:18:26 2024
Raid Level : raid1
Array Size : 1951283200 (1860.89 GiB 1998.11 GB)
Used Dev Size : 1951283200 (1860.89 GiB 1998.11 GB)
Raid Devices : 4
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Sat Sep 14 19:30:44 2024
State : active, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Consistency Policy : bitmap
Name : md3
UUID : 939ad077:07c22e9e:ae62fbf9:4df58cf9
Events : 55337
Number Major Minor RaidDevice State
- 0 0 0 removed
- 0 0 1 removed
2 8 35 2 active sync /dev/sdc3
3 8 51 3 active sync /dev/sdd3
我该如何修复它?
我曾尝试测试从头开始重建阵列等各种解决方案
mdadm --assemble --scan