发生硬断电和 UPS 故障后,ZFS 池处于我无法理解的状态:
$ zpool status -c serial
pool: storage
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
scan: scrub repaired 0B in 1 days 10:40:40 with 0 errors on Wed Apr 2 10:40:42 2025
config:
NAME STATE READ WRITE CKSUM serial
storage DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
8844532865098720143 FAULTED 0 0 0 was /dev/sda1 ZL2AJ3S10000C1111G0H
scsi-35000c500cafcbb67 ONLINE 0 0 0 ZL2AJ3S10000C1111G0H
scsi-35000c500cafc9a63 ONLINE 0 0 0 ZL2AKR3F0000C1128SV6
scsi-35000c500cafcb303 ONLINE 0 0 0 ZL2AKQVX0000C1143G62
scsi-35000c500cafcff33 ONLINE 0 0 0 ZL2AKAG10000C11445AW
scsi-35000c500cafc392b ONLINE 0 0 0 ZL2AKCWB0000C1143ARJ
wwn-0x5000c500cafa8287 ONLINE 0 0 0 ZL2AHSSL0000C107BQWN
scsi-35000c500cafbec03 ONLINE 0 0 0 ZL2AGE6X0000C1122SME
7647119559265938125 FAULTED 0 0 0 was /dev/sdi1 ZL2AGE6X0000C1122SME
scsi-35000c500cafca18b ONLINE 0 0 0 ZL2AKR0B0000C1128RNJ
scsi-35000c500cafc29c3 ONLINE 0 0 0 ZL2AGDN30000C1140NTP
scsi-35000c500cafbe293 ONLINE 0 0 0 ZL2AKDSM0000C11278YB
raidz2-1 DEGRADED 0 0 0
scsi-SSEAGATE_ST16000NM002G_ZL2AKBXB0000C1126C6X ONLINE 0 0 0 ZL2AKBXB0000C1126C6X
1470086598115969130 UNAVAIL 0 0 0 was /dev/sdy1 20342A6158FC
wwn-0x5000c500cae0af8b ONLINE 0 0 0 ZL29T97Q0000C107188W
12722321230162544658 FAULTED 0 0 0 was /dev/sdl1 ZL2AKDSM0000C11278YB
scsi-35000c500cafc3be7 ONLINE 0 0 0 ZL2AJJZF0000C1143AH2
scsi-35000c500cafc611f ONLINE 0 0 0 ZL2AKC6Z0000C11438R8
scsi-35000c500cafcfb97 ONLINE 0 0 0 ZL2AHY5R0000C11441Z5
scsi-35000c500cafc8663 ONLINE 0 0 0 ZL2AKBNX0000C1128RLR
scsi-35000c500cafc9fa3 ONLINE 0 0 0 ZL2AKR0Y0000C1128RN1
scsi-35000c500cafc96b3 ONLINE 0 0 0 ZL2AKR6T0000C1128SQP
scsi-35000c500cafc2f23 ONLINE 0 0 0 ZL2AK1NP0000C1143FXB
scsi-35000c500cafc4ccf ONLINE 0 0 0 ZL2AKCKG0000C1143ETM
logs
nvme-INTEL_SSDPED1K375GA_PHKS01530050375AGN ONLINE 0 0 0 PHKS01530050375AGN
cache
sdu FAULTED 0 0 0 corrupted data ZL2AKR6T0000C1128SQP
sdw FAULTED 0 0 0 corrupted data ZL2AK1NP0000C1143FXB
上述状态有很多我不明白的地方:
- 所有故障磁盘都具有与某个在线磁盘相同的序列号
- UNAVAIL 磁盘序列号是我用于缓存的两个 SSD 之一的序列号,而不是用于池的 HDD 之一的序列号
- 两个故障缓存磁盘的序列号是两个存储硬盘的序列号
是什么原因导致池容量在这种状态下减少?池容量还能恢复吗?我考虑过尝试这里描述的步骤,但我甚至搞不清楚哪些磁盘出了问题。这篇文章描述的简单步骤能解决我的情况吗?我真的很需要帮助,先行感谢。