我的 FreeBSD 10.2 服务器上的一个目录莫名其妙地被损坏了(ZFS 不应该阻止这种情况吗?)
ls
或针对它的任何其他命令会导致当前会话在内核级别冻结(即使 SIGKILL 什么也不做)。
ZFS 清理没有发现任何问题。
# zpool status zroot
pool: zroot
state: ONLINE
scan: scrub repaired 0 in 0h17m with 0 errors on Sun Dec 18 18:25:04 2016
config:
NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
gpt/zfs0 ONLINE 0 0 0
errors: No known data errors
smartctl
说磁盘一切正常。
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0
2 Throughput_Performance 0x0005 137 137 054 Pre-fail Offline - 89
3 Spin_Up_Time 0x0007 128 128 024 Pre-fail Always - 314 (Average 277)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 78
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 142 142 020 Pre-fail Offline - 29
9 Power_On_Hours 0x0012 097 097 000 Old_age Always - 24681
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 78
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 306
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 306
194 Temperature_Celsius 0x0002 171 171 000 Old_age Always - 35 (Min/Max 20/46)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0
甚至zdb
没有发现任何问题。
# zdb -c zroot
Traversing all blocks to verify metadata checksums and verify nothing leaked ...
loading space map for vdev 0 of 1, metaslab 44 of 116 ...
12.2G completed ( 60MB/s) estimated time remaining: 0hr 00min 00sec
No leaks (block sum matches space maps exactly)
bp count: 956750
ganged count: 0
bp logical: 43512090624 avg: 45479
bp physical: 11620376064 avg: 12145 compression: 3.74
bp allocated: 13143715840 avg: 13737 compression: 3.31
bp deduped: 0 ref>1: 0 deduplication: 1.00
SPA allocated: 13143715840 used: 1.32%
additional, non-pointer bps of type 0: 123043
Dittoed blocks on same vdev: 62618
该目录不包含任何重要数据,所以我可以删除它并进入“干净”状态。
想到的一种解决方案是创建一个新的 ZFS 池,复制所有健康的数据,然后删除旧的。但这感觉非常危险。如果系统挂起并且我的服务器出现故障怎么办?
你能想出一种方法来摆脱损坏的目录而不会造成太多破坏吗?