AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / server / 问题

问题[raidz](server)

Martin Hope
Sreekanth Chityala
Asked: 2021-02-10 04:50:58 +0800 CST

我能够用不同大小的硬盘创建 raidz。我在哪里失踪?

  • 0

我读到raidz5不能用不同大小的硬盘创建,但我可以用不同大小的硬盘创建raidz 请告诉我我在哪里失踪?

da0              0:107  12G zfs                                   - -
da1              0:109 8.0G zfs                                   - -
md0              0:13  456M ufs                                   - /
vtbd0            0:53   80G GPT                                   - /
  vtbd0p1        0:115 512K freebsd-boot       gptid/681ee6a8-6ab8-11eb-b72e-558e06157d17 -
  <FREE>         -:-   492K -                                     - -
  vtbd0p2        0:122 2.0G freebsd-swap                  gpt/swap0 -
  vtbd0p3        0:127  78G freebsd-zfs                   gpt/disk0 <ZFS>
  <FREE>         -:-   1.0M -                                     - -
  pool: edjstorage
 state: ONLINE
  scan: none requested
config:

    NAME         STATE     READ WRITE CKSUM
    edjstorage   ONLINE       0     0     0
      raidz1-0   ONLINE       0     0     0
        vtbd0p3  ONLINE       0     0     0
        da0      ONLINE       0     0     0
        da1      ONLINE       0     0     0
freebsd zfs raidz
  • 1 个回答
  • 38 Views
Martin Hope
Steve81
Asked: 2021-01-28 09:33:44 +0800 CST

残废 zpool 的创造性解决方案(也就是帮助垂死的突袭生存 1-2 周)

  • 0

我在 Proxmox 6(Linux 内核版本:5.4.78-2-pve,openZFS 版本:0.8.5-pve1)上有一个 8x4TB 制作的 ZFS RAIDZ-7。

不幸的星相合迫使我从我的 RAIDZ 中移除 3 个磁盘。在大约 1 或 2 周内,我会收到 5 个新磁盘。

还不错,raidz7可以吃3亏。但是剩余磁盘上的一个正在死去(慢慢地,我想可以存活一段时间)。

我现在有一个 2TB 和一个 3TB 磁盘可用。

所以我想用 2TB 和 3TB 构建一个 mdadm 条带阵列,并将生成的 5TB raid 用作 raidz 的磁盘。

有一个纯粹的 zfs 替代品吗?是否可以使用 2TB+3TB 磁盘创建 VDEV 并将其用作 RAID-Z 的磁盘?

zfs mdadm zpool raidz
  • 1 个回答
  • 69 Views
Martin Hope
ascendants
Asked: 2020-10-20 09:31:59 +0800 CST

ZFS - 在重新同步/重建期间删除 ZFS 提供的文件是否安全?

  • 1

我有一个 Nexenta ZFS 系统,服务于大型 NFS 卷(使用 250 TB 的约 85%)。一周前,70 个磁盘中的一个发生了故障,系统正在重新同步一个热备件,没有任何问题(除了由于重新同步 I/O 导致的大量性能损失)。

我知道拥有更多可用空间将减少未来重新同步所需的时间,因此我计划很快从 NFS 卷中清理约 70 TB。

但是,我不确定在当前重新同步期间进行清理是否会导致问题,或者是否会减少当前重新同步所需的时间,这提出了我的问题:

  • 删除文件会对正在进行的 ZFS 重新同步产生负面影响吗?

系统信息:

# uname -a
SunOS stor-nas02a 5.11 NexentaOS_4:55745843a2 i86pc i386 i86pc

阵列信息:

  • 7x raidz2(每个 10 个磁盘)
  • 2x SLOG 镜子
  • 各种备件和缓存
  • 地位:
action: Wait for the resilver to complete.
  scan: resilver in progress since _________
    29.7T scanned out of 199T at 60.2M/s, (scan is slow, no estimated time)    # the 199 TB is compressed
    451G resilvered, 14.92% done
zfs opensolaris raidz sunos nexenta
  • 1 个回答
  • 296 Views
Martin Hope
Benoit
Asked: 2020-09-22 05:49:29 +0800 CST

将 QLC SSD 用作 RAIDZ(科学档案)?

  • 2

我们正在建立一个系统,用于对一些天气数据进行归档和科学分析。

设置是多余的,有两个 HP DL580、Proxmox (ZoL) 和一些用于分析的 GPU。在每台服务器上,我们计划 5 个大约 50 TB 的池。我们出于密度和读取速度的原因使用 SSD。在过去两年中,我们一直在使用 HPE 读取密集型 SSD。我们正在考虑对下一个存档池进行以下更改:

  • 使用 HPE QLC“非常读取优化”的 SSD。它们带有减少的 DWPD,尤其是对于随机写入。
  • 从条带镜像移动到 raidZ2 (8 x 7.68 TB)

数据保存为文件 (25%) 和数据库 (InnoDB, 75%),显然只写入一次。

raidZ2-QLC SSD 组合是否适合这种类型的存档?

关于 QLC SSD 耐用性,是否存在特定于 ZFS 的良好实践或陷阱?

编辑:条带镜像中当前 TLC SSD 的示例 smartctl 输出

Copyright (*C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org  
=== START OF INFORMATION SECTION ===  
Device Model:     VK007680GWSXN  
Serial Number:      
LU WWN Device Id: 5 00a075 1266adce4  
Firmware Version: HPG2  
User Capacity:    7,681,501,126,656 bytes [7.68 TB]  
Sector Sizes:     512 bytes logical, 4096 bytes physical  
Rotation Rate:    Solid State Device  
Form Factor:      2.5 inches  
Device is:        Not in smartctl database [for details use: -P showall]  
ATA Version is:   ACS-3 T13/2161-D revision 5  
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)  
Local Time is:    Mon Sep 21 21:11:42 2020 CEST  
SMART support is: Available - device has SMART capability.  
SMART support is: Enabled  
=== START OF READ SMART DATA SECTION ===  
SMART overall-health self-assessment test result: PASSED  
General SMART Values:  
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.  
                    Auto Offline Data Collection: Disabled.  
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.  
Total time to complete Offline   
data collection:        (26790) seconds.  
Offline data collection
capabilities:            (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    (  45) minutes.
Conveyance self-test routine
recommended polling time:    (   3) minutes.
SCT capabilities:          (0x0035) SCT Status supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   050    Pre-fail  Always       -       0  
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0  
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       4514  
 11 Unknown_SSD_Attribute   0x0012   100   100   000    Old_age   Always       -       5  
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       6  
171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0  
172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0  
173 Unknown_Attribute       0x0033   100   100   010    Pre-fail  Always       -       26  
174 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       5  
175 Program_Fail_Count_Chip 0x0033   100   100   001    Pre-fail  Always       -       0  
180 Unused_Rsvd_Blk_Cnt_Tot 0x003b   100   100   001    Pre-fail  Always       -       0  
184 End-to-End_Error        0x0032   100   100   000    Old_age   Always       -       0  
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0  
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       7  
194 Temperature_Celsius     0x0022   067   057   000    Old_age   Always       -       33 (Min/Max 22/43)  
196 Reallocated_Event_Count 0x0033   100   100   001    Pre-fail  Always       -       0  
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0  
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0  
199 UDMA_CRC_Error_Count    0x003e   100   100   000    Old_age   Always       -       0  
SMART Error Log not supported  
SMART Self-test Log not supported  
SMART Selective self-test log data structure revision number 1  
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS  
    1        0        0  Not_testing  
    2        0        0  Not_testing  
    3        0        0  Not_testing  
    4        0        0  Not_testing  
    5        0        0  Not_testing  
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.*
ssd archive zfsonlinux raidz
  • 2 个回答
  • 675 Views
Martin Hope
Northern Brewer
Asked: 2020-08-25 07:52:08 +0800 CST

ZFS Raid-Z2 失败,需要构建新池但缺少 SAS 端口。我可以从 Raid-Z2 断开一个驱动器吗?

  • 3

我有一个带有 6 * 4TB 驱动器的 Raid-Z2 池。所有驱动器的运行时间都超过 40 000 小时。现在它们似乎都在同时退化。池已降级,并且所有驱动器都标记为已降级并出现许多错误。但幸运的是,目前没有丢失数据。

        NAME        STATE     READ WRITE CKSUM
        File        DEGRADED     0     0     0
          raidz2-0  DEGRADED     0     0     0
            sda     DEGRADED     0     0     0  too many errors
            sdb     DEGRADED     0     0     0  too many errors
            sdc     DEGRADED     0     0     0  too many errors
            sdd     DEGRADED     0     0     0  too many errors
            sde     DEGRADED     0     0     0  too many errors
            sdf     DEGRADED     0     0     0  too many errors

我想用 Raid-Z1 和 3 * 6TB 驱动器构建一个新池,因为我不需要原始池中的所有空间。我的问题是旧池有 6 个驱动器,而我的池将有 3 个,但我的 SAS 控制器只有 8 个端口。所以我想从我的 Raid-Z2 池中断开一个磁盘,连接我的 3 个新驱动器并用它们创建一个新池,然后在旧池失败之前通过复制到新池来保存我的数据。

那可能吗?我的想法是旧池应该在缺少一个磁盘的情况下工作。但是当我尝试断开磁盘连接时,我无法访问旧池中的任何数据。

有谁知道如何解决这个问题?

Zpool状态-v:

  pool: File
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: resilvered 6.82G in 0 days 00:04:00 with 0 errors on Sun Aug 23 21:21:15 2020
config:

        NAME        STATE     READ WRITE CKSUM
        File        DEGRADED     0     0     0
          raidz2-0  DEGRADED     0     0     0
            sda     DEGRADED     0     0     0  too many errors
            sdb     DEGRADED     0     0     0  too many errors
            sdc     DEGRADED     0     0     0  too many errors
            sdd     DEGRADED     0     0     0  too many errors
            sde     DEGRADED     0     0     0  too many errors
            sdf     DEGRADED     0     0     0  too many errors

errors: No known data errors

所有磁盘都报告 SMART 状态正常:

smartctl -H /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.55-1-pve] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK


系统日志似乎是空的:

root@boxvm:/var/log# cat syslog | grep sda
root@boxvm:/var/log#

dmesg 输出似乎也很好:

dmesg | grep sda
[    8.997624] sd 1:0:0:0: [sda] Enabling DIF Type 2 protection
[    8.998488] sd 1:0:0:0: [sda] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
[    8.998847] sd 1:0:0:0: [sda] Write Protect is off
[    8.998848] sd 1:0:0:0: [sda] Mode Sense: df 00 10 08
[    8.999540] sd 1:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
[    9.093385]  sda: sda1 sda9
[    9.096819] sd 1:0:0:0: [sda] Attached SCSI disk


dmesg | grep sdb
[    8.997642] sd 1:0:1:0: [sdb] Enabling DIF Type 2 protection
[    8.998467] sd 1:0:1:0: [sdb] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
[    8.998828] sd 1:0:1:0: [sdb] Write Protect is off
[    8.998830] sd 1:0:1:0: [sdb] Mode Sense: df 00 10 08
[    8.999524] sd 1:0:1:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA
[    9.087056]  sdb: sdb1 sdb9
[    9.090465] sd 1:0:1:0: [sdb] Attached SCSI disk


dmesg | grep sdc
[    8.997812] sd 1:0:2:0: [sdc] Enabling DIF Type 2 protection
[    8.998639] sd 1:0:2:0: [sdc] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
[    8.998998] sd 1:0:2:0: [sdc] Write Protect is off
[    8.998999] sd 1:0:2:0: [sdc] Mode Sense: df 00 10 08
[    8.999692] sd 1:0:2:0: [sdc] Write cache: disabled, read cache: enabled, supports DPO and FUA
[    9.084259]  sdc: sdc1 sdc9
[    9.088030] sd 1:0:2:0: [sdc] Attached SCSI disk


dmesg | grep sdd
[    8.997932] sd 1:0:3:0: [sdd] Enabling DIF Type 2 protection
[    8.998761] sd 1:0:3:0: [sdd] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
[    8.999120] sd 1:0:3:0: [sdd] Write Protect is off
[    8.999121] sd 1:0:3:0: [sdd] Mode Sense: df 00 10 08
[    8.999818] sd 1:0:3:0: [sdd] Write cache: disabled, read cache: enabled, supports DPO and FUA
[    9.103840]  sdd: sdd1 sdd9
[    9.107482] sd 1:0:3:0: [sdd] Attached SCSI disk


dmesg | grep sde
[    8.998017] sd 1:0:4:0: [sde] Enabling DIF Type 2 protection
[    8.998839] sd 1:0:4:0: [sde] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
[    8.999234] sd 1:0:4:0: [sde] Write Protect is off
[    8.999235] sd 1:0:4:0: [sde] Mode Sense: df 00 10 08
[    8.999933] sd 1:0:4:0: [sde] Write cache: disabled, read cache: enabled, supports DPO and FUA
[    9.088282]  sde: sde1 sde9
[    9.091665] sd 1:0:4:0: [sde] Attached SCSI disk


dmesg | grep sdf
[    8.998247] sd 1:0:5:0: [sdf] Enabling DIF Type 2 protection
[    8.999076] sd 1:0:5:0: [sdf] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
[    8.999435] sd 1:0:5:0: [sdf] Write Protect is off
[    8.999436] sd 1:0:5:0: [sdf] Mode Sense: df 00 10 08
[    9.000136] sd 1:0:5:0: [sdf] Write cache: disabled, read cache: enabled, supports DPO and FUA
[    9.090609]  sdf: sdf1 sdf9
[    9.094235] sd 1:0:5:0: [sdf] Attached SCSI disk

dmesg 用于 SAS 控制器

root@boxvm:/var/log# dmesg | grep mpt2
[    1.151805] mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (65793672 kB)
[    1.200012] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[    1.200023] mpt2sas_cm0: MSI-X vectors supported: 1
[    1.200024] mpt2sas_cm0:  0 1
[    1.200098] mpt2sas_cm0: High IOPs queues : disabled
[    1.200099] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 51
[    1.200100] mpt2sas_cm0: iomem(0x00000000fc740000), mapped(0x00000000629d5dd1), size(65536)
[    1.200101] mpt2sas_cm0: ioport(0x000000000000d000), size(256)
[    1.254826] mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
[    1.281681] mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15)
[    1.281746] mpt2sas_cm0: request pool(0x0000000074c49e3e) - dma(0xfcd700000): depth(3492), frame_size(128), pool_size(436 kB)
[    1.289333] mpt2sas_cm0: sense pool(0x00000000693be9f4)- dma(0xfcba00000): depth(3367),element_size(96), pool_size(315 kB)
[    1.289400] mpt2sas_cm0: config page(0x00000000f6926acf) - dma(0xfcb9ad000): size(512)
[    1.289401] mpt2sas_cm0: Allocated physical memory: size(1687 kB)
[    1.289401] mpt2sas_cm0: Current Controller Queue Depth(3364),Max Controller Queue Depth(3432)
[    1.289402] mpt2sas_cm0: Scatter Gather Elements per IO(128)
[    1.333780] mpt2sas_cm0: LSISAS2008: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
[    1.333781] mpt2sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
[    1.334527] mpt2sas_cm0: sending port enable !!
[    2.861790] mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x590b11c0155b3300), phys(8)
[    8.996385] mpt2sas_cm0: port enable: SUCCESS

linux zfs sas storage raidz
  • 1 个回答
  • 158 Views
Martin Hope
ALchEmiXt
Asked: 2020-03-17 13:41:40 +0800 CST

ZFS:编辑 ubuntu 上失败池的 zpool 驱动器顺序

  • 1

我对到底发生了什么以及如何在 Ubuntu 18.04 上进行最近扩展的 zfs 配置有点迷茫。

我有一台使用 ZFS 运行多年的存储服务器,其中有 2 个池,每个池包含 10 多个驱动器。一切都很好,直到......我们决定通过添加一个包含 10 个磁盘的新 vdev 来扩展一个池。插上后一切正常。这就是我添加设备所做的(我现在知道我应该在磁盘上按 id 完成 :-( ):

~$ sudo modprobe zfs
~$ dmesg|grep ZFS
[   17.948569] ZFS: Loaded module v0.6.5.6-0ubuntu26, ZFS pool version 5000, ZFS filesystem version 5
~$ lsscsi
[0:0:0:0]    disk    HGST     HUS724020ALS640  A1C4  /dev/sda
[0:0:1:0]    disk    HGST     HUS724020ALS640  A1C4  /dev/sdb
[0:0:2:0]    disk    HGST     HUS726040AL5210  A7J0  /dev/sdc
[0:0:3:0]    enclosu LSI      SAS2X28          0e12  -
[1:0:0:0]    disk    HGST     HUS726040AL5210  A7J0  /dev/sdd
[1:0:1:0]    disk    HGST     HUS726040AL5210  A7J0  /dev/sde
[1:0:2:0]    disk    HGST     HUS726040AL5210  A7J0  /dev/sdf
[1:0:3:0]    disk    HGST     HUS726040AL5210  A7J0  /dev/sdg
[1:0:4:0]    disk    HGST     HUS726040AL5210  A7J0  /dev/sdh
[1:0:5:0]    disk    HGST     HUS726040AL5210  A7J0  /dev/sdi
[1:0:6:0]    disk    HGST     HUS726040AL5210  A7J0  /dev/sdj
[1:0:7:0]    disk    HGST     HUS726040AL5210  A7J0  /dev/sdk
[1:0:8:0]    disk    HGST     HUS726040AL5210  A7J0  /dev/sdl
[1:0:9:0]    disk    HGST     HUS726040AL5210  A7J0  /dev/sdm
[1:0:10:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdn
[1:0:11:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdo
[1:0:12:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdp
[1:0:13:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdq
[1:0:14:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdr
[1:0:15:0]   disk    HGST     HUS726060AL5210  A519  /dev/sds
[1:0:16:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdt
[1:0:17:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdu
[1:0:18:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdv
[1:0:19:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdw
[1:0:20:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdx
[1:0:21:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdy
[1:0:22:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdz
[1:0:23:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdaa
[1:0:24:0]   enclosu LSI CORP SAS2X36          0717  -
[1:0:25:0]   disk    HGST     HUS726040AL5210  A7J0  /dev/sdab
[1:0:26:0]   enclosu LSI CORP SAS2X36          0717  -
[1:0:27:0]   disk    HGST     HUH721010AL4200  A384  /dev/sdac      ===>from here below the new plugged disks
[1:0:28:0]   disk    HGST     HUH721010AL4200  A384  /dev/sdad
[1:0:30:0]   disk    HGST     HUH721010AL4200  A384  /dev/sdae
[1:0:31:0]   disk    HGST     HUH721010AL4200  A384  /dev/sdaf
[1:0:32:0]   disk    HGST     HUH721010AL4200  A384  /dev/sdag
[1:0:33:0]   disk    HGST     HUH721010AL4200  A384  /dev/sdah
[1:0:34:0]   disk    HGST     HUH721010AL4200  A384  /dev/sdai
[1:0:35:0]   disk    HGST     HUH721010AL4200  A384  /dev/sdaj
[1:0:36:0]   disk    HGST     HUH721010AL4200  A384  /dev/sdak
[1:0:37:0]   disk    HGST     HUH721010AL4200  A384  /dev/sdal

接下来,我将驱动器作为新的 raidz2 vdev 添加到现有的存档池中。之后似乎运行顺利:

~$ sudo zpool add -f archive raidz2 sdac sdad sdae sdaf sdag sdah sdai sdaj sdak sdal
~$ sudo zpool status
  pool: archive
state: ONLINE
  scan: scrub repaired 0 in 17h18m with 0 errors on Sun Dec  8 17:42:17 2019
config:
        NAME                        STATE     READ WRITE CKSUM
        archive                     ONLINE       0     0     0
          raidz2-0                  ONLINE       0     0     0
            scsi-35000cca24311c340  ONLINE       0     0     0
            scsi-35000cca24311ecbc  ONLINE       0     0     0
            scsi-35000cca24d019248  ONLINE       0     0     0
            scsi-35000cca24311e30c  ONLINE       0     0     0
            scsi-35000cca243113ab0  ONLINE       0     0     0
            scsi-35000cca24311c188  ONLINE       0     0     0
            scsi-35000cca24311e7c8  ONLINE       0     0     0
            scsi-35000cca24311e3f0  ONLINE       0     0     0
            scsi-35000cca24311e7bc  ONLINE       0     0     0
            scsi-35000cca24311e40c  ONLINE       0     0     0
            scsi-35000cca243118054  ONLINE       0     0     0
            scsi-35000cca243115cb8  ONLINE       0     0     0
          raidz2-1                  ONLINE       0     0     0
            sdac                    ONLINE       0     0     0
            sdad                    ONLINE       0     0     0
            sdae                    ONLINE       0     0     0
            sdaf                    ONLINE       0     0     0
            sdag                    ONLINE       0     0     0
            sdah                    ONLINE       0     0     0
            sdai                    ONLINE       0     0     0
            sdaj                    ONLINE       0     0     0
            sdak                    ONLINE       0     0     0
            sdal                    ONLINE       0     0     0

errors: No known data errors

  pool: scratch
state: ONLINE
  scan: scrub repaired 0 in 9h8m with 0 errors on Sun Dec  8 09:32:15 2019
config:
        NAME                        STATE     READ WRITE CKSUM
        scratch                     ONLINE       0     0     0
          raidz2-0                  ONLINE       0     0     0
            scsi-35000cca24311e2e8  ONLINE       0     0     0
            scsi-35000cca24311e858  ONLINE       0     0     0
            scsi-35000cca24311ea5c  ONLINE       0     0     0
            scsi-35000cca24311c344  ONLINE       0     0     0
            scsi-35000cca24311e7ec  ONLINE       0     0     0
            scsi-35000cca24311bcb8  ONLINE       0     0     0
            scsi-35000cca24311e8a8  ONLINE       0     0     0
            scsi-35000cca2440b4f98  ONLINE       0     0     0
            scsi-35000cca24311e8f0  ONLINE       0     0     0
            scsi-35000cca2440b4ff0  ONLINE       0     0     0
            scsi-35000cca243113e30  ONLINE       0     0     0
            scsi-35000cca24311e9b4  ONLINE       0     0     0
            scsi-35000cca243137e80  ONLINE       0     0     0

errors: No known data errors

但是,重新启动很可能会打乱磁盘驱动器的顺序(设备分配;不确定是否很难,但似乎很可能)。至少这是我在阅读了许多文档和问题后所能理解的。目前的状态如下。暂存池工作正常。存档池不是:

~$ sudo zpool status -v
  pool: archive
state: UNAVAIL
status: One or more devices could not be used because the label is missing
or invalid.  There are insufficient replicas for the pool to continue
functioning.
action: Destroy and re-create the pool from
a backup source.
  see: http://zfsonlinux.org/msg/ZFS-8000-5E
  scan: none requested
config:

NAME                        STATE    READ WRITE CKSUM
archive                    UNAVAIL      0    0    0  insufficient replicas
  raidz2-0                  ONLINE      0    0    0
    scsi-35000cca24311c340  ONLINE      0    0    0
    scsi-35000cca24311ecbc  ONLINE      0    0    0
    scsi-35000cca24d019248  ONLINE      0    0    0
    scsi-35000cca24311e30c  ONLINE      0    0    0
    scsi-35000cca243113ab0  ONLINE      0    0    0
    scsi-35000cca24311c188  ONLINE      0    0    0
    scsi-35000cca24311e7c8  ONLINE      0    0    0
    scsi-35000cca24311e3f0  ONLINE      0    0    0
    scsi-35000cca24311e7bc  ONLINE      0    0    0
    scsi-35000cca24311e40c  ONLINE      0    0    0
    scsi-35000cca243118054  ONLINE      0    0    0
    scsi-35000cca243115cb8  ONLINE      0    0    0
  raidz2-1                  UNAVAIL      0    0    0  insufficient replicas
    sdac                    FAULTED      0    0    0  corrupted data
    sdad                    FAULTED      0    0    0  corrupted data
    sdae                    FAULTED      0    0    0  corrupted data
    sdaf                    FAULTED      0    0    0  corrupted data
    sdag                    FAULTED      0    0    0  corrupted data
    sdah                    FAULTED      0    0    0  corrupted data
    sdai                    FAULTED      0    0    0  corrupted data
    sdaj                    FAULTED      0    0    0  corrupted data
    sdak                    FAULTED      0    0    0  corrupted data
    sdal                    FAULTED      0    0    0  corrupted data

  pool: scratch
state: ONLINE
  scan: scrub repaired 0 in 16h36m with 0 errors on Sun Feb  9 17:00:25 2020
config:

NAME                        STATE    READ WRITE CKSUM
scratch                    ONLINE      0    0    0
  raidz2-0                  ONLINE      0    0    0
    scsi-35000cca24311e2e8  ONLINE      0    0    0
    scsi-35000cca24311e858  ONLINE      0    0    0
    scsi-35000cca24311ea5c  ONLINE      0    0    0
    scsi-35000cca24311c344  ONLINE      0    0    0
    scsi-35000cca24311e7ec  ONLINE      0    0    0
    scsi-35000cca24311bcb8  ONLINE      0    0    0
    scsi-35000cca24311e8a8  ONLINE      0    0    0
    scsi-35000cca2440b4f98  ONLINE      0    0    0
    scsi-35000cca24311e8f0  ONLINE      0    0    0
    scsi-35000cca2440b4ff0  ONLINE      0    0    0
    scsi-35000cca243113e30  ONLINE      0    0    0
    scsi-35000cca24311e9b4  ONLINE      0    0    0
    scsi-35000cca243137e80  ONLINE      0    0    0

errors: No known data errors

我尝试了 zpool 导出存档(也使用 -f),但它抱怨缺少设备。

~$ sudo zpool export -f archive
cannot export 'archive': one or more devices is currently unavailable

显然导入也失败了......

还有什么可以尝试的?我简直不敢相信“简单”的磁盘重新排序会弄乱存档池中的所有数据。

编辑 3 月 23 日

问题确实是驱动顺序发生了变化。
如果我在池上运行 zdb,它会显示存储在标签中的所有信息,并且大型新磁盘被错误的 /dev/sdxx 设备引用。我通过列出驱动器的 guid 以及实际分配的 /dev/sdxx 设备及其 ID 来确定这一点。它给了我下面的映射:

老开发者和当前开发者的映射表

但是如何解决这个问题。理论上,将更正的 zdb 数据重写到磁盘应该可以解决这个问题。

zfs ubuntu-18.04 pool raidz
  • 1 个回答
  • 244 Views
Martin Hope
Brian Thomas
Asked: 2020-01-23 15:42:12 +0800 CST

zfs raidz-2 如何从 3 个驱动器故障中恢复?

  • 2

我想知道发生了什么,ZFS 是如何完全恢复的,或者我的数据是否仍然完好无损。
当我昨晚进来时,我感到沮丧,然后感到困惑。

zpool status
  pool: san
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: resilvered 392K in 0h0m with 0 errors on Tue Jan 21 16:36:41 2020
config:

        NAME                                          STATE     READ WRITE CKSUM
        san                                           DEGRADED     0     0     0
          raidz2-0                                    DEGRADED     0     0     0
            ata-WDC_WD20EZRX-00DC0B0_WD-WMC1T3458346  ONLINE       0     0     0
            ata-ST2000DM001-9YN164_W1E07E0G           DEGRADED     0     0    38  too many errors
            ata-WDC_WD20EZRX-19D8PB0_WD-WCC4M0428332  DEGRADED     0     0    63  too many errors
            ata-ST2000NM0011_Z1P07NVZ                 ONLINE       0     0     0
            ata-WDC_WD20EARX-00PASB0_WD-WCAZAJ490344  ONLINE       0     0     0
            wwn-0x50014ee20949b6f9                    DEGRADED     0     0    75  too many errors

errors: No known data errors 

怎么可能没有数据错误,并且整个池都没有故障?

一个驱动器sdf对 SMART 的 smartctl 测试失败read fail,其他驱动器的问题稍小;不可纠正/未决扇区或 UDMA CRC 错误。

我尝试将每个发生故障的驱动器切换到离线状态,然后一次切换到一个在线状态,但这没有帮助。

    $ zpool status
  pool: san
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: resilvered 392K in 0h0m with 0 errors on Tue Jan 21 16:36:41 2020
config:

        NAME                                          STATE     READ WRITE CKSUM
        san                                           DEGRADED     0     0     0
          raidz2-0                                    DEGRADED     0     0     0
            ata-WDC_WD20EZRX-00DC0B0_WD-WMC1T3458346  ONLINE       0     0     0
            ata-ST2000DM001-9YN164_W1E07E0G           DEGRADED     0     0    38  too many errors
            ata-WDC_WD20EZRX-19D8PB0_WD-WCC4M0428332  OFFLINE      0     0    63
            ata-ST2000NM0011_Z1P07NVZ                 ONLINE       0     0     0
            ata-WDC_WD20EARX-00PASB0_WD-WCAZAJ490344  ONLINE       0     0     0
            wwn-0x50014ee20949b6f9                    DEGRADED     0     0    75  too many errors

因此,如果我的数据实际上仍然全部存在,我感到非常幸运,或者有点困惑,在检查了最差的驱动器之后,我用我唯一的备用驱动器进行了更换。

    $ zpool status
  pool: san
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Jan 21 17:33:15 2020
        467G scanned out of 8.91T at 174M/s, 14h10m to go
        77.6G resilvered, 5.12% done
config:

        NAME                                              STATE     READ WRITE CKSUM
        san                                               DEGRADED     0     0     0
          raidz2-0                                        DEGRADED     0     0     0
            ata-WDC_WD20EZRX-00DC0B0_WD-WMC1T3458346      ONLINE       0     0     0
            replacing-1                                   DEGRADED     0     0     0
              ata-ST2000DM001-9YN164_W1E07E0G             OFFLINE      0     0    38
              ata-WDC_WD2000FYYZ-01UL1B1_WD-WCC1P1171516  ONLINE       0     0     0  (resilvering)
            ata-WDC_WD20EZRX-19D8PB0_WD-WCC4M0428332      DEGRADED     0     0    63  too many errors
            ata-ST2000NM0011_Z1P07NVZ                     ONLINE       0     0     0
            ata-WDC_WD20EARX-00PASB0_WD-WCAZAJ490344      ONLINE       0     0     0
            wwn-0x50014ee20949b6f9                        DEGRADED     0     0    75  too many errors

resilver 确实成功完成。

$ zpool status
  pool: san
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: resilvered 1.48T in 12h5m with 0 errors on Wed Jan 22 05:38:48 2020
config:

        NAME                                            STATE     READ WRITE CKSUM
        san                                             DEGRADED     0     0     0
          raidz2-0                                      DEGRADED     0     0     0
            ata-WDC_WD20EZRX-00DC0B0_WD-WMC1T3458346    ONLINE       0     0     0
            ata-WDC_WD2000FYYZ-01UL1B1_WD-WCC1P1171516  ONLINE       0     0     0
            ata-WDC_WD20EZRX-19D8PB0_WD-WCC4M0428332    DEGRADED     0     0    63  too many errors
            ata-ST2000NM0011_Z1P07NVZ                   ONLINE       0     0     0
            ata-WDC_WD20EARX-00PASB0_WD-WCAZAJ490344    ONLINE       0     0     0
            wwn-0x50014ee20949b6f9                      DEGRADED     0     0    75  too many errors

我现在正处于十字路口。我通常dd将故障驱动器的前 2MB 归零,然后用它自己替换,我可以这样做,但是如果确实有数据丢失,我可能需要最后两个卷来恢复。

我现在桌子上有这个sdf,已删除。我觉得我可以,在最坏的情况下,用这个来帮助恢复。

同时,我想我现在要对降级驱动器的前几 MB 进行开发/归零,并自行更换,我认为事情应该会解决,冲洗并重复第二个故障驱动器,直到我能得到一些替换手上。

问题 发生了什么,池如何能够挂起,或者我可能丢失了一些数据(考虑到 zfs 及其报告的完整性,值得怀疑)

可能是由于幸运的失败顺序,例如失败的堆栈的顶部驱动器?

问题 这只是仅供参考,与主题无关。是什么导致所有 3 个同时失败?我认为这是一种磨砂膏,它是催化剂。我前一天晚上检查了所有驱动器都在线。

请注意,最近布线一直是个问题,办公室晚上很冷,但这些问题只是drive unavailable,而不是校验和错误。我认为那不是布线,而是老化的驱动器,它们已经 5 年了。但是一天3次失败?来吧,这足以吓到我们很多人!

zfs redundancy zfsonlinux raidz
  • 1 个回答
  • 1258 Views
Martin Hope
Greg
Asked: 2017-01-18 18:16:34 +0800 CST

在 ZFS raidz 中使用奇异的设置来最大化容量(使用不同大小的磁盘时)

  • 3

我有 2 个 4TB 磁盘和 3 个 6TB 磁盘,我想与 ZFS 一起使用。我的目标是最大化可用存储空间,同时允许单个磁盘发生故障。

理想情况下,将使用 raidz 设置,但是根据我的研究,不同大小的驱动器会导致较大的驱动器未得到充分利用。也就是说,6TB 中只有 4TB 将用于较大的驱动器。

是否可以对以下内容进行条带化(raid 0):

  • 两个 4TB 镜像(raid 1)配置
  • raidz (raid 5) 配置中的三个 6TB 磁盘

或者,是否可以将两个 4TB 条带化,然后将条带用于 6TB 驱动器的 raidz 配置?那是:

  • 条带化两个 4TB 驱动器
  • Raidz 3 x 6TB 和条带化 4TB 磁盘
zfs mirror zfsonlinux raidz
  • 2 个回答
  • 6475 Views
Martin Hope
swood
Asked: 2012-06-16 04:50:44 +0800 CST

FreeNAS 中的 Raidz 占用的空间超出预期

  • 5

我刚得到 6 个新的 2TB 驱动器,并将它们添加到我的 FreeNAS 盒中。我以前只处理过 RAID1,每个设置都符合我的预期。

然而,对于 6*2TB 驱动器,我想最大化可用空间,所以我选择了 raidz。但我似乎缺少空间。构建 raidz 后,我有 8.6TB 可用。也许我的数学计算错得离谱,但 (N-1) x S(min)(其中 N=6 和 S(min)=2TB)应该是 10TB。(我知道它更像是 9.something)

raidz 是否真的消耗了超过 1 个驱动器的空间?或者他们可能是另一个问题?(所有驱动器均已独立验证有 2TB 可用空间)

zfs storage truenas raidz
  • 1 个回答
  • 2708 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    新安装后 postgres 的默认超级用户用户名/密码是什么?

    • 5 个回答
  • Marko Smith

    SFTP 使用什么端口?

    • 6 个回答
  • Marko Smith

    命令行列出 Windows Active Directory 组中的用户?

    • 9 个回答
  • Marko Smith

    什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同?

    • 3 个回答
  • Marko Smith

    如何确定bash变量是否为空?

    • 15 个回答
  • Martin Hope
    Tom Feiner 如何按大小对 du -h 输出进行排序 2009-02-26 05:42:42 +0800 CST
  • Martin Hope
    Noah Goodrich 什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同? 2009-05-19 18:24:42 +0800 CST
  • Martin Hope
    Brent 如何确定bash变量是否为空? 2009-05-13 09:54:48 +0800 CST
  • Martin Hope
    cletus 您如何找到在 Windows 中打开文件的进程? 2009-05-01 16:47:16 +0800 CST

热门标签

linux nginx windows networking ubuntu domain-name-system amazon-web-services active-directory apache-2.4 ssh

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve