一点历史:2 年前我很兴奋地发现 mdadm 如此强大,它甚至可以重塑数组,所以你可以从一个较小的数组开始,然后根据需要扩展它。我已经购买了 3x1Tb 驱动器并制作了 RAID-5。用了一年还好。
然后我又买了 2 倍,并尝试从 5 个驱动器中重塑为 RAID-6,但由于超级块版本的一些混乱,丢失了所有内容。不得不从头开始重建它,但 2Tb 的数据消失了。
昨天我又买了 2 个驱动器,这次我什么都有了:正确构建的阵列、UPS。我已禁用写入意图映射,添加了 2 个新驱动器作为备用驱动器并运行命令将阵列增加到 7 个磁盘。
它开始工作,但速度慢得离谱,~100kb/sec。在以如此惊人的速度处理第一个 37Mb 之后,其中一个旧 HDD 出现故障。我正确关闭了 PC 并断开了故障驱动器的连接。启动后,它似乎重新创建了意图映射,因为它仍在 mdadm 配置中,所以我将其从配置中删除并再次重新启动。
现在我看到的是所有 mdadm 进程都死锁了,并且什么都不做。
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1937 root 20 0 12992 608 444 D 0 0.1 0:00.00 mdadm
2283 root 20 0 12992 852 704 D 0 0.1 0:00.01 mdadm
2287 root 20 0 0 0 0 D 0 0.0 0:00.01 md0_reshape
2288 root 18 -2 12992 820 676 D 0 0.1 0:00.01 mdadm
我在 mdstat 中看到的只是:
$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid6 sdb1[1] sdg1[4] sdf1[7] sde1[6] sdd1[0] sdc1[5]
2929683456 blocks super 1.2 level 6, 1024k chunk, algorithm 2 [7/6] [UU_UUUU]
[>....................] reshape = 0.0% (37888/976561152) finish=567604147.2min speed=0K/sec
我已经尝试过 mdadm 2.6.7、3.1.4 和 3.2 - 没有任何帮助。我是否又丢失了数据?关于如何使这项工作的任何建议?
操作系统是 Ubuntu 服务器 10.04.2。
PS。不用说,数据是不可访问的——我无法挂载 /dev/md0 来保存最有价值的数据。
你可以看到我的失望——我很兴奋的一件非常具体的事情是失败了两次,它拿走了我的 5Tb 数据。
更新: kern.log 中似乎有一些不错的信息:
21:38:48 ...: [ 166.522055] raid5: reshape will continue
21:38:48 ...: [ 166.522085] raid5: device sdb1 operational as raid disk 1
21:38:48 ...: [ 166.522091] raid5: device sdg1 operational as raid disk 4
21:38:48 ...: [ 166.522097] raid5: device sdf1 operational as raid disk 5
21:38:48 ...: [ 166.522102] raid5: device sde1 operational as raid disk 6
21:38:48 ...: [ 166.522107] raid5: device sdd1 operational as raid disk 0
21:38:48 ...: [ 166.522111] raid5: device sdc1 operational as raid disk 3
21:38:48 ...: [ 166.523942] raid5: allocated 7438kB for md0
21:38:48 ...: [ 166.524041] 1: w=1 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0
21:38:48 ...: [ 166.524050] 4: w=2 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0
21:38:48 ...: [ 166.524056] 5: w=3 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0
21:38:48 ...: [ 166.524062] 6: w=4 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0
21:38:48 ...: [ 166.524068] 0: w=5 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0
21:38:48 ...: [ 166.524073] 3: w=6 pa=2 pr=5 m=2 a=2 r=7 op1=0 op2=0
21:38:48 ...: [ 166.524079] raid5: raid level 6 set md0 active with 6 out of 7 devices, algorithm 2
21:38:48 ...: [ 166.524519] RAID5 conf printout:
21:38:48 ...: [ 166.524523] --- rd:7 wd:6
21:38:48 ...: [ 166.524528] disk 0, o:1, dev:sdd1
21:38:48 ...: [ 166.524532] disk 1, o:1, dev:sdb1
21:38:48 ...: [ 166.524537] disk 3, o:1, dev:sdc1
21:38:48 ...: [ 166.524541] disk 4, o:1, dev:sdg1
21:38:48 ...: [ 166.524545] disk 5, o:1, dev:sdf1
21:38:48 ...: [ 166.524550] disk 6, o:1, dev:sde1
21:38:48 ...: [ 166.524553] ...ok start reshape thread
21:38:48 ...: [ 166.524727] md0: detected capacity change from 0 to 2999995858944
21:38:48 ...: [ 166.524735] md: reshape of RAID array md0
21:38:48 ...: [ 166.524740] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
21:38:48 ...: [ 166.524745] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
21:38:48 ...: [ 166.524756] md: using 128k window, over a total of 976561152 blocks.
21:39:05 ...: [ 166.525013] md0:
21:42:04 ...: [ 362.520063] INFO: task mdadm:1937 blocked for more than 120 seconds.
21:42:04 ...: [ 362.520068] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:42:04 ...: [ 362.520073] mdadm D 00000000ffffffff 0 1937 1 0x00000000
21:42:04 ...: [ 362.520083] ffff88002ef4f5d8 0000000000000082 0000000000015bc0 0000000000015bc0
21:42:04 ...: [ 362.520092] ffff88002eb5b198 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5ade0
21:42:04 ...: [ 362.520100] 0000000000015bc0 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5b198
21:42:04 ...: [ 362.520107] Call Trace:
21:42:04 ...: [ 362.520133] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456]
21:42:04 ...: [ 362.520148] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20
21:42:04 ...: [ 362.520159] [<ffffffffa0228413>] make_request+0x243/0x4b0 [raid456]
21:42:04 ...: [ 362.520169] [<ffffffffa0221a90>] ? release_stripe+0x50/0x70 [raid456]
21:42:04 ...: [ 362.520179] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:42:04 ...: [ 362.520188] [<ffffffff81414df0>] md_make_request+0xc0/0x130
21:42:04 ...: [ 362.520194] [<ffffffff81414df0>] ? md_make_request+0xc0/0x130
21:42:04 ...: [ 362.520205] [<ffffffff8129f8c1>] generic_make_request+0x1b1/0x4f0
21:42:04 ...: [ 362.520214] [<ffffffff810f6515>] ? mempool_alloc_slab+0x15/0x20
21:42:04 ...: [ 362.520222] [<ffffffff8116c2ec>] ? alloc_buffer_head+0x1c/0x60
21:42:04 ...: [ 362.520230] [<ffffffff8129fc80>] submit_bio+0x80/0x110
21:42:04 ...: [ 362.520236] [<ffffffff8116c849>] submit_bh+0xf9/0x140
21:42:04 ...: [ 362.520244] [<ffffffff8116f124>] block_read_full_page+0x274/0x3b0
21:42:04 ...: [ 362.520251] [<ffffffff81172c90>] ? blkdev_get_block+0x0/0x70
21:42:04 ...: [ 362.520258] [<ffffffff8110d875>] ? __inc_zone_page_state+0x35/0x40
21:42:04 ...: [ 362.520265] [<ffffffff810f46d8>] ? add_to_page_cache_locked+0xe8/0x160
21:42:04 ...: [ 362.520272] [<ffffffff81173d78>] blkdev_readpage+0x18/0x20
21:42:04 ...: [ 362.520279] [<ffffffff810f484b>] __read_cache_page+0x7b/0xe0
21:42:04 ...: [ 362.520285] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20
21:42:04 ...: [ 362.520290] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20
21:42:04 ...: [ 362.520297] [<ffffffff810f57dc>] do_read_cache_page+0x3c/0x120
21:42:04 ...: [ 362.520304] [<ffffffff810f5909>] read_cache_page_async+0x19/0x20
21:42:04 ...: [ 362.520310] [<ffffffff810f591e>] read_cache_page+0xe/0x20
21:42:04 ...: [ 362.520317] [<ffffffff811a6cb0>] read_dev_sector+0x30/0xa0
21:42:04 ...: [ 362.520324] [<ffffffff811a7fcd>] amiga_partition+0x6d/0x460
21:42:04 ...: [ 362.520331] [<ffffffff811a7938>] check_partition+0x138/0x190
21:42:04 ...: [ 362.520338] [<ffffffff811a7a7a>] rescan_partitions+0xea/0x2f0
21:42:04 ...: [ 362.520344] [<ffffffff811744c7>] __blkdev_get+0x267/0x3d0
21:42:04 ...: [ 362.520350] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0
21:42:04 ...: [ 362.520356] [<ffffffff81174640>] blkdev_get+0x10/0x20
21:42:04 ...: [ 362.520362] [<ffffffff811746c1>] blkdev_open+0x71/0xc0
21:42:04 ...: [ 362.520369] [<ffffffff811419f3>] __dentry_open+0x113/0x370
21:42:04 ...: [ 362.520377] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30
21:42:04 ...: [ 362.520385] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0
21:42:04 ...: [ 362.520391] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70
21:42:04 ...: [ 362.520398] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0
21:42:04 ...: [ 362.520406] [<ffffffff811134a8>] ? unmap_vmas+0x178/0x310
21:42:04 ...: [ 362.520414] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150
21:42:04 ...: [ 362.520421] [<ffffffff81141769>] do_sys_open+0x69/0x170
21:42:04 ...: [ 362.520428] [<ffffffff811418b0>] sys_open+0x20/0x30
21:42:04 ...: [ 362.520437] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:42:04 ...: [ 362.520446] INFO: task mdadm:2283 blocked for more than 120 seconds.
21:42:04 ...: [ 362.520450] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:42:04 ...: [ 362.520454] mdadm D 0000000000000000 0 2283 2212 0x00000000
21:42:04 ...: [ 362.520462] ffff88002cca7d98 0000000000000086 0000000000015bc0 0000000000015bc0
21:42:04 ...: [ 362.520470] ffff88002ededf78 ffff88002cca7fd8 0000000000015bc0 ffff88002ededbc0
21:42:04 ...: [ 362.520478] 0000000000015bc0 ffff88002cca7fd8 0000000000015bc0 ffff88002ededf78
21:42:04 ...: [ 362.520485] Call Trace:
21:42:04 ...: [ 362.520495] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180
21:42:04 ...: [ 362.520502] [<ffffffff8154397b>] mutex_lock+0x2b/0x50
21:42:04 ...: [ 362.520508] [<ffffffff8117404d>] __blkdev_put+0x3d/0x190
21:42:04 ...: [ 362.520514] [<ffffffff811741b0>] blkdev_put+0x10/0x20
21:42:04 ...: [ 362.520520] [<ffffffff811741f3>] blkdev_close+0x33/0x60
21:42:04 ...: [ 362.520527] [<ffffffff81145375>] __fput+0xf5/0x210
21:42:04 ...: [ 362.520534] [<ffffffff811454b5>] fput+0x25/0x30
21:42:04 ...: [ 362.520540] [<ffffffff811415ad>] filp_close+0x5d/0x90
21:42:04 ...: [ 362.520546] [<ffffffff81141697>] sys_close+0xb7/0x120
21:42:04 ...: [ 362.520553] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:42:04 ...: [ 362.520559] INFO: task md0_reshape:2287 blocked for more than 120 seconds.
21:42:04 ...: [ 362.520563] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:42:04 ...: [ 362.520567] md0_reshape D ffff88003aee96f0 0 2287 2 0x00000000
21:42:04 ...: [ 362.520575] ffff88003cf05a70 0000000000000046 0000000000015bc0 0000000000015bc0
21:42:04 ...: [ 362.520582] ffff88003aee9aa8 ffff88003cf05fd8 0000000000015bc0 ffff88003aee96f0
21:42:04 ...: [ 362.520590] 0000000000015bc0 ffff88003cf05fd8 0000000000015bc0 ffff88003aee9aa8
21:42:04 ...: [ 362.520597] Call Trace:
21:42:04 ...: [ 362.520608] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456]
21:42:04 ...: [ 362.520616] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20
21:42:04 ...: [ 362.520626] [<ffffffffa0226f80>] reshape_request+0x4c0/0x9a0 [raid456]
21:42:04 ...: [ 362.520634] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:42:04 ...: [ 362.520644] [<ffffffffa022777a>] sync_request+0x31a/0x3a0 [raid456]
21:42:04 ...: [ 362.520651] [<ffffffff81052713>] ? __wake_up+0x53/0x70
21:42:04 ...: [ 362.520658] [<ffffffff814156b1>] md_do_sync+0x621/0xbb0
21:42:04 ...: [ 362.520668] [<ffffffff810387b9>] ? default_spin_lock_flags+0x9/0x10
21:42:04 ...: [ 362.520675] [<ffffffff8141640c>] md_thread+0x5c/0x130
21:42:04 ...: [ 362.520681] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:42:04 ...: [ 362.520688] [<ffffffff814163b0>] ? md_thread+0x0/0x130
21:42:04 ...: [ 362.520694] [<ffffffff81084416>] kthread+0x96/0xa0
21:42:04 ...: [ 362.520701] [<ffffffff810131ea>] child_rip+0xa/0x20
21:42:04 ...: [ 362.520707] [<ffffffff81084380>] ? kthread+0x0/0xa0
21:42:04 ...: [ 362.520713] [<ffffffff810131e0>] ? child_rip+0x0/0x20
21:42:04 ...: [ 362.520718] INFO: task mdadm:2288 blocked for more than 120 seconds.
21:42:04 ...: [ 362.520721] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:42:04 ...: [ 362.520725] mdadm D 0000000000000000 0 2288 1 0x00000000
21:42:04 ...: [ 362.520733] ffff88002cca9c18 0000000000000086 0000000000015bc0 0000000000015bc0
21:42:04 ...: [ 362.520741] ffff88003aee83b8 ffff88002cca9fd8 0000000000015bc0 ffff88003aee8000
21:42:04 ...: [ 362.520748] 0000000000015bc0 ffff88002cca9fd8 0000000000015bc0 ffff88003aee83b8
21:42:04 ...: [ 362.520755] Call Trace:
21:42:04 ...: [ 362.520763] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180
21:42:04 ...: [ 362.520771] [<ffffffff812a6d50>] ? exact_match+0x0/0x10
21:42:04 ...: [ 362.520777] [<ffffffff8154397b>] mutex_lock+0x2b/0x50
21:42:04 ...: [ 362.520783] [<ffffffff811742c8>] __blkdev_get+0x68/0x3d0
21:42:04 ...: [ 362.520790] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0
21:42:04 ...: [ 362.520795] [<ffffffff81174640>] blkdev_get+0x10/0x20
21:42:04 ...: [ 362.520801] [<ffffffff811746c1>] blkdev_open+0x71/0xc0
21:42:04 ...: [ 362.520808] [<ffffffff811419f3>] __dentry_open+0x113/0x370
21:42:04 ...: [ 362.520815] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30
21:42:04 ...: [ 362.520821] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0
21:42:04 ...: [ 362.520828] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70
21:42:04 ...: [ 362.520834] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0
21:42:04 ...: [ 362.520841] [<ffffffff810ff0e1>] ? lru_cache_add_lru+0x21/0x40
21:42:04 ...: [ 362.520848] [<ffffffff8111109c>] ? do_anonymous_page+0x11c/0x330
21:42:04 ...: [ 362.520855] [<ffffffff81115d5f>] ? handle_mm_fault+0x31f/0x3c0
21:42:04 ...: [ 362.520862] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150
21:42:04 ...: [ 362.520868] [<ffffffff81141769>] do_sys_open+0x69/0x170
21:42:04 ...: [ 362.520874] [<ffffffff811418b0>] sys_open+0x20/0x30
21:42:04 ...: [ 362.520882] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:44:04 ...: [ 482.520065] INFO: task mdadm:1937 blocked for more than 120 seconds.
21:44:04 ...: [ 482.520071] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:44:04 ...: [ 482.520077] mdadm D 00000000ffffffff 0 1937 1 0x00000000
21:44:04 ...: [ 482.520087] ffff88002ef4f5d8 0000000000000082 0000000000015bc0 0000000000015bc0
21:44:04 ...: [ 482.520096] ffff88002eb5b198 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5ade0
21:44:04 ...: [ 482.520104] 0000000000015bc0 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5b198
21:44:04 ...: [ 482.520112] Call Trace:
21:44:04 ...: [ 482.520139] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456]
21:44:04 ...: [ 482.520154] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20
21:44:04 ...: [ 482.520165] [<ffffffffa0228413>] make_request+0x243/0x4b0 [raid456]
21:44:04 ...: [ 482.520175] [<ffffffffa0221a90>] ? release_stripe+0x50/0x70 [raid456]
21:44:04 ...: [ 482.520185] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:44:04 ...: [ 482.520194] [<ffffffff81414df0>] md_make_request+0xc0/0x130
21:44:04 ...: [ 482.520201] [<ffffffff81414df0>] ? md_make_request+0xc0/0x130
21:44:04 ...: [ 482.520212] [<ffffffff8129f8c1>] generic_make_request+0x1b1/0x4f0
21:44:04 ...: [ 482.520221] [<ffffffff810f6515>] ? mempool_alloc_slab+0x15/0x20
21:44:04 ...: [ 482.520229] [<ffffffff8116c2ec>] ? alloc_buffer_head+0x1c/0x60
21:44:04 ...: [ 482.520237] [<ffffffff8129fc80>] submit_bio+0x80/0x110
21:44:04 ...: [ 482.520244] [<ffffffff8116c849>] submit_bh+0xf9/0x140
21:44:04 ...: [ 482.520252] [<ffffffff8116f124>] block_read_full_page+0x274/0x3b0
21:44:04 ...: [ 482.520258] [<ffffffff81172c90>] ? blkdev_get_block+0x0/0x70
21:44:04 ...: [ 482.520266] [<ffffffff8110d875>] ? __inc_zone_page_state+0x35/0x40
21:44:04 ...: [ 482.520273] [<ffffffff810f46d8>] ? add_to_page_cache_locked+0xe8/0x160
21:44:04 ...: [ 482.520280] [<ffffffff81173d78>] blkdev_readpage+0x18/0x20
21:44:04 ...: [ 482.520286] [<ffffffff810f484b>] __read_cache_page+0x7b/0xe0
21:44:04 ...: [ 482.520293] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20
21:44:04 ...: [ 482.520299] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20
21:44:04 ...: [ 482.520306] [<ffffffff810f57dc>] do_read_cache_page+0x3c/0x120
21:44:04 ...: [ 482.520313] [<ffffffff810f5909>] read_cache_page_async+0x19/0x20
21:44:04 ...: [ 482.520319] [<ffffffff810f591e>] read_cache_page+0xe/0x20
21:44:04 ...: [ 482.520327] [<ffffffff811a6cb0>] read_dev_sector+0x30/0xa0
21:44:04 ...: [ 482.520334] [<ffffffff811a7fcd>] amiga_partition+0x6d/0x460
21:44:04 ...: [ 482.520341] [<ffffffff811a7938>] check_partition+0x138/0x190
21:44:04 ...: [ 482.520348] [<ffffffff811a7a7a>] rescan_partitions+0xea/0x2f0
21:44:04 ...: [ 482.520355] [<ffffffff811744c7>] __blkdev_get+0x267/0x3d0
21:44:04 ...: [ 482.520361] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0
21:44:04 ...: [ 482.520367] [<ffffffff81174640>] blkdev_get+0x10/0x20
21:44:04 ...: [ 482.520373] [<ffffffff811746c1>] blkdev_open+0x71/0xc0
21:44:04 ...: [ 482.520380] [<ffffffff811419f3>] __dentry_open+0x113/0x370
21:44:04 ...: [ 482.520388] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30
21:44:04 ...: [ 482.520396] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0
21:44:04 ...: [ 482.520403] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70
21:44:04 ...: [ 482.520410] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0
21:44:04 ...: [ 482.520417] [<ffffffff811134a8>] ? unmap_vmas+0x178/0x310
21:44:04 ...: [ 482.520426] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150
21:44:04 ...: [ 482.520432] [<ffffffff81141769>] do_sys_open+0x69/0x170
21:44:04 ...: [ 482.520438] [<ffffffff811418b0>] sys_open+0x20/0x30
21:44:04 ...: [ 482.520447] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:44:04 ...: [ 482.520458] INFO: task mdadm:2283 blocked for more than 120 seconds.
21:44:04 ...: [ 482.520462] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:44:04 ...: [ 482.520467] mdadm D 0000000000000000 0 2283 2212 0x00000000
21:44:04 ...: [ 482.520475] ffff88002cca7d98 0000000000000086 0000000000015bc0 0000000000015bc0
21:44:04 ...: [ 482.520483] ffff88002ededf78 ffff88002cca7fd8 0000000000015bc0 ffff88002ededbc0
21:44:04 ...: [ 482.520490] 0000000000015bc0 ffff88002cca7fd8 0000000000015bc0 ffff88002ededf78
21:44:04 ...: [ 482.520498] Call Trace:
21:44:04 ...: [ 482.520508] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180
21:44:04 ...: [ 482.520515] [<ffffffff8154397b>] mutex_lock+0x2b/0x50
21:44:04 ...: [ 482.520521] [<ffffffff8117404d>] __blkdev_put+0x3d/0x190
21:44:04 ...: [ 482.520527] [<ffffffff811741b0>] blkdev_put+0x10/0x20
21:44:04 ...: [ 482.520533] [<ffffffff811741f3>] blkdev_close+0x33/0x60
21:44:04 ...: [ 482.520541] [<ffffffff81145375>] __fput+0xf5/0x210
21:44:04 ...: [ 482.520547] [<ffffffff811454b5>] fput+0x25/0x30
21:44:04 ...: [ 482.520554] [<ffffffff811415ad>] filp_close+0x5d/0x90
21:44:04 ...: [ 482.520560] [<ffffffff81141697>] sys_close+0xb7/0x120
21:44:04 ...: [ 482.520568] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:44:04 ...: [ 482.520574] INFO: task md0_reshape:2287 blocked for more than 120 seconds.
21:44:04 ...: [ 482.520578] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:44:04 ...: [ 482.520582] md0_reshape D ffff88003aee96f0 0 2287 2 0x00000000
21:44:04 ...: [ 482.520590] ffff88003cf05a70 0000000000000046 0000000000015bc0 0000000000015bc0
21:44:04 ...: [ 482.520597] ffff88003aee9aa8 ffff88003cf05fd8 0000000000015bc0 ffff88003aee96f0
21:44:04 ...: [ 482.520605] 0000000000015bc0 ffff88003cf05fd8 0000000000015bc0 ffff88003aee9aa8
21:44:04 ...: [ 482.520612] Call Trace:
21:44:04 ...: [ 482.520623] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456]
21:44:04 ...: [ 482.520633] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20
21:44:04 ...: [ 482.520643] [<ffffffffa0226f80>] reshape_request+0x4c0/0x9a0 [raid456]
21:44:04 ...: [ 482.520651] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:44:04 ...: [ 482.520661] [<ffffffffa022777a>] sync_request+0x31a/0x3a0 [raid456]
21:44:04 ...: [ 482.520668] [<ffffffff81052713>] ? __wake_up+0x53/0x70
21:44:04 ...: [ 482.520675] [<ffffffff814156b1>] md_do_sync+0x621/0xbb0
21:44:04 ...: [ 482.520685] [<ffffffff810387b9>] ? default_spin_lock_flags+0x9/0x10
21:44:04 ...: [ 482.520692] [<ffffffff8141640c>] md_thread+0x5c/0x130
21:44:04 ...: [ 482.520699] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:44:04 ...: [ 482.520705] [<ffffffff814163b0>] ? md_thread+0x0/0x130
21:44:04 ...: [ 482.520711] [<ffffffff81084416>] kthread+0x96/0xa0
21:44:04 ...: [ 482.520718] [<ffffffff810131ea>] child_rip+0xa/0x20
21:44:04 ...: [ 482.520725] [<ffffffff81084380>] ? kthread+0x0/0xa0
21:44:04 ...: [ 482.520730] [<ffffffff810131e0>] ? child_rip+0x0/0x20
21:44:04 ...: [ 482.520735] INFO: task mdadm:2288 blocked for more than 120 seconds.
21:44:04 ...: [ 482.520739] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:44:04 ...: [ 482.520743] mdadm D 0000000000000000 0 2288 1 0x00000000
21:44:04 ...: [ 482.520751] ffff88002cca9c18 0000000000000086 0000000000015bc0 0000000000015bc0
21:44:04 ...: [ 482.520759] ffff88003aee83b8 ffff88002cca9fd8 0000000000015bc0 ffff88003aee8000
21:44:04 ...: [ 482.520767] 0000000000015bc0 ffff88002cca9fd8 0000000000015bc0 ffff88003aee83b8
21:44:04 ...: [ 482.520774] Call Trace:
21:44:04 ...: [ 482.520782] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180
21:44:04 ...: [ 482.520790] [<ffffffff812a6d50>] ? exact_match+0x0/0x10
21:44:04 ...: [ 482.520797] [<ffffffff8154397b>] mutex_lock+0x2b/0x50
21:44:04 ...: [ 482.520804] [<ffffffff811742c8>] __blkdev_get+0x68/0x3d0
21:44:04 ...: [ 482.520810] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0
21:44:04 ...: [ 482.520816] [<ffffffff81174640>] blkdev_get+0x10/0x20
21:44:04 ...: [ 482.520822] [<ffffffff811746c1>] blkdev_open+0x71/0xc0
21:44:04 ...: [ 482.520829] [<ffffffff811419f3>] __dentry_open+0x113/0x370
21:44:04 ...: [ 482.520837] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30
21:44:04 ...: [ 482.520843] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0
21:44:04 ...: [ 482.520850] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70
21:44:04 ...: [ 482.520857] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0
21:44:04 ...: [ 482.520864] [<ffffffff810ff0e1>] ? lru_cache_add_lru+0x21/0x40
21:44:04 ...: [ 482.520871] [<ffffffff8111109c>] ? do_anonymous_page+0x11c/0x330
21:44:04 ...: [ 482.520878] [<ffffffff81115d5f>] ? handle_mm_fault+0x31f/0x3c0
21:44:04 ...: [ 482.520885] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150
21:44:04 ...: [ 482.520891] [<ffffffff81141769>] do_sys_open+0x69/0x170
21:44:04 ...: [ 482.520897] [<ffffffff811418b0>] sys_open+0x20/0x30
21:44:04 ...: [ 482.520905] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:46:04 ...: [ 602.520053] INFO: task mdadm:1937 blocked for more than 120 seconds.
21:46:04 ...: [ 602.520059] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:46:04 ...: [ 602.520065] mdadm D 00000000ffffffff 0 1937 1 0x00000000
21:46:04 ...: [ 602.520075] ffff88002ef4f5d8 0000000000000082 0000000000015bc0 0000000000015bc0
21:46:04 ...: [ 602.520084] ffff88002eb5b198 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5ade0
21:46:04 ...: [ 602.520091] 0000000000015bc0 ffff88002ef4ffd8 0000000000015bc0 ffff88002eb5b198
21:46:04 ...: [ 602.520099] Call Trace:
21:46:04 ...: [ 602.520127] [<ffffffffa0224892>] get_active_stripe+0x312/0x3f0 [raid456]
21:46:04 ...: [ 602.520142] [<ffffffff81059ae0>] ? default_wake_function+0x0/0x20
21:46:04 ...: [ 602.520153] [<ffffffffa0228413>] make_request+0x243/0x4b0 [raid456]
21:46:04 ...: [ 602.520162] [<ffffffffa0221a90>] ? release_stripe+0x50/0x70 [raid456]
21:46:04 ...: [ 602.520171] [<ffffffff81084790>] ? autoremove_wake_function+0x0/0x40
21:46:04 ...: [ 602.520180] [<ffffffff81414df0>] md_make_request+0xc0/0x130
21:46:04 ...: [ 602.520187] [<ffffffff81414df0>] ? md_make_request+0xc0/0x130
21:46:04 ...: [ 602.520197] [<ffffffff8129f8c1>] generic_make_request+0x1b1/0x4f0
21:46:04 ...: [ 602.520206] [<ffffffff810f6515>] ? mempool_alloc_slab+0x15/0x20
21:46:04 ...: [ 602.520215] [<ffffffff8116c2ec>] ? alloc_buffer_head+0x1c/0x60
21:46:04 ...: [ 602.520222] [<ffffffff8129fc80>] submit_bio+0x80/0x110
21:46:04 ...: [ 602.520229] [<ffffffff8116c849>] submit_bh+0xf9/0x140
21:46:04 ...: [ 602.520237] [<ffffffff8116f124>] block_read_full_page+0x274/0x3b0
21:46:04 ...: [ 602.520244] [<ffffffff81172c90>] ? blkdev_get_block+0x0/0x70
21:46:04 ...: [ 602.520252] [<ffffffff8110d875>] ? __inc_zone_page_state+0x35/0x40
21:46:04 ...: [ 602.520259] [<ffffffff810f46d8>] ? add_to_page_cache_locked+0xe8/0x160
21:46:04 ...: [ 602.520266] [<ffffffff81173d78>] blkdev_readpage+0x18/0x20
21:46:04 ...: [ 602.520273] [<ffffffff810f484b>] __read_cache_page+0x7b/0xe0
21:46:04 ...: [ 602.520279] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20
21:46:04 ...: [ 602.520285] [<ffffffff81173d60>] ? blkdev_readpage+0x0/0x20
21:46:04 ...: [ 602.520292] [<ffffffff810f57dc>] do_read_cache_page+0x3c/0x120
21:46:04 ...: [ 602.520300] [<ffffffff810f5909>] read_cache_page_async+0x19/0x20
21:46:04 ...: [ 602.520306] [<ffffffff810f591e>] read_cache_page+0xe/0x20
21:46:04 ...: [ 602.520314] [<ffffffff811a6cb0>] read_dev_sector+0x30/0xa0
21:46:04 ...: [ 602.520321] [<ffffffff811a7fcd>] amiga_partition+0x6d/0x460
21:46:04 ...: [ 602.520328] [<ffffffff811a7938>] check_partition+0x138/0x190
21:46:04 ...: [ 602.520335] [<ffffffff811a7a7a>] rescan_partitions+0xea/0x2f0
21:46:04 ...: [ 602.520342] [<ffffffff811744c7>] __blkdev_get+0x267/0x3d0
21:46:04 ...: [ 602.520348] [<ffffffff81174650>] ? blkdev_open+0x0/0xc0
21:46:04 ...: [ 602.520354] [<ffffffff81174640>] blkdev_get+0x10/0x20
21:46:04 ...: [ 602.520359] [<ffffffff811746c1>] blkdev_open+0x71/0xc0
21:46:04 ...: [ 602.520367] [<ffffffff811419f3>] __dentry_open+0x113/0x370
21:46:04 ...: [ 602.520375] [<ffffffff81253f8f>] ? security_inode_permission+0x1f/0x30
21:46:04 ...: [ 602.520383] [<ffffffff8114de3f>] ? inode_permission+0xaf/0xd0
21:46:04 ...: [ 602.520390] [<ffffffff81141d67>] nameidata_to_filp+0x57/0x70
21:46:04 ...: [ 602.520397] [<ffffffff8115207a>] do_filp_open+0x2da/0xba0
21:46:04 ...: [ 602.520404] [<ffffffff811134a8>] ? unmap_vmas+0x178/0x310
21:46:04 ...: [ 602.520413] [<ffffffff8115dbfa>] ? alloc_fd+0x10a/0x150
21:46:04 ...: [ 602.520419] [<ffffffff81141769>] do_sys_open+0x69/0x170
21:46:04 ...: [ 602.520425] [<ffffffff811418b0>] sys_open+0x20/0x30
21:46:04 ...: [ 602.520434] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
21:46:04 ...: [ 602.520443] INFO: task mdadm:2283 blocked for more than 120 seconds.
21:46:04 ...: [ 602.520447] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
21:46:04 ...: [ 602.520451] mdadm D 0000000000000000 0 2283 2212 0x00000000
21:46:04 ...: [ 602.520460] ffff88002cca7d98 0000000000000086 0000000000015bc0 0000000000015bc0
21:46:04 ...: [ 602.520468] ffff88002ededf78 ffff88002cca7fd8 0000000000015bc0 ffff88002ededbc0
21:46:04 ...: [ 602.520475] 0000000000015bc0 ffff88002cca7fd8 0000000000015bc0 ffff88002ededf78
21:46:04 ...: [ 602.520483] Call Trace:
21:46:04 ...: [ 602.520492] [<ffffffff81543a97>] __mutex_lock_slowpath+0xf7/0x180
21:46:04 ...: [ 602.520500] [<ffffffff8154397b>] mutex_lock+0x2b/0x50
21:46:04 ...: [ 602.520506] [<ffffffff8117404d>] __blkdev_put+0x3d/0x190
21:46:04 ...: [ 602.520512] [<ffffffff811741b0>] blkdev_put+0x10/0x20
21:46:04 ...: [ 602.520518] [<ffffffff811741f3>] blkdev_close+0x33/0x60
21:46:04 ...: [ 602.520526] [<ffffffff81145375>] __fput+0xf5/0x210
21:46:04 ...: [ 602.520533] [<ffffffff811454b5>] fput+0x25/0x30
21:46:04 ...: [ 602.520539] [<ffffffff811415ad>] filp_close+0x5d/0x90
21:46:04 ...: [ 602.520545] [<ffffffff81141697>] sys_close+0xb7/0x120
21:46:04 ...: [ 602.520552] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
我能够联系到 Neil Brown(开发人员),他立即建议将 stripe_cache_size 至少增加到 2048。这类似于我之前的问题,我无法使该设置永久化。
所以设置后8192 reshape继续,问题就解决了。上帝保佑尼尔布朗:-)
有时,由于备份文件创建失败或在处理过程中丢失,整形将以 speed=0K/sec 的速度进行。
在这种情况下,解决方案是由 Neil Brown 提供的,以回复[email protected]的电子邮件。
对于 RAID5,作为设备
/dev/md0
,7 个磁盘安装在/mnt/data
; 他回答的程序是:以下所有命令必须以 root 或同等身份运行。
查找到驱动器的任何打开的连接:
关闭它们,或停止可能与其交互的服务。
通常:
或者
卸载、停止,然后重新组装:
根据先前的配置,设备可能会在组装命令后自动重新安装。如果没有,请安装:
然后重新启动从那里运行的任何服务或连接是安全的。