Tae Young Hong
2012-May-07 04:13 UTC
[Lustre-discuss] recovery from multiple disks failure on the same md
Hi,
I found the terrible situation on our lustre system.
A OST (raid 6: 8+2, spare 1) had 2 disk failures almost at the same time. While
recovering it, another disk failed. so recovering procedure seems to be halt,
and the spare disk which were in resync fell into "spare" status
again. (I guess that resync procedure almost finished more than 95%)
Right now we have just 7 disks for this md. Is there any possibility to recover
from this situation?
The following is detailed log.
#1 the original configuration before any failure
Number Major Minor RaidDevice State
0 8 176 0 active sync /dev/sdl
1 8 192 1 active sync /dev/sdm
2 8 208 2 active sync /dev/sdn
3 8 224 3 active sync /dev/sdo
4 8 240 4 active sync /dev/sdp
5 65 0 5 active sync /dev/sdq
6 65 16 6 active sync /dev/sdr
7 65 32 7 active sync /dev/sds
8 65 48 8 active sync /dev/sdt
9 65 96 9 active sync /dev/sdw
10 65 64 - spare /dev/sdu
#2 a disk(sdl) failed, and resync started after adding spare disk(sdu)
May 7 04:53:33 oss07 kernel: sd 1:0:10:0: SCSI error: return code = 0x08000002
May 7 04:53:33 oss07 kernel: sdl: Current: sense key: Medium Error
May 7 04:53:33 oss07 kernel: Add. Sense: Unrecovered read error
May 7 04:53:33 oss07 kernel:
May 7 04:53:33 oss07 kernel: Info fld=0x74241ace
May 7 04:53:33 oss07 kernel: end_request: I/O error, dev sdl, sector 1948523214
... ...
May 7 04:54:15 oss07 kernel: RAID5 conf printout:
May 7 04:54:16 oss07 kernel: --- rd:10 wd:9 fd:1
May 7 04:54:16 oss07 kernel: disk 1, o:1, dev:sdm
May 7 04:54:16 oss07 kernel: disk 2, o:1, dev:sdn
May 7 04:54:16 oss07 kernel: disk 3, o:1, dev:sdo
May 7 04:54:16 oss07 kernel: disk 4, o:1, dev:sdp
May 7 04:54:16 oss07 kernel: disk 5, o:1, dev:sdq
May 7 04:54:16 oss07 kernel: disk 6, o:1, dev:sdr
May 7 04:54:16 oss07 kernel: disk 7, o:1, dev:sds
May 7 04:54:16 oss07 kernel: disk 8, o:1, dev:sdt
May 7 04:54:16 oss07 kernel: disk 9, o:1, dev:sdw
May 7 04:54:16 oss07 kernel: RAID5 conf printout:
May 7 04:54:16 oss07 kernel: --- rd:10 wd:9 fd:1
May 7 04:54:16 oss07 kernel: disk 0, o:1, dev:sdu
May 7 04:54:16 oss07 kernel: disk 1, o:1, dev:sdm
May 7 04:54:16 oss07 kernel: disk 2, o:1, dev:sdn
May 7 04:54:16 oss07 kernel: disk 3, o:1, dev:sdo
May 7 04:54:16 oss07 kernel: disk 4, o:1, dev:sdp
May 7 04:54:16 oss07 kernel: disk 5, o:1, dev:sdq
May 7 04:54:16 oss07 kernel: disk 6, o:1, dev:sdr
May 7 04:54:16 oss07 kernel: disk 7, o:1, dev:sds
May 7 04:54:16 oss07 kernel: disk 8, o:1, dev:sdt
May 7 04:54:16 oss07 kernel: disk 9, o:1, dev:sdw
May 7 04:54:16 oss07 kernel: md: syncing RAID array md12
#3 another disk(sdp) failed
May 7 04:54:42 oss07 kernel: end_request: I/O error, dev sdp, sector 1949298688
May 7 04:54:42 oss07 kernel: mptbase: ioc1: LogInfo(0x31080000):
Originator={PL}, Code={SATA NCQ FaCommands After Error}, SubCode(0x0000)
May 7 04:54:42 oss07 last message repeated 3 times
May 7 04:54:42 oss07 kernel: raid5:md12: read error not correctable (sector
1949298688 on sdp).
May 7 04:54:42 oss07 kernel: raid5: Disk failure on sdp, disabling device.
Operation continuing on
May 7 04:54:43 oss07 kernel: end_request: I/O error, dev sdp, sector 1948532499
... ...
May 7 04:54:44 oss07 kernel: raid5:md12: read error not correctable (sector
1948532728 on sdp).
May 7 04:54:44 oss07 kernel: md: md12: sync done.
May 7 04:54:53 oss07 kernel: RAID5 conf printout:
May 7 04:54:53 oss07 kernel: --- rd:10 wd:8 fd:2
May 7 04:54:53 oss07 kernel: disk 0, o:1, dev:sdu
May 7 04:54:53 oss07 kernel: disk 1, o:1, dev:sdm
May 7 04:54:53 oss07 kernel: disk 2, o:1, dev:sdn
May 7 04:54:53 oss07 kernel: disk 3, o:1, dev:sdo
May 7 04:54:53 oss07 kernel: disk 4, o:0, dev:sdp
May 7 04:54:53 oss07 kernel: disk 5, o:1, dev:sdq
May 7 04:54:53 oss07 kernel: disk 6, o:1, dev:sdr
May 7 04:54:53 oss07 kernel: disk 7, o:1, dev:sds
May 7 04:54:53 oss07 kernel: disk 8, o:1, dev:sdt
May 7 04:54:53 oss07 kernel: disk 9, o:1, dev:sdw
... ...
May 7 04:54:54 oss07 kernel: RAID5 conf printout:
May 7 04:54:54 oss07 kernel: --- rd:10 wd:8 fd:2
May 7 04:54:54 oss07 kernel: disk 1, o:1, dev:sdm
May 7 04:54:54 oss07 kernel: disk 2, o:1, dev:sdn
May 7 04:54:54 oss07 kernel: disk 3, o:1, dev:sdo
May 7 04:54:54 oss07 kernel: disk 5, o:1, dev:sdq
May 7 04:54:54 oss07 kernel: disk 6, o:1, dev:sdr
May 7 04:54:54 oss07 kernel: disk 7, o:1, dev:sds
May 7 04:54:54 oss07 kernel: disk 8, o:1, dev:sdt
May 7 04:54:54 oss07 kernel: disk 9, o:1, dev:sdw
May 7 04:54:54 oss07 kernel: RAID5 conf printout:
May 7 04:54:54 oss07 kernel: --- rd:10 wd:8 fd:2
May 7 04:54:54 oss07 kernel: disk 0, o:1, dev:sdu
May 7 04:54:54 oss07 kernel: disk 1, o:1, dev:sdm
May 7 04:54:54 oss07 kernel: disk 2, o:1, dev:sdn
May 7 04:54:54 oss07 kernel: disk 3, o:1, dev:sdo
May 7 04:54:54 oss07 kernel: disk 5, o:1, dev:sdq
May 7 04:54:54 oss07 kernel: disk 6, o:1, dev:sdr
May 7 04:54:54 oss07 kernel: disk 7, o:1, dev:sds
May 7 04:54:55 oss07 kernel: disk 8, o:1, dev:sdt
May 7 04:54:55 oss07 kernel: disk 9, o:1, dev:sdw
May 7 04:54:55 oss07 kernel: md: syncing RAID array md12
# the 3rd disk(sdm) failed while resyncing
May 7 09:41:53 oss07 kernel: mptbase: ioc1: LogInfo(0x31080000):
Originator={PL}, Code={SATA NCQ Fail All Commands After Error}, SubCode(0x0000)
May 7 09:41:57 oss07 kernel: mptbase: ioc1: LogInfo(0x31110e00):
Originator={PL}, Code={Reset}, SubCode(0x0e00)
May 7 09:41:59 oss07 last message repeated 24 times
May 7 09:42:04 oss07 kernel: mptbase: ioc1: LogInfo(0x31080000):
Originator={PL}, Code={SATA NCQ Fail All Commands After Error}, SubCode(0x0000)
May 7 09:42:34 oss07 last message repeated 43 times
May 7 09:42:34 oss07 kernel: sd 1:0:11:0: SCSI error: return code = 0x000b0000
May 7 09:42:34 oss07 kernel: end_request: I/O error, dev sdm, sector 1948444160
May 7 09:42:34 oss07 kernel: mptbase: ioc1: LogInfo(0x31080000):
Originator={PL}, Code={SATA NCQ Fail All Commands After Error}, SubCode(0x0000)
May 7 09:42:34 oss07 last message repeated 3 times
May 7 09:42:34 oss07 kernel: raid5:md12: read error not correctable (sector
1948444160 on sdm).
May 7 09:42:34 oss07 kernel: raid5: Disk failure on sdm, disabling device.
Operation continuing on 7 devices
May 7 09:42:34 oss07 kernel: raid5:md12: read error not correctable (sector
1948444168 on sdm).
May 7 09:42:34 oss07 kernel: raid5:md12: read error not correctable (sector
1948444176 on sdm).
... ...
May 7 09:42:49 oss07 kernel: --- rd:10 wd:7 fd:3
May 7 09:42:49 oss07 kernel: disk 0, o:1, dev:sdu
May 7 09:42:49 oss07 kernel: disk 1, o:0, dev:sdm
May 7 09:42:49 oss07 kernel: disk 2, o:1, dev:sdn
May 7 09:42:49 oss07 kernel: disk 3, o:1, dev:sdo
May 7 09:42:49 oss07 kernel: disk 5, o:1, dev:sdq
May 7 09:42:49 oss07 kernel: disk 6, o:1, dev:sdr
May 7 09:42:49 oss07 kernel: disk 7, o:1, dev:sds
May 7 09:42:49 oss07 kernel: disk 8, o:1, dev:sdt
May 7 09:42:49 oss07 kernel: disk 9, o:1, dev:sdw
... ...
May 7 09:42:58 oss07 kernel: RAID5 conf printout:
May 7 09:42:58 oss07 kernel: --- rd:10 wd:7 fd:3
May 7 09:42:58 oss07 kernel: disk 1, o:0, dev:sdm
May 7 09:42:58 oss07 kernel: disk 2, o:1, dev:sdn
May 7 09:42:58 oss07 kernel: disk 3, o:1, dev:sdo
May 7 09:42:58 oss07 kernel: disk 5, o:1, dev:sdq
May 7 09:42:58 oss07 kernel: disk 6, o:1, dev:sdr
May 7 09:42:58 oss07 kernel: disk 7, o:1, dev:sds
May 7 09:42:58 oss07 kernel: disk 8, o:1, dev:sdt
May 7 09:42:58 oss07 kernel: disk 9, o:1, dev:sdw
# current md status
[root at oss07 ~]# mdadm --detail /dev/md12
/dev/md12:
Version : 00.90.03
Creation Time : Mon Oct 4 15:30:53 2010
Raid Level : raid6
Array Size : 7814099968 (7452.11 GiB 8001.64 GB)
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Raid Devices : 10
Total Devices : 11
Preferred Minor : 12
Persistence : Superblock is persistent
Intent Bitmap : /mnt/scratch/bitmaps/ost02/bitmap
Update Time : Mon May 7 11:38:51 2012
State : clean, degraded
Active Devices : 7
Working Devices : 8
Failed Devices : 3
Spare Devices : 1
Chunk Size : 128K
UUID : 63eb5b15:294c1354:f0c167bd:f8e81f47
Events : 0.7382
Number Major Minor RaidDevice State
0 0 0 0 removed
1 0 0 1 removed
2 8 208 2 active sync /dev/sdn
3 8 224 3 active sync /dev/sdo
4 0 0 4 removed
5 65 0 5 active sync /dev/sdq
6 65 16 6 active sync /dev/sdr
7 65 32 7 active sync /dev/sds
8 65 48 8 active sync /dev/sdt
9 65 96 9 active sync /dev/sdw
10 8 176 - faulty spare /dev/sdl
11 65 64 - spare /dev/sdu
12 8 240 - faulty spare /dev/sdp
13 8 192 - faulty spare /dev/sdm
Best regards,
Taeyoung Hong
Senior Researcher
Supercomputing Center of KISTI
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120507/046a5912/attachment-0001.html
Kevin Van Maren
2012-May-07 15:38 UTC
[Lustre-discuss] recovery from multiple disks failure on the same md
On May 6, 2012, at 10:13 PM, Tae Young Hong wrote:
Hi,
I found the terrible situation on our lustre system.
A OST (raid 6: 8+2, spare 1) had 2 disk failures almost at the same time. While
recovering it, another disk failed. so recovering procedure seems to be halt,
and the spare disk which were in resync fell into "spare" status
again. (I guess that resync procedure almost finished more than 95%)
Right now we have just 7 disks for this md. Is there any possibility to recover
from this situation?
It might be possible, but not something I''ve done. If the array has
not been written to since a drive failed, you might be able to power-cycle the
failed drives (to reset the firmware) and force re-add them (without a rebuild)?
If the array _has_ been modified (most likely) you could write a sector of
0''s to the bad sector, which will corrupt just that stripe, and
force-re-add the last failed drive and attempt to rebuild again.
Certainly if you have a support contract I''d recommend you get
professional assistance.
Unfortunately, the failure mode you encountered is all too common. Because the
Linux SW RAID code does not read the parity blocks unless there is a problem,
hard drive failures are NOT independent: drives appear to fail more often during
a rebuild than at any other time. The only way to work around this problem is
to periodically do a "verify" of the MD array.
A verify allows the drive, which is failing in the 20% of the space that
contains parity, to fail _before_ the data becomes unreadable, rather than fail
_after_ the data becomes unreadable. Don''t do it on a degraded array,
but it is a good way to ensure healthy arrays are really healthy.
"echo check > /sys/block/mdX/md/sync_action" to force a verify.
Parity mis-matches will be reported (not corrected), but drive failures can be
dealt with sooner, rather than letting them stack up. Do "man md" and
see the "sync_action" section.
Also note that Lustre 1.8.7 has a fix to the SW RAID code (corruption when
rebuilding under load). Oracle''s release called the patch
md-avoid-corrupted-ldiskfs-after-rebuild.patch, while Whamcloud called it
raid5-rebuild-corrupt-bug.patch
Kevin
The following is detailed log.
#1 the original configuration before any failure
Number Major Minor RaidDevice State
0 8 176 0 active sync /dev/sdl
1 8 192 1 active sync /dev/sdm
2 8 208 2 active sync /dev/sdn
3 8 224 3 active sync /dev/sdo
4 8 240 4 active sync /dev/sdp
5 65 0 5 active sync /dev/sdq
6 65 16 6 active sync /dev/sdr
7 65 32 7 active sync /dev/sds
8 65 48 8 active sync /dev/sdt
9 65 96 9 active sync /dev/sdw
10 65 64 - spare /dev/sdu
#2 a disk(sdl) failed, and resync started after adding spare disk(sdu)
May 7 04:53:33 oss07 kernel: sd 1:0:10:0: SCSI error: return code = 0x08000002
May 7 04:53:33 oss07 kernel: sdl: Current: sense key: Medium Error
May 7 04:53:33 oss07 kernel: Add. Sense: Unrecovered read error
May 7 04:53:33 oss07 kernel:
May 7 04:53:33 oss07 kernel: Info fld=0x74241ace
May 7 04:53:33 oss07 kernel: end_request: I/O error, dev sdl, sector 1948523214
... ...
May 7 04:54:15 oss07 kernel: RAID5 conf printout:
May 7 04:54:16 oss07 kernel: --- rd:10 wd:9 fd:1
May 7 04:54:16 oss07 kernel: disk 1, o:1, dev:sdm
May 7 04:54:16 oss07 kernel: disk 2, o:1, dev:sdn
May 7 04:54:16 oss07 kernel: disk 3, o:1, dev:sdo
May 7 04:54:16 oss07 kernel: disk 4, o:1, dev:sdp
May 7 04:54:16 oss07 kernel: disk 5, o:1, dev:sdq
May 7 04:54:16 oss07 kernel: disk 6, o:1, dev:sdr
May 7 04:54:16 oss07 kernel: disk 7, o:1, dev:sds
May 7 04:54:16 oss07 kernel: disk 8, o:1, dev:sdt
May 7 04:54:16 oss07 kernel: disk 9, o:1, dev:sdw
May 7 04:54:16 oss07 kernel: RAID5 conf printout:
May 7 04:54:16 oss07 kernel: --- rd:10 wd:9 fd:1
May 7 04:54:16 oss07 kernel: disk 0, o:1, dev:sdu
May 7 04:54:16 oss07 kernel: disk 1, o:1, dev:sdm
May 7 04:54:16 oss07 kernel: disk 2, o:1, dev:sdn
May 7 04:54:16 oss07 kernel: disk 3, o:1, dev:sdo
May 7 04:54:16 oss07 kernel: disk 4, o:1, dev:sdp
May 7 04:54:16 oss07 kernel: disk 5, o:1, dev:sdq
May 7 04:54:16 oss07 kernel: disk 6, o:1, dev:sdr
May 7 04:54:16 oss07 kernel: disk 7, o:1, dev:sds
May 7 04:54:16 oss07 kernel: disk 8, o:1, dev:sdt
May 7 04:54:16 oss07 kernel: disk 9, o:1, dev:sdw
May 7 04:54:16 oss07 kernel: md: syncing RAID array md12
#3 another disk(sdp) failed
May 7 04:54:42 oss07 kernel: end_request: I/O error, dev sdp, sector 1949298688
May 7 04:54:42 oss07 kernel: mptbase: ioc1: LogInfo(0x31080000):
Originator={PL}, Code={SATA NCQ FaCommands After Error}, SubCode(0x0000)
May 7 04:54:42 oss07 last message repeated 3 times
May 7 04:54:42 oss07 kernel: raid5:md12: read error not correctable (sector
1949298688 on sdp).
May 7 04:54:42 oss07 kernel: raid5: Disk failure on sdp, disabling device.
Operation continuing on
May 7 04:54:43 oss07 kernel: end_request: I/O error, dev sdp, sector 1948532499
... ...
May 7 04:54:44 oss07 kernel: raid5:md12: read error not correctable (sector
1948532728 on sdp).
May 7 04:54:44 oss07 kernel: md: md12: sync done.
May 7 04:54:53 oss07 kernel: RAID5 conf printout:
May 7 04:54:53 oss07 kernel: --- rd:10 wd:8 fd:2
May 7 04:54:53 oss07 kernel: disk 0, o:1, dev:sdu
May 7 04:54:53 oss07 kernel: disk 1, o:1, dev:sdm
May 7 04:54:53 oss07 kernel: disk 2, o:1, dev:sdn
May 7 04:54:53 oss07 kernel: disk 3, o:1, dev:sdo
May 7 04:54:53 oss07 kernel: disk 4, o:0, dev:sdp
May 7 04:54:53 oss07 kernel: disk 5, o:1, dev:sdq
May 7 04:54:53 oss07 kernel: disk 6, o:1, dev:sdr
May 7 04:54:53 oss07 kernel: disk 7, o:1, dev:sds
May 7 04:54:53 oss07 kernel: disk 8, o:1, dev:sdt
May 7 04:54:53 oss07 kernel: disk 9, o:1, dev:sdw
... ...
May 7 04:54:54 oss07 kernel: RAID5 conf printout:
May 7 04:54:54 oss07 kernel: --- rd:10 wd:8 fd:2
May 7 04:54:54 oss07 kernel: disk 1, o:1, dev:sdm
May 7 04:54:54 oss07 kernel: disk 2, o:1, dev:sdn
May 7 04:54:54 oss07 kernel: disk 3, o:1, dev:sdo
May 7 04:54:54 oss07 kernel: disk 5, o:1, dev:sdq
May 7 04:54:54 oss07 kernel: disk 6, o:1, dev:sdr
May 7 04:54:54 oss07 kernel: disk 7, o:1, dev:sds
May 7 04:54:54 oss07 kernel: disk 8, o:1, dev:sdt
May 7 04:54:54 oss07 kernel: disk 9, o:1, dev:sdw
May 7 04:54:54 oss07 kernel: RAID5 conf printout:
May 7 04:54:54 oss07 kernel: --- rd:10 wd:8 fd:2
May 7 04:54:54 oss07 kernel: disk 0, o:1, dev:sdu
May 7 04:54:54 oss07 kernel: disk 1, o:1, dev:sdm
May 7 04:54:54 oss07 kernel: disk 2, o:1, dev:sdn
May 7 04:54:54 oss07 kernel: disk 3, o:1, dev:sdo
May 7 04:54:54 oss07 kernel: disk 5, o:1, dev:sdq
May 7 04:54:54 oss07 kernel: disk 6, o:1, dev:sdr
May 7 04:54:54 oss07 kernel: disk 7, o:1, dev:sds
May 7 04:54:55 oss07 kernel: disk 8, o:1, dev:sdt
May 7 04:54:55 oss07 kernel: disk 9, o:1, dev:sdw
May 7 04:54:55 oss07 kernel: md: syncing RAID array md12
# the 3rd disk(sdm) failed while resyncing
May 7 09:41:53 oss07 kernel: mptbase: ioc1: LogInfo(0x31080000):
Originator={PL}, Code={SATA NCQ Fail All Commands After Error}, SubCode(0x0000)
May 7 09:41:57 oss07 kernel: mptbase: ioc1: LogInfo(0x31110e00):
Originator={PL}, Code={Reset}, SubCode(0x0e00)
May 7 09:41:59 oss07 last message repeated 24 times
May 7 09:42:04 oss07 kernel: mptbase: ioc1: LogInfo(0x31080000):
Originator={PL}, Code={SATA NCQ Fail All Commands After Error}, SubCode(0x0000)
May 7 09:42:34 oss07 last message repeated 43 times
May 7 09:42:34 oss07 kernel: sd 1:0:11:0: SCSI error: return code = 0x000b0000
May 7 09:42:34 oss07 kernel: end_request: I/O error, dev sdm, sector 1948444160
May 7 09:42:34 oss07 kernel: mptbase: ioc1: LogInfo(0x31080000):
Originator={PL}, Code={SATA NCQ Fail All Commands After Error}, SubCode(0x0000)
May 7 09:42:34 oss07 last message repeated 3 times
May 7 09:42:34 oss07 kernel: raid5:md12: read error not correctable (sector
1948444160 on sdm).
May 7 09:42:34 oss07 kernel: raid5: Disk failure on sdm, disabling device.
Operation continuing on 7 devices
May 7 09:42:34 oss07 kernel: raid5:md12: read error not correctable (sector
1948444168 on sdm).
May 7 09:42:34 oss07 kernel: raid5:md12: read error not correctable (sector
1948444176 on sdm).
... ...
May 7 09:42:49 oss07 kernel: --- rd:10 wd:7 fd:3
May 7 09:42:49 oss07 kernel: disk 0, o:1, dev:sdu
May 7 09:42:49 oss07 kernel: disk 1, o:0, dev:sdm
May 7 09:42:49 oss07 kernel: disk 2, o:1, dev:sdn
May 7 09:42:49 oss07 kernel: disk 3, o:1, dev:sdo
May 7 09:42:49 oss07 kernel: disk 5, o:1, dev:sdq
May 7 09:42:49 oss07 kernel: disk 6, o:1, dev:sdr
May 7 09:42:49 oss07 kernel: disk 7, o:1, dev:sds
May 7 09:42:49 oss07 kernel: disk 8, o:1, dev:sdt
May 7 09:42:49 oss07 kernel: disk 9, o:1, dev:sdw
... ...
May 7 09:42:58 oss07 kernel: RAID5 conf printout:
May 7 09:42:58 oss07 kernel: --- rd:10 wd:7 fd:3
May 7 09:42:58 oss07 kernel: disk 1, o:0, dev:sdm
May 7 09:42:58 oss07 kernel: disk 2, o:1, dev:sdn
May 7 09:42:58 oss07 kernel: disk 3, o:1, dev:sdo
May 7 09:42:58 oss07 kernel: disk 5, o:1, dev:sdq
May 7 09:42:58 oss07 kernel: disk 6, o:1, dev:sdr
May 7 09:42:58 oss07 kernel: disk 7, o:1, dev:sds
May 7 09:42:58 oss07 kernel: disk 8, o:1, dev:sdt
May 7 09:42:58 oss07 kernel: disk 9, o:1, dev:sdw
# current md status
[root at oss07 ~]# mdadm --detail /dev/md12
/dev/md12:
Version : 00.90.03
Creation Time : Mon Oct 4 15:30:53 2010
Raid Level : raid6
Array Size : 7814099968 (7452.11 GiB 8001.64 GB)
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Raid Devices : 10
Total Devices : 11
Preferred Minor : 12
Persistence : Superblock is persistent
Intent Bitmap : /mnt/scratch/bitmaps/ost02/bitmap
Update Time : Mon May 7 11:38:51 2012
State : clean, degraded
Active Devices : 7
Working Devices : 8
Failed Devices : 3
Spare Devices : 1
Chunk Size : 128K
UUID : 63eb5b15:294c1354:f0c167bd:f8e81f47
Events : 0.7382
Number Major Minor RaidDevice State
0 0 0 0 removed
1 0 0 1 removed
2 8 208 2 active sync /dev/sdn
3 8 224 3 active sync /dev/sdo
4 0 0 4 removed
5 65 0 5 active sync /dev/sdq
6 65 16 6 active sync /dev/sdr
7 65 32 7 active sync /dev/sds
8 65 48 8 active sync /dev/sdt
9 65 96 9 active sync /dev/sdw
10 8 176 - faulty spare /dev/sdl
11 65 64 - spare /dev/sdu
12 8 240 - faulty spare /dev/sdp
13 8 192 - faulty spare /dev/sdm
Best regards,
Taeyoung Hong
Senior Researcher
Supercomputing Center of KISTI
<ATT00001..txt>
This e-mail message, its contents and any attachments to it are confidential to
the intended recipient, and may contain information that is privileged and/or
exempt from disclosure under applicable law. If you are not the intended
recipient, please immediately notify the sender and destroy the original e-mail
message and any attachments (and any copies that may have been made) from your
system or otherwise. Any unauthorized use, copying, disclosure or distribution
of this information is strictly prohibited. Email addresses that end with a
?-c? identify the sender as a Fusion-io contractor.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120507/1a7c1964/attachment-0001.html
Adrian Ulrich
2012-May-07 19:17 UTC
[Lustre-discuss] recovery from multiple disks failure on the same md
Hi,> A OST (raid 6: 8+2, spare 1) had 2 disk failures almost at the same time. While recovering it, another disk failed. so recovering procedure seems to be halt,So did the md-array stop itself on the 3th disk failure (or at least turn read-only)? If it did you might be able to get it running again without catastrophic corruption. This is what i would try (without any warranty!): -> Forget about the 2 syncing spares -> Take the 3th failed disk and attach it to some pc -> Copy as much data as possible to a new spare using dd_rescue (-r might help) -> Put the drive with the fresh copy (= the good, new drive) into the array and assemble + start it. Use --force if mdadm complains about outdated metadata. (and starting it as ''readonly'' for now would also be a good idea) -> Add a new spare to the array and sync it as fast as possible to get at least 1 parity disk. -> Run ''fsck -n /dev/mdX'' to see how badly damaged your filesystem is. If you think that fsck can fix the errors (and will not cause more damadge), run it without ''-n'' -> Add the 2nd parity disk, sync it, mount the filesystem and pray. The amount of data corruption will be linked to the success of dd_rescue: You are probably lucky if it only failed to read a few sectors. And i agree with Kevin: If you have a support contract: ask them to fix it. (..and if you have enough hardware + time: create a backup of ALL drives in the failed raid via ''dd'' before touching anything!) I''d also recommend to start periodic scrubbing: We do this once per month with low priority (~5MBPS) with little impact to the users. Regards and good luck, Adrian
Mark Hahn
2012-May-07 19:24 UTC
[Lustre-discuss] recovery from multiple disks failure on the same md
> I''d also recommend to start periodic scrubbing: We do this once per month >with low priority (~5MBPS) with little impact to the users.yes. and if you think a rebuild might overstress marginal disks, throttling via the dev.raid.speed_limit_max sysctl can help.
Tae Young Hong
2012-May-10 09:24 UTC
[Lustre-discuss] recovery from multiple disks failure on the same md
Thank you all for your valuable information. We survived and about 1 million files survived. At the first time I wanted get recovery professional under our support contract, but it''s not possible to get the right guy in the right time. So we had to do it on our own, roughly following the procedure Adrian mentioned, but we still felt risky and we needed good luck, now I feel that I do not want to do this ever again. For your information, dd_rescue showed that about 4MB at the almost end of the disk had bad sector. It took about 20 hrs to run for 1 TB SATA disk, we ran this on an OSS whose load was relatively small. After inserting the fresh one into the original oss(oss07) in question, we found that mdadm with " -A --force" could assemble it with some errors, and it''s state was "active, degraded, Not Started", and we had to use the following to start and resync it. echo "clean" > /sys/block/md12/md/array_state I didn''t know other method to start it. At the 1st try, we failed and two disks fell into faulty, maybe because at that times (we had a periodic maintenance), we rebooted the pair OSS node(oss08) to patch the lustre kernel(1.8.5), raid5 one-line fix which was mentioned by Kevin before. For the next try, I updated the raid5 patched lustre kernel on oss07 and just power-cycled the jbod(J4400) and oss07 and then we made it without any error while resyncing and we found that just only 2 inodes were stale by running e2fsck. Thank you also for the detailed information why we need periodic scrubbing. Taeyoung Hong Senior Researcher Supercomputing Center, KISTI 2012. 5. 8., ?? 4:24, Mark Hahn ??:>> I''d also recommend to start periodic scrubbing: We do this once per month >> with low priority (~5MBPS) with little impact to the users. > > yes. and if you think a rebuild might overstress marginal disks, > throttling via the dev.raid.speed_limit_max sysctl can help. > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120510/39b67da0/attachment.html