Alexander Bugl
2010-Oct-26 19:42 UTC
[Lustre-discuss] system disk with external journals for OSTs formatted
Hi, we had an accident with a Sun Fire X4540 "Thor" System with 48 HDDs: The first two disks sda and sdb contain several partitions, one for the / file system, one for swap (not used) and 5 small partitions used as external journals for the OSTs, which reside on the 46 other HDDs. [root at soss10 ~]# fdisk -l /dev/sda /dev/sdb Disk /dev/sda: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 * 1 6527 52428096 fd Linux raid autodetect /dev/sda2 6528 10704 33551752+ fd Linux raid autodetect /dev/sda3 10705 121601 890780152+ 5 Extended /dev/sda5 10705 10953 2000061 fd Linux raid autodetect /dev/sda6 10954 11202 2000061 fd Linux raid autodetect /dev/sda7 11203 11451 2000061 fd Linux raid autodetect /dev/sda8 11452 11700 2000061 fd Linux raid autodetect /dev/sda9 11701 11949 2000061 fd Linux raid autodetect Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdb1 * 1 6527 52428096 fd Linux raid autodetect /dev/sdb2 6528 10704 33551752+ fd Linux raid autodetect /dev/sdb3 10705 121601 890780152+ 5 Extended /dev/sdb5 10705 10953 2000061 fd Linux raid autodetect /dev/sdb6 10954 11202 2000061 fd Linux raid autodetect /dev/sdb7 11203 11451 2000061 fd Linux raid autodetect /dev/sdb8 11452 11700 2000061 fd Linux raid autodetect /dev/sdb9 11701 11949 2000061 fd Linux raid autodetect The md devices are: md14 : active raid6 sdw[0] sdav[9] sdan[8] sdaf[7] sdx[6] sdp[5] sdh[4] sdau[3] sdam[2] sdae[1] 7814099968 blocks level 6, 64k chunk, algorithm 2 [10/10] [UUUUUUUUUU] md13 : active raid6 sdak[0] sdo[9] sdg[8] sdat[7] sdal[6] sdad[5] sdv[4] sdn[3] sdf[2] sdas[1] 7814099968 blocks level 6, 64k chunk, algorithm 2 [10/10] [UUUUUUUUUU] md12 : active raid6 sdd[0] sdac[9] sdu[8] sdm[7] sde[6] sdar[5] sdaj[4] sdab[3] sdt[2] sdl[1] 7814099968 blocks level 6, 64k chunk, algorithm 2 [10/10] [UUUUUUUUUU] md11 : active raid6 sdah[0] sdaq[7] sdai[6] sdaa[5] sds[4] sdk[3] sdc[2] sdap[1] 5860574976 blocks level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU] md10 : active raid6 sdi[0] sdz[7] sdao[6] sdag[5] sdy[4] sdr[3] sdq[2] sdj[1] 5860574976 blocks level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU] md1 : active raid1 sdb2[1] sda2[0] 33551680 blocks [2/2] [UU] md20 : active raid1 sdb5[1] sda5[0] 1999936 blocks [2/2] [UU] md21 : active raid1 sdb6[1] sda6[0] 1999936 blocks [2/2] [UU] md22 : active raid1 sdb7[1] sda7[0] 1999936 blocks [2/2] [UU] md23 : active raid1 sdb8[1] sda8[0] 1999936 blocks [2/2] [UU] md24 : active raid1 sdb9[1] sda9[0] 1999936 blocks [2/2] [UU] md0 : active raid1 sdb1[1] sda1[0] 52428032 blocks [2/2] [UU] The original OSTs had been created using a command like: mkfs.lustre --ost --fsname=${FSNAME} --mgsnode=${MGSNODE}@o2ib \ --reformat --mkfsoptions="-m 0 -J device=/dev/md20" \ --param ost.quota_type=ug /dev/md10 & (the pairs md21/md11, md22/md12, ..., respectively) Accidentally we started a fresh installation, which could not be aborted fast enough -- the partition information on sda and sdb was erased. The other 46 disks should not have been harmed, though. We started a reinstallation which only formatted the first 2 partitions and which recreated the partition layout on sda and sdb, all of the md devices resynced without problems. When we now try to mount any of the 5 OSTs, we get the following error: [root at soss10 ~]# mount /dev/md14 mount.lustre: mount /dev/md14 at /lustre/ost4 failed: Invalid argument This may have multiple causes. Are the mount options correct? Check the syslog for more info. syslog says: Oct 26 21:34:55 soss10 kernel: LDISKFS-fs error (device md14): ldiskfs_check_descriptors: Block bitmap for group 1920 not in group (block 268482810)! Oct 26 21:34:55 soss10 kernel: LDISKFS-fs: group descriptors corrupted! Oct 26 21:34:55 soss10 kernel: LustreError: 10719:0: (obd_mount.c:1292:server_kernel_mount()) premount /dev/md14:0x0 ldiskfs failed: -22, ldiskfs2 failed: -19. Is the ldiskfs module available? Oct 26 21:34:56 soss10 kernel: LustreError: 10719:0: (obd_mount.c:1618:server_fill_super()) Unable to mount device /dev/md14: -22 Oct 26 21:34:56 soss10 kernel: LustreError: 10719:0: (obd_mount.c:2050:lustre_fill_super()) Unable to mount (-22) Trying to mount the partition as ldiskfs does not work, either: [root at soss10 ~]# mount -t ldiskfs /dev/md14 /mnt mount: wrong fs type, bad option, bad superblock on /dev/md14, missing codepage or other error In some cases useful info is found in syslog - try dmesg | tail or so syslog only says: Oct 26 21:35:54 soss10 kernel: LDISKFS-fs error (device md14): ldiskfs_check_descriptors: Block bitmap for group 1920 not in group (block 268482810)! Oct 26 21:35:54 soss10 kernel: LDISKFS-fs: group descriptors corrupted! Trying to run e2fsck -n yields: [root at soss10 ~]# e2fsck -n /dev/md10 e2fsck 1.41.10.sun2 (24-Feb-2010) e2fsck: Group descriptors look bad... trying backup blocks... Error writing block 1 (Attempt to write block from filesystem resulted in short write). Ignore error? no Error writing block 2 (Attempt to write block from filesystem resulted in short write). Ignore error? no Error writing block 3 (Attempt to write block from filesystem resulted in short write). Ignore error? no Error writing block 4 (Attempt to write block from filesystem resulted in short write). Ignore error? no ... [continues up to block 344] One or more block group descriptor checksums are invalid. Fix? no Group descriptor 0 checksum is invalid. IGNORED. Group descriptor 1 checksum is invalid. IGNORED. Group descriptor 2 checksum is invalid. IGNORED. Group descriptor 3 checksum is invalid. IGNORED. ... [continues up to Group descriptor 44712] squall-OST0019 contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes (the rest of e2fsck is till running ...) Question: What could be the problem, I thought that no data on the OSTs and insode the journal partitions should have been overwritten. Is there any chance to repair these problems without data loss? Thank you in advance for any suggestions about how to continue ... With regards, Alex -- Alexander Bugl, Central IT Services, ZMAW Max Planck Institute for Meteorology Bundesstrasse 53, D-20146 Hamburg, Germany tel +49-40-41173-351, fax -298, room PE048
Wojciech Turek
2010-Oct-26 19:52 UTC
[Lustre-discuss] system disk with external journals for OSTs formatted
Hi Alex, So if I understand you correctly you have accidentally destroyed your external journals. So it seem that your OSTs are missing journals. Maybe the fix will be to recreate the journal on the OSTs regards, Wojciech On 26 October 2010 20:42, Alexander Bugl <alexander.bugl at zmaw.de> wrote:> Hi, > > we had an accident with a Sun Fire X4540 "Thor" System with 48 HDDs: > > The first two disks sda and sdb contain several partitions, one for the / > file > system, one for swap (not used) and 5 small partitions used as external > journals for the OSTs, which reside on the 46 other HDDs. > > [root at soss10 ~]# fdisk -l /dev/sda /dev/sdb > > Disk /dev/sda: 1000.2 GB, 1000204886016 bytes > 255 heads, 63 sectors/track, 121601 cylinders > Units = cylinders of 16065 * 512 = 8225280 bytes > Device Boot Start End Blocks Id System > /dev/sda1 * 1 6527 52428096 fd Linux raid > autodetect > /dev/sda2 6528 10704 33551752+ fd Linux raid > autodetect > /dev/sda3 10705 121601 890780152+ 5 Extended > /dev/sda5 10705 10953 2000061 fd Linux raid > autodetect > /dev/sda6 10954 11202 2000061 fd Linux raid > autodetect > /dev/sda7 11203 11451 2000061 fd Linux raid > autodetect > /dev/sda8 11452 11700 2000061 fd Linux raid > autodetect > /dev/sda9 11701 11949 2000061 fd Linux raid > autodetect > > Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes > 255 heads, 63 sectors/track, 121601 cylinders > Units = cylinders of 16065 * 512 = 8225280 bytes > Device Boot Start End Blocks Id System > /dev/sdb1 * 1 6527 52428096 fd Linux raid > autodetect > /dev/sdb2 6528 10704 33551752+ fd Linux raid > autodetect > /dev/sdb3 10705 121601 890780152+ 5 Extended > /dev/sdb5 10705 10953 2000061 fd Linux raid > autodetect > /dev/sdb6 10954 11202 2000061 fd Linux raid > autodetect > /dev/sdb7 11203 11451 2000061 fd Linux raid > autodetect > /dev/sdb8 11452 11700 2000061 fd Linux raid > autodetect > /dev/sdb9 11701 11949 2000061 fd Linux raid > autodetect > > The md devices are: > md14 : active raid6 sdw[0] sdav[9] sdan[8] sdaf[7] sdx[6] sdp[5] sdh[4] > sdau[3] sdam[2] sdae[1] > 7814099968 blocks level 6, 64k chunk, algorithm 2 [10/10] [UUUUUUUUUU] > > md13 : active raid6 sdak[0] sdo[9] sdg[8] sdat[7] sdal[6] sdad[5] sdv[4] > sdn[3] sdf[2] sdas[1] > 7814099968 blocks level 6, 64k chunk, algorithm 2 [10/10] [UUUUUUUUUU] > > md12 : active raid6 sdd[0] sdac[9] sdu[8] sdm[7] sde[6] sdar[5] sdaj[4] > sdab[3] sdt[2] sdl[1] > 7814099968 blocks level 6, 64k chunk, algorithm 2 [10/10] [UUUUUUUUUU] > > md11 : active raid6 sdah[0] sdaq[7] sdai[6] sdaa[5] sds[4] sdk[3] sdc[2] > sdap[1] > 5860574976 blocks level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU] > > md10 : active raid6 sdi[0] sdz[7] sdao[6] sdag[5] sdy[4] sdr[3] sdq[2] > sdj[1] > 5860574976 blocks level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU] > > md1 : active raid1 sdb2[1] sda2[0] > 33551680 blocks [2/2] [UU] > > md20 : active raid1 sdb5[1] sda5[0] > 1999936 blocks [2/2] [UU] > > md21 : active raid1 sdb6[1] sda6[0] > 1999936 blocks [2/2] [UU] > > md22 : active raid1 sdb7[1] sda7[0] > 1999936 blocks [2/2] [UU] > > md23 : active raid1 sdb8[1] sda8[0] > 1999936 blocks [2/2] [UU] > > md24 : active raid1 sdb9[1] sda9[0] > 1999936 blocks [2/2] [UU] > > md0 : active raid1 sdb1[1] sda1[0] > 52428032 blocks [2/2] [UU] > > The original OSTs had been created using a command like: > mkfs.lustre --ost --fsname=${FSNAME} --mgsnode=${MGSNODE}@o2ib \ > --reformat --mkfsoptions="-m 0 -J device=/dev/md20" \ > --param ost.quota_type=ug /dev/md10 & > (the pairs md21/md11, md22/md12, ..., respectively) > > Accidentally we started a fresh installation, which could not be aborted > fast > enough -- the partition information on sda and sdb was erased. > The other 46 disks should not have been harmed, though. > > We started a reinstallation which only formatted the first 2 partitions and > which recreated the partition layout on sda and sdb, all of the md devices > resynced without problems. > > When we now try to mount any of the 5 OSTs, we get the following error: > > [root at soss10 ~]# mount /dev/md14 > mount.lustre: mount /dev/md14 at /lustre/ost4 failed: Invalid argument > This may have multiple causes. > Are the mount options correct? > Check the syslog for more info. > > syslog says: > Oct 26 21:34:55 soss10 kernel: LDISKFS-fs error (device md14): > ldiskfs_check_descriptors: Block bitmap for group 1920 not in group (block > 268482810)! > Oct 26 21:34:55 soss10 kernel: LDISKFS-fs: group descriptors corrupted! > Oct 26 21:34:55 soss10 kernel: LustreError: 10719:0: > (obd_mount.c:1292:server_kernel_mount()) premount /dev/md14:0x0 ldiskfs > failed: -22, ldiskfs2 failed: -19. Is the ldiskfs module available? > Oct 26 21:34:56 soss10 kernel: LustreError: 10719:0: > (obd_mount.c:1618:server_fill_super()) Unable to mount device /dev/md14: > -22 > Oct 26 21:34:56 soss10 kernel: LustreError: 10719:0: > (obd_mount.c:2050:lustre_fill_super()) Unable to mount (-22) > > Trying to mount the partition as ldiskfs does not work, either: > [root at soss10 ~]# mount -t ldiskfs /dev/md14 /mnt > mount: wrong fs type, bad option, bad superblock on /dev/md14, > missing codepage or other error > In some cases useful info is found in syslog - try > dmesg | tail or so > syslog only says: > Oct 26 21:35:54 soss10 kernel: LDISKFS-fs error (device md14): > ldiskfs_check_descriptors: Block bitmap for group 1920 not in group (block > 268482810)! > Oct 26 21:35:54 soss10 kernel: LDISKFS-fs: group descriptors corrupted! > > Trying to run e2fsck -n yields: > [root at soss10 ~]# e2fsck -n /dev/md10 > e2fsck 1.41.10.sun2 (24-Feb-2010) > e2fsck: Group descriptors look bad... trying backup blocks... > Error writing block 1 (Attempt to write block from filesystem resulted in > short write). Ignore error? no > Error writing block 2 (Attempt to write block from filesystem resulted in > short write). Ignore error? no > Error writing block 3 (Attempt to write block from filesystem resulted in > short write). Ignore error? no > Error writing block 4 (Attempt to write block from filesystem resulted in > short write). Ignore error? no > ... [continues up to block 344] > One or more block group descriptor checksums are invalid. Fix? no > Group descriptor 0 checksum is invalid. IGNORED. > Group descriptor 1 checksum is invalid. IGNORED. > Group descriptor 2 checksum is invalid. IGNORED. > Group descriptor 3 checksum is invalid. IGNORED. > ... [continues up to Group descriptor 44712] > squall-OST0019 contains a file system with errors, check forced. > Pass 1: Checking inodes, blocks, and sizes > > (the rest of e2fsck is till running ...) > > Question: What could be the problem, I thought that no data on the OSTs and > insode the journal partitions should have been overwritten. Is there any > chance to repair these problems without data loss? > > Thank you in advance for any suggestions about how to continue ... > With regards, Alex > > -- > Alexander Bugl, Central IT Services, ZMAW > Max Planck Institute for Meteorology > Bundesstrasse 53, D-20146 Hamburg, Germany > tel +49-40-41173-351, fax -298, room PE048 > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20101026/16beeb67/attachment-0001.html
Alexander Bugl
2010-Oct-27 07:02 UTC
[Lustre-discuss] system disk with external journals for OSTs formatted
Hi! We now have a full e2fsck -fn log -- probably one of you could give an estimation how severe it is what e2fsck says?> Trying to run e2fsck -n yields:[root at soss10 ~]# e2fsck -fn /dev/md14 e2fsck 1.41.10.sun2 (24-Feb-2010) e2fsck: Group descriptors look bad... trying backup blocks... Error writing block 1 (Attempt to write block from filesystem resulted in short write). Ignore error? no Error writing block 2 (Attempt to write block from filesystem resulted in short write). Ignore error? no Error writing block 3 (Attempt to write block from filesystem resulted in short write). Ignore error? no [... and all integers between] Error writing block 463 (Attempt to write block from filesystem resulted in short write). Ignore error? no Error writing block 464 (Attempt to write block from filesystem resulted in short write). Ignore error? no One or more block group descriptor checksums are invalid. Fix? no Group descriptor 0 checksum is invalid. IGNORED. Group descriptor 1 checksum is invalid. IGNORED. Group descriptor 2 checksum is invalid. IGNORED. Group descriptor 3 checksum is invalid. IGNORED. [... and all integers between] Group descriptor 59615 checksum is invalid. IGNORED. Group descriptor 59616 checksum is invalid. IGNORED. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Block bitmap differences: +(1394212864--1394213377) +(1394245632--1394246145) +(1394278400--1394278913) +(1394311168--1394311681) +(1394343936--1394344449) [... this line is veeeery long ...] Fix? no Free blocks count wrong for group #0 (31223, counted=1625). Fix? no Free blocks count wrong for group #1 (31229, counted=395). Fix? no Free blocks count wrong for group #2 (32254, counted=848). Fix? no Free blocks count wrong for group #3 (31229, counted=1998). Fix? no [... and many groups between] Free blocks count wrong for group #59615 (32254, counted=1088). Fix? no Free blocks count wrong for group #59616 (27390, counted=26878). Fix? no Free blocks count wrong (752943653, counted=753681334). Fix? no Inode bitmap differences: -(8193--8197) -(8200--8201) -(8206--8209) -(8211--8217) -(8219--8222) -8224 -(8226--8228) -8231 ... [... this line is even longer than the Block bitmap differences ...] Fix? no Free inodes count wrong for group #0 (8181, counted=8177). Fix? no Free inodes count wrong for group #1 (8192, counted=4382). Fix? no Free inodes count wrong for group #3 (8192, counted=3602). Fix? no [... and many groups more] Free inodes count wrong for group #7418 (8192, counted=2878). Fix? no Free inodes count wrong for group #19795 (8192, counted=8189). Fix? no Directories count wrong for group #19795 (0, counted=1). Fix? no Free inodes count wrong (488206819, counted=485877645). Fix? no squall-OST001d: ***** FILE SYSTEM WAS MODIFIED ***** squall-OST001d: ********** WARNING: Filesystem still has errors ********** squall-OST001d: 175645/488382464 files (25.3% non-contiguous), 1200581339/1953524992 blocks Error writing block 1 (Attempt to write block from filesystem resulted in short write). Ignore error? no Error writing block 2 (Attempt to write block from filesystem resulted in short write). Ignore error? no Error writing block 3 (Attempt to write block from filesystem resulted in short write). Ignore error? no [... and all integers between] Error writing block 463 (Attempt to write block from filesystem resulted in short write). Ignore error? no Error writing block 464 (Attempt to write block from filesystem resulted in short write). Ignore error? no The full log is 24 MB in size, too big for a mailing list -- it is online available at <ftp://ftp.zmaw.de/outgoing/alex/e2fsck-md14.txt>. Thanks in advance, with regards, Alex -- Alexander Bugl, Central IT Services, ZMAW Max Planck Institute for Meteorology Bundesstrasse 53, D-20146 Hamburg, Germany tel +49-40-41173-351, fax -298, room PE048
Andreas Dilger
2010-Oct-27 12:46 UTC
[Lustre-discuss] system disk with external journals for OSTs formatted
On 2010-10-27, at 15:02, Alexander Bugl wrote:>> Trying to run e2fsck -n yields: > > [root at soss10 ~]# e2fsck -fn /dev/md14 > e2fsck 1.41.10.sun2 (24-Feb-2010) > e2fsck: Group descriptors look bad... trying backup blocks... > Error writing block 1 (Attempt to write block from filesystem resulted in > short write). Ignore error? no > Error writing block 2 (Attempt to write block from filesystem resulted in > short write). Ignore error? no > Error writing block 3 (Attempt to write block from filesystem resulted in > short write). Ignore error? no > [... and all integers between] > Error writing block 463 (Attempt to write block from filesystem resulted in > short write). Ignore error? no > Error writing block 464 (Attempt to write block from filesystem resulted in > short write). Ignore error? noI don''t know what these errors are, possibly trying to write into the broken journal device? The rest of the fileystem errors are very minor. You should probably delete the journal device via "tune2fs -O ^has_journal", run a full "e2fsck -f" and then recreate the journal with "tune2fs -j size=400". Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc.
Daniel Kobras
2010-Oct-27 13:45 UTC
[Lustre-discuss] system disk with external journals for OSTs formatted
On Wed, Oct 27, 2010 at 08:46:41PM +0800, Andreas Dilger wrote:> I don''t know what these errors are, possibly trying to write into the broken > journal device? The rest of the fileystem errors are very minor. You should > probably delete the journal device via "tune2fs -O ^has_journal", run a full > "e2fsck -f" and then recreate the journal with "tune2fs -j size=400".On a filesystem with errors, you''ll have to use "tune2fs -f -O ^has_journal" to force removal of the journal. At least that''s what the man page says. When I once had to pull such a stunt a while ago, tune2fs refused to remove the journal even when given the force flag, though. It might work as documented now. Otherwise, my workaround was to retrieve the journal UUID from the main filesystem with tune2fs -l, then create a new external journal with this UUID (mke2fs -O journal_dev -U <UUID> ...). At this point I was able to run e2fsck on the main filesystem to get back to a clean state. Finally, I could remove and add back the journal with normal tune2fs calls to get it properly linked back to the filesystem. Regards, Daniel.
Ben Evans
2010-Nov-01 19:49 UTC
[Lustre-discuss] system disk with external journals for OSTsformatted
If removing the journal by using tune2fs does not work (and others have mentioned cases where it doesn''t) the following should do the trick (but take much longer): debugfs -w {OST device name (/dev/sda, etc.} debugfs: features Filesystem features: has_journal ext_attr_ resize_inode dir_index ... (you need to make sure the ''has_journal'' is listed as part of the features) debugfs: ^has_journal debugfs: features Filesystem features: ext_attr_ resize_inode dir_index ... (list is the same as above, but without has_features) debugfs: quit You then need to recreate your journals as others have said and add them back into the FS. -----Original Message----- From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Andreas Dilger Sent: Wednesday, October 27, 2010 8:47 AM To: Alexander Bugl Cc: lustre-discuss at lists.lustre.org Subject: Re: [Lustre-discuss] system disk with external journals for OSTsformatted On 2010-10-27, at 15:02, Alexander Bugl wrote:>> Trying to run e2fsck -n yields: > > [root at soss10 ~]# e2fsck -fn /dev/md14 > e2fsck 1.41.10.sun2 (24-Feb-2010) > e2fsck: Group descriptors look bad... trying backup blocks... > Error writing block 1 (Attempt to write block from filesystem resultedin> short write). Ignore error? no > Error writing block 2 (Attempt to write block from filesystem resultedin> short write). Ignore error? no > Error writing block 3 (Attempt to write block from filesystem resultedin> short write). Ignore error? no > [... and all integers between] > Error writing block 463 (Attempt to write block from filesystemresulted in> short write). Ignore error? no > Error writing block 464 (Attempt to write block from filesystemresulted in> short write). Ignore error? noI don''t know what these errors are, possibly trying to write into the broken journal device? The rest of the fileystem errors are very minor. You should probably delete the journal device via "tune2fs -O ^has_journal", run a full "e2fsck -f" and then recreate the journal with "tune2fs -j size=400". Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc. _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Alexander Bugl
2010-Nov-01 21:13 UTC
[Lustre-discuss] system disk with external journals for OSTs formatted
Hi! On Wednesday 27 October 2010 14:46:41 Andreas Dilger wrote:> On 2010-10-27, at 15:02, Alexander Bugl wrote: > >> Trying to run e2fsck -n yields: > > [root at soss10 ~]# e2fsck -fn /dev/md14 > > e2fsck 1.41.10.sun2 (24-Feb-2010) > > e2fsck: Group descriptors look bad... trying backup blocks... > > Error writing block 1 (Attempt to write block from filesystem resulted in > > short write). Ignore error? no > > Error writing block 2 (Attempt to write block from filesystem resulted in > > short write). Ignore error? no > > Error writing block 3 (Attempt to write block from filesystem resulted in > > short write). Ignore error? no > > [... and all integers between] > > Error writing block 463 (Attempt to write block from filesystem resulted > > in short write). Ignore error? no > > Error writing block 464 (Attempt to write block from filesystem resulted > > in short write). Ignore error? no > > I don''t know what these errors are, possibly trying to write into the > broken journal device? The rest of the fileystem errors are very minor. > You should probably delete the journal device via "tune2fs -O > ^has_journal", run a full "e2fsck -f" and then recreate the journal with > "tune2fs -j size=400". > > Cheers, AndreasThanks for all the tips. We started some minutes before Andreas'' mail with the e2fsck, without deleting the journal: # e2fsck -fp /dev/md14 squall-OST001d: Note: if several inode or block bitmap blocks or part of the inode table require relocation, you may wish to try running e2fsck with the ''-b 32768'' option first. The problem may lie only with the primary block group descriptors, and the backup block group descriptors may be OK. squall-OST001d: Block bitmap for group 1920 is not in group. (block 268482810) squall-OST001d: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. (i.e., without -a or -p options) # e2fsck -fp -b 32768 /dev/md14 squall-OST001d: One or more block group descriptor checksums are invalid. FIXED. squall-OST001d: Group descriptor 0 checksum is invalid. squall-OST001d: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. (i.e., without -a or -p options) # e2fsck -fy /dev/md14 e2fsck 1.41.10.sun2 (24-Feb-2010) One or more block group descriptor checksums are invalid. Fix? yes Group descriptor 0 checksum is invalid. FIXED. Group descriptor 1 checksum is invalid. FIXED. Group descriptor 2 checksum is invalid. FIXED. [... continuing] But luckily the file system checks on the OSTs finished, and the OSTs could be mounted as ldiskfs and as lustre. And it looks like there has been no file mangled, deleted or whatsoever, we did not find any problems after careful checking, and our users did not report problems, either. Next time we know how to remove and re-add the external journals, so thank you again for all the tips. With regards, Alex -- Alexander Bugl, Central IT Services, ZMAW Max Planck Institute for Meteorology Bundesstrasse 53, D-20146 Hamburg, Germany tel +49-40-41173-351, fax -298, room PE048