Werner Dilling
2011-Apr-06 11:26 UTC
[Lustre-discuss] e2fsck and related errors during recovering
Hello, after a crash of our lustre system (1.6.4) we have problems repairing the filesystem. Running the 1.6.4 e2fsck failed on the mds filesystem so we tried with the latest 1.8 version which succeeded. But trying to mount mds as ldiskfs filesystem failed with the standard error message: bad superblock on .... We tried to get more info and the file command file -s -L /dev/.... produced "ext2 filesystem" instead of ext3 filesystem which we got from all ost-filesystems. We were able to produce the mds-database which is needed to get info for lfs fsck. But using this database to create the ost databases failed with the error message: error getting mds_hdr (large number:8) in /tmp/msdb: Cannot allocate memory .. So I assume the msdb is in bad shape and my question is how we can proceed. I assume we have to create a correct version of the mds-filesystem and how to do this is unknown. Any help and info is appreciated. Thanks w.dilling -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1986 bytes Desc: S/MIME Signatur Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110406/a451177d/attachment.bin
Andreas Dilger
2011-Apr-06 18:29 UTC
[Lustre-discuss] e2fsck and related errors during recovering
Having the actual error messages makes this kind of problem much easier to solve. At a guess, if the journal was removed by e2fsck you can re-add it with "tune2fs -J size=400 /dev/{mdsdev}". As for lfsck, if you still need to run it, you need to make sure the same version of e2fsprogs is on all OSTs and MDS. Cheers, Andreas On 2011-04-06, at 1:26 AM, Werner Dilling <dilling at zdv.uni-tuebingen.de> wrote:> Hello, > after a crash of our lustre system (1.6.4) we have problems repairing the filesystem. Running the 1.6.4 e2fsck failed on the mds filesystem so we tried with the latest 1.8 version which succeeded. But trying to mount mds as ldiskfs filesystem failed with the standard error message: bad superblock on .... > We tried to get more info and the file command > file -s -L /dev/.... produced "ext2 filesystem" instead of ext3 filesystem which we got from all ost-filesystems. > We were able to produce the mds-database which is needed to get info for lfs fsck. But using this database to create the ost databases failed with the error message: error getting mds_hdr (large number:8) in /tmp/msdb: Cannot allocate memory .. > So I assume the msdb is in bad shape and my question is how we can proceed. I assume we have to create a correct version of the mds-filesystem and how to do this is unknown. Any help and info is appreciated. > > Thanks > w.dilling > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Is it helpful updating the e2fsprogs to the newest version? I have ever had a problem during e2fsck, after updating the e2fsprogs, it''s ok. On Thu, Apr 7, 2011 at 2:29 AM, Andreas Dilger <adilger at whamcloud.com> wrote:> Having the actual error messages makes this kind of problem much easier to solve. > > At a guess, if the journal was removed by e2fsck you can re-add it with "tune2fs -J size=400 /dev/{mdsdev}". > > As for lfsck, if you still need to run it, you need to make sure the same version of e2fsprogs is on all OSTs and MDS. > > Cheers, Andreas > > On 2011-04-06, at 1:26 AM, Werner Dilling <dilling at zdv.uni-tuebingen.de> wrote: > >> Hello, >> after a crash of our lustre system (1.6.4) we have problems repairing the filesystem. Running the 1.6.4 e2fsck failed on the mds filesystem so we tried with the latest 1.8 version which succeeded. But trying to mount mds as ldiskfs filesystem failed with the standard error message: bad superblock on .... >> We tried to get more info and the file command >> file -s -L /dev/.... produced "ext2 filesystem" instead of ext3 filesystem which we got from all ost-filesystems. >> We were able to produce the mds-database which is needed to get info for lfs fsck. But using this database to create the ost databases failed with the error message: error getting mds_hdr (large number:8) in /tmp/msdb: Cannot allocate memory .. >> So I assume the msdb is in bad shape and my question is how we can proceed. I assume we have to create a correct version of the mds-filesystem and how to do this is unknown. Any help and info is appreciated. >> >> Thanks >> w.dilling >> >> >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Andreas Dilger
2011-Apr-07 20:01 UTC
[Lustre-discuss] e2fsck and related errors during recovering
Yes, upgrading to the latest e2fsprogs is generally safe. Cheers, Andreas On 2011-04-06, at 8:18 PM, Larry <tsrjzq at gmail.com> wrote:> Is it helpful updating the e2fsprogs to the newest version? I have > ever had a problem during e2fsck, after updating the e2fsprogs, it''s > ok. > > On Thu, Apr 7, 2011 at 2:29 AM, Andreas Dilger <adilger at whamcloud.com> wrote: >> Having the actual error messages makes this kind of problem much easier to solve. >> >> At a guess, if the journal was removed by e2fsck you can re-add it with "tune2fs -J size=400 /dev/{mdsdev}". >> >> As for lfsck, if you still need to run it, you need to make sure the same version of e2fsprogs is on all OSTs and MDS. >> >> Cheers, Andreas >> >> On 2011-04-06, at 1:26 AM, Werner Dilling <dilling at zdv.uni-tuebingen.de> wrote: >> >>> Hello, >>> after a crash of our lustre system (1.6.4) we have problems repairing the filesystem. Running the 1.6.4 e2fsck failed on the mds filesystem so we tried with the latest 1.8 version which succeeded. But trying to mount mds as ldiskfs filesystem failed with the standard error message: bad superblock on .... >>> We tried to get more info and the file command >>> file -s -L /dev/.... produced "ext2 filesystem" instead of ext3 filesystem which we got from all ost-filesystems. >>> We were able to produce the mds-database which is needed to get info for lfs fsck. But using this database to create the ost databases failed with the error message: error getting mds_hdr (large number:8) in /tmp/msdb: Cannot allocate memory .. >>> So I assume the msdb is in bad shape and my question is how we can proceed. I assume we have to create a correct version of the mds-filesystem and how to do this is unknown. Any help and info is appreciated. >>> >>> Thanks >>> w.dilling >>> >>> >>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>