Samuel Aparicio
2012-Jul-04 01:09 UTC
[Lustre-discuss] corrupted OSTs on server, advice needed
We had an OST server go down somewhat uncleanly and it appears we have problems with two of the OSTs. we are running the maintenance release of lustre-2.1.1 one OST reports the following: ------------ e2fsck -p -j /dev/etherd/e18.21p2 /dev/md141 lustre2-OST0006: Note: if several inode or block bitmap blocks or part of the inode table require relocation, you may wish to try running e2fsck with the ''-b 32768'' option first. The problem may lie only with the primary block group descriptors, and the backup block group descriptors may be OK. lustre2-OST0006: Block bitmap for group 960 is not in group. (block 18446744073709551615) lustre2-OST0006: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. (i.e., without -a or -p options) ----------- I notice in an old thread, something similar happened elsewhere and was recovered with e2fsck -fp -b 32768 <device> followed by e2fsck -fy <device> would this be safe to do ? an alternative on that thread suggested by Andreas Dilger was to deleted the external journal, e2fsck and then re-add the journal afterwards. The second OST reports the following: ----------- e2fsck -p -j /dev/etherd/e18.21p6 /dev/md142 lustre2-OST0009: External journal does not support this filesystem which is strange because this IS the external journal for this filesystem. any idea on how to proceed with this one would be gratefully received. Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 lab website http://molonc.bccrc.ca