we have come across two instances of what may be e2fsck bugs. the situation comes from trying to repair some OSTs that suffered outages. the system running is lustre-2.1.2 (latest maintenance release) e2fstools is 1.42.3-wc3 case1: e2fsck reports the following on an OST e2fsck 1.42.3.wc3 (15-Aug-2012) lustre-OST0001: recovering journal Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Entry ''5753172'' in /O/0/d20 (31064071) has deleted/unused inode 6178421. Clear? yes Entry ''5753173'' in /O/0/d21 (31064072) has deleted/unused inode 6178422. Clear? yes Entry ''5753175'' in /O/0/d23 (31096834) has deleted/unused inode 6178424. Clear? yes Entry ''5753174'' in /O/0/d22 (31096833) has deleted/unused inode 6178423. Clear? yes Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information lustre-OST0001: ***** FILE SYSTEM WAS MODIFIED ***** lustre-OST0001: 3799906/32431488 files (1.2% non-contiguous), 2863449917/8302436879 blocks when the disk is remounted to the OSS however after a short interval the following appears Lustre: lustre-OST0001: sending delayed replies to recovered clients Lustre: lustre-OST0001: received MDS connection from 10.9.89.51 at tcp Lustre: Skipped 1 previous similar message LDISKFS-fs error (device etherd!e9.0): ldiskfs_lookup: deleted inode referenced: 6178422 Aborting journal on device etherd!e9.24p2. LDISKFS-fs (etherd!e9.0): Remounting filesystem read-only LustreError: 14555:0:(filter.c:1506:filter_fid2dentry()) lustre-OST0001: object 5753173:0 lookup error: rc -5 LustreError: 14555:0:(filter.c:3129:__filter_oa2dentry()) filter_setattr error looking up object: 5753173:0 LustreError: 14551:0:(llog_cat.c:485:llog_cat_process_thread()) llog_cat_process() failed -5 it seems the dangling entry has not been fixed. it would appear we have no way to fix this disk in it''s current state. e2fsck will not rectify the issue. Is this a bug or a feature of a terminally damaged disk.?? case2: e2fsck of a disk that was cleanly unmounted but came back up with errors reports some inodes with multiply claimed blocks. however e2fsck reports the following when trying to delete them: File /O/0/d1/4921697 (inode #14123014, mod time Wed Aug 15 10:45:12 2012) has 666 multiply-claimed block(s), shared with 1 file(s): /O/0/d11 (inode #30900230, mod time Thu Aug 16 17:49:45 2012) Delete file? yes delete_file_block: internal error: can''t find dup_blk for 7910459945 File ??? (inode #14123015, mod time Wed Aug 15 10:27:33 2012) has 648 multiply-claimed block(s), shared with 1 file(s): /O/0/d12 (inode #30900231, mod time Thu Aug 16 17:49:45 2012) Delete file? yes delete_file_block: internal error: can''t find dup_blk for 7910459968 File ??? (inode #14123016, mod time Wed Aug 15 10:45:12 2012) has 657 multiply-claimed block(s), shared with 1 file(s): /O/0/d13 (inode #30900232, mod time Thu Aug 16 17:49:45 2012) Delete file? yes delete_file_block: internal error: can''t find dup_blk for 7910459957 Professor Samuel Aparicio BM BCh PhD FRCPath Nan and Lorraine Robertson Chair UBC/BC Cancer Agency 675 West 10th, Vancouver V5Z 1L3, Canada. office: +1 604 675 8200 lab website http://molonc.bccrc.ca -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120816/cfafa4b3/attachment.html