thr3ads.net - Lustre discuss - [Lustre-discuss] On-disk bitmap corrupted [Apr 2010]

If this information is useful, please help other people find it:
Share via:
Lu Wang
2010-Apr-19 10:00 UTC
[Lustre-discuss] On-disk bitmap corrupted

Dear  all, 
	 We envolve in  a same situation as problem discussed here:
http://lists.lustre.org/pipermail/lustre-discuss/2009-January/009512.html

One OST is set as read only the first time after it is remounted after a server
crash.



      Apr 16 17:40:31 boss27 kernel: LDISKFS-fs error (device sdd1):
ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 32486corrupted: 8664
blocks free in bitmap, 13310 - in gd
Apr 16 17:40:31 boss27 kernel:
Apr 16 17:40:31 boss27 kernel: Remounting filesystem read-only
Apr 16 17:40:31 boss27 kernel: LDISKFS-fs error (device sdd1):
ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 32486corrupted: 8664
blocks free in bitmap, 13310 - in gd
Apr 16 17:40:31 boss27 kernel:
Apr 16 17:40:31 boss27 kernel: LDISKFS-fs error (device sdd1):
ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 32486corrupted: 8664
blocks free in bitmap, 13310 - in gd
Apr 16 17:40:31 boss27 kernel:
Apr 16 17:40:31 boss27 kernel: LDISKFS-fs error (device sdd1):
ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 32486corrupted: 8664
blocks free in bitmap, 13310 - in gd
Apr 16 17:40:31 boss27 kernel:
Apr 16 17:40:31 boss27 kernel: LDISKFS-fs error (device sdd1):
ldiskfs_mb_check_ondisk_bitmap: on-disk bitmap for group 32486corrupted: 8664
blocks free in bitmap, 13310 - in gd
Apr 16 17:40:31 boss27 kernel:
Apr 16 17:40:31 boss27 kernel: LustreError:
6158:0:(fsfilt-ldiskfs.c:1288:fsfilt_ldiskfs_write_record()) can''t
start transaction for 37 blocks (128 bytes)
Apr 16 17:40:31 boss27 kernel: LustreError:
6240:0:(fsfilt-ldiskfs.c:1288:fsfilt_ldiskfs_write_record()) can''t
start transaction for 37 blocks (128 bytes)
Apr 16 17:40:31 boss27 kernel: LustreError:
6166:0:(fsfilt-ldiskfs.c:470:fsfilt_ldiskfs_brw_start()) can''t get
handle for 555 credits: rc = -30
@


This OST is unwritable since last Friday, and its disk usage is quite different
from neigbour OSTs.

[root at boss27 ~]# df -h 
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             4.1T  3.8T  105G  98% /lustre/ost1
/dev/sda2             4.1T  3.7T  158G  96% /lustre/ost2
/dev/sdb1             4.1T  3.8T  124G  97% /lustre/ost3
/dev/sdb2             4.1T  3.8T   84G  98% /lustre/ost4
/dev/sdc1             4.1T  3.7T  128G  97% /lustre/ost5
/dev/sdc2             4.1T  3.7T  131G  97% /lustre/ost6


/dev/sdd1             4.1T  3.3T  591G  85% /lustre/ost7

/dev/sdd2             4.1T  3.8T   52G  99% /lustre/ost8

Is is prossible to fix this problem without lfsck the whole file system? Our
system is about 500TB(93% full). We are running Lustre 1.8.1.1. File stripe=1



Best Regards
Lu Wang
--------------------------------------------------------------	  
Computing Center
IHEP						Office: Computing Center,123 
19B Yuquan Road				Tel: (+86) 10 88236012-607
P.O. Box 918-7				Fax: (+86) 10 8823 6839
Beijing 100049,China		Email: Lu.Wang at ihep.ac.cn							
--------------------------------------------------------------
Lustre discuss - Apr 2010 - On-disk bitmap corrupted

[Lustre-discuss] On-disk bitmap corrupted