I've got an approximately 100GB ext3 FS which we recently sized down from 300GB using e2fsadm (with the disc offline obviously). I noticed the following in dmesg the other day: EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14827639 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14041793 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14827672 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14827754 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14827752 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14827753 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15074206 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14205567 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 29573367 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 13647877 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15253505 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15106867 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15073284 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15140401 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 37093790 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15140403 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15140402 I took this disc offline and ran fsck twice. It found a few errors the first time and none the second. I then rebooted the server, but the errors returned. The obvious next step would seem to be to build a new file-system and tar copy the data. Does it seem reasonable that the shrink could have corrupted the inode table? There is a lot going on with this setup (could be LVM, ext3online_resize, etc), so mostly I'm just curious if it is reasonable that the shrink could have been responsible or if there is some other component I should examine. Thoughts? RedHat 7.2 kernel linux-2.4.19 lvm-1.0.6 ext3-2.4 + online_resize patch e2fsprogs-1.28 w/ ext3resize ext2resize-1.1.18 (CVS actually) RAID5 - 4 discs on SAN, qla2200 driver (6.04.00) Thanks, -poul
On Wed, Dec 03, 2003 at 04:26:07PM -0800, Poul Petersen wrote:> > I took this disc offline and ran fsck twice. It found a few errors > the first time and none the second. I then rebooted the server, but the > errors returned. The obvious next step would seem to be to build a new > file-system and tar copy the data. Does it seem reasonable that the shrink > could have corrupted the inode table? There is a lot going on with this > setup (could be LVM, ext3online_resize, etc), so mostly I'm just curious if > it is reasonable that the shrink could have been responsible or if there is > some other component I should examine. Thoughts? >If fsck found filesystem corruptions, and the errors returned after running fsck, I would be very suspicious about hardware problems; perhaps a memory problem, or a loose SCSI/IDE cable, or a flakey disk about to go bad. So the *first* thing I would do is a full backup, before doing any more investigation, just in case this is a warning sign before a full fledge hard disk head crash. It doesn't seem reasonable that the shrink has anything to do with this; even if the shrink had corrupted the filesystem, fsck would have fixed it. That's why the fact that filesystem errors occured right after an fsck makes me very suspicious about your hardware. It's possible it might also be caused by a kernel bug, but if you haven't changed your kernel in a while, and it's been working fine, the component to be most suspicious of is your hardware.... - Ted
On Dec 03, 2003 16:26 -0800, Poul Petersen wrote:> > I've got an approximately 100GB ext3 FS which we recently sized down > from 300GB using e2fsadm (with the disc offline obviously). I noticed the > following in dmesg the other day: > > EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: > 14827639 > EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: > 14041793Are you using this machine as an NFS server, and possibly the clients have cached the inode numbers for no-longer-existent inode numbers? Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/
I have recently done some testing on ext3 fs for large file systems ( > 1 TB ) and ext3 has shown no problems as such . One thing I can assure you that the size has nothing to play any role in your problem . regards , Salil "God give me work, till my life shall end. And life, till my work is done" -----Original Message----- From: Poul Petersen [mailto:petersp@roguewave.com] Sent: Thursday, December 04, 2003 5:56 AM To: ext3-users@redhat.com Subject: ext3_get_inode_loc: bad inode number: I've got an approximately 100GB ext3 FS which we recently sized down from 300GB using e2fsadm (with the disc offline obviously). I noticed the following in dmesg the other day: EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14827639 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14041793 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14827672 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14827754 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14827752 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14827753 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15074206 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 14205567 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 29573367 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 13647877 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15253505 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15106867 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15073284 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15140401 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 37093790 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15140403 EXT3-fs error (device lvm(58,8)): ext3_get_inode_loc: bad inode number: 15140402 I took this disc offline and ran fsck twice. It found a few errors the first time and none the second. I then rebooted the server, but the errors returned. The obvious next step would seem to be to build a new file-system and tar copy the data. Does it seem reasonable that the shrink could have corrupted the inode table? There is a lot going on with this setup (could be LVM, ext3online_resize, etc), so mostly I'm just curious if it is reasonable that the shrink could have been responsible or if there is some other component I should examine. Thoughts? RedHat 7.2 kernel linux-2.4.19 lvm-1.0.6 ext3-2.4 + online_resize patch e2fsprogs-1.28 w/ ext3resize ext2resize-1.1.18 (CVS actually) RAID5 - 4 discs on SAN, qla2200 driver (6.04.00) Thanks, -poul _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users
> Are you using this machine as an NFS server, and possibly the clients > have cached the inode numbers for no-longer-existent inode numbers? >Yes - this is a NFS server, a rather busy one at that. I'm currently seeing this error occur 5 or 8 times per day, though the frequency seems to be tapering off since the last fsck/resize/fsck which occurred on Dec 3 15:14. I've included the Date/inode#/time data below. Looking at the times at least one of those has got to be triggered by an automated process given that it seems to occur with startling consistency around 4:20 am. Should I try and figure out which clients might have cached these inodes and simply reboot them? -poul Dec 9: 17613464 (04:15:03) 24822372 (04:15:03) 33538067 (09:34:43) 33538067 (10:27:13) 33538067 (10:55:58) Dec 8: 17613464 (04:15:00) 24822372 (04:15:00) 17613464 (04:25:03) 24822372 (04:25:03) 17613464 (04:30:12) 24822372 (04:30:12) 13647877 (15:45:45) 13647877 (20:16:46) Dec 7: 17613464 (04:15:02) 24822372 (04:15:02) 17613464 (04:30:04) 24822372 (04:30:04) 13647877 (12:18:26) Dec 6: 17613464 (04:15:00) 24822372 (04:15:00) 17613464 (04:30:03) 24822372 (04:30:03) 13647877 (14:03:20) Dec 5: 17613464 (04:15:02) 24822372 (04:15:02) 37093790 (09:05:06) 33538067 (10:35:31) 15024387 (12:47:15) 15024400 (12:47:15) 32768008 (16:26:27) 25427971 (17:24:09) Dec 4: 13647877 (00:00:00) 14270467 (00:00:02) 17613464 (04:15:00) 24822372 (04:15:00) 17613464 (04:20:02) 24822372 (04:20:02) 17613464 (04:25:02) 24822372 (04:25:02) 17613464 (04:30:07) 24822372 (04:30:07) 15140403 (10:52:37) 37093790 (16:12:09) 25427971 (17:54:29) Dec 3: 17613464 (04:15:00) 24822372 (04:15:00) 37093790 (09:49:30) 14827639 (15:14:19) 14041793 (15:14:20) 14827672 (15:14:20) 14827754 (15:14:20) 14827752 (15:14:20) 14827753 (15:14:20) 15074206 (15:14:20) 14205567 (15:14:22) 29573367 (15:14:25) 13647877 (15:14:29) 15253505 (15:14:29) 15106867 (15:14:30) 15073284 (15:14:31) 15140401 (15:14:57) 37093790 (15:19:58) 15140403 (15:36:06) 15140402 (15:36:07) 15221312 (16:20:59) 15221355 (16:20:59) 15073284 (16:26:14) 25427971 (17:39:44)