Ed W
2002-Aug-30 10:58 UTC
What to do? Error:(Attempt to read block from filesystem resulted in short read) while doing inode scan.)
Hi folks, I have come sort of corruption on my hard disk, but it is difficult to find documentation on how to deal with it. The system is an old AMD K6/400 running Redhat 7.2 with 2.4.9. Disk is clipped to 33Mb due to old disk controller, and is running ext3. Some time back I suffered a number of cases where the machine hard locked up and I had to power cycle it to get control back. However, I recently suspect disk corruption and did a manual fsck in single user mode. What this came back with was a number of errors as shown below (possibly one per hard reboot....?) Pass 1: Checking inodes, block, and sizes Error reading block xxxxxxx (Attempt to read block from filesystem resulted in short read) while doing inode scan. I don't recall the exact block numbers (seem to have misplaced my notes where I wrote them down...). However, they were reproducible every time I ran fsck So the question is how do I take this forward? I assume that it means a dead sector? Is there any way to find out which files occupy that sector so that I can be suspicious of their quality? Can I mark the sector as bad and recover any of the data? Is this definitely a symptom of a failing disk? (seem unlikely bearing in mind the age and history of the disk). fsck simply ignored the problem and it wasn't obvious that there was any way to make it correct the problem (I'm running manually, not with -y) The only hint I really have that something is wrong is my backup via tar always ends with: tar: Error exit delayed from previous errors (As an aside is tar a satisfactory method of doing a full backup of an ext3 filesystem?) Pointers and preferably notes as to what to be careful not to do would be really appreciated (this is a live system and I would rather not need to test my backups). Thanks all Ed W
Stephen C. Tweedie
2002-Aug-30 12:03 UTC
Re: What to do? Error:(Attempt to read block from filesystem resulted in short read) while doing inode scan.)
Hi, On Fri, Aug 30, 2002 at 11:58:49AM +0100, Ed W wrote:> So the question is how do I take this forward? I assume that it means a > dead sector?Probably, yes.> Is there any way to find out which files occupy that sector so > that I can be suspicious of their quality?debugfs can do that if you know the block number.> Can I mark the sector as bad and > recover any of the data?You can't recover data from the bad sector, but you can map it out with "e2fsck -c" if you want to. If you don't do that, the drive should map the bad block out anyway the next time you try to write to it.> Is this definitely a symptom of a failing disk?Looks like it, although if it's just one sector it might be a one-off thing. Cheers, Stephen
Theodore Ts'o
2002-Aug-30 12:34 UTC
Re: What to do? Error:(Attempt to read block from filesystem resulted in short read) while doing inode scan.)
On Fri, Aug 30, 2002 at 11:58:49AM +0100, Ed W wrote:> Pass 1: Checking inodes, block, and sizes > Error reading block xxxxxxx (Attempt to read block from filesystem resulted > in short read) while doing inode scan. > > I don't recall the exact block numbers (seem to have misplaced my notes > where I wrote them down...). However, they were reproducible every time I > ran fsck > > So the question is how do I take this forward? I assume that it means a > dead sector? Is there any way to find out which files occupy that sector so > that I can be suspicious of their quality?This specific error means that you had a bad block inside your inode table. Depending on where in the inode table you had bad block, you could have lost as many as 32 files. (If you saw complaints about directory entries pointing to deleted files, that would be confirmation.)> Can I mark the sector as bad and recover any of the data?You can mark the sector as bad by doing "e2fsck -c". You can try to see if the disk drive will remap the sector by doing a non-destructive read/write badblocks check. With e2fsck versions 1.26 or later, this can be done by using the command "e2fsck -cc". With older versions of e2fsprogs, you'll need to run "badblocks -n" manually. Note that this can be dangerous! If your disk is starting to die, more disk activity can make things worse, not better. I'd normally suggest that people do a full disk-to-disk image backup before starting.> The only hint I really have that something is wrong is my backup via tar > always ends with: > > tar: Error exit delayed from previous errors > > (As an aside is tar a satisfactory method of doing a full backup of an ext3 > filesystem?)Under normal circumstances yes. A much more paranoid way to do a disk backup, which is what I recommend in these sorts of circumstances, is to do a disk-to-disk image backup. In order to do this you need a disk partition at least as big as the one which you are backing up. You then issue this command, while the filesystem is unmounted: dd if=/dev/hd_old_disk of=/dev/hd_backup_disk bs=1k conv=sync,noerror If you don't have a spare disk, consider getting one. Disk drives are terribly cheap these days, and the data on the disk is generally worth a factor of 10 or 100 more than purchasing a new disk might cost. This is why sometimes people will simply replace a disk on a drop of the hat as soon as they start seeing soft errors (i.e., warnings from the disk that there were read errors that were correctable using ECC), never mind the hard errors which you're clearly seeing here. Their time to dick around and figure out whether or not the errors on the disk are stable or not, or to replace the disk and deal recovering from backups, just simply isn't worth it. It's often better to spend the $100 or so for a new 80 gig drive, and just move on.> Pointers and preferably notes as to what to be careful not to do would be > really appreciated (this is a live system and I would rather not need to > test my backups).Good luck!! - Ted