Hey all, Had a power failure and subsequent ext3 disk corruptions. Attempting to fix, but not working. Its a 120 gig IDE disk, 3 partitions. /boot, /, and swap. Basically can't boot up since the box can't get to the system files in /usr/ or anything. So I'm booting off of a FC2 disk 1 in recovery mode and trying to fix the filesystem with e2fsck The boot partition cleaned up fine. The swap partition came back up when I did "/sbin/mkswap /dev/hda3" and "/sbin/swapon -a" I cannot, however, get fsck to run fully on /dev/hda2 (the main part of the drive). I run it and it goes through the first few recovery steps (1A, 1B, 1C), and then it comes to the problem. It lists a bunch of inodes (like 150) and asks: "Clone duplicate/bad blocks? (y)" I get about 6 of these messages with varying lists of inodes. I say yes, and fsck continues. But, after about 6 of these messages, the cycle of "Clone duplicate/bad blocks?" repeats. I looked closely at the listing of inodes, and the same 6 or so groupings of blocks are (i guess) asking to be cloned? So the program just loops, infinitely looping through the yes answers. (2 hours now). No difference if I go through manually and say no, don't clone the blocks. Just loops through the same lists. Some data: + I've got a gig of ram in the box, and (I think) swap space is on. + There may be some several 4 gig files that were corrupted. + By using top in another terminal, I find processor pegged at 99% What can I do to recover this drive? Thanks, John. ====----------------------------------- John J Freer 518.441.9647 / john_freer at yahoo.com
> I cannot, however, get fsck to run fully on /dev/hda2 (the main part > of the drive). I run it and it goes through the first few recovery > steps (1A, 1B, 1C), and then it comes to the problem.I had the same problem with debian's e2fsck 1.35 (28-Feb-2004) with a 1 terabyte volume (left it fscking for 1 week, it was still looping). I could still mount the volume however, so I mounted it read-only, and copied all the data on another disk. Some files were corrupted of course, and some of them had a wrong size (like 10 GB or even some TB), so I built an exclusion list when copying the data. After that, I had some checksums of the important files so I could find out which were corrupted and which were not. I *think* that the problem was, that the volume was full, and fsck needed free space to clone blocks or something like that. I couldn't free up space on the volume however, since when I tried it sometimes caused kernel oopses, sometimes made the volume go read-only, and when it didn't, it didn't free a single block (from du's point of view anyway). I still have an image of the damaged filesystem here (but I think I will destroy it soon, since 1 TB isn't that cheap, as you can guess).
Hey All,
Thanks so much for the responses. Here's some more details.
Recap:
1. 120G ATA133 Drive. Unclean shutdown during power failure.
2. XP2200, Via chipset ECS K7S5A Mobo, 1G ram.
3. Drive broken into 3 partitions, / (hda2), /boot (hda1), and swap
(hda3). Bulk of the drive is in / (hda2).
4. on reboot, unable to mount the / partition, boot halted.
5. on next boot, used e2fsck by booting off of RHFC2 Disk 1 in
"linux rescue" mode. ie: e2fsck -vy /dev/hda2. Many errors reported,
including a statement that the journal was corrupted and needed to be
removed. Think it was fsck ver. 1.35.
+Crunched for about 10 minutes through passes 1A, 1B, 1C, and
started 1D.
+fsck hangs in the middle of pass 1D, then after approx. 20
seconds hung seems to restart pass 1D.
+I know it is looping/restarting because in pass 1D I get a list
of 50-150 bad/duplicate blocks at different inodes. At the bottom of
the list I get "Clone duplicate/bad blocks?" I then press y and get
another list of blocks/inodes with the same prompt. Press y again,
repeat, say, 5 times. after the last 'y', fsck hangs for like 30
seconds and then goes back to the same first list of bad
blocks/inodes. The bad block/inode numbers match exactly to the
first time through.
+I tried answering no manually to the duplication and yes to
delete duplicate. No difference. Still loops.
+I downloaded the source for fsck and looked at pass1D. Not sure
what/why its looping, though the souce is fairly easy to parse.
+++I think there may be an error message scrolling off the top of
the screen really fast after I answer 'y' to the prompt, but I can't
see it on the terminal.
IS THERE ANY WAY I CAN RUN e2fsck AND TRAP THE OUTPUT TO A FILE,
SAY, LIKE e2fsck -vy /dev/hda >> /tmp/output.txt ? How would I
format the syntax?
I'll be happy to do this and make all the data available if it
would help.
+ I also tried re-running e2fsck with -b 32764 to use another
superblock for rebuilding. This resulted in the same thing. It said
the journal was corrupt and that it would be removed. Then it tried
to fix the drive and the same symptoms came up.
6. Since I thought this might be a memory problem thing, I also
tried running only after re-enabling the swap partition. ie: mkswap
/dev/hda3; swapon -a;
+No difference.
7. I just ordered a new 200G drive. I'll probably install FC3 on it
this weekend and then try to recover the data off the 120G drive
then. But I'd sure like to get this drive back up to recover some
stuff.
What do you suggest next?
Thanks,
John.