Hey all, Had a power failure and subsequent ext3 disk corruptions. Attempting to fix, but not working. Its a 120 gig IDE disk, 3 partitions. /boot, /, and swap. Basically can't boot up since the box can't get to the system files in /usr/ or anything. So I'm booting off of a FC2 disk 1 in recovery mode and trying to fix the filesystem with e2fsck The boot partition cleaned up fine. The swap partition came back up when I did "/sbin/mkswap /dev/hda3" and "/sbin/swapon -a" I cannot, however, get fsck to run fully on /dev/hda2 (the main part of the drive). I run it and it goes through the first few recovery steps (1A, 1B, 1C), and then it comes to the problem. It lists a bunch of inodes (like 150) and asks: "Clone duplicate/bad blocks? (y)" I get about 6 of these messages with varying lists of inodes. I say yes, and fsck continues. But, after about 6 of these messages, the cycle of "Clone duplicate/bad blocks?" repeats. I looked closely at the listing of inodes, and the same 6 or so groupings of blocks are (i guess) asking to be cloned? So the program just loops, infinitely looping through the yes answers. (2 hours now). No difference if I go through manually and say no, don't clone the blocks. Just loops through the same lists. Some data: + I've got a gig of ram in the box, and (I think) swap space is on. + There may be some several 4 gig files that were corrupted. + By using top in another terminal, I find processor pegged at 99% What can I do to recover this drive? Thanks, John. ====----------------------------------- John J Freer 518.441.9647 / john_freer at yahoo.com
> I cannot, however, get fsck to run fully on /dev/hda2 (the main part > of the drive). I run it and it goes through the first few recovery > steps (1A, 1B, 1C), and then it comes to the problem.I had the same problem with debian's e2fsck 1.35 (28-Feb-2004) with a 1 terabyte volume (left it fscking for 1 week, it was still looping). I could still mount the volume however, so I mounted it read-only, and copied all the data on another disk. Some files were corrupted of course, and some of them had a wrong size (like 10 GB or even some TB), so I built an exclusion list when copying the data. After that, I had some checksums of the important files so I could find out which were corrupted and which were not. I *think* that the problem was, that the volume was full, and fsck needed free space to clone blocks or something like that. I couldn't free up space on the volume however, since when I tried it sometimes caused kernel oopses, sometimes made the volume go read-only, and when it didn't, it didn't free a single block (from du's point of view anyway). I still have an image of the damaged filesystem here (but I think I will destroy it soon, since 1 TB isn't that cheap, as you can guess).
Hey All, Thanks so much for the responses. Here's some more details. Recap: 1. 120G ATA133 Drive. Unclean shutdown during power failure. 2. XP2200, Via chipset ECS K7S5A Mobo, 1G ram. 3. Drive broken into 3 partitions, / (hda2), /boot (hda1), and swap (hda3). Bulk of the drive is in / (hda2). 4. on reboot, unable to mount the / partition, boot halted. 5. on next boot, used e2fsck by booting off of RHFC2 Disk 1 in "linux rescue" mode. ie: e2fsck -vy /dev/hda2. Many errors reported, including a statement that the journal was corrupted and needed to be removed. Think it was fsck ver. 1.35. +Crunched for about 10 minutes through passes 1A, 1B, 1C, and started 1D. +fsck hangs in the middle of pass 1D, then after approx. 20 seconds hung seems to restart pass 1D. +I know it is looping/restarting because in pass 1D I get a list of 50-150 bad/duplicate blocks at different inodes. At the bottom of the list I get "Clone duplicate/bad blocks?" I then press y and get another list of blocks/inodes with the same prompt. Press y again, repeat, say, 5 times. after the last 'y', fsck hangs for like 30 seconds and then goes back to the same first list of bad blocks/inodes. The bad block/inode numbers match exactly to the first time through. +I tried answering no manually to the duplication and yes to delete duplicate. No difference. Still loops. +I downloaded the source for fsck and looked at pass1D. Not sure what/why its looping, though the souce is fairly easy to parse. +++I think there may be an error message scrolling off the top of the screen really fast after I answer 'y' to the prompt, but I can't see it on the terminal. IS THERE ANY WAY I CAN RUN e2fsck AND TRAP THE OUTPUT TO A FILE, SAY, LIKE e2fsck -vy /dev/hda >> /tmp/output.txt ? How would I format the syntax? I'll be happy to do this and make all the data available if it would help. + I also tried re-running e2fsck with -b 32764 to use another superblock for rebuilding. This resulted in the same thing. It said the journal was corrupt and that it would be removed. Then it tried to fix the drive and the same symptoms came up. 6. Since I thought this might be a memory problem thing, I also tried running only after re-enabling the swap partition. ie: mkswap /dev/hda3; swapon -a; +No difference. 7. I just ordered a new 200G drive. I'll probably install FC3 on it this weekend and then try to recover the data off the 120G drive then. But I'd sure like to get this drive back up to recover some stuff. What do you suggest next? Thanks, John.