On Sun, Nov 10, 2002 at 08:55:26PM +0100, Francesco Peeters
wrote:> Hi all,
>
> I know this is the EXT3 list, and my problem is with an EXT2
> filesys, but I cannot seem to find a more suitable list on this
> server, and I have seen a lot of knowledge go by on this list and in
> the archives, so I thought I'd give it a try anyway...
>
> Here goes nothing:
>
> I am in a terrible problem: My data disk on my Linux server has gone
> bad, with approx 18 GB of data on it, and I never got round to
> installing a abckup system! :-( (I know: very stupid!) I never
> noticed anything before, but I went on vacation, and after returning
> I simply turned the box on again, and now I have this problem!!!
>
> It gave an error on a short read (attempt to read block from
> filesystem resulted in short read while trying to open
> /dev/hdc1. Could this be a zero-length partition?) and I ran e2fsck
> -cc on it, which seems to have fixed that, however the following
> inode sweep gives so many 'bad blocks in inode XXXXX', that I am
> afraid that I'll be left with an empty disk once the check is
> done...
I suppose one of these days someone really should write a "hard disk
catastrophe" HOWTO.....
When you have a lot of precious data on a disk that hasn't been backed
up. The very **first** thing you should do is to get the cursing
yourself for being twenty different kinds of full for not having a
backup system out of your system. Get that out of your system, so you
don't make any further mistakes.....
Next, get yourself a backup hard drive which is at least as big as the
disk which is in trouble, and do a full disk-to-disk copy of the disk
that's in trouble:
dd if=/dev/hdc of=/dev/hdd bs=1k conv=sync,noerr
Do this right away, because if the problem was due to hardware
failure, you want to grab a snapshot before the disk gets any worse.
For experimental purposes, if you're not sure what you're doing,
it's
useful to get another spare disk, and make a second-generation copy
from your first primary backup. That way, you can experiment on the
second-generation copy, and if one recovery technique doesn't work,
you can try again with a different technique, and not have to worry
about making any irrecoverable mistakes.
The first thing I would try at this point, is an "e2fsck -y" on the
second generation backup. See what you can save when it's all done;
don't forget to check the lost+found directory in the root of the
filesystem. Sometimes files will end up there.
If that doesn't work, the next steps will require a lot more expertise
and special work. So I'd start with that, and see how much you can
recover from that.
> Now when I try to do e2fsck /dev/hdc1 I get 'a corruption was found
> in the superblock' When I try e2fsck -b 8193 /dev/hdc1 It claims it
> is not a valid superblock... The same for for instance 32679, a.s.o.
For a 4k filesystem, the backup block is 32768. But please, make the
full disk-to-disk backups first, and experiment on the backups. That
way, you don't need to worry about panic-induced mistakes from making
the problem any worse.
- Ted
P.S. For those people for whom backups are just too much effort,
*please* consider using the "e2image" program to snapshot and backup
critical filesystem metadata. It's not a replacement for doing full
data backups, but at least if you have an e2image dump, in the worst
case you'll be able to recover more files if a disk failure damages
your inode table. The problem without the inode table there is no
record of which blocks go with which files, which means that
recovering files because a very, very painful manual process. e2image
will create a backup copy of the inode table, which even if it is not
fully up-to-date, will be a help in trying to reconstruct data from a
filesystem after a disk failure. Of course, the real answer is to do
real backups.....