On Thu, Feb 12, 2009 at 10:54:40AM +0100, Vegard Svanberg
wrote:> After a power failure, a ~500G filesystem crashed. Fsck has been running
> for days. The problem seems to be multiply-claimed blocks. Example:
>
> File /directory/file.name/foo (inode #1234567, mod time Tue Feb
> 10 08:14:40 2008)
> has 1800000 multiply-claimed block(s), shared with 1 file(s):
>
> /directory/file.name/bar
> (inode #1234567, mod time Wed Dec 1 15:30:00 2008)
> Clone multiply-claimed blocks? y
>
> This takes like forever, probably due to the large number of
> multiply-claimed blocks.
You are using a version of e2fsprogs/e2fsck newer than 1.28, right?
If not, there's your problem; upgrade to something newer. Older
e2fsck's had O(n**2) algorithms that made this very slow, causing this
pass to be CPU-bound. It could be slow because of memory pressure
issues; the data structures for keeping track of all of those blocks
aren't small.
>I was wondering if:
>
> - I can get a list of the impacted files/inodes
Yes; you can; they were listed by e2fsck during pass 1B, actually:
Look for entries like this:
Pass 1B: Rescanning for multiply-claimed blocks
Multiply-claimed block(s) in inode 12: 25 26
Multiply-claimed block(s) in inode 13: 25 26 57 58
Multiply-claimed block(s) in inode 14: 57 58
> - Wipe them with debugfs
You could wipe them all out via debugfs's clri function, like this:
debugfs -R "clri <12> <13> <14>" /dev/sdXX
The angle brackets indicate that you are passing in an inode number,
instead of a pathname; and I've left it as an exercise to the reader
how to use your choice of tools (emacs, grep/awk, perl) to pull out
the necessary inode numbers from e2fsck's Pass1B output.
Then run e2fsck, and it will clear the resulting inodes.
To get the filenames, do this first, before the clri command:
debugfs -R "ncheck 12 13 14" /dev/sdXX
(No angle brackets are needed because ncheck only takes inode numbers
and converts them to pathnames.)
> Is this safe? How do I do it? Fsck says it's 538 inodes with this
> problem. If I could get a file list and be able to wipe the inodes, I
> could restore the missing files from backup and get the machine online
> again quickly.
However, it's not strictly necessary to wipe all 538 inodes. It's
likely that you only need to wipe approximately half of them. What
happened is that somehow, the disk drive got confused and wrote data
to the wrong location on disk. Or, the journal was corrupted (one of
the reasons why ext4 has journal checksums) so inode table blocks got
written to the wrong place on disk. So that means what you'll see is
something like this:
Multiply-claimed block(s) in inode 32: 200 201 203
Multiply-claimed block(s) in inode 33: 210 211 212 213 214
Multiply-claimed block(s) in inode 34: 215 216 217 218
...
Multiply-claimed block(s) in inode 128: 200 201 203
Multiply-claimed block(s) in inode 129: 210 211 212 213 214
Multiply-claimed block(s) in inode 130: 215 216 217 218
You may not see 16 or 32 inodes in each group of duplicate inodes
(there are 32 inodes in each 4k block, 16 inodes per 4k block if you
are using 256 byte inodes), since some inodes may have been deleted or
never allocated before.
In any case, only one set of inodes will be correct; after you
determine which one set seems correct given the mapping between
pathnames and file contents, you can clri the other set.
Or if that's too much effort, you can clri them all and recover them
from backups....
- Ted