thr3ads.net - Ext3 users - ext3 fsck question [Feb 2002]

If this information is useful, please help other people find it:
Share via:

mb/ext3@dcs.qmul.ac.uk

2002-Feb-15 11:56 UTC

ext3 fsck question

Hi,

After our big ext3 file server crashes, I notice the fsck spends some time
replaying the journals (about 5-10 mins for all volumes on the server in
question). I guess it must do this should you want to mount the volumes as
ext2.

My question--is it (theoretically) possible to tell fsck only to replay
half-finished and to knock out incomplete transactions from the journals,
leaving the kernel to replay the good ones in its own time, possibly
reducing downtime by a few minutes? Or might this break assumptions the
kernel code makes? Or is it totally impossible and ridiculous? :)

Matt

Andrew Morton

2002-Feb-15 19:27 UTC

head link

Re: ext3 fsck question

mb/ext3@dcs.qmul.ac.uk wrote:> 
> Hi,
> 
> After our big ext3 file server crashes, I notice the fsck spends some time
> replaying the journals (about 5-10 mins for all volumes on the server in
> question). I guess it must do this should you want to mount the volumes as
> ext2.
That must be one big server.  Are the fsck's running in parallel?

The ideal situation is that individual partitions on each disk
are fsck'ed sequentially, and that all disks are fsked in parallel.
So you end up with a *single* fsck running against each disk, but
all disks being worked on in parallel.  This is all possible, but
requires some genuflection over fsck and fstab manpages.

Oh, and how come your server crashes?
> My question--is it (theoretically) possible to tell fsck only to replay
> half-finished and to knock out incomplete transactions from the journals,
> leaving the kernel to replay the good ones in its own time, possibly
> reducing downtime by a few minutes? Or might this break assumptions the
> kernel code makes? Or is it totally impossible and ridiculous? :)
> 
No, this is quite possible, but not desirable.

What happens at present is that recovery can replay data unnecessarily - it
will rewrite transactions which were in fact fully checkpointed at the
time of the crash.

But addressing this shortcoming would require that the ext3 commit phase
seek to the head of the journal to update the journal superblock each
time we've fully checkpointed a transaction.  Which would slow down
normal operation to gain a recovery-time speedup.  Which is a bad
tradeoff.

Possibly we could optimise this by putting additional information into
the journal commit blocks - record the highest known-to-be-committed
transaction ID within the commit block.  hmm.

I suggest that you ensure that you're getting the best possible
parallalism in the recovery, and perhaps experiment with smaller
journals.

-

Stephen C. Tweedie

2002-Feb-15 19:28 UTC

head link

Re: ext3 fsck question

Hi,

On Fri, Feb 15, 2002 at 11:56:38AM +0000, mb/ext3@dcs.qmul.ac.uk wrote:
 > After our big ext3 file server crashes, I notice the fsck spends some time
> replaying the journals (about 5-10 mins for all volumes on the server in
> question). I guess it must do this should you want to mount the volumes as
> ext2.
Yes, but 5--10 minutes is a long time.  How many volumes are there?
How large are the journals?  Can you not parallelise the fscks a bit?
> My question--is it (theoretically) possible to tell fsck only to replay
> half-finished and to knock out incomplete transactions from the journals,
> leaving the kernel to replay the good ones in its own time, possibly
> reducing downtime by a few minutes? Or might this break assumptions the
> kernel code makes? Or is it totally impossible and ridiculous? :)
Incomplete transactions are always ignored completely, both by kernel
and e2fsck recovery.  The replay is _always_ going to get done,
because if e2fsck doesn't do it, then the kernel will do exactly the
same thing when you try to mount the filesystems.  The kernel and
e2fsprogs actually share the same recovery.c file for that.

Unless Something Weird is happening, doing the recovery in fsck should
be better because you'll be able to parallelise the recovery on
different disks.  Using kernel recovery, recovery happens at mount
time, and mounts are typically done sequentially.

Cheers,
 Stephen

Reasonably Related Threads

Search for more maybe matching threads

Ext3 users - Feb 2002 - ext3 fsck question

ext3 fsck question

Re: ext3 fsck question

Re: ext3 fsck question

Reasonably Related Threads