thr3ads.net - Ext3 users - Check journal is replayable ? [Dec 2002]

If this information is useful, please help other people find it:
Share via:

John Vickers

2002-Dec-02 17:10 UTC

Check journal is replayable ?

Hello.

Is there a simple way, at a shell script level, of finding out whether an ext3
fs
has a sane journal, other than mounting it or running a full fsck ?

I may quite well be missing a few things here, but what I think I'd like is
some option extra
to e2fsck that says "if this is a journalled filesystem, and it was shut
down
uncleanly, just replay the journal and check for immediately obvious problems,
but don't bother scanning the whole filesystem unless there's a
'-f' in sight".

AFAICT, the usual way of handling ext3 filesystems seems to be to mark them with
fs_passno=0,
so they never get fscked from the init scripts - but the journal gets replayed,
and a few things
get checked at mount time.

If mount fails - because something horrible really did happen - then the
/etc/rc.sysinit doesn't
seem to have any way of coping, or dropping to an interactive shell.

John.

Andreas Dilger

2002-Dec-02 17:44 UTC

head link

Re: Check journal is replayable ?

On Dec 02, 2002  17:10 +0000, John Vickers wrote:> Is there a simple way, at a shell script level, of finding out whether an
> ext3 fs has a sane journal, other than mounting it or running a full fsck ?
Yes, "tune2fs -l <dev> | grep
'features:.*needs_recovery'", but reading
further you do not actually need it.
> I may quite well be missing a few things here, but what I think I'd
like is
> some option extra to e2fsck that says "if this is a journalled
filesystem,
> and it was shut down uncleanly, just replay the journal and check for
> immediately obvious problems, but don't bother scanning the whole
filesystem
> unless there's a '-f' in sight".
That is how e2fsck already works, no need to change anything.  By default
it will replay the journal, and then check the superblock for errors.  If
no error is marked in the superblock, it is done in a second or so[*].

Just doing this with the above script isn't enough, since errors can also
be stored in the journal header in case of very serious errors, and the
un-recovered filesystem superblock will _appear_ to be fine, but the
filesystem really needs a full check.

[*] There is also a feature of ext2/3 that allows you to specify full
    filesystem checks after a certain number of mounts/time.  Some
    people turn this off in the mistaken thought that "it has a journal,
    I don't need no stinking fsck on my filesystems".  However, a
journal
    is no protection against disk, memory, CPU, or kernel errors, so doing
    periodic full fscks can help find errors before they cause cascading
    data corruption on your filesystem, or get detected right in the middle
    of some important work and make your system unusable.  If you don't like
    the "every 20 mounts" full fsck, change it with "tune2fs
-c" to be some
    longer interval.
> AFAICT, the usual way of handling ext3 filesystems seems to be to mark them
> with fs_passno=0, so they never get fscked from the init scripts - but the
> journal gets replayed, and a few things get checked at mount time.
That is just plain wrong, since it will skip full checking if there was an
error detected in the filesystem.
> If mount fails - because something horrible really did happen - then the
> /etc/rc.sysinit doesn't seem to have any way of coping, or dropping to
an
> interactive shell.
That's why you should have passno != 0 for all ext3 filesystems, so that
e2fsck has a chance to check the superblock before the filesystem is
mounted.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

Stephen C. Tweedie

2002-Dec-02 17:50 UTC

head link

Re: Check journal is replayable ?

On Mon, 2002-12-02 at 17:10, John Vickers wrote:
> Is there a simple way, at a shell script level, of finding out whether an
ext3 fs
> has a sane journal, other than mounting it or running a full fsck ?
Define a "sane journal"?

The journal just contains copies of disk blocks.  It's nothing more than
a list of "here's a copy of the new block number FOO of the
filesystem."  And the journal is *supposed* to contain gaps after an
unexpected reboot --- it's by looking for missing bits that we work out
just how much of the journal did get successfully written out to disk
when things crashed.

In other words, the journal is really really dumb, and there's next to
no validation you can sensibly do on its contents without invoking a lot
of filesystem layout knowledge (and at that point you're into full fsck
territory.)
> AFAICT, the usual way of handling ext3 filesystems seems to be to mark them
with fs_passno=0,
> so they never get fscked from the init scripts - but the journal gets
replayed, and a few things
> get checked at mount time.
No, you should give them a valid pass number to force fsck to run, but
when fsck sees an ext3 filesystem needing recovery, it skips the full
check and just does the recovery stage.

You still want the fsck to run because in case of a filesystem error
being detected at run time, the kernel can mark the partition as having
an error, and the subsequent fsck can pick that up and force a full fsck
to fix it.  That mechanism fails if you set the pass number to zero.

You can disable forced fscks while preserving that error-recovery
behaviour by leaving the passno intact but setting the fsck mount-count
and check-intervals to zero with tune2fs.

Cheers,
 Stephen

Apparently Analagous Threads

Search for more reasonably related threads

Ext3 users - Dec 2002 - Check journal is replayable ?

Check journal is replayable ?

Re: Check journal is replayable ?

Re: Check journal is replayable ?

Apparently Analagous Threads