thr3ads.net - Ext3 users - ext3-2.4-0.9.4 [Jul 2001]

If this information is useful, please help other people find it:
Share via:

Norbert Preining

2001-Jul-26 07:34 UTC

Re: ext3-2.4-0.9.4

On Don, 26 Jul 2001, Andrew Morton wrote:> Ted has put out a prelease of e2fsprogs-1.23 which supports
Where to get it? On the sourceforge page there is no prerelease.

Best wishes

Norbert

-- 
ciao
norb

+-------------------------------------------------------------------+
| Norbert Preining              http://www.logic.at/people/preining |
| University of Technology Vienna, Austria        preining@logic.at |
| DSA: 0x09C5B094 (RSA: 0xCF1FA165) mail subject: get [DSA|RSA]-key |
+-------------------------------------------------------------------+

Andrew Morton

2001-Jul-26 07:34 UTC

head link

ext3-2.4-0.9.4

An update to the ext3 filesystem for 2.4 kernels is available at

	http://www.uow.edu.au/~andrewm/linux/ext3/

The diffs are against linux-2.4.7 and linux-2.4.6-ac5.

The changelog is there.  One rarely-occurring but oopsable bug
was fixed and several quite significant performance enhancements
have been made.  These are in addition to the performance fixes
which went into 0.9.3.

Ted has put out a prelease of e2fsprogs-1.23 which supports
filesystem type `auto' in /etc/fstab, so it is now possible to
switch between ext3- and non-ext3-kernels without changing
any configuration.

It is recommended that users of earlier ext3 releases upgrade
to 0.9.4.

For people who are undertaking performance testing, it is perhaps
useful to point out that ext3 operates in one of three different
journalling modes, and that these modes have very different
functionality and very different performance characteristics.
Really, you need to test all three and balance the functionality
which each mode offers against the throughput which you obtain
in your application.


The modes are:

data=writeback

  This is classic metadata-only journalling.  File data is written
  back to the main fs lazily.  After a crash+recovery the fs's
  structural integrity is preserved, but the *contents* of files
  can and will contain old, stale data.  Potentially hundreds of
  megabytes of it.

  This is the fastest mode for normal filesystem applications.

data=ordered

  The fs ensures that file data is written into the main fs prior
  to committing its metadata.  Hence after a crash+recovery, your
  files will contain the correct data.

  This is the default operating mode and throughput is good. It
  adds about one second to a four minute kernel compile when
  compared with ext2.   Under heavier loads the difference
  becomes larger.

data=journal

  All data (as well as to metadata) is written to the journal
  before it is released to the main fs for writeback.
  
  This is a specialised mode - for normal fs usage you're better
  off using ordered data, which has the same benefits of not corrupting
  data after crash+recovery.  However for applications which require
  synchronous operation such as mail spools and synchronously exported
  NFS servers, this can be a performance win.  I have seen dbench
  figures in this mode (where the files were opened O_SYNC) running
  at ten times the throughput of ext2.  Not that this is the expected
  benefit for other applications!


Looking at the above issues, one may initially think that the
post-recovery data corruption is a serious issue with writeback mode,
and that there are big advantages to using journalled or ordered data.

However, even in these modes the affected files may be shorter-than-expected
after recovery, because the app hadn't finished writing them yet.  And
usually, a truncated file is just as useless as one which contains
garbage - it needs to be deleted.

It's not really as simple as that - for small (< a few hundred k) files,
it tends to be the case that either the whole file is intact after a crash,
or none of it is.  This is because the journalling mechanism starts a
new transaction every five seconds, and a typical open/write/close operation
usually fits entirely inside this window.

There is also a security issue to be considered: a recovered writeback-mode
filesystem will expose other people's old data to unintended recipients.


Hopefully this description will help people make their deployment choices.
If not, assistance is available on the ext3-users@redhat.com mailing list.

-

Norbert Preining

2001-Jul-26 07:37 UTC

head link

Re: ext3-2.4-0.9.4

On Don, 26 Jul 2001, Norbert Preining wrote:> Where to get it? On the sourceforge page there is no prerelease.
Forget it, found it ;-)

-- 
ciao
norb

+-------------------------------------------------------------------+
| Norbert Preining              http://www.logic.at/people/preining |
| University of Technology Vienna, Austria        preining@logic.at |
| DSA: 0x09C5B094 (RSA: 0xCF1FA165) mail subject: get [DSA|RSA]-key |
+-------------------------------------------------------------------+

Matthias Andree

2001-Jul-26 11:08 UTC

head link

Re: ext3-2.4-0.9.4

On Thu, 26 Jul 2001, Andrew Morton wrote:
> data=journal
> 
>   All data (as well as to metadata) is written to the journal
>   before it is released to the main fs for writeback.
>   
>   This is a specialised mode - for normal fs usage you're better
>   off using ordered data, which has the same benefits of not corrupting
>   data after crash+recovery.  However for applications which require
>   synchronous operation such as mail spools and synchronously exported
>   NFS servers, this can be a performance win.  I have seen dbench
In ordered and journal mode, are meta data operations, namely creating a
file, rename(), link(), unlink() "synchronous" in the sense that after
the call has returned, the effect of this call is never lost, i. e., if
link(2) has returned and the machine crashes immediately, will the next
recovery ALWAYS recover the link?

Or will ext3 still need chattr +S?

Does it still support chattr +S at all?

Synchronous meta data operations are crucial for mail transfer agents
such as Postfix or qmail. Postfix has up until now been setting
chattr +S /var/spool/postfix, making original (esp. soft-updating) BSD
file systems significantly faster for data (payload) writes in this
directory than ext2.

Note: I'm not on the ext3-users list. Please Cc: back replies.

-- 
Matthias Andree

Andrew Morton

2001-Jul-26 11:42 UTC

head link

Re: ext3-2.4-0.9.4

Matthias Andree wrote:> 
> On Thu, 26 Jul 2001, Andrew Morton wrote:
> 
> > data=journal
> >
> >   All data (as well as to metadata) is written to the journal
> >   before it is released to the main fs for writeback.
> >
> >   This is a specialised mode - for normal fs usage you're better
> >   off using ordered data, which has the same benefits of not
corrupting
> >   data after crash+recovery.  However for applications which require
> >   synchronous operation such as mail spools and synchronously exported
> >   NFS servers, this can be a performance win.  I have seen dbench
> 
> In ordered and journal mode, are meta data operations, namely creating a
> file, rename(), link(), unlink() "synchronous" in the sense that
after
> the call has returned, the effect of this call is never lost, i. e., if
> link(2) has returned and the machine crashes immediately, will the next
> recovery ALWAYS recover the link?
No, they're not synchronous by default.  After recovery they
will either be wholly intact, or wholly absent.
> Or will ext3 still need chattr +S?
Yes, if the app doesn't support O_SYNC or fsync().  I believe
that MTA's *do* support those things.
 > Does it still support chattr +S at all?
Yes.
> Synchronous meta data operations are crucial for mail transfer agents
> such as Postfix or qmail. Postfix has up until now been setting
> chattr +S /var/spool/postfix, making original (esp. soft-updating) BSD
> file systems significantly faster for data (payload) writes in this
> directory than ext2.
If postfix is capable of opening the files O_SYNC or of doing
fsync() on them then the `chattr +s' is no longer necessary - unlike
ext2, when the O_SYNC write() or the fsync() return, the directory
contents (as well as the inode, bitmaps, data, etc) will all be tight on
disk and will be restored after a crash.

This should speed things up considerably, especially with journalled-data
mode.  I need to test and characterise this some more to come up with some
quantitative results and configuration recommendations.

BTW, if you have more-than-modest throughput requirements, don't
even *think* of mounting the fs with `mount -o sync'. Our performance
in this mode is terrible :(

I have a hack somewhere which fixes this as much as it can be fixed, but
I didn't even bother committing it.  It's feasible, but tiresome.

A better solution is to fix some lock inversion problems in the core
kernel which prevent optimal implementation of data-journalling
filesystems.  I don't really expect this to occur medium-term or ever.

A middle-ground solution may be to add an fs-private `osync' mount
option, so all files are treated similarly to O_SYNC, which would
work well.

-

Philipp Matthias Hahn

2001-Jul-30 06:37 UTC

head link

Re: ext3-2.4-0.9.4

On Thu, 26 Jul 2001, Andrew Morton wrote:
> An update to the ext3 filesystem for 2.4 kernels is available at
>
> 	http://www.uow.edu.au/~andrewm/linux/ext3/I'm using ext3-0.9.4 with linux-2.4.7 / 2.4.8-pre1 and get some hangs on
my dual P2-350:>From time to time I will have multiple CRON-Daemons in D-state and loginhangs when logging in. It even happens during boot before my MTA is
started.

I have a single ext3 partition which is exported by kernel-nfs-server.

As soon as I do an Alt-SysRq-S forced sync the hang goes away and
everything works normal.

If you need further information send me an eMail. SGIs kdb is already
compiled in so if we need it ...

BYtE
Philipp
-- 
  / /  (_)__  __ ____  __ Philipp Hahn
 / /__/ / _ \/ // /\ \/ /
/____/_/_//_/\_,_/ /_/\_\ pmhahn@titan.lahn.de

Reasonably Related Threads

Search for more seemingly similar threads

Ext3 users - Jul 2001 - ext3-2.4-0.9.4

Re: ext3-2.4-0.9.4

ext3-2.4-0.9.4

Re: ext3-2.4-0.9.4

Re: ext3-2.4-0.9.4

Re: ext3-2.4-0.9.4

Re: ext3-2.4-0.9.4

Reasonably Related Threads