On 2009-12-17, at 15:25, Bernd Schubert wrote:> I''m presently a bit puzzled about asynchronous journal patches.
>
> While I was just reading the jbd-journal-chksum-rhel53.patch patch,
> I noticed it also adds a new option and feature
> "journal_async_commit".
>
> But then ever since lustre-1.8.0 there is also a patch included for
> async
> journals from obdfilter. This patch is presently disabled, since it
> could
> cause data corruption on fail over.
>
> I now wonder how these two patches/features are related, so jbd/
> ldiskfs/ext4
> (journal_async_commit vs. obdfilter (obdfilter.*.sync_journal=0).
These are two completely independent changes, though they have
confusingly similar names.
The jbd-journal-checksum patch adds in a generic feature to the JBD
transaction commit which avoids one synchronous operation per
transaction commit. Originally, JBD would need to write out all of
the transaction data blocks, sync them to disk, then submit the
transaction commit block and sync it to disk before the transaction
could be considered as committed.
The addition of a checksum to the transaction commit block allows the
journal replay code to determine itself whether all of the transaction
data blocks were committed to disk before the commit block, and allow
or deny the transaction during journal replay based on that. If the
journal async_commit feature is enabled on the journal, then it will
skip this pre-commit-block sync. However, this is not enabled by
default, since there was a problem found in the upstream ext4 code due
to blocks being modified outside of the journal and causing checksum
failures during replay.
The Lustre async journal commit is essentially adding support for
Lustre clients to be able to submit write requests to the OST, but not
require the server to do that IO synchronously. Instead, the client
will keep a copy of the data in cache until it gets a commit
notification from the OST, and will rewrite the data if the OST
crashes. This allows a single client to submit a large number of
writes without having to commit the journal transaction. This feature
is still under testing, so it is not enabled by default, and a bug was
fixed in 1.8.2 related to recovery.
> When I did some tests with lctl set_param
> obdfilter.*.sync_journal=0, it even slightly reduced performance. So
> I wonder if one additionally needs to enable jbd-async journals?
It is only expected that this will improve performance when a small
number of clients is doing IO to each OST. Once there are many
clients doing IO at the same time, there is enough IO per transaction
that the commit does not noticably affect performance.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.