thr3ads.net - Ocfs2 devel - [Ocfs2-devel] A patch to improve the metadata reading throughput(a gainst svn1267) [Jul 2004]

If this information is useful, please help other people find it:
Share via:

Ling, Xiaofeng

2004-Jul-20 17:08 UTC

[Ocfs2-devel] A patch to improve the metadata reading throughput(a gainst svn1267)

>-----Original Message-----
>Another thing that's on the list which you might be interested 
>in looking at
>is not sending all lock release messages. Some of them do 
>basically nothing
>on the other end in process_vote, so there's really no reason 
>to send them
>to the nodes at all. This should help alot when you've batched 
>up a ton of
>locks to release in commit_cache.Now, in our patch, the release message will notify the other node
to throw away meta data caches, so they are not doing nothing.
>So are you planning to turn off immediate checkpointing for 
>all the other
>journal transactions? This is also on the list :) The only one 
>that *may* be
>troublesome I believe is truncate. Otherwise, the ones that 
>are left are:
>link, symlink, and rename.
Yes, the immediate checkpointing is the main reason for the 
low performance of these operations we found.
>> 4. readdir() may get old data after the data is written back 
>to disk in
>> journal asynchronously. It is not a bug. But which way is 
>better, sync
>> the new data to disk when other nodes notify READONLY message or just
>> let them get old data?
>No, we consider it a bug :)  The other nodes should be getting 
>up to date
>directory contents.Now, in our patch, the release message is sent in journal
asynchronously,
so before that, we can think the write is not finished. So we think this
is 
accepted and not bug, of cause, resolved it is also ok.

Index: src/journal.c
==================================================================---
src/journal.c	(revision 1267)
+++ src/journal.c	(working copy)
@@ -148,6 +148,8 @@
 	}
 	spin_unlock(&journal->cmt_lock);
 
+	if (osb->needs_flush)
+		ocfs_sync_blockdev(osb->sb);
>Is this necessary? It seems awfully heavy, and since we journal *all*>
>metadata (so it should be synced up to disk via the journal_flush just
a>couple lines above that), I don't see the point... I was actually
meaning to>take the other call to sync_blockdev out as it's never used :)
We added this just because we found that some times we can not see the
new created directory
from another node, but by adding this, we can always see. Seems
some buffer in block device's cache list are not flushed to disk after
journal_flush.
And after the lock release message is sent, the meta data cache on
another node can not be
throw away any more.So we must ensure all data is synced to disk on this
node before sending message.

Mark Fasheh

2004-Jul-22 23:46 UTC

head link

[Ocfs2-devel] A patch to improve the metadata reading throughput(a gainst svn1267)

On Wed, Jul 21, 2004 at 05:58:38AM +0800, Ling, Xiaofeng
wrote:> --- src/journal.c	(revision 1267)
> +++ src/journal.c	(working copy)
> @@ -148,6 +148,8 @@
>  	}
>  	spin_unlock(&journal->cmt_lock);
>  
> +	if (osb->needs_flush)
> +		ocfs_sync_blockdev(osb->sb);
> 
> >Is this necessary? It seems awfully heavy, and since we journal
*all*>
> >metadata (so it should be synced up to disk via the journal_flush just
> a
> >couple lines above that), I don't see the point... I was actually
> meaning to
> >take the other call to sync_blockdev out as it's never used :)
> 
> We added this just because we found that some times we can not see the
> new created directory
> from another node, but by adding this, we can always see. Seems
> some buffer in block device's cache list are not flushed to disk after
> journal_flush.Actually, the bug (which I just fixed) was that we weren't telling the other
node to wait *on* the journal flush for a busy directory. Once I fixed it, I
haven't had any directory contents consistency issues. See svn revision 1302
for the patch.
> And after the lock release message is sent, the meta data cache on
> another node can not be
> throw away any more.So we must ensure all data is synced to disk on this
> node before sending message.Again, JBD handles this for us via journal_flush. If you're seeing metadata
inconsistency it's much more likely that it's a DLM bug (in the case of
the
readdir) or a caching issue.
	--Mark

--
Mark Fasheh
Software Developer, Oracle Corp
mark.fasheh@oracle.com

Ling, Xiaofeng

2004-Jul-23 18:41 UTC

head link

[Ocfs2-devel] A patch to improve the metadata reading throughput(a gainst svn1267)

=20
>-----Original Message-----
>From: Mark Fasheh [mailto:mark.fasheh@oracle.com]=20
>Sent: 2004=C4=EA7=D4=C222=C8=D5 21:46
>To: Ling, Xiaofeng
>Cc: Zhang, Sonic; Fu, Michael; Yang, Elton; Ocfs2-Devel
>Subject: Re: [Ocfs2-devel] A patch to improve the metadata=20
>reading throughput(a gainst svn1267)
>Actually, the bug (which I just fixed) was that we weren't=20
>telling the other
>node to wait *on* the journal flush for a busy directory. Once=20
>I fixed it, I
>haven't had any directory contents consistency issues. See svn=20
>revision 1302
>for the patch.Yes, I've run test, it's ok now.
So currently, when the node get READONLY message, it will=20
flush the journal and then answer it. Is my description correct?

Ocfs2 devel - Jul 2004 - A patch to improve the metadata reading throughput(a gainst svn1267)

[Ocfs2-devel] A patch to improve the metadata reading throughput(a gainst svn1267)

[Ocfs2-devel] A patch to improve the metadata reading throughput(a gainst svn1267)

[Ocfs2-devel] A patch to improve the metadata reading throughput(a gainst svn1267)