thr3ads.net - Ocfs2 devel - [Ocfs2-devel] Ocfs2 performance bugs of doom [Mar 2006]

If this information is useful, please help other people find it:
Share via:

Daniel Phillips

2006-Mar-03 22:27 UTC

[Ocfs2-devel] Ocfs2 performance bugs of doom

Hi ocfs2 guys,

Actually, ocfs2 is not doomed - far from it!  I just wanted to grab a few
more of those eyeballs that normally glaze over each time they see a
cluster mail run by on lkml[1].  Today's news: you are still not finished
with global lock systime badness, and I already have more performance bugs
for you.

For interested observers, the story so far:

A couple of days ago I reported a bug to the ocfs2 team via irc[2].  If I
untar a kernel tree several times, each into its own directory, ocfs2
system time (already four times higher than ext3 to start with) increases
linearly on each successive iteration, up to a very silly amount where it
finally stabilizes.

This is with ocfs2 running uniprocessor on a single node with a local disk.
The same performance bug also bites on a cluster of course.

In a dazzling display of developer studlyness, the ocfs2 team found the hash
table bottleneck while I was still trying to figure out how to make oprofile
behave.

   
http://sosdg.org/~coywolf/lxr/source/fs/ocfs2/dlm/dlmdomain.c?v=2.6.16-rc3#L102

So the ocfs2 guys switched from double word to single word hash table heads
and quadrupled the size of the hash table.  Problem solved, right?  No!

I got the patch from Git (note to Gitweb devs: could "plain" please be
_really plain_ just like marc's "raw" instead of colorless html?).
While
system is indeed reduced, it still sucks chamber pot juice:

    CPU: P4 / Xeon, speed 2793.41 MHz (estimated)
    samples  %        image name               app name                 symbol
name
   
-------------------------------------------------------------------------------
    4399778  64.0568  libbz2.so.1.0.2          libbz2.so.1.0.2          (no
symbols)
      4399778  100.000  libbz2.so.1.0.2          libbz2.so.1.0.2          (no
symbols) [self]
   
-------------------------------------------------------------------------------
    1410173  20.5308  vmlinux                  vmlinux                 
__dlm_lookup_lockres
      1410173  100.000  vmlinux                  vmlinux                 
__dlm_lookup_lockres [self]
   
-------------------------------------------------------------------------------
    82840     1.2061  oprofiled                oprofiled                (no
symbols)
      82840    100.000  oprofiled                oprofiled                (no
symbols) [self]
   
-------------------------------------------------------------------------------
    68368     0.9954  vmlinux                  vmlinux                 
ocfs2_local_alloc_count_bits
      68368    100.000  vmlinux                  vmlinux                 
ocfs2_local_alloc_count_bits [self]
   
-------------------------------------------------------------------------------
    47639     0.6936  tar                      tar                      (no
symbols)
      47639    100.000  tar                      tar                      (no
symbols) [self]
   
-------------------------------------------------------------------------------
      1         0.0488  vmlinux                  vmlinux                 
ide_build_sglist
      6         0.2925  vmlinux                  vmlinux                 
ide_do_request
      70        3.4130  vmlinux                  vmlinux                 
__ide_dma_test_irq
      128       6.2409  vmlinux                  vmlinux                 
ata_vlb_sync
      134       6.5334  vmlinux                  vmlinux                 
proc_ide_read_imodel
      147       7.1672  vmlinux                  vmlinux                 
ide_dma_verbose
      1565     76.3042  vmlinux                  vmlinux                 
ide_do_rw_disk
    20390     0.2969  vmlinux                  vmlinux                 
ide_dma_speed
      20390    100.000  vmlinux                  vmlinux                 
ide_dma_speed [self]
   
-------------------------------------------------------------------------------

Hash probe cleanup and hand optimization
----------------------------------------

I included two patches below.  The first vmallocs a megabyte hash table
instead of a single page.  This definitively squishes away the global
lock hash table overhead so that systime stays about twice as high as
ext3 instead of starting at three times and increasing to four times
with the (new improved) 4K hash table.

Re vmalloc... even with the extra tlb load, this is better than
kmallocing a smaller vector and running the very high risk that the
allocation will fail or be forced to fall back because of our lame
fragmentation handling.  Of course the Right Think To Do is kmalloc a
vector of pages and mask the hash into it, but as a lazy person I
would rather sink the time into writing this deathless prose.

The second patch is a rewrite of __dlm_lookup_lockres to get rid of the
bogus clearing of tmpres on every loop iteration.  I also threw in a
little hand optimization, which gcc really ought to be doing but isn't.
Comparing the first character of the string before doing the full
memcmp seems to help measurably.  I also tried to provide some more
optimization guidance for the compiler, though I do fear it will fall
on deaf ears.  Finally, I cleaned up some stylistic oddities[3].  I
also left one screaming lindent violation intact.  Executive summary:
could you please install some editors that show whitespace?  And what
is it with this one line per function parameter?   After a few edits
of the function name you end up with a mess anyway so why not do as
the romans do and stick to lindent style?

The __dlm_lookup_lockres rewrite shows a measurable improvement in
hash chain traversal overhead.  Every little bit helps.  I failed to
quantify the improvement precisely because there are other glitchy
things going on that interfere with accurate measurement.  So now we
get to...

How fast is it now?
-------------------

Here is ocfs2 from yesterday's git:

	       Untar                Sync
    Trial   Real  User  Sys     Real  User  Sys
      1.    31.42 24.90 3.81
      2.    31.97 25.15 5.03   (similar to below)
      3.    94.72 25.29 5.79
      4.    26.27  0.28 4.43
      5.    34.24 25.21 6.41
      6.    50.71 25.26 7.23
      7.    33.73 25.44 6.83

Here is what happens with the shiny new patches:

	       Untar                Sync
    Trial   Real  User  Sys     Real  User  Sys

      1.    34.12 24.92 3.17    46.70 0.00 0.17
      2.    29.63 24.96 3.36    47.25 0.00 0.16
      3.   141.18 25.06 3.11    28.43 0.00 0.12
      4.    45.20 25.38 3.07    49.04 0.00 0.18
      5.   160.70 25.11 3.24    21.74 0.00 0.07
      6.    31.65 24.99 3.34    46.30 0.00 0.19
      7.   145.45 25.28 3.16    27.26 0.00 0.12
      8.    30.87 25.16 3.29    44.60 0.00 0.20
      9.    58.31 25.32 3.20    22.82 0.00 0.09
     10.    31.01 25.29 3.09    43.93 0.00 0.22

Things to notice: with the still-teensy hash table, systime for global
lock lookups is still way too high, eventually stabilizing when lock
purging kicks in at about 4 times ext3's systime.  Not good (but indeed,
not as excruciatingly painful as two days ago).

With the attached patches we kill the hash table overhead dead, dead,
dead.  But look at real time for the untar and sync!

   A) We have huge swings in idle time during the untar

   B) We have great whacks of dirty memory sitting around after the
      untar

So that brings us to...

More performance bugs
---------------------

Ocfs2 starts little or no writeout during the untar itself.  Ext3 does.
Solution: check ext3 to see what secret incantations Andrew conjured up
to fiddle it into doing something reasonable.

Ocfs2 sometimes sits and gazes at its navel for minutes at a time, doing
nothing at all.  Timer bug?  A quick glance at SysRq-t shows ocfs2 waiting
in io_schedule.  Waiting for io that never happens?  This needs more
investigation.

You also might want to look at more fine grained lock purging instead of
the current burn the inbox approach.  One can closely approximate the
performance of a LRU without paying the penalty of a separate per-lock lru
field by treating each hash chain as a LRU list.  Here is the equation:
either the hash chains are short in which case you don't care about poor
LRU behavior because you are not purging locks, or they are long and you
do care.  Luckily, the longer the hash chains get, the more closely they
approximate the behavior of a dedicated LRU list.  Amazing.

Delete performance remains horrible, even with a 256 meg journal[4] which
is unconscionably big anyway.  Compare to ext3, which deletes kernel trees
at a steady 2 seconds per, with a much smaller journal.  Ocfs2 takes more
like a minute.

I would chase down these glitches too, except that I really need to get
back to my cluster block devices, without which we cannot fully bake
our cluster cake.  After all, there are six of you and only one of me, I
had better go work on the thing nobody else is working on.

[1] People would be much more interested in ocfs2 if they could get it
running in less than a day, right?  Well, you can, or at least you could
if Oracle's startup scripts were ever so slightly less user hostile.  For
example, how about scripts/ocfs2 instead of vendor/common/o2cb.init?  And
it would be nice to expunge the evil of using lsmod to see if a subsystem
is available.  Also, an actual shared disk is not recommended for your
first run.  That way you avoid the pain of iscsi, nbd, fiber channel,
etc al. (At least for a little while.)  Once you get the init script
insanity under control, ocfs2 comes up just like a normal filesystem on
each node: mount -tocfs2.

[2] It is high time we pried loose the ocfs2 design process from secret
irc channels and dark tunnels running deep beneath Oracle headquarters,
and started sharing the squishy goodness of filesystem clustering
development with some more of the smart people lurking here.

[3] Aha! I hear you cry, he wrote an 83 character line!  Actually, I am
going to blame the person who invented the name DLM_HASH_BUCKETS.  Can
you spot the superfluous redundancy?  (Hint: name three other kinds of
kernel objects that have buckets.)

[4] Could you please have fsck.ocfs2 report the size of the journal?

Patches apply to current linux-git.

Regards,

Daniel

[PATCH] Vmalloc ocfs global lock hash table and make it way bigger

Signed-Off-By: Daniel "the wayward" Phillips <phillips at
google.com>

diff -urp 2.6.16.clean/fs/ocfs2/dlm/dlmcommon.h 2.6.16/fs/ocfs2/dlm/dlmcommon.h
--- 2.6.16.clean/fs/ocfs2/dlm/dlmcommon.h	2006-03-02 19:05:13.000000000 -0800
+++ 2.6.16/fs/ocfs2/dlm/dlmcommon.h	2006-03-03 09:52:54.000000000 -0800
@@ -36,8 +36,8 @@
  #define DLM_LOCK_RES_OWNER_UNKNOWN     O2NM_MAX_NODES
  #define DLM_THREAD_SHUFFLE_INTERVAL    5     // flush everything every 5
passes
  #define DLM_THREAD_MS                  200   // flush at least every 200 ms
-
-#define DLM_HASH_BUCKETS     (PAGE_SIZE / sizeof(struct hlist_head))
+#define DLM_HASH_SIZE		(1 << 20)
+#define DLM_HASH_BUCKETS	(DLM_HASH_SIZE / sizeof(struct hlist_head))

  enum dlm_ast_type {
  	DLM_AST = 0,
diff -urp 2.6.16.clean/fs/ocfs2/dlm/dlmdomain.c 2.6.16/fs/ocfs2/dlm/dlmdomain.c
--- 2.6.16.clean/fs/ocfs2/dlm/dlmdomain.c	2006-03-02 16:47:47.000000000 -0800
+++ 2.6.16/fs/ocfs2/dlm/dlmdomain.c	2006-03-03 09:53:21.000000000 -0800
@@ -194,7 +191,7 @@ static int dlm_wait_on_domain_helper(con
  static void dlm_free_ctxt_mem(struct dlm_ctxt *dlm)
  {
  	if (dlm->lockres_hash)
-		free_page((unsigned long) dlm->lockres_hash);
+		vfree(dlm->lockres_hash);

  	if (dlm->name)
  		kfree(dlm->name);
@@ -1191,7 +1188,7 @@ static struct dlm_ctxt *dlm_alloc_ctxt(c
  		goto leave;
  	}

-	dlm->lockres_hash = (struct hlist_head *) __get_free_page(GFP_KERNEL);
+	dlm->lockres_hash = (struct hlist_head *)vmalloc(DLM_HASH_SIZE);
  	if (!dlm->lockres_hash) {
  		mlog_errno(-ENOMEM);
  		kfree(dlm->name);
@@ -1200,7 +1197,7 @@ static struct dlm_ctxt *dlm_alloc_ctxt(c
  		goto leave;
  	}

-	for (i=0; i<DLM_HASH_BUCKETS; i++)
+	for (i = 0; i < DLM_HASH_BUCKETS; i++)
  		INIT_HLIST_HEAD(&dlm->lockres_hash[i]);

  	strcpy(dlm->name, domain);

[PATCH] Clean up ocfs2 hash probe and make it faster

Signed-Off-By: Daniel "He's Baaaaack" Phillips <phillips at
google.com>

diff -urp 2.6.16.clean/fs/ocfs2/dlm/dlmdomain.c 2.6.16/fs/ocfs2/dlm/dlmdomain.c
--- 2.6.16.clean/fs/ocfs2/dlm/dlmdomain.c	2006-03-02 16:47:47.000000000 -0800
+++ 2.6.16/fs/ocfs2/dlm/dlmdomain.c	2006-03-03 09:53:21.000000000 -0800
@@ -103,31 +103,28 @@ struct dlm_lock_resource * __dlm_lookup_
  					 const char *name,
  					 unsigned int len)
  {
-	unsigned int hash;
-	struct hlist_node *iter;
-	struct dlm_lock_resource *tmpres=NULL;
  	struct hlist_head *bucket;
+	struct hlist_node *list;

  	mlog_entry("%.*s\n", len, name);

  	assert_spin_locked(&dlm->spinlock);
+	bucket = dlm->lockres_hash + full_name_hash(name, len) % DLM_HASH_BUCKETS;

-	hash = full_name_hash(name, len);
+	hlist_for_each(list, bucket) {
+		struct dlm_lock_resource *res = hlist_entry(list,
+			struct dlm_lock_resource, hash_node);

-	bucket = &(dlm->lockres_hash[hash % DLM_HASH_BUCKETS]);
-
-	/* check for pre-existing lock */
-	hlist_for_each(iter, bucket) {
-		tmpres = hlist_entry(iter, struct dlm_lock_resource, hash_node);
-		if (tmpres->lockname.len == len &&
-		    memcmp(tmpres->lockname.name, name, len) == 0) {
-			dlm_lockres_get(tmpres);
-			break;
-		}
-
-		tmpres = NULL;
+		if (likely(res->lockname.name[0] != name[0]))
+			continue;
+		if (likely(res->lockname.len != len))
+			continue;
+		if (likely(memcmp(res->lockname.name + 1, name + 1, len - 1)))
+			continue;
+		dlm_lockres_get(res);
+		return res;
  	}
-	return tmpres;
+	return NULL;
  }

  struct dlm_lock_resource * dlm_lookup_lockres(struct dlm_ctxt *dlm,

Mark Fasheh

2006-Mar-04 00:53 UTC

head link

[Ocfs2-devel] Ocfs2 performance bugs of doom

Hi Daniel,

On Fri, Mar 03, 2006 at 02:27:52PM -0800, Daniel Phillips
wrote:> Hi ocfs2 guys,
> 
> Actually, ocfs2 is not doomed - far from it!  I just wanted to grab a few
> more of those eyeballs that normally glaze over each time they see a
> cluster mail run by on lkml[1].  Today's news: you are still not
finished
> with global lock systime badness, and I already have more performance bugs
> for you.Thanks for continuing to profile this stuff. We've definitely got lots of
profiling to do, so any help is welcome.
> So the ocfs2 guys switched from double word to single word hash table heads
> and quadrupled the size of the hash table.  Problem solved, right?  No!Heh, well if you remember from irc I told you that there was more to do -
specifically that I was looking at increasing the memory dedicated to the
hash. 
> I included two patches below.  The first vmallocs a megabyte hash table
> instead of a single page.  This definitively squishes away the global
> lock hash table overhead so that systime stays about twice as high as
> ext3 instead of starting at three times and increasing to four times
> with the (new improved) 4K hash table.I completely agree that we need to increase the memory dedicated to the
hash, but I fear that vmallocing a 1 megabyte table per domain (effectively
per mount) is going overboard. I will assume that this was a straw man patch
:)
> Of course the Right Think To Do is kmalloc a vector of pages and mask the
> hash into it, but as a lazy person I would rather sink the time into
> writing this deathless prose.Right. That's one possible approach - I was originally thinking of
allocating a large global lockres hash table allocated at module init time.
This way we don't have to make large allocs for each domain.

Before anything else though, I'd really like to get an idea of how large we
want things. This might very well dictate the severity of the solution.
> The __dlm_lookup_lockres rewrite shows a measurable improvement in
> hash chain traversal overhead.  Every little bit helps.Yes, this patch seems much easier to swallow ;) Thanks for doing that work.
I have some comments about it below.
> I failed to quantify the improvement precisely because there are other
> glitchy things going on that interfere with accurate measurement. So now
> we get to...Definitely if you get a chance to show how much just the lookup optimization
helps, I'd like to know. I'll also try to gather some numbers.
> Things to notice: with the still-teensy hash table, systime for global
> lock lookups is still way too high, eventually stabilizing when lock
> purging kicks in at about 4 times ext3's systime.  Not good (but
indeed,
> not as excruciatingly painful as two days ago).
> 
> With the attached patches we kill the hash table overhead dead, dead,
> dead.  But look at real time for the untar and sync!Indeed. The real time numbers are certainly confusing. I actually saw real
time decrease on most of my tests (usually I hack things to increase the
hash allocation to something like a few pages). I want to do some more
consecutive untars though.

Btw, for what it's worth, here's some single untar numbers I have lying
around. Sync numbers are basicaly the same.

Before the hlist patch:
    real    0m21.279s
    user    0m3.940s
    sys     0m18.884s

With the hlist patch:
    real    0m15.717s
    user    0m3.924s
    sys     0m13.324s
> Ocfs2 starts little or no writeout during the untar itself.  Ext3 does.
> Solution: check ext3 to see what secret incantations Andrew conjured up
> to fiddle it into doing something reasonable.Good idea. As you probably already know, we've never been beyond checking to
see how ext3 handles a given problem :)
> Ocfs2 sometimes sits and gazes at its navel for minutes at a time, doing
> nothing at all.  Timer bug?  A quick glance at SysRq-t shows ocfs2 waiting
> in io_schedule.  Waiting for io that never happens?  This needs more
> investigation.Did you notice this during the untar? If not, do you have any reproducible
test case?
> Delete performance remains horrible, even with a 256 meg journal[4] which
> is unconscionably big anyway.  Compare to ext3, which deletes kernel trees
> at a steady 2 seconds per, with a much smaller journal.  Ocfs2 takes more
> like a minute.This doesn't surprise me - our unlink performance leaves much to be desired
at the moment. How many nodes did you have mounted when you ran that test?
Off the top of my head,the two things which I would guess are hurting delete
the most right now are node messaging and lack of directory read ahead. The
first will be solved by more intelligent use of the DLM so that the network
will only be hit for those nodes that actually care about a given inode --
unlink rename and mount/unmount are the only things left that still use the
slower 'vote' mechanism. Directory readahead is much easier to solve,
it's
just that nobody has gotten around to fixing that yet :/ I bet there's more
to figure out with respect to unlink performance.
> I would chase down these glitches too, except that I really need to get
> back to my cluster block devices, without which we cannot fully bake
> our cluster cake.  After all, there are six of you and only one of me, I
> had better go work on the thing nobody else is working on.Hey, we appreciate the help so far - thanks.
> [2] It is high time we pried loose the ocfs2 design process from secret
> irc channels and dark tunnels running deep beneath Oracle headquarters,
> and started sharing the squishy goodness of filesystem clustering
> development with some more of the smart people lurking here.We're not trying to hide anything from anyone :) I'm always happy to
talk
about design. We've been in bugfix (and more recently, performance fix) mode
for a while now, so there hasn't been much new design yet.

> -	bucket = &(dlm->lockres_hash[hash % DLM_HASH_BUCKETS]);
> -
> -	/* check for pre-existing lock */
> -	hlist_for_each(iter, bucket) {
> -		tmpres = hlist_entry(iter, struct dlm_lock_resource, 
> hash_node);
> -		if (tmpres->lockname.len == len &&
> -		    memcmp(tmpres->lockname.name, name, len) == 0) {
> -			dlm_lockres_get(tmpres);
> -			break;
> -		}
> -
> -		tmpres = NULL;
> +		if (likely(res->lockname.name[0] != name[0]))Is the likely() here necessary? It actually surprises me that the check even
helps so much - if you check the way OCFS2 lock names are built, the first
few bytes are very regular - the first char is going to almost always be one
of M, D, or W. Hmm, I guess if you're hitting it 1/3 fewer times...
> +			continue;
> +		if (likely(res->lockname.len != len))Also here, lengths of OCFS2 locks are also actually fairly regular, so
perhaps the likely() isn't right?
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh at oracle.com

Andrew Morton

2006-Mar-04 07:37 UTC

head link

[Ocfs2-devel] Ocfs2 performance bugs of doom

Daniel Phillips <phillips at google.com> wrote:>
> @@ -103,31 +103,28 @@ struct dlm_lock_resource * __dlm_lookup_
>    					 const char *name,
>    					 unsigned int len)
>    {
>  -	unsigned int hash;
>  -	struct hlist_node *iter;
>  -	struct dlm_lock_resource *tmpres=NULL;
>    	struct hlist_head *bucket;
>  +	struct hlist_node *list;
> 
>    	mlog_entry("%.*s\n", len, name);
> 
>    	assert_spin_locked(&dlm->spinlock);
>  +	bucket = dlm->lockres_hash + full_name_hash(name, len) %
DLM_HASH_BUCKETS;
> 
>  -	hash = full_name_hash(name, len);
err, you might want to calculate that hash outside the spinlock.

Maybe have a lock per bucket, too.

A 1MB hashtable is verging on comical.  How may data are there in total?

Joel Becker

2006-Mar-10 07:10 UTC

head link

[Ocfs2-devel] Ocfs2 performance

Please don't drop the ocfs2-devel Cc:  I'm bouncing this there.

On Fri, Mar 10, 2006 at 02:14:08AM +0100, Bernd Eckenfels
wrote:> Mark Fasheh <mark.fasheh at oracle.com> wrote:
> > Your hash sizes are still ridiculously large.
> 
> How long are those entries in the buckets kept? I mean if I untar a tree
the
> files are only locked while extracted, afterwards they are owner-less... (I
> must admint I dont understand ocfs2 very deeply, but maybe explaining why
so
> many active locks need to be cached might help to find an optimized way.
> 
> > By the way, an interesting thing happened when I recently switched
disk
> > arrays - the fluctuations in untar times disappeared. The new array is
much
> > nicer, while the old one was basically Just A Bunch Of Disks. Also,
sync
> > times dropped dramatically.
> 
> Writeback Cache?
> 
> Gruss
> Bernd
> -
> To unsubscribe from this list: send the line "unsubscribe
linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 

"Depend on the rabbit's foot if you will, but remember, it didn't
 help the rabbit."
	- R. E. Shay

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127

Mark Fasheh

2006-Mar-11 01:09 UTC

head link

[Ocfs2-devel] Ocfs2 performance

On Fri, Mar 10, 2006 at 02:14:08AM +0100, Bernd Eckenfels
wrote:> Mark Fasheh <mark.fasheh at oracle.com> wrote:
> > Your hash sizes are still ridiculously large.
> 
> How long are those entries in the buckets kept?Shortly after the inode is destroyed. Basically the entries are "lock
resources" which the DLM tracks. OCFS2 only ever gets lock objects attached
to those resources. In OCFS2 the lock objects obey inode lifetimes. So the
DLM can't purge the lock resources until OCFS2 destroys the inode, etc. If
other nodes take locks on that resource and your local node is the
"master"
(as in, arbitrates access to the resource) it may stay around longer.
> I mean if I untar a tree the files are only locked while extracted,
> afterwards they are owner-less... (I must admint I dont understand ocfs2
> very deeply, but maybe explaining why so many active locks need to be
> cached might help to find an optimized way.Well, OCFS2 caches locks. That is, once you've gone to the DLM to acquire a
lock at a given level, OCFS2 will just hold onto it and manage access to it
until the locks needs to be upgraded or downgraded. This provides a very
large performance increase over always asking the DLM for a new lock.
Anyway, at the point that we've acquired the lock the first time, the file
system isn't really forcing the dlm to hit the hash much.
> > By the way, an interesting thing happened when I recently switched
disk
> > arrays - the fluctuations in untar times disappeared. The new array is
much
> > nicer, while the old one was basically Just A Bunch Of Disks. Also,
sync
> > times dropped dramatically.
> 
> Writeback Cache?Yep, and a whole slew of other nice features :)
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh at oracle.com

Bernd Eckenfels

2006-Mar-11 01:57 UTC

head link

[Ocfs2-devel] Ocfs2 performance

On Fri, Mar 10, 2006 at 05:09:13PM -0800, Mark Fasheh
wrote:> until the locks needs to be upgraded or downgraded. This provides a very
> large performance increase over always asking the DLM for a new lock.
Yes, it is basically the same problem as the buffer cache. Excessive
single-use patterns dirty the small cache or require a too big cache to be
usefull.

Maybe a user specific limit of percentage of hash (and locks) used? I mean
the untar test case is a bit synthetic, but think of concurrent read access
in a cluster of nntp servers (news article pool).

Gruss
Bernd

Ocfs2 devel - Mar 2006 - Ocfs2 performance bugs of doom

[Ocfs2-devel] Ocfs2 performance bugs of doom

[Ocfs2-devel] Ocfs2 performance bugs of doom

[Ocfs2-devel] Ocfs2 performance bugs of doom

[Ocfs2-devel] Ocfs2 performance

[Ocfs2-devel] Ocfs2 performance

[Ocfs2-devel] Ocfs2 performance