thr3ads.net - zfs discuss - [zfs-discuss] ARC cache issues with b35/b36; Bugs 6397610 / 6398177 [Mar 2006]

If this information is useful, please help other people find it:
Share via:

Jürgen Keil

2006-Mar-20 10:19 UTC

[zfs-discuss] ARC cache issues with b35/b36; Bugs 6397610 / 6398177

> Bug ID: 6398177
> Synopsis: zfs: poor nightly build performance in 32-bit mode (high disk
activity)

Part of the problem appear to be these kmem_caches:

# mdb -k
...> ::kmastatcache                        buf    buf    buf    memory     alloc alloc
name                        size in use  total    in use   succeed  fail
------------------------- ------ ------ ------ --------- --------- -----
...
dmu_buf_impl_t               192   2029 104328  20348928    326434     0
dnode_t                      388   1529  84090  34443264    157309     0
arc_buf_hdr_t                 92  13276  26460   2580480    140047     0
arc_buf_t                     20    957   5915    143360    225786     0
zil_lwb_cache                184      3     44      8192       149     0
zfs_znode_cache              104   1226  83460   8765440    142715     0
...


Note that the dmu_buf_impl_t, dnode_t and zfs_znode_cache kmem_caches are
sitting on
98% of unused memory.  Although the arc_reclaim_thread is detecting a kmem heap
shortage
all the time and is reducing the zio_buf_* and arc_buf_* caches, it
doesn''t return free
memory for the dmu_buf_impl_t, dnode_t and zfs_znode_cache to the
kernel''s
heap arena.  In the above situation it would be easy to return ~ 60 mbyte to the
kmem heap
arena by adjusting the size of the dmu_buf_impl_t, dnode_t and zfs_znode_cache.


A possible fix for the issue would be to call kmem_reap() from
arc_kmem_reap_now(), like this:

diff -ru ../opensolaris-20060313/usr/src/uts/common/fs/zfs/arc.c
usr/src/uts/common/fs/zfs/arc.c
--- ../opensolaris-20060313/usr/src/uts/common/fs/zfs/arc.c     2006-03-14
21:05:00.000000000 +0100
+++ usr/src/uts/common/fs/zfs/arc.c     2006-03-18 18:06:02.959972174 +0100
@@ -1212,6 +1212,11 @@
         * up too much memory.
         */
        dnlc_reduce_cache((void *)(uintptr_t)arc_reduce_dnlc_percent);
+
+       /*
+        * Reclaim unused memory from all kmem caches.
+        */
+       kmem_reap();
 #endif

        /*
This message posted from opensolaris.org

Jürgen Keil

2006-Mar-20 12:39 UTC

head link

[zfs-discuss] Re: ARC cache issues with b35/b36; Bugs 6397610 / 6398177

> A possible fix for the issue would be to call kmem_reap() from
arc_kmem_reap_now(), ...
Btw, a side effect of this fix will be, that you often get a BAD TRAP panic in
htable_steal()
when trying to halt of poweroff the system.  This seems to be a known issue:

Bug: 6392698
Synopsis: double panic during reboot when stealing htables (htable_steal)
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6392698


Unfortunatelly the kernel is unable to write a valid crash dump, because it
panics a second
time while trying to write the dump (in function hat_dump(); also mentioned in
bug 6392698).


I''m using this fix for now; don''t panic when
kas.a_hat->hat_next == NULL :

diff -ru ../opensolaris-20060313/usr/src/uts/i86pc/vm/htable.c
usr/src/uts/i86pc/vm/htable.c
--- ../opensolaris-20060313/usr/src/uts/i86pc/vm/htable.c	2006-03-14
21:05:09.000000000 +0100
+++ usr/src/uts/i86pc/vm/htable.c	2006-03-20 00:11:31.912956587 +0100
@@ -423,6 +423,8 @@
 	for (pass = 1; pass <= htable_steal_passes && stolen < cnt;
++pass) {
 		threshhold = pass / htable_steal_passes;
 		hat = kas.a_hat->hat_next;
+		if (hat == NULL)
+			break;
 		for (;;) {
 
 			/*
@@ -2171,8 +2173,9 @@
 	 * the list. Once we pass kas.a_hat->hat_next a second time, we
 	 * know we''ve iterated through every hat structure.
 	 */
-	for (hat = kas.a_hat, count = 0; hat != kas.a_hat->hat_next ||
-	    count++ == 0; hat = hat->hat_next) {
+	for (hat = kas.a_hat, count = 0; hat != NULL && 
+	    (hat != kas.a_hat->hat_next || count++ == 0); hat = hat->hat_next)
+	{
 		for (h = 0; h < hat->hat_num_hash; ++h) {
 			for (ht = hat->hat_ht_hash[h]; ht; ht = ht->ht_next) {
 				if ((ht->ht_flags & HTABLE_VLP) == 0) {
This message posted from opensolaris.org

Seemingly Similar Threads

Search for more apparently analagous threads

zfs discuss - Mar 2006 - ARC cache issues with b35/b36; Bugs 6397610 / 6398177

[zfs-discuss] ARC cache issues with b35/b36; Bugs 6397610 / 6398177

[zfs-discuss] Re: ARC cache issues with b35/b36; Bugs 6397610 / 6398177

Seemingly Similar Threads