- Tecra S1, 32-bit Pentium-M, 1 cpu core, 768 memory - xem dom0 root filesystem on zfs - opensolaris sources on zfs 1. Consume kernel memory, e.g. by running "sum" on some kernel crash dumps, and a "find" in an opensolaris source tree # ls -l ls -l /var/crash/max/ total 1245361 -rw-r--r-- 1 root root 3 Aug 25 19:12 bounds -rw-r--r-- 1 root root 1270961 Aug 25 18:04 unix.7 -rw-r--r-- 1 root root 1287667 Aug 25 19:00 unix.8 -rw-r--r-- 1 root root 1271353 Aug 25 19:10 unix.9 -rw-r--r-- 1 root root 505765888 Aug 25 18:06 vmcore.7 -rw-r--r-- 1 root root 473620480 Aug 25 19:03 vmcore.8 -rw-r--r-- 1 root root 478453760 Aug 25 19:12 vmcore.9 # sum /var/crash/max/* # find /files/wos_b66/ -name core # xm list Name ID Mem VCPUs State Time(s) Domain-0 0 701 1 r----- 2948.9 # mdb -k ... > ::memstat Page Summary Pages MB %Tot ------------ ---------------- ---------------- ---- Kernel 110546 431 63% Anon 34816 136 20% Exec and libs 8949 34 5% Page cache 3756 14 2% Free (cachelist) 2893 11 2% Free (freelist) 15352 59 9% Balloon 0 0 0% Total 176312 688 2. Try to reduce the memory usage for Domain-0 from 701MB to 512MB, using: xm set 0 512 System hangs (has to be power cycled). ======================================================================= balloon_worker_thread kernel thread is looping, never releases the cpu. - usr/src/uts/i86xpv/os/balloon.c, balloon_worker_thread(), lines 600, 620: /* * This can be used to throttle the hv calls, but by default it''s turned off. */ uint_t bln_wait_sec = 0; /* * We weren''t able to fully complete the request * last time through, so try again. */ (void) cv_timedwait(&bln_cv, &bln_mutex, lbolt + (bln_wait_sec * hz)); This is a ``cv_timewait(&bln_cv, &bln_mutex, lbolt)'''' call, it returns immediatelly with a return value -1 (thread does not release cpu) - balloon_worker_thread(), line 636 calls balloon_dec_reservation(). In balloon_dec_reservation() ``page_resv(debit, KM_NOSLEEP)'''' is called. Typically, debit == 1024. Since we''ve made the kernel use lots of kernel memory, availrmem is low (e.g. availrmem == 640); ``page_resv(debit, KM_NOSLEEP)'''' returns with 0 immediatelly, without ever releasing the cpu. /* * This routine reserves availrmem for npages; * flags: KM_NOSLEEP or KM_SLEEP * returns 1 on success or 0 on failure */ int page_resv(pgcnt_t npages, uint_t flags) { mutex_enter(&freemem_lock); while (availrmem < tune.t_minarmem + npages) { if (flags & KM_NOSLEEP) { mutex_exit(&freemem_lock); return (0); } - the process repeats Workaround: ========== set "bln_wait_sec = 1", so that the cv_timedwait() in balloon_worker_thread() releases the cpu. This makes balloon memory reservatio changes quite slow (4MB / second), but the machine survives them! Suggested fix: ============= We could wait one tick minimum, in balloon_worker_thread(): (void) cv_timedwait(&bln_cv, &bln_mutex, lbolt + 1 + (bln_wait_sec * hz)); As a refinement, don''t wait when the previous balloon_dec_reservation() call has worked; only wait for one clock tick when balloon_dec_reservation() was unable to release any memory to the hypervisor. --- wos_b66_xen/usr/src/uts/i86xpv/os/balloon.c 2007-07-22 21:16:30.913352547 +0200 +++ wos_b66_xen_uppc/usr/src/uts/i86xpv/os/balloon.c 2007-08-26 00:14:59.514847774 +0200 @@ -611,6 +611,8 @@ balloon_worker_thread(void) { callb_cpr_t cprinfo; + spgcnt_t pages; + int dec_failed = 0; CALLB_CPR_INIT(&cprinfo, &bln_mutex, callb_generic_cpr, "balloon"); for (;;) { @@ -622,20 +624,24 @@ * last time through, so try again. */ (void) cv_timedwait(&bln_cv, &bln_mutex, - lbolt + (bln_wait_sec * hz)); + lbolt + dec_failed + (bln_wait_sec * hz)); } else { cv_wait(&bln_cv, &bln_mutex); } CALLB_CPR_SAFE_END(&cprinfo, &bln_mutex); + dec_failed = 0; + if (bln_stats.bln_new_target != bln_stats.bln_current_pages) { if (bln_stats.bln_new_target < bln_stats.bln_current_pages) { /* reservation shrunk */ bln_stats.bln_current_pages -+ pages balloon_dec_reservation( bln_stats.bln_current_pages - bln_stats.bln_new_target); + dec_failed = (pages == 0); } else if (bln_stats.bln_new_target > bln_stats.bln_current_pages) { /* reservation grew */ This message posted from opensolaris.org
On Mon, Aug 27, 2007 at 03:00:39AM -0700, J??rgen Keil wrote:> - Tecra S1, 32-bit Pentium-M, 1 cpu core, 768 memory > - xem dom0 root filesystem on zfs > - opensolaris sources on zfs> [snip]I think that this has been fixed under: 6570855 Running xen out of memory causes Dom0 to become perpetually busy I''m sure Frank can tell you about the details if you''re interested... regards john
Ryan Scott
2007-Aug-28 01:45 UTC
[xen-discuss] balloon_worker_thread could hang Solaris dom0
John Levon wrote:> On Mon, Aug 27, 2007 at 03:00:39AM -0700, J??rgen Keil wrote: > >> - Tecra S1, 32-bit Pentium-M, 1 cpu core, 768 memory >> - xem dom0 root filesystem on zfs >> - opensolaris sources on zfs > >> [snip] > > I think that this has been fixed under: > > 6570855 Running xen out of memory causes Dom0 to become perpetually busyAlso, the fix for: 6576429 xmstress combining with Cthon_stress(domU server, dom0 client) will cause domU hang implements an exponential backoff when the balloon thread isn''t releasing any pages. -Ryan> > I''m sure Frank can tell you about the details if you''re interested... > > regards > john > _______________________________________________ > xen-discuss mailing list > xen-discuss at opensolaris.org