- Tecra S1, 32-bit Pentium-M, 1 cpu core, 768 memory
- xem dom0 root filesystem on zfs
- opensolaris sources on zfs
1. Consume kernel memory, e.g. by running "sum" on some kernel
crash dumps, and a "find" in an opensolaris source tree
# ls -l ls -l /var/crash/max/
total 1245361
-rw-r--r-- 1 root root 3 Aug 25 19:12 bounds
-rw-r--r-- 1 root root 1270961 Aug 25 18:04 unix.7
-rw-r--r-- 1 root root 1287667 Aug 25 19:00 unix.8
-rw-r--r-- 1 root root 1271353 Aug 25 19:10 unix.9
-rw-r--r-- 1 root root 505765888 Aug 25 18:06 vmcore.7
-rw-r--r-- 1 root root 473620480 Aug 25 19:03 vmcore.8
-rw-r--r-- 1 root root 478453760 Aug 25 19:12 vmcore.9
# sum /var/crash/max/*
# find /files/wos_b66/ -name core
# xm list
Name ID Mem VCPUs State Time(s)
Domain-0 0 701 1 r----- 2948.9
# mdb -k
...
> ::memstat
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 110546 431 63%
Anon 34816 136 20%
Exec and libs 8949 34 5%
Page cache 3756 14 2%
Free (cachelist) 2893 11 2%
Free (freelist) 15352 59 9%
Balloon 0 0 0%
Total 176312 688
2. Try to reduce the memory usage for Domain-0 from 701MB to 512MB, using:
xm set 0 512
System hangs (has to be power cycled).
=======================================================================
balloon_worker_thread kernel thread is looping, never releases the
cpu.
- usr/src/uts/i86xpv/os/balloon.c, balloon_worker_thread(), lines 600, 620:
/*
* This can be used to throttle the hv calls, but by default it''s
turned off.
*/
uint_t bln_wait_sec = 0;
/*
* We weren''t able to fully complete the request
* last time through, so try again.
*/
(void) cv_timedwait(&bln_cv, &bln_mutex,
lbolt + (bln_wait_sec * hz));
This is a ``cv_timewait(&bln_cv, &bln_mutex,
lbolt)'''' call, it returns
immediatelly with a return value -1 (thread does not release cpu)
- balloon_worker_thread(), line 636 calls balloon_dec_reservation().
In balloon_dec_reservation() ``page_resv(debit,
KM_NOSLEEP)'''' is called.
Typically, debit == 1024.
Since we''ve made the kernel use lots of kernel memory, availrmem is
low
(e.g. availrmem == 640); ``page_resv(debit, KM_NOSLEEP)''''
returns with 0
immediatelly, without ever releasing the cpu.
/*
* This routine reserves availrmem for npages;
* flags: KM_NOSLEEP or KM_SLEEP
* returns 1 on success or 0 on failure
*/
int
page_resv(pgcnt_t npages, uint_t flags)
{
mutex_enter(&freemem_lock);
while (availrmem < tune.t_minarmem + npages) {
if (flags & KM_NOSLEEP) {
mutex_exit(&freemem_lock);
return (0);
}
- the process repeats
Workaround:
==========
set "bln_wait_sec = 1", so that the cv_timedwait() in
balloon_worker_thread() releases the cpu. This makes balloon
memory reservatio changes quite slow (4MB / second),
but the machine survives them!
Suggested fix:
=============
We could wait one tick minimum, in balloon_worker_thread():
(void) cv_timedwait(&bln_cv, &bln_mutex,
lbolt + 1 + (bln_wait_sec * hz));
As a refinement, don''t wait when the previous
balloon_dec_reservation() call has worked; only wait
for one clock tick when balloon_dec_reservation() was
unable to release any memory to the hypervisor.
--- wos_b66_xen/usr/src/uts/i86xpv/os/balloon.c 2007-07-22 21:16:30.913352547
+0200
+++ wos_b66_xen_uppc/usr/src/uts/i86xpv/os/balloon.c 2007-08-26
00:14:59.514847774 +0200
@@ -611,6 +611,8 @@
balloon_worker_thread(void)
{
callb_cpr_t cprinfo;
+ spgcnt_t pages;
+ int dec_failed = 0;
CALLB_CPR_INIT(&cprinfo, &bln_mutex, callb_generic_cpr,
"balloon");
for (;;) {
@@ -622,20 +624,24 @@
* last time through, so try again.
*/
(void) cv_timedwait(&bln_cv, &bln_mutex,
- lbolt + (bln_wait_sec * hz));
+ lbolt + dec_failed + (bln_wait_sec * hz));
} else {
cv_wait(&bln_cv, &bln_mutex);
}
CALLB_CPR_SAFE_END(&cprinfo, &bln_mutex);
+ dec_failed = 0;
+
if (bln_stats.bln_new_target != bln_stats.bln_current_pages) {
if (bln_stats.bln_new_target <
bln_stats.bln_current_pages) {
/* reservation shrunk */
bln_stats.bln_current_pages -+ pages
balloon_dec_reservation(
bln_stats.bln_current_pages -
bln_stats.bln_new_target);
+ dec_failed = (pages == 0);
} else if (bln_stats.bln_new_target >
bln_stats.bln_current_pages) {
/* reservation grew */
This message posted from opensolaris.org
On Mon, Aug 27, 2007 at 03:00:39AM -0700, J??rgen Keil wrote:> - Tecra S1, 32-bit Pentium-M, 1 cpu core, 768 memory > - xem dom0 root filesystem on zfs > - opensolaris sources on zfs> [snip]I think that this has been fixed under: 6570855 Running xen out of memory causes Dom0 to become perpetually busy I''m sure Frank can tell you about the details if you''re interested... regards john
Ryan Scott
2007-Aug-28 01:45 UTC
[xen-discuss] balloon_worker_thread could hang Solaris dom0
John Levon wrote:> On Mon, Aug 27, 2007 at 03:00:39AM -0700, J??rgen Keil wrote: > >> - Tecra S1, 32-bit Pentium-M, 1 cpu core, 768 memory >> - xem dom0 root filesystem on zfs >> - opensolaris sources on zfs > >> [snip] > > I think that this has been fixed under: > > 6570855 Running xen out of memory causes Dom0 to become perpetually busyAlso, the fix for: 6576429 xmstress combining with Cthon_stress(domU server, dom0 client) will cause domU hang implements an exponential backoff when the balloon thread isn''t releasing any pages. -Ryan> > I''m sure Frank can tell you about the details if you''re interested... > > regards > john > _______________________________________________ > xen-discuss mailing list > xen-discuss at opensolaris.org