Hi, during some experiments with many guests I get crashing Dom0s because of too less memory. Actually the OOM killer goes ''round and kills random things, preferably qemu-dm''s ;-) The box in question has 128GB of memory, I start with dom0_mem=8192M (or 16384M, doesn''t matter). I also used "dom0_mem=8192M,min:1536M", but that didn''t make any difference. Xen is c/s 25688. Then I start some guests with 2GB each. This works fine until about 55 guests, then I get some denies from xl when starting guests (which would be OK). But sometimes the guest start works (even after having failed before), but it has obviously ripped off precious memory from Dom0. With around 55 guests Dom0 has about 500MB in use. The whole Dom0 is in trouble then, I get "fork: cannot allocate memory" messages for a simple "ls" and have to reboot the box. This is with xl.conf:autoballooning=1 (= the commented default) Setting it to 0 works, but is obviously not a real option as a default. I found the hardcoded 128MB limit in libxl_internal.h, I guess this is way too small for this type of machine. Either we change this to something higher (768 MB worked for me) or we make this a config option in xl.conf (like it was in xend-config.sxp) Another option would be to make it dynamic, by looking at the actual memory currently used in Dom0 and don''t balloon down to 110% or so of it. Sadly (well..) I am about to leave for vacation, so no patch this time, I leave this as an exercise to the tool buffs ;-) In any case we should do something still for Xen 4.2, as I guess people dislike crashing Dom0, tearing down all the domains with it... Regards, Andre. -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany
Andre Przywara writes ("auto-ballooning crashing Dom0?"):> during some experiments with many guests I get crashing Dom0s because of > too less memory. Actually the OOM killer goes ''round and kills random > things, preferably qemu-dm''s ;-) > The box in question has 128GB of memory, I start with dom0_mem=8192M (or > 16384M, doesn''t matter). I also used "dom0_mem=8192M,min:1536M", but > that didn''t make any difference. Xen is c/s 25688.I have seen similar effects occasionally but have usually been to busy in the middle of something else to do anything about it. The autoballooning arrangements aren''t very good TBH and we are intending to improve things in 4.3.> Either we change this to something higher (768 MB worked for me) or we > make this a config option in xl.conf (like it was in xend-config.sxp)Certainly it should be a config option.> Another option would be to make it dynamic, by looking at the actual > memory currently used in Dom0 and don''t balloon down to 110% or so of it.That would be a possibility.> In any case we should do something still for Xen 4.2, as I guess people > dislike crashing Dom0, tearing down all the domains with it...Yes. Ian.
(Resurrecting a thread from the past) --On 2 August 2012 15:45:00 +0100 Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:> Andre Przywara writes ("auto-ballooning crashing Dom0?"): >> during some experiments with many guests I get crashing Dom0s because of >> too less memory. Actually the OOM killer goes ''round and kills random >> things, preferably qemu-dm''s ;-) >> The box in question has 128GB of memory, I start with dom0_mem=8192M (or >> 16384M, doesn''t matter). I also used "dom0_mem=8192M,min:1536M", but >> that didn''t make any difference. Xen is c/s 25688. > > I have seen similar effects occasionally but have usually been to busy > in the middle of something else to do anything about it. The > autoballooning arrangements aren''t very good TBH and we are intending > to improve things in 4.3. > >> Either we change this to something higher (768 MB worked for me) or we >> make this a config option in xl.conf (like it was in xend-config.sxp) > > Certainly it should be a config option. > >> Another option would be to make it dynamic, by looking at the actual >> memory currently used in Dom0 and don''t balloon down to 110% or so of it. > > That would be a possibility. > >> In any case we should do something still for Xen 4.2, as I guess people >> dislike crashing Dom0, tearing down all the domains with it... > > Yes. >We got hit by this. Our fix is to turn off autoballooning of dom0. However, for the record, xen4.2.2 seems to perform very strangely here. Things worked with 2.5GB of RAM, failed with 3GB, but worked again with 4GB. The symptom was a hang initialising the balloon driver. Our ''compounding factor'' is that we run with a large initrd that stays alive during normal running as a ramdisk. Those pages are marked as buffer/page cache (I forget which) but never get flushed. I suspect this confuses the free memory calculations as xen''s balloon driver thinks there are pages that can be freed that actually can''t. We''ll probably turn autoballooning off with xen4.3 (as our dom0 memory usage is pretty static), but it might affect that too. -- Alex Bligh
Possibly Parallel Threads
- acpidump crashes on some machines
- Disable memory balloon in dom0
- [PATCH] xl: xl.conf(5): correct advice re autoballooning vs. dom0_mem
- [PATCH] xl: extend autoballoon xl.conf option with an "auto" option
- Bug#721946: Bug#721946: xen-hypervisor-4.1-amd64: dom0_mem cannot exceed some value