Is there a known bug in xen-unstable with rebooting of HVMs w/stubdom, and having the new domU come up before the old domU has been shut down and its memory fully reclaimed? I have had a couple of problems recently when customers rebooted their HVM domU through the OS; the last time, Xend froze, complaining in the log that it couldn''t free up enough memory to start the rebooted domain. Of course, that memory should have been free, if the old domain had been fully shut down. The xend.log looks like what I''d expect with an out of memory situation: [2009-10-11 14:23:19 6419] DEBUG (XendDomainInfo:121) XendDomainInfo.create_from_dict(... [2009-10-11 14:23:19 6419] DEBUG (XendDomainInfo:2369) XendDomainInfo.constructDomain [2009-10-11 14:23:19 6419] DEBUG (balloon:181) Balloon: 602420 KiB free; need 4096; done. [2009-10-11 14:23:19 6419] DEBUG (XendDomain:454) Adding Domain: 21 [2009-10-11 14:23:19 6419] DEBUG (XendDomainInfo:2614) XendDomainInfo.initDomain: 21 256 [2009-10-11 14:23:19 6419] DEBUG (image:336) Stored a VNC password for vfb access [2009-10-11 14:23:19 6419] DEBUG (image:843) args: boot, val: cd [2009-10-11 14:23:19 6419] DEBUG (image:843) args: fda, val: None [2009-10-11 14:23:19 6419] DEBUG (image:843) args: fdb, val: None [2009-10-11 14:23:19 6419] DEBUG (image:843) args: soundhw, val: None [2009-10-11 14:23:19 6419] DEBUG (image:843) args: localtime, val: 0 [2009-10-11 14:23:19 6419] DEBUG (image:843) args: serial, val: None [2009-10-11 14:23:19 6419] DEBUG (image:843) args: std-vga, val: 0 [2009-10-11 14:23:19 6419] DEBUG (image:843) args: isa, val: 0 [2009-10-11 14:23:19 6419] DEBUG (image:843) args: acpi, val: 1 [2009-10-11 14:23:19 6419] DEBUG (image:843) args: usb, val: 0 [2009-10-11 14:23:19 6419] DEBUG (image:843) args: usbdevice, val: tablet [2009-10-11 14:23:19 6419] DEBUG (image:843) args: gfx_passthru, val: None [2009-10-11 14:23:19 6419] INFO (image:779) Need to create platform device.[domid:21] [2009-10-11 14:23:19 6419] DEBUG (XendDomainInfo:2641) _initDomain:shadow_memory=0x9, memory_static_max=0x40000000, memory_static_min=0x0. [2009-10-11 14:23:19 6419] DEBUG (balloon:135) Balloon: tmem relinquished -1 KiB of 464976 KiB requested. [2009-10-11 14:23:19 6419] DEBUG (balloon:187) Balloon: 601008 KiB free; 0 to scrub; need 1065984; retries: 20. [2009-10-11 14:23:19 6419] DEBUG (balloon:187) Balloon: 601008 KiB free; 0 to scrub; need 1065984; retries: 20. [2009-10-11 14:23:42 6419] ERROR (XendDomainInfo:2694) XendDomainInfo.initDomain: exception occurred Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2655, in _initDomain balloon.free(memory + shadow + vtd_mem, self) File "/usr/lib64/python2.6/site-packages/xen/xend/balloon.py", line 225, in free free_mem + scrub_mem + dom0_alloc - dom0_min_mem)) VmError: I need 1065984 KiB, but dom0_min_mem is 196608 and shrinking to 196608 KiB would leave only 924848 KiB free. [2009-10-11 14:23:42 6419] ERROR (XendDomainInfo:467) VM start failed Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 453, in start XendTask.log_progress(31, 60, self._initDomain) File "/usr/lib64/python2.6/site-packages/xen/xend/XendTask.py", line 209, in log_progress retval = func(*args, **kwds) File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2697, in _initDomain raise exn VmError: I need 1065984 KiB, but dom0_min_mem is 196608 and shrinking to 196608 KiB would leave only 924848 KiB free. [2009-10-11 14:23:42 6419] DEBUG (XendDomainInfo:2844) XendDomainInfo.destroy: domid=21 [2009-10-11 14:23:42 6419] DEBUG (XendDomainInfo:2279) Destroying device model [2009-10-11 14:23:42 6419] ERROR (XendDomainInfo:2282) Device model destroy failed X86_HVM_ImageHandler instance has no attribute ''sentinel_lock'' Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2280, in _releaseDevices self.image.destroyDeviceModel() [2009-10-11 14:23:42 6419] DEBUG (XendDomainInfo:2286) Releasing devices [2009-10-11 14:23:42 6419] ERROR (XendDomainInfo:126) Domain construction failed Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 124, in create_from_dict vm.start() File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 453, in start XendTask.log_progress(31, 60, self._initDomain) File "/usr/lib64/python2.6/site-packages/xen/xend/XendTask.py", line 209, in log_progress retval = func(*args, **kwds) File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2697, in _initDomain raise exn VmError: I need 1065984 KiB, but dom0_min_mem is 196608 and shrinking to 196608 KiB would leave only 924848 KiB free. [2009-10-11 14:23:42 6419] DEBUG (XendDomainInfo:2844) XendDomainInfo.destroy: domid=20 [2009-10-11 14:23:42 6419] ERROR (XendDomainInfo:2117) Failed to restart domain 19. Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2100, in _restart new_dom_info) File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomain.py", line 989, in domain_create_from_dict dominfo = XendDomainInfo.create_from_dict(config_dict) File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 124, in create_from_dict vm.start() File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 453, in start XendTask.log_progress(31, 60, self._initDomain) File "/usr/lib64/python2.6/site-packages/xen/xend/XendTask.py", line 209, in log_progress retval = func(*args, **kwds) File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2697, in _initDomain raise exn VmError: I need 1065984 KiB, but dom0_min_mem is 196608 and shrinking to 196608 KiB would leave only 924848 KiB free. After that, Xend was unresponsive. Killing it and restarting it killed all the other domU''s, but new ones still couldn''t be created, forcing me to reboot the physical machine. Unfortunately, this doesn''t seem to happen every time. I''m still trying to determine how I can reproduce it reliably. -John _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 11/10/2009 21:54, "John" <lists.xen@nuclearfallout.net> wrote:> Is there a known bug in xen-unstable with rebooting of HVMs w/stubdom, and > having the new domU come up before the old domU has been shut down and its > memory fully reclaimed? I have had a couple of problems recently when > customers rebooted their HVM domU through the OS; the last time, Xend froze, > complaining in the log that it couldn''t free up enough memory to start the > rebooted domain. Of course, that memory should have been free, if the old > domain had been fully shut down.Yes. Perhaps there is a race between shutting down the stubdom and creating the new domU? The stubdom maps a lot of the old domU''s memory, and that wouldn''t get freed until the stubdom is destroyed.> After that, Xend was unresponsive. Killing it and restarting it killed all the > other domU''s, but new ones still couldn''t be created, forcing me to reboot the > physical machine.Odd, since it should be possible to restart just xend (but not xenstored and other daemons) and have the system continue to work okay. Could new ones still not be created due to lack of memory? If you use Xen''s ''q'' debug key, how many domains does Xen itself think are still alive? -- Keir> Unfortunately, this doesn''t seem to happen every time. I''m still trying to > determine how I can reproduce it reliably. > > -John >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>> Is there a known bug in xen-unstable with rebooting of HVMs w/stubdom, >> and >> having the new domU come up before the old domU has been shut down and >> its >> memory fully reclaimed? > > Yes. Perhaps there is a race between shutting down the stubdom and > creating > the new domU? The stubdom maps a lot of the old domU''s memory, and that > wouldn''t get freed until the stubdom is destroyed.Nod, that makes sense. Would this be a simple fix?>> After that, Xend was unresponsive. Killing it and restarting it killed >> all the >> other domU''s, but new ones still couldn''t be created, forcing me to >> reboot the >> physical machine. > > Odd, since it should be possible to restart just xend (but not xenstored > and > other daemons) and have the system continue to work okay. Could new ones > still not be created due to lack of memory? If you use Xen''s ''q'' debug > key, > how many domains does Xen itself think are still alive?If I remember correctly, xend wasn''t responding to any commands, so I restarted it with "xend restart". The domUs stayed online, but "xm top" then showed a blank screen (no domains listed at all), and when I tried to recreate the crashed domain again with "xm create", that still timed out (and "xm list" also timed out). I discovered that the old xend was still running, frozen, so I nuked it with "killall -9 xend" and then started up a new instance with "xend start". At that point, the domUs dropped offline and a fresh "xm top" showed only dom0 and one other domain, that it called "Domain-unnamed" or something similar; all the rest of the memory was listed as free. I tried "xm create" again and it timed out, so I chose to reboot the machine. Rebooting it hung the box (something that has not happened to me before), so I had to remotely power cycle to bring it back to a usable state. -John _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 12/10/2009 08:49, "John" <lists.xen@nuclearfallout.net> wrote:>>> Is there a known bug in xen-unstable with rebooting of HVMs w/stubdom, >>> and >>> having the new domU come up before the old domU has been shut down and >>> its >>> memory fully reclaimed? >> >> Yes. Perhaps there is a race between shutting down the stubdom and >> creating >> the new domU? The stubdom maps a lot of the old domU''s memory, and that >> wouldn''t get freed until the stubdom is destroyed. > > Nod, that makes sense. Would this be a simple fix?Perhaps Stefano can comment on whether this is possibly the cause at all? I''m only hypothesising. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2009-Oct-13 15:32 UTC
Re: [Xen-devel] Memory allocation with rebooted HVM
On Sun, 11 Oct 2009, John wrote:> Is there a known bug in xen-unstable with rebooting of HVMs w/stubdom, and having the new domU come up before the old > domU has been shut down and its memory fully reclaimed? I have had a couple of problems recently when customers rebooted > their HVM domU through the OS; the last time, Xend froze, complaining in the log that it couldn''t free up enough memory > to start the rebooted domain. Of course, that memory should have been free, if the old domain had been fully shut down. > > The xend.log looks like what I''d expect with an out of memory situation: > > [2009-10-11 14:23:19 6419] DEBUG (XendDomainInfo:121) XendDomainInfo.create_from_dict(... > [2009-10-11 14:23:19 6419] DEBUG (XendDomainInfo:2369) XendDomainInfo.constructDomain > [2009-10-11 14:23:19 6419] DEBUG (balloon:181) Balloon: 602420 KiB free; need 4096; done. > [2009-10-11 14:23:19 6419] DEBUG (XendDomain:454) Adding Domain: 21 > [2009-10-11 14:23:19 6419] DEBUG (XendDomainInfo:2614) XendDomainInfo.initDomain: 21 256 > [2009-10-11 14:23:19 6419] DEBUG (image:336) Stored a VNC password for vfb access > [2009-10-11 14:23:19 6419] DEBUG (image:843) args: boot, val: cd > [2009-10-11 14:23:19 6419] DEBUG (image:843) args: fda, val: None > [2009-10-11 14:23:19 6419] DEBUG (image:843) args: fdb, val: None > [2009-10-11 14:23:19 6419] DEBUG (image:843) args: soundhw, val: None > [2009-10-11 14:23:19 6419] DEBUG (image:843) args: localtime, val: 0 > [2009-10-11 14:23:19 6419] DEBUG (image:843) args: serial, val: None > [2009-10-11 14:23:19 6419] DEBUG (image:843) args: std-vga, val: 0 > [2009-10-11 14:23:19 6419] DEBUG (image:843) args: isa, val: 0 > [2009-10-11 14:23:19 6419] DEBUG (image:843) args: acpi, val: 1 > [2009-10-11 14:23:19 6419] DEBUG (image:843) args: usb, val: 0 > [2009-10-11 14:23:19 6419] DEBUG (image:843) args: usbdevice, val: tablet > [2009-10-11 14:23:19 6419] DEBUG (image:843) args: gfx_passthru, val: None > [2009-10-11 14:23:19 6419] INFO (image:779) Need to create platform device.[domid:21] > [2009-10-11 14:23:19 6419] DEBUG (XendDomainInfo:2641) _initDomain:shadow_memory=0x9, memory_static_max=0x40000000, > memory_static_min=0x0. > [2009-10-11 14:23:19 6419] DEBUG (balloon:135) Balloon: tmem relinquished -1 KiB of 464976 KiB requested. > [2009-10-11 14:23:19 6419] DEBUG (balloon:187) Balloon: 601008 KiB free; 0 to scrub; need 1065984; retries: 20. > [2009-10-11 14:23:19 6419] DEBUG (balloon:187) Balloon: 601008 KiB free; 0 to scrub; need 1065984; retries: 20. > [2009-10-11 14:23:42 6419] ERROR (XendDomainInfo:2694) XendDomainInfo.initDomain: exception occurred > Traceback (most recent call last): > File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2655, in _initDomain > balloon.free(memory + shadow + vtd_mem, self) > File "/usr/lib64/python2.6/site-packages/xen/xend/balloon.py", line 225, in free > free_mem + scrub_mem + dom0_alloc - dom0_min_mem)) > VmError: I need 1065984 KiB, but dom0_min_mem is 196608 and shrinking to 196608 KiB would leave only 924848 KiB free. > [2009-10-11 14:23:42 6419] ERROR (XendDomainInfo:467) VM start failed > Traceback (most recent call last): > File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 453, in start > XendTask.log_progress(31, 60, self._initDomain) > File "/usr/lib64/python2.6/site-packages/xen/xend/XendTask.py", line 209, in log_progress > retval = func(*args, **kwds) > File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2697, in _initDomain > raise exn > VmError: I need 1065984 KiB, but dom0_min_mem is 196608 and shrinking to 196608 KiB would leave only 924848 KiB free. > [2009-10-11 14:23:42 6419] DEBUG (XendDomainInfo:2844) XendDomainInfo.destroy: domid=21 > [2009-10-11 14:23:42 6419] DEBUG (XendDomainInfo:2279) Destroying device model > [2009-10-11 14:23:42 6419] ERROR (XendDomainInfo:2282) Device model destroy failed X86_HVM_ImageHandler instance has no > attribute ''sentinel_lock'' > Traceback (most recent call last): > File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2280, in _releaseDevices > self.image.destroyDeviceModel() > [2009-10-11 14:23:42 6419] DEBUG (XendDomainInfo:2286) Releasing devices > [2009-10-11 14:23:42 6419] ERROR (XendDomainInfo:126) Domain construction failed > Traceback (most recent call last): > File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 124, in create_from_dict > vm.start() > File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 453, in start > XendTask.log_progress(31, 60, self._initDomain) > File "/usr/lib64/python2.6/site-packages/xen/xend/XendTask.py", line 209, in log_progress > retval = func(*args, **kwds) > File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2697, in _initDomain > raise exn > VmError: I need 1065984 KiB, but dom0_min_mem is 196608 and shrinking to 196608 KiB would leave only 924848 KiB free. > [2009-10-11 14:23:42 6419] DEBUG (XendDomainInfo:2844) XendDomainInfo.destroy: domid=20 > [2009-10-11 14:23:42 6419] ERROR (XendDomainInfo:2117) Failed to restart domain 19. > Traceback (most recent call last): > File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2100, in _restart > new_dom_info) > File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomain.py", line 989, in domain_create_from_dict > dominfo = XendDomainInfo.create_from_dict(config_dict) > File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 124, in create_from_dict > vm.start() > File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 453, in start > XendTask.log_progress(31, 60, self._initDomain) > File "/usr/lib64/python2.6/site-packages/xen/xend/XendTask.py", line 209, in log_progress > retval = func(*args, **kwds) > File "/usr/lib64/python2.6/site-packages/xen/xend/XendDomainInfo.py", line 2697, in _initDomain > raise exn > VmError: I need 1065984 KiB, but dom0_min_mem is 196608 and shrinking to 196608 KiB would leave only 924848 KiB free. > After that, Xend was unresponsive. Killing it and restarting it killed all the other domU''s, but new ones still couldn''t > be created, forcing me to reboot the physical machine. > > Unfortunately, this doesn''t seem to happen every time. I''m still trying to determine how I can reproduce it reliably.Could you please post the log of what happened right before? .From that it should be clear if the destruction of the old guest or stubdom didn''t go smoothly. Are you sure that this is a stubdom specific issue? Because from the logs it seems that your host ran out of memory. For example, were you trying to start another guest while rebooting the first one? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano, Thanks for posting.> Could you please post the log of what happened right before? > .From that it should be clear if the destruction of the old guest or > stubdom didn''t go smoothly.Here are the entries before what I gave before, starting with an old line that was unrelated (afaik): [2009-10-11 13:14:50 6419] INFO (image:586) testvds device model terminated [2009-10-11 14:23:17 6419] INFO (XendDomainInfo:1961) Domain has shutdown: name=testvds id=19 reason=reboot. [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:2844) XendDomainInfo.destroy: domid=19 [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:2279) Destroying device model [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:2286) Releasing devices [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:2292) Removing vif/0 [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:1185) XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0 [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:2292) Removing console/0 [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:1185) XendDomainInfo.destroyDevice: deviceClass = console, device = console /0 [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:2292) Removing vbd/768 [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:1185) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/768 [2009-10-11 14:23:18 6419] DEBUG (XendDomainInfo:2292) Removing vbd/5632 [2009-10-11 14:23:18 6419] DEBUG (XendDomainInfo:1185) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/5632 [2009-10-11 14:23:18 6419] DEBUG (XendDomainInfo:2292) Removing vbd/5696 [2009-10-11 14:23:18 6419] DEBUG (XendDomainInfo:1185) XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/5696 [2009-10-11 14:23:18 6419] DEBUG (XendDomainInfo:2292) Removing vfb/0 [2009-10-11 14:23:18 6419] DEBUG (XendDomainInfo:1185) XendDomainInfo.destroyDevice: deviceClass = vfb, device = vfb/0 [2009-10-11 14:23:19 6419] DEBUG (XendDomainInfo:121) XendDomainInfo.create_from_dict({> Are you sure that this is a stubdom specific issue?I didn''t see problems like this before I switched to using stubdoms a few days ago, but back then, I was also using a dom0 with ballooning, which might have affected things (when I switched to stubdoms, I also set dom0_mem to 512M).> Because from the logs it seems that your host ran out of memory. > For example, were you trying to start another guest while rebooting the > first one?This seems to have been triggered just by that one domU being restarted. I''ve seen this before as well, when a 4G domU tried to restart and hosed a box that only had 3G of memory free (on top of the 4G it was using). Thanks, John _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2009-Oct-13 18:13 UTC
Re: [Xen-devel] Memory allocation with rebooted HVM
On Tue, 13 Oct 2009, John wrote:> Stefano, > > Thanks for posting. > > > Could you please post the log of what happened right before? > > .From that it should be clear if the destruction of the old guest or > > stubdom didn''t go smoothly. > > Here are the entries before what I gave before, starting with an old line > that was unrelated (afaik): > > [2009-10-11 13:14:50 6419] INFO (image:586) testvds device model terminated > [2009-10-11 14:23:17 6419] INFO (XendDomainInfo:1961) Domain has shutdown: > name=testvds id=19 reason=reboot. > [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:2844) > XendDomainInfo.destroy: domid=19 > [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:2279) Destroying device > model > [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:2286) Releasing devices > [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:2292) Removing vif/0 > [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:1185) > XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0 > [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:2292) Removing console/0 > [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:1185) > XendDomainInfo.destroyDevice: deviceClass = console, device = console > /0 > [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:2292) Removing vbd/768 > [2009-10-11 14:23:17 6419] DEBUG (XendDomainInfo:1185) > XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/768 > [2009-10-11 14:23:18 6419] DEBUG (XendDomainInfo:2292) Removing vbd/5632 > [2009-10-11 14:23:18 6419] DEBUG (XendDomainInfo:1185) > XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/5632 > [2009-10-11 14:23:18 6419] DEBUG (XendDomainInfo:2292) Removing vbd/5696 > [2009-10-11 14:23:18 6419] DEBUG (XendDomainInfo:1185) > XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/5696 > [2009-10-11 14:23:18 6419] DEBUG (XendDomainInfo:2292) Removing vfb/0 > [2009-10-11 14:23:18 6419] DEBUG (XendDomainInfo:1185) > XendDomainInfo.destroyDevice: deviceClass = vfb, device = vfb/0 > [2009-10-11 14:23:19 6419] DEBUG (XendDomainInfo:121) > XendDomainInfo.create_from_dict({ >.From these logs it seems that the destruction of the old guest went fine but the old stubdom is still alive.> > Are you sure that this is a stubdom specific issue? > > I didn''t see problems like this before I switched to using stubdoms a few > days ago, but back then, I was also using a dom0 with ballooning, which > might have affected things (when I switched to stubdoms, I also set dom0_mem > to 512M). > > > Because from the logs it seems that your host ran out of memory. > > For example, were you trying to start another guest while rebooting the > > first one? > > This seems to have been triggered just by that one domU being restarted. > I''ve seen this before as well, when a 4G domU tried to restart and hosed a > box that only had 3G of memory free (on top of the 4G it was using).It seems to me that the new domain is always created after the old one has been destroyed with the exception that the old stubdom may still be alive but one stubdom uses only 32MB of ram, therefore cannot be the one preventing you from restarting the domain, especially in the case above where you had 3G free. Maybe someone else more confident with xend memory managent can comment on this. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 13/10/2009 19:13, "Stefano Stabellini" <stefano.stabellini@eu.citrix.com> wrote:>> This seems to have been triggered just by that one domU being restarted. >> I''ve seen this before as well, when a 4G domU tried to restart and hosed a >> box that only had 3G of memory free (on top of the 4G it was using). > > It seems to me that the new domain is always created after the old one > has been destroyed with the exception that the old stubdom may still be > alive but one stubdom uses only 32MB of ram, therefore cannot be the one > preventing you from restarting the domain, especially in the case above > where you had 3G free.How much guest memory might the stubdom map? Could the stubdom be preventing the guest''s memory from being freed, by holding references to the pages? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2009-Oct-13 18:30 UTC
Re: [Xen-devel] Memory allocation with rebooted HVM
On Tue, 13 Oct 2009, Keir Fraser wrote:> On 13/10/2009 19:13, "Stefano Stabellini" <stefano.stabellini@eu.citrix.com> > wrote: > > >> This seems to have been triggered just by that one domU being restarted. > >> I''ve seen this before as well, when a 4G domU tried to restart and hosed a > >> box that only had 3G of memory free (on top of the 4G it was using). > > > > It seems to me that the new domain is always created after the old one > > has been destroyed with the exception that the old stubdom may still be > > alive but one stubdom uses only 32MB of ram, therefore cannot be the one > > preventing you from restarting the domain, especially in the case above > > where you had 3G free. > > How much guest memory might the stubdom map? Could the stubdom be preventing > the guest''s memory from being freed, by holding references to the pages? >Yes it could, this can also be the reason why he was able to reproduce this problem only few times (it depends on how much memory the stubdom has mapped, and that can vary from case to case). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2009-Oct-14 16:12 UTC
Re: [Xen-devel] Memory allocation with rebooted HVM
On Tue, Oct 13, 2009 at 07:30:32PM +0100, Stefano Stabellini wrote:> On Tue, 13 Oct 2009, Keir Fraser wrote: > > On 13/10/2009 19:13, "Stefano Stabellini" <stefano.stabellini@eu.citrix.com> > > wrote: > > > > >> This seems to have been triggered just by that one domU being restarted. > > >> I''ve seen this before as well, when a 4G domU tried to restart and hosed a > > >> box that only had 3G of memory free (on top of the 4G it was using). > > > > > > It seems to me that the new domain is always created after the old one > > > has been destroyed with the exception that the old stubdom may still be > > > alive but one stubdom uses only 32MB of ram, therefore cannot be the one > > > preventing you from restarting the domain, especially in the case above > > > where you had 3G free. > > > > How much guest memory might the stubdom map? Could the stubdom be preventing > > the guest''s memory from being freed, by holding references to the pages? > > > > Yes it could, this can also be the reason why he was able to reproduce > this problem only few times (it depends on how much memory the stubdom has > mapped, and that can vary from case to case).We had encountered this problem in the past with blkback and with the SCSI disks being iSCSI. The page''s had a page-reference that would never decrement and the guest would stay in its zombie state. I''ve posted a patch some-time ago: http://lists.xensource.com/archives/html/xen-devel/2009-09/msg00561.html that can help troubleshoot if this is indeed this type of failure. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad,> We had encountered this problem in the past with blkback and with the SCSI > disks > being iSCSI. The page''s had a page-reference that would never decrement > and > the guest would stay in its zombie state. I''ve posted a patch some-time > ago: > > http://lists.xensource.com/archives/html/xen-devel/2009-09/msg00561.html > > that can help troubleshoot if this is indeed this type of failure.Thanks for the patch. It looks like Stefano posted a patch this morning that hopefully will take care of the issue with a restart of xend, but if it doesn''t work, I''ll apply yours during a full-blown maintenance event (right now, unfortunately, maintenance events are rather disruptive because the domUs have to be shut down completely -- http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1521) If it matters, I am using file: on RAID10 with standard SATA drives. -John _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel