Hi guys, I have a problem with powering down my system under the XEN hypervisor. System details are as follows: gentoo linux, X86_64 XEN version 4.2.2 linux hardened kernel 3.9.5 as dom0 Xeon E3 1260L processor (vt-d capable) 32GB ECC RAM which has been thoroughly tested - so should be o.k. when I issue "shutdown -h now" from dom0 the system usually reboots instead of turning off power to the machine. There''s the odd occassion (probably 1 in every 10 to 20 shutdown attempts) when the system power is actually turned off. There seems to be no rule to follow when this happens. If I use the exact same kernel and start w/o the XEN hypervisor powerdown *always* works as expected when I use "shutdown -h now". So on the face of it, this seems to point to the XEN hypervisor as the culprit. Any idea/help on how to track down and solve the issue would be very much appreciated. If you require any more information / log data, I''m more than happy to provide that. Unfortuantely however, there seems to be no log / dmesg data available during shutdown as syslog-ng is stopped. The only thing I can confirm that there''s no strange output on the console during either bootup or shutdown: All services / daemons start up o.k. and also during shutdown all services seem to come to a proper halt. The root filesystem is re-mounted r/o and the last message reads "Power down" - only to then reboot the system by going through a BIOS power-on sequence. Thanks and regards, Atom2
Ping ... Nobody with any idea / solution or able to provide some assistance on how to get to the grounds of this irritating issue? Thanks Atom2 On Mo, 29.07.2013, 00:03, Atom2 wrote:> Hi guys, > I have a problem with powering down my system under the XEN hypervisor. > System details are as follows: > > gentoo linux, X86_64 > XEN version 4.2.2 > linux hardened kernel 3.9.5 as dom0 > Xeon E3 1260L processor (vt-d capable) > 32GB ECC RAM which has been thoroughly tested - so should be o.k. > > when I issue "shutdown -h now" from dom0 the system usually reboots > instead of turning off power to the machine. There''s the odd occassion > (probably 1 in every 10 to 20 shutdown attempts) when the system power > is actually turned off. There seems to be no rule to follow when this > happens. > > If I use the exact same kernel and start w/o the XEN hypervisor > powerdown *always* works as expected when I use "shutdown -h now". So on > the face of it, this seems to point to the XEN hypervisor as the culprit. > > Any idea/help on how to track down and solve the issue would be very > much appreciated. If you require any more information / log data, I''m > more than happy to provide that. > > Unfortuantely however, there seems to be no log / dmesg data available > during shutdown as syslog-ng is stopped. The only thing I can confirm > that there''s no strange output on the console during either bootup or > shutdown: All services / daemons start up o.k. and also during shutdown > all services seem to come to a proper halt. The root filesystem is > re-mounted r/o and the last message reads "Power down" - only to then > reboot the system by going through a BIOS power-on sequence. > > Thanks and regards, > > Atom2 > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xen.org > http://lists.xen.org/xen-users >
On Mon, 2013-07-29 at 00:03 +0200, Atom2 wrote:> Hi guys, > I have a problem with powering down my system under the XEN hypervisor. > System details are as follows: > > gentoo linux, X86_64 > XEN version 4.2.2 > linux hardened kernel 3.9.5 as dom0 > Xeon E3 1260L processor (vt-d capable) > 32GB ECC RAM which has been thoroughly tested - so should be o.k. > > when I issue "shutdown -h now" from dom0 the system usually reboots > instead of turning off power to the machine. There''s the odd occassion > (probably 1 in every 10 to 20 shutdown attempts) when the system power > is actually turned off. There seems to be no rule to follow when this > happens. > > If I use the exact same kernel and start w/o the XEN hypervisor > powerdown *always* works as expected when I use "shutdown -h now". So on > the face of it, this seems to point to the XEN hypervisor as the culprit. > > Any idea/help on how to track down and solve the issue would be very > much appreciated. If you require any more information / log data, I''m > more than happy to provide that.You could try the Xen reboot= option, which controls which hardware mechanism tries to use to reboot. See http://xenbits.xen.org/docs/unstable/misc/xen-command-line.html for details.> Unfortuantely however, there seems to be no log / dmesg data available > during shutdown as syslog-ng is stopped. The only thing I can confirm > that there''s no strange output on the console during either bootup or > shutdown: All services / daemons start up o.k. and also during shutdown > all services seem to come to a proper halt. The root filesystem is > re-mounted r/o and the last message reads "Power down" - only to then > reboot the system by going through a BIOS power-on sequence.I would expect you to see a message to the affect "dom0 has shutdown, rebooting" from the hypervisor at the very end. You may need a serial console to see this though I suppose. http://wiki.xen.org/wiki/Xen_Serial_Console In fact if the reboot= options don''t help then setting up a serial console to get at Xen''s logs during reboot is probably the next step. Alternatively I think "console=vga vga=current,keep" will keep the VGA for Xen so you can see what is happening, but at the expense of no VGA for dom0 (I''ve never actually tried this myself, but I think it should work). This might be OK if you can use ssh to initiate the reboot and then taker a photo of the resulting Xen logs. Ian.
Hi Ian, many thanks for your suggestions and for helping out in trying to get to the roots of this issue. Sorry for my delay in responding - setting up a serial console took a bit of time, but please see my inline comments below. Thanks. Am 07.08.13 10:39, schrieb Ian Campbell:> On Mon, 2013-07-29 at 00:03 +0200, Atom2 wrote: >> Hi guys, >> I have a problem with powering down my system under the XEN hypervisor. >> System details are as follows: >> >> gentoo linux, X86_64 >> XEN version 4.2.2 >> linux hardened kernel 3.9.5 as dom0 >> Xeon E3 1260L processor (vt-d capable) >> 32GB ECC RAM which has been thoroughly tested - so should be o.k. >> >> when I issue "shutdown -h now" from dom0 the system usually reboots >> instead of turning off power to the machine. There''s the odd occassion >> (probably 1 in every 10 to 20 shutdown attempts) when the system power >> is actually turned off. There seems to be no rule to follow when this >> happens. >> >> If I use the exact same kernel and start w/o the XEN hypervisor >> powerdown *always* works as expected when I use "shutdown -h now". So on >> the face of it, this seems to point to the XEN hypervisor as the culprit. >> >> Any idea/help on how to track down and solve the issue would be very >> much appreciated. If you require any more information / log data, I''m >> more than happy to provide that. > > You could try the Xen reboot= option, which controls which hardware > mechanism tries to use to reboot. See > http://xenbits.xen.org/docs/unstable/misc/xen-command-line.html for > details. >None of the options listed in your link (reboot=b|t|k|n|w|c) managed to turn off power to the system. All but reboot=n (and also noreboot=true) did a reboot after requesting a powerdown with shutdown -h now. reboot=n or noreboot=true just left the machine''s power on with the last message on the console (from the gentoo dom0 shutdown) reading [ <no secs since start> ] Power Down.>> Unfortuantely however, there seems to be no log / dmesg data available >> during shutdown as syslog-ng is stopped. The only thing I can confirm >> that there''s no strange output on the console during either bootup or >> shutdown: All services / daemons start up o.k. and also during shutdown >> all services seem to come to a proper halt. The root filesystem is >> re-mounted r/o and the last message reads "Power down" - only to then >> reboot the system by going through a BIOS power-on sequence. > > I would expect you to see a message to the affect "dom0 has shutdown, > rebooting" from the hypervisor at the very end. You may need a serial > console to see this though I suppose. > http://wiki.xen.org/wiki/Xen_Serial_Console >I have set up a serial console and the output is attached to this mail. It did however not contain the message you expected, but that might be due to the fact that I did not request a reboot but rather a powerdown. At least that''s my guess.> In fact if the reboot= options don''t help then setting up a serial > console to get at Xen''s logs during reboot is probably the next step. > > Alternatively I think "console=vga vga=current,keep" will keep the VGA > for Xen so you can see what is happening, but at the expense of no VGA > for dom0 (I''ve never actually tried this myself, but I think it should > work). This might be OK if you can use ssh to initiate the reboot and > then taker a photo of the resulting Xen logs.This was my initial try, but I did not manage to get that working. I guess, now having the serial console log available, might render this useless anyways for my case ... I hope the attached file helps in narrowing the powerdown problem down. If you require any more informatio, I''d be more than happy to provide that. Many thanks again ...> > Ian. >_______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
On Wed, 2013-08-07 at 15:38 +0200, Atom2 wrote:> None of the options listed in your link (reboot=b|t|k|n|w|c) managed to > turn off power to the system. All but reboot=n (and also noreboot=true) > did a reboot after requesting a powerdown with shutdown -h now. > reboot=n or noreboot=true just left the machine''s power on with the last > message on the console (from the gentoo dom0 shutdown) reading > [ <no secs since start> ] Power Down.[...]> I hope the attached file helps in narrowing the powerdown problem down. > If you require any more informatio, I''d be more than happy to provide that.Hrm, I don''t have much in the way bright ideas after this. It would be worth trying at least Xen 4.3 if not -unstable, just in case this has been fixed already. I suppose booting the same Linux kernel natively works and can reboot/shutdown as much as you like? AFAIK xen/arch/x86/acpi/power.c is at least somewhat related to linux/drivers/acpi/power.c. I don''t know to what extent they have diverged but it might be worth eye-balling the diff and/or looking at the Linux changelog for likely looking updates. Ian.
Ian, many thanks for your continued support. I have answers for your latest question and probably some more data that might help pinpointing the problem further down. Please see below: Am 13.08.13 22:11, schrieb Ian Campbell:> On Wed, 2013-08-07 at 15:38 +0200, Atom2 wrote: > >> None of the options listed in your link (reboot=b|t|k|n|w|c) managed to >> turn off power to the system. All but reboot=n (and also noreboot=true) >> did a reboot after requesting a powerdown with shutdown -h now. >> reboot=n or noreboot=true just left the machine''s power on with the last >> message on the console (from the gentoo dom0 shutdown) reading >> [ <no secs since start> ] Power Down. > [...] >> I hope the attached file helps in narrowing the powerdown problem down. >> If you require any more informatio, I''d be more than happy to provide that. > > Hrm, I don''t have much in the way bright ideas after this. > > It would be worth trying at least Xen 4.3 if not -unstable, just in case > this has been fixed already. >I have not gone down that route yet, but might do so if everything else fails - also in the hope, that 4.3 is not that far away from gentoo ...> I suppose booting the same Linux kernel natively works and can > reboot/shutdown as much as you like?Yes, that works asbolutely reliably - only with XEN do I see those powerdown problems ...> > AFAIK xen/arch/x86/acpi/power.c is at least somewhat related to > linux/drivers/acpi/power.c. I don''t know to what extent they have > diverged but it might be worth eye-balling the diff and/or looking at > the Linux changelog for likely looking updates.I did a bit of digging in and comparing of the two files that you pointed me to - but unfortunately, they are *VERY* different - not only in size (the linux one is around 24k whereas the XEN file is less than 10k), but also in content. So that not really brought me any further. I however nevertheless thought I''d modify the XEN power.c file to see where the problem may start. I''m far from being a kernel or XEN programmer, but I am able to read and basically understand and modify C code. Supported by finding and identifying the messages I had on the serial console I decided to add a few additional printk statements after the last message that was displayed on the console to see where the system probably crashes / the problem could possibly start: The relevant code snippet now looks as follows (NOTE: The printk messages starting with "After" or "Before" stem from me, the first one and the one within the if-construct are both unchanged; the initial one was originally always displayed on the serial console as the final line): printk("Entering ACPI S%d state.\n", state); local_irq_save(flags); printk("After local_irq_save\n"); spin_debug_disable(); printk("After spin_debug_disable\n"); if ( (error = device_power_down()) ) { printk(XENLOG_ERR "Some devices failed to power down."); system_state = SYS_STATE_resume; goto done; } printk("Before ACPI_FLUSH_CPU_CACHE\n"); ACPI_FLUSH_CPU_CACHE(); printk("After ACPI_FLUSH_CPU_CACHE\n"); The final few messages of the *new* output *after my amateur mods* on the serial console now read as follows: (XEN) Entering ACPI S5 state. (XEN) After local_irq_save (XEN) After spin_debug_disable There is neither a message reading (XEN) Some devices failed to power down. (NOTE: this printk statement however has a XENLOG_ERR before the text - so I am not sure whether that would appear on the serial console at all) nor one reading (XEN) Before ACPI_FLUSH_CPU_CACHE This to me seems to indicate, that the problematic code is somewhere in between the following lines: if ( (error = device_power_down()) ) { printk(XENLOG_ERR "Some devices failed to power down."); system_state = SYS_STATE_resume; goto done; } I hope that might provide you with some more information which I could use to make a step forward. On the other hand I might be completely on the wrong track as I have no clue where the actual requested power-down (or as is: reboot) actually happens. That was not obvious for me from the code in power.c without further knowledge ... In any case, many thanks in advance.> > Ian. >
On Wed, 2013-08-14 at 01:36 +0200, Atom2 wrote:> I hope that might provide you with some more information which I could > use to make a step forward.Perhaps the console device is one which was powered down so you don''t get any further output?> On the other hand I might be completely on the wrong track as I have no > clue where the actual requested power-down (or as is: reboot) actually > happens. That was not obvious for me from the code in power.c without > further knowledge ...TBH this ACPI stuff and reboot etc is not an area of Xen which I am all that familiar with either. I think it is worth taking what you have learned to xen-devel@. According to MAINTAINERS you could also CC jbeulich@suse.com. Ian.
Thanks for your reply. I''ll do as suggested and take it to xen-devel. I assume my best best would be to start afresh there with a brief summary of the problem? Thanks again Am 14.08.13 09:24, schrieb Ian Campbell:> On Wed, 2013-08-14 at 01:36 +0200, Atom2 wrote: >> I hope that might provide you with some more information which I could >> use to make a step forward. > > Perhaps the console device is one which was powered down so you don''t > get any further output? > >> On the other hand I might be completely on the wrong track as I have no >> clue where the actual requested power-down (or as is: reboot) actually >> happens. That was not obvious for me from the code in power.c without >> further knowledge ... > > TBH this ACPI stuff and reboot etc is not an area of Xen which I am all > that familiar with either. I think it is worth taking what you have > learned to xen-devel@. According to MAINTAINERS you could also CC > jbeulich@suse.com. > > Ian. >
On Wed, 2013-08-14 at 09:29 +0200, Atom2 wrote:> Thanks for your reply. I''ll do as suggested and take it to xen-devel. > > I assume my best best would be to start afresh there with a brief > summary of the problem?Yes, and maybe mention ACPI S5 in the subject. If you have time then testing a newer Xen first would be useful, at least the latest 4.2.x but better 4.3 or unstable. NB you just need to build and boot the new xen.gz to test this, since tools aren''t involved you don''t need all the userspace gubbins. I think "make install-xen" in the newer tree will do the trick, or "make dist-xen"/"make xen" plus manually copy to /boot etc. Ian.> > Thanks again > > Am 14.08.13 09:24, schrieb Ian Campbell: > > On Wed, 2013-08-14 at 01:36 +0200, Atom2 wrote: > >> I hope that might provide you with some more information which I could > >> use to make a step forward. > > > > Perhaps the console device is one which was powered down so you don''t > > get any further output? > > > >> On the other hand I might be completely on the wrong track as I have no > >> clue where the actual requested power-down (or as is: reboot) actually > >> happens. That was not obvious for me from the code in power.c without > >> further knowledge ... > > > > TBH this ACPI stuff and reboot etc is not an area of Xen which I am all > > that familiar with either. I think it is worth taking what you have > > learned to xen-devel@. According to MAINTAINERS you could also CC > > jbeulich@suse.com. > > > > Ian. > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xen.org > http://lists.xen.org/xen-users