Tobias F. Leucht
2005-Sep-16 16:15 UTC
[Xen-users] Kernel panic recoverable by hitting <enter>?
Hi *, I''m running several xenified systems [UP|SMP]/[P4,Athlon,Xeon] from multiple maufacturers [HP,Intel,IBM,Dell] in different CoLos since early 2004 24/7 in test as well as production scenarios without any noteworthy problems. But, suddenly, a Dell PowerEdge 1750 equipped with SMP Xeon@2.80GHz and 4GB DDR ECC RAM (from crucial.com) running under xen-2.0.5 with 2.6.10-xen0 on Debian Sarge and carrying 22 domUs [Sarge|FC4|*BSD] under medium load disappeared this afternoon from the network - both the dom0 and all domUs. The remote-hands service claimed that there was a kernel panic on the system''s console showing loads of hexdumps and something related to ''memory stack'' - unfotunately, the guy couldn''t remember exactly what was shown on the screen [and, sadly, I''ve no serial console attached to that machine and no terminal server, of course ;-)]. Now the odd part, at least for me as I never heard of such a thing: the technician also claimed that, after pressing the enter key, the system returned to ''normal operation'' and showed a conosle login. When the machine reappeared back on the net it looked like as if nothing had happened: Uptime of dom0 is still >90 days, neither the logs of dom0 nor those of the domUs show any suspicious entries - except some of the domU''s services complaining about the lack of network connectivity (openvpn, etc...). Do you guys have heard of user-confirmable kernel panics? (I just grepped through the xen- and linux-sources but found nothing; Google wasn''t of much help, either.) Was the remote-hands-guy telling me fary tales merely to draw off attention from some colo-internal network problems? Or could that be a xen-specific experience? How to track that down? Regards, Tobias _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Williamson
2005-Sep-16 16:22 UTC
Re: [Xen-users] Kernel panic recoverable by hitting <enter>?
> But, suddenly, a Dell PowerEdge 1750 equipped with SMP Xeon@2.80GHz and > 4GB DDR ECC RAM (from crucial.com) running under xen-2.0.5 with > 2.6.10-xen0 on Debian Sarge and carrying 22 domUs [Sarge|FC4|*BSD] under > medium load disappeared this afternoon from the network - both the dom0 > and all domUs.Nice setup!> The remote-hands service claimed that there was a kernel panic on the > system''s console showing loads of hexdumps and something related to > ''memory stack'' - unfotunately, the guy couldn''t remember exactly what > was shown on the screen [and, sadly, I''ve no serial console attached to > that machine and no terminal server, of course ;-)].Shame they didn''t just take a digicam photo and mail it to you.> Now the odd part, at least for me as I never heard of such a thing: the > technician also claimed that, after pressing the enter key, the system > returned to ''normal operation'' and showed a conosle login. > > When the machine reappeared back on the net it looked like as if nothing > had happened: Uptime of dom0 is still >90 days, neither the logs of dom0 > nor those of the domUs show any suspicious entries - except some of the > domU''s services complaining about the lack of network connectivity > (openvpn, etc...).But all the domUs will still up?> Do you guys have heard of user-confirmable kernel panics? (I just > grepped through the xen- and linux-sources but found nothing; Google > wasn''t of much help, either.)They don''t exist... But that assumes it was a panic. The message he saw could have been some sort of non-fatal oops - have a look in dom0''s dmesg output. I guess it might have somehow recovered from <whatever it was> and thus returned your connectivity. Don''t ask me what it could be though ;-)> Was the remote-hands-guy telling me fary tales merely to draw off > attention from some colo-internal network problems? Or could that be a > xen-specific experience? How to track that down?As well as checking the dmesg in dom0, you might also like to check for suspicious warnings in xm dmesg - just in case Xen spotted a domain doing something weird and complained about it. Other than that, it seems a bit mysterious... Cheers, Mark _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users