Hi, I''m running Debian Lenny (2.6.26-1-xen-amd64) dom0 thats running 5 Debian Lenny 2.6.26-1-xen-amd64 domU''s. The two busiest of those VMs (webservers) frequently hang (every few days); the other three VMs have not hung a single time in 122 days (and nor has dom0 obviously). The syslogs (of the VMs and dom0) don''t show anything useful; I haven''t been able to correlate the hangs with any event in particular; the hangs seem to happen quite randomly (and not necessarily when the VMs are busy). All domU''s run without swap and sufficient memory (monitor statistics show a fair bit of memory allocated to the cache at all times). All VMs and dom0 share a single Internet connection and all run their own specific firewall. When a VM hangs, xentop actually thinks its running (its state = r and its using nearly 100% CPU). Its not possible to connect to the VM via the network (not even ping it), nor is it possible to connect the console (console just hangs). I appreciate any hints, suggestions, etc how I can further diagnose this problem. thanks, Jan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi Jan, You should strongly consider adding swap. Running a webserver is like running on ice. I am building custom servers for quite a long time, but I never ever considered that a web server has "sufficient" memory. Maybe only if we are talking about some 200-400GB RAM host? Then you could assign "sufficient" memory to each of your two web server and have some for the other 3 VPSs running there. Adding a 1GB swap file to each of your VPS will show that I''m right. ;) I always am! :D Good luck! Jan Bakuwel wrote:> Hi, > > I''m running Debian Lenny (2.6.26-1-xen-amd64) dom0 thats running 5 > Debian Lenny 2.6.26-1-xen-amd64 domU''s. The two busiest of those VMs > (webservers) frequently hang (every few days); the other three VMs have > not hung a single time in 122 days (and nor has dom0 obviously). The > syslogs (of the VMs and dom0) don''t show anything useful; I haven''t been > able to correlate the hangs with any event in particular; the hangs seem > to happen quite randomly (and not necessarily when the VMs are busy). > All domU''s run without swap and sufficient memory (monitor statistics > show a fair bit of memory allocated to the cache at all times). All VMs > and dom0 share a single Internet connection and all run their own > specific firewall. > > When a VM hangs, xentop actually thinks its running (its state = r and > its using nearly 100% CPU). Its not possible to connect to the VM via > the network (not even ping it), nor is it possible to connect the > console (console just hangs). > > I appreciate any hints, suggestions, etc how I can further diagnose this > problem. > > thanks, > Jan > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >-- Deac Mihai-Adrian W: www.mikesoftware.com P: +40-745-256.364 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Jan Bakuwel wrote:>When a VM hangs, xentop actually thinks its running (its state = r and >its using nearly 100% CPU). Its not possible to connect to the VM via >the network (not even ping it), nor is it possible to connect the >console (console just hangs).Sounds like a common problem with xenconsoled This is the advice I had from Ferenc Wagner on 18 Jul 2009> > Also, if it were that, wouldn''t I expect to see other guests on the > > same host lock up ? > >It isn''t that simple. Other guests lock up when they fill their >console buffers, apparently. But new guests seem to work fine for >me. Just try it, when you next experience this: stop xend, kill >xenconsoled, start xends. Don''t touch xenstored! If everything >returns to normal, then this was it. If not, then something else. :)It''s worked for me every time. -- Simon Hobson Visit http://www.magpiesnestpublishing.co.uk/ for books by acclaimed author Gladys Hobson. Novels - poetry - short stories - ideal as Christmas stocking fillers. Some available as e-books. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi Simon,> Sounds like a common problem with xenconsoled > > This is the advice I had from Ferenc Wagner on 18 Jul 2009 > >> > Also, if it were that, wouldn''t I expect to see other guests on the >> > same host lock up ? >> >> It isn''t that simple. Other guests lock up when they fill their >> console buffers, apparently. But new guests seem to work fine for >> me. Just try it, when you next experience this: stop xend, kill >> xenconsoled, start xends. Don''t touch xenstored! If everything >> returns to normal, then this was it. If not, then something else. :) > > It''s worked for me every time.Thanks for the suggestion; I''ll try that next time a VM hangs. I''ve read some reports that restarting xend with running VMs might not be safe (others say it is?). It won''t fix my problem though: getting hanging production VMs going again still requires interaction (restart of xenconsoled versus restart of VM). Are there fixes available for xenconsoled? Having said that I did notice console buffers filling up a while ago and fixed that problem by making sure that log messages do not end up on the (xen) console. When I connect to the console, I can see that indeed no log messages have been sent there; it just sits at the login: prompt and plays dead. regards, Jan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi Ady, In my case dom0 and domU are running the same kernel (para virt). I''m running 2.6.26-1-xen-amd64 while 2.6.26-2-xen-amd64 is available though... The reason I haven''t upgraded this production server yet is that the Debian Lenny scripts for Xen combined wit the new grub can seriously mess up the system boot up sequence (requiring manual repair). The server is in a remote datacenter... so it potentially not being able to rebooting is a pain... How are you using Xen: same kernel for dom0 and domU as well? best, Jan Ady Deac wrote:> Hi Jan, > > Hmmm, this sounds exactly like something that hit me a while ago. This > has happen right after I''ve upgraded the dom0 kernel. I needed to > re-build the domU''s kernel with the same sources as dom0''s. > > Have phun! > > > Jan Bakuwel wrote: >> Hey Ady, >> >> >>> You should strongly consider adding swap. Running a webserver is like >>> running on ice. I am building custom servers for quite a long time, >>> but I never ever considered that a web server has "sufficient" memory. >>> Maybe only if we are talking about some 200-400GB RAM host? Then you >>> could assign "sufficient" memory to each of your two web server and >>> have some for the other 3 VPSs running there. >>> Adding a 1GB swap file to each of your VPS will show that I''m right. >>> ;) I always am! :D >>> >>> Good luck! >>> >> >> To swap or not to swap... :-) >> >> I did add swap but it didn''t have the desired result (i.e. still >> hanging). Also I''d expect the kernel to start killing processes if it >> runs out of memory... and not just hang? >> >> Jan >> >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> -----Original Message----- > From: xen-users-bounces@lists.xensource.com [mailto:xen-users- > bounces@lists.xensource.com] On Behalf Of Jan Bakuwel > Sent: Sunday, September 20, 2009 3:59 PM > To: Ady Deac > Cc: xen-users@lists.xensource.com > Subject: Re: [Xen-users] Frequent para (Linux) domU hangs > > Hi Ady, > > In my case dom0 and domU are running the same kernel (para virt). > > I'm running 2.6.26-1-xen-amd64 while 2.6.26-2-xen-amd64 is available > though... > > The reason I haven't upgraded this production server yet is that the > Debian Lenny scripts for Xen combined wit the new grub can seriously > mess up the system boot up sequence (requiring manual repair). The > server is in a remote datacenter... so it potentially not being able to > rebooting is a pain... > > How are you using Xen: same kernel for dom0 and domU as well? > > best, > Jan >Just wanted to chime in and say that I am experiencing this exact problem, and am struggling to come up with a cause, let alone a solution. Like you, it's only one VM on a host with the problem, while the dozen other VMs on the same host have never frozen. The console logs are empty. I'm at a loss! Nathan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Whoops, I don't think my last post made it to the list for some reason. Apologies if it did and this is a double post...> -----Original Message----- > From: xen-users-bounces@lists.xensource.com [mailto:xen-users- > bounces@lists.xensource.com] On Behalf Of Jan Bakuwel > Sent: Sunday, September 20, 2009 3:59 PM > To: Ady Deac > Cc: xen-users@lists.xensource.com > Subject: Re: [Xen-users] Frequent para (Linux) domU hangs > > Hi Ady, > > In my case dom0 and domU are running the same kernel (para virt). > > I'm running 2.6.26-1-xen-amd64 while 2.6.26-2-xen-amd64 is available > though... > > The reason I haven't upgraded this production server yet is that the > Debian Lenny scripts for Xen combined wit the new grub can seriously > mess up the system boot up sequence (requiring manual repair). The > server is in a remote datacenter... so it potentially not being able > to rebooting is a pain... > > How are you using Xen: same kernel for dom0 and domU as well? > > best, > Jan >Just wanted to chime in and say that I am experiencing this exact problem, and am struggling to come up with a cause, let alone a solution. Like you, it's only one VM on a host with the problem, while the dozen other VMs on the same host have never frozen. The console logs are empty. Nathan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi Nathan,> Whoops, I don''t think my last post made it to the list for some reason. Apologies if it did and this is a double post... >I made it just fine :-)> Just wanted to chime in and say that I am experiencing this exact problem, and am struggling to come up with a cause, let alone a solution. Like you, it''s only one VM on a host with the problem, while the dozen other VMs on the same host have never frozen. The console logs are empty. >Lets see if we can find similarities. I''m running Debian Lenny 2.6.26-1-xen-amd64, so 64bit dom0. I''m only running para Debian Lenny 64bit domUs (can''t be more straightforward). All domUs run without swap (to prevent potential disc trashing as they''re all on the same RAID-6 volume). Xen 3.2-1 I''ve got two out of five VMs that frequently hang (once every few days). What''s your situtation? Jan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> -----Original Message----- > From: Jan Bakuwel [mailto:jan.bakuwel@gmail.com] > Sent: Tuesday, September 22, 2009 3:05 PM > To: Nathan Eisenberg > Cc: xen-users@lists.xensource.com > Subject: Re: [Xen-users] Frequent para (Linux) domU hangs > > Hi Nathan, > > Lets see if we can find similarities. I'm running Debian Lenny > 2.6.26-1-xen-amd64, so 64bit dom0. I'm only running para Debian Lenny > 64bit domUs (can't be more straightforward). All domUs run without swap > (to prevent potential disc trashing as they're all on the same RAID-6 > volume). Xen 3.2-1 I've got two out of five VMs that frequently hang > (once every few days). > > What's your situtation? > > JanHey Jan, Dom0 is Debian Etch amd64 (haven't upgraded this Dom0 yet) running 2.6.26-bpo.1-xen-amd64. Xen version is xen-3.2-1-amd64. Plenty of leftover memory and swap. DomU is Centos 5.3 Final amd64, running the dom0's kernel (2.6.26-bpo.1-xen-amd64). DomU has lots of memory and plenty of swap (added because I suspected it may have been a problem). One possibility I am beginning to suspect is that domU may be filling up /tmp (using /usr/tmpDSK for some reason). Not sure on that, though. I may try removing that from the fstab and see if stability improves... Nathan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi Nathan,> Dom0 is Debian Etch amd64 (haven''t upgraded this Dom0 yet) running 2.6.26-bpo.1-xen-amd64. Xen version is xen-3.2-1-amd64. Plenty of leftover memory and swap. > > DomU is Centos 5.3 Final amd64, running the dom0''s kernel (2.6.26-bpo.1-xen-amd64). DomU has lots of memory and plenty of swap (added because I suspected it may have been a problem). > > One possibility I am beginning to suspect is that domU may be filling up /tmp (using /usr/tmpDSK for some reason). Not sure on that, though. I may try removing that from the fstab and see if stability improves... >Seems like we''re in a similar although not identical environment. I think there''s not much difference (or perhaps none at all) between the backported etch kernel and the lenny kernel? I''ve planned an upgrade to 2.6.26-2-xen-amd64 in the next few weeks to see if that makes a difference and have just added a swapfile to one of the two troubled VMs. Please keep me informed about your progress; I''ll do the same. Jan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> -----Original Message----- > From: Jan Bakuwel [mailto:jan.bakuwel@gmail.com] > Sent: Tuesday, September 22, 2009 6:29 PM > To: Nathan Eisenberg > Cc: xen-users@lists.xensource.com > Subject: Re: [Xen-users] Frequent para (Linux) domU hangs > > Hi Nathan, > > Seems like we're in a similar although not identical environment. > > I think there's not much difference (or perhaps none at all) between > the > backported etch kernel and the lenny kernel? > > I've planned an upgrade to 2.6.26-2-xen-amd64 in the next few weeks to > see if that makes a difference and have just added a swapfile to one of > the two troubled VMs. > > Please keep me informed about your progress; I'll do the same. > > Jan >Hey Jan/List, This one domU hung again. I'm kinda out of ideas. Why would only one domU hang now and again, while every other domU is totally stable? Best Regards, Nathan Eisenberg Sr. Systems Administrator - Atlas Networks, LLC _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi Nathan, all, One thing I''ve just noticed: the two domU''s that are frequently hanging are doing serious (database) I/O from time to time (not always but in bursts). Both have a relatively high VBD_OO, the domU that just hung had a VBD_OO of over 800 while the other domU''s have very low VDB_OO (4 or 32). I found this on the Internet: http://lists.xensource.com/archives/html/xen-devel/2006-06/msg00812.html Basically Satoshi suggests in his email that the underlying hardware is not able to keep up with the I/O requests of these particular domU''s (if I read him well). I wouldn''t mind if the domU''s simply slowed down if the hardware can''t keep up... but in my case once they get "there", they hang and do not seem to return to normal (well we can''t afford to wait for days). Any suggestions to further diagnose (and fix? :-) ) this problem are very welcome. kind regards, Jan Jan Bakuwel wrote:> Hi Nathan, > > >> Dom0 is Debian Etch amd64 (haven''t upgraded this Dom0 yet) running 2.6.26-bpo.1-xen-amd64. Xen version is xen-3.2-1-amd64. Plenty of leftover memory and swap. >> >> DomU is Centos 5.3 Final amd64, running the dom0''s kernel (2.6.26-bpo.1-xen-amd64). DomU has lots of memory and plenty of swap (added because I suspected it may have been a problem). >> >> One possibility I am beginning to suspect is that domU may be filling up /tmp (using /usr/tmpDSK for some reason). Not sure on that, though. I may try removing that from the fstab and see if stability improves... >> >> > > Seems like we''re in a similar although not identical environment. > > I think there''s not much difference (or perhaps none at all) between the > backported etch kernel and the lenny kernel? > > I''ve planned an upgrade to 2.6.26-2-xen-amd64 in the next few weeks to > see if that makes a difference and have just added a swapfile to one of > the two troubled VMs. > > Please keep me informed about your progress; I''ll do the same. > > Jan > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi Nathan,> This one domU hung again. I''m kinda out of ideas. Why would only one domU hang now and again, while every other domU is totally stable? >Any luck with this problem? We''ve got 30+ VMs running elsewhere without any issues on the exact same OS/Xen version. Just on this server we''re having issues... What''s your hardware? I''m running a AMD64 SuperMicro with 3ware & 1GB SATA discs. BTW I''m preparing a migration to fully virtualized domU''s to see if that helps. I know there''s a (performance) cost but wouldn''t know best regards, Jan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tue, Oct 06, 2009 at 03:26:05PM +1300, Jan Bakuwel wrote:> Hi Nathan, all, > > One thing I''ve just noticed: the two domU''s that are frequently hanging > are doing serious (database) I/O from time to time (not always but in > bursts). Both have a relatively high VBD_OO, the domU that just hung had > a VBD_OO of over 800 while the other domU''s have very low VDB_OO (4 or 32). > > I found this on the Internet: > > http://lists.xensource.com/archives/html/xen-devel/2006-06/msg00812.html > > Basically Satoshi suggests in his email that the underlying hardware is > not able to keep up with the I/O requests of these particular domU''s (if > I read him well). > > I wouldn''t mind if the domU''s simply slowed down if the hardware can''t > keep up... but in my case once they get "there", they hang and do not > seem to return to normal (well we can''t afford to wait for days). > > Any suggestions to further diagnose (and fix? :-) ) this problem are > very welcome. >Just a thought; have you set up xen domain weights so that dom0 will always have cpu time to process IO requests properly? ie. dom0 gets more cpu than domUs. -- Pasi> kind regards, > Jan > > > > > Jan Bakuwel wrote: > > Hi Nathan, > > > > > >> Dom0 is Debian Etch amd64 (haven''t upgraded this Dom0 yet) running 2.6.26-bpo.1-xen-amd64. Xen version is xen-3.2-1-amd64. Plenty of leftover memory and swap. > >> > >> DomU is Centos 5.3 Final amd64, running the dom0''s kernel (2.6.26-bpo.1-xen-amd64). DomU has lots of memory and plenty of swap (added because I suspected it may have been a problem). > >> > >> One possibility I am beginning to suspect is that domU may be filling up /tmp (using /usr/tmpDSK for some reason). Not sure on that, though. I may try removing that from the fstab and see if stability improves... > >> > >> > > > > Seems like we''re in a similar although not identical environment. > > > > I think there''s not much difference (or perhaps none at all) between the > > backported etch kernel and the lenny kernel? > > > > I''ve planned an upgrade to 2.6.26-2-xen-amd64 in the next few weeks to > > see if that makes a difference and have just added a swapfile to one of > > the two troubled VMs. > > > > Please keep me informed about your progress; I''ll do the same. > > > > Jan > > > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users > > > > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi Pasi,> Just a thought; have you set up xen domain weights so that dom0 will > always have cpu time to process IO requests properly? ie. dom0 gets more > cpu than domUs. >Thanks for the tip - that makes sense. I did indeed not have domain weights set up - now I have. There are several posts on the Internet saying that if dom0 has the same weight as the domU''s (which is the default: 256), you may have serious trouble with domU''s under high load. Wouldn''t it make sense that dom0 by default would have a higher weight? Jan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> Just a thought; have you set up xen domain weights so that dom0 will > always have cpu time to process IO requests properly? ie. dom0 gets > more > cpu than domUs. > > -- PasiI have not done this - however, I pinned a whole CPU to dom0 which I figured would be enough. Best Regards, Nathan Eisenberg _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Fri, Oct 16, 2009 at 10:01:40AM +1300, Jan Bakuwel wrote:> Hi Pasi, > > > Just a thought; have you set up xen domain weights so that dom0 will > > always have cpu time to process IO requests properly? ie. dom0 gets more > > cpu than domUs. > > > > Thanks for the tip - that makes sense. I did indeed not have domain > weights set up - now I have. >Ok.> There are several posts on the Internet saying that if dom0 has the same > weight as the domU''s (which is the default: 256), you may have serious > trouble with domU''s under high load. Wouldn''t it make sense that dom0 by > default would have a higher weight? >Maybe.. then again some people do it differently; they dedicate a single core only for dom0, and use the other cores for domUs. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> Any luck with this problem? > > We've got 30+ VMs running elsewhere without any issues on the exact > same > OS/Xen version. Just on this server we're having issues... > > What's your hardware? I'm running a AMD64 SuperMicro with 3ware & 1GB > SATA discs. > > BTW I'm preparing a migration to fully virtualized domU's to see if > that > helps. I know there's a (performance) cost but wouldn't know > > best regards, > JanJan, I am beginning to strongly suspect this is indeed due to IO starvation, at least for me. I offloaded the heavy MySQL traffic (it's a bunch of CPanel domUs) which was the vast majority of the IO, and haven’t had a crash yet. I'll let you know if I see another one. Like you, I'm using 1TB SATA disks. Best Regards, Nathan Eisenberg _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi Nathan,> I am beginning to strongly suspect this is indeed due to IO starvation, at least for me. I offloaded the heavy MySQL traffic (it''s a bunch of CPanel domUs) which was the vast majority of the IO, and haven’t had a crash yet. I''ll let you know if I see another one. > > Like you, I''m using 1TB SATA disks. >Same here. I''ve pinned the domU CPU''s and given dom0 a higher weight. My VMs have been up since. Many thanks to Pasi for the suggestion to look into that! best, Jan _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Fri, Oct 30, 2009 at 08:41:29AM +1300, Jan Bakuwel wrote:> Hi Nathan, > > > I am beginning to strongly suspect this is indeed due to IO starvation, at least for me. I offloaded the heavy MySQL traffic (it''s a bunch of CPanel domUs) which was the vast majority of the IO, and haven???t had a crash yet. I''ll let you know if I see another one. > > > > Like you, I''m using 1TB SATA disks. > > > > Same here. I''ve pinned the domU CPU''s and given dom0 a higher weight. My > VMs have been up since. Many thanks to Pasi for the suggestion to look > into that! >Good to hear it works :) -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users