All, After much combing the web I''ve come to my wits end regarding this issue. I have 3 virtual environments all running on separate network cards, bridges, and virtual interfaces. Sporadically, they will stop responding to external connections and just sit there then pick themselves back up. Users are complaining of speed problems, hangups, and performance problems. I am using the stock xen kernel, rpms, modules I get from CentOS so this is a 3.03 with backported fixes. Host system: CentOS 5.2 2.6.18-92.1.22.el5xen #1 SMP Tue Dec 16 13:08:49 EST 2008 i686 i686 i386 GNU/Linux Network cards: lspci | grep Ethernet 01:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 02) 01:04.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 02) 02:02.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03) 02:02.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03) xen version: rpm -qa | grep -i xen xen-3.0.3-64.el5_2.9 kmod-gfs-xen-0.1.23-5.el5_2.4 kernel-xen-2.6.18-92.1.22.el5 kmod-gfs-xen-0.1.19-7.el5 xen-libs-3.0.3-64.el5_2.9 kmod-gnbd-xen-0.1.4-12.el5 kernel-xen-2.6.18-92.1.6.el5 kmod-gfs-xen-0.1.23-5.el5 kernel-xen-2.6.18-92.1.13.el5 Interface information: eth0 Link encap:Ethernet HWaddr 00:09:6B:E6:5B:06 inet addr:xxx.xxx.xxx Bcast:xxx.xxx.xxx.255 Mask:255.255.255.0 inet6 addr: fe80::209:6bff:fee6:5b06/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:28364 errors:0 dropped:0 overruns:0 frame:0 TX packets:6768 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:3434040 (3.2 MiB) TX bytes:855425 (835.3 KiB) eth1 Link encap:Ethernet HWaddr 00:09:6B:16:5B:06 inet addr:xxx.xxx.xxx.195 Bcast:xxx.xxx.xxx.255 Mask:255.255.255.0 inet6 addr: fe80::209:6bff:fe16:5b06/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:21764 errors:0 dropped:0 overruns:0 frame:0 TX packets:44 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2677909 (2.5 MiB) TX bytes:12055 (11.7 KiB) eth2 Link encap:Ethernet HWaddr 00:10:18:0B:10:75 inet addr:xxx.xxx.xxx.196 Bcast:xxx.xxx.xxx.255 Mask:255.255.255.0 inet6 addr: fe80::210:18ff:fe0b:1075/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:22015 errors:0 dropped:0 overruns:0 frame:0 TX packets:81 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2875611 (2.7 MiB) TX bytes:23706 (23.1 KiB) eth3 Link encap:Ethernet HWaddr 00:10:18:0B:10:76 inet addr:xxx.xxx.xxx.227 Bcast:xxx.xxx.xxx.255 Mask:255.255.255.0 inet6 addr: fe80::210:18ff:fe0b:1076/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:22389 errors:0 dropped:0 overruns:0 frame:0 TX packets:1187 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2608708 (2.4 MiB) TX bytes:186558 (182.1 KiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:6482 errors:0 dropped:0 overruns:0 frame:0 TX packets:6482 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:9194576 (8.7 MiB) TX bytes:9194576 (8.7 MiB) peth0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:560490 errors:0 dropped:0 overruns:0 frame:0 TX packets:285997 errors:12 dropped:0 overruns:0 carrier:0 collisions:97795 txqueuelen:1000 RX bytes:674918514 (643.6 MiB) TX bytes:128105121 (122.1 MiB) Interrupt:23 peth1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:26646 errors:0 dropped:0 overruns:0 frame:0 TX packets:1749 errors:0 dropped:0 overruns:0 carrier:0 collisions:167 txqueuelen:1000 RX bytes:3348509 (3.1 MiB) TX bytes:171945 (167.9 KiB) Interrupt:11 peth2 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:27434 errors:0 dropped:0 overruns:0 frame:0 TX packets:2870 errors:0 dropped:0 overruns:0 carrier:0 collisions:390 txqueuelen:1000 RX bytes:4054463 (3.8 MiB) TX bytes:280913 (274.3 KiB) Interrupt:24 peth3 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:25802 errors:0 dropped:0 overruns:0 frame:0 TX packets:1187 errors:0 dropped:0 overruns:0 carrier:0 collisions:1 txqueuelen:1000 RX bytes:2917674 (2.7 MiB) TX bytes:193936 (189.3 KiB) Interrupt:25 vif0.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:6768 errors:0 dropped:0 overruns:0 frame:0 TX packets:28654 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:855425 (835.3 KiB) TX bytes:3461741 (3.3 MiB) vif0.1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:44 errors:0 dropped:0 overruns:0 frame:0 TX packets:21764 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:12055 (11.7 KiB) TX bytes:2677909 (2.5 MiB) vif0.2 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:81 errors:0 dropped:0 overruns:0 frame:0 TX packets:22016 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:23706 (23.1 KiB) TX bytes:2876137 (2.7 MiB) vif0.3 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:1187 errors:0 dropped:0 overruns:0 frame:0 TX packets:22392 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:186558 (182.1 KiB) TX bytes:2609854 (2.4 MiB) vif4.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:286714 errors:0 dropped:0 overruns:0 frame:0 TX packets:544428 errors:0 dropped:10 overruns:0 carrier:0 collisions:0 txqueuelen:32 RX bytes:132587246 (126.4 MiB) TX bytes:670620334 (639.5 MiB) vif6.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:1489 errors:0 dropped:0 overruns:0 frame:0 TX packets:18033 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:32 RX bytes:98329 (96.0 KiB) TX bytes:2502693 (2.3 MiB) vif7.0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:2422 errors:0 dropped:0 overruns:0 frame:0 TX packets:18400 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:32 RX bytes:165924 (162.0 KiB) TX bytes:2985768 (2.8 MiB) virbr0 Link encap:Ethernet HWaddr 00:00:00:00:00:00 inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0 inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:185 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:28260 (27.5 KiB) xenbr0 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:21473 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2142514 (2.0 MiB) TX bytes:0 (0.0 b) xenbr1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:21081 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2102012 (2.0 MiB) TX bytes:0 (0.0 b) xenbr2 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:21050 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2094001 (1.9 MiB) TX bytes:0 (0.0 b) xenbr3 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:21004 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2083399 (1.9 MiB) TX bytes:0 (0.0 b) brctl show bridge name bridge id STP enabled interfaces virbr0 8000.000000000000 yes xenbr0 8000.feffffffffff no vif4.0 peth0 vif0.0 xenbr1 8000.feffffffffff no vif6.0 peth1 vif0.1 xenbr2 8000.feffffffffff no vif7.0 peth2 vif0.2 xenbr3 8000.feffffffffff no peth3 vif0.3 [root@diatpa3 ~]# Anyone, for the love of all that is holy help me out with some guidance? I''m thinking it''s the network driver, but it works like a champ on the host OS. If I can fix this I''ll owe someone a steak dinner. Scout''s honor. -- Jonathan Weismann _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tuesday 20 January 2009, Jonathan Weismann wrote:> All, > After much combing the web I''ve come to my wits end regarding > this issue. > > I have 3 virtual environments all running on separate network cards, > bridges, and virtual interfaces. Sporadically, they will stop responding > to external connections and just sit there then pick themselves back up. > Users are complaining of speed problems, hangups, and performance problems. > > I am using the stock xen kernel, rpms, modules I get from CentOS so this is > a 3.03 with backported fixes. > > Host system: CentOS 5.2 2.6.18-92.1.22.el5xen #1 SMP Tue Dec 16 13:08:49 > EST 2008 i686 i686 i386 GNU/LinuxHi Jonathan The system you''re running may be too new to be suffering from this MTU related problem, but it might be worth reading this posting: http://lists.xensource.com/archives/html/xen-devel/2005-12/msg00226.html -- Phil Driscoll _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> > Thanks for the response Phil,It was my understanding that redhat backported all current fixes to the current build and just left the 3.0.3 as the "current" build. My current hardware architecture is rather "old" IBM xseries servers (8 cpu Xeons 16GB of RAM, 2 Dual port Gigabit NICs). I''ll try that "fix" but that is only for the Dom0 domain not the DomU. It was my belief that I am routing all my xen traffic out the different network ports instead of one central port. Would this "fix" then have to be applied across each DomU? Would upgrading to 3.2 work any differently? Are they any fixes or patches in there that would solve this problem? I''m still pegging it on the driver in Xen for my network cards, still can''t seem to understand how it deals with the drivers.> > Message: 3 > Date: Wed, 21 Jan 2009 07:42:32 +0000 > From: Phil Driscoll <phil@dialsolutions.co.uk> > Subject: Re: [Xen-users] DomU hangs sporadically > To: xen-users@lists.xensource.com > Message-ID: <200901210742.32762.phil@dialsolutions.co.uk> > Content-Type: text/plain; charset="iso-8859-15" > > On Tuesday 20 January 2009, Jonathan Weismann wrote: > > All, > > After much combing the web I''ve come to my wits end regarding > > this issue. > > > > I have 3 virtual environments all running on separate network cards, > > bridges, and virtual interfaces. Sporadically, they will stop responding > > to external connections and just sit there then pick themselves back up. > > Users are complaining of speed problems, hangups, and performance > problems. > > > > I am using the stock xen kernel, rpms, modules I get from CentOS so this > is > > a 3.03 with backported fixes. > > > > Host system: CentOS 5.2 2.6.18-92.1.22.el5xen #1 SMP Tue Dec 16 13:08:49 > > EST 2008 i686 i686 i386 GNU/Linux > > Hi Jonathan > > The system you''re running may be too new to be suffering from this MTU > related > problem, but it might be worth reading this posting: > > http://lists.xensource.com/archives/html/xen-devel/20<http://lists.xensource.com/archives/html/xen-devel/2005-12/msg00226.html> >> >> >> > -- > Phil Driscoll > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Wed, Jan 21, 2009 at 9:41 PM, Jonathan Weismann <jweismann@gmail.com> wrote:> I''ll try that "fix" but that is only for the Dom0 domain not the DomU. It > was my belief that I am routing all my xen traffic out the different network > ports instead of one central port. Would this "fix" then have to be applied > across each DomU?If it''s MTU problem then your domU''s network should always not work, instead of sporadically. Just a hunch, did you assign static MACs to each domU? Are the domUs always active (e.g something like a web server that''s always accessed day and night) or does it have long periods of idle where no packet comes in/out? If it''s mostly idle, try doing this : - ssh to domU from outside host - ping the router - leave the ping/ssh window open for a day or two, and see if that domU still have problems. Regards, Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Yes I I assigned static MAC address in each DomU vif=[''mac=00:16:3e:57:03:1b, bridge=xenbr1''] vif=[''mac=00:16:3e:57:03:1b, bridge=xenbr2''] Hmmm this might be a problem. I changed the 2nd one to read 00:17 restarting domains If it''s this simple i''m going to shoot the machine. Wed, Jan 21, 2009 at 10:19 AM, Fajar A. Nugraha <fajar@fajar.net> wrote:> On Wed, Jan 21, 2009 at 9:41 PM, Jonathan Weismann <jweismann@gmail.com> > wrote: > > I''ll try that "fix" but that is only for the Dom0 domain not the DomU. It > > was my belief that I am routing all my xen traffic out the different > network > > ports instead of one central port. Would this "fix" then have to be > applied > > across each DomU? > > If it''s MTU problem then your domU''s network should always not work, > instead of sporadically. > > Just a hunch, did you assign static MACs to each domU? > Are the domUs always active (e.g something like a web server that''s > always accessed day and night) or does it have long periods of idle > where no packet comes in/out? If it''s mostly idle, try doing this : > - ssh to domU from outside host > - ping the router > - leave the ping/ssh window open for a day or two, and see if that > domU still have problems. > > Regards, > > Fajar >-- Jonathan Weismann CCNA _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
As a test I''m scp the CentOS DVD iso from one machine to the domain from within the DomU itself. I''m getting "stalled" errors every so often which leads me to believe I''m still getting network hangups. I''ll try wget from a website to see if that''s another test as it could be the machine. I ran a ping test from last night to this morning and did see that it would hang every so often. There''s no time/pattern that it stops running sometimes it''s good for a minute sometimes it''s good for 20s. Sporadic hangups are occurring. On Wed, Jan 21, 2009 at 10:36 AM, Jonathan Weismann <jweismann@gmail.com>wrote:> Yes I I assigned static MAC address in each DomU > > vif=[''mac=00:16:3e:57:03:1b, bridge=xenbr1''] > > vif=[''mac=00:16:3e:57:03:1b, bridge=xenbr2''] > > > Hmmm this might be a problem. I changed the 2nd one to read 00:17 > restarting domains > > If it''s this simple i''m going to shoot the machine. > > > > Wed, Jan 21, 2009 at 10:19 AM, Fajar A. Nugraha <fajar@fajar.net> wrote: > >> On Wed, Jan 21, 2009 at 9:41 PM, Jonathan Weismann <jweismann@gmail.com> >> wrote: >> > I''ll try that "fix" but that is only for the Dom0 domain not the DomU. >> It >> > was my belief that I am routing all my xen traffic out the different >> network >> > ports instead of one central port. Would this "fix" then have to be >> applied >> > across each DomU? >> >> If it''s MTU problem then your domU''s network should always not work, >> instead of sporadically. >> >> Just a hunch, did you assign static MACs to each domU? >> Are the domUs always active (e.g something like a web server that''s >> always accessed day and night) or does it have long periods of idle >> where no packet comes in/out? If it''s mostly idle, try doing this : >> - ssh to domU from outside host >> - ping the router >> - leave the ping/ssh window open for a day or two, and see if that >> domU still have problems. >> >> Regards, >> >> Fajar >> > > > > -- > Jonathan Weismann > CCNA >-- Jonathan Weismann CCNA _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Wed, Jan 21, 2009 at 10:43 PM, Jonathan Weismann <jweismann@gmail.com> wrote:> I ran a ping test from last night to this morning and did see that it would > hang every so often. There''s no time/pattern that it stops running > sometimes it''s good for a minute sometimes it''s good for 20s. Sporadic > hangups are occurring.One more test then. To isolate the effect of flaky switches, could you do the same ping test from domU to another direct-connect (i.e. as in cross-cable) machine? If it''s not available then domU to another domU attached to the same bridge should also work. Regards, Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> > Message: 3 > Date: Thu, 22 Jan 2009 06:59:28 +0700 > From: "Fajar A. Nugraha" <fajar@fajar.net> > Subject: Re: [Xen-users] DomU hangs sporadically > To: Jonathan Weismann <jweismann@gmail.com> > Cc: xen-users@lists.xensource.com > Message-ID: > <7207d96f0901211559y1c59ea81s73be52355cbd85ab@mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > On Wed, Jan 21, 2009 at 10:43 PM, Jonathan Weismann <jweismann@gmail.com> > wrote: > > > I ran a ping test from last night to this morning and did see that it > would > > hang every so often. There''s no time/pattern that it stops running > > sometimes it''s good for a minute sometimes it''s good for 20s. Sporadic > > hangups are occurring. > > One more test then. To isolate the effect of flaky switches, could you > do the same ping test from domU to another direct-connect (i.e. as in > cross-cable) machine? If it''s not available then domU to another domU > attached to the same bridge should also work. > > Regards, > > Fajar > >I had to move to another machine with Xen 3.2 that I compiled from source, I deployed the same image and when I ping one host to the other (2 DomU''s on the same server sharing the same bridge) I get: Destination Host Unreachable _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Thu, Jan 22, 2009 at 11:52 PM, Jonathan Weismann <jweismann@gmail.com> wrote:> I had to move to another machine with Xen 3.2 that I compiled from source, I > deployed the same image and when I ping one host to the other (2 DomU''s on > the same server sharing the same bridge) I get: > > Destination Host UnreachableIf it''s a similar setup (same versions, similar hardware, same image, same config) then maybe you''re just forgetting something. Perhaps iptables is still turned on on this test machine and preventing traffic coming through? try iptables -nL (on dom0), and snipping packets (with tcpdump) on dom0''s bridge on domU''s interfaces during ping. The point that I was trying to make is : - if all traffic can''t go through (like your test setup), most likely you''re forgetting something basic : iptables, ip address/subnet setup, your bridge is still down (e.g. created it manually but forgot to run "ifconfig brtest up"), etc. - if some domUs traffic can''t go through all the time (example : ICMP works, but TCP connections broken) then it''s most likely MTU problem - if domU <-> domU and domU <-> dom0 traffic on the same machine works, but domU <-> outside world broken, then most likely something wrong with network card (driver or hardware) - if domU <-> domU and domU <-> dom0 traffic on the same machine works, but domU <-> outside world SOMETIMES broken, I tend to suspect that it''s basic general networking problem : conflicting MACs, flaky switches, problematic routers, etc. Regards, Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users