Hello, I am running Xen 3.2.1 under a Gentoo 2.6.21-xen kernel. I have four domains running under a single dom0, two hvm: Windows XP and Windows 2k3, and two pv: a Gentoo 2.6.25 kernel, and a Ubuntu 2.6.24 kernel. I have also experienced the same behaviour under a 2.6.21-xen gentoo kernel. All the domU networks are bridged to a single 1gb nic, and I have tried an alternative physical nic. There is very little load on this nic - this is a test environment. At a certain point that I have not established exactly, the network load takes out the pv network. For example, if I initiate a bittorrent session in a pv domU, I get a slow build up of network load, and then connectivity is lost to both of the pv domU''s. If I console into them, they cannot ping outside the network, but they can ping their own interfaces. A tcpdump on the dom0 physical shows no traffic. However, during all this, the hvm domains are able to use their network connections without issues. A shutdown of the broken domU doesn''t work, as they have nfs shares loaded and it hangs on the nfs unmount, but I suspect that without this they would shutdown cleanly. In any case, I have to destroy them. If I attempt to recreate, I get: Error: Device 0 (vif) could not be connected. Hotplug scripts not working. The xend log only shows: ... [2008-08-31 19:14:11 5531] DEBUG (DevController:595) hotplugStatusCallback /local/domain/0/backend/vif/11/0/hotplug-status. [2008-08-31 19:15:51 5531] DEBUG (XendDomainInfo:1897) XendDomainInfo.destroy: domid=11 ... Although the hvm domUs are still working, if I shut them down, they hang on start up, again with the vif problem. I used the bittorrent example above to demonstrate it is at a certain traffic load, however if I do a large cp from the domU to an nfs share, it will fail almost instantly. Restarting xend has no effect. The only thing I can fix the problem with is a reboot of the dom0. Here is the dom0 kernel line: module /xen-2.6.21-noreal root=/dev/sda2 max_loop=255 (the noreal just refers to the realtek drivers being removed from the kernel, as I tried to use realteks own drivers on their website to resolve the problem, but the behaviour is the same). Here is the domU cfg: =========================================kernel = ''/xen/kernels/xen-2.6.25-pae'' ramdisk = ''/etc/xen/kernels/initramfs-genkernel-x86-2.6.25-gentoo-r7-ich10'' extra = ''console=hvc0'' memory = ''768'' disk = [ ''phy:sda7,hda3,w'', ''file:/xen/domains/zenayonswap.img,hdb,w'' ] name = ''zenayon'' vif = [''bridge=eth1, mac=00:16:3E:11:11:12''] root=''/dev/xvda3'' cpu_cap = 100 #sdl=0 #acpi=0 #apic=0 localtime=1 =============================== The dmesg of the dom0 and domU don''t have any clues that I can see, nor log/messages, nor the xen logs. I am at a bit of a loss as to how to diagnose this. All the other networking related issues seem to have been resolved in earlier releases and/or are related to routed mode. It seems to be related to the dom0 kernel or xend, as these are the things that haven''t changed in my testing. Perhaps I have a setting in my dom0 kernel that is not compatible? Thanks for any help, Paul ** _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pepe Barbe
2008-Sep-02 23:56 UTC
Re: [Xen-users] domU network fails under load - vif breaks
On Sep 1, 2008, at 8:57 AM, Paul wrote:> At a certain point that I have not established exactly, the network > load takes out the pv network. For example, if I initiate a > bittorrent session in a pv domU, I get a slow build up of network > load, and then connectivity is lost to both of the pv domU''s. If I > console into them, they cannot ping outside the network, but they > can ping their own interfaces. A tcpdump on the dom0 physical shows > no traffic. However, during all this, the hvm domains are able to > use their network connections without issues.I have experienced the same issue with Ubuntu 8.04. The only solution I have found is to limit the rate of the nic so that it doesn''t reach a transfer rate that breaks the networking. Pepe _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> > On Sep 1, 2008, at 8:57 AM, Paul wrote: > > > > At a certain point that I have not established exactly, the > network load takes out the pv network. For example, if I initiate > a bittorrent session in a pv domU, I get a slow build up of > network load, and then connectivity is lost to both of the pv > domU''s. If I console into them, they cannot ping outside the > network, but they can ping their own interfaces. A tcpdump on the > dom0 physical shows no traffic. However, during all this, the hvm > domains are able to use their network connections without issues. > > I have experienced the same issue with Ubuntu 8.04. The only solution > I have found is to limit the rate of the nic so that it doesn''t reach > a transfer rate that breaks the networking. > Pepe >Thanks Pepe. My last test showed the network breaking at 3.6Mbs - too slow to lower the nic rate, and having networking running that slow would make the vm unusable for its purpose. I have upgraded Xen from 3.2.1 to 3.3, and the problem remains, which means I have changed: 1) Change domU kernel (2.6.21, 2.6.24, 2.6.25) 2) Change domU userland (gentoo, ubuntu) 3) Change dom0 physical nic + driver (Realtek 8169 -> 8168) 4) Change Xen version (3.2.1 -> 3.3) The only thing I haven''t changed is the dom0 kernel. I am using a stock 2.6.21-xen gentoo kernel, so it would be great if anyone watching this that has pv domUs working under a 2.6.21 kernel would post their .config so I can compare it to mine. Thanks, Paul _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Nathan Eisenberg
2008-Sep-05 03:46 UTC
RE: [Xen-users] Re: domU network fails under load - vif breaks
Paul, I''m wondering if #3 is a large enough change to rule out driver issues. Maybe try an entirely different chipset and manufacturer - perhaps an Intel, Broadcom, etc NIC. Thank you, Nathan Eisenberg Sr. Systems Administrator Atlas Networks, LLC -----Original Message----- From: xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] On Behalf Of Paul Sent: Thursday, September 04, 2008 3:43 PM Cc: xen-users@lists.xensource.com Subject: [Xen-users] Re: domU network fails under load - vif breaks> > On Sep 1, 2008, at 8:57 AM, Paul wrote: > > > > At a certain point that I have not established exactly, the > network load takes out the pv network. For example, if I initiate > a bittorrent session in a pv domU, I get a slow build up of > network load, and then connectivity is lost to both of the pv > domU''s. If I console into them, they cannot ping outside the > network, but they can ping their own interfaces. A tcpdump on the > dom0 physical shows no traffic. However, during all this, the hvm > domains are able to use their network connections without issues. > > I have experienced the same issue with Ubuntu 8.04. The only solution > I have found is to limit the rate of the nic so that it doesn''t reach > a transfer rate that breaks the networking. > Pepe >Thanks Pepe. My last test showed the network breaking at 3.6Mbs - too slow to lower the nic rate, and having networking running that slow would make the vm unusable for its purpose. I have upgraded Xen from 3.2.1 to 3.3, and the problem remains, which means I have changed: 1) Change domU kernel (2.6.21, 2.6.24, 2.6.25) 2) Change domU userland (gentoo, ubuntu) 3) Change dom0 physical nic + driver (Realtek 8169 -> 8168) 4) Change Xen version (3.2.1 -> 3.3) The only thing I haven''t changed is the dom0 kernel. I am using a stock 2.6.21-xen gentoo kernel, so it would be great if anyone watching this that has pv domUs working under a 2.6.21 kernel would post their .config so I can compare it to mine. Thanks, Paul _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pepe Barbe
2008-Sep-05 14:16 UTC
Re: [Xen-users] Re: domU network fails under load - vif breaks
On Sep 4, 2008, at 10:46 PM, Nathan Eisenberg wrote:> I''m wondering if #3 is a large enough change to rule out driver > issues. Maybe try an entirely different chipset and manufacturer - > perhaps an Intel, Broadcom, etc NIC.Just to let people know; my HW NICs are NetXtreme BCM5722 Gigabit Ethernet PCI Express and I experience the same issue. Is it enough to rule out driver issues? Pepe _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Dustin Henning
2008-Sep-05 14:54 UTC
RE: [Xen-users] Re: domU network fails under load - vif breaks
I have a backup running on four Windows HVM domains (I know you asked for someone using PV domUs, but these are running on the PV network interfaces, as I am using James Harper''s GPLPV drivers). All four DomUs are on the same Dom0, which has one gigabit connection. The backups are going toward a separate physical backup server. Here is a summary of said backups: HVM1: Start backup objects time 09/05/2008 1:11:04 AM Elapsed backup objects time 2 hours,14 minutes,30 seconds Completed 61743 objects, 9.33 GB (100%) HVM2: Start backup objects time 09/05/2008 2:18:36 AM Elapsed backup objects time 57 minutes,30 seconds Completed 38071 objects, 6.05 GB (100%) HVM3: Start backup objects time 09/05/2008 2:21:41 AM Elapsed backup objects time 57 minutes,15 seconds Completed 36175 objects, 5.94 GB (100%) HVM4: Start backup objects time 09/05/2008 1:17:47 AM Elapsed backup objects time 1 hour,30 minutes,30 seconds Completed 36101 objects, 5.92 GB (100%) As you can see, these backups overlap, so the network usage across the bridge/NIC would be even higher than what it takes to back up 6 GiB in just under an hour (I suspect that is quite a bit more than 3.6Mbps, but I haven''t calculated or benchmarked). Also, as you can see here, they have been running for some time: [virtadmin@virt1 ~]$ sudo /usr/sbin/xm list Name ID Mem VCPUs State Time(s) Domain-0 0 2858 4 r----- 160986.6 xm1 17 2048 1 ------ 233816.8 xm2 18 1024 1 -b---- 188676.6 xm3 19 1024 1 -b---- 166046.2 xm4 20 1024 1 -b---- 171760.6 [virtadmin@virt1 ~]$ uptime 10:50:40 up 65 days, 20:45, 1 user, load average: 0.00, 0.01, 0.00 The DomUs are currently IDs 17-20 because of reboots for software installs, updates, and whatnot. xm1''s output from systeminfo shows it has been up for 17 Days, 18 Hours, 24 Minutes, 31 Seconds. This is with the backup mentioned above running 5 times a week. Anyway, I am running Fedora 8 with a kernel from their repo. I have attached the .config for that kernel in case you want to compare. The version matches yours, but it''s hard telling what additional patches the Fedora kernel might include (though technically said patches would be available with the source, which should be downloadable from multiple mirrors, so I guess it''s not so hard telling). Hope this is helpful, Dustin -----Original Message----- From: xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] On Behalf Of Paul Sent: Thursday, September 04, 2008 18:43 Cc: xen-users@lists.xensource.com Subject: [Xen-users] Re: domU network fails under load - vif breaks> > On Sep 1, 2008, at 8:57 AM, Paul wrote: > > > > At a certain point that I have not established exactly, the > network load takes out the pv network. For example, if I initiate > a bittorrent session in a pv domU, I get a slow build up of > network load, and then connectivity is lost to both of the pv > domU''s. If I console into them, they cannot ping outside the > network, but they can ping their own interfaces. A tcpdump on the > dom0 physical shows no traffic. However, during all this, the hvm > domains are able to use their network connections without issues. > > I have experienced the same issue with Ubuntu 8.04. The only solution > I have found is to limit the rate of the nic so that it doesn''t reach > a transfer rate that breaks the networking. > Pepe >Thanks Pepe. My last test showed the network breaking at 3.6Mbs - too slow to lower the nic rate, and having networking running that slow would make the vm unusable for its purpose. I have upgraded Xen from 3.2.1 to 3.3, and the problem remains, which means I have changed: 1) Change domU kernel (2.6.21, 2.6.24, 2.6.25) 2) Change domU userland (gentoo, ubuntu) 3) Change dom0 physical nic + driver (Realtek 8169 -> 8168) 4) Change Xen version (3.2.1 -> 3.3) The only thing I haven''t changed is the dom0 kernel. I am using a stock 2.6.21-xen gentoo kernel, so it would be great if anyone watching this that has pv domUs working under a 2.6.21 kernel would post their .config so I can compare it to mine. Thanks, Paul _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Dustin Henning
2008-Sep-05 15:02 UTC
RE: [Xen-users] Re: domU network fails under load - vif breaks
I just posted a lengthy response to Paul. It does not include my NIC info, so here it is: 04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01) I am running an MSI P965 Platinum with the on board NIC ("Realtek RTL8111B PCI-Express Gb LAN Controller" according to the website). This is why I am running Fedora vs CentOS, actually, though who knows, CentOS might have this problem if it doesn''t turn out to be related to your NIC and some patch Fedora included fixes it. Dustin -----Original Message----- From: xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] On Behalf Of Pepe Barbe Sent: Friday, September 05, 2008 10:16 To: xen-users@lists.xensource.com Subject: Re: [Xen-users] Re: domU network fails under load - vif breaks On Sep 4, 2008, at 10:46 PM, Nathan Eisenberg wrote:> I''m wondering if #3 is a large enough change to rule out driver > issues. Maybe try an entirely different chipset and manufacturer - > perhaps an Intel, Broadcom, etc NIC.Just to let people know; my HW NICs are NetXtreme BCM5722 Gigabit Ethernet PCI Express and I experience the same issue. Is it enough to rule out driver issues? Pepe _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Paul
2008-Sep-06 02:50 UTC
Re: [forums] RE: [Xen-users] Re: domU network fails under load - vif breaks
Hi Nathan, Yeah, this was something I was wondering about too... The driver change was from the kernel driver to the manufacturers own driver - but I imagine it leverages off code in the kernel in any case. I am testing a tulip based 10/100 card at the moment, and pushing it pretty hard without any issues. This isn''t a one for one test yet, as I have my other domains off the original nic, but it certainly looks promising. Paul Nathan Eisenberg wrote:> Paul, > > I''m wondering if #3 is a large enough change to rule out driver issues. Maybe try an entirely different chipset and manufacturer - perhaps an Intel, Broadcom, etc NIC. > > Thank you, > Nathan Eisenberg > Sr. Systems Administrator > Atlas Networks, LLC > > > -----Original Message----- > From: xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] On Behalf Of Paul > Sent: Thursday, September 04, 2008 3:43 PM > Cc: xen-users@lists.xensource.com > Subject: [Xen-users] Re: domU network fails under load - vif breaks > > >> On Sep 1, 2008, at 8:57 AM, Paul wrote: >> >> >> >> At a certain point that I have not established exactly, the >> network load takes out the pv network. For example, if I initiate >> a bittorrent session in a pv domU, I get a slow build up of >> network load, and then connectivity is lost to both of the pv >> domU''s. If I console into them, they cannot ping outside the >> network, but they can ping their own interfaces. A tcpdump on the >> dom0 physical shows no traffic. However, during all this, the hvm >> domains are able to use their network connections without issues. >> >> I have experienced the same issue with Ubuntu 8.04. The only solution >> I have found is to limit the rate of the nic so that it doesn''t reach >> a transfer rate that breaks the networking. >> Pepe >> >> > Thanks Pepe. My last test showed the network breaking at 3.6Mbs - too > slow to lower the nic rate, and having networking running that slow > would make the vm unusable for its purpose. > > I have upgraded Xen from 3.2.1 to 3.3, and the problem remains, which > means I have changed: > > 1) Change domU kernel (2.6.21, 2.6.24, 2.6.25) > 2) Change domU userland (gentoo, ubuntu) > 3) Change dom0 physical nic + driver (Realtek 8169 -> 8168) > 4) Change Xen version (3.2.1 -> 3.3) > > The only thing I haven''t changed is the dom0 kernel. I am using a stock > 2.6.21-xen gentoo kernel, so it would be great if anyone watching this > that has pv domUs working under a 2.6.21 kernel would post their .config > so I can compare it to mine. > > Thanks, > > Paul > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >-- *email: paul@airbred.com | msn: paul@airbred.com | gtalk: paul@g.airbred.com | skype: paulkoan | y!: paulxkoan | blog: http://www.servwise.com/blog-paul* ------------------------------------------------------------------------ *ServWise Advanced Hosting www.servwise.com <http://www.servwise.com> *Standard and Reseller web hosting on Windows and Linux* * _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Jon Simonds
2008-Sep-11 18:34 UTC
Re: [Xen-users] domU network fails under load - vif breaks
> Nathan Eisenberg wrote: > I''m wondering if #3 is a large enough change to rule out driver issues. Maybe try an entirely different chipset and manufacturer - perhaps an Intel, Broadcom, etc NIC.I''m curious if anybody has made any further progress on this. I too have the exact same issues as Paul-372 (identical.) Only fix is to reboot the dom0. Now running Xen 3.3.0 with Xen-Sources 2.6.21 on a gentoo base system. Same issues on Xen 3.2.0. I will add to the list that I am using a SuperMicro based server, with built-in Intel Pro/1000 Nics. I''ve had this issue from the beginning, and have tried to limit things that cause the network bridge to collapse, but this makes the machine worthless. This is supposed to be a production level machine for us, but at the moment it obviously isn''t useable. I can cause an immediate drop of all the vifs by doing a copy to an nfs share on dom0 from any of the domU machines. I am running 4 PV domUs and 1 Win32 DomU on this machine. I will provide any information that may help in determining the root of the problem. from lspci: 0d:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet Controller (Copper) (rev 03) 0e:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller from syslog: Sep 10 18:58:15 xen Intel(R) PRO/1000 Network Driver - version 7.3.20-k2 Sep 10 18:58:15 xen e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection Sep 10 18:58:15 xen e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection I see that intel has a new driver, but it is only available as a module. I tend to compile the kernel for exactly what I need, and keep the machines as lean as possible. I would prefer not to go this route, but will if it offers a fix. I have not done this yet, as it seems to not matter what NIC is being used. I believe I may be the first to post about the problem with Intel Pro/1000 nics. So far for me, this problem seems to be strictly related to heavy traffic between the dom0 and the domUs. If I place an nfs server on one of the other domUs I seem to be able to transfer files without causing the vif drop. I have a colleague that has the exact same servers, and he doesn''t do any domU to dom0 traffic, just domU to the internet, and he is able to sustain from 15Mb/sec to 100Mb/sec data streams without dropping the vifs. Maybe it is possible that this level of throughput is not causing him problems, as I''m not completely aware of what type of throughput between my DomUs and Dom0 is actually dropping the vifs, but I see that it is as low as 3Mbit for some people. -- I appologize if this hits the list incorrectly or misses the thread... I tried to reply through Nabble, but my post has been pending since yesterday, so I figured I would try posting directly to the list. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Paul
2008-Sep-11 22:27 UTC
Re: [forums] Re: [Xen-users] domU network fails under load - vif breaks
I have been running for a few days now with a tulip 10/100 card, and without issues. The original realtek was a gigabit card, so this isn''t a one for one test - however, I was getting the network breaking at 3-4Mbs as far as I could tell. I am using nfs to move the data, and this is directly from domU to dom0 - I haven''t fully got my head around whether this traffic even traverses the nic - does xen have the smarts to keep the traffic on the vif? Any, the same action was causing a break, and is no longer. Is the vif speed the same as the physical nic? ethtool only shows link up from the domU''s perspective of the card. I am going to source a non-realtek gigabit card and see if the problem is just a combination of gigabit and gentoo somehow. Jon Simonds wrote:>> Nathan Eisenberg wrote: >> I''m wondering if #3 is a large enough change to rule out driver issues. Maybe try an entirely different chipset and manufacturer - perhaps an Intel, Broadcom, etc NIC. >> > > I''m curious if anybody has made any further progress on this. > I too have the exact same issues as Paul-372 (identical.) Only fix is > to reboot the dom0. > Now running Xen 3.3.0 with Xen-Sources 2.6.21 on a gentoo base system. > Same issues on Xen 3.2.0. > > I will add to the list that I am using a SuperMicro based server, with > built-in Intel Pro/1000 Nics. > I''ve had this issue from the beginning, and have tried to limit things > that cause the network bridge to collapse, but this makes the machine > worthless. This is supposed to be a production level machine for us, > but at the moment it obviously isn''t useable. I can cause an > immediate drop of all the vifs by doing a copy to an nfs share on dom0 > from any of the domU machines. I am running 4 PV domUs and 1 Win32 > DomU on this machine. I will provide any information that may help in > determining the root of the problem. > > from lspci: > 0d:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet > Controller (Copper) (rev 03) > 0e:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet > Controller > > from syslog: > Sep 10 18:58:15 xen Intel(R) PRO/1000 Network Driver - version 7.3.20-k2 > Sep 10 18:58:15 xen e1000: eth0: e1000_probe: Intel(R) PRO/1000 > Network Connection > Sep 10 18:58:15 xen e1000: eth1: e1000_probe: Intel(R) PRO/1000 > Network Connection > > I see that intel has a new driver, but it is only available as a > module. I tend to compile the kernel for exactly what I need, and > keep the machines as lean as possible. I would prefer not to go this > route, but will if it offers a fix. I have not done this yet, as it > seems to not matter what NIC is being used. I believe I may be the > first to post about the problem with Intel Pro/1000 nics. > > So far for me, this problem seems to be strictly related to heavy > traffic between the dom0 and the domUs. If I place an nfs server on > one of the other domUs I seem to be able to transfer files without > causing the vif drop. I have a colleague that has the exact same > servers, and he doesn''t do any domU to dom0 traffic, just domU to the > internet, and he is able to sustain from 15Mb/sec to 100Mb/sec data > streams without dropping the vifs. Maybe it is possible that this > level of throughput is not causing him problems, as I''m not completely > aware of what type of throughput between my DomUs and Dom0 is actually > dropping the vifs, but I see that it is as low as 3Mbit for some > people. > > -- I appologize if this hits the list incorrectly or misses the > thread... I tried to reply through Nabble, but my post has been > pending since yesterday, so I figured I would try posting directly to > the list. > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users > >-- *email: paul@airbred.com | msn: paul@airbred.com | gtalk: paul@g.airbred.com | skype: paulkoan | y!: paulxkoan | blog: http://www.servwise.com/blog-paul* ------------------------------------------------------------------------ *ServWise Advanced Hosting www.servwise.com <http://www.servwise.com> *Standard and Reseller web hosting on Windows and Linux* * _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users