Matthieu Patou
2010-Feb-08  20:55 UTC
[Xen-users] slow network with gplpv drivers in vlan setup
Hello James, I faced quite a long time ago a problem on some of my servers: network interface with gplpv driver has an horrible throughput (~6minute to copy ~15MBytes). I supposed it''s partially linked to my setup as I have a vlan interface in the dom0 in the bridge (because I want to put some PV/HV servers in different vlans. The problem occurs only with servers/workstation not on the same server host (ie. communication domU/domU on the same host are ok and dom0/domU also). * If i put the plain interface (ie. eth0) in the bridge and do the vlan * Without GPLPV the throughput is OK Today I remade some test: * I tried the 0.11.0.188 * I tried to desactivate checksum offload As a last test I deactivated scatter/gather and it turns out that it make it. I have two captures: http://www.matws.net/mat/misc/xen_slow.gz http://www.matws.net/mat/misc/xen_notslow_part.gz Any chance for an explanation ? what scatter/gather is supposed to bring ? Matthieu. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
James Harper
2010-Feb-09  01:14 UTC
RE: [Xen-users] slow network with gplpv drivers in vlan setup
> > Hello James, > > I faced quite a long time ago a problem on some of my servers: > network interface with gplpv driver has an horrible throughput(~6minute> to copy ~15MBytes). > I supposed it''s partially linked to my setup as I have a vlaninterface> in the dom0 in the bridge (because I want to put some PV/HV servers in > different vlans. > > The problem occurs only with servers/workstation not on the sameserver> host (ie. communication domU/domU on the same host are ok anddom0/domU> also). > > * If i put the plain interface (ie. eth0) in the bridge and do thevlan> * Without GPLPV the throughput is OK > > Today I remade some test: > > * I tried the 0.11.0.188 > * I tried to desactivate checksum offload > > As a last test I deactivated scatter/gather and it turns out that it > make it. > > I have two captures: > http://www.matws.net/mat/misc/xen_slow.gz > http://www.matws.net/mat/misc/xen_notslow_part.gz > > Any chance for an explanation ? what scatter/gather is supposed tobring ? Large Send is probably what is causing you problems. Some network cards support large send offload only for untagged packets. Linux doesn''t seem to know this though so the result is just that it doesn''t work. Try turning off large send and see what happens. Turning off scatter gather will (almost completely) disable large send also, because windows is then limited to a total packet length of 4096 bytes which can''t be broken up unless the MSS is really small. James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fajar A. Nugraha
2010-Feb-09  02:44 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
On Tue, Feb 9, 2010 at 3:55 AM, Matthieu Patou <mat+Informatique.xen@matws.net> wrote:> I faced quite a long time ago a problem on some of my servers: > network interface with gplpv driver has an horrible throughput (~6minute to > copy ~15MBytes).> Today I remade some test: > > * I tried the 0.11.0.188This might not be directly related, but I used 0.11.0.188 yesterday for an XP domU, RHEL''s xen (3.1.2). While copying files, the performance was dreadfully slow. MUCH slower than QEMU. A bit puzzling, since on another system I have a similar setup, but with Gitco''s Xen 3.4.1. Uninstalled that GPLPV version, and install 0.10.0.142. What do you know, they worked great :D Haven''t got time to find out what exactly is wrong though. So if you still have problems, you might want to try that too. -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Matthieu Patou
2010-Feb-09  08:10 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
Hi Fajar,> I faced quite a long time ago a problem on some of my servers: >> network interface with gplpv driver has an horrible throughput (~6minute to >> copy ~15MBytes). >> >> Today I remade some test: >> >> * I tried the 0.11.0.188 >> > This might not be directly related, but I used 0.11.0.188 yesterday > for an XP domU, RHEL''s xen (3.1.2). While copying files, the > performance was dreadfully slow. MUCH slower than QEMU. A bit > puzzling, since on another system I have a similar setup, but with > Gitco''s Xen 3.4.1. > > Uninstalled that GPLPV version, and install 0.10.0.142. What do you > know, they worked great :D Haven''t got time to find out what exactly > is wrong though. So if you still have problems, you might want to try > that too. >The thing is that I remember having this problem since 0.9.xx so it''s a constant problem. Matthieu. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fajar A. Nugraha
2010-Feb-09  08:19 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
On Tue, Feb 9, 2010 at 3:10 PM, Matthieu Patou <mat+Informatique.xen@matws.net> wrote:>> I faced quite a long time ago a problem on some of my servers:>> This might not be directly related, but I used 0.11.0.188 yesterday >> for an XP domU, RHEL''s xen (3.1.2). While copying files, the >> performance was dreadfully slow. MUCH slower than QEMU. A bit >> puzzling, since on another system I have a similar setup, but with >> Gitco''s Xen 3.4.1. >> >> Uninstalled that GPLPV version, and install 0.10.0.142. What do you >> know, they worked great :D Haven''t got time to find out what exactly >> is wrong though. So if you still have problems, you might want to try >> that too. >> > > The thing is that I remember having this problem since 0.9.xx so it''s a > constant problem.You didn''t mention what hardware/Xen version you use. As James mentioned, the problem might be related to network cards. Turning off Large Send might solve your problem (at the cost of extra settings effort, and possible higher CPU usage). However, from my experince with RHEL and tg3, sometimes driver and firmware version matters as well. So if you have time, I suggest: - check server manufacturer''s website to see if they have a newer firmware - update your OS, or change to known-good OS. On some systems I use RHEL5 with Gitco''s updated Xen RPM, which works great with vlans and even bonding. -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Matthieu Patou
2010-Feb-09  08:37 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
James,>> Any chance for an explanation ? what scatter/gather is supposed to >> > bring ? > > Large Send is probably what is causing you problems. Some network cards > support large send offload only for untagged packets.Well that''s a bit strange because on 2 identical dom0 servers configured the same way I had this behavior only on one dom0 (the really loaded server).> Linux doesn''t seem > to know this though so the result is just that it doesn''t work. Try > turning off large send and see what happens. > >Ok so you suggest to change the value of 61440 to something smaller (ie. 8192)> Turning off scatter gather will (almost completely) disable large send > also, because windows is then limited to a total packet length of 4096 > bytes which can''t be broken up unless the MSS is really small. >The question that stay is what is drawback of not having large send ? Matthieu. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
James Harper
2010-Feb-09  10:03 UTC
RE: [Xen-users] slow network with gplpv drivers in vlan setup
> > Large Send is probably what is causing you problems. Some networkcards> > support large send offload only for untagged packets. > > Well that''s a bit strange because on 2 identical dom0 serversconfigured> the same way I had this behavior only on one dom0 (the really loaded > server).How identical is identical? Are you able to determine the exact chipset and firmware version of the network adapters? Even if you bought the two servers from the same supplier at exactly the same time, there is still a chance that there is some hardware difference, most likely firmware.> > Linux doesn''t seem > > to know this though so the result is just that it doesn''t work. Try > > turning off large send and see what happens. > > > > > Ok so you suggest to change the value of 61440 to something smaller(ie.> 8192)It may be worth a go, but I suspect that the difference would be in turning it off or on, not the size. Unless the underlying chipset has some limitation in size... I hadn''t considered that.> > > Turning off scatter gather will (almost completely) disable largesend> > also, because windows is then limited to a total packet length of4096> > bytes which can''t be broken up unless the MSS is really small. > > > The question that stay is what is drawback of not having large send ? >Large send means that the network card will accept TCP packets well in excess of the actual MTU, up to about 60K. The network card computes checksum, seq, etc for you. So if you want to send a lot of TCP data it''s the difference between windows giving one 60K packet to the network card vs 40 packets. It gets even better when you are talking about virtual machines because a Linux Dom0 can keep the packet ''large'' as long as all the things it has to pass through can handle it, whether that''s from the DomU to Dom0, DomU through the bridge to another DomU, or DomU through the bridge to the physical network card. In the testing I''ve done it''s been the difference between 2GBits/second iperf throughput and 3-4GBits/second. That''s probably not representative of real-world workloads though. So in turning it off you do lose out on performance, but only if it worked in the first place, which it doesn''t for you. If you could figure out exactly what is different between you''re 2 Dom0''s I''d be grateful. I keep getting these reports of LSO causing problems for some people and have never been able to properly figure out exactly why... Thanks James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2010-Feb-09  10:22 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
On Tue, Feb 09, 2010 at 09:03:49PM +1100, James Harper wrote:> > > Large Send is probably what is causing you problems. Some network > cards > > > support large send offload only for untagged packets. > > > > Well that''s a bit strange because on 2 identical dom0 servers > configured > > the same way I had this behavior only on one dom0 (the really loaded > > server). > > How identical is identical? Are you able to determine the exact chipset > and firmware version of the network adapters? Even if you bought the two > servers from the same supplier at exactly the same time, there is still > a chance that there is some hardware difference, most likely firmware. >ethtool -i <interface> in dom0 might show the firmware version. like this: # ethtool -i eth0 driver: e1000e version: 1.0.2-k2 firmware-version: 5.11-8 bus-info: 0000:04:00.0> > Large send means that the network card will accept TCP packets well in > excess of the actual MTU, up to about 60K. The network card computes > checksum, seq, etc for you. So if you want to send a lot of TCP data > it''s the difference between windows giving one 60K packet to the network > card vs 40 packets. > > It gets even better when you are talking about virtual machines because > a Linux Dom0 can keep the packet ''large'' as long as all the things it > has to pass through can handle it, whether that''s from the DomU to Dom0, > DomU through the bridge to another DomU, or DomU through the bridge to > the physical network card. > > In the testing I''ve done it''s been the difference between 2GBits/second > iperf throughput and 3-4GBits/second. That''s probably not representative > of real-world workloads though. > > So in turning it off you do lose out on performance, but only if it > worked in the first place, which it doesn''t for you. > > If you could figure out exactly what is different between you''re 2 > Dom0''s I''d be grateful. I keep getting these reports of LSO causing > problems for some people and have never been able to properly figure out > exactly why... >Please paste "ethtool ethX" and "ethtool -i ethX" output from both dom0s. If you''re using the xen network-bridge script then the device is pethX instead of ethX. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2010-Feb-09  10:27 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
On Tue, Feb 09, 2010 at 12:22:09PM +0200, Pasi Kärkkäinen wrote:> On Tue, Feb 09, 2010 at 09:03:49PM +1100, James Harper wrote: > > > > Large Send is probably what is causing you problems. Some network > > cards > > > > support large send offload only for untagged packets. > > > > > > Well that''s a bit strange because on 2 identical dom0 servers > > configured > > > the same way I had this behavior only on one dom0 (the really loaded > > > server). > > > > How identical is identical? Are you able to determine the exact chipset > > and firmware version of the network adapters? Even if you bought the two > > servers from the same supplier at exactly the same time, there is still > > a chance that there is some hardware difference, most likely firmware. > > > > ethtool -i <interface> in dom0 might show the firmware version. > > like this: > > # ethtool -i eth0 > driver: e1000e > version: 1.0.2-k2 > firmware-version: 5.11-8 > bus-info: 0000:04:00.0 > > > > > > Large send means that the network card will accept TCP packets well in > > excess of the actual MTU, up to about 60K. The network card computes > > checksum, seq, etc for you. So if you want to send a lot of TCP data > > it''s the difference between windows giving one 60K packet to the network > > card vs 40 packets. > > > > It gets even better when you are talking about virtual machines because > > a Linux Dom0 can keep the packet ''large'' as long as all the things it > > has to pass through can handle it, whether that''s from the DomU to Dom0, > > DomU through the bridge to another DomU, or DomU through the bridge to > > the physical network card. > > > > In the testing I''ve done it''s been the difference between 2GBits/second > > iperf throughput and 3-4GBits/second. That''s probably not representative > > of real-world workloads though. > > > > So in turning it off you do lose out on performance, but only if it > > worked in the first place, which it doesn''t for you. > > > > If you could figure out exactly what is different between you''re 2 > > Dom0''s I''d be grateful. I keep getting these reports of LSO causing > > problems for some people and have never been able to properly figure out > > exactly why... > > > > Please paste "ethtool ethX" and "ethtool -i ethX" output from both dom0s. > If you''re using the xen network-bridge script then the device is pethX instead of ethX. >Oh, and also "ethtool -k ethX" to get the dom0 offloading settings. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Matthieu Patou
2010-Feb-09  13:38 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
On 09/02/2010 11:19, Fajar A. Nugraha wrote:> On Tue, Feb 9, 2010 at 3:10 PM, Matthieu Patou > <mat+Informatique.xen@matws.net> wrote: > >>> I faced quite a long time ago a problem on some of my servers: >>> > >>> This might not be directly related, but I used 0.11.0.188 yesterday >>> for an XP domU, RHEL''s xen (3.1.2). While copying files, the >>> performance was dreadfully slow. MUCH slower than QEMU. A bit >>> puzzling, since on another system I have a similar setup, but with >>> Gitco''s Xen 3.4.1. >>> >>> Uninstalled that GPLPV version, and install 0.10.0.142. What do you >>> know, they worked great :D Haven''t got time to find out what exactly >>> is wrong though. So if you still have problems, you might want to try >>> that too. >>> >>> >> The thing is that I remember having this problem since 0.9.xx so it''s a >> constant problem. >> > You didn''t mention what hardware/Xen version you use. > As James mentioned, the problem might be related to network cards. > Turning off Large Send might solve your problem (at the cost of extra > settings effort, and possible higher CPU usage). >Right, I am on a debian lenny which is shipped with xen 3.2.1 (a bit old ... right). Hardware is Intel Gigabit 80003ES2LAN. But I also have this behavior on a test workstation with the same OS but the following nic: BCM5784M Gigabit Ethernet> However, from my experince with RHEL and tg3, sometimes driver and > firmware version matters as well. So if you have time, I suggest: > - check server manufacturer''s website to see if they have a newer firmware > - update your OS, or change to known-good OS. On some systems I use > RHEL5 with Gitco''s updated Xen RPM, which works great with vlans and > even bonding. >I didn''t find any update for the moment. I''m not very familiar with this gitco stuff, what is it exactly ? Matthieu. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Matthieu Patou
2010-Feb-09  14:45 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
On 09/02/2010 13:03, James Harper wrote:>>> Large Send is probably what is causing you problems. Some network >>> > cards > >>> support large send offload only for untagged packets. >>> >> Well that''s a bit strange because on 2 identical dom0 servers >> > configured > >> the same way I had this behavior only on one dom0 (the really loaded >> server). >> > How identical is identical? Are you able to determine the exact chipset > and firmware version of the network adapters? Even if you bought the two > servers from the same supplier at exactly the same time, there is still > a chance that there is some hardware difference, most likely firmware. >Well identical like the same server at the same moment, for the firmware I don''t know really how to check it for the nic (any tip ?).> >>> Linux doesn''t seem >>> to know this though so the result is just that it doesn''t work. Try >>> turning off large send and see what happens. >>> >>> >>> >> Ok so you suggest to change the value of 61440 to something smaller >> > (ie. > >> 8192) >> > It may be worth a go, but I suspect that the difference would be in > turning it off or on, not the size. Unless the underlying chipset has > some limitation in size... I hadn''t considered that. > > >> >>> Turning off scatter gather will (almost completely) disable large >>> > send > >>> also, because windows is then limited to a total packet length of >>> > 4096 > >>> bytes which can''t be broken up unless the MSS is really small. >>> >>> >> The question that stay is what is drawback of not having large send ? >> >> > Large send means that the network card will accept TCP packets well in > excess of the actual MTU, up to about 60K. The network card computes > checksum, seq, etc for you. So if you want to send a lot of TCP data > it''s the difference between windows giving one 60K packet to the network > card vs 40 packets. > > It gets even better when you are talking about virtual machines because > a Linux Dom0 can keep the packet ''large'' as long as all the things it > has to pass through can handle it, whether that''s from the DomU to Dom0, > DomU through the bridge to another DomU, or DomU through the bridge to > the physical network card. > > In the testing I''ve done it''s been the difference between 2GBits/second > iperf throughput and 3-4GBits/second. That''s probably not representative > of real-world workloads though. > > So in turning it off you do lose out on performance, but only if it > worked in the first place, which it doesn''t for you. > > If you could figure out exactly what is different between you''re 2 > Dom0''s I''d be grateful. I keep getting these reports of LSO causing > problems for some people and have never been able to properly figure out > exactly why... >As I said in some other mails I have also the same behavior on a test server that is in fact a Dell workstation with a broadcom gigabit nic. Once I know how to search for the firmware of the nic I''ll try to dig on it. Matthieu. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Matthieu Patou
2010-Feb-09  15:11 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
Hello Pasi, James>>> How identical is identical? Are you able to determine the exact chipset >>> and firmware version of the network adapters? Even if you bought the two >>> servers from the same supplier at exactly the same time, there is still >>> a chance that there is some hardware difference, most likely firmware. >>> >>> >> ethtool -i<interface> in dom0 might show the firmware version. >> >> like this: >> >> # ethtool -i eth0 >> driver: e1000e >> version: 1.0.2-k2 >> firmware-version: 5.11-8 >> bus-info: 0000:04:00.0 >> >>So on the Dell Workstation with a broadcom card where I witnessed yesterday the problem I have: ethtool -i eth1 driver: tg3 version: 3.92.1 firmware-version: bus-info: 0000:02:00.0 ethtool -k eth1 Offload parameters for eth1: Cannot get device flags: Operation not supported rx-checksumming: on tx-checksumming: on scatter-gather: on tcp segmentation offload: on udp fragmentation offload: off generic segmentation offload: on large receive offload: off eth1 is the physical interface supporting the vlan99 device which is in the bridge for the windows servers. On the other side on the 2 servers where I witnessed before the problem with the same image (I copied the image on the second server to be sure when I discovered a difference of behavior between the two servers) I have this: "Good" server ethtool -i eth1 driver: e1000e version: 0.3.3.3-k2 firmware-version: 2.1-12 bus-info: 0000:04:00.1 ethtool -k eth1 Offload parameters for eth1: Cannot get device flags: Operation not supported rx-checksumming: on tx-checksumming: on scatter-gather: on tcp segmentation offload: on udp fragmentation offload: off generic segmentation offload: on large receive offload: off "Bad server" ethtool -i eth1 driver: e1000e version: 0.3.3.3-k2 firmware-version: 2.1-12 bus-info: 0000:04:00.1 ethtool -k eth1 Offload parameters for eth1: Cannot get device flags: Operation not supported rx-checksumming: on tx-checksumming: on scatter-gather: on tcp segmentation offload: on udp fragmentation offload: off generic segmentation offload: on large receive offload: off> Oh, and also "ethtool -k ethX" to get the dom0 offloading settings. >Matthieu. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2010-Feb-09  17:56 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
On Tue, Feb 09, 2010 at 06:11:57PM +0300, Matthieu Patou wrote:> > Hello Pasi, James > >>>> How identical is identical? Are you able to determine the exact chipset >>>> and firmware version of the network adapters? Even if you bought the two >>>> servers from the same supplier at exactly the same time, there is still >>>> a chance that there is some hardware difference, most likely firmware. >>>> >>>> >>> ethtool -i<interface> in dom0 might show the firmware version. >>> >>> like this: >>> >>> # ethtool -i eth0 >>> driver: e1000e >>> version: 1.0.2-k2 >>> firmware-version: 5.11-8 >>> bus-info: 0000:04:00.0 >>> >>> > So on the Dell Workstation with a broadcom card where I witnessed > yesterday the problem I have: > ethtool -i eth1 > driver: tg3 > version: 3.92.1 > firmware-version: > bus-info: 0000:02:00.0 > > ethtool -k eth1 > Offload parameters for eth1: > Cannot get device flags: Operation not supported > rx-checksumming: on > tx-checksumming: on > scatter-gather: on > tcp segmentation offload: on > udp fragmentation offload: off > generic segmentation offload: on > large receive offload: off > > > eth1 is the physical interface supporting the vlan99 device which is in > the bridge for the windows servers. > > On the other side on the 2 servers where I witnessed before the problem > with the same image (I copied the image on the second server to be sure > when I discovered a difference of behavior between the two servers) I > have this: > > "Good" server > ethtool -i eth1 > driver: e1000e > version: 0.3.3.3-k2 > firmware-version: 2.1-12 > bus-info: 0000:04:00.1 > > ethtool -k eth1 > Offload parameters for eth1: > Cannot get device flags: Operation not supported > rx-checksumming: on > tx-checksumming: on > scatter-gather: on > tcp segmentation offload: on > udp fragmentation offload: off > generic segmentation offload: on > large receive offload: off > > > "Bad server" > ethtool -i eth1 > driver: e1000e > version: 0.3.3.3-k2 > firmware-version: 2.1-12 > bus-info: 0000:04:00.1 > > ethtool -k eth1 > Offload parameters for eth1: > Cannot get device flags: Operation not supported > rx-checksumming: on > tx-checksumming: on > scatter-gather: on > tcp segmentation offload: on > udp fragmentation offload: off > generic segmentation offload: on > large receive offload: off >Hmm.. so the driver version is the same, firmware is the same, offloading settings are the same. Is the bridge/vlan/bond configuration exactly the same between good and bad servers? -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2010-Feb-09  18:12 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
On Tue, Feb 09, 2010 at 04:38:24PM +0300, Matthieu Patou wrote:> On 09/02/2010 11:19, Fajar A. Nugraha wrote: >> On Tue, Feb 9, 2010 at 3:10 PM, Matthieu Patou >> <mat+Informatique.xen@matws.net> wrote: >> >>>> I faced quite a long time ago a problem on some of my servers: >>>> >> >>>> This might not be directly related, but I used 0.11.0.188 yesterday >>>> for an XP domU, RHEL''s xen (3.1.2). While copying files, the >>>> performance was dreadfully slow. MUCH slower than QEMU. A bit >>>> puzzling, since on another system I have a similar setup, but with >>>> Gitco''s Xen 3.4.1. >>>> >>>> Uninstalled that GPLPV version, and install 0.10.0.142. What do you >>>> know, they worked great :D Haven''t got time to find out what exactly >>>> is wrong though. So if you still have problems, you might want to try >>>> that too. >>>> >>>> >>> The thing is that I remember having this problem since 0.9.xx so it''s a >>> constant problem. >>> >> You didn''t mention what hardware/Xen version you use. >> As James mentioned, the problem might be related to network cards. >> Turning off Large Send might solve your problem (at the cost of extra >> settings effort, and possible higher CPU usage). >> > Right, I am on a debian lenny which is shipped with xen 3.2.1 (a bit old > ... right). > Hardware is Intel Gigabit 80003ES2LAN. > > But I also have this behavior on a test workstation with the same OS but > the following nic: BCM5784M Gigabit Ethernet >Hmm.. the bug could be in the dom0 kernel. Lenny''s 2.6.26-2-xen is known to have many issues. Can you try linux-2.6.18-xen as dom0 kernel? Or centos5 kernel-xen? Both of those are known to work as dom0 without problems. http://wiki.xensource.com/xenwiki/XenDom0Kernels>> However, from my experince with RHEL and tg3, sometimes driver and >> firmware version matters as well. So if you have time, I suggest: >> - check server manufacturer''s website to see if they have a newer firmware >> - update your OS, or change to known-good OS. On some systems I use >> RHEL5 with Gitco''s updated Xen RPM, which works great with vlans and >> even bonding. >> > I didn''t find any update for the moment. > I''m not very familiar with this gitco stuff, what is it exactly ? >gitco.de/repo/ is newer xen hypervisor rpms for rhel5/centos5, but it seems you''re using Debian so it doesn''t help you. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
James Harper
2010-Feb-09  23:06 UTC
RE: [Xen-users] slow network with gplpv drivers in vlan setup
> "Good" server > ethtool -i eth1 > driver: e1000e > version: 0.3.3.3-k2 > firmware-version: 2.1-12 > bus-info: 0000:04:00.1 >Can you do a ''lspci -v'' and ''lspci -vn'' and post the output of the e1000 adapter. On mine it looks like: lspci -v 08:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06) Subsystem: Hewlett-Packard Company HP 110T PCIe Gigabit Server Adapter lspci -vn 08:00.0 0200: 8086:10b9 (rev 06) Subsystem: 103c:704a I''m pretty sure that some ''e1000'' chipsets don''t support lso for tagged packets... James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Matthieu Patou
2010-Feb-10  08:52 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
On 09/02/2010 21:12, Pasi Kärkkäinen wrote:> On Tue, Feb 09, 2010 at 04:38:24PM +0300, Matthieu Patou wrote: > >> On 09/02/2010 11:19, Fajar A. Nugraha wrote: >> >>> On Tue, Feb 9, 2010 at 3:10 PM, Matthieu Patou >>> <mat+Informatique.xen@matws.net> wrote: >>> >>> >>>>> I faced quite a long time ago a problem on some of my servers: >>>>> >>>>> >>> >>> >>>>> This might not be directly related, but I used 0.11.0.188 yesterday >>>>> for an XP domU, RHEL''s xen (3.1.2). While copying files, the >>>>> performance was dreadfully slow. MUCH slower than QEMU. A bit >>>>> puzzling, since on another system I have a similar setup, but with >>>>> Gitco''s Xen 3.4.1. >>>>> >>>>> Uninstalled that GPLPV version, and install 0.10.0.142. What do you >>>>> know, they worked great :D Haven''t got time to find out what exactly >>>>> is wrong though. So if you still have problems, you might want to try >>>>> that too. >>>>> >>>>> >>>>> >>>> The thing is that I remember having this problem since 0.9.xx so it''s a >>>> constant problem. >>>> >>>> >>> You didn''t mention what hardware/Xen version you use. >>> As James mentioned, the problem might be related to network cards. >>> Turning off Large Send might solve your problem (at the cost of extra >>> settings effort, and possible higher CPU usage). >>> >>> >> Right, I am on a debian lenny which is shipped with xen 3.2.1 (a bit old >> ... right). >> Hardware is Intel Gigabit 80003ES2LAN. >> >> But I also have this behavior on a test workstation with the same OS but >> the following nic: BCM5784M Gigabit Ethernet >> >> > Hmm.. the bug could be in the dom0 kernel. Lenny''s 2.6.26-2-xen is known to > have many issues. > > Can you try linux-2.6.18-xen as dom0 kernel? Or centos5 kernel-xen? > Both of those are known to work as dom0 without problems. > > http://wiki.xensource.com/xenwiki/XenDom0Kernels >Unfortunately for driver reason it''s not a possibility.>>> However, from my experince with RHEL and tg3, sometimes driver and >>> firmware version matters as well. So if you have time, I suggest: >>> - check server manufacturer''s website to see if they have a newer firmware >>> - update your OS, or change to known-good OS. On some systems I use >>> RHEL5 with Gitco''s updated Xen RPM, which works great with vlans and >>> even bonding. >>> >>> >> I didn''t find any update for the moment. >> I''m not very familiar with this gitco stuff, what is it exactly ? >> > gitco.de/repo/ is newer xen hypervisor rpms for rhel5/centos5, > but it seems you''re using Debian so it doesn''t help youI''ll try to give a try to testing version of xen to see if it brings anything good. Matthieu. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2010-Feb-10  09:59 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
On Wed, Feb 10, 2010 at 11:52:17AM +0300, Matthieu Patou wrote:> On 09/02/2010 21:12, Pasi Kärkkäinen wrote: >> On Tue, Feb 09, 2010 at 04:38:24PM +0300, Matthieu Patou wrote: >> >>> On 09/02/2010 11:19, Fajar A. Nugraha wrote: >>> >>>> On Tue, Feb 9, 2010 at 3:10 PM, Matthieu Patou >>>> <mat+Informatique.xen@matws.net> wrote: >>>> >>>> >>>>>> I faced quite a long time ago a problem on some of my servers: >>>>>> >>>>>> >>>> >>>> >>>>>> This might not be directly related, but I used 0.11.0.188 yesterday >>>>>> for an XP domU, RHEL''s xen (3.1.2). While copying files, the >>>>>> performance was dreadfully slow. MUCH slower than QEMU. A bit >>>>>> puzzling, since on another system I have a similar setup, but with >>>>>> Gitco''s Xen 3.4.1. >>>>>> >>>>>> Uninstalled that GPLPV version, and install 0.10.0.142. What do you >>>>>> know, they worked great :D Haven''t got time to find out what exactly >>>>>> is wrong though. So if you still have problems, you might want to try >>>>>> that too. >>>>>> >>>>>> >>>>>> >>>>> The thing is that I remember having this problem since 0.9.xx so it''s a >>>>> constant problem. >>>>> >>>>> >>>> You didn''t mention what hardware/Xen version you use. >>>> As James mentioned, the problem might be related to network cards. >>>> Turning off Large Send might solve your problem (at the cost of extra >>>> settings effort, and possible higher CPU usage). >>>> >>>> >>> Right, I am on a debian lenny which is shipped with xen 3.2.1 (a bit old >>> ... right). >>> Hardware is Intel Gigabit 80003ES2LAN. >>> >>> But I also have this behavior on a test workstation with the same OS but >>> the following nic: BCM5784M Gigabit Ethernet >>> >>> >> Hmm.. the bug could be in the dom0 kernel. Lenny''s 2.6.26-2-xen is known to >> have many issues. >> >> Can you try linux-2.6.18-xen as dom0 kernel? Or centos5 kernel-xen? >> Both of those are known to work as dom0 without problems. >> >> http://wiki.xensource.com/xenwiki/XenDom0Kernels >> > Unfortunately for driver reason it''s not a possibility. >Too bad..>>>> However, from my experince with RHEL and tg3, sometimes driver and >>>> firmware version matters as well. So if you have time, I suggest: >>>> - check server manufacturer''s website to see if they have a newer firmware >>>> - update your OS, or change to known-good OS. On some systems I use >>>> RHEL5 with Gitco''s updated Xen RPM, which works great with vlans and >>>> even bonding. >>>> >>>> >>> I didn''t find any update for the moment. >>> I''m not very familiar with this gitco stuff, what is it exactly ? >>> >> gitco.de/repo/ is newer xen hypervisor rpms for rhel5/centos5, >> but it seems you''re using Debian so it doesn''t help you > I''ll try to give a try to testing version of xen to see if it brings > anything good. >I don''t think it''s related to Xen hypervisor. It''s more probable that it''s a problem in dom0 linux (kernel). -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Matthieu Patou
2010-Feb-10  12:06 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
Hi James,>> "Good" server >> ethtool -i eth1 >> driver: e1000e >> version: 0.3.3.3-k2 >> firmware-version: 2.1-12 >> bus-info: 0000:04:00.1 >> >> > Can you do a ''lspci -v'' and ''lspci -vn'' and post the output of the e1000 adapter. On mine it looks like: > > lspci -v > 08:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06) > Subsystem: Hewlett-Packard Company HP 110T PCIe Gigabit Server Adapter > > lspci -vn > 08:00.0 0200: 8086:10b9 (rev 06) > Subsystem: 103c:704a > > I''m pretty sure that some ''e1000'' chipsets don''t support lso for tagged packets... > >04:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper) (rev 01) Subsystem: Intel Corporation Device 0000 Flags: bus master, fast devsel, latency 0, IRQ 19 Memory at d8020000 (32-bit, non-prefetchable) [size=128K] I/O ports at 3020 [size=32] Capabilities: [c8] Power Management version 2 Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- Capabilities: [e0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting <?> Capabilities: [140] Device Serial Number 87-95-76-ff-ff-81-e0-00 Kernel driver in use: e1000e Kernel modules: e1000e 04:00.1 0200: 8086:1096 (rev 01) Subsystem: 8086:0000 Flags: bus master, fast devsel, latency 0, IRQ 19 Memory at d8020000 (32-bit, non-prefetchable) [size=128K] I/O ports at 3020 [size=32] Capabilities: [c8] Power Management version 2 Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- Capabilities: [e0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting <?> Capabilities: [140] Device Serial Number 87-95-76-ff-ff-81-e0-00 Kernel driver in use: e1000e Kernel modules: e1000e Matthieu. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
James Harper
2010-Feb-10  12:29 UTC
RE: [Xen-users] slow network with gplpv drivers in vlan setup
> 04:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit > Ethernet Controller (Copper) (rev 01) > Subsystem: Intel Corporation Device 0000 > Flags: bus master, fast devsel, latency 0, IRQ 19 > Memory at d8020000 (32-bit, non-prefetchable) [size=128K] > I/O ports at 3020 [size=32] > Capabilities: [c8] Power Management version 2 > Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ > Queue=0/0 Enable- > Capabilities: [e0] Express Endpoint, MSI 00 > Capabilities: [100] Advanced Error Reporting <?> > Capabilities: [140] Device Serial Number 87-95-76-ff-ff-81-e0-00 > Kernel driver in use: e1000e > Kernel modules: e1000e > > 04:00.1 0200: 8086:1096 (rev 01) > Subsystem: 8086:0000 > Flags: bus master, fast devsel, latency 0, IRQ 19 > Memory at d8020000 (32-bit, non-prefetchable) [size=128K] > I/O ports at 3020 [size=32] > Capabilities: [c8] Power Management version 2 > Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ > Queue=0/0 Enable- > Capabilities: [e0] Express Endpoint, MSI 00 > Capabilities: [100] Advanced Error Reporting <?> > Capabilities: [140] Device Serial Number 87-95-76-ff-ff-81-e0-00 > Kernel driver in use: e1000e > Kernel modules: e1000e >Are the adapters in both systems identical in terms of lspci output? James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Matthieu Patou
2010-Feb-11  09:35 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
Hi James, On 10/02/2010 15:29, James Harper wrote:>> 04:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit >> Ethernet Controller (Copper) (rev 01) >> Subsystem: Intel Corporation Device 0000 >> Flags: bus master, fast devsel, latency 0, IRQ 19 >> Memory at d8020000 (32-bit, non-prefetchable) [size=128K] >> I/O ports at 3020 [size=32] >> Capabilities: [c8] Power Management version 2 >> Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ >> Queue=0/0 Enable- >> Capabilities: [e0] Express Endpoint, MSI 00 >> Capabilities: [100] Advanced Error Reporting<?> >> Capabilities: [140] Device Serial Number 87-95-76-ff-ff-81-e0-00 >> Kernel driver in use: e1000e >> Kernel modules: e1000e >> >> 04:00.1 0200: 8086:1096 (rev 01) >> Subsystem: 8086:0000 >> Flags: bus master, fast devsel, latency 0, IRQ 19 >> Memory at d8020000 (32-bit, non-prefetchable) [size=128K] >> I/O ports at 3020 [size=32] >> Capabilities: [c8] Power Management version 2 >> Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ >> Queue=0/0 Enable- >> Capabilities: [e0] Express Endpoint, MSI 00 >> Capabilities: [100] Advanced Error Reporting<?> >> Capabilities: [140] Device Serial Number 87-95-76-ff-ff-81-e0-00 >> Kernel driver in use: e1000e >> Kernel modules: e1000e > Are the adapters in both systems identical in terms of lspci output? >Here is the result on the second server: 04:00.0 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper) (rev 01) Subsystem: Intel Corporation Device 0000 Flags: bus master, fast devsel, latency 0, IRQ 18 Memory at d8000000 (32-bit, non-prefetchable) [size=128K] I/O ports at 3000 [size=32] Capabilities: [c8] Power Management version 2 Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- Capabilities: [e0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting <?> Capabilities: [140] Device Serial Number 93-95-76-ff-ff-81-e0-00 Kernel driver in use: e1000e Kernel modules: e1000e 04:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper) (rev 01) Subsystem: Intel Corporation Device 0000 Flags: bus master, fast devsel, latency 0, IRQ 19 Memory at d8020000 (32-bit, non-prefetchable) [size=128K] I/O ports at 3020 [size=32] Capabilities: [c8] Power Management version 2 Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- Capabilities: [e0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting <?> Capabilities: [140] Device Serial Number 93-95-76-ff-ff-81-e0-00 Kernel driver in use: e1000e Kernel modules: e1000e Looks pretty the same. I''m wondering if you have an idea to fix the problem on the dell workstation, or maybe how the ethtool -k/-i should looks like in order to have chance for offloading to work. Matthieu. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fajar A. Nugraha
2010-Feb-11  09:56 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
On Thu, Feb 11, 2010 at 4:35 PM, Matthieu Patou <mat+Informatique.xen@matws.net> wrote:> I''m wondering if you have an idea to fix the problem on the dell > workstation,Have you tried looking for updated NIC firmware at Dell? I know that they''re available for HP and IBM. -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
James Harper
2010-Feb-11  12:29 UTC
RE: [Xen-users] slow network with gplpv drivers in vlan setup
> > Looks pretty the same.Yes, identical, just like you said they were :)> I''m wondering if you have an idea to fix the problem on the dell > workstation, or maybe how the ethtool -k/-i should looks like in order > to have chance for offloading to work. >No vlan''s on the dell? Some offload implementations are just broken, either on the firmware or the driver. Some have limitations that the driver doesn''t understand (eg no offload when a vlan is in use). Just turn off the offload. If there are vlan''s on the dell then get back to me and I might be able to suggest a few things. James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Matthieu Patou
2010-Feb-14  07:50 UTC
Re: [Xen-users] slow network with gplpv drivers in vlan setup
On 11/02/2010 15:29, James Harper wrote:>> Looks pretty the same. >> > Yes, identical, just like you said they were :) > > >> I''m wondering if you have an idea to fix the problem on the dell >> workstation, or maybe how the ethtool -k/-i should looks like in order >> to have chance for offloading to work. >> >> > No vlan''s on the dell? Some offload implementations are just broken, either on the firmware or the driver. Some have limitations that the driver doesn''t understand (eg no offload when a vlan is in use). Just turn off the offload.I also have vlan on the dell (that''s how I witnessed the problem one more time and that I find that scatter/gather help with the transfer speed). Thing is that I have roughly ~6/7MB per second in speed transfer in this case, which is not too bad but for a 1Gbit adapter it well bellow its capacities.> If there are vlan''s on the dell then get back to me and I might be able to suggest a few things. > >Thanks. Matthieu. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users