Linus van Geuns
2012-Jun-24 14:34 UTC
Xen 10GBit Ethernet network performance (was: Re: Experience with Xen & AMD Opteron 4200 series?)
Hey, On Mon, Feb 27, 2012 at 10:26 PM, Linus van Geuns <linus@vangeuns.name> wrote:> Hey, > > I am currently considering a two machine Xen setup based on AMD Opterons, > more precisely Dell R515s or R415s with dual Opteron 4274HE. > Does anyone has some Xen setup(s) running on top of Opterons 4274HE, Opteron > 4200 or Dell R515 machines and is willing to share some experience?Meanwhile, I got myself a two machine test setup for evaluation. 2 machines: - 1x Opteron 4274HE (8C) - 32GByte RAM - 2x 1GBit Eth on board - 1x Intel X520-DA2 (dual port 10GBit Ethernet) I installed Debian squeeze with Xen on both machines. "Debian kernel": 2.6.32-5-amd64 "Debian Xen kernel": 2.6.32-5-xen-amd64 "Debian bpo kernel": 3.2.0-0.bpo.2-amd64 "Debian Xen": 4.0.1 "Vanilla Xen": 4.1.2 TCP/IP transfer test: user@lx$ dd if=/dev/zero bs=1M count=40960 | nc -q 0 -v ly 16777 user@ty$ nc -l -v -p 16777 -q 0 | dd bs=1M of=/dev/null First, I am testing dom0 10GBit ethernet performance. This is pretty important to me for storage replication between cluster nodes. I found that between both boxes with Debian Kernel or Debian bpo kernel on bare hardware, I can transfer about 520 up to 550MByte/s. Between two dom0 instances, I get only 200 up to 250MByte/s. I also tried the same between a dom0 and a plain hardware instance and the other way around. I tried Debian Xen and vanilla Xen and also using Debian Xen kernel and Debian bpo kernel as a dom0 kernel. I did try setting the dom0 memory to various sizes and iommu=1 as well. Finally I retried most test scenarios with irqbalance enabled, as those Intel 10GE cards expose between 8 and 16 queses (IRQs) for TX/RX, depending on the kernel version. The result is always the same: I am limited to 250MByte/s max. unless both boxes run on bare hardware. Even Debian Xen kernel on bare hardware on both boxes does about 550MByte/s. When testing dom0 performance, no other domains (domUs) are running. So, this perfomance issue does seem to relate to communication between Xen hypervisor and Linux kernel. Any ideas on the issue or pointers on further testing? Regards, Linus
Florian Heigl
2012-Jun-25 21:42 UTC
Re: Xen 10GBit Ethernet network performance (was: Re: Experience with Xen & AMD Opteron 4200 series?)
2012/6/24 Linus van Geuns <linus@vangeuns.name>:> Between two dom0 instances, I get only 200 up to 250MByte/s. > > I also tried the same between a dom0 and a plain hardware instance andSteps you can try: - do NOT configure a bridge in dom0, try normal eth0 <-> eth0 comms (The linux bridge is a BRIDGE. that''s the things everyone stopped using in 1998) - dom0 vpcu pinning (because I wonder if the migrations between vcpus make things trip) Things to keep in mind: ---------------------------------- There was a handful of "new network" implementations to speed up IO performance (i.e. xenloop, fido) between domUs. All have not gotten anywhere although they were, indeed, fast. Stub IO domains as a concept were invented to take IO processing out of dom0. I have NO IDEA why that would be faster, but someone though it does make a difference, otherwise it would not be there. It is very probable that with switching to a SR-IOV nic, the whole issue is gone. Some day I''ll afford a SolarFlare 61xx NIC and benchmark on my own. The key thing with using "vNIC"s assigned to the domUs is that you get rid of the bridging idiocy and have more io queues; some nic will switch between multiple domUs on the same nic, and even if they can''t: The 10gig switch next to your dom0 is definitely faster than the SW bridge code. OpenVSwitch is a nice solution to generally replace the bridge but i haven''t seen anyone say that it gets anywhere near hardware performance. Last: I''m not sure if you will see the problem solved. I think it has never gotten extremely high prio. Greetings, Florian
Linus van Geuns
2012-Jun-29 14:11 UTC
Re: Xen 10GBit Ethernet network performance (was: Re: Experience with Xen & AMD Opteron 4200 series?)
Hey Florian, thank you for your input. On Mon, Jun 25, 2012 at 11:42 PM, Florian Heigl <florian.heigl@gmail.com> wrote:> 2012/6/24 Linus van Geuns <linus@vangeuns.name>: > >> Between two dom0 instances, I get only 200 up to 250MByte/s. >> >> I also tried the same between a dom0 and a plain hardware instance and > > Steps you can try: > > - do NOT configure a bridge in dom0, try normal eth0 <-> eth0 comms > (The linux bridge is a BRIDGE. that's the things everyone stopped > using in 1998)As I am still testing network performance in dom0, I did not yet setup any virtual networking. All tests were done in the "direct" interfaces (eth*) in dom0 and on bare hardware,no bridging or virtual switches involved. When I do the tests on bare hardware, I get about 550MByte/s; within dom0 speed drops to about 250MByte/s.> > - dom0 vpcu pinning > (because I wonder if the migrations between vcpus make things trip)Already tried that and it had no affect at all. :-/ Any ideas?> > Things to keep in mind: > ---------------------------------- > There was a handful of "new network" implementations to speed up IO > performance (i.e. xenloop, fido) between domUs. All have not gotten > anywhere although they were, indeed, fast. > Stub IO domains as a concept were invented to take IO processing out > of dom0. I have NO IDEA why that would be faster, but someone though > it does make a difference, otherwise it would not be there.Or that someone did not want to process network traffic within the privileged domain. ;-)> It is very probable that with switching to a SR-IOV nic, the whole > issue is gone. Some day I'll afford a SolarFlare 61xx NIC and > benchmark on my own.As I am testing dom0 network performance on "direct" interfaces, SR-IOV should not make a difference. Those X520-DA2 support SR-IOV and VMDq.> The key thing with using "vNIC"s assigned to the domUs is that you get > rid of the bridging idiocy and have more io queues; some nic will > switch between multiple domUs on the same nic, and even if they can't: > The 10gig switch next to your dom0 is definitely faster than the SW > bridge code.Is it possible to live migrate domUs using SR-IOV "vNICs"? Basically, if Xen would migrate the state of that vNIC to the target host, it could work. (Without knowing any details of SR-IOV at all :-D).> OpenVSwitch is a nice solution to generally replace the bridge but i > haven't seen anyone say that it gets anywhere near hardware > performance. > > Last: I'm not sure if you will see the problem solved. I think it has > never gotten extremely high prio.First, I would like to identify the problem(s) that limit dom0 10GE speed on my systems. ;-) Regards, Linus _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
lists+xen@internecto.net
2012-Jul-01 08:22 UTC
Re: Xen 10GBit Ethernet network performance (was: Re: Experience with Xen & AMD Opteron 4200 series?)
> - do NOT configure a bridge in dom0, try normal eth0 <-> eth0 comms > (The linux bridge is a BRIDGE. that''s the things everyone stopped > using in 1998)can you back this up with evidence? Even if this would be true, which is absolutely not the case, you could try playing around with openvswitch. But bridges not being used anymore? Bogative imho. Bridges making traffic more slow? Please back this up with facts, too.
Linus van Geuns
2012-Jul-01 09:15 UTC
Re: Xen 10GBit Ethernet network performance (was: Re: Experience with Xen & AMD Opteron 4200 series?)
Hey "internecto.net", On Sun, Jul 1, 2012 at 10:22 AM, <lists+xen@internecto.net> wrote:> >> - do NOT configure a bridge in dom0, try normal eth0 <-> eth0 comms >> (The linux bridge is a BRIDGE. that''s the things everyone stopped >> using in 1998) > > can you back this up with evidence?I think he meant multiport reapeaters aka hubs. Regards, Linus
Mark van Dijk
2012-Jul-01 10:29 UTC
Re: Xen 10GBit Ethernet network performance (was: Re: Experience with Xen & AMD Opteron 4200 series?)
> Hey "internecto.net",Urgh, sorry, I use tags for my list subscriptions (e.g. lists+xen, lists+zsh etc.) to ease delivery to imap folders. Unfortunately I can''t set a default From header with each list so have to do that manually.> >> - do NOT configure a bridge in dom0, try normal eth0 <-> eth0 comms > >> (The linux bridge is a BRIDGE. that''s the things everyone stopped > >> using in 1998) > > > > can you back this up with evidence? > > I think he meant multiport reapeaters aka hubs.Yes, now that is something I can agree with although I still don''t really understand why this comment would apply to virtual bridging. Mark
Eric Lindsey
2012-Jul-01 19:19 UTC
Re: Xen 10GBit Ethernet network performance (was: Re: Experience with Xen & AMD Opteron 4200 series?)
While we''re on the subject, I''m still having a massive headache with using my dom0 as a NAT to route traffic internally from eth0 to the outside world on eth1 using iptables in masquerade mode as a firewall, AND having eth0 (internal LAN) successfully bridged for my VMs run by Xen. I''ve heard a lot of talk about Open vSwitch on the list, I''m sure I can find info about it on the net, but can someone give me some hard facts about its performance, especially with regard to virtualization and my particular use case? Thanks in advance! On Jul 1, 2012, at 6:29 AM, Mark van Dijk <lists+xen@internecto.net> wrote:>> Hey "internecto.net", > > Urgh, sorry, I use tags for my list subscriptions (e.g. lists+xen, > lists+zsh etc.) to ease delivery to imap folders. Unfortunately I can''t > set a default From header with each list so have to do that manually. > >>>> - do NOT configure a bridge in dom0, try normal eth0 <-> eth0 comms >>>> (The linux bridge is a BRIDGE. that''s the things everyone stopped >>>> using in 1998) >>> >>> can you back this up with evidence? >> >> I think he meant multiport reapeaters aka hubs. > > Yes, now that is something I can agree with although I still don''t > really understand why this comment would apply to virtual bridging. > > Mark > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xen.org > http://lists.xen.org/xen-users
On 01/07/2012 20:19, Eric Lindsey wrote:> While we''re on the subject, I''m still having a massive headache with using my dom0 as a NAT to route traffic internally from eth0 to the outside world on eth1 using iptables in masquerade mode as a firewall, AND having eth0 (internal LAN) successfully bridged for my VMs run by Xen. I''ve heard a lot of talk about Open vSwitch on the list, I''m sure I can find info about it on the net, but can someone give me some hard facts about its performance, especially with regard to virtualization and my particular use case? Thanks in advance!You don''t need to use OpenVSwitch. You can use just the bridging capabilities of the Linux kernel (brctrl) Cheers
On 01/07/2012 20:56, Eric Lindsey wrote:> I''ve tried and no matter how I try to configure it, the DHCP server on > dom0 (dnsmasq listening on br0) does not succesfully assign IPs to my > VMs--even though it does see the DHCPREQUEST and make a DHCPOFFER. And > it works fine for all the physical machines on the network. > >In line with most community mail lists, it would be very much appreciated if you didn''t top post The standard Linux bridge is a simple piece of software that performs a simple task: layer 2 transport of network traffic. An "out of the box" installation of Xen will have enabled Xen in "bridge mode" by default. Start from there, use an external DHCP server (e.g. home router) for testing, and see if your DomUs can pick up IP addresses. If this works, you''ve confirmed that bridging is working ok. If you really want to run a DHCP server on the Xen box, then I highly recommend that you install dhcpd on a separate DomU. That should work no problems. Keep the Dom0 clean and free from such unnecessities Cheers
Joseph Glanville
2012-Jul-02 18:42 UTC
Re: Xen 10GBit Ethernet network performance (was: Re: Experience with Xen & AMD Opteron 4200 series?)
On 1 July 2012 20:29, Mark van Dijk <lists+xen@internecto.net> wrote:>> Hey "internecto.net", > > Urgh, sorry, I use tags for my list subscriptions (e.g. lists+xen, > lists+zsh etc.) to ease delivery to imap folders. Unfortunately I can''t > set a default From header with each list so have to do that manually. > >> >> - do NOT configure a bridge in dom0, try normal eth0 <-> eth0 comms >> >> (The linux bridge is a BRIDGE. that''s the things everyone stopped >> >> using in 1998) >> > >> > can you back this up with evidence? >> >> I think he meant multiport reapeaters aka hubs. > > Yes, now that is something I can agree with although I still don''t > really understand why this comment would apply to virtual bridging. > > MarkYeah I think he doesn''t understand that the linux bridge module is actually a MAC learning switch.> > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xen.org > http://lists.xen.org/xen-usersTo drop in a my piece on OpenvSwitch: Performance of OpenvSwitch is on par with the Linux bridge module. The main differences is the flow based rules, json-rpc api etc. Benefits of OpenvSwitch are really about programatically altering traffic based on these flow rules. If you are a networking nut and have a good reason to do crazy stuff with L7 inspection then this is nice. On the other hand OpenvSwitch doesn''t interact with the rest of the Linux kernel networking ecosystem.. For instance you can''t do Layer 2 filtering as easily as you can with ebtables(the bridge-netfilter module). Nor can it integrate as nicely with Linux gre/ip-ip tunnels etc. This is changing slowly with OpenvSwitch going upstream.. we are likely to see the GRE implementation from OVS merged with the mainline code. IMO unless you know what you want OVS for (say easy VLAN tagging) then you probably don''t need it and are better off with the linux bridge. In terms of Linux bridging module performance... I can easily do 20 odd gigabits between VMs on the same host and can push around 13 gigabits between hosts (40gig infinbiand). It''s worth noting that with both OVS and the bridge module you will be limited by PPS (packets per second) rather than throughput. With very big MTUs ( I am using 64k over IB ) one can do many gigabits, but using 1500 byte MTUs throughput drops below 3gigabits. This is due to the max throughput of the both the linux bridge module and OVS on a single core maxes out a a few 100k PPS. Joseph. -- CTO | Orion Virtualisation Solutions | www.orionvm.com.au Phone: 1300 56 99 52 | Mobile: 0428 754 846
Linus van Geuns
2012-Jul-09 09:50 UTC
Re: Xen 10GBit Ethernet network performance (was: Re: Experience with Xen & AMD Opteron 4200 series?)
Hey Joseph, On Mon, Jul 2, 2012 at 8:42 PM, Joseph Glanville < joseph.glanville@orionvm.com.au> wrote:> [..] > > In terms of Linux bridging module performance... I can easily do 20 > odd gigabits between VMs on the same host and can push around 13 > gigabits between hosts (40gig infinbiand). > It''s worth noting that with both OVS and the bridge module you will be > limited by PPS (packets per second) rather than throughput. > With very big MTUs ( I am using 64k over IB ) one can do many > gigabits, but using 1500 byte MTUs throughput drops below 3gigabits. > This is due to the max throughput of the both the linux bridge module > and OVS on a single core maxes out a a few 100k PPS. >What version of Xen and which distros are you using for dom0/domU? And what hardware are you using? Did you measure plain dom0 to dom0 performance over your infiniband connection(s)? I am still digging for the limiting bottleneck/ issue with dom0 to dom0 performance over 10GBit Eth on my boxes. Regards, Linus _______________________________________________ Xen-users mailing list Xen-users@lists.xen.org http://lists.xen.org/xen-users
Joseph Glanville
2012-Jul-15 19:18 UTC
Re: Xen 10GBit Ethernet network performance (was: Re: Experience with Xen & AMD Opteron 4200 series?)
On 9 July 2012 19:50, Linus van Geuns <linus@vangeuns.name> wrote:> Hey Joseph, > > On Mon, Jul 2, 2012 at 8:42 PM, Joseph Glanville > <joseph.glanville@orionvm.com.au> wrote: >> >> [..] >> >> >> In terms of Linux bridging module performance... I can easily do 20 >> odd gigabits between VMs on the same host and can push around 13 >> gigabits between hosts (40gig infinbiand). >> It''s worth noting that with both OVS and the bridge module you will be >> limited by PPS (packets per second) rather than throughput. >> With very big MTUs ( I am using 64k over IB ) one can do many >> gigabits, but using 1500 byte MTUs throughput drops below 3gigabits. >> This is due to the max throughput of the both the linux bridge module >> and OVS on a single core maxes out a a few 100k PPS. > > > What version of Xen and which distros are you using for dom0/domU? > And what hardware are you using? > Did you measure plain dom0 to dom0 performance over your infiniband > connection(s)?Gentoo running Linux 3.2 (pretty heavily patched though, especially to drive up IPoIB performance) Yes, we do a ton of stuff between our dom0s. Performance is not noticeably different in terms of throughput but VMs have a very measurable drop in PPS. it would not be uncommon for us to easily get between 11-12 Gigabits between dom0s using big MTUs as noted above. 2044 MTU is a good balance between working with stuff and not breaking multicast + great throughput.> > I am still digging for the limiting bottleneck/ issue with dom0 to dom0 > performance over 10GBit Eth on my boxes. > > Regards, Linus >-- CTO | Orion Virtualisation Solutions | www.orionvm.com.au Phone: 1300 56 99 52 | Mobile: 0428 754 846