Hello Maybe someone could direct us to right solution with the problem descibed below. We tried different trick bo no success stories. We have a development cluster based on Dell R815 servers (+-30) and a storage server - Netapp FAS 3240 runnig Data Ontap 8.1p2. We have some 100 virtual servers - we use Xen 4.1 for virtualisation. A basics OS is Debian Squeeze with some backports packages (newer kernel, newer xen, newer nfs commons etc) One of virtual servers is a backup server which has mounted a big nfs share from Netapp. It copied changed data on Netapp''s share (we use also excellent Netapp''s snapshot technology to keep older versions data). Our network is mixed 1GBit/10GBit ethernet with junper switch as a central point of network. Internally we use IPv6 protocol. The problem is the following: THE READ PERFORMANCE OF BACKUP SERVER IS DRAMATICALLy LOW: TYPICAL SEQUENTIAL READ FROM NETAPP SHARE FALLS DOWN TO 5-10MBytes/s !!!!!!!!!!!!!!!!!!!!!!! On the other hand WRITE PERFORMANCE IS OK i.e 80-100MBytes/s During test on our testbed system we could read about 250-280 MBytes/s from our netapp storage (using 10GBit network). The backup server is connected using 1GBit/s network so we expected some 100MBytes/s performance. We tried to find a reason of such slow performance and it looks like it is somewhere between xen bridge and Xen domU and a KEY FACTOR IS IPv6. As far as we setup IPv4 connection on Netapp and client regardless bare metal, dom0 and domU system, all performance (READ/WRITE) is fine. We made the following tests: 1) Netapp_IPv6 - bare_metal_server_IPv6 - ALL FINE (full network speed) 2) Netapp_IPv6 - Xen_dom0_IPv6_no_bridge - ALL FINE 3) Netapp_IPv6 - Xen_dom0_IPv6_via_bridge - ALL FINE 4) Netapp_IPv6 - Xen_domU-IPv6(via_bridge) - WRITE OK, READ <10MBytes/s !!!!!! 5) Netapp_IPv6 - Xen_domU-IPv6(via_bridge with eth interface set into promiscuous mode) - WRITE OK, READ 20-25Bytes/s !!! 6) Linux_server_IPv6 (regadless bare metal, dom0 or domU) - - Xen_domU-IPv6(via_bridge) - ALL FINE All mounts use NFS version 3 and proto tcp6. We checked MTUs - everywhere 1500. No errors on any side (switch, netapp, linux clients)... IPv4 connections performs just fine. ssh/scp and others IPv6 services perform OK ... Any hints? We think of throwing Xen away :-/ Kind regards, Grzegorz
Hello Maybe someone could direct us to right solution with the problem descibed below. We tried different trick bo no success stories. We have a development cluster based on Dell R815 servers (+-30) and a storage server - Netapp FAS 3240 runnig Data Ontap 8.1p2. We have some 100 virtual servers - we use Xen 4.1 for virtualisation. A basics OS is Debian Squeeze with some backports packages (newer kernel, newer xen, newer nfs commons etc) One of virtual servers is a backup server which has mounted a big nfs share from Netapp. It copied changed data on Netapp''s share (we use also excellent Netapp''s snapshot technology to keep older versions data). Our network is mixed 1GBit/10GBit ethernet with junper switch as a central point of network. Internally we use IPv6 protocol. The problem is the following: THE READ PERFORMANCE OF BACKUP SERVER IS DRAMATICALLy LOW: TYPICAL SEQUENTIAL READ FROM NETAPP SHARE FALLS DOWN TO 5-10MBytes/s !!!!!!!!!!!!!!!!!!!!!!! On the other hand WRITE PERFORMANCE IS OK i.e 80-100MBytes/s During test on our testbed system we could read about 250-280 MBytes/s from our netapp storage (using 10GBit network). The backup server is connected using 1GBit/s network so we expected some 100MBytes/s performance. We tried to find a reason of such slow performance and it looks like it is somewhere between xen bridge and Xen domU and a KEY FACTOR IS IPv6. As far as we setup IPv4 connection on Netapp and client regardless bare metal, dom0 and domU system, all performance (READ/WRITE) is fine. We made the following tests: 1) Netapp_IPv6 - bare_metal_server_IPv6 - ALL FINE (full network speed) 2) Netapp_IPv6 - Xen_dom0_IPv6_no_bridge - ALL FINE 3) Netapp_IPv6 - Xen_dom0_IPv6_via_bridge - ALL FINE 4) Netapp_IPv6 - Xen_domU-IPv6(via_bridge) - WRITE OK, READ <10MBytes/s !!!!!! 5) Netapp_IPv6 - Xen_domU-IPv6(via_bridge with eth interface set into promiscuous mode) - WRITE OK, READ 20-25Bytes/s !!! 6) Linux_server_IPv6 (regadless bare metal, dom0 or domU) - - Xen_domU-IPv6(via_bridge) - ALL FINE All mounts use NFS version 3 and proto tcp6. We checked MTUs - everywhere 1500. No errors on any side (switch, netapp, linux clients)... IPv4 connections performs just fine. ssh/scp and others IPv6 services perform OK ... Any hints? We think of throwing Xen away :-/ Kind regards, Grzegorz
On Mon, Nov 19, 2012 at 12:08 AM, <G.Bakalarski@icm.edu.pl> wrote:> The problem is the following: > > THE READ PERFORMANCE OF BACKUP SERVER IS DRAMATICALLy LOW: TYPICAL SEQUENTIAL > READ FROM NETAPP SHARE FALLS DOWN TO 5-10MBytes/s !!!!!!!!!!!!!!!!!!!!!!! > On the other hand WRITE PERFORMANCE IS OK i.e 80-100MBytes/sWriting in all capitals won''t get you faster response, you know. It''d only get you marked as rude or annoying http://en.wikipedia.org/wiki/All_caps#Computing> We made the following tests: > > 1) Netapp_IPv6 - bare_metal_server_IPv6 - ALL FINE (full network speed)Is this the same server? If no, please retest using the same server (or at least the same NIC)> 6) Linux_server_IPv6 (regadless bare metal, dom0 or domU) - > - Xen_domU-IPv6(via_bridge) - ALL FINESo mounting a normal linux nfs share is fine, but mounting netapps share sucks?> Any hints?Since mounting a linux nfs share works fine, you should check with netapp just in case they have some experience or bug entry on that. Other than that, some network-related problems usually go away if you turn off checksum in ALL interfaces involved (dom0''s nic, the bridge, and the virtual interfaces)> We think of throwing Xen away :-/Perhaps you should. In the end, use whatever works. It works fine for other people, although perhaps it''s because they don''t use the same combination as yours. Had you use vendor-supported xen solution, this might be a good time to file a bug report. Since you don''t, your generic choice is to pick one of: - be patient. Just in case someone has similar problem and solved it already. Shouting or writing all caps won''t help you here. - try latest versions of xen and kernel from git. Just in case the problem is solved in latest upstream. - try some generic workarounds (e.g. disable checksum) - throw away one of the things that caused the problem (in your case, it''s either xen or netapp) and stick with working solutions. -- Fajar
On Sun, Nov 18, 2012 at 06:08:03PM +0100, G.Bakalarski@icm.edu.pl wrote:> We have some 100 virtual servers - we use Xen 4.1 for virtualisation. > A basics OS is Debian Squeeze with some backports packages (newer kernel, > newer xen, newer nfs commons etc)This combination is not supported.> During test on our testbed system we could read about 250-280 MBytes/s > from our netapp storage (using 10GBit network). The backup server is > connected using 1GBit/s network so we expected some 100MBytes/s performance.So you have one working and one not really working setup? How do they differ?> We tried to find a reason of such slow performance and it looks like > it is somewhere between xen bridge and Xen domU and a KEY FACTOR IS IPv6.Below you also show that another key factor is NetApp vs. Linux.> All mounts use NFS version 3 and proto tcp6.Please show the mount options from /proc/mounts. Also you want to use NFSv4.> We checked MTUs - everywhere 1500. No errors on any side (switch, netapp, > linux clients)... IPv4 connections performs just fine.1500 is too low. You want to use 9000 for anything with a lot of traffic.> ssh/scp and others IPv6 services perform OK ...ssh between NetApp and the server? Otherwise this is no real test.> Any hints?You want to use OpenVSwitch for stuff that produces a lot of traffic. Bastian -- Killing is wrong. -- Losira, "That Which Survives", stardate unknown
Dear Xen''ers Still no improvements with this issue. Thanks all who tried to help and all who sent trash rebukes. After some testings a problem has been redefined a little. Currently is is not NFS issue but network issue (tcp/udp). So status is the following: when we have *all* three in action, i.e." 1) Xen domU 2) IPv6 protocol 3) Netapp file server then we have very poor transfer rates. E.g. NFS - 5-8MBytes/s FTP - 11 MBytes/s HTTP - 3-4 MBytes/s (looks like 10MBit speed :-( ) If one of three elements (anyone) is missing we get full 1000Mbit/s speed. I talked to Netapp support and they suggested playing with following TCP options: options ip.tcp.newreno.enable options ip.tcp.rfc3390.enable options ip.tcp.sack.enable But setting them on/off did not help much (or worsen performance sometimes). My question is if anyone knows about network issues between xen domU and FreeBSD machines (netapp file server is FreeBSD based). Or what should I look for (options/setting) in xen, xen network interfaces, tcp stack to see what''s going on ... ? Kind regards, GB
G.Bakalarski@icm.edu.pl
2012-Dec-13 12:43 UTC
follow UP: Xen + IPv6 + Netapp = NFS read problem
Dear Xen''ers Some follow-up on this topic. No success yet :/ Seems it''s related with XEN by virtual network performance. And (as of now) the main reason of slow transfer are probably fragmented datagrams on IPv6 level (not TCP6 but IPv6!!!). In our environment NetApp filer sends many fragmneted IPv6 frames. When such frames come to bare metal or Dom0 systems they are agregated in-time (at least at a 1GBit/s speed) by machine such "physical". When receiver is domU with virtual xen interface (bridged) it is too slow to assemby IPv6 frames in-time, so transfer slows down. When a sender is Linux machine no IPv6 packets are fragmented ... IPv4 packets are NOT fragmented!!! ToE does not change anything ... We DID set 1500 MTU on all network devices (server, netapp filer, switches) ... So maybe anyone knows how to force netapp filer not to fragment IPv6 packets? Or how to improve Xen network performace (but the first method would be most welcome)??? Best regards, Grzegorz> > Still no improvements with this issue. > Thanks all who tried to help and all who sent trash > rebukes. > > After some testings a problem has been redefined a little. > > Currently is is not NFS issue but network issue (tcp/udp). > > So status is the following: > > when we have *all* three in action, i.e." > > 1) Xen domU > 2) IPv6 protocol > 3) Netapp file server > > then we have very poor transfer rates. > > E.g. > NFS - 5-8MBytes/s > FTP - 11 MBytes/s > HTTP - 3-4 MBytes/s > > (looks like 10MBit speed :-( ) > > If one of three elements (anyone) is missing we get full 1000Mbit/s speed. > I talked to Netapp support and they suggested playing with following > TCP options: > options ip.tcp.newreno.enable > options ip.tcp.rfc3390.enable > options ip.tcp.sack.enable > > But setting them on/off did not help much (or worsen performance sometimes). > > My question is if anyone knows about network issues between xen domU > and FreeBSD machines (netapp file server is FreeBSD based). > Or what should I look for (options/setting) in xen, xen network interfaces, > tcp stack to see what''s going on ... ? > > Kind regards, > > GB > >
On Thu, Dec 13, 2012 at 01:43:55PM +0100, G.Bakalarski@icm.edu.pl wrote:> When a sender is Linux machine no IPv6 packets are fragmented ...Only datagram protocoles (UDP!) needs fragmenting.> We DID set 1500 MTU on all network devices (server, netapp filer, switches) ...Don''t do that. Use 9000 on networks with filers. The MTU only defines the maximum ethernet packet size, _not_ the maximum udp packet size.> So maybe anyone knows how to force netapp filer not to fragment IPv6 > packets?Use TCP? Bastian -- Spock: We suffered 23 casualties in that attack, Captain.
Reasonably Related Threads
- Re: [Fedora-xen] 2.6.38-rc dom0 kernel and xen 4.1.0-rc2
- Samba 4.6.7 AD, Netapp CDOT 9.2 and missing "Domain Users" membership
- smbclient using SMB2+ against NetApp OnTap 8.2 anyone ?
- Using rpcclient with my NetApp fails
- objectclass "posixAccount" missing on new created users