thr3ads.net - Xen users - Xen + IPv6 + Netapp = NFS read problem [Nov 2012]

If this information is useful, please help other people find it:
Share via:

G.Bakalarski@icm.edu.pl

2012-Nov-18 17:08 UTC

Xen + IPv6 + Netapp = NFS read problem

Hello

Maybe someone could direct us to right solution with the problem
descibed below. We tried different trick bo no success stories.

We have a development cluster based on Dell R815 servers (+-30) and
a storage server - Netapp FAS 3240 runnig Data Ontap 8.1p2.

We have some 100 virtual servers - we use Xen 4.1 for virtualisation.
A basics OS is Debian Squeeze with some backports packages (newer kernel,
newer xen, newer nfs commons etc)

One of virtual servers is a backup server which has mounted a big nfs
share from Netapp. It copied changed data on Netapp''s share (we use
also excellent Netapp''s snapshot technology to keep older versions
data).

Our network is mixed 1GBit/10GBit ethernet with junper switch as a central
point of network. Internally we use IPv6 protocol.

The problem is the following:

THE READ PERFORMANCE OF BACKUP SERVER IS DRAMATICALLy LOW: TYPICAL SEQUENTIAL
READ FROM NETAPP SHARE FALLS DOWN TO 5-10MBytes/s !!!!!!!!!!!!!!!!!!!!!!!
On the other hand WRITE PERFORMANCE IS OK i.e 80-100MBytes/s

During test on our testbed system we could read about 250-280 MBytes/s
from our netapp storage (using 10GBit network). The backup server is
connected using 1GBit/s network so we expected some 100MBytes/s performance.

We tried to find a reason of such slow performance and it looks like
it is somewhere between xen bridge and Xen domU and a KEY FACTOR IS IPv6.
 As far as we setup IPv4 connection on Netapp and client regardless
bare metal, dom0 and domU system, all performance (READ/WRITE) is fine.



We made the following tests:

1) Netapp_IPv6 - bare_metal_server_IPv6 - ALL FINE (full network speed)
2) Netapp_IPv6 - Xen_dom0_IPv6_no_bridge - ALL FINE
3) Netapp_IPv6 - Xen_dom0_IPv6_via_bridge - ALL FINE
4) Netapp_IPv6 - Xen_domU-IPv6(via_bridge) - WRITE OK, READ <10MBytes/s
!!!!!!
5) Netapp_IPv6 - Xen_domU-IPv6(via_bridge with eth interface
                 set into promiscuous mode) -  WRITE OK, READ 20-25Bytes/s !!!
6) Linux_server_IPv6 (regadless bare metal, dom0 or domU) -
               - Xen_domU-IPv6(via_bridge) - ALL FINE

All mounts use NFS version 3 and proto tcp6.

We checked  MTUs - everywhere 1500. No errors on any side (switch, netapp,
linux clients)... IPv4 connections performs just fine.

ssh/scp and others IPv6 services perform OK ...

Any hints? We think of throwing Xen away :-/

Kind regards,


Grzegorz

G.Bakalarski@icm.edu.pl

2012-Nov-18 17:13 UTC

head link

Xen + IPv6 + Netapp = NFS read problem

Hello

Maybe someone could direct us to right solution with the problem
descibed below. We tried different trick bo no success stories.

We have a development cluster based on Dell R815 servers (+-30) and
a storage server - Netapp FAS 3240 runnig Data Ontap 8.1p2.

We have some 100 virtual servers - we use Xen 4.1 for virtualisation.
A basics OS is Debian Squeeze with some backports packages (newer kernel,
newer xen, newer nfs commons etc)

One of virtual servers is a backup server which has mounted a big nfs
share from Netapp. It copied changed data on Netapp''s share (we use
also excellent Netapp''s snapshot technology to keep older versions
data).

Our network is mixed 1GBit/10GBit ethernet with junper switch as a central
point of network. Internally we use IPv6 protocol.

The problem is the following:

THE READ PERFORMANCE OF BACKUP SERVER IS DRAMATICALLy LOW: TYPICAL SEQUENTIAL
READ FROM NETAPP SHARE FALLS DOWN TO 5-10MBytes/s !!!!!!!!!!!!!!!!!!!!!!!
On the other hand WRITE PERFORMANCE IS OK i.e 80-100MBytes/s

During test on our testbed system we could read about 250-280 MBytes/s
from our netapp storage (using 10GBit network). The backup server is
connected using 1GBit/s network so we expected some 100MBytes/s performance.

We tried to find a reason of such slow performance and it looks like
it is somewhere between xen bridge and Xen domU and a KEY FACTOR IS IPv6.
 As far as we setup IPv4 connection on Netapp and client regardless
bare metal, dom0 and domU system, all performance (READ/WRITE) is fine.



We made the following tests:

1) Netapp_IPv6 - bare_metal_server_IPv6 - ALL FINE (full network speed)
2) Netapp_IPv6 - Xen_dom0_IPv6_no_bridge - ALL FINE
3) Netapp_IPv6 - Xen_dom0_IPv6_via_bridge - ALL FINE
4) Netapp_IPv6 - Xen_domU-IPv6(via_bridge) - WRITE OK, READ <10MBytes/s
!!!!!!
5) Netapp_IPv6 - Xen_domU-IPv6(via_bridge with eth interface
                 set into promiscuous mode) -  WRITE OK, READ 20-25Bytes/s !!!
6) Linux_server_IPv6 (regadless bare metal, dom0 or domU) -
               - Xen_domU-IPv6(via_bridge) - ALL FINE

All mounts use NFS version 3 and proto tcp6.

We checked  MTUs - everywhere 1500. No errors on any side (switch, netapp,
linux clients)... IPv4 connections performs just fine.

ssh/scp and others IPv6 services perform OK ...

Any hints? We think of throwing Xen away :-/

Kind regards,


Grzegorz

Fajar A. Nugraha

2012-Nov-19 04:12 UTC

head link

Re: Xen + IPv6 + Netapp = NFS read problem

On Mon, Nov 19, 2012 at 12:08 AM,  <G.Bakalarski@icm.edu.pl>
wrote:> The problem is the following:
>
> THE READ PERFORMANCE OF BACKUP SERVER IS DRAMATICALLy LOW: TYPICAL
SEQUENTIAL
> READ FROM NETAPP SHARE FALLS DOWN TO 5-10MBytes/s !!!!!!!!!!!!!!!!!!!!!!!
> On the other hand WRITE PERFORMANCE IS OK i.e 80-100MBytes/s
Writing in all capitals won''t get you faster response, you know.
It''d
only get you marked as rude or annoying
en.wikipedia.org/wiki/All_caps#Computing
> We made the following tests:
>
> 1) Netapp_IPv6 - bare_metal_server_IPv6 - ALL FINE (full network speed)
Is this the same server? If no, please retest using the same server
(or at least the same NIC)
> 6) Linux_server_IPv6 (regadless bare metal, dom0 or domU) -
>                - Xen_domU-IPv6(via_bridge) - ALL FINE
So mounting a normal linux nfs share is fine, but mounting netapps share sucks?
> Any hints?
Since mounting a linux nfs share works fine, you should check with
netapp just in case they have some experience or bug entry on that.

Other than that, some network-related problems usually go away if you
turn off checksum in ALL interfaces involved (dom0''s nic, the bridge,
and the virtual interfaces)
> We think of throwing Xen away :-/

Perhaps you should.

In the end, use whatever works. It works fine for other people,
although perhaps it''s because they don''t use the same
combination as
yours. Had you use vendor-supported xen solution, this might be a good
time to file a bug report. Since you don''t, your generic choice is to
pick one of:
- be patient. Just in case someone has similar problem and solved it
already. Shouting or writing all caps won''t help you here.
- try latest versions of xen and kernel from git. Just in case the
problem is solved in latest upstream.
- try some generic workarounds (e.g. disable checksum)
- throw away one of the things that caused the problem (in your case,
it''s either xen or netapp) and stick with working solutions.

-- 
Fajar

Bastian Blank

2012-Nov-19 19:35 UTC

head link

Re: Xen + IPv6 + Netapp = NFS read problem

On Sun, Nov 18, 2012 at 06:08:03PM +0100, G.Bakalarski@icm.edu.pl
wrote:> We have some 100 virtual servers - we use Xen 4.1 for virtualisation.
> A basics OS is Debian Squeeze with some backports packages (newer kernel,
> newer xen, newer nfs commons etc)
This combination is not supported.
> During test on our testbed system we could read about 250-280 MBytes/s
> from our netapp storage (using 10GBit network). The backup server is
> connected using 1GBit/s network so we expected some 100MBytes/s
performance.
So you have one working and one not really working setup? How do they
differ?
> We tried to find a reason of such slow performance and it looks like
> it is somewhere between xen bridge and Xen domU and a KEY FACTOR IS IPv6.
Below you also show that another key factor is NetApp vs. Linux.
> All mounts use NFS version 3 and proto tcp6.
Please show the mount options from /proc/mounts. Also you want to use
NFSv4.
> We checked  MTUs - everywhere 1500. No errors on any side (switch, netapp,
> linux clients)... IPv4 connections performs just fine.
1500 is too low. You want to use 9000 for anything with a lot of
traffic.
> ssh/scp and others IPv6 services perform OK ...
ssh between NetApp and the server? Otherwise this is no real test.
> Any hints?
You want to use OpenVSwitch for stuff that produces a lot of traffic.

Bastian

-- 
Killing is wrong.
		-- Losira, "That Which Survives", stardate unknown

G.Bakalarski@icm.edu.pl

2012-Nov-28 16:05 UTC

head link

Re: Xen + IPv6 + Netapp = NFS read problem

Dear Xen''ers

Still no improvements with this issue.
Thanks all who tried to help and all who sent trash
rebukes.

After some testings a problem has been redefined a little.

Currently is is not NFS issue but network issue (tcp/udp).

So status is the following:

when we have *all* three in action, i.e."

1) Xen domU
2) IPv6 protocol
3) Netapp file server

then we have very poor transfer rates.

E.g.
NFS - 5-8MBytes/s
FTP - 11 MBytes/s
HTTP - 3-4 MBytes/s

(looks like 10MBit speed :-(  )

If one of three elements (anyone) is missing we get full 1000Mbit/s speed.
I talked to Netapp support and they suggested playing with  following
TCP options:
options ip.tcp.newreno.enable
options ip.tcp.rfc3390.enable
options ip.tcp.sack.enable

But setting them on/off did not help much (or worsen performance sometimes).

My question is if anyone knows about network issues between xen domU
and FreeBSD machines (netapp file server is FreeBSD based).
Or what should I look for (options/setting)  in xen, xen network interfaces,
tcp stack to see what''s going on ... ?

Kind regards,

GB

G.Bakalarski@icm.edu.pl

2012-Dec-13 12:43 UTC

head link

follow UP: Xen + IPv6 + Netapp = NFS read problem

Dear Xen''ers

Some follow-up on this topic. No success yet :/

Seems it''s related with XEN by virtual network performance.
And (as of now) the main reason of slow transfer are probably
fragmented datagrams on IPv6 level (not TCP6 but IPv6!!!).
In our environment NetApp filer sends many fragmneted
IPv6 frames. When such frames come to bare metal or
Dom0 systems they are agregated in-time (at least at a 1GBit/s
speed) by machine such "physical". When receiver is domU with
virtual xen interface (bridged) it is too slow to assemby IPv6 frames
in-time, so transfer slows down.

When a sender is Linux machine no IPv6 packets are fragmented ...

IPv4 packets are NOT fragmented!!!

ToE does not change anything ...

We DID set 1500 MTU on all network devices (server, netapp filer, switches) ...

So maybe anyone knows how to force netapp filer not to fragment IPv6
packets? Or how to improve Xen network performace (but the first method
would be most welcome)???

Best regards,


Grzegorz
>
> Still no improvements with this issue.
> Thanks all who tried to help and all who sent trash
> rebukes.
>
> After some testings a problem has been redefined a little.
>
> Currently is is not NFS issue but network issue (tcp/udp).
>
> So status is the following:
>
> when we have *all* three in action, i.e."
>
> 1) Xen domU
> 2) IPv6 protocol
> 3) Netapp file server
>
> then we have very poor transfer rates.
>
> E.g.
> NFS - 5-8MBytes/s
> FTP - 11 MBytes/s
> HTTP - 3-4 MBytes/s
>
> (looks like 10MBit speed :-(  )
>
> If one of three elements (anyone) is missing we get full 1000Mbit/s speed.
> I talked to Netapp support and they suggested playing with  following
> TCP options:
> options ip.tcp.newreno.enable
> options ip.tcp.rfc3390.enable
> options ip.tcp.sack.enable
>
> But setting them on/off did not help much (or worsen performance
sometimes).
>
> My question is if anyone knows about network issues between xen domU
> and FreeBSD machines (netapp file server is FreeBSD based).
> Or what should I look for (options/setting)  in xen, xen network
interfaces,
> tcp stack to see what''s going on ... ?
>
> Kind regards,
>
> GB
>
>

Bastian Blank

2012-Dec-22 13:32 UTC

head link

Re: follow UP: Xen + IPv6 + Netapp = NFS read problem

On Thu, Dec 13, 2012 at 01:43:55PM +0100, G.Bakalarski@icm.edu.pl
wrote:> When a sender is Linux machine no IPv6 packets are fragmented ...
Only datagram protocoles (UDP!) needs fragmenting.
> We DID set 1500 MTU on all network devices (server, netapp filer, switches)
...
Don''t do that. Use 9000 on networks with filers. The MTU only defines
the maximum ethernet packet size, _not_ the maximum udp packet size.
> So maybe anyone knows how to force netapp filer not to fragment IPv6
> packets?
Use TCP?

Bastian

-- 
Spock: We suffered 23 casualties in that attack, Captain.

Apparently Analagous Threads

Search for more possibly parallel threads

Xen users - Nov 2012 - Xen + IPv6 + Netapp = NFS read problem

Xen + IPv6 + Netapp = NFS read problem

Xen + IPv6 + Netapp = NFS read problem

Re: Xen + IPv6 + Netapp = NFS read problem

Re: Xen + IPv6 + Netapp = NFS read problem

Re: Xen + IPv6 + Netapp = NFS read problem

follow UP: Xen + IPv6 + Netapp = NFS read problem

Re: follow UP: Xen + IPv6 + Netapp = NFS read problem

Apparently Analagous Threads