thr3ads.net - Xen users - [Xen-users] NFS problems in guest [Apr 2006]

If this information is useful, please help other people find it:
Share via:

Itai Tavor

2006-Apr-28 06:16 UTC

[Xen-users] NFS problems in guest

Hi,

I''m running Xen 3 (xen-unstable tarball from 4/27) with kernel 2.6.16  
and Debian Sarge in all domains. Most things seem to work very well,  
I had the soft lockups problem but Apr 6 change seems to have fixed it.

One problem I can''t get over is freezes when accessing NFS shares  
exported from a guest domain. The problem occurs inconsistently, in  
one case I can access a share with no problem when mounting it in  
domain 0 but not in any guest domains, in another case even domain 0  
can''t successfully access the share.

The symptom is that I can view the directory listing at the top level  
of the share, but if I try to enter any directory, or do any deep  
operation on the share (like du) I get:
     ls: reading directory /mnt/test: Input/output error
     nfs: server NAS not responding, still trying

followed by a very long freeze... the NFS server doesn''t show any  
problems.

Any ideas? TIA, Itai

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Itai Tavor

2006-Apr-29 07:27 UTC

head link

Re: [Xen-users] NFS problems in guest

On 29/04/2006, at 2:05 AM, Birger Brunswiek wrote:
> Itai Tavor wrote:
>> The symptom is that I can view the directory listing at the top  
>> level of
>> the share, but if I try to enter any directory, or do any deep  
>> operation
>> on the share (like du) I get:
>>     ls: reading directory /mnt/test: Input/output error
>>     nfs: server NAS not responding, still trying
>>
>> followed by a very long freeze... the NFS server doesn''t show
any
>> problems.
>
> Can you "ping -s 1500 dumU''s-ip" from dom0? If not
it''s probably
> the same
> problem as I have, otherwise never mind.
Interesting... I can''t ping any domain on this machine, including  
dom0, with a 1500 packet size - any "-s 1500" ping originating from  
or going to a Xen domain fails.

I can see from your 25/4 post that you analyzed the problem already  
so I won''t bother doing that... definitely something''s wrong
there.
But is it related to NFS?

Itai

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Birger Brunswiek

2006-Apr-29 20:03 UTC

head link

Re: [Xen-users] NFS problems in guest

Itai Tavor wrote:> On 29/04/2006, at 2:05 AM, Birger Brunswiek wrote:
> 
>> Itai Tavor wrote:
>>> The symptom is that I can view the directory listing at the top
level of
>>> the share, but if I try to enter any directory, or do any deep
operation
>>> on the share (like du) I get:
>>>     ls: reading directory /mnt/test: Input/output error
>>>     nfs: server NAS not responding, still trying
>>>
>>> followed by a very long freeze... the NFS server doesn''t
show any
>>> problems.
>>
>> Can you "ping -s 1500 dumU''s-ip" from dom0? If not
it''s probably the same
>> problem as I have, otherwise never mind.
> 
> Interesting... I can''t ping any domain on this machine, including
dom0,
> with a 1500 packet size - any "-s 1500" ping originating from or
going
> to a Xen domain fails.
> 
> I can see from your 25/4 post that you analyzed the problem already so I
> won''t bother doing that... definitely something''s wrong
there. But is it
> related to NFS?
Not at all. It is a fragmentation problem. Packets are defragmented by
ip_contrackt at the bridge but are then not fragmented when passed over. If I
remove that module on my machine everything is just fine.



_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Itai Tavor

2006-Apr-30 06:34 UTC

head link

Re: [Xen-users] NFS problems in guest

On 30/04/2006, at 6:03 AM, Birger Brunswiek wrote:
> Itai Tavor wrote:
>> On 29/04/2006, at 2:05 AM, Birger Brunswiek wrote:
>>
>>> Itai Tavor wrote:
>>>> The symptom is that I can view the directory listing at the top
>>>> level of
>>>> the share, but if I try to enter any directory, or do any deep
>>>> operation
>>>> on the share (like du) I get:
>>>>     ls: reading directory /mnt/test: Input/output error
>>>>     nfs: server NAS not responding, still trying
>>>>
>>>> followed by a very long freeze... the NFS server
doesn''t show any
>>>> problems.
>>>
>>> Can you "ping -s 1500 dumU''s-ip" from dom0? If
not it''s probably
>>> the same
>>> problem as I have, otherwise never mind.
>>
>> Interesting... I can''t ping any domain on this machine,
including
>> dom0,
>> with a 1500 packet size - any "-s 1500" ping originating from
or
>> going
>> to a Xen domain fails.
>>
>> I can see from your 25/4 post that you analyzed the problem  
>> already so I
>> won''t bother doing that... definitely something''s
wrong there. But
>> is it
>> related to NFS?
>
> Not at all. It is a fragmentation problem. Packets are defragmented by
> ip_contrackt at the bridge but are then not fragmented when passed  
> over. If I
> remove that module on my machine everything is just fine.
Haha, hoho! Removing ip_conntrack eliminates the NFS problems, and  
great happiness ensues.

Some people might wonder why they can''t use ip_conntrack and have  
reliable networking at the same time. Not me, though.

Thanks, Birger.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Javier Guerra

2006-Apr-30 12:40 UTC

head link

Re: [Xen-users] NFS problems in guest

On Sunday 30 April 2006 1:34 am, Itai Tavor wrote:> Some people might wonder why they can''t use ip_conntrack and have
> reliable networking at the same time. Not me, though.
but i do wonder... is it that the original packets were bigger than usual 
1500bytes? if so, why?? what is the MTU at both ends? (NFS server and client)

-- 
Javier


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Birger Brunswiek

2006-Apr-30 18:18 UTC

head link

Re: [Xen-users] NFS problems in guest

Javier Guerra wrote:> On Sunday 30 April 2006 1:34 am, Itai Tavor wrote:
>> Some people might wonder why they can''t use ip_conntrack and
have
>> reliable networking at the same time. Not me, though.
> 
> but i do wonder... is it that the original packets were bigger than usual 
> 1500bytes? if so, why?? what is the MTU at both ends? (NFS server and
client)
On my machine I have MTUs of 1500 on all ethernet devices. ''ping -s
1500
somewhere'' creates a packet that''s just larger than 1500 bytes
and it is
therefore fragmented before it is sent. The fragmented packages go from eth0 to
vif0.0 and then to xen-br0. _Without_ ip_conntract I see the packet fragmented
on eth0, vif0.0 and xen-br0. _With_ ip_conntract loaded I see the packet
fragmented at eth0 and vif0.0 but _not_ _on_ xen-br0. At xen-br0 the packets
have been defragmented and the resulting packet is larger than 1500 bytes (1500
bytes from fragment 1 + a few bytes but less than 1500 from fragment 2). Because
it is larger than the MTU of all participating devices in the bridge (1500
bytes) and a the bridge is not supposed to do fragmentation the packet is simply
dropped.

I''m not sure why ip_conntract defragments but not refragments the
packets it
receives. Maybe it''s not even supposed to refragment them and assumes
the
network device will ... but a bridge does not as it works on the ethernet layer.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Birger Brunswiek

2006-Apr-30 18:26 UTC

head link

Re: [Xen-users] NFS problems in guest

Itai Tavor wrote:> I''m running Xen 3 (xen-unstable tarball from 4/27) with kernel
2.6.16
> and Debian Sarge in all domains. Most things seem to work very well, I
> had the soft lockups problem but Apr 6 change seems to have fixed it.
I was just wondering where you get your tarball from? Is it the "Pre-built
installations of Xen 3.0 for 32 bit, PAE and 64 bit." from
http://www.xensource.com/xen/downloads/ ?

If so I''m wondering why not more people are complaining about this
problem? It
should not be restricted to our setup?!? I doubt that Itai and I are both
affected just because we use Debian/Sarge. I think its more likely to be a (Xen)
kernel issue.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Javier Guerra

2006-May-01 01:38 UTC

head link

Re: [Xen-users] NFS problems in guest

On Sunday 30 April 2006 1:18 pm, Birger Brunswiek wrote:> I''m not sure why ip_conntract defragments but not refragments the
packets
> it receives. Maybe it''s not even supposed to refragment them and
assumes
> the network device will ... but a bridge does not as it works on the
> ethernet layer.
ok, i think i got the spirit of the problem. by nature, ip_conntrack have to
work on defragmented packages. a bridge, OTOH, works at ethernet frame level;
it doesn''t care if those frames contain fragments or complete packets,
as
long as they''re not longer than the MTU of any interface.

in a ''typical'' router (IP level, no bridging), it''s
ok to defragment (even
convenient most times), and any packet would be refragmented (if needed) on
it''s way out of the router. for example, if you have a WAN interface
with an
MTU of 512, you might get a very fragmented packet, and after reassembly it
might be 2000 bytes long (unlikely, but legal). when the routing code sends
it to a Ethernet LAN, it would have to split it into 1500-byte fragments.

when you do have a bridge, this is different; a ''real'' bridge
shouldn''t see
into the ethernet frames, and shouldn''t modify the packages in any way.
all
interfaces should have the same MTU, and there shouldn''t be any
problem. but
since ip_conntrack reassembles IP fragments, the bridge suddenly finds
packets too big to handle, and has to drop them.

in most networks that''s not a problem, since the protocol stack
optimises it''s
packet length until they go all the way without fragmenting (end-to-end MTU
discovery). at the very least, this happens on TCP layers... but NFS uses
UDP by default.

the NFS manpage says the rsize and wsize parameters are 1024 by default, but
performance is improved using 8192. of course, when using NFS over TCP, that
means that almost all traffic would be on fragmented packets (since UDP
doesn''t fragment, the IP layer would).

so, an interesting test would be:

1) using NFS over UDP with big rsize/wsize parameters over a linux bridge with
ip_conntrack and no Xen (i think i can do this test... maybe this wednesday)

2) trying NFS over TCP on a bridging Xen box

3) reducing rsize/wsize on the NFS client on a bridging Xen box.

if it''s a linux networking issue, and not a Xen issue, case 1) should
fail,
and 2) and 3) should work. it would be interesting to compare performance of
2) and 3)....

--
Javier

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Itai Tavor

2006-May-01 05:06 UTC

head link

Re: [Xen-users] NFS problems in guest

On 01/05/2006, at 4:26 AM, Birger Brunswiek wrote:
> Itai Tavor wrote:
>> I''m running Xen 3 (xen-unstable tarball from 4/27) with kernel
2.6.16
>> and Debian Sarge in all domains. Most things seem to work very  
>> well, I
>> had the soft lockups problem but Apr 6 change seems to have fixed it.
>
> I was just wondering where you get your tarball from? Is it the  
> "Pre-built
> installations of Xen 3.0 for 32 bit, PAE and 64 bit." from
> http://www.xensource.com/xen/downloads/ ?
I''m using source releases from http://www.cl.cam.ac.uk/Research/SRG/ 
netos/xen/downloads.html (end of the page).

If I can''t get Debian packages, I always prefer to build from source :)
>
> If so I''m wondering why not more people are complaining about this
> problem? It
> should not be restricted to our setup?!? I doubt that Itai and I  
> are both
> affected just because we use Debian/Sarge. I think its more likely  
> to be a (Xen)
> kernel issue.
I agree it''s not likely to be a Debian issue. We''re using
stock
kernels with Xen patches, and I think the conntrack module will do  
the same thing no matter what distribution it''s on.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Xen users - Apr 2006 - NFS problems in guest

[Xen-users] NFS problems in guest

Re: [Xen-users] NFS problems in guest

Re: [Xen-users] NFS problems in guest

Re: [Xen-users] NFS problems in guest

Re: [Xen-users] NFS problems in guest

Re: [Xen-users] NFS problems in guest

Re: [Xen-users] NFS problems in guest

Re: [Xen-users] NFS problems in guest

Re: [Xen-users] NFS problems in guest