thr3ads.net - CentOS - [CentOS] losing NFS connection [Apr 2005]

If this information is useful, please help other people find it:
Share via:

Angelo Machils

2005-Apr-24 18:28 UTC

[CentOS] losing NFS connection

Hello there!

Perhaps this is a little off-topic, but I notice this only on the Centos 
box.
I'm running Centos 4 on an AMD64 which has the following entries in the 
fstab to connect to NFS shares on a Fedora3 box:
192.168.1.12:/home/angelo/ /home/angelo/NFS_share1 nfs 
rw,addr=192.168.1.12 0 0
192.168.1.12:/home/angelo/data /home/angelo/NFS_share2 nfs 
rw,addr=192.168.1.12 0 0
192.168.1.12:/home/angelo/data2 /home/angelo/NFS_share3 nfs 
rw,addr=192.168.1.12 0 0
I have opened ports 111 (TCP), 648 (TCP), 651 (TCP) and 2049 (TCP and 
UDP) in iptables on the FC3 box and I can connect to them, but after a 
while I seem to loose the connection to the shares.
When I try to move into them while in a console I get the error:
bash: cd: NFS_share1: Input/output error
In Nautilus I don't even see the directories anymore and in 
/var/log/messages I get this error msgs:
Apr 24 20:17:02 solaris kernel: RPC: garbage, exit EIO
There are not entries in the /var/log/messages on the FC3 box.
If I manually umount them and then mount them again, I can use them 
again for a while....
The exports file on the FC3 box looks like this:
[root at imhotep etc]# more exports
/home/angelo             192.168.1.*(rw,sync)
/home/angelo/data        192.168.1.*(rw,sync)
/home/angelo/data2       192.168.1.*(rw,sync)

Anyone any idea what is wrong here?

Thanks in advance,
Angelo

Collins Richey

2005-Apr-24 19:14 UTC

head link

[CentOS] losing NFS connection

On 4/24/05, Angelo Machils <angelus at sangreal.demon.nl>
wrote:> Hello there!
> 
> Perhaps this is a little off-topic, but I notice this only on the Centos
> box.
> I'm running Centos 4 on an AMD64 which has the following entries in the
> fstab to connect to NFS shares on a Fedora3 box:
> 192.168.1.12:/home/angelo/ /home/angelo/NFS_share1 nfs
> rw,addr=192.168.1.12 0 0
> 192.168.1.12:/home/angelo/data /home/angelo/NFS_share2 nfs
> rw,addr=192.168.1.12 0 0
> 192.168.1.12:/home/angelo/data2 /home/angelo/NFS_share3 nfs
> rw,addr=192.168.1.12 0 0
> I have opened ports 111 (TCP), 648 (TCP), 651 (TCP) and 2049 (TCP and
> UDP) in iptables on the FC3 box and I can connect to them, but after a
> while I seem to loose the connection to the shares.
> When I try to move into them while in a console I get the error:
> bash: cd: NFS_share1: Input/output error
> In Nautilus I don't even see the directories anymore and in
> /var/log/messages I get this error msgs:
> Apr 24 20:17:02 solaris kernel: RPC: garbage, exit EIO
> There are not entries in the /var/log/messages on the FC3 box.
> If I manually umount them and then mount them again, I can use them
> again for a while....
> The exports file on the FC3 box looks like this:
> [root at imhotep etc]# more exports
> /home/angelo             192.168.1.*(rw,sync)
> /home/angelo/data        192.168.1.*(rw,sync)
> /home/angelo/data2       192.168.1.*(rw,sync)
> 
> Anyone any idea what is wrong here?
> 
Quote from an NFS thread on another list:

"I find on my NFS clients, that i need to allow connections to port 111
and also to higher level tcp ports (assuming you are doing NFS over tcp)
--destination-ports 32768:65535."

Maybe you need to open up your firewall?

-- 
 Collins
       When I saw the Iraqi people voting three weeks ago, 8 million of them, 
       it was the start of a new Arab world.... The Berlin Wall has fallen. 
               - Lebanese Druze leader Walid Jumblatt

Sean O'Connell

2005-Apr-24 19:23 UTC

head link

[CentOS] losing NFS connection

On Sun, 2005-04-24 at 20:28 +0200, Angelo Machils wrote:> Hello there!
> 
> Perhaps this is a little off-topic, but I notice this only on the Centos 
> box.
> I'm running Centos 4 on an AMD64 which has the following entries in the
> fstab to connect to NFS shares on a Fedora3 box:
> 192.168.1.12:/home/angelo/ /home/angelo/NFS_share1 nfs 
> rw,addr=192.168.1.12 0 0
> 192.168.1.12:/home/angelo/data /home/angelo/NFS_share2 nfs 
> rw,addr=192.168.1.12 0 0
> 192.168.1.12:/home/angelo/data2 /home/angelo/NFS_share3 nfs 
> rw,addr=192.168.1.12 0 0
> I have opened ports 111 (TCP), 648 (TCP), 651 (TCP) and 2049 (TCP and 
> UDP) in iptables on the FC3 box and I can connect to them, but after a 
> while I seem to loose the connection to the shares.
> When I try to move into them while in a console I get the error:
> bash: cd: NFS_share1: Input/output error
> In Nautilus I don't even see the directories anymore and in 
> /var/log/messages I get this error msgs:
> Apr 24 20:17:02 solaris kernel: RPC: garbage, exit EIO
> There are not entries in the /var/log/messages on the FC3 box.
> If I manually umount them and then mount them again, I can use them 
> again for a while....
> The exports file on the FC3 box looks like this:
> [root at imhotep etc]# more exports
> /home/angelo             192.168.1.*(rw,sync)
> /home/angelo/data        192.168.1.*(rw,sync)
> /home/angelo/data2       192.168.1.*(rw,sync)
> 
> Anyone any idea what is wrong here?
Angelo-

I have found that you need to allow higher numbered tcp ports
(32768:65535) through on both the server and client to make rpc
connections happy. I have also had to allow a range of ports in between
600:1024 UDP range on the server to make things happy (though, this was
with older NFS implementations). It's possible that you need to open up
more ports on the server. One thing to do would be to add a log rule to
your iptables rules on the client and server and see what is being
dropped when the client mount hangs.

Sean

Marc Powell

2005-Apr-24 19:53 UTC

head link

[CentOS] losing NFS connection

> -----Original Message-----
> From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On
> Behalf Of Angelo Machils
> Sent: Sunday, April 24, 2005 1:28 PM
> To: centos at centos.org
> Subject: [CentOS] losing NFS connection
> 
> Hello there!
> 
> Perhaps this is a little off-topic, but I notice this only on the
Centos> box.
> I'm running Centos 4 on an AMD64 which has the following entries in
the> fstab to connect to NFS shares on a Fedora3 box:
> 192.168.1.12:/home/angelo/ /home/angelo/NFS_share1 nfs
> rw,addr=192.168.1.12 0 0
> 192.168.1.12:/home/angelo/data /home/angelo/NFS_share2 nfs
> rw,addr=192.168.1.12 0 0
> 192.168.1.12:/home/angelo/data2 /home/angelo/NFS_share3 nfs
> rw,addr=192.168.1.12 0 0
> I have opened ports 111 (TCP), 648 (TCP), 651 (TCP) and 2049 (TCP and
> UDP) in iptables on the FC3 box and I can connect to them, but after a
> while I seem to loose the connection to the shares.
[snip]
> 
> Anyone any idea what is wrong here?
Just a thought but have you hard-coded speed and duplex all the way
through? Don't trust auto-negotiation.

--
Marc

Aleksandar Milivojevic

2005-Apr-25 14:46 UTC

head link

[CentOS] losing NFS connection

Angelo Machils wrote:> Hello there!
> 
> Perhaps this is a little off-topic, but I notice this only on the Centos 
> box.
> I'm running Centos 4 on an AMD64 which has the following entries in the
> fstab to connect to NFS shares on a Fedora3 box:
> I have opened ports 111 (TCP), 648 (TCP), 651 (TCP) and 2049 (TCP and 
> UDP) in iptables on the FC3 box and I can connect to them, but after a 
> while I seem to loose the connection to the shares.
NFS uses RPC, and RPC can be a real bitch to get it working over a 
firewall.  IMO, if anybody thinks of writing a service that uses RPC, 
he/she should think again.  And again, until he/she drops the idea, and 
decides not to use RPC.

Anyhow, since NFS does use RPC, and we are kind of stuck with it for 
now...  Try and make sure that in all of your configuration files all 
NFS RPC services are set up to use fixed ports, and make sure all of 
them are covered.  If you miss single one, you get into trouble.  The 
other solution is to open all high ports from the client to the server, 
and see if that helps.  Try using rpcinfo (or wahtever it is called) 
utility and see if port mapper assigned any non-standard ports to any of 
NFS related RPC services.

Also, put some logging rules into your firewall configuration.  That 
will help you troubleshoot the problems.  When you do it, you'll know 
exactly what kind of packets are being dropped by the firewall and why 
they are dropped.  Then you can either update your firewall 
configuration or make changes on NFS/RPC (for example, if you missed to 
explicitly force some NFS related RPC service to use fixed port).

There's also RPC helper module for Netfilter.  It is part of iptables 
package, but not part of the kernel package (in other words, you can't 
use it, unless you recompile the kernel, and than you need to know 
exactly what patch level of the module was in iptables package to patch 
the kernel with the same patch level of the module, or you need to 
repatch/recompile both iptables and the kernel).  Adding Netfilter 
patches to your kernel can be a real bitch too for unexperienced users. 
  Wish there was an easier way of doing it (as in here's the userland 
module, here's the kernel module, just compile these too, but there 
isn't).  I've attempted to try it out once long time ago, but it
wasn't
working all that great for me.  Hopefully it will mature one day and 
will be included into the kernel.

-- 
Aleksandar Milivojevic <amilivojevic at pbl.ca>    Pollard Banknote
Limited
Systems Administrator                           1499 Buffalo Place
Tel: (204) 474-2323 ext 276                     Winnipeg, MB  R3T 1L7

Marc Powell

2005-Apr-25 14:47 UTC

head link

[CentOS] losing NFS connection

> -----Original Message-----
> From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On
> Behalf Of Joshua Baker-LePain
> Sent: Monday, April 25, 2005 7:59 AM
> To: CentOS mailing list
> Subject: RE: [CentOS] losing NFS connection
> 
> On Sun, 24 Apr 2005 at 2:53pm, Marc Powell wrote
> 
> > > Anyone any idea what is wrong here?
> >
> > Just a thought but have you hard-coded speed and duplex all the way
> > through? Don't trust auto-negotiation.
> 
> Just a dissenting opinion here on that last bit.  No less a source
than> Donald Becker often advises *strongly* against disabling
auto-negotiation.> Yes, some switches *cough*Cisco*cough* historically did it very badly.
> But that was then, not now.  And, AIUI, it's actually part of the gbe
> standard.
Sure, let me rephrase my original wording -- "I don't trust
auto-negitiation." =) I say that for two reasons --

I work with a very large network consisting of tens of thousands of
machines connecting to a hodge-podge of switches and routers from many
different vendors. It is my personal experience that auto-negotiation
does not result in optimum or even compatible speed/duplex settings
often enough to be trustworthy. I've experienced this as recently as
last week with brand new Cisco equipment and Dell computers running
CentOS 3.4 (100/Full on one end, 100/Half on the other). I've seen the
problem with Alteon and Foundry equipment recently as well. It may be
part of the standard but as anyone who's been around a while knows, each
vendor's interpretation of a standard can vary enough to be problematic.

Second, as an administrator I want to do everything in my power to make
sure the devices I manage are going to run smoothly. Why leave something
as simple but as problematic as speed/duplex settings to chance or trust
when it is a simple task to force it to be that which works best in the
networking environment that the device lives in? For example, if
auto-negotiation comes up with 100/half, 10/full or 10/half and every
other local device is 100/full or vice-versa, the network is operating
at less than peak efficiency and that can also result in odd problems.

--
Marc

Bowie Bailey

2005-Apr-25 16:26 UTC

head link

[CentOS] losing NFS connection

From: Marc Powell [mailto:marc at ena.com]> From: Joshua Baker-LePain
> > On Sun, 24 Apr 2005 at 2:53pm, Marc Powell wrote
> > 
> > > > Anyone any idea what is wrong here?
> > >
> > > Just a thought but have you hard-coded speed and duplex all the
way
> > > through? Don't trust auto-negotiation.
> > 
> > Just a dissenting opinion here on that last bit.  No less a source
than
> > Donald Becker often advises *strongly* against disabling
auto-negotiation.> > Yes, some switches *cough*Cisco*cough* historically did it very badly.
> > But that was then, not now.  And, AIUI, it's actually part of the
gbe
> > standard.
> 
> Sure, let me rephrase my original wording -- "I don't trust
> auto-negitiation." =) I say that for two reasons --
The problem I've found with hard-coding speed and duplex is that it MUST be
done on both sides of the link.  If one side is hard-coded and the other
side
is trying to negotiate, the speed will be detected properly, but the duplex
will not.  This can result in one side of the link running at full and the
other side running at half.  Obviously, problems will occur.

If both sides support (and are configured for) negotiation, there shouldn't
be
a problem.  Likewise, if both sides are hard-coded, there shouldn't be a
problem.

Bowie

Angelo Machils

2005-Apr-25 19:42 UTC

head link

[CentOS] losing NFS connection

>
>
>-----Original Message-----
>> From: centos-bounces at centos.org [mailto:centos-bounces at
centos.org] On
>> Behalf Of Angelo Machils
>> Sent: Sunday, April 24, 2005 1:28 PM
>> To: centos at centos.org
>> Subject: [CentOS] losing NFS connection
>> 
>> Hello there!
>> 
>> Perhaps this is a little off-topic, but I notice this only on the
>  
>
>Centos
>  
>
>>> box.
>>> I'm running Centos 4 on an AMD64 which has the following
entries in
>>    
>>
>the
>  
>
>>> fstab to connect to NFS shares on a Fedora3 box:
>>> 192.168.1.12:/home/angelo/ /home/angelo/NFS_share1 nfs
>>> rw,addr=192.168.1.12 0 0
>>> 192.168.1.12:/home/angelo/data /home/angelo/NFS_share2 nfs
>>> rw,addr=192.168.1.12 0 0
>>> 192.168.1.12:/home/angelo/data2 /home/angelo/NFS_share3 nfs
>>> rw,addr=192.168.1.12 0 0
>>> I have opened ports 111 (TCP), 648 (TCP), 651 (TCP) and 2049 (TCP
and
>>> UDP) in iptables on the FC3 box and I can connect to them, but
after a
>>> while I seem to loose the connection to the shares.
>>    
>>
>
>[snip]
>
>  
>
>>> 
>>> Anyone any idea what is wrong here?
>>    
>>
>
>Just a thought but have you hard-coded speed and duplex all the way
>through? Don't trust auto-negotiation.
>
>--
>Marc 
>
I have opened the firewall on the server all the way for the client (and 
also the other way around) but it makes no difference. After a while I 
see these messages again in the /var/log/messages on the client:
Apr 25 21:38:02 solaris kernel: RPC: garbage, exit EIO
Apr 25 21:38:33 solaris last message repeated 70 times
Apr 25 21:39:34 solaris last message repeated 122 times
Apr 25 21:40:01 solaris last message repeated 55 times
Apr 25 21:40:01 solaris crond(pam_unix)[4701]: session opened for user 
root by (uid=0)
Apr 25 21:40:01 solaris crond(pam_unix)[4701]: session closed for user root
Apr 25 21:40:02 solaris kernel: RPC: garbage, exit EIO
And this keeps repeating...... No entries on the server though.....

Which file do I have to edit in order to set the NIC into fixed state. I 
know that I can use mii-tool to check and set, but is this permanent, 
even after a reboot?

Thanks in advance, Angelo

Marc Powell

2005-Apr-25 20:39 UTC

head link

[CentOS] losing NFS connection

> -----Original Message-----
> From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On
> Behalf Of Angelo Machils
> Sent: Monday, April 25, 2005 2:43 PM
> To: centos at centos.org
> Subject: RE: [CentOS] losing NFS connection
> >>
[snip]> >
> >Just a thought but have you hard-coded speed and duplex all the way
> >through? Don't trust auto-negotiation.
> 
> I have opened the firewall on the server all the way for the client
(and> also the other way around) but it makes no difference. After a while I
> see these messages again in the /var/log/messages on the client:
> Apr 25 21:38:02 solaris kernel: RPC: garbage, exit EIO
> Apr 25 21:38:33 solaris last message repeated 70 times
> Apr 25 21:39:34 solaris last message repeated 122 times
> Apr 25 21:40:01 solaris last message repeated 55 times
> Apr 25 21:40:01 solaris crond(pam_unix)[4701]: session opened for user
> root by (uid=0)
> Apr 25 21:40:01 solaris crond(pam_unix)[4701]: session closed for user
> root
> Apr 25 21:40:02 solaris kernel: RPC: garbage, exit EIO
> And this keeps repeating...... No entries on the server though.....
> 
> Which file do I have to edit in order to set the NIC into fixed state.
I> know that I can use mii-tool to check and set, but is this permanent,
> even after a reboot?
It varies but mii-tool changes will be lost on reboot. There are several
ways of getting around that from simplest to hardest --

- Add '/sbin/mii-tool -F 100baseTx-FD' to /etc/rc.local
- Add '/sbin/mii-tool -F 100baseTx-FD ${DEVICE}' to
/etc/sysconfig/network-scripts/ifup-post
- research your card driver and add the appropriate arguments in
/etc/modules.conf.

I almost always opt for the first option. As has been previously noted,
you must make sure that the switch, hub or router that you are connected
to is also forced to the same speed and duplex.

--
Marc

Possibly Parallel Threads

Search for more possibly parallel threads

CentOS - Apr 2005 - losing NFS connection

[CentOS] losing NFS connection

[CentOS] losing NFS connection

[CentOS] losing NFS connection

[CentOS] losing NFS connection

[CentOS] losing NFS connection

[CentOS] losing NFS connection

[CentOS] losing NFS connection

[CentOS] losing NFS connection

[CentOS] losing NFS connection

Possibly Parallel Threads