Hello there! Perhaps this is a little off-topic, but I notice this only on the Centos box. I'm running Centos 4 on an AMD64 which has the following entries in the fstab to connect to NFS shares on a Fedora3 box: 192.168.1.12:/home/angelo/ /home/angelo/NFS_share1 nfs rw,addr=192.168.1.12 0 0 192.168.1.12:/home/angelo/data /home/angelo/NFS_share2 nfs rw,addr=192.168.1.12 0 0 192.168.1.12:/home/angelo/data2 /home/angelo/NFS_share3 nfs rw,addr=192.168.1.12 0 0 I have opened ports 111 (TCP), 648 (TCP), 651 (TCP) and 2049 (TCP and UDP) in iptables on the FC3 box and I can connect to them, but after a while I seem to loose the connection to the shares. When I try to move into them while in a console I get the error: bash: cd: NFS_share1: Input/output error In Nautilus I don't even see the directories anymore and in /var/log/messages I get this error msgs: Apr 24 20:17:02 solaris kernel: RPC: garbage, exit EIO There are not entries in the /var/log/messages on the FC3 box. If I manually umount them and then mount them again, I can use them again for a while.... The exports file on the FC3 box looks like this: [root at imhotep etc]# more exports /home/angelo 192.168.1.*(rw,sync) /home/angelo/data 192.168.1.*(rw,sync) /home/angelo/data2 192.168.1.*(rw,sync) Anyone any idea what is wrong here? Thanks in advance, Angelo
On 4/24/05, Angelo Machils <angelus at sangreal.demon.nl> wrote:> Hello there! > > Perhaps this is a little off-topic, but I notice this only on the Centos > box. > I'm running Centos 4 on an AMD64 which has the following entries in the > fstab to connect to NFS shares on a Fedora3 box: > 192.168.1.12:/home/angelo/ /home/angelo/NFS_share1 nfs > rw,addr=192.168.1.12 0 0 > 192.168.1.12:/home/angelo/data /home/angelo/NFS_share2 nfs > rw,addr=192.168.1.12 0 0 > 192.168.1.12:/home/angelo/data2 /home/angelo/NFS_share3 nfs > rw,addr=192.168.1.12 0 0 > I have opened ports 111 (TCP), 648 (TCP), 651 (TCP) and 2049 (TCP and > UDP) in iptables on the FC3 box and I can connect to them, but after a > while I seem to loose the connection to the shares. > When I try to move into them while in a console I get the error: > bash: cd: NFS_share1: Input/output error > In Nautilus I don't even see the directories anymore and in > /var/log/messages I get this error msgs: > Apr 24 20:17:02 solaris kernel: RPC: garbage, exit EIO > There are not entries in the /var/log/messages on the FC3 box. > If I manually umount them and then mount them again, I can use them > again for a while.... > The exports file on the FC3 box looks like this: > [root at imhotep etc]# more exports > /home/angelo 192.168.1.*(rw,sync) > /home/angelo/data 192.168.1.*(rw,sync) > /home/angelo/data2 192.168.1.*(rw,sync) > > Anyone any idea what is wrong here? >Quote from an NFS thread on another list: "I find on my NFS clients, that i need to allow connections to port 111 and also to higher level tcp ports (assuming you are doing NFS over tcp) --destination-ports 32768:65535." Maybe you need to open up your firewall? -- Collins When I saw the Iraqi people voting three weeks ago, 8 million of them, it was the start of a new Arab world.... The Berlin Wall has fallen. - Lebanese Druze leader Walid Jumblatt
On Sun, 2005-04-24 at 20:28 +0200, Angelo Machils wrote:> Hello there! > > Perhaps this is a little off-topic, but I notice this only on the Centos > box. > I'm running Centos 4 on an AMD64 which has the following entries in the > fstab to connect to NFS shares on a Fedora3 box: > 192.168.1.12:/home/angelo/ /home/angelo/NFS_share1 nfs > rw,addr=192.168.1.12 0 0 > 192.168.1.12:/home/angelo/data /home/angelo/NFS_share2 nfs > rw,addr=192.168.1.12 0 0 > 192.168.1.12:/home/angelo/data2 /home/angelo/NFS_share3 nfs > rw,addr=192.168.1.12 0 0 > I have opened ports 111 (TCP), 648 (TCP), 651 (TCP) and 2049 (TCP and > UDP) in iptables on the FC3 box and I can connect to them, but after a > while I seem to loose the connection to the shares. > When I try to move into them while in a console I get the error: > bash: cd: NFS_share1: Input/output error > In Nautilus I don't even see the directories anymore and in > /var/log/messages I get this error msgs: > Apr 24 20:17:02 solaris kernel: RPC: garbage, exit EIO > There are not entries in the /var/log/messages on the FC3 box. > If I manually umount them and then mount them again, I can use them > again for a while.... > The exports file on the FC3 box looks like this: > [root at imhotep etc]# more exports > /home/angelo 192.168.1.*(rw,sync) > /home/angelo/data 192.168.1.*(rw,sync) > /home/angelo/data2 192.168.1.*(rw,sync) > > Anyone any idea what is wrong here?Angelo- I have found that you need to allow higher numbered tcp ports (32768:65535) through on both the server and client to make rpc connections happy. I have also had to allow a range of ports in between 600:1024 UDP range on the server to make things happy (though, this was with older NFS implementations). It's possible that you need to open up more ports on the server. One thing to do would be to add a log rule to your iptables rules on the client and server and see what is being dropped when the client mount hangs. Sean
> -----Original Message----- > From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On > Behalf Of Angelo Machils > Sent: Sunday, April 24, 2005 1:28 PM > To: centos at centos.org > Subject: [CentOS] losing NFS connection > > Hello there! > > Perhaps this is a little off-topic, but I notice this only on theCentos> box. > I'm running Centos 4 on an AMD64 which has the following entries inthe> fstab to connect to NFS shares on a Fedora3 box: > 192.168.1.12:/home/angelo/ /home/angelo/NFS_share1 nfs > rw,addr=192.168.1.12 0 0 > 192.168.1.12:/home/angelo/data /home/angelo/NFS_share2 nfs > rw,addr=192.168.1.12 0 0 > 192.168.1.12:/home/angelo/data2 /home/angelo/NFS_share3 nfs > rw,addr=192.168.1.12 0 0 > I have opened ports 111 (TCP), 648 (TCP), 651 (TCP) and 2049 (TCP and > UDP) in iptables on the FC3 box and I can connect to them, but after a > while I seem to loose the connection to the shares.[snip]> > Anyone any idea what is wrong here?Just a thought but have you hard-coded speed and duplex all the way through? Don't trust auto-negotiation. -- Marc
Angelo Machils wrote:> Hello there! > > Perhaps this is a little off-topic, but I notice this only on the Centos > box. > I'm running Centos 4 on an AMD64 which has the following entries in the > fstab to connect to NFS shares on a Fedora3 box:> I have opened ports 111 (TCP), 648 (TCP), 651 (TCP) and 2049 (TCP and > UDP) in iptables on the FC3 box and I can connect to them, but after a > while I seem to loose the connection to the shares.NFS uses RPC, and RPC can be a real bitch to get it working over a firewall. IMO, if anybody thinks of writing a service that uses RPC, he/she should think again. And again, until he/she drops the idea, and decides not to use RPC. Anyhow, since NFS does use RPC, and we are kind of stuck with it for now... Try and make sure that in all of your configuration files all NFS RPC services are set up to use fixed ports, and make sure all of them are covered. If you miss single one, you get into trouble. The other solution is to open all high ports from the client to the server, and see if that helps. Try using rpcinfo (or wahtever it is called) utility and see if port mapper assigned any non-standard ports to any of NFS related RPC services. Also, put some logging rules into your firewall configuration. That will help you troubleshoot the problems. When you do it, you'll know exactly what kind of packets are being dropped by the firewall and why they are dropped. Then you can either update your firewall configuration or make changes on NFS/RPC (for example, if you missed to explicitly force some NFS related RPC service to use fixed port). There's also RPC helper module for Netfilter. It is part of iptables package, but not part of the kernel package (in other words, you can't use it, unless you recompile the kernel, and than you need to know exactly what patch level of the module was in iptables package to patch the kernel with the same patch level of the module, or you need to repatch/recompile both iptables and the kernel). Adding Netfilter patches to your kernel can be a real bitch too for unexperienced users. Wish there was an easier way of doing it (as in here's the userland module, here's the kernel module, just compile these too, but there isn't). I've attempted to try it out once long time ago, but it wasn't working all that great for me. Hopefully it will mature one day and will be included into the kernel. -- Aleksandar Milivojevic <amilivojevic at pbl.ca> Pollard Banknote Limited Systems Administrator 1499 Buffalo Place Tel: (204) 474-2323 ext 276 Winnipeg, MB R3T 1L7
> -----Original Message----- > From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On > Behalf Of Joshua Baker-LePain > Sent: Monday, April 25, 2005 7:59 AM > To: CentOS mailing list > Subject: RE: [CentOS] losing NFS connection > > On Sun, 24 Apr 2005 at 2:53pm, Marc Powell wrote > > > > Anyone any idea what is wrong here? > > > > Just a thought but have you hard-coded speed and duplex all the way > > through? Don't trust auto-negotiation. > > Just a dissenting opinion here on that last bit. No less a sourcethan> Donald Becker often advises *strongly* against disablingauto-negotiation.> Yes, some switches *cough*Cisco*cough* historically did it very badly. > But that was then, not now. And, AIUI, it's actually part of the gbe > standard.Sure, let me rephrase my original wording -- "I don't trust auto-negitiation." =) I say that for two reasons -- I work with a very large network consisting of tens of thousands of machines connecting to a hodge-podge of switches and routers from many different vendors. It is my personal experience that auto-negotiation does not result in optimum or even compatible speed/duplex settings often enough to be trustworthy. I've experienced this as recently as last week with brand new Cisco equipment and Dell computers running CentOS 3.4 (100/Full on one end, 100/Half on the other). I've seen the problem with Alteon and Foundry equipment recently as well. It may be part of the standard but as anyone who's been around a while knows, each vendor's interpretation of a standard can vary enough to be problematic. Second, as an administrator I want to do everything in my power to make sure the devices I manage are going to run smoothly. Why leave something as simple but as problematic as speed/duplex settings to chance or trust when it is a simple task to force it to be that which works best in the networking environment that the device lives in? For example, if auto-negotiation comes up with 100/half, 10/full or 10/half and every other local device is 100/full or vice-versa, the network is operating at less than peak efficiency and that can also result in odd problems. -- Marc
From: Marc Powell [mailto:marc at ena.com]> From: Joshua Baker-LePain > > On Sun, 24 Apr 2005 at 2:53pm, Marc Powell wrote > > > > > > Anyone any idea what is wrong here? > > > > > > Just a thought but have you hard-coded speed and duplex all the way > > > through? Don't trust auto-negotiation. > > > > Just a dissenting opinion here on that last bit. No less a source than > > Donald Becker often advises *strongly* against disablingauto-negotiation.> > Yes, some switches *cough*Cisco*cough* historically did it very badly. > > But that was then, not now. And, AIUI, it's actually part of the gbe > > standard. > > Sure, let me rephrase my original wording -- "I don't trust > auto-negitiation." =) I say that for two reasons --The problem I've found with hard-coding speed and duplex is that it MUST be done on both sides of the link. If one side is hard-coded and the other side is trying to negotiate, the speed will be detected properly, but the duplex will not. This can result in one side of the link running at full and the other side running at half. Obviously, problems will occur. If both sides support (and are configured for) negotiation, there shouldn't be a problem. Likewise, if both sides are hard-coded, there shouldn't be a problem. Bowie
> > >-----Original Message----- >> From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On >> Behalf Of Angelo Machils >> Sent: Sunday, April 24, 2005 1:28 PM >> To: centos at centos.org >> Subject: [CentOS] losing NFS connection >> >> Hello there! >> >> Perhaps this is a little off-topic, but I notice this only on the > > >Centos > > >>> box. >>> I'm running Centos 4 on an AMD64 which has the following entries in >> >> >the > > >>> fstab to connect to NFS shares on a Fedora3 box: >>> 192.168.1.12:/home/angelo/ /home/angelo/NFS_share1 nfs >>> rw,addr=192.168.1.12 0 0 >>> 192.168.1.12:/home/angelo/data /home/angelo/NFS_share2 nfs >>> rw,addr=192.168.1.12 0 0 >>> 192.168.1.12:/home/angelo/data2 /home/angelo/NFS_share3 nfs >>> rw,addr=192.168.1.12 0 0 >>> I have opened ports 111 (TCP), 648 (TCP), 651 (TCP) and 2049 (TCP and >>> UDP) in iptables on the FC3 box and I can connect to them, but after a >>> while I seem to loose the connection to the shares. >> >> > >[snip] > > > >>> >>> Anyone any idea what is wrong here? >> >> > >Just a thought but have you hard-coded speed and duplex all the way >through? Don't trust auto-negotiation. > >-- >Marc >I have opened the firewall on the server all the way for the client (and also the other way around) but it makes no difference. After a while I see these messages again in the /var/log/messages on the client: Apr 25 21:38:02 solaris kernel: RPC: garbage, exit EIO Apr 25 21:38:33 solaris last message repeated 70 times Apr 25 21:39:34 solaris last message repeated 122 times Apr 25 21:40:01 solaris last message repeated 55 times Apr 25 21:40:01 solaris crond(pam_unix)[4701]: session opened for user root by (uid=0) Apr 25 21:40:01 solaris crond(pam_unix)[4701]: session closed for user root Apr 25 21:40:02 solaris kernel: RPC: garbage, exit EIO And this keeps repeating...... No entries on the server though..... Which file do I have to edit in order to set the NIC into fixed state. I know that I can use mii-tool to check and set, but is this permanent, even after a reboot? Thanks in advance, Angelo
> -----Original Message----- > From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On > Behalf Of Angelo Machils > Sent: Monday, April 25, 2005 2:43 PM > To: centos at centos.org > Subject: RE: [CentOS] losing NFS connection > >>[snip]> > > >Just a thought but have you hard-coded speed and duplex all the way > >through? Don't trust auto-negotiation.> > I have opened the firewall on the server all the way for the client(and> also the other way around) but it makes no difference. After a while I > see these messages again in the /var/log/messages on the client: > Apr 25 21:38:02 solaris kernel: RPC: garbage, exit EIO > Apr 25 21:38:33 solaris last message repeated 70 times > Apr 25 21:39:34 solaris last message repeated 122 times > Apr 25 21:40:01 solaris last message repeated 55 times > Apr 25 21:40:01 solaris crond(pam_unix)[4701]: session opened for user > root by (uid=0) > Apr 25 21:40:01 solaris crond(pam_unix)[4701]: session closed for user > root > Apr 25 21:40:02 solaris kernel: RPC: garbage, exit EIO > And this keeps repeating...... No entries on the server though..... > > Which file do I have to edit in order to set the NIC into fixed state.I> know that I can use mii-tool to check and set, but is this permanent, > even after a reboot?It varies but mii-tool changes will be lost on reboot. There are several ways of getting around that from simplest to hardest -- - Add '/sbin/mii-tool -F 100baseTx-FD' to /etc/rc.local - Add '/sbin/mii-tool -F 100baseTx-FD ${DEVICE}' to /etc/sysconfig/network-scripts/ifup-post - research your card driver and add the appropriate arguments in /etc/modules.conf. I almost always opt for the first option. As has been previously noted, you must make sure that the switch, hub or router that you are connected to is also forced to the same speed and duplex. -- Marc