Hello, I am experiencing system hangs when running NFSv4 over a tinc VPN. I don't know if the problem is with NFS or tinc and would appreciate any suggestions on how to narrow down the culprit. Unfortunately I cannot simply run NFS directly over TCP -- the participating systems are connected only over an open network. The configuration is as follows: I have a master server ("spitzer") that exports the NFS shares and also acts as the "primary" tinc server. All other clients connect to the tinc instance on the main server to establish the VPN, and then mount the NFS shares. The server has 4 relevant network interfaces: eth0 - Internet eth1 - Internal network hbt - Tinc VPN vnet0 - Virtual hosts br0 - Bridge between eth1 and vnet0 The hbt, br0 (i.e., eth1 and vnet0) interfaces share a common 192.168.1.0/24 network. Proxy ARP is enabled for br0 and hbt. I am using tinc 1.0.16 on 3.0.0, 64bit Ubuntu 10.04 LTS on both server and client. The problem is that as soon as more than three tinc clients are accessing the NFS shares, any operations on the NFS mountpoints by the clients hang. On the clients, tinc then takes 100% CPU time. On the server, the tinc stance runs with about 20% load. When activating tinc debugging, tinc seems to be busy forwarding packets. I also ran a packet sniffer which showed me that 90% of the packets were NFS related, but I am not familiar enough with NFS to be able to tell anything from the packets themselves. I could find no indication that the br0, eth1 and vnet0 interfaces and systems are related to the problem in any way. Any suggestions how I could debug this? Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C
On Sat, Mar 10, 2012 at 08:29:42PM -0500, Nikolaus Rath wrote:> The server has 4 relevant network interfaces: > > eth0 - Internet > eth1 - Internal network > hbt - Tinc VPN > vnet0 - Virtual hosts > br0 - Bridge between eth1 and vnet0 > > The hbt, br0 (i.e., eth1 and vnet0) interfaces share a common > 192.168.1.0/24 network. Proxy ARP is enabled for br0 and hbt.Just out of curiosity, why do you use proxy ARP in this case instead of adding hbt to the bridge? I assume you are you using Mode = switch?> I am using tinc 1.0.16 on 3.0.0, 64bit Ubuntu 10.04 LTS on both server > and client. > > The problem is that as soon as more than three tinc clients are > accessing the NFS shares, any operations on the NFS mountpoints by the > clients hang. On the clients, tinc then takes 100% CPU time. On the > server, the tinc stance runs with about 20% load. When activating tinc > debugging, tinc seems to be busy forwarding packets. I also ran a packet > sniffer which showed me that 90% of the packets were NFS related, but I > am not familiar enough with NFS to be able to tell anything from the > packets themselves.Can you show me a few lines of the debug messages when it is busy forwarding? Also, without full debugging enabled, when this happens, run "tincd -n <netname> -kUSR2" on one of the clients using 100% CPU and on the server, and send me the node, edge and subnet list that tinc logged. This allows me to see if there could be a routing loop within tinc or if there is another cause for this problem. Also please try out tinc 1.0.17 which has just been released, and check whether it also runs into the same problem. -- Met vriendelijke groet / with kind regards, Guus Sliepen <guus at tinc-vpn.org> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: Digital signature URL: <http://www.tinc-vpn.org/pipermail/tinc/attachments/20120311/cc900f96/attachment.pgp>
Nikolaus Rath <Nikolaus-BTH8mxji4b0 at public.gmane.org> writes:> Hello, > > I am experiencing system hangs when running NFSv4 over a tinc VPN. I > don't know if the problem is with NFS or tinc and would appreciate any > suggestions on how to narrow down the culprit. Unfortunately I cannot > simply run NFS directly over TCP -- the participating systems are > connected only over an open network.[...] This issue has turned out to be an NFS problem, see http://news.gmane.org/find-root.php?message_id=%3c878vj7x6mj.fsf%40vostro.rath.org%3e- for details. However, I would still like to use tinc's switch mode instead of proxy ARP, but couldn't quite get it to work yet (details in my previous email). Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C
On Mon, Mar 12, 2012 at 06:04:01PM -0400, Nikolaus Rath wrote:> > I am experiencing system hangs when running NFSv4 over a tinc VPN.[...]> This issue has turned out to be an NFS problem, see > http://news.gmane.org/find-root.php?message_id=%3c878vj7x6mj.fsf%40vostro.rath.org%3e > for details.Oh, that is interesting, I use NFS background mounting from /etc/fstab on a lot of servers as well that may not directly have an IP address at boot time. I hope they will release a fix soon. -- Met vriendelijke groet / with kind regards, Guus Sliepen <guus at tinc-vpn.org> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: Digital signature URL: <http://www.tinc-vpn.org/pipermail/tinc/attachments/20120313/423d78c5/attachment.pgp>