Hi, ? I've recently moved from OpenVPN to Tinc for the mesh setup. Everything works nicely except for some looping packet issues. ? I have 12 servers around the United States, including a couple of moving clients. Every now and then the network is flooded with what I assume is looping packets. 200-300MB of bandwidth per node over 10-45 minutes (normally < 20 KB/s, idle). Most if not all nodes participate in the spike. Some nodes just have a spike on output, some only on input. ? I see a bunch of these lines in the logs on all the servers with the spikes, each about one or two other nodes: ``` Got late or replayed packet from <NODE_NAME> (<INTERNAL_IPv6_ADDRESS>%<NETWORK_NAME> port 655), seqno 235266, last received 235410 ``` ? What causes this and is there a way to prevent these spikes? I estimate an extra 4-5GB of extra bandwidth usage per node because of this. ? Regards, Mark Lopez ? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.tinc-vpn.org/pipermail/tinc/attachments/20140803/206a8aea/attachment.html>
On Sun, Aug 03, 2014 at 11:02:14PM -0600, Mark Lopez wrote:> I've recently moved from OpenVPN to Tinc for the mesh setup. Everything works nicely except for some looping packet issues. > ? > I have 12 servers around the United States, including a couple of moving clients. Every now and then the network is flooded with what I assume is looping packets. 200-300MB of bandwidth per node over 10-45 minutes (normally < 20 KB/s, idle). Most if not all nodes participate in the spike. Some nodes just have a spike on output, some only on input.[...]> What causes this and is there a way to prevent these spikes? I estimate an extra 4-5GB of extra bandwidth usage per node because of this.Oh, that's quite a lot. First, which version of tinc are you using? Which Mode (router or switch) are you using? Are you bridging the VPN interface with the LAN interface on any of the nodes? -- Met vriendelijke groet / with kind regards, Guus Sliepen <guus at tinc-vpn.org> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: Digital signature URL: <http://www.tinc-vpn.org/pipermail/tinc/attachments/20140806/f756ba47/attachment.sig>
> Oh, that's quite a lot. First, which version of tinc are you using? > Which Mode (router or switch) are you using? Are you bridging the VPNinterface with the LAN interface on any of the nodes? Thanks for taking time to reply. I appreciate the time and experience that developers put into projects like Tinc. I'm using Tinc 1.0.24 from the Utopic Ubuntu (14.10?) repositions on all the nodes (automated deployment, exact configs). The nodes are running a mix of 12.04 and 14.04 LTS Ubuntu. Each node is running under switch mode (forcing TAP support and automatic routing). There is no bridging between interfaces on any node. The connection between some servers may not be the best. There might be a correlation between the time of packet floods and when my monitoring server losses connection with a node. If the floods are due to network issues or that the flood causes issues is another story. Some nodes constantly send larger amounts of packets overtime to another node resulting in the "Got late or replayed packet" message in the logs. A restart of the Tinc daemon on those nodes calms the network. Side question: Is Tinc 1.1 stable (or predictable) for daily use? I'm wondering on upgrading. The extra stats that Tinc 1.1 could generate would be largely beneficial. Thank you, Mark Lopez
Thanks for the help.> It should if you set ExperimentalProtocol = no in tinc.conf, but if you > really want a stable VPN I suggest you keep running 1.0.x for now.I've upgraded to 1.1pre9. For some reason 1.1pre10 was crashing on Ubuntu 12.04 LTS systems (encryption issues?). The binary for Windows also had encryption issues on pre10. Pre9 seems stable, although every now and then I experience a cascading crash of all the daemons on the network. I had to create a watchdog script to restart tinc. The nodes couldn't route incoming packets unless I pinged all the nodes first - so the watchdog script also pings each node on the network after a crash. ExperimentalProtocol was disabled.> It would be helpful if you could provide us with a dump of the nodes and > edges of the VPN, both when it is working properly and when it is > looping. You can tell tinc to dump this information to the syslog by > running:I've dumped the nodes to a pastebin using the following commands:> tinc -n <netname> dump nodes > tinc -n <netname> dump edgeshttp://pastebin.com/raw.php?i=XMVyQgp9 I've included both the dump at normal operations, and during the spike. It took some time to catch to spikes in the act - with the upgrade to 1.1 I've only seen two spikes in the past month (they lasted minutes) - improvement? Regards, Mark Lopez -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.tinc-vpn.org/pipermail/tinc/attachments/20140904/805bfcca/attachment.html>