On Wed, Dec 28, 2016 at 05:34:59PM +0200, Tuomas Silen wrote:
> We have a tinc network of about ~200 hosts and in the full mesh
> configuration we've had a lot of problems with the edge propagation
storms
> taking the entire network down. Recently we had a setup with a small number
> of "hubs" to which all the other nodes connected to, which
limited the
> number of meta connections, but that didn't help much with the edge
> propagation issues.
In principle you only need a small number of meta connections to form
the initial graph, tinc will automatically create a full mesh using UDP
connections on demand.
With 200 hosts, a full mesh of meta-connections is indeed problematic,
however, if you only have 2 or 3 meta-connections per host on average,
it should work fine. Of course, the network connection of each host
should still be able to handle the amount of meta-data being sent
around, otherwise their connections will time out and they will be
disconnected. Such an event in turn will generate some meta-data...
> Now we moved to using the TunnelServer mode where we define all the
> necessary ConnectTos (on one side of the tunnel), which at least solves the
> propagation issues.
>
> There are a couple of servers where most of the servers still need to
> connect to and with TunnelServer mode we noticed that the throughput on
> those servers dropped to less than half of what it used to be (from over
> 600Mbps to ~250Mbps), probably mainly caused by the tinc's cpu core
being
> saturated much earlier.
The main problem with TunnelServer is that nodes don't get meta-data
about other nodes anymore. This means that they don't know how to send
data directly to each other anymore; instead everything has to go via
the nodes that have TunnelServer enabled. This might explain the
reduction in bandwidth you are seeing.
> Any ideas why that is? The server in question has about 135 meta
> connections and when we reduced that to ~50 or so the throughput started to
> increase back to normal. Is the TunnelServer mode somehow very expensive or
> is it just the number of meta connections that's the problem?
I don't know. Tinc itself should handle 200 nodes just fine, even
running on underpowered hardware like simple broadband routers. I would
try to figure out which node(s) are the cause of the edge propagation
storms you are seeing.
> The common settings for every host in tinc.conf (just BindToAddress and
> Name are host specific):
>
> AddressFamily = ipv4
> Forwarding = internal
> DirectOnly = no
> Device = /dev/net/tun
> MinTimeout = 2
> MaxTimeout = 300
> PingTimeout = 90
> TunnelServer = yes
> Broadcast = no
>
> hosts configurations:
>
> Port = 655
> Compression = 0
> Cipher = aes-128-cbc
> IndirectData = no
That looks fine.
--
Met vriendelijke groet / with kind regards,
Guus Sliepen <guus at tinc-vpn.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL:
<http://www.tinc-vpn.org/pipermail/tinc-devel/attachments/20161229/2cc3717f/attachment.sig>