thr3ads.net - tinc - Metadata flooding [Jun 2016]

If this information is useful, please help other people find it:
Share via:

Hendrik Schumacher

2016-Jun-21 11:04 UTC

Metadata flooding

Hi,

we use a tinc network of about 400 nodes, all of them linux servers, partly
in different datacenters (but generally low latency). Usually this is
working very well (for weeks without a problem).
>From time to time the whole network goes down though. This happened when werestarted a larger number of servers or when there was a connectivity issue
between datacenters or some (short) maintenance on the network
infrastructure. The problem was already described in the mailing list (for
example here:
https://www.tinc-vpn.org/pipermail/tinc/2015-December/004325.html , we see
the same messages in our logs as described there). We try to avoid
situations where a large number of servers becomes unavailable but from
time to time it just happens. For us it would be important that tinc
continues working with the hosts that are still reachable and that it
recovers itself and we do not have to stop and start the whole network
manually.

We already tried to tweak the configuration to limit the amount of metadata
by only having 3 ConnectTo hosts (the same ones everywhere) and using

Broadcast = no
DirectOnly = yes
Cipher=aes-128-cbc

(Apart from Name, AddressFamily, BindToAddress, Interface and ConnectTo
that are the only settings we use in tinc.conf).

We are also going to increase PingTimeout to 30 and reduce the number of
ConnectTo hosts to 2.

Is there anything else we can do to limit the amount of metadata (as that
seems to be reason why tinc just stops working and only produces log
messages about failed connection attempts)?

Ideally we would not need any metadata updates at all (apart from key
updates) since each host can connect to every other host and all the host
config files are available everywhere locally.

We also thought about using TunnelServer = yes, would this help? Does it
make sense to somehow group ConnectTo hosts (so use two ConnectTo servers
for one host group, another two for another host group and let the
ConnectTo servers connect to each other)?

Thank you for any help with this!

Hendrik
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.tinc-vpn.org/pipermail/tinc/attachments/20160621/61555640/attachment.html>

Guus Sliepen

2016-Jun-21 12:35 UTC

head link

Metadata flooding

On Tue, Jun 21, 2016 at 01:04:31PM +0200, Hendrik Schumacher wrote:
> From time to time the whole network goes down though. This happened when we
> restarted a larger number of servers or when there was a connectivity issue
> between datacenters or some (short) maintenance on the network
> infrastructure. The problem was already described in the mailing list (for
> example here:
> https://www.tinc-vpn.org/pipermail/tinc/2015-December/004325.html , we see
> the same messages in our logs as described there).
[...]> We already tried to tweak the configuration to limit the amount of metadata
> by only having 3 ConnectTo hosts (the same ones everywhere) and using
> 
> Broadcast = no
> DirectOnly = yes
> Cipher=aes-128-cbc
These options do not directly affect metadata. In particular,
"DirectOnly = yes" may actually cause nodes to be less reachable than
without that option.
> We are also going to increase PingTimeout to 30 and reduce the number of
> ConnectTo hosts to 2.
Increasing PingTimeout will probably help. As for the ConnectTo hosts:
reducing the number will also reduce the amount of metadata traffic
proprotionally. However, in your case, with 400 nodes connection to the
same 3 central nodes, you might have to look at the amount of metadata
that each node handles. It would be better to have more central nodes,
and have leaf nodes only connect to a few of them.
> Is there anything else we can do to limit the amount of metadata (as that
> seems to be reason why tinc just stops working and only produces log
> messages about failed connection attempts)?
Not really.
> Ideally we would not need any metadata updates at all (apart from key
> updates) since each host can connect to every other host and all the host
> config files are available everywhere locally.
In that case, you might want to have a look at the tinc 1.1 prerelease,
remove the ConnectTo's and enable the AutoConnect feature. This will let
tinc make metaconnections automatically in a more distributed way. It
will also switch metaconnections to different nodes in case the ones it
is connecting to fail.
> We also thought about using TunnelServer = yes, would this help?
That might help, but then you lose most of the peer-to-peer
connectivity. The reason is that tinc nodes actually only look at their
host config files when making metaconnections. With TunnelServer = yes,
nodes will only learn about those servers, but don't learn about all the
other nodes, and then they will act like they don't exist.
> Does it make sense to somehow group ConnectTo hosts (so use two
> ConnectTo servers for one host group, another two for another host
> group and let the ConnectTo servers connect to each other)?
That will probably help.

-- 
Met vriendelijke groet / with kind regards,
     Guus Sliepen <guus at tinc-vpn.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL:
<http://www.tinc-vpn.org/pipermail/tinc/attachments/20160621/bec019a7/attachment.sig>

Hendrik Schumacher

2016-Jun-22 15:59 UTC

head link

Metadata flooding

Thank you for the helpful advice. We will try to group the servers with
different ConnectTo servers first. If this does not help we will look at
the TunnelServer solution. Just to make sure we understand TunnelServer
correctly: do you need to specify every host as ConnectTo that the host
should be able to communicate with or is it sufficient to just provide the
hosts files?

Thanks, Hendrik

2016-06-21 14:35 GMT+02:00 Guus Sliepen <guus at tinc-vpn.org>:
> On Tue, Jun 21, 2016 at 01:04:31PM +0200, Hendrik Schumacher wrote:
>
> > From time to time the whole network goes down though. This happened
when
> we
> > restarted a larger number of servers or when there was a connectivity
> issue
> > between datacenters or some (short) maintenance on the network
> > infrastructure. The problem was already described in the mailing list
> (for
> > example here:
> > https://www.tinc-vpn.org/pipermail/tinc/2015-December/004325.html , we
> see
> > the same messages in our logs as described there).
> [...]
> > We already tried to tweak the configuration to limit the amount of
> metadata
> > by only having 3 ConnectTo hosts (the same ones everywhere) and using
> >
> > Broadcast = no
> > DirectOnly = yes
> > Cipher=aes-128-cbc
>
> These options do not directly affect metadata. In particular,
> "DirectOnly = yes" may actually cause nodes to be less reachable
than
> without that option.
>
> > We are also going to increase PingTimeout to 30 and reduce the number
of
> > ConnectTo hosts to 2.
>
> Increasing PingTimeout will probably help. As for the ConnectTo hosts:
> reducing the number will also reduce the amount of metadata traffic
> proprotionally. However, in your case, with 400 nodes connection to the
> same 3 central nodes, you might have to look at the amount of metadata
> that each node handles. It would be better to have more central nodes,
> and have leaf nodes only connect to a few of them.
>
> > Is there anything else we can do to limit the amount of metadata (as
that
> > seems to be reason why tinc just stops working and only produces log
> > messages about failed connection attempts)?
>
> Not really.
>
> > Ideally we would not need any metadata updates at all (apart from key
> > updates) since each host can connect to every other host and all the
host
> > config files are available everywhere locally.
>
> In that case, you might want to have a look at the tinc 1.1 prerelease,
> remove the ConnectTo's and enable the AutoConnect feature. This will
let
> tinc make metaconnections automatically in a more distributed way. It
> will also switch metaconnections to different nodes in case the ones it
> is connecting to fail.
>
> > We also thought about using TunnelServer = yes, would this help?
>
> That might help, but then you lose most of the peer-to-peer
> connectivity. The reason is that tinc nodes actually only look at their
> host config files when making metaconnections. With TunnelServer = yes,
> nodes will only learn about those servers, but don't learn about all
the
> other nodes, and then they will act like they don't exist.
>
> > Does it make sense to somehow group ConnectTo hosts (so use two
> > ConnectTo servers for one host group, another two for another host
> > group and let the ConnectTo servers connect to each other)?
>
> That will probably help.
>
> --
> Met vriendelijke groet / with kind regards,
>      Guus Sliepen <guus at tinc-vpn.org>
>
> _______________________________________________
> tinc mailing list
> tinc at tinc-vpn.org
> https://www.tinc-vpn.org/cgi-bin/mailman/listinfo/tinc
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.tinc-vpn.org/pipermail/tinc/attachments/20160622/0e0214fe/attachment.html>

Maybe Matching Threads

Search for more maybe matching threads

tinc - Jun 2016 - Metadata flooding

Metadata flooding

Metadata flooding

Metadata flooding

Maybe Matching Threads