Henning Verbeek
2014-Apr-15 09:09 UTC
tinc 1.1pre19 slower than tinc 1.0, experimentalProtocol even more
Hi there, we're using tinc to mesh together hosts in a public datacenter (instead of using a private VLAN, sort of). So all hosts are reasonably modern; connections are low latency with an available bandwith of around 500Mbit/s or 1Gbit/s (depending on how close they are to each other). Iperf between two nodes directly reports around 940Mbit/s. The CPUs are Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, the hosts are running debian wheezy (kernel 3.2.54-2), the aesni module is loaded. OpenSSL is at 1.0.1e. On tinc 1.0, with cipher aes-128-cbc and digest sha1, I managed to get around 900Mbit/s throughput (iperf, simplex, both tinc daemons at 90% CPU utilisation). Switching to a null-cipher came up with only marginally better throughput. Note, this is from memory, can't find my notes unfortunately. Reading about tinc 1.1, specifically the use of a GCM cipher suite to reduce the cost of HMAC, we've decided to test it. The results are quite surprising: * using the experimental protocol, throughput falls to around 380Mbit/s; both tincd are at or just above 100% CPU (the host has 8 cores) * not specificing ECDSA keys, throughput is around 600Mbit/s; both tincd are at around 90-95% CPU sptps_speed reports: Generating keys for 10 seconds: 5200.94 op/s ECDSA sign for 10 seconds: 3710.72 op/s ECDSA verify for 10 seconds: 1734.05 op/s ECDH for 10 seconds: 1449.40 op/s SPTPS/TCP authenticate for 10 seconds: 641.37 op/s SPTPS/TCP transmit for 10 seconds: 8.45 Gbit/s SPTPS/UDP authenticate for 10 seconds: 640.18 op/s SPTPS/UDP transmit for 10 seconds: 8.64 Gbit/s Note: the two nodes are connected via two separate tinc-mesh-networks (since they participate in different network groups). Two tincds are running, sharing the same UDP port; one is configured with ECDSA keys, the other without. Not sure if that is causing a problem. Are these findings expected? Or are we doing something wrong, missing something? Is there any way to get close to raw network throughput? Thanks a lot! Best regards, Henning -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.tinc-vpn.org/pipermail/tinc/attachments/20140415/7d8d2ac4/attachment.html>
Guus Sliepen
2014-Apr-15 12:18 UTC
tinc 1.1pre10 slower than tinc 1.0, experimentalProtocol even more
On Tue, Apr 15, 2014 at 11:09:09AM +0200, Henning Verbeek wrote:> On tinc 1.0, with cipher aes-128-cbc and digest sha1, I managed to get > around 900Mbit/s throughput[...]> Reading about tinc 1.1, specifically the use of a GCM cipher suite to > reduce the cost of HMAC, we've decided to test it. The results are quite > surprising: > * using the experimental protocol, throughput falls to around 380Mbit/s; > both tincd are at or just above 100% CPU (the host has 8 cores)I'm not too surprised about that; although the new protocol itself should have a much higher throughput, it does not always use direct communication via UDP yet, so it could be that the traffic is going via TCP, which will limit the throughput considerably.> * not specificing ECDSA keys, throughput is around 600Mbit/s; both tincd > are at around 90-95% CPUThat is cause for concern, since it should then just use the old protocol with exactly the same behaviour as tinc 1.0. There are however some other changes in tinc 1.1 which might cause throughput to be lowered, I'll have a look at that.> sptps_speed reports: > Generating keys for 10 seconds: 5200.94 op/s > ECDSA sign for 10 seconds: 3710.72 op/s > ECDSA verify for 10 seconds: 1734.05 op/s > ECDH for 10 seconds: 1449.40 op/s > SPTPS/TCP authenticate for 10 seconds: 641.37 op/s > SPTPS/TCP transmit for 10 seconds: 8.45 Gbit/s > SPTPS/UDP authenticate for 10 seconds: 640.18 op/s > SPTPS/UDP transmit for 10 seconds: 8.64 Gbit/sThat's the fastest I've seen so far, but not unexpected of one of the fastest Haswell processors available. Note that 1.1pre11 will not use AES-GCM anymore, but will instead use ChaCha-Poly1305, which will lower the throughput on your processor to about 2 Gbit/s (you can try out the latest commit from the 1.1 branch of tinc's git repository, and run sptps_speed again). However, it will be much faster on processors which do not have AESNI. The sptps_speed program tests the performance of the new protocol in a semi-realistic way: the last four values are measured by making a real connection between two "nodes", and transmitting data in packets of the same size as tinc itself would use. However, it does not include any overhead or latencies caused by a real network, nor does it simulate the overhead and latencies caused by having to read and write packets from/to the TUN/TAP device.> Note: the two nodes are connected via two separate tinc-mesh-networks > (since they participate in different network groups). Two tincds are > running, sharing the same UDP port; one is configured with ECDSA keys, the > other without. Not sure if that is causing a problem.Uh, are you sure they are running on the same UDP port? Normally, only tinc daemon can use a given port. If you did manage to run them on the same port, then it would indeed cause problems.> Are these findings expected? Or are we doing something wrong, missing > something? Is there any way to get close to raw network throughput?I do hope that when all issues have been resolved and tinc 1.1.0 can be released, actual throughput is much closer to the throughput measured by sptps_speed. Also, at the moment, both tinc and sptps_speed are single-threaded, so on a multi-core machine the throughput could in principle be multiplied by the number of cores, however that only makes sense if the encryption and authentication themselves are the bottleneck. -- Met vriendelijke groet / with kind regards, Guus Sliepen <guus at tinc-vpn.org> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: Digital signature URL: <http://www.tinc-vpn.org/pipermail/tinc/attachments/20140415/095d1c78/attachment.sig>