John-Mark Gurney
2015-Mar-18 21:14 UTC
35-40% performance drop releng9 vs releng10 openvpn
Mike Tancsa wrote this message on Wed, Mar 18, 2015 at 15:49 -0400:> On 3/16/2015 9:20 AM, John-Mark Gurney wrote: > > > > Since you have at test framework ready, you could generate some flame > > graphs[1] using dtrace to help see where things might be having an > > impact... > > > > These are very easy to generate, and posting them would be useful... > > > > [1] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html > > > Hi, > I went through the steps to generate one. What args should I use for > dtrace to generate the information that is helpful / useful ? For my > setup, I have > > server1---------apu-------------server2 > > server1 has an openvpn tunnel to the apu > I route server2's IP address across the VPN tunnel, so if I ping from > server1 to server2's IP, it goes via the tunnel > > on the > > dtrace -x ustackframes=100 -n 'profile-99 /execname == "openvpn" && > arg1/ { @[ustack()] = count(); } tick-30s { exit(0); }' -o 10.stacks > > which generated > http://tancsa.com/10.svgSo, I would first identify the machine w/ the cpu limited load.. I assume that is apu... Then I would look at where most of the cpu time is being spent, be it openvpn itself, or in the kernel... Most likely it is the kernel, so getting stacks from the kernel would be more useful than the one you generated... Use the command: # dtrace -x stackframes=100 -n 'profile-997 /arg0/ { @[stack()] = count(); } tick-60s { exit(0); }' -o out.kern_stacks Also, another thing you can do is to compare the two using differential flame graphs: http://www.brendangregg.com/blog/2014-11-09/differential-flame-graphs.html Which will highlight where the performances differ... As I've never used OpenVPN before and their docs don't go into saying what it's using.. Is OpenVPN a kernel or userland VPN? Do they use IPSec in the kernel? or are they just using UDP or TCP for their connections? -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
On Mar 18, 2015, at 2:14 PM, John-Mark Gurney <jmg at funkthat.com> wrote:> As I've never used OpenVPN before and their docs don't go into saying > what it's using.. Is OpenVPN a kernel or userland VPN? Do they use > IPSec in the kernel? or are they just using UDP or TCP for their > connections?OpenVPN runs in userland; it uses OpenSSL to create either a layer-2 or layer-3 tunnel via either UDP or TCP. http://en.wikipedia.org/wiki/OpenVPN has more details. Regards, -- -Chuck
On 3/18/2015 5:14 PM, John-Mark Gurney wrote:> > So, I would first identify the machine w/ the cpu limited load.. I > assume that is apu...Yup, the APU. The machines on either side are significantly faster> Then I would look at where most of the cpu time > is being spent, be it openvpn itself, or in the kernel... Most likely > it is the kernel, so getting stacks from the kernel would be more useful > than the one you generated... Use the command: > # dtrace -x stackframes=100 -n 'profile-997 /arg0/ { @[stack()] = count(); } tick-60s { exit(0); }' -o out.kern_stacks > > Also, another thing you can do is to compare the two using differential > flame graphs: > http://www.brendangregg.com/blog/2014-11-09/differential-flame-graphs.html > > Which will highlight where the performances differ... >Will do, I will work on those.> As I've never used OpenVPN before and their docs don't go into saying > what it's using.. Is OpenVPN a kernel or userland VPN? Do they use > IPSec in the kernel? or are they just using UDP or TCP for their > connections?All in userland. I use UDP for the transport, and it uses OpenSSL in the base for the crypto. In this case, AES-128-CBC. There is no hardware assist on the APU either to offload the AES. ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike at sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/
On 3/18/2015 5:14 PM, John-Mark Gurney wrote:> # dtrace -x stackframes=100 -n 'profile-997 /arg0/ { @[stack()] = count(); } tick-60s { exit(0); }' -o out.kern_stacks > > Also, another thing you can do is to compare the two using differential > flame graphs: > http://www.brendangregg.com/blog/2014-11-09/differential-flame-graphs.html > > Which will highlight where the performances differ...OK, some more data points. It seems a performance regression happened in RELENG_10 somewhere between r277684 (late January 2015) and now. Using r277684 on RELENG_10, I can get about 75Mb/s of throughput on OpenVPN. Still not as good as the 83-85Mb on RELENG_9, but much better than the 61Mb using RELENG_10 from the start of this week, For the differential graph, see http://tancsa.com/diffgraph.svg and http://tancsa.com/10-r277684.svg http://tancsa.com/10-r277684-kern.svg ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike at sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/