Jelle de Jong
2020-Apr-04 18:02 UTC
how to pick cipher for AES-NI enabled AMD GX-412TC SOC tincd at 100% CPU
Hello everybody, First a big thanks for tinc-vpn I am still using it next to wireguard and openvpn. I am having a setup where the tinc debian appliance is at 100% cpu load doing about 7.5MB/s. Compression = 9 PMTU = 1400 PMTUDiscovery = yes Cipher = aes-128-cbc How can I pick a cipher that is the fasted for my CPU and don't create a CPU bottleneck at 100%. Kind regards, Jelle de Jong root at officelink01:~# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 40 bits physical, 48 bits virtual CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 22 Model: 48 Model name: AMD GX-412TC SOC Stepping: 1 CPU MHz: 775.729 CPU max MHz: 1000.0000 CPU min MHz: 600.0000 BogoMIPS: 1996.08 Virtualization: AMD-V L1d cache: 32K L1i cache: 32K L2 cache: 2048K NUMA node0 CPU(s): 0-3 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good acc_power nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt topoext perfctr_nb bpext ptsc perfctr_llc cpb hw_pstate ssbd vmmcall bmi1 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale flushbyasid decodeassists pausefilter pfthreshold overflow_recov root at officelink01:~# openssl help Standard commands asn1parse ca ciphers cms crl crl2pkcs7 dgst dhparam dsa dsaparam ec ecparam enc engine errstr gendsa genpkey genrsa help list nseq ocsp passwd pkcs12 pkcs7 pkcs8 pkey pkeyparam pkeyutl prime rand rehash req rsa rsautl s_client s_server s_time sess_id smime speed spkac srp storeutl ts verify version x509 Message Digest commands (see the `dgst' command for more details) blake2b512 blake2s256 gost md4 md5 rmd160 sha1 sha224 sha256 sha3-224 sha3-256 sha3-384 sha3-512 sha384 sha512 sha512-224 sha512-256 shake128 shake256 sm3 Cipher commands (see the `enc' command for more details) aes-128-cbc aes-128-ecb aes-192-cbc aes-192-ecb aes-256-cbc aes-256-ecb aria-128-cbc aria-128-cfb aria-128-cfb1 aria-128-cfb8 aria-128-ctr aria-128-ecb aria-128-ofb aria-192-cbc aria-192-cfb aria-192-cfb1 aria-192-cfb8 aria-192-ctr aria-192-ecb aria-192-ofb aria-256-cbc aria-256-cfb aria-256-cfb1 aria-256-cfb8 aria-256-ctr aria-256-ecb aria-256-ofb base64 bf bf-cbc bf-cfb bf-ecb bf-ofb camellia-128-cbc camellia-128-ecb camellia-192-cbc camellia-192-ecb camellia-256-cbc camellia-256-ecb cast cast-cbc cast5-cbc cast5-cfb cast5-ecb cast5-ofb des des-cbc des-cfb des-ecb des-ede des-ede-cbc des-ede-cfb des-ede-ofb des-ede3 des-ede3-cbc des-ede3-cfb des-ede3-ofb des-ofb des3 desx rc2 rc2-40-cbc rc2-64-cbc rc2-cbc rc2-cfb rc2-ecb rc2-ofb rc4 rc4-40 seed seed-cbc seed-cfb seed-ecb seed-ofb sm4-cbc sm4-cfb sm4-ctr sm4-ecb sm4-ofb root at officelink01:~# openssl speed -elapsed -evp aes-128-cbc You have chosen to measure elapsed time instead of user CPU time. Doing aes-128-cbc for 3s on 16 size blocks: 13905799 aes-128-cbc's in 3.00s Doing aes-128-cbc for 3s on 64 size blocks: 6572120 aes-128-cbc's in 3.00s Doing aes-128-cbc for 3s on 256 size blocks: 2254183 aes-128-cbc's in 3.00s Doing aes-128-cbc for 3s on 1024 size blocks: 623111 aes-128-cbc's in 3.00s Doing aes-128-cbc for 3s on 8192 size blocks: 80058 aes-128-cbc's in 3.00s Doing aes-128-cbc for 3s on 16384 size blocks: 40180 aes-128-cbc's in 3.00s OpenSSL 1.1.1d 10 Sep 2019 built on: Sat Oct 12 19:56:43 2019 UTC options:bn(64,64) rc4(8x,int) des(int) aes(partial) blowfish(ptr) compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -fdebug-prefix-map=/build/openssl-YwazYa/openssl-1.1.1d=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2 The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-128-cbc 74164.26k 140205.23k 192356.95k 212688.55k 218611.71k 219436.37k root at officelink01:~# openssl speed -elapsed -evp aes-256-cbc You have chosen to measure elapsed time instead of user CPU time. Doing aes-256-cbc for 3s on 16 size blocks: 12322268 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 64 size blocks: 5283431 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 256 size blocks: 1686231 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 1024 size blocks: 454425 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 8192 size blocks: 58092 aes-256-cbc's in 3.00s Doing aes-256-cbc for 3s on 16384 size blocks: 29035 aes-256-cbc's in 3.00s OpenSSL 1.1.1d 10 Sep 2019 built on: Sat Oct 12 19:56:43 2019 UTC options:bn(64,64) rc4(8x,int) des(int) aes(partial) blowfish(ptr) compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -fdebug-prefix-map=/build/openssl-YwazYa/openssl-1.1.1d=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2 The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-256-cbc 65718.76k 112713.19k 143891.71k 155110.40k 158629.89k 158569.81k root at officelink01:~#
Fufu Fang
2020-Apr-04 18:08 UTC
how to pick cipher for AES-NI enabled AMD GX-412TC SOC tincd at 100% CPU
I basically end up using the same cipher suite as Wireguard, it works quite well on my Atom N2800, which does not have AES-NI. It is now 3 times as fast. Cipher = chacha20-poly1305 Digest = blake2b512 On Sat, 2020-04-04 at 20:02 +0200, Jelle de Jong wrote:> Hello everybody, > > First a big thanks for tinc-vpn I am still using it next to > wireguard > and openvpn. > > I am having a setup where the tinc debian appliance is at 100% cpu > load > doing about 7.5MB/s. > > Compression = 9 > PMTU = 1400 > PMTUDiscovery = yes > Cipher = aes-128-cbc > > How can I pick a cipher that is the fasted for my CPU and don't > create a > CPU bottleneck at 100%. > > Kind regards, > > Jelle de Jong > > root at officelink01:~# lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > Address sizes: 40 bits physical, 48 bits virtual > CPU(s): 4 > On-line CPU(s) list: 0-3 > Thread(s) per core: 1 > Core(s) per socket: 4 > Socket(s): 1 > NUMA node(s): 1 > Vendor ID: AuthenticAMD > CPU family: 22 > Model: 48 > Model name: AMD GX-412TC SOC > Stepping: 1 > CPU MHz: 775.729 > CPU max MHz: 1000.0000 > CPU min MHz: 600.0000 > BogoMIPS: 1996.08 > Virtualization: AMD-V > L1d cache: 32K > L1i cache: 32K > L2 cache: 2048K > NUMA node0 CPU(s): 0-3 > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep > mtrr > pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx > mmxext > fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good acc_power nopl > nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 > cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c lahf_lm > cmp_legacy > svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs > skinit wdt topoext perfctr_nb bpext ptsc perfctr_llc cpb hw_pstate > ssbd > vmmcall bmi1 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale > flushbyasid decodeassists pausefilter pfthreshold overflow_recov > > root at officelink01:~# openssl help > Standard commands > asn1parse ca ciphers cms > crl crl2pkcs7 dgst dhparam > dsa dsaparam ec ecparam > enc engine errstr gendsa > genpkey genrsa help list > nseq ocsp passwd pkcs12 > pkcs7 pkcs8 pkey pkeyparam > pkeyutl prime rand rehash > req rsa rsautl s_client > s_server s_time sess_id smime > speed spkac srp storeutl > ts verify version x509 > > Message Digest commands (see the `dgst' command for more details) > blake2b512 blake2s256 gost md4 > md5 rmd160 sha1 sha224 > sha256 sha3-224 sha3-256 sha3-384 > sha3-512 sha384 sha512 sha512-224 > sha512-256 shake128 shake256 sm3 > > Cipher commands (see the `enc' command for more details) > aes-128-cbc aes-128-ecb aes-192-cbc aes-192-ecb > aes-256-cbc aes-256-ecb aria-128-cbc aria-128-cfb > aria-128-cfb1 aria-128-cfb8 aria-128-ctr aria-128-ecb > aria-128-ofb aria-192-cbc aria-192-cfb aria-192-cfb1 > aria-192-cfb8 aria-192-ctr aria-192-ecb aria-192-ofb > aria-256-cbc aria-256-cfb aria-256-cfb1 aria-256-cfb8 > aria-256-ctr aria-256-ecb aria-256-ofb base64 > bf bf-cbc bf-cfb bf-ecb > bf-ofb camellia-128-cbc camellia-128-ecb camellia-192- > cbc > camellia-192-ecb camellia-256-cbc camellia-256-ecb cast > cast-cbc cast5-cbc cast5-cfb cast5-ecb > cast5-ofb des des-cbc des-cfb > des-ecb des-ede des-ede-cbc des-ede-cfb > des-ede-ofb des-ede3 des-ede3-cbc des-ede3-cfb > des-ede3-ofb des-ofb des3 desx > rc2 rc2-40-cbc rc2-64-cbc rc2-cbc > rc2-cfb rc2-ecb rc2-ofb rc4 > rc4-40 seed seed-cbc seed-cfb > seed-ecb seed-ofb sm4-cbc sm4-cfb > sm4-ctr sm4-ecb sm4-ofb > > root at officelink01:~# openssl speed -elapsed -evp aes-128-cbc > You have chosen to measure elapsed time instead of user CPU time. > Doing aes-128-cbc for 3s on 16 size blocks: 13905799 aes-128-cbc's in > 3.00s > Doing aes-128-cbc for 3s on 64 size blocks: 6572120 aes-128-cbc's in > 3.00s > Doing aes-128-cbc for 3s on 256 size blocks: 2254183 aes-128-cbc's in > 3.00s > Doing aes-128-cbc for 3s on 1024 size blocks: 623111 aes-128-cbc's in > 3.00s > Doing aes-128-cbc for 3s on 8192 size blocks: 80058 aes-128-cbc's in > 3.00s > Doing aes-128-cbc for 3s on 16384 size blocks: 40180 aes-128-cbc's in > 3.00s > OpenSSL 1.1.1d 10 Sep 2019 > built on: Sat Oct 12 19:56:43 2019 UTC > options:bn(64,64) rc4(8x,int) des(int) aes(partial) blowfish(ptr) > compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall > -Wa,--noexecstack -g -O2 > -fdebug-prefix-map=/build/openssl-YwazYa/openssl-1.1.1d=. > -fstack-protector-strong -Wformat -Werror=format-security > -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ > -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 > -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM > -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM > -DGHASH_ASM > -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time > -D_FORTIFY_SOURCE=2 > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes 256 bytes 1024 > bytes 8192 > bytes 16384 bytes > aes-128-cbc 74164.26k 140205.23k 192356.95k 212688.55k > 218611.71k 219436.37k > root at officelink01:~# openssl speed -elapsed -evp aes-256-cbc > You have chosen to measure elapsed time instead of user CPU time. > Doing aes-256-cbc for 3s on 16 size blocks: 12322268 aes-256-cbc's in > 3.00s > Doing aes-256-cbc for 3s on 64 size blocks: 5283431 aes-256-cbc's in > 3.00s > Doing aes-256-cbc for 3s on 256 size blocks: 1686231 aes-256-cbc's in > 3.00s > Doing aes-256-cbc for 3s on 1024 size blocks: 454425 aes-256-cbc's in > 3.00s > Doing aes-256-cbc for 3s on 8192 size blocks: 58092 aes-256-cbc's in > 3.00s > Doing aes-256-cbc for 3s on 16384 size blocks: 29035 aes-256-cbc's in > 3.00s > OpenSSL 1.1.1d 10 Sep 2019 > built on: Sat Oct 12 19:56:43 2019 UTC > options:bn(64,64) rc4(8x,int) des(int) aes(partial) blowfish(ptr) > compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall > -Wa,--noexecstack -g -O2 > -fdebug-prefix-map=/build/openssl-YwazYa/openssl-1.1.1d=. > -fstack-protector-strong -Wformat -Werror=format-security > -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ > -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 > -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM > -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM > -DGHASH_ASM > -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time > -D_FORTIFY_SOURCE=2 > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes 256 bytes 1024 > bytes 8192 > bytes 16384 bytes > aes-256-cbc 65718.76k 112713.19k 143891.71k 155110.40k > 158629.89k 158569.81k > root at officelink01:~# > _______________________________________________ > tinc mailing list > tinc at tinc-vpn.org > https://www.tinc-vpn.org/cgi-bin/mailman/listinfo/tinc
Jelle de Jong
2020-Apr-04 19:33 UTC
how to pick cipher for AES-NI enabled AMD GX-412TC SOC tincd at 100% CPU
Hello everybody, Thank you Fufu Fang for your quick reply: With tinc version 1.0.35 and the bellow options at 100% CPu load i get about 10 MB/s... PMTU = 1400 PMTUDiscovery = yes #Cipher = none Cipher = chacha20-poly1305 Digest = blake2b512 Tried Cipher = none as well and also got 10MB/s with 100% CPU on one thread the other three available threads are idle. With inc_1.1~pre17-1.1_amd64.deb and libssl1.1:amd64 1.1.1d-0+deb10u2 I get the following error: Apr 04 19:03:19 officelink01 tincd[522]: Error while decrypting: error:060A7094:digital envelope routines:EVP_EncryptUpdate:invalid operation installation steps: wget http://ftp.nl.debian.org/debian/pool/main/t/tinc/tinc_1.1~pre17-1.1_amd64.deb dpkg -i tinc_1.1~pre17-1.1_amd64.deb apt-get -f install Any speed improvement ideas? Kind regards, Jelle On 2020-04-04 20:02, Jelle de Jong wrote:> Hello everybody, > > First a big thanks for tinc-vpn I am still using it next to wireguard > and openvpn. > > I am having a setup where the tinc debian appliance is at 100% cpu load > doing about 7.5MB/s. > > Compression = 9 > PMTU = 1400 > PMTUDiscovery = yes > Cipher = aes-128-cbc > > How can I pick a cipher that is the fasted for my CPU and don't create a > CPU bottleneck at 100%. > > Kind regards, > > Jelle de Jong > > root at officelink01:~# lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > Address sizes: 40 bits physical, 48 bits virtual > CPU(s): 4 > On-line CPU(s) list: 0-3 > Thread(s) per core: 1 > Core(s) per socket: 4 > Socket(s): 1 > NUMA node(s): 1 > Vendor ID: AuthenticAMD > CPU family: 22 > Model: 48 > Model name: AMD GX-412TC SOC > Stepping: 1 > CPU MHz: 775.729 > CPU max MHz: 1000.0000 > CPU min MHz: 600.0000 > BogoMIPS: 1996.08 > Virtualization: AMD-V > L1d cache: 32K > L1i cache: 32K > L2 cache: 2048K > NUMA node0 CPU(s): 0-3 > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr > pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext > fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good acc_power nopl > nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 > cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c lahf_lm cmp_legacy > svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs > skinit wdt topoext perfctr_nb bpext ptsc perfctr_llc cpb hw_pstate ssbd > vmmcall bmi1 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale > flushbyasid decodeassists pausefilter pfthreshold overflow_recov > > root at officelink01:~# openssl help > Standard commands > asn1parse ca ciphers cms > crl crl2pkcs7 dgst dhparam > dsa dsaparam ec ecparam > enc engine errstr gendsa > genpkey genrsa help list > nseq ocsp passwd pkcs12 > pkcs7 pkcs8 pkey pkeyparam > pkeyutl prime rand rehash > req rsa rsautl s_client > s_server s_time sess_id smime > speed spkac srp storeutl > ts verify version x509 > > Message Digest commands (see the `dgst' command for more details) > blake2b512 blake2s256 gost md4 > md5 rmd160 sha1 sha224 > sha256 sha3-224 sha3-256 sha3-384 > sha3-512 sha384 sha512 sha512-224 > sha512-256 shake128 shake256 sm3 > > Cipher commands (see the `enc' command for more details) > aes-128-cbc aes-128-ecb aes-192-cbc aes-192-ecb > aes-256-cbc aes-256-ecb aria-128-cbc aria-128-cfb > aria-128-cfb1 aria-128-cfb8 aria-128-ctr aria-128-ecb > aria-128-ofb aria-192-cbc aria-192-cfb aria-192-cfb1 > aria-192-cfb8 aria-192-ctr aria-192-ecb aria-192-ofb > aria-256-cbc aria-256-cfb aria-256-cfb1 aria-256-cfb8 > aria-256-ctr aria-256-ecb aria-256-ofb base64 > bf bf-cbc bf-cfb bf-ecb > bf-ofb camellia-128-cbc camellia-128-ecb camellia-192-cbc > camellia-192-ecb camellia-256-cbc camellia-256-ecb cast > cast-cbc cast5-cbc cast5-cfb cast5-ecb > cast5-ofb des des-cbc des-cfb > des-ecb des-ede des-ede-cbc des-ede-cfb > des-ede-ofb des-ede3 des-ede3-cbc des-ede3-cfb > des-ede3-ofb des-ofb des3 desx > rc2 rc2-40-cbc rc2-64-cbc rc2-cbc > rc2-cfb rc2-ecb rc2-ofb rc4 > rc4-40 seed seed-cbc seed-cfb > seed-ecb seed-ofb sm4-cbc sm4-cfb > sm4-ctr sm4-ecb sm4-ofb > > root at officelink01:~# openssl speed -elapsed -evp aes-128-cbc > You have chosen to measure elapsed time instead of user CPU time. > Doing aes-128-cbc for 3s on 16 size blocks: 13905799 aes-128-cbc's in 3.00s > Doing aes-128-cbc for 3s on 64 size blocks: 6572120 aes-128-cbc's in 3.00s > Doing aes-128-cbc for 3s on 256 size blocks: 2254183 aes-128-cbc's in 3.00s > Doing aes-128-cbc for 3s on 1024 size blocks: 623111 aes-128-cbc's in 3.00s > Doing aes-128-cbc for 3s on 8192 size blocks: 80058 aes-128-cbc's in 3.00s > Doing aes-128-cbc for 3s on 16384 size blocks: 40180 aes-128-cbc's in 3.00s > OpenSSL 1.1.1d 10 Sep 2019 > built on: Sat Oct 12 19:56:43 2019 UTC > options:bn(64,64) rc4(8x,int) des(int) aes(partial) blowfish(ptr) > compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall > -Wa,--noexecstack -g -O2 > -fdebug-prefix-map=/build/openssl-YwazYa/openssl-1.1.1d=. > -fstack-protector-strong -Wformat -Werror=format-security > -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ > -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 > -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM > -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM > -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time > -D_FORTIFY_SOURCE=2 > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 > bytes 16384 bytes > aes-128-cbc 74164.26k 140205.23k 192356.95k 212688.55k > 218611.71k 219436.37k > root at officelink01:~# openssl speed -elapsed -evp aes-256-cbc > You have chosen to measure elapsed time instead of user CPU time. > Doing aes-256-cbc for 3s on 16 size blocks: 12322268 aes-256-cbc's in 3.00s > Doing aes-256-cbc for 3s on 64 size blocks: 5283431 aes-256-cbc's in 3.00s > Doing aes-256-cbc for 3s on 256 size blocks: 1686231 aes-256-cbc's in 3.00s > Doing aes-256-cbc for 3s on 1024 size blocks: 454425 aes-256-cbc's in 3.00s > Doing aes-256-cbc for 3s on 8192 size blocks: 58092 aes-256-cbc's in 3.00s > Doing aes-256-cbc for 3s on 16384 size blocks: 29035 aes-256-cbc's in 3.00s > OpenSSL 1.1.1d 10 Sep 2019 > built on: Sat Oct 12 19:56:43 2019 UTC > options:bn(64,64) rc4(8x,int) des(int) aes(partial) blowfish(ptr) > compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall > -Wa,--noexecstack -g -O2 > -fdebug-prefix-map=/build/openssl-YwazYa/openssl-1.1.1d=. > -fstack-protector-strong -Wformat -Werror=format-security > -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ > -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 > -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM > -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM -DGHASH_ASM > -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time > -D_FORTIFY_SOURCE=2 > The 'numbers' are in 1000s of bytes per second processed. > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 > bytes 16384 bytes > aes-256-cbc 65718.76k 112713.19k 143891.71k 155110.40k > 158629.89k 158569.81k > root at officelink01:~# > _______________________________________________ > tinc mailing list > tinc at tinc-vpn.org > https://www.tinc-vpn.org/cgi-bin/mailman/listinfo/tinc
Jelle de Jong
2020-May-09 10:25 UTC
how to pick cipher for AES-NI enabled AMD GX-412TC SOC tincd at 100% CPU
Hello everybody, I would also love to know how I can optimize my tinc setup so it goes faster without using 100% CPU load for 10MB/s... Kind regards, Jelle de Jong On 2020-04-04 21:33, Jelle de Jong wrote:> Hello everybody, > > Thank you Fufu Fang for your quick reply: > > With tinc version 1.0.35 and the bellow options at 100% CPu load i get > about 10 MB/s... > > PMTU = 1400 > PMTUDiscovery = yes > #Cipher = none > Cipher = chacha20-poly1305 > Digest = blake2b512 > > Tried Cipher = none as well and also got 10MB/s with 100% CPU on one > thread the other three available threads are idle. > > With inc_1.1~pre17-1.1_amd64.deb and libssl1.1:amd64 1.1.1d-0+deb10u2 I > get the following error: > > Apr 04 19:03:19 officelink01 tincd[522]: Error while decrypting: > error:060A7094:digital envelope routines:EVP_EncryptUpdate:invalid > operation > > installation steps: > wget > http://ftp.nl.debian.org/debian/pool/main/t/tinc/tinc_1.1~pre17-1.1_amd64.deb > > dpkg -i tinc_1.1~pre17-1.1_amd64.deb > apt-get -f install > > Any speed improvement ideas? > > Kind regards, > > Jelle > > On 2020-04-04 20:02, Jelle de Jong wrote: >> Hello everybody, >> >> First a big thanks for tinc-vpn I am still using it next to wireguard >> and openvpn. >> >> I am having a setup where the tinc debian appliance is at 100% cpu >> load doing about 7.5MB/s. >> >> Compression = 9 >> PMTU = 1400 >> PMTUDiscovery = yes >> Cipher = aes-128-cbc >> >> How can I pick a cipher that is the fasted for my CPU and don't create >> a CPU bottleneck at 100%. >> >> Kind regards, >> >> Jelle de Jong >> >> root at officelink01:~# lscpu >> Architecture: x86_64 >> CPU op-mode(s): 32-bit, 64-bit >> Byte Order: Little Endian >> Address sizes: 40 bits physical, 48 bits virtual >> CPU(s): 4 >> On-line CPU(s) list: 0-3 >> Thread(s) per core: 1 >> Core(s) per socket: 4 >> Socket(s): 1 >> NUMA node(s): 1 >> Vendor ID: AuthenticAMD >> CPU family: 22 >> Model: 48 >> Model name: AMD GX-412TC SOC >> Stepping: 1 >> CPU MHz: 775.729 >> CPU max MHz: 1000.0000 >> CPU min MHz: 600.0000 >> BogoMIPS: 1996.08 >> Virtualization: AMD-V >> L1d cache: 32K >> L1i cache: 32K >> L2 cache: 2048K >> NUMA node0 CPU(s): 0-3 >> Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr >> pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext >> fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good acc_power nopl >> nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 >> cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c lahf_lm cmp_legacy >> svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs >> skinit wdt topoext perfctr_nb bpext ptsc perfctr_llc cpb hw_pstate >> ssbd vmmcall bmi1 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale >> flushbyasid decodeassists pausefilter pfthreshold overflow_recov >> >> root at officelink01:~# openssl help >> Standard commands >> asn1parse ca ciphers cms >> crl crl2pkcs7 dgst dhparam >> dsa dsaparam ec ecparam >> enc engine errstr gendsa >> genpkey genrsa help list >> nseq ocsp passwd pkcs12 >> pkcs7 pkcs8 pkey pkeyparam >> pkeyutl prime rand rehash >> req rsa rsautl s_client >> s_server s_time sess_id smime >> speed spkac srp storeutl >> ts verify version x509 >> >> Message Digest commands (see the `dgst' command for more details) >> blake2b512 blake2s256 gost md4 >> md5 rmd160 sha1 sha224 >> sha256 sha3-224 sha3-256 sha3-384 >> sha3-512 sha384 sha512 sha512-224 >> sha512-256 shake128 shake256 sm3 >> >> Cipher commands (see the `enc' command for more details) >> aes-128-cbc aes-128-ecb aes-192-cbc aes-192-ecb >> aes-256-cbc aes-256-ecb aria-128-cbc aria-128-cfb >> aria-128-cfb1 aria-128-cfb8 aria-128-ctr aria-128-ecb >> aria-128-ofb aria-192-cbc aria-192-cfb aria-192-cfb1 >> aria-192-cfb8 aria-192-ctr aria-192-ecb aria-192-ofb >> aria-256-cbc aria-256-cfb aria-256-cfb1 aria-256-cfb8 >> aria-256-ctr aria-256-ecb aria-256-ofb base64 >> bf bf-cbc bf-cfb bf-ecb >> bf-ofb camellia-128-cbc camellia-128-ecb camellia-192-cbc >> camellia-192-ecb camellia-256-cbc camellia-256-ecb cast >> cast-cbc cast5-cbc cast5-cfb cast5-ecb >> cast5-ofb des des-cbc des-cfb >> des-ecb des-ede des-ede-cbc des-ede-cfb >> des-ede-ofb des-ede3 des-ede3-cbc des-ede3-cfb >> des-ede3-ofb des-ofb des3 desx >> rc2 rc2-40-cbc rc2-64-cbc rc2-cbc >> rc2-cfb rc2-ecb rc2-ofb rc4 >> rc4-40 seed seed-cbc seed-cfb >> seed-ecb seed-ofb sm4-cbc sm4-cfb >> sm4-ctr sm4-ecb sm4-ofb >> >> root at officelink01:~# openssl speed -elapsed -evp aes-128-cbc >> You have chosen to measure elapsed time instead of user CPU time. >> Doing aes-128-cbc for 3s on 16 size blocks: 13905799 aes-128-cbc's in >> 3.00s >> Doing aes-128-cbc for 3s on 64 size blocks: 6572120 aes-128-cbc's in >> 3.00s >> Doing aes-128-cbc for 3s on 256 size blocks: 2254183 aes-128-cbc's in >> 3.00s >> Doing aes-128-cbc for 3s on 1024 size blocks: 623111 aes-128-cbc's in >> 3.00s >> Doing aes-128-cbc for 3s on 8192 size blocks: 80058 aes-128-cbc's in >> 3.00s >> Doing aes-128-cbc for 3s on 16384 size blocks: 40180 aes-128-cbc's in >> 3.00s >> OpenSSL 1.1.1d 10 Sep 2019 >> built on: Sat Oct 12 19:56:43 2019 UTC >> options:bn(64,64) rc4(8x,int) des(int) aes(partial) blowfish(ptr) >> compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall >> -Wa,--noexecstack -g -O2 >> -fdebug-prefix-map=/build/openssl-YwazYa/openssl-1.1.1d=. >> -fstack-protector-strong -Wformat -Werror=format-security >> -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ >> -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 >> -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM >> -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM >> -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG >> -Wdate-time -D_FORTIFY_SOURCE=2 >> The 'numbers' are in 1000s of bytes per second processed. >> type 16 bytes 64 bytes 256 bytes 1024 bytes >> 8192 bytes 16384 bytes >> aes-128-cbc 74164.26k 140205.23k 192356.95k 212688.55k >> 218611.71k 219436.37k >> root at officelink01:~# openssl speed -elapsed -evp aes-256-cbc >> You have chosen to measure elapsed time instead of user CPU time. >> Doing aes-256-cbc for 3s on 16 size blocks: 12322268 aes-256-cbc's in >> 3.00s >> Doing aes-256-cbc for 3s on 64 size blocks: 5283431 aes-256-cbc's in >> 3.00s >> Doing aes-256-cbc for 3s on 256 size blocks: 1686231 aes-256-cbc's in >> 3.00s >> Doing aes-256-cbc for 3s on 1024 size blocks: 454425 aes-256-cbc's in >> 3.00s >> Doing aes-256-cbc for 3s on 8192 size blocks: 58092 aes-256-cbc's in >> 3.00s >> Doing aes-256-cbc for 3s on 16384 size blocks: 29035 aes-256-cbc's in >> 3.00s >> OpenSSL 1.1.1d 10 Sep 2019 >> built on: Sat Oct 12 19:56:43 2019 UTC >> options:bn(64,64) rc4(8x,int) des(int) aes(partial) blowfish(ptr) >> compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall >> -Wa,--noexecstack -g -O2 >> -fdebug-prefix-map=/build/openssl-YwazYa/openssl-1.1.1d=. >> -fstack-protector-strong -Wformat -Werror=format-security >> -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ >> -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 >> -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM >> -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM >> -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG >> -Wdate-time -D_FORTIFY_SOURCE=2 >> The 'numbers' are in 1000s of bytes per second processed. >> type 16 bytes 64 bytes 256 bytes 1024 bytes >> 8192 bytes 16384 bytes >> aes-256-cbc 65718.76k 112713.19k 143891.71k 155110.40k >> 158629.89k 158569.81k >> root at officelink01:~# >> _______________________________________________ >> tinc mailing list >> tinc at tinc-vpn.org >> https://www.tinc-vpn.org/cgi-bin/mailman/listinfo/tinc > _______________________________________________ > tinc mailing list > tinc at tinc-vpn.org > https://www.tinc-vpn.org/cgi-bin/mailman/listinfo/tinc
Reasonably Related Threads
- how to pick cipher for AES-NI enabled AMD GX-412TC SOC tincd at 100% CPU
- Blowfish crypt in rails app
- [LLVMdev] MC X86 lacking support for hyphenated VIA Padlock instructions
- [LLVMdev] MC X86 lacking support for hyphenated VIA Padlock instructions
- [LLVMdev] MC X86 lacking support for hyphenated VIA Padlock instructions