Darren Tucker
2018-Jul-12 08:10 UTC
OpenSSH slow on OSX High Sierra (maybe due to libcrypto)?
Hi all. Is anyone else seeing issues with OpenSSH being slow on OSX High Sierra? In the interests of better test coverage I set one up, however the OpenSSH tests take much longer on it than on much older machines with much slower CPUs. It seems to be due to the vendor-supplied libcrypto being about 20x slower at bignum operations than nominally the same version of LibreSSL compiled locally. If anyone has such a machine handy, could you please run "sysctl machdep.cpu.brand_string; /usr/bin/openssl speed rsa" and post the results for comparison? $ uname -a Darwin osx-highsierra 17.6.0 Darwin Kernel Version 17.6.0: Tue May 8 15:22:16 PDT 2018; root:xnu-4570.61.1~1/RELEASE_X86_64 x86_64 $ sysctl machdep.cpu.brand_string machdep.cpu.brand_string: Intel(R) Core(TM) i5-2415M CPU @ 2.30GHz $ /usr/bin/openssl speed rsa [...] LibreSSL 2.2.7 built on: date not available options:bn(64,64) rc4(ptr,int) des(idx,cisc,16,int) aes(partial) blowfish(idx) compiler: information not available rsa 512 bits 0.000964s 0.000059s 1037.3 16987.1 rsa 1024 bits 0.006052s 0.000271s 165.2 3687.3 rsa 2048 bits 0.040528s 0.001145s 24.7 873.6 rsa 4096 bits 0.278889s 0.004272s 3.6 234.1 $ libressl-2.2.7/apps/openssl speed rsa [...] options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: information not available sign verify sign/s verify/s rsa 512 bits 0.000074s 0.000008s 13466.5 130066.4 rsa 1024 bits 0.000271s 0.000017s 3690.6 57557.5 rsa 2048 bits 0.001665s 0.000054s 600.6 18684.4 rsa 4096 bits 0.011938s 0.000195s 83.8 5121.7 -- Darren Tucker (dtucker at dtucker.net) GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860 37F4 9357 ECEF 11EA A6FA (new) Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
Peter Moody
2018-Jul-12 20:30 UTC
OpenSSH slow on OSX High Sierra (maybe due to libcrypto)?
On Thu, Jul 12, 2018 at 1:10 AM, Darren Tucker <dtucker at dtucker.net> wrote:> If anyone has such a machine handy, could you please run "sysctl > machdep.cpu.brand_string; /usr/bin/openssl speed rsa" and post the > results for comparison?$ /usr/bin/openssl speed rsa [...] sign verify sign/s verify/s rsa 512 bits 0.000622s 0.000037s 1606.5 27333.3 rsa 1024 bits 0.003932s 0.000176s 254.3 5689.6 rsa 2048 bits 0.025932s 0.000751s 38.6 1331.3 rsa 4096 bits 0.176667s 0.002718s 5.7 367.9 $ sysctl machdep.cpu.brand_string machdep.cpu.brand_string: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz $ uname -a Darwin localhost 17.6.0 Darwin Kernel Version 17.6.0: Tue May 8 15:22:16 PDT 2018; root:xnu-4570.61.1~1/RELEASE_X86_64 x86_64 vs $ ./apps/openssl/openssl speed rsa [...] sign verify sign/s verify/s rsa 512 bits 0.000049s 0.000006s 20467.2 167761.5 rsa 1024 bits 0.000166s 0.000015s 6032.5 65297.8 rsa 2048 bits 0.000948s 0.000051s 1054.8 19469.8 rsa 4096 bits 0.007410s 0.000194s 134.9 5147.4
Jan Schermer
2018-Jul-12 22:56 UTC
OpenSSH slow on OSX High Sierra (maybe due to libcrypto)?
This is on Mojave $ sysctl machdep.cpu.brand_string; /usr/bin/openssl speed rsa machdep.cpu.brand_string: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz Doing 512 bit private rsa's for 10s: 13043 512 bit private RSA's in 9.88s Doing 512 bit public rsa's for 10s: 130182 512 bit public RSA's in 9.92s Doing 1024 bit private rsa's for 10s: 2290 1024 bit private RSA's in 9.95s Doing 1024 bit public rsa's for 10s: 27525 1024 bit public RSA's in 9.93s Doing 2048 bit private rsa's for 10s: 339 2048 bit private RSA's in 9.94s Doing 2048 bit public rsa's for 10s: 7251 2048 bit public RSA's in 9.96s Doing 4096 bit private rsa's for 10s: 51 4096 bit private RSA's in 10.05s Doing 4096 bit public rsa's for 10s: 1963 4096 bit public RSA's in 9.94s LibreSSL 2.6.4 built on: date not available options:bn(64,64) rc4(ptr,int) des(idx,cisc,16,int) aes(partial) blowfish(idx) compiler: information not available sign verify sign/s verify/s rsa 512 bits 0.000757s 0.000076s 1320.1 13123.2 rsa 1024 bits 0.004345s 0.000361s 230.2 2771.9 rsa 2048 bits 0.029322s 0.001374s 34.1 728.0 rsa 4096 bits 0.197059s 0.005064s 5.1 197.5 Homebrew?s real openssl version copes a ?bit? better $ /usr/local/Cellar/openssl/1.0.2o_2/bin/openssl speed rsa Doing 512 bit private rsa's for 10s: 218442 512 bit private RSA's in 9.92s Doing 512 bit public rsa's for 10s: 2249226 512 bit public RSA's in 9.87s Doing 1024 bit private rsa's for 10s: 78250 1024 bit private RSA's in 9.92s Doing 1024 bit public rsa's for 10s: 1107881 1024 bit public RSA's in 9.91s Doing 2048 bit private rsa's for 10s: 11736 2048 bit private RSA's in 9.96s Doing 2048 bit public rsa's for 10s: 381898 2048 bit public RSA's in 9.97s Doing 4096 bit private rsa's for 10s: 1692 4096 bit private RSA's in 9.98s Doing 4096 bit public rsa's for 10s: 107800 4096 bit public RSA's in 9.98s OpenSSL 1.0.2o 27 Mar 2018 built on: reproducible build, date unspecified options:bn(64,64) rc4(ptr,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: clang -I. -I.. -I../include -fPIC -fno-common -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -arch x86_64 -O3 -DL_ENDIAN -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM sign verify sign/s verify/s rsa 512 bits 0.000045s 0.000004s 22020.4 227885.1 rsa 1024 bits 0.000127s 0.000009s 7888.1 111794.2 rsa 2048 bits 0.000849s 0.000026s 1178.3 38304.7 rsa 4096 bits 0.005898s 0.000093s 169.5 10801.6 Jan> On 12 Jul 2018, at 22:30, Peter Moody <mindrot at hda3.com> wrote: > > On Thu, Jul 12, 2018 at 1:10 AM, Darren Tucker <dtucker at dtucker.net> wrote: > >> If anyone has such a machine handy, could you please run "sysctl >> machdep.cpu.brand_string; /usr/bin/openssl speed rsa" and post the >> results for comparison? > > $ /usr/bin/openssl speed rsa > [...] > sign verify sign/s verify/s > rsa 512 bits 0.000622s 0.000037s 1606.5 27333.3 > rsa 1024 bits 0.003932s 0.000176s 254.3 5689.6 > rsa 2048 bits 0.025932s 0.000751s 38.6 1331.3 > rsa 4096 bits 0.176667s 0.002718s 5.7 367.9 > > $ sysctl machdep.cpu.brand_string > machdep.cpu.brand_string: Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz > > $ uname -a > Darwin localhost 17.6.0 Darwin Kernel Version 17.6.0: Tue May 8 > 15:22:16 PDT 2018; root:xnu-4570.61.1~1/RELEASE_X86_64 x86_64 > > vs > > $ ./apps/openssl/openssl speed rsa > [...] > sign verify sign/s verify/s > rsa 512 bits 0.000049s 0.000006s 20467.2 167761.5 > rsa 1024 bits 0.000166s 0.000015s 6032.5 65297.8 > rsa 2048 bits 0.000948s 0.000051s 1054.8 19469.8 > rsa 4096 bits 0.007410s 0.000194s 134.9 5147.4 > _______________________________________________ > openssh-unix-dev mailing list > openssh-unix-dev at mindrot.org > https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Darren Tucker
2018-Jul-13 09:10 UTC
OpenSSH slow on OSX High Sierra (maybe due to libcrypto)?
On 12 July 2018 at 18:10, Darren Tucker <dtucker at dtucker.net> wrote: [...]> compiler: information not available > rsa 512 bits 0.000964s 0.000059s 1037.3 16987.1 > rsa 1024 bits 0.006052s 0.000271s 165.2 3687.3 > rsa 2048 bits 0.040528s 0.001145s 24.7 873.6 > rsa 4096 bits 0.278889s 0.004272s 3.6 234.1 > > $ libressl-2.2.7/apps/openssl speed rsa > [...] > options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) > idea(int) blowfish(idx) > compiler: information not available > sign verify sign/s verify/s > rsa 512 bits 0.000074s 0.000008s 13466.5 130066.4 > rsa 1024 bits 0.000271s 0.000017s 3690.6 57557.5 > rsa 2048 bits 0.001665s 0.000054s 600.6 18684.4 > rsa 4096 bits 0.011938s 0.000195s 83.8 5121.7Someone suggested that Apple might be shipping with assembler optimizations disabled. That might be part of it (about a 2.5x slowdown compared to with them) but it would still leave an order of magnitude unaccounted for. Local build with --disable-asm: sign verify sign/s verify/s rsa 512 bits 0.000191s 0.000016s 5236.4 62283.9 rsa 1024 bits 0.000863s 0.000046s 1158.4 21805.9 rsa 2048 bits 0.005286s 0.000148s 189.2 6765.1 rsa 4096 bits 0.035035s 0.000515s 28.5 1941.7 -- Darren Tucker (dtucker at dtucker.net) GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860 37F4 9357 ECEF 11EA A6FA (new) Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
Darren Tucker
2018-Aug-07 10:31 UTC
OpenSSH slow on OSX High Sierra (maybe due to libcrypto)?
On 12 July 2018 at 18:10, Darren Tucker <dtucker at dtucker.net> wrote: [...]> vendor-supplied libcrypto being about 20x slower at bignum operations > than nominally the same version of LibreSSL compiled locally.I've discovered two data points that may or may not be clues. 1) the native libcrypto is a "fat" library with both i386 and x86_64 code: $ file /usr/lib/libcrypto.dylib /usr/lib/libcrypto.dylib: Mach-O universal binary with 2 architectures: [i386:Mach-O dynamically linked shared library i386] [x86_64] /usr/lib/libcrypto.dylib (for architecture i386): Mach-O dynamically linked shared library i386 /usr/lib/libcrypto.dylib (for architecture x86_64): Mach-O 64-bit dynamically linked shared library x86_64 2) if I build libressl forcing 32 bit mode, the resulting speed is close to what I'm seeing from the native libcrypto + openssl tool (which is also fat). $ CFLAGS=-m32 ./configure --disable-asm && make -j4 && apps/openssl speed rsa [...] compiler: information not available sign verify sign/s verify/s rsa 512 bits 0.000950s 0.000072s 1052.2 13924.2 rsa 1024 bits 0.005181s 0.000227s 193.0 4406.0 rsa 2048 bits 0.031746s 0.000827s 31.5 1208.5 rsa 4096 bits 0.210208s 0.002916s 4.8 342.9 My ssh binaries are all x86_64 only so it's not as simple as just having built the wrong binary type. Does anyone know how to see what the linker/loader is actually doing? -- Darren Tucker (dtucker at dtucker.net) GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860 37F4 9357 ECEF 11EA A6FA (new) Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
John Hawkinson
2018-Aug-07 12:25 UTC
OpenSSH slow on OSX High Sierra (maybe due to libcrypto)?
Darren Tucker <dtucker at dtucker.net> wrote on Tue, 7 Aug 2018 at 20:31:43 +1000 in <CALDDTe3D4SB2P=MMc5Cxg1nWTdyZhxR-Nvj0K3c3xCvjaQdGKg at mail.gmail.com>:> Does anyone know how to see what the linker/loader is actually doing?See dyld(1) for a variety of environment var that control diagnostics. Perhaps you want DYLD_PRINT_BINDINGS=1. --jhawk at mit.edu John Hawkinson