Chris Rapier
2022-Oct-24 19:23 UTC
[PATCH] Use EVP_MAC interface for Poly1305 if supported.
New version of the patch. This time the MAC ctx is part of the chachapoly_ctx. This improved performance even more. The testbed are 2 AMD Epyc 7502Ps connected via 10Gb through a local switch. The test is "./hpnssh remote "dd if=/dev/zero bs=1M count=20000" > /dev/null". The results are an average of 10 runs. Baseline is 604.9MB/s and the EVP version of poly1305 hit 733.5 MB/s. So that's just a bit more than 21% improvement in throughput. Please note that these results are for hpnssh and not mainline OpenSSH. The results for OpenSSH (9.0p1) are actually a lot more dramatic. Baseline on the same testbed is 297MB/s. The EVP version clocked in at 637MB/s. I tested compatibility against other versions of OpenSSH and it does work. I feel like I must be doing something wrong but if I am it's not obvious to me. Chris diff --git a/cipher-chachapoly-libcrypto.c b/cipher-chachapoly-libcrypto.c index 719f9c843..199f2974e 100644 --- a/cipher-chachapoly-libcrypto.c +++ b/cipher-chachapoly-libcrypto.c @@ -37,12 +37,16 @@ struct chachapoly_ctx { EVP_CIPHER_CTX *main_evp, *header_evp; +#if OPENSSL_VERSION_NUMBER >= 0x30000000UL + EVP_MAC_CTX *poly_ctx; +#endif }; struct chachapoly_ctx * chachapoly_new(const u_char *key, u_int keylen) { struct chachapoly_ctx *ctx; + EVP_MAC *mac = NULL; if (keylen != (32 + 32)) /* 2 x 256 bit keys */ return NULL; @@ -57,6 +61,12 @@ chachapoly_new(const u_char *key, u_int keylen) goto fail; if (EVP_CIPHER_CTX_iv_length(ctx->header_evp) != 16) goto fail; +#if OPENSSL_VERSION_NUMBER >= 0x30000000UL + if ((mac = EVP_MAC_fetch(NULL, "POLY1305", NULL)) == NULL) + goto fail; + if ((ctx->poly_ctx = EVP_MAC_CTX_new(mac)) == NULL) + goto fail; +#endif return ctx; fail: chachapoly_free(ctx); @@ -70,6 +80,9 @@ chachapoly_free(struct chachapoly_ctx *cpctx) return; EVP_CIPHER_CTX_free(cpctx->main_evp); EVP_CIPHER_CTX_free(cpctx->header_evp); +#if OPENSSL_VERSION_NUMBER >= 0x30000000UL + EVP_MAC_CTX_free(cpctx->poly_ctx); +#endif freezero(cpctx, sizeof(*cpctx)); } @@ -90,6 +103,13 @@ chachapoly_crypt(struct chachapoly_ctx *ctx, u_int seqnr, u_char *dest, int r = SSH_ERR_INTERNAL_ERROR; u_char expected_tag[POLY1305_TAGLEN], poly_key[POLY1305_KEYLEN]; + /* using the EVP_MAC interface for poly1305 is significantly + * faster than the version bundled with OpenSSH. However, + * this interface is only available in OpenSSL 3.0+ + * -cjr 10/21/2022 */ +#if OPENSSL_VERSION_NUMBER >= 0x30000000UL + size_t poly_out_len; +#endif /* * Run ChaCha20 once to generate the Poly1305 key. The IV is the * packet sequence number. @@ -104,11 +124,25 @@ chachapoly_crypt(struct chachapoly_ctx *ctx, u_int seqnr, u_char *dest, goto out; } +#if OPENSSL_VERSION_NUMBER >= 0x30000000UL + /* init the MAC each time to get the new key */ + if(!EVP_MAC_init(ctx->poly_ctx, (const u_char *)poly_key, POLY1305_KEYLEN, NULL)) { + r = SSH_ERR_LIBCRYPTO_ERROR; + goto out; + } +#endif + /* If decrypting, check tag before anything else */ if (!do_encrypt) { const u_char *tag = src + aadlen + len; - +#if OPENSSL_VERSION_NUMBER >= 0x30000000UL + /* EVP_MAC_update doesn't put the poly_mac into a buffer + * we need EVP_MAC_final for that */ + EVP_MAC_update(ctx->poly_ctx, src, aadlen + len); + EVP_MAC_final(ctx->poly_ctx, expected_tag, &poly_out_len, (size_t)POLY1305_TAGLEN); +#else poly1305_auth(expected_tag, src, aadlen + len, poly_key); +#endif if (timingsafe_bcmp(expected_tag, tag, POLY1305_TAGLEN) != 0) { r = SSH_ERR_MAC_INVALID; goto out; @@ -134,8 +168,13 @@ chachapoly_crypt(struct chachapoly_ctx *ctx, u_int seqnr, u_char *dest, /* If encrypting, calculate and append tag */ if (do_encrypt) { - poly1305_auth(dest + aadlen + len, dest, aadlen + len, - poly_key); +#if OPENSSL_VERSION_NUMBER >= 0x30000000UL + EVP_MAC_update(ctx->poly_ctx, dest, aadlen + len); + EVP_MAC_final(ctx->poly_ctx, dest + aadlen + len, &poly_out_len, (size_t)POLY1305_TAGLEN); +#else + poly1305_auth(dest + aadlen + len, dest, aadlen + len, + poly_key); +#endif } r = 0; out: On 10/24/22 11:53 AM, Chris Rapier wrote:> On 10/22/22 6:49 PM, Darren Tucker wrote: >> On Sat, 22 Oct 2022 at 07:53, Chris Rapier <rapier at psc.edu> wrote: >> [...] >>> I normally wouldn't clutter up the code with library version specific >>> ifdefs but it might be worth considering. >> >> Instead of ifdefs, you can check if the MAC init succeeded before >> calling the EVP functions, else fall back to the existing code path. > > As pointed out, this is only in OSSL3. That said, for hpnssh we've been > looking at extracting the necessary code/assembly from OSSL3 and > incorporating it into our code base to provide this functionality. > Maybe. Depends on the complexity of the task. >>> +?????? /* fetch the mac and create and initialize the context */ >>> +?????? if ((mac = EVP_MAC_fetch(NULL, "POLY1305", NULL)) == NULL || >>> +?????????? (poly_ctx = EVP_MAC_CTX_new(mac)) == NULL || >> >> You're initializing the MAC context on every call to this function. >> If you initialize the context once, cache it (say, as a static) and >> reuse it does it go any faster? > > That's a fine question and one I hope to explore today. I also noticed > that I'm neglecting to free the the EVP_MAC and the EVP_MAC_CTX. Kind of > jumped the gun on that. > > Chris
Darren Tucker
2022-Oct-24 19:55 UTC
[PATCH] Use EVP_MAC interface for Poly1305 if supported.
On Tue, 25 Oct 2022 at 06:23, Chris Rapier <rapier at psc.edu> wrote: [...]> I tested compatibility against other versions of OpenSSH and it > does work. I feel like I must be doing something wrong but if I am it's > not obvious to me.For changes where there are you have questions about interop, installing PuTTY and running the relevant regress tests ('make t-exec LTESTS="putty-transfer putty-ciphers putty-kex"') will give you some reassurance that the changes interop OK (as long as PuTTY implements the thing you have the question about, which in this case it does). -- Darren Tucker (dtucker at dtucker.net) GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860 37F4 9357 ECEF 11EA A6FA (new) Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
Darren Tucker
2022-Oct-24 20:23 UTC
[PATCH] Use EVP_MAC interface for Poly1305 if supported.
On Tue, 25 Oct 2022 at 06:23, Chris Rapier <rapier at psc.edu> wrote:> +#if OPENSSL_VERSION_NUMBER >= 0x30000000ULAs mentioned by Dmitry Belyavskiy upthread, since this depends on EVP_MAC_fetch() this should probably be checked by configure instead and put inside an ifdef HAVE_EVP_MAC_FETCH. I'm also wondering if the additional OpenSSL specific code belongs in the poly1305_auth function in cipher-chachapoly-libcrypto.c.> + size_t poly_out_len; > +#endifSince poly_out_len is only ever used inside the "if (!do_encrypt)" block below, you could move this declaration inside the existing ifdef inside that block and reduce this diff by one hunk. -- Darren Tucker (dtucker at dtucker.net) GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860 37F4 9357 ECEF 11EA A6FA (new) Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.