Chris Rapier
2022-Oct-21 20:45 UTC
[PATCH] Use EVP_MAC interface for Poly1305 if supported.
We've been working on improving poly1305 performance and I came across the EVP_MAC interface in OpenSSL 3.0 today. I tried swapping it in place of the poly1305_auth routine and my performance increased notably. On a Ryzen 7 5800X test to localhost I went from about 1000MB/s to 1450MB/s. On a more realistic test between two AMD Epyc servers connected via 10G I went from an average of 604.9MB/s to 718.7MB/s. Another testbed also got me about a 16 to 17% improvement in throughput. I normally wouldn't clutter up the code with library version specific ifdefs but it might be worth considering. This is a first pass so if anyone sees any glaring problems please let me know. diff --git a/cipher-chachapoly-libcrypto.c b/cipher-chachapoly-libcrypto.c index 719f9c843..b2a148696 100644 --- a/cipher-chachapoly-libcrypto.c +++ b/cipher-chachapoly-libcrypto.c @@ -90,6 +90,15 @@ chachapoly_crypt(struct chachapoly_ctx *ctx, u_int seqnr, u_char *dest, int r = SSH_ERR_INTERNAL_ERROR; u_char expected_tag[POLY1305_TAGLEN], poly_key[POLY1305_KEYLEN]; + /* using the EVP_MAC interface for poly1305 is significantly + * faster than the version bundled with OpenSSH. However, + * this interface is only available in OpenSSL 3.0+ + * -cjr 10/21/2022 */ +#if OPENSSL_VERSION_NUMBER >= 0x30000000UL + EVP_MAC_CTX *poly_ctx = NULL; + EVP_MAC *mac = NULL; + size_t poly_out_len; +#endif /* * Run ChaCha20 once to generate the Poly1305 key. The IV is the * packet sequence number. @@ -104,11 +113,27 @@ chachapoly_crypt(struct chachapoly_ctx *ctx, u_int seqnr, u_char *dest, goto out; } +#if OPENSSL_VERSION_NUMBER >= 0x30000000UL + /* fetch the mac and create and initialize the context */ + if ((mac = EVP_MAC_fetch(NULL, "POLY1305", NULL)) == NULL || + (poly_ctx = EVP_MAC_CTX_new(mac)) == NULL || + !EVP_MAC_init(poly_ctx, (const u_char *)poly_key, POLY1305_KEYLEN, NULL)) { + r = SSH_ERR_LIBCRYPTO_ERROR; + goto out; + } +#endif + /* If decrypting, check tag before anything else */ if (!do_encrypt) { const u_char *tag = src + aadlen + len; - +#if OPENSSL_VERSION_NUMBER >= 0x30000000UL + /* EVP_MAC_update doesn't put the poly_mac into a buffer + * we need EVP_MAC_final for that */ + EVP_MAC_update(poly_ctx, src, aadlen + len); + EVP_MAC_final(poly_ctx, expected_tag, &poly_out_len, (size_t)POLY1305_TAGLEN); +#else poly1305_auth(expected_tag, src, aadlen + len, poly_key); +#endif if (timingsafe_bcmp(expected_tag, tag, POLY1305_TAGLEN) != 0) { r = SSH_ERR_MAC_INVALID; goto out; @@ -134,8 +159,13 @@ chachapoly_crypt(struct chachapoly_ctx *ctx, u_int seqnr, u_char *dest, /* If encrypting, calculate and append tag */ if (do_encrypt) { - poly1305_auth(dest + aadlen + len, dest, aadlen + len, - poly_key); +#if OPENSSL_VERSION_NUMBER >= 0x30000000UL + EVP_MAC_update(poly_ctx, dest, aadlen + len); + EVP_MAC_final(poly_ctx, dest + aadlen + len, &poly_out_len, (size_t)POLY1305_TAGLEN); +#else + poly1305_auth(dest + aadlen + len, dest, aadlen + len, + poly_key); +#endif } r = 0; out:
Darren Tucker
2022-Oct-22 22:49 UTC
[PATCH] Use EVP_MAC interface for Poly1305 if supported.
On Sat, 22 Oct 2022 at 07:53, Chris Rapier <rapier at psc.edu> wrote: [...]> I normally wouldn't clutter up the code with library version specific > ifdefs but it might be worth considering.Instead of ifdefs, you can check if the MAC init succeeded before calling the EVP functions, else fall back to the existing code path.> + /* fetch the mac and create and initialize the context */ > + if ((mac = EVP_MAC_fetch(NULL, "POLY1305", NULL)) == NULL || > + (poly_ctx = EVP_MAC_CTX_new(mac)) == NULL ||You're initializing the MAC context on every call to this function. If you initialize the context once, cache it (say, as a static) and reuse it does it go any faster? [...]> +#if OPENSSL_VERSION_NUMBER >= 0x30000000UL > + /* EVP_MAC_update doesn't put the poly_mac into a buffer > + * we need EVP_MAC_final for that */ > + EVP_MAC_update(poly_ctx, src, aadlen + len); > + EVP_MAC_final(poly_ctx, expected_tag, &poly_out_len, (size_t)POLY1305_TAGLEN); > +#else > poly1305_auth(expected_tag, src, aadlen + len, poly_key); > +#endifYou'd also want to only try to init the context once instead of every time in the case where libcrypto did not support it, so something like: if (ctx_inited && poly_ctx != NULL) { EVP_MAC_update(poly_ctx, src, aadlen + len); EVP_MAC_final(poly_ctx, expected_tag, &poly_out_len, (size_t)POLY1305_TAGLEN); } else { poly1305_auth(expected_tag, src, aadlen + len, poly_key); } -- Darren Tucker (dtucker at dtucker.net) GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860 37F4 9357 ECEF 11EA A6FA (new) Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.