Nir Soffer
2021-Nov-09 18:53 UTC
[Libguestfs] [PATCH nbdkit 2/2] common: Add checked-overflow macros and use for safe vector extension
On Tue, Nov 9, 2021 at 7:49 PM Richard W.M. Jones <rjones at redhat.com> wrote:> > --- > common/include/Makefile.am | 1 + > common/utils/Makefile.am | 3 +- > common/include/checked-overflow.h | 61 +++++++++++++++++++++++++++++++ > common/utils/vector.c | 26 +++++++++---- > 4 files changed, 82 insertions(+), 9 deletions(-) > > diff --git a/common/include/Makefile.am b/common/include/Makefile.am > index a7d0d026..52d97216 100644 > --- a/common/include/Makefile.am > +++ b/common/include/Makefile.am > @@ -37,6 +37,7 @@ EXTRA_DIST = \ > ascii-ctype.h \ > ascii-string.h \ > byte-swapping.h \ > + checked-overflow.h \ > exit-with-parent.h \ > isaligned.h \ > ispowerof2.h \ > diff --git a/common/utils/Makefile.am b/common/utils/Makefile.am > index 55415535..012a5c25 100644 > --- a/common/utils/Makefile.am > +++ b/common/utils/Makefile.am > @@ -52,6 +52,7 @@ libutils_la_SOURCES = \ > $(NULL) > libutils_la_CPPFLAGS = \ > -I$(top_srcdir)/include \ > + -I$(top_srcdir)/common/include \ > $(NULL) > libutils_la_CFLAGS = \ > $(WARNINGS_CFLAGS) \ > @@ -101,7 +102,7 @@ test_quotes_CPPFLAGS = -I$(srcdir) > test_quotes_CFLAGS = $(WARNINGS_CFLAGS) > > test_vector_SOURCES = test-vector.c vector.c vector.h bench.h > -test_vector_CPPFLAGS = -I$(srcdir) > +test_vector_CPPFLAGS = -I$(srcdir) -I$(top_srcdir)/common/include > test_vector_CFLAGS = $(WARNINGS_CFLAGS) > > bench: test-vector > diff --git a/common/include/checked-overflow.h b/common/include/checked-overflow.h > new file mode 100644 > index 00000000..b571e2c6 > --- /dev/null > +++ b/common/include/checked-overflow.h > @@ -0,0 +1,61 @@ > +/* nbdkit > + * Copyright (C) 2013-2021 Red Hat Inc. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions are > + * met: > + * > + * * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * > + * * Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in the > + * documentation and/or other materials provided with the distribution. > + * > + * * Neither the name of Red Hat nor the names of its contributors may be > + * used to endorse or promote products derived from this software without > + * specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, > + * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A > + * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR > + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF > + * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND > + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, > + * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT > + * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > + * SUCH DAMAGE. > + */ > + > +/* This header file defines functions for checking overflow in common > + * integer arithmetic operations. > + * > + * It uses GCC/Clang built-ins: a possible future enhancement is to > + * provide fallbacks in plain C or for other compilers. The only > + * purpose of having a header file for this is to have a single place > + * where we would extend this in future. > + */ > + > +#ifndef NBDKIT_CHECKED_OVERFLOW_H > +#define NBDKIT_CHECKED_OVERFLOW_H > + > +#if !defined(__GNUC__) && !defined(__clang__) > +#error "this file may need to be ported to your compiler" > +#endif > + > +/* Add two uint64_t values. Returns true if overflow happened. */ > +#define ADD_UINT64_T_OVERFLOW(a, b, r) __builtin_add_overflow((a), (b), (r)) > + > +/* Multiply two uint64_t values. Returns true if overflow happened. */ > +#define MUL_UINT64_T_OVERFLOW(a, b, r) __builtin_mul_overflow((a), (b), (r)) > + > +/* Add two size_t values. Returns true if overflow happened. */ > +#define ADD_SIZE_T_OVERFLOW(a, b, r) __builtin_add_overflow((a), (b), (r)) > + > +/* Multiple two size_t values. Returns true if overflow happened. */ > +#define MUL_SIZE_T_OVERFLOW(a, b, r) __builtin_mul_overflow((a), (b), (r)) > + > +#endif /* NBDKIT_CHECKED_OVERFLOW_H */ > diff --git a/common/utils/vector.c b/common/utils/vector.c > index dff051e9..e4ea7f3f 100644 > --- a/common/utils/vector.c > +++ b/common/utils/vector.c > @@ -36,6 +36,7 @@ > #include <stdlib.h> > #include <errno.h> > > +#include "checked-overflow.h" > #include "vector.h" > > int > @@ -44,21 +45,30 @@ generic_vector_reserve (struct generic_vector *v, size_t n, size_t itemsize) > void *newptr; > size_t reqcap, reqbytes, newcap, newbytes; > > - /* New capacity requested. We must allocate this minimum (or fail). */ > - reqcap = v->cap + n; > - reqbytes = reqcap * itemsize; > - if (reqbytes < v->cap * itemsize) { > + /* New capacity requested. We must allocate this minimum (or fail). > + * reqcap = v->cap + n > + * reqbytes = reqcap * itemsize > + */This should explain the next lines? I'm not sure about it, it makes the code more complicated and the commented code can get out of sync with the actual code.> + if (ADD_SIZE_T_OVERFLOW (v->cap, n, &reqcap) || > + MUL_SIZE_T_OVERFLOW (reqcap, itemsize, &reqbytes)) {Is order guaranteed? I think it will be more clear as separate if blocks, even if we need to have 2 blocks for returning ENOMEM.> errno = ENOMEM; > - return -1; /* overflow */ > + return -1; > } > > /* However for the sake of optimization, scale buffer by 3/2 so that > * repeated reservations don't call realloc often. > + * newcap = v->cap + (v->cap + 1) / 2 > + * newbytes = newcap * itemsize > */ > - newcap = v->cap + (v->cap + 1) / 2; > - newbytes = newcap * itemsize; > - > + if (ADD_SIZE_T_OVERFLOW (v->cap, 1, &newcap)) > + goto fallback; > + newcap /= 2; > + if (ADD_SIZE_T_OVERFLOW (v->cap, newcap, &newcap)) > + goto fallback;This probably works but adding v->cap and newcap and storing back in newcap is pretty confusing. I would use a temporary: if (ADD_SIZE_T_OVERFLOW (v->cap, 1, &extracap)) goto fallback; extracap /= 2; if (ADD_SIZE_T_OVERFLOW (v->cap, extracap, &newcap)) goto fallback;> + if (MUL_SIZE_T_OVERFLOW (newcap, itemsize, &newbytes)) > + goto fallback; > if (newbytes < reqbytes) { > + fallback:Jumping inside an if block is evil. I would try to extract the code to compute the new capacity into a helper function: if (next_capacity(v-cap, n, itemsize, &newcap)) return -1; This function can return early instead of jumping around or fail if we cannot reserve n items. In the worst case this function will only hide the overflow macros. Nir
Richard W.M. Jones
2021-Nov-09 19:27 UTC
[Libguestfs] [PATCH nbdkit 2/2] common: Add checked-overflow macros and use for safe vector extension
On Tue, Nov 09, 2021 at 08:53:28PM +0200, Nir Soffer wrote:> On Tue, Nov 9, 2021 at 7:49 PM Richard W.M. Jones <rjones at redhat.com> wrote: > > > > --- > > common/include/Makefile.am | 1 + > > common/utils/Makefile.am | 3 +- > > common/include/checked-overflow.h | 61 +++++++++++++++++++++++++++++++ > > common/utils/vector.c | 26 +++++++++---- > > 4 files changed, 82 insertions(+), 9 deletions(-) > > > > diff --git a/common/include/Makefile.am b/common/include/Makefile.am > > index a7d0d026..52d97216 100644 > > --- a/common/include/Makefile.am > > +++ b/common/include/Makefile.am > > @@ -37,6 +37,7 @@ EXTRA_DIST = \ > > ascii-ctype.h \ > > ascii-string.h \ > > byte-swapping.h \ > > + checked-overflow.h \ > > exit-with-parent.h \ > > isaligned.h \ > > ispowerof2.h \ > > diff --git a/common/utils/Makefile.am b/common/utils/Makefile.am > > index 55415535..012a5c25 100644 > > --- a/common/utils/Makefile.am > > +++ b/common/utils/Makefile.am > > @@ -52,6 +52,7 @@ libutils_la_SOURCES = \ > > $(NULL) > > libutils_la_CPPFLAGS = \ > > -I$(top_srcdir)/include \ > > + -I$(top_srcdir)/common/include \ > > $(NULL) > > libutils_la_CFLAGS = \ > > $(WARNINGS_CFLAGS) \ > > @@ -101,7 +102,7 @@ test_quotes_CPPFLAGS = -I$(srcdir) > > test_quotes_CFLAGS = $(WARNINGS_CFLAGS) > > > > test_vector_SOURCES = test-vector.c vector.c vector.h bench.h > > -test_vector_CPPFLAGS = -I$(srcdir) > > +test_vector_CPPFLAGS = -I$(srcdir) -I$(top_srcdir)/common/include > > test_vector_CFLAGS = $(WARNINGS_CFLAGS) > > > > bench: test-vector > > diff --git a/common/include/checked-overflow.h b/common/include/checked-overflow.h > > new file mode 100644 > > index 00000000..b571e2c6 > > --- /dev/null > > +++ b/common/include/checked-overflow.h > > @@ -0,0 +1,61 @@ > > +/* nbdkit > > + * Copyright (C) 2013-2021 Red Hat Inc. > > + * > > + * Redistribution and use in source and binary forms, with or without > > + * modification, are permitted provided that the following conditions are > > + * met: > > + * > > + * * Redistributions of source code must retain the above copyright > > + * notice, this list of conditions and the following disclaimer. > > + * > > + * * Redistributions in binary form must reproduce the above copyright > > + * notice, this list of conditions and the following disclaimer in the > > + * documentation and/or other materials provided with the distribution. > > + * > > + * * Neither the name of Red Hat nor the names of its contributors may be > > + * used to endorse or promote products derived from this software without > > + * specific prior written permission. > > + * > > + * THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND > > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, > > + * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A > > + * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR > > + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF > > + * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND > > + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, > > + * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT > > + * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > > + * SUCH DAMAGE. > > + */ > > + > > +/* This header file defines functions for checking overflow in common > > + * integer arithmetic operations. > > + * > > + * It uses GCC/Clang built-ins: a possible future enhancement is to > > + * provide fallbacks in plain C or for other compilers. The only > > + * purpose of having a header file for this is to have a single place > > + * where we would extend this in future. > > + */ > > + > > +#ifndef NBDKIT_CHECKED_OVERFLOW_H > > +#define NBDKIT_CHECKED_OVERFLOW_H > > + > > +#if !defined(__GNUC__) && !defined(__clang__) > > +#error "this file may need to be ported to your compiler" > > +#endif > > + > > +/* Add two uint64_t values. Returns true if overflow happened. */ > > +#define ADD_UINT64_T_OVERFLOW(a, b, r) __builtin_add_overflow((a), (b), (r)) > > + > > +/* Multiply two uint64_t values. Returns true if overflow happened. */ > > +#define MUL_UINT64_T_OVERFLOW(a, b, r) __builtin_mul_overflow((a), (b), (r)) > > + > > +/* Add two size_t values. Returns true if overflow happened. */ > > +#define ADD_SIZE_T_OVERFLOW(a, b, r) __builtin_add_overflow((a), (b), (r)) > > + > > +/* Multiple two size_t values. Returns true if overflow happened. */ > > +#define MUL_SIZE_T_OVERFLOW(a, b, r) __builtin_mul_overflow((a), (b), (r)) > > + > > +#endif /* NBDKIT_CHECKED_OVERFLOW_H */ > > diff --git a/common/utils/vector.c b/common/utils/vector.c > > index dff051e9..e4ea7f3f 100644 > > --- a/common/utils/vector.c > > +++ b/common/utils/vector.c > > @@ -36,6 +36,7 @@ > > #include <stdlib.h> > > #include <errno.h> > > > > +#include "checked-overflow.h" > > #include "vector.h" > > > > int > > @@ -44,21 +45,30 @@ generic_vector_reserve (struct generic_vector *v, size_t n, size_t itemsize) > > void *newptr; > > size_t reqcap, reqbytes, newcap, newbytes; > > > > - /* New capacity requested. We must allocate this minimum (or fail). */ > > - reqcap = v->cap + n; > > - reqbytes = reqcap * itemsize; > > - if (reqbytes < v->cap * itemsize) { > > + /* New capacity requested. We must allocate this minimum (or fail). > > + * reqcap = v->cap + n > > + * reqbytes = reqcap * itemsize > > + */ > > This should explain the next lines? I'm not sure about it, it makes > the code more complicated and the commented code can get out > of sync with the actual code.I really do think it makes it clearer. I agree that it makes it possible for the code to get out of step though.> > + if (ADD_SIZE_T_OVERFLOW (v->cap, n, &reqcap) || > > + MUL_SIZE_T_OVERFLOW (reqcap, itemsize, &reqbytes)) { > > Is order guaranteed? > > I think it will be more clear as separate if blocks, even if we need > to have 2 blocks for returning ENOMEM.Order is definitely guaranteed since || is a sequence point https://en.wikipedia.org/wiki/Sequence_point#Sequence_points_in_C_and_C++ (point 1). (It could also short-circuit, but we don't care). I could add an overflow: label, split the two statements, and jump here I suppose?> > errno = ENOMEM; > > - return -1; /* overflow */ > > + return -1; > > } > > > > /* However for the sake of optimization, scale buffer by 3/2 so that > > * repeated reservations don't call realloc often. > > + * newcap = v->cap + (v->cap + 1) / 2 > > + * newbytes = newcap * itemsize > > */ > > - newcap = v->cap + (v->cap + 1) / 2; > > - newbytes = newcap * itemsize; > > - > > + if (ADD_SIZE_T_OVERFLOW (v->cap, 1, &newcap)) > > + goto fallback; > > + newcap /= 2; > > + if (ADD_SIZE_T_OVERFLOW (v->cap, newcap, &newcap)) > > + goto fallback; > > This probably works but adding v->cap and newcap and storing > back in newcap is pretty confusing. I would use a temporary:Agreed.> if (ADD_SIZE_T_OVERFLOW (v->cap, 1, &extracap)) > goto fallback; > > extracap /= 2; > if (ADD_SIZE_T_OVERFLOW (v->cap, extracap, &newcap)) > goto fallback; > > > + if (MUL_SIZE_T_OVERFLOW (newcap, itemsize, &newbytes)) > > + goto fallback; > > if (newbytes < reqbytes) { > > + fallback: > > Jumping inside an if block is evil.?!> I would try to extract the code to compute the new capacity into a helper > function: > > if (next_capacity(v-cap, n, itemsize, &newcap)) > return -1; > > This function can return early instead of jumping around or fail > if we cannot reserve n items. In the worst case this function will > only hide the overflow macros.OK Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Eric Blake
2021-Nov-09 19:56 UTC
[Libguestfs] [PATCH nbdkit 2/2] common: Add checked-overflow macros and use for safe vector extension
On Tue, Nov 09, 2021 at 08:53:28PM +0200, Nir Soffer wrote:> On Tue, Nov 9, 2021 at 7:49 PM Richard W.M. Jones <rjones at redhat.com> wrote: > > +++ b/common/include/checked-overflow.h> > + > > +/* This header file defines functions for checking overflow in common > > + * integer arithmetic operations. > > + * > > + * It uses GCC/Clang built-ins: a possible future enhancement is to > > + * provide fallbacks in plain C or for other compilers. The only > > + * purpose of having a header file for this is to have a single place > > + * where we would extend this in future. > > + */gnulib has such checks ported to other compilers (in intprops.h), but with an incompatible license that we can't copy.> > + > > +#ifndef NBDKIT_CHECKED_OVERFLOW_H > > +#define NBDKIT_CHECKED_OVERFLOW_H > > + > > +#if !defined(__GNUC__) && !defined(__clang__) > > +#error "this file may need to be ported to your compiler" > > +#endif > > + > > +/* Add two uint64_t values. Returns true if overflow happened. */ > > +#define ADD_UINT64_T_OVERFLOW(a, b, r) __builtin_add_overflow((a), (b), (r))I was able to figure out that this performs '*r = a + b' by guessing from the macro parameter names; I'm not sure if adding that to the comment would help other readers less familiar with the compiler builtin.> > + > > +/* Multiply two uint64_t values. Returns true if overflow happened. */ > > +#define MUL_UINT64_T_OVERFLOW(a, b, r) __builtin_mul_overflow((a), (b), (r)) > > + > > +/* Add two size_t values. Returns true if overflow happened. */ > > +#define ADD_SIZE_T_OVERFLOW(a, b, r) __builtin_add_overflow((a), (b), (r)) > > + > > +/* Multiple two size_t values. Returns true if overflow happened. */ > > +#define MUL_SIZE_T_OVERFLOW(a, b, r) __builtin_mul_overflow((a), (b), (r))gcc also has forced-type builtins, but using __builtin_uaddll_overflow() would be wrong (it's harder to prove whether uint64_t is 'unsigned long' or 'unsigned long long'), so using the generic macro is what we want anyways. The fact that you have two different macros (for UINT64_T and SIZE_T) that expand to the same thing is a side-effect of gcc's implementation; if we later have to code up a C language fallback for other compilers, that fallback may indeed need different implementations between the two macros. So I'm fine with what looks like duplication here.> > +++ b/common/utils/vector.c > > @@ -36,6 +36,7 @@ > > #include <stdlib.h> > > #include <errno.h> > > > > +#include "checked-overflow.h" > > #include "vector.h" > > > > int > > @@ -44,21 +45,30 @@ generic_vector_reserve (struct generic_vector *v, size_t n, size_t itemsize) > > void *newptr; > > size_t reqcap, reqbytes, newcap, newbytes; > > > > - /* New capacity requested. We must allocate this minimum (or fail). */ > > - reqcap = v->cap + n; > > - reqbytes = reqcap * itemsize; > > - if (reqbytes < v->cap * itemsize) { > > + /* New capacity requested. We must allocate this minimum (or fail). > > + * reqcap = v->cap + n > > + * reqbytes = reqcap * itemsize > > + */ > > This should explain the next lines? I'm not sure about it, it makes > the code more complicated and the commented code can get out > of sync with the actual code. > > > + if (ADD_SIZE_T_OVERFLOW (v->cap, n, &reqcap) || > > + MUL_SIZE_T_OVERFLOW (reqcap, itemsize, &reqbytes)) {Yes, the code in the comment is pseudocode for the macro use; I agree that the duplication is a slight risk of losing sync, but it's a weak argument (the comment is quite close to the code, and this file is not likely to be frequently rewritten). I'm fine keeping it.> > Is order guaranteed?Yes, because of ||.> > I think it will be more clear as separate if blocks, even if we need > to have 2 blocks for returning ENOMEM.2 blocks is useful if you ever expect to be in a gdb session trying to figure out which of the two conditions failed. But for this one, I'm fine with one.> > > errno = ENOMEM; > > - return -1; /* overflow */ > > + return -1; > > } > > > > /* However for the sake of optimization, scale buffer by 3/2 so that > > * repeated reservations don't call realloc often. > > + * newcap = v->cap + (v->cap + 1) / 2 > > + * newbytes = newcap * itemsize > > */ > > - newcap = v->cap + (v->cap + 1) / 2; > > - newbytes = newcap * itemsize; > > - > > + if (ADD_SIZE_T_OVERFLOW (v->cap, 1, &newcap)) > > + goto fallback; > > + newcap /= 2; > > + if (ADD_SIZE_T_OVERFLOW (v->cap, newcap, &newcap)) > > + goto fallback; > > This probably works but adding v->cap and newcap and storing > back in newcap is pretty confusing. I would use a temporary: > > if (ADD_SIZE_T_OVERFLOW (v->cap, 1, &extracap)) > goto fallback; > > extracap /= 2; > if (ADD_SIZE_T_OVERFLOW (v->cap, extracap, &newcap)) > goto fallback; > > > + if (MUL_SIZE_T_OVERFLOW (newcap, itemsize, &newbytes)) > > + goto fallback; > > if (newbytes < reqbytes) { > > + fallback: > > Jumping inside an if block is evil.Not the first time our code base has done it. It's not always the cleanest, but is more compact that a number of other alternatives without too much hassle.> > I would try to extract the code to compute the new capacity into a helper > function: > > if (next_capacity(v-cap, n, itemsize, &newcap)) > return -1; > > This function can return early instead of jumping around or fail > if we cannot reserve n items. In the worst case this function will > only hide the overflow macros.That is indeed an option which may improve legibility for the next reader, even if it costs a few more lines of code now. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org