Richard W.M. Jones
2019-Jul-30 10:51 UTC
[Libguestfs] [PATCH libnbd] examples: Fix theoretical cookie race in example.
Previously discussed here: https://www.redhat.com/archives/libguestfs/2019-July/msg00213.html It turns out that deferring callbacks is a PITA. (It would be a bit easier if C has closures.) However by rewriting the example we can avoid the need to use the cookie at all and make it run a bit more efficiently, so let's do that instead. Rich.
Richard W.M. Jones
2019-Jul-30 10:51 UTC
[Libguestfs] [PATCH libnbd] examples: Fix theoretical cookie race in example.
There was a theoretic race in this example: If the server was very fast at handling commands then it's possible that in a call such as: cookie = nbd_aio_pread_callback (..., callback, ...); buffers[i].cookie = cookie; nbd_aio_pread_callback finished and calls the callback before returning. buffers[i].cookie would therefore not be set, but the callback() function was checking the list of buffers for the cookie. This would have caused an abort() in the existing code, although we have never observed that. The new code, instead of relying on the cookie, passes a pointer to &buffers[i], so the callback no longer needs to search the list of buffers (this is also quicker). This necessitated another change however: previously we were copying buffers around to ensure that the next free buffer was always at &buffers[nr_buffers]. Instead of that introduce a new state (BUFFER_UNUSED) and search the short list of buffers once when looking for a free buffer. Thanks: Eric Blake. --- examples/glib-main-loop.c | 96 +++++++++++++-------------------------- 1 file changed, 32 insertions(+), 64 deletions(-) diff --git a/examples/glib-main-loop.c b/examples/glib-main-loop.c index 9f98033..826651e 100644 --- a/examples/glib-main-loop.c +++ b/examples/glib-main-loop.c @@ -247,6 +247,7 @@ static const char *dest_args[] = { #define BUFFER_SIZE 65536 enum buffer_state { + BUFFER_UNUSED = 0, BUFFER_READING, BUFFER_READ_COMPLETED, BUFFER_WRITING, @@ -254,13 +255,6 @@ enum buffer_state { struct buffer { uint64_t offset; - /* Note that command cookies are only unique per libnbd handle. - * Since we have two handles but we must look up completed commands - * in the buffer table by cookie we must maintain separate read and - * write cookies. - */ - int64_t rcookie; - int64_t wcookie; enum buffer_state state; char *data; }; @@ -355,7 +349,7 @@ static gboolean read_data (gpointer user_data) { static uint64_t posn = 0; - const size_t i = nr_buffers; + size_t i; if (gssrc == NULL) return FALSE; @@ -367,16 +361,21 @@ read_data (gpointer user_data) return FALSE; } + /* Find a free buffer. */ + for (i = 0; i < MAX_BUFFERS; ++i) + if (buffers[i].state == BUFFER_UNUSED) + goto found; + /* If too many read requests are in flight, return FALSE so this * idle callback is unregistered. It will be registered by the * write callback when nr_buffers decreases. */ - if (nr_buffers >= MAX_BUFFERS) { - DEBUG (gssrc, "read_data: buffer full, pausing reads from source"); - reader_paused = true; - return FALSE; - } + assert (nr_buffers == MAX_BUFFERS); + DEBUG (gssrc, "read_data: buffer full, pausing reads from source"); + reader_paused = true; + return FALSE; + found: /* Begin reading into the new buffer. */ assert (buffers[i].data == NULL); buffers[i].data = g_new (char, BUFFER_SIZE); @@ -385,11 +384,9 @@ read_data (gpointer user_data) nr_buffers++; posn += BUFFER_SIZE; - buffers[i].rcookie - nbd_aio_pread_callback (gssrc->nbd, buffers[i].data, - BUFFER_SIZE, buffers[i].offset, - finished_read, NULL, 0); - if (buffers[i].rcookie == -1) { + if (nbd_aio_pread_callback (gssrc->nbd, buffers[i].data, + BUFFER_SIZE, buffers[i].offset, + finished_read, &buffers[i], 0) == -1) { fprintf (stderr, "%s\n", nbd_get_error ()); exit (EXIT_FAILURE); } @@ -399,9 +396,9 @@ read_data (gpointer user_data) /* This callback is called from libnbd when any read command finishes. */ static int -finished_read (unsigned valid_flag, void *vp, int64_t rcookie, int *error) +finished_read (unsigned valid_flag, void *vp, int64_t unused, int *error) { - size_t i; + struct buffer *buffer = vp; if (!(valid_flag & LIBNBD_CALLBACK_VALID)) return 0; @@ -411,19 +408,11 @@ finished_read (unsigned valid_flag, void *vp, int64_t rcookie, int *error) DEBUG (gssrc, "finished_read: read completed"); - /* Find the corresponding buffer and mark it as completed. */ - for (i = 0; i < nr_buffers; ++i) { - if (buffers[i].rcookie == rcookie) - goto found; - } - /* This should never happen. */ - abort (); - - found: - buffers[i].state = BUFFER_READ_COMPLETED; + assert (buffer->state == BUFFER_READING); + buffer->state = BUFFER_READ_COMPLETED; /* Create a writer idle handler. */ - g_idle_add (write_data, NULL); + g_idle_add (write_data, buffer); return 1; } @@ -432,31 +421,19 @@ finished_read (unsigned valid_flag, void *vp, int64_t rcookie, int *error) static gboolean write_data (gpointer user_data) { - size_t i; + struct buffer *buffer = user_data; if (gsdest == NULL) return FALSE; - /* Find the first read-completed buffer and schedule it to be - * written. - */ - for (i = 0; i < nr_buffers; ++i) { - if (buffers[i].state == BUFFER_READ_COMPLETED) - goto found; - } - /* This should never happen. */ - abort (); - - found: - buffers[i].wcookie - nbd_aio_pwrite_callback (gsdest->nbd, buffers[i].data, - BUFFER_SIZE, buffers[i].offset, - finished_write, NULL, 0); - if (buffers[i].wcookie == -1) { + assert (buffer->state == BUFFER_READ_COMPLETED); + buffer->state = BUFFER_WRITING; + if (nbd_aio_pwrite_callback (gsdest->nbd, buffer->data, + BUFFER_SIZE, buffer->offset, + finished_write, buffer, 0) == -1) { fprintf (stderr, "%s\n", nbd_get_error ()); exit (EXIT_FAILURE); } - buffers[i].state = BUFFER_WRITING; /* We always unregister this idle handler because the read side * creates a new idle handler for every buffer that has to be @@ -467,9 +444,9 @@ write_data (gpointer user_data) /* This callback is called from libnbd when any write command finishes. */ static int -finished_write (unsigned valid_flag, void *vp, int64_t wcookie, int *error) +finished_write (unsigned valid_flag, void *vp, int64_t unused, int *error) { - size_t i; + struct buffer *buffer = vp; if (!(valid_flag & LIBNBD_CALLBACK_VALID)) return 0; @@ -479,20 +456,11 @@ finished_write (unsigned valid_flag, void *vp, int64_t wcookie, int *error) DEBUG (gsdest, "finished_write: write completed"); - /* Find the corresponding buffer and free it. */ - for (i = 0; i < nr_buffers; ++i) { - if (buffers[i].wcookie == wcookie) - goto found; - } - /* This should never happen. */ - abort (); - - found: - g_free (buffers[i].data); - memmove (&buffers[i], &buffers[i+1], - sizeof (struct buffer) * (nr_buffers-(i+1))); + assert (buffer->state == BUFFER_WRITING); + g_free (buffer->data); + buffer->data = NULL; + buffer->state = BUFFER_UNUSED; nr_buffers--; - buffers[nr_buffers].data = NULL; /* If the number of buffers was MAX_BUFFERS and has now gone down to * MAX_BUFFERS-1 then we need to restart the read handler. -- 2.22.0
Richard W.M. Jones
2019-Jul-30 10:54 UTC
Re: [Libguestfs] [PATCH libnbd] examples: Fix theoretical cookie race in example.
On Tue, Jul 30, 2019 at 11:51:40AM +0100, Richard W.M. Jones wrote:> Previously discussed here: > https://www.redhat.com/archives/libguestfs/2019-July/msg00213.html > > It turns out that deferring callbacks is a PITA. (It would be a bit > easier if C has closures.) However by rewriting the example we can > avoid the need to use the cookie at all and make it run a bit more > efficiently, so let's do that instead.I was going to say also: The callbacks pass the command cookie back to the caller, and are thus all potentially racy. There seem to be two possibilities here: (1) Drop the cookie parameter entirely. (2) Document the possible race so that users don't use the cookie inappropritely. I think I favour option (1). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Eric Blake
2019-Jul-30 14:57 UTC
Re: [Libguestfs] [PATCH libnbd] examples: Fix theoretical cookie race in example.
On 7/30/19 5:51 AM, Richard W.M. Jones wrote:> There was a theoretic race in this example: If the server was very > fast at handling commands then it's possible that in a call such as: > > cookie = nbd_aio_pread_callback (..., callback, ...); > buffers[i].cookie = cookie; > > nbd_aio_pread_callback finished and calls the callback before > returning. buffers[i].cookie would therefore not be set, but the > callback() function was checking the list of buffers for the cookie. > This would have caused an abort() in the existing code, although we > have never observed that. > > The new code, instead of relying on the cookie, passes a pointer to > &buffers[i], so the callback no longer needs to search the list of > buffers (this is also quicker). > > This necessitated another change however: previously we were copying > buffers around to ensure that the next free buffer was always at > &buffers[nr_buffers]. Instead of that introduce a new state > (BUFFER_UNUSED) and search the short list of buffers once when looking > for a free buffer. > > Thanks: Eric Blake. > --- > examples/glib-main-loop.c | 96 +++++++++++++-------------------------- > 1 file changed, 32 insertions(+), 64 deletions(-)Avoids a potential race, and uses fewer lines of code. What's not to like about it :) ACK. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Eric Blake
2019-Jul-30 15:01 UTC
Re: [Libguestfs] [PATCH libnbd] examples: Fix theoretical cookie race in example.
On 7/30/19 5:54 AM, Richard W.M. Jones wrote:> On Tue, Jul 30, 2019 at 11:51:40AM +0100, Richard W.M. Jones wrote: >> Previously discussed here: >> https://www.redhat.com/archives/libguestfs/2019-July/msg00213.html >> >> It turns out that deferring callbacks is a PITA. (It would be a bit >> easier if C has closures.) However by rewriting the example we can >> avoid the need to use the cookie at all and make it run a bit more >> efficiently, so let's do that instead. > > I was going to say also: > > The callbacks pass the command cookie back to the caller, and are thus > all potentially racy. There seem to be two possibilities here: (1) > Drop the cookie parameter entirely. (2) Document the possible race so > that users don't use the cookie inappropritely. > > I think I favour option (1).We already have option (2) to some extent; libnbd.pod states: The completion callback will be invoked with C<cookie> set to the same value returned by the original API such as C<nbd_aio_pread_callback> (in rare cases, it is possible that the completion callback may fire before the original API has returned). But I also favor option (1); my recent changes to nbdkit to use auto-retire also managed to get rid of the need to rely on the cookie argument. The cookie is still important for clients not using auto-retire, but by getting rid of the cookie parameter to the completion callback, we are forcing the client to do their own mapping of their void* parameter to determine which command is being completed - but that turns out to be the most efficient anyways (both nbdkit and your glib example got that that point without too much difficulty). It's another API/ABI break, but I don't see it as being detrimental (none of our existing completion callbacks are too difficult to keep working if they lose the cookie parameter). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Reasonably Related Threads
- [PATCH libnbd v2] examples: Include an example of integrating with glib main loop.
- [PATCH libnbd] examples: Include an example of integrating with the glibc main loop.
- [PATCH libnbd] examples: Fix theoretical cookie race in example.
- Re: [PATCH libnbd v2] examples: Include an example of integrating with the glib main loop.
- [PATCH libnbd v3 0/2] lib: Implement closure lifetimes.