thr3ads.net - Libguestfs - Re: [Libguestfs] [PATCH libnbd v2] examples: Include an example of integrating with the glib main loop. [Jul 2019]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2019-Jul-17 13:40 UTC

Re: [Libguestfs] [PATCH libnbd v2] examples: Include an example of integrating with the glib main loop.

On 7/17/19 8:00 AM, Richard W.M. Jones wrote:> ---
>  .gitignore                |   1 +
>  README                    |   2 +
>  configure.ac              |   9 +
>  examples/Makefile.am      |  22 ++
>  examples/glib-main-loop.c | 511 ++++++++++++++++++++++++++++++++++++++
>  5 files changed, 545 insertions(+)
Looks good.
> 
> +  revents = g_source_query_unix_fd ((GSource *) source, source->tag);
> +
> +  DEBUG (source, "dispatch: revents = 0x%x%s%s",
> +         revents,
> +         revents & G_IO_IN ? " G_IO_IN" : "",
> +         revents & G_IO_OUT ? " G_IO_OUT" : "");
> +
> +  r = 0;
> +  if ((revents & G_IO_IN) != 0)
> +    r = nbd_aio_notify_read (source->nbd);
> +  else if ((revents & G_IO_OUT) != 0)
> +    r = nbd_aio_notify_write (source->nbd);
> +
If revents ever contains G_IO_ERR/G_IO_HUP, and if we ever add
nbd_aio_notify_error(), we should inform the state machine about that fd
problem.  I've mentioned that idea before; I guess it is more of an
optimization - we will probably notice the problem eventually if the
client ever tries to issue another command or detects EOF while reading
a response; but tying up the resource until the next client command vs.
identifying the problem immediately may matter to some clients.

Also, there may still be a possible race:
> +/* This idle callback reads data from the source nbdkit until the ring
> + * is full.
> + */
> +static gboolean
> +read_data (gpointer user_data)
> +{
> +  static uint64_t posn = 0;
> +  buffers[i].rcookie > +    nbd_aio_pread_callback (gssrc->nbd,
buffers[i].data,
> +                            BUFFER_SIZE, buffers[i].offset,
> +                            finished_read, NULL, 0);
It may be possible in rare situations that the libnbd state machine can
send() the command AND see data ready to recv() (perhaps from a previous
command still in flight), where it ends up read()ing until blocking and
happens to get the server's reply to this command and fire off the
callback, all prior to the nbd_aio_pread_callback() returning.  If that
ever happens, then buffers[i].rcookie will still be unset at the time
the callback fires...
> +/* This callback is called from libnbd when any read command finishes. */
> +static int
> +finished_read (void *vp, int64_t rcookie, int *error)
> +{
> +  size_t i;
> +
> +  if (gssrc == NULL)
> +    return 0;
> +
> +  DEBUG (gssrc, "finished_read: read completed");
> +
> +  /* Find the corresponding buffer and mark it as completed. */
> +  for (i = 0; i < nr_buffers; ++i) {
> +    if (buffers[i].rcookie == rcookie)
> +      goto found;
> +  }
> +  /* This should never happen. */
> +  abort ();
...such that we can hit this abort().  (Same for the write callback).
But I don't know how likely that scenario is in practice, and don't see
it as a reason to defer committing this patch as-is.  I don't know if
there is any way to rework the state machine to guarantee that a
completion callback will not fire during the same aio_FOO_callback that
registered the callback (that is, to guarantee that h->lock will be
dropped and must be re-obtained prior to firing off a callback), but
it's worth thinking about.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Eric Blake

2019-Jul-17 18:34 UTC

head link

Re: [Libguestfs] [PATCH libnbd v2] examples: Include an example of integrating with the glib main loop.

On 7/17/19 8:40 AM, Eric Blake wrote:
> 
> It may be possible in rare situations that the libnbd state machine can
> send() the command AND see data ready to recv() (perhaps from a previous
> command still in flight), where it ends up read()ing until blocking and
> happens to get the server's reply to this command and fire off the
> callback, all prior to the nbd_aio_pread_callback() returning.  If that
> ever happens, then buffers[i].rcookie will still be unset at the time
> the callback fires...
> 
>> +/* This callback is called from libnbd when any read command finishes.
*/
>> +static int
>> +finished_read (void *vp, int64_t rcookie, int *error)
>> +{
>> +  size_t i;
>> +
>> +  if (gssrc == NULL)
>> +    return 0;
>> +
>> +  DEBUG (gssrc, "finished_read: read completed");
>> +
>> +  /* Find the corresponding buffer and mark it as completed. */
>> +  for (i = 0; i < nr_buffers; ++i) {
>> +    if (buffers[i].rcookie == rcookie)
>> +      goto found;
>> +  }
>> +  /* This should never happen. */
>> +  abort ();
> 
> ...such that we can hit this abort().  (Same for the write callback).
> But I don't know how likely that scenario is in practice, and don't
see
> it as a reason to defer committing this patch as-is.  I don't know if
> there is any way to rework the state machine to guarantee that a
> completion callback will not fire during the same aio_FOO_callback that
> registered the callback (that is, to guarantee that h->lock will be
> dropped and must be re-obtained prior to firing off a callback), but
> it's worth thinking about.
An idea we had on IRC: instead of having nbd_aio_pread_callback which
both creates the cookie and submits the command with callback, we could
instead have: nbd_aio_pread(...) creates a command and returns a cookie
but does NOT send it; then nbd_aio_submit(nbd, cookie, callback, opaque)
which can be used to submit ANY nbd_aio_FOO command based on its cookie.
That solves the race for the callback ever being fired before the user
has had a chance to track the cookie, and makes it so that only one
command has to register callback functions (rather than one counterpart
per aio_FOO command).

Or maybe even three parts:

 nbd_aio_pread_structured() => creates cookie, called once
 nbd_aio_add_callback(cookie, callback, user_data) - called 0 or more
times before submit; callbacks run in LIFO order at command completion
 nbd_aio_submit(cookie) => kicks off the command

Why do it this way? Because in the Python bindings, right now we have to
malloc() a C structure that tracks the Python Callable associated with
any Closure, and currently don't free that C struct until nbd_close.
Basically, we have:

nbd_internal_py_aio_pread_structured
  data = malloc()
  nbd_aio_pread_structured(, chunk_wrapper, data)
  nbd_add_close_callback(data_free, data)
nbd_internal_py_aio_pread_structured_callback
  data = malloc()
  nbd_aio_pread_structured_callback(, chunk_wrapper, callback_wrapper, data)
  nbd_add_close_callback(data_free, data)

where there is no convenient way for the generator to insert cleanup for
the data malloc'd in aio_pread_structured (for
aio_pread_structured_callback, it could use the callback_wrapper - but
handling the two APIs differently requires hairy generator refactoring).
 But with a three-part split, we could make the generator produce:

nbd_internal_py_aio_pread_structured
  data1 = malloc()
  cookie = nbd_aio_pread_structured(, chunk_wrapper, data1)
  nbd_aio_add_callback(cookie, data_free, data1)
nbd_internal_py_aio_add_callback
  data2 = malloc()
  nbd_aio_add_callback(cookie, data_free, data2)
  nbd_aio_add_callback(cookie, callback_wrapper, data2)
nbd_internal_py_aio_submit
  nbd_aio_submit(cookie)

where, during the course of the command being in flight, we call
chunk_wrapper(data1) as many times as needed, then when the command
completes, we call the callback wrappers in reverse order:
callback_wrapper(data2)
data_free(data2)
data_free(data1)

and thus avoid the need to use nbd_add_close_callback.  We'd have to
tweak the Closure element to take an enum of dispositions instead of its
current bool persistent (use stack storage, dispose via
add_close_callback, dispose via aio_add_callback after returned cookie
obtained from underlying API, dispose via aio_add_callback using
incoming cookie argument prior to underlying API), but that still seems
easier than cross-linking the code for Python aio_pread_structured to
call the C aio_pread_structured_callback under the hood, and offers
enough separation that the generator does not have to be aware of
cross-API ties.

It may also simplify the generator in that we'd be back to never having
more than one Closure per function (compared to our current two
functions in nbd_aio_pread_structured_callback and
nbd_aio_block_status_callback), where handling a Closure in the
generator doesn't need List.iter.

The other thing this example pointed out was that in a single-threaded
main loop, the callback function can't retire a command itself, but it
is hairy to wire up anything else to retire the command and nothing
further is to be learned by retiring it. One idea was to change the
callback semantics: if a callback returns 1, the command is
auto-retired; if it returns 0 or -1, you still have to use
nbd_aio_command_completed to retire the command (works with the existing
API when there is at most one callback, but gets trickier with my above
split of allowing more than one LIFO callback addition: if the first
callback returns 1, but a later callback returns -1, does the command
still need to be auto-retired?).  Or, instead of making the callback
return value special, we could add a flag to the proposed
nbd_aio_submit(cookie, bool) where the bool determines if the command
should be auto-retired.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Richard W.M. Jones

2019-Jul-18 11:19 UTC

head link

Re: [Libguestfs] [PATCH libnbd v2] examples: Include an example of integrating with the glib main loop.

I had a long think about all this today and I'm not really any closer
to knowing what the solution should be.  However some observations,
mostly obvious:

* Storing the commands in linked lists probably won't work well when
we depend more and more on quickly looking them up by cookie number.
So we'll have to change the way they are stored, probably a simple
hash on the bottom bits of the cookie number is fine.

* If we break out the create command / submit command steps then we
will need a separate cmds_pending list (or hash) before commands are
added to cmds_to_issue.

* If we allow commands to be constructed in steps:

  cookie = nbd_aio_pread (nbd, ...);
  nbd_aio_add_callback (nbd, cookie, ...);
  nbd_aio_set_autoretire (nbd,cookie, ...);
  nbd_aio_submit (nbd, cookie);

then we end up having a whole lot more locking overhead which may
matter.  This is an argument for more fine-grained locking perhaps.

* Also the API becomes a whole lot more complex to use and therefore
error-prone.

* At the moment we allow multiple callbacks in the Closure parameter.
But I think we won't need that, and can probably simplify the
generator to allow a single callback per Closure.  Also that allows us
to have different scopes per callback (ie. persistent, one-shot, etc.)
in the same function.  [You pointed this out in your email already]

* Maybe a case for adding an Optional (Closure ...) type to allow a
callback to be nullable.  In Python & OCaml they are currently not
nullable.  Unclear if they are or are supposed to be in C.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

Apparently Analagous Threads

Search for more seemingly similar threads

Libguestfs - Jul 2019 - Re: [PATCH libnbd v2] examples: Include an example of integrating with the glib main loop.

Re: [Libguestfs] [PATCH libnbd v2] examples: Include an example of integrating with the glib main loop.

Re: [Libguestfs] [PATCH libnbd v2] examples: Include an example of integrating with the glib main loop.

Re: [Libguestfs] [PATCH libnbd v2] examples: Include an example of integrating with the glib main loop.

Apparently Analagous Threads