thr3ads.net - Libguestfs - Re: [Libguestfs] [PATCH libnbd 4/4] lib: Atomically update h->state when leaving the locked region. [Jun 2019]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2019-Jun-05 15:23 UTC

Re: [Libguestfs] [PATCH libnbd 4/4] lib: Atomically update h->state when leaving the locked region.

On 6/5/19 6:15 AM, Richard W.M. Jones wrote:> Split h->state into:
> 
>  - h->state = the state on entry to the locked region
> 
>  - h->next_state = the current state and what the "publicly
visible"
>    state will become when we leave the locked region
The rest of this thread discusses potential other names, such as
h->public_state for the state visible to unlocked code and h->state for
internal while still locked; I trust that once you pick a naming scheme
you like, you can redo this patch accordingly. I'll go ahead and review
the changes as spelled here to at least see if I can spot any problems.
> 
> Some calls to get_state become calls to get_next_state depending on
> which of these they are trying to read.  Calls to set_state become
> set_next_state because that is what gets updated.
> 
> When we leave the locked region we update h->state.
> 
> The purpose of this patch is to make it easier to reason about the
> state in lockless code.
Lockless code (including nbd_aio_get_direction) can no longer see
intermediate states that did not block, which is a lot nicer for
deciding whether we will accidentally let the user's poll() loop block
forever if it didn't see that we were blocked on POLLIN or POLLOUT.

There's still always a TOCTTOU race (anywhere the user reads the public
state while another thread holds the lock, their future actions based on
the observed state may be unexpected by the time they actually grab the
lock because the other thread changed state), as well as the issue I
mentioned in on 3/4 about any consecutive user calls to nbd_aio_is_*
having a race where the two calls can be done on different public states.

I'm still thinking that for any functions that grab the lock but want to
do gating on state MUST save the public state check until after grabbing
the lock, not beforehand, to avoid the TOCTTOU race.
> ---
>  generator/generator | 23 +++++++++++++----------
>  lib/connect.c       | 10 +++++-----
>  lib/disconnect.c    |  8 ++++----
>  lib/handle.c        |  2 +-
>  lib/internal.h      | 15 +++++++++++++--
>  lib/rw.c            |  4 ++--
>  6 files changed, 38 insertions(+), 24 deletions(-)
> 
> @@ -2462,7 +2462,7 @@ let generate_lib_states_c () >    pr " 
}\n";
>    pr "\n";
>    pr "  set_error (EINVAL, \"external event %%d is invalid in
state %%s\",\n";
> -  pr "             ev, nbd_internal_state_short_string (get_state
(h)));\n";
> +  pr "             ev, nbd_internal_state_short_string
(get_next_state (h)));\n";
I'm also wondering if we should treat external events more like
interrupts, where calling nbd_aio_notify_read() in an incorrect state
succeeds by setting a flag that is not cleared until we later reach a
state that can actually care about the flag, rather than by rejecting
the call out-right for being in the wrong state.  (That is, states where
we are NOT expecting a notify_read() are the same as setting an IRQ mask
that we don't want to be interrupted by a a read notification yet, but
we don't want to lose the interrupt when we later clear the mask)

In fact, CmdIssue is somewhat already like this - the code determines if
the state machine is unmasked (in the READY state) to kick it off
immediately; or is masked (in a processing state) to queue things but
leave the interrupt set (the next time we enter READY, we unmask,
observe the queue is non-empty, and so fire off the next command).

Another thought - if we ever want to allow the user the ability to send
custom NBD_OPT_ commands during handshake, or even to emulate 'qemu-nbd
--list' where the client can make queries to see what the server
supports before finally settling on whether to run NBD_OPT_GO or
NBD_OPT_ABORT, we'll need to add an external event after nbd_aio_connect
but before nbd_aio_is_connected for doing those additional handshake
steps. It's easier to think about adding a mandatory nbd_aio_go() called
in between nbd_aio_connect*() and the first nbd_aio_pread now, before we
have to start worrying about back-compat issues to all existing AIO clients.

> @@ -2866,8 +2866,11 @@ let generate_lib_api_c () As said above, I think you're still missing a change at the beginning of
the wrapper that grabs the lock to do the state check after the lock.
>      let argnames = List.flatten (List.map name_of_arg args) in
>      List.iter (pr ", %s") argnames;
>      pr ");\n";
> -    if is_locked then
> -      pr "  pthread_mutex_unlock (&h->lock);\n";
> +    if is_locked then (
> +      pr "  if (h->state != h->next_state)\n";
> +      pr "    h->state = h->next_state;\n";
> +      pr "  pthread_mutex_unlock (&h->lock);\n"
> +    );
>      pr "  return ret;\n";
>      pr "}\n";
>      pr "\n";
> diff --git a/lib/connect.c b/lib/connect.c
> index b889f80..4e3141f 100644
> --- a/lib/connect.c
> +++ b/lib/internal.h
> @@ -80,7 +80,17 @@ struct nbd_handle {
>    /* Linked list of close callbacks. */
>    struct close_callback *close_callbacks;
>  
> -  _Atomic enum state state;     /* State machine. */
> +  /* State machine.
> +   *
> +   * The actual current state is ‘next_state’.  ‘state’ is updated
> +   * before we release the lock.
> +   *
> +   * Note don't access these fields directly, use the SET_NEXT_STATE
> +   * macro in generator/states* code, or the set_next_state,
> +   * get_next_state and get_state macros in regular code.
> +   */
> +  _Atomic enum state state;
> +  enum state next_state;
>  
Otherwise the patch looks like you caught the right places.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Richard W.M. Jones

2019-Jun-08 17:45 UTC

head link

Re: [Libguestfs] [PATCH libnbd 4/4] lib: Atomically update h->state when leaving the locked region.

I think in general we should have a good idea of the thread model that
we support with libnbd.  At the moment we have a handle lock which
only means that simultaneous calls into the handle won't corrupt it;
however it doesn't mean that you don't expect the handle state to
change between calls if the handle is accessible by two threads.  Is
that the model we trying to achieve or something else?  Does libnbd
need to support more complex thread models or can we rely on callers
to use their own mutex on top?

More comments below ...

On Wed, Jun 05, 2019 at 10:23:57AM -0500, Eric Blake
wrote:> I'm also wondering if we should treat external events more like
> interrupts, where calling nbd_aio_notify_read() in an incorrect state
> succeeds by setting a flag that is not cleared until we later reach a
> state that can actually care about the flag, rather than by rejecting
> the call out-right for being in the wrong state.  (That is, states where
> we are NOT expecting a notify_read() are the same as setting an IRQ mask
> that we don't want to be interrupted by a a read notification yet, but
> we don't want to lose the interrupt when we later clear the mask)
Yes, I think this is a good idea, and not difficult to implement.
> In fact, CmdIssue is somewhat already like this - the code determines if
> the state machine is unmasked (in the READY state) to kick it off
> immediately; or is masked (in a processing state) to queue things but
> leave the interrupt set (the next time we enter READY, we unmask,
> observe the queue is non-empty, and so fire off the next command).
CmdIssue is also an external event.  If it was a level-triggered
interrupt then we'd set a flag in the handle when it happens, and the
state machine could do all of the above in generated code.

The only potential problem I can think of at the moment is the usual
one with interrupts which is that second and subsequent interrupts are
lost if the first interrupt hasn't been processed.  However for the
external events we have at the moment:

  | NotifyRead                  (* fd becomes ready to read *)
  | NotifyWrite                 (* fd becomes ready to write *)
  | CmdCreate                   (* [nbd_create] function called *)
  | CmdConnectSockAddr          (* [nbd_aio_connect] function called *)
  | CmdConnectUnix              (* [nbd_aio_connect_unix] *)
  | CmdConnectTCP               (* [nbd_aio_connect_tcp] *)
  | CmdConnectCommand           (* [nbd_aio_connect_command] *)
  | CmdIssue                    (* issuing an NBD command *)

it seems as if this wouldn't be an issue for any of them.  My
reasoning is: NotifyRead/Write work like this in the kernel already.
CmdCreate is only sent once.  CmdConnect* are only sent once.
CmdIssue can be sent multiple times but commands are queued so as long
as the writer side always checks the issue queue at appropriate points
we're OK.
> Another thought - if we ever want to allow the user the ability to send
> custom NBD_OPT_ commands during handshake, or even to emulate 'qemu-nbd
> --list' where the client can make queries to see what the server
> supports before finally settling on whether to run NBD_OPT_GO or
> NBD_OPT_ABORT, we'll need to add an external event after
nbd_aio_connect
> but before nbd_aio_is_connected for doing those additional handshake
> steps. It's easier to think about adding a mandatory nbd_aio_go()
called
> in between nbd_aio_connect*() and the first nbd_aio_pread now, before we
> have to start worrying about back-compat issues to all existing AIO
clients.
I envisaged that we'd just set a flag in the handle to indicate that
we were in "list mode".
> > @@ -2866,8 +2866,11 @@ let generate_lib_api_c () > 
> As said above, I think you're still missing a change at the beginning
of
> the wrapper that grabs the lock to do the state check after the lock.
AFAIK the code is safe now?  I'm not sure what you mean by this.
> >      let argnames = List.flatten (List.map name_of_arg args) in
> >      List.iter (pr ", %s") argnames;
> >      pr ");\n";
> > -    if is_locked then
> > -      pr "  pthread_mutex_unlock (&h->lock);\n";
> > +    if is_locked then (
> > +      pr "  if (h->state != h->next_state)\n";
> > +      pr "    h->state = h->next_state;\n";
> > +      pr "  pthread_mutex_unlock (&h->lock);\n"
> > +    );
> >      pr "  return ret;\n";
> >      pr "}\n";
> >      pr "\n";
> > diff --git a/lib/connect.c b/lib/connect.c
> > index b889f80..4e3141f 100644
> > --- a/lib/connect.c
> 
> > +++ b/lib/internal.h
> > @@ -80,7 +80,17 @@ struct nbd_handle {
> >    /* Linked list of close callbacks. */
> >    struct close_callback *close_callbacks;
> >  
> > -  _Atomic enum state state;     /* State machine. */
> > +  /* State machine.
> > +   *
> > +   * The actual current state is ‘next_state’.  ‘state’ is updated
> > +   * before we release the lock.
> > +   *
> > +   * Note don't access these fields directly, use the
SET_NEXT_STATE
> > +   * macro in generator/states* code, or the set_next_state,
> > +   * get_next_state and get_state macros in regular code.
> > +   */
> > +  _Atomic enum state state;
> > +  enum state next_state;
> >  
> 
> Otherwise the patch looks like you caught the right places.
Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

Richard W.M. Jones

2019-Jun-08 17:50 UTC

head link

Re: [Libguestfs] [PATCH libnbd 4/4] lib: Atomically update h->state when leaving the locked region.

On Sat, Jun 08, 2019 at 06:45:53PM +0100, Richard W.M. Jones
wrote:> > > @@ -2866,8 +2866,11 @@ let generate_lib_api_c () > > 
> > As said above, I think you're still missing a change at the
beginning of
> > the wrapper that grabs the lock to do the state check after the lock.
> 
> AFAIK the code is safe now?  I'm not sure what you mean by this.
Ignore this, your review of the updated patch makes it clear what you
mean.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

Apparently Analagous Threads

Search for more possibly parallel threads

Libguestfs - Jun 2019 - Re: [PATCH libnbd 4/4] lib: Atomically update h->state when leaving the locked region.

Re: [Libguestfs] [PATCH libnbd 4/4] lib: Atomically update h->state when leaving the locked region.

Re: [Libguestfs] [PATCH libnbd 4/4] lib: Atomically update h->state when leaving the locked region.

Re: [Libguestfs] [PATCH libnbd 4/4] lib: Atomically update h->state when leaving the locked region.

Apparently Analagous Threads