Roger Pau Monne
2012-Jan-14 13:41 UTC
[PATCH v2] libxl: Atomicaly check backend state and set it to 5 at device_remove
# HG changeset patch
# User Roger Pau Monne <roger.pau@entel.upc.edu>
# Date 1326548456 -3600
# Node ID 1b95948228b84c986764488cbb7683d0d109be90
# Parent 5b2676ac13218951698c49fa0350f2ac48220f3d
libxl: Atomicaly check backend state and set it to 5 at device_remove
libxl__device_remove was setting backend state to 5, which could
create a race condition, since the previous check for state != 4 and
setting state to 5 was not done inside the same transaction, so the
kernel could change the state to 6 in the space between the check for
state != 4 and setting it to 5.
The state != 4 check and setting it to 5 should happen in the same
transaction, to assure that nobody is modifying it behind our back.
Changes since v1:
* Do the check and set in the same transaction, instead of removing
the set state to 5.
Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu>
diff -r 5b2676ac1321 -r 1b95948228b8 tools/libxl/libxl_device.c
--- a/tools/libxl/libxl_device.c Mon Jan 09 16:01:44 2012 +0100
+++ b/tools/libxl/libxl_device.c Sat Jan 14 14:40:56 2012 +0100
@@ -442,19 +442,23 @@ int libxl__device_remove(libxl__gc *gc,
xs_transaction_t t;
char *be_path = libxl__device_backend_path(gc, dev);
char *state_path = libxl__sprintf(gc, "%s/state", be_path);
- char *state = libxl__xs_read(gc, XBT_NULL, state_path);
+ char *state;
int rc = 0;
- if (!state)
+retry_transaction:
+ t = xs_transaction_start(ctx->xsh);
+ state = libxl__xs_read(gc, t, state_path);
+ if (!state) {
+ xs_transaction_end(ctx->xsh, t, 0);
goto out;
+ }
if (atoi(state) != 4) {
+ xs_transaction_end(ctx->xsh, t, 0);
libxl__device_destroy_tapdisk(gc, be_path);
xs_rm(ctx->xsh, XBT_NULL, be_path);
goto out;
}
-retry_transaction:
- t = xs_transaction_start(ctx->xsh);
xs_write(ctx->xsh, t, libxl__sprintf(gc, "%s/online",
be_path), "0", strlen("0"));
xs_write(ctx->xsh, t, state_path, "5", strlen("5"));
if (!xs_transaction_end(ctx->xsh, t, 0)) {
Ian Jackson
2012-Feb-20 16:20 UTC
Re: [PATCH v2] libxl: Atomicaly check backend state and set it to 5 at device_remove
Roger Pau Monne writes ("[Xen-devel] [PATCH v2] libxl: Atomicaly check
backend state and set it to 5 at device_remove"):> libxl: Atomicaly check backend state and set it to 5 at device_remove
>
> libxl__device_remove was setting backend state to 5, which could
> create a race condition, since the previous check for state != 4 and
> setting state to 5 was not done inside the same transaction, so the
> kernel could change the state to 6 in the space between the check for
> state != 4 and setting it to 5.
>
> The state != 4 check and setting it to 5 should happen in the same
> transaction, to assure that nobody is modifying it behind our back.
This looks like a correct fix to me. However, the code in libxl has
changed now: libxl__device_remove is now
libxl__initiate_device_remove.
Also can you please arrange to use "goto out" style error handling ?
Eg something like:
+ xs_transaction_t t = 0;
...
out:
+ if (t) xs_transaction_end(ctx->xsh, t, 0);
device_remove_cleanup(gc, aorm);
TBH the meat of this function is in general rather unreconstructed.
Lots of unchecked xs_write etc. I don''t know if you want to fix that
while you''re there. I wimped out on doing that in my recent event
series. Up to you.
Really we should have some kind of more cooked xenstore transaction
arrangements, probably in the gc, and certainly with a suitable loop
macro so we can say:
IN_XENSTORE_TRANSACTION {
read, write, etc.
}
But that''s a task for another day and it''s not on the critical
path
for 4.2 so I''m putting it off.
Thanks,
Ian.
Roger Pau Monné
2012-Feb-21 09:46 UTC
Re: [PATCH v2] libxl: Atomicaly check backend state and set it to 5 at device_remove
2012/2/20 Ian Jackson <Ian.Jackson@eu.citrix.com>:> Roger Pau Monne writes ("[Xen-devel] [PATCH v2] libxl: Atomicaly check backend state and set it to 5 at device_remove"): >> libxl: Atomicaly check backend state and set it to 5 at device_remove >> >> libxl__device_remove was setting backend state to 5, which could >> create a race condition, since the previous check for state != 4 and >> setting state to 5 was not done inside the same transaction, so the >> kernel could change the state to 6 in the space between the check for >> state != 4 and setting it to 5. >> >> The state != 4 check and setting it to 5 should happen in the same >> transaction, to assure that nobody is modifying it behind our back. > > This looks like a correct fix to me. However, the code in libxl has > changed now: libxl__device_remove is now > libxl__initiate_device_remove. > > Also can you please arrange to use "goto out" style error handling ? > > Eg something like: > > + xs_transaction_t t = 0; > ... > out: > + if (t) xs_transaction_end(ctx->xsh, t, 0); > device_remove_cleanup(gc, aorm);I will commit an updated version of this patch with my hotplug series, that matches the current state of libxl__initiate_device_remove.> > TBH the meat of this function is in general rather unreconstructed. > Lots of unchecked xs_write etc. I don't know if you want to fix that > while you're there. I wimped out on doing that in my recent event > series. Up to you.Will decide that later, depending on the time frame.> Really we should have some kind of more cooked xenstore transaction > arrangements, probably in the gc, and certainly with a suitable loop > macro so we can say: > IN_XENSTORE_TRANSACTION { > read, write, etc. > } > But that's a task for another day and it's not on the critical path > for 4.2 so I'm putting it off.Also I've found myself using a lot a construction like the following: - check xenstore path - If path value is different than x - Set up a watch - Wait for events. And I'm afraid it's not concurrent safe, since someone can perform a bunch of xenstore changes in the space between setting the watch, and wait for events. The watcher will only get notified of the last xenstore change, but not all changes that happened between.> Thanks, > Ian._______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xensource.com/xen-devel
Ian Jackson
2012-Feb-21 12:28 UTC
Re: [PATCH v2] libxl: Atomicaly check backend state and set it to 5 at device_remove
Roger Pau Monné writes ("Re: [Xen-devel] [PATCH v2] libxl: Atomicaly check
backend state and set it to 5 at device_remove"):> Also I''ve found myself using a lot a construction like the
following:
>
> - check xenstore path
> - If path value is different than x
> - Set up a watch
> - Wait for events.
>
> And I''m afraid it''s not concurrent safe, since someone
can perform a
> bunch of xenstore changes in the space between setting the watch, and
> wait for events. The watcher will only get notified of the last
> xenstore change, but not all changes that happened between.
xenstore is not a message-passing protocol in that way. There is no
way to be sure that you see all the intermediate values of a
particular xenstore node. The required approach is for the protocol
not to care about intermediate values, or for the "sender" to
coordinate with the "receiver" somehow.
If all you mean is that you worry about the path changing between the
first read and the setting up of the watch, then you''re fine: every
watch is guaranteed to fire once right after you set it up, so your
watch handler function will definitely read the path again and you
won''t sit waiting for an event that never comes.
Ian.