Roger Pau Monne
2012-Jan-13 11:57 UTC
[PATCH] libxl: don''t set backend state to 5 when trying to unplug a device
# HG changeset patch # User Roger Pau Monne <roger.pau@entel.upc.edu> # Date 1326454785 -3600 # Node ID 887a3229fd7a50c04981e29709bc7210dafef38f # Parent 5b2676ac13218951698c49fa0350f2ac48220f3d libxl: don''t set backend state to 5 when trying to unplug a device libxl__device_remove was setting backend state to 5, which could create a race condition, since the previous check for state != 4 and setting state to 5 was not done inside of the same transaction, so the kernel could change the state to 6 in the space between the check for state != 4 and setting state to 5. I might be wrong, but since I don''t think setting backend state to 5 helps in any way when disconnecting a device I''ve just removed the xs_write. If this is necessary, the state != 4 check and setting it to 5 should happen inside the same transaction, to avoid the kernel from changing the state behind our back. Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> diff -r 5b2676ac1321 -r 887a3229fd7a tools/libxl/libxl_device.c --- a/tools/libxl/libxl_device.c Mon Jan 09 16:01:44 2012 +0100 +++ b/tools/libxl/libxl_device.c Fri Jan 13 12:39:45 2012 +0100 @@ -456,7 +456,6 @@ int libxl__device_remove(libxl__gc *gc, retry_transaction: t = xs_transaction_start(ctx->xsh); xs_write(ctx->xsh, t, libxl__sprintf(gc, "%s/online", be_path), "0", strlen("0")); - xs_write(ctx->xsh, t, state_path, "5", strlen("5")); if (!xs_transaction_end(ctx->xsh, t, 0)) { if (errno == EAGAIN) goto retry_transaction;
Ian Campbell
2012-Jan-13 13:07 UTC
Re: [PATCH] libxl: don''t set backend state to 5 when trying to unplug a device
On Fri, 2012-01-13 at 11:57 +0000, Roger Pau Monne wrote:> # HG changeset patch > # User Roger Pau Monne <roger.pau@entel.upc.edu> > # Date 1326454785 -3600 > # Node ID 887a3229fd7a50c04981e29709bc7210dafef38f > # Parent 5b2676ac13218951698c49fa0350f2ac48220f3d > libxl: don''t set backend state to 5 when trying to unplug a device > > libxl__device_remove was setting backend state to 5, which could > create a race condition, since the previous check for state != 4 and > setting state to 5 was not done inside of the same transaction, so the > kernel could change the state to 6 in the space between the check for > state != 4 and setting state to 5. > > I might be wrong, but since I don''t think setting backend state to 5 > helps in any way when disconnecting a deviceIt''s the exact thing which makes the disconnect happen at all, isn''t it? Some backends (particularly the Linux ones) might also use the online node but I don''t think that behaviour is universal.> I''ve just removed the > xs_write. If this is necessary, the state != 4 check and setting it > to 5 should happen inside the same transaction, to avoid the kernel > from changing the state behind our back.I think that is the right solution. Ian.> Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> > > diff -r 5b2676ac1321 -r 887a3229fd7a tools/libxl/libxl_device.c > --- a/tools/libxl/libxl_device.c Mon Jan 09 16:01:44 2012 +0100 > +++ b/tools/libxl/libxl_device.c Fri Jan 13 12:39:45 2012 +0100 > @@ -456,7 +456,6 @@ int libxl__device_remove(libxl__gc *gc, > retry_transaction: > t = xs_transaction_start(ctx->xsh); > xs_write(ctx->xsh, t, libxl__sprintf(gc, "%s/online", be_path), "0", strlen("0")); > - xs_write(ctx->xsh, t, state_path, "5", strlen("5")); > if (!xs_transaction_end(ctx->xsh, t, 0)) { > if (errno == EAGAIN) > goto retry_transaction; > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel
Roger Pau Monné
2012-Jan-13 13:10 UTC
Re: [PATCH] libxl: don''t set backend state to 5 when trying to unplug a device
2012/1/13 Ian Campbell <Ian.Campbell@citrix.com>:> On Fri, 2012-01-13 at 11:57 +0000, Roger Pau Monne wrote: >> # HG changeset patch >> # User Roger Pau Monne <roger.pau@entel.upc.edu> >> # Date 1326454785 -3600 >> # Node ID 887a3229fd7a50c04981e29709bc7210dafef38f >> # Parent 5b2676ac13218951698c49fa0350f2ac48220f3d >> libxl: don't set backend state to 5 when trying to unplug a device >> >> libxl__device_remove was setting backend state to 5, which could >> create a race condition, since the previous check for state != 4 and >> setting state to 5 was not done inside of the same transaction, so the >> kernel could change the state to 6 in the space between the check for >> state != 4 and setting state to 5. >> >> I might be wrong, but since I don't think setting backend state to 5 >> helps in any way when disconnecting a device > > It's the exact thing which makes the disconnect happen at all, isn't it?What makes the disconnect happen on NetBSD al least is removing the frontend or setting the frontend state to 6, but this doesn't do anything at all (it might be different on Linux though). Can someone confirm that this is actually useful on Linux?> Some backends (particularly the Linux ones) might also use the online > node but I don't think that behaviour is universal. > >> I've just removed the >> xs_write. If this is necessary, the state != 4 check and setting it >> to 5 should happen inside the same transaction, to avoid the kernel >> from changing the state behind our back. > > I think that is the right solution. > > Ian. > >> Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> >> >> diff -r 5b2676ac1321 -r 887a3229fd7a tools/libxl/libxl_device.c >> --- a/tools/libxl/libxl_device.c Mon Jan 09 16:01:44 2012 +0100 >> +++ b/tools/libxl/libxl_device.c Fri Jan 13 12:39:45 2012 +0100 >> @@ -456,7 +456,6 @@ int libxl__device_remove(libxl__gc *gc, >> retry_transaction: >> t = xs_transaction_start(ctx->xsh); >> xs_write(ctx->xsh, t, libxl__sprintf(gc, "%s/online", be_path), "0", strlen("0")); >> - xs_write(ctx->xsh, t, state_path, "5", strlen("5")); >> if (!xs_transaction_end(ctx->xsh, t, 0)) { >> if (errno == EAGAIN) >> goto retry_transaction; >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2012-Jan-13 13:20 UTC
Re: [PATCH] libxl: don''t set backend state to 5 when trying to unplug a device
On Fri, 2012-01-13 at 13:10 +0000, Roger Pau Monné wrote:> 2012/1/13 Ian Campbell <Ian.Campbell@citrix.com>: > > On Fri, 2012-01-13 at 11:57 +0000, Roger Pau Monne wrote: > >> # HG changeset patch > >> # User Roger Pau Monne <roger.pau@entel.upc.edu> > >> # Date 1326454785 -3600 > >> # Node ID 887a3229fd7a50c04981e29709bc7210dafef38f > >> # Parent 5b2676ac13218951698c49fa0350f2ac48220f3d > >> libxl: don't set backend state to 5 when trying to unplug a device > >> > >> libxl__device_remove was setting backend state to 5, which could > >> create a race condition, since the previous check for state != 4 and > >> setting state to 5 was not done inside of the same transaction, so the > >> kernel could change the state to 6 in the space between the check for > >> state != 4 and setting state to 5. > >> > >> I might be wrong, but since I don't think setting backend state to 5 > >> helps in any way when disconnecting a device > > > > It's the exact thing which makes the disconnect happen at all, isn't it? > > What makes the disconnect happen on NetBSD al least is removing the > frontend or setting the frontend state to 6, but this doesn't do > anything at all (it might be different on Linux though).Linux certainly uses state 5 (AKA XenbusStateClosing) in the state machine in at least netfront+back, blkfront+back pcifront+back and fbfront (fbback is not an in kernel driver). e.g. when netfront sees netback go to closing then it will shut itself down, see drivers/net/xen-netfront.c:netback_changed> Can someone confirm that this is actually useful on Linux?I think really any changes in these areas should be backed up with empirical experiments on a variety of system types (both front and back), otherwise we are basing things on supposition and heresay about how things are supposed to/do work. No one really knows for sure (witness the number of times we've all gone round on this). Ian.> > > Some backends (particularly the Linux ones) might also use the online > > node but I don't think that behaviour is universal. > > > >> I've just removed the > >> xs_write. If this is necessary, the state != 4 check and setting it > >> to 5 should happen inside the same transaction, to avoid the kernel > >> from changing the state behind our back. > > > > I think that is the right solution. > > > > Ian. > > > >> Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> > >> > >> diff -r 5b2676ac1321 -r 887a3229fd7a tools/libxl/libxl_device.c > >> --- a/tools/libxl/libxl_device.c Mon Jan 09 16:01:44 2012 +0100 > >> +++ b/tools/libxl/libxl_device.c Fri Jan 13 12:39:45 2012 +0100 > >> @@ -456,7 +456,6 @@ int libxl__device_remove(libxl__gc *gc, > >> retry_transaction: > >> t = xs_transaction_start(ctx->xsh); > >> xs_write(ctx->xsh, t, libxl__sprintf(gc, "%s/online", be_path), "0", strlen("0")); > >> - xs_write(ctx->xsh, t, state_path, "5", strlen("5")); > >> if (!xs_transaction_end(ctx->xsh, t, 0)) { > >> if (errno == EAGAIN) > >> goto retry_transaction; > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@lists.xensource.com > >> http://lists.xensource.com/xen-devel > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Roger Pau Monné
2012-Jan-13 15:24 UTC
Re: [PATCH] libxl: don''t set backend state to 5 when trying to unplug a device
2012/1/13 Ian Campbell <Ian.Campbell@citrix.com>:> On Fri, 2012-01-13 at 13:10 +0000, Roger Pau Monné wrote: >> 2012/1/13 Ian Campbell <Ian.Campbell@citrix.com>: >> > On Fri, 2012-01-13 at 11:57 +0000, Roger Pau Monne wrote: >> >> # HG changeset patch >> >> # User Roger Pau Monne <roger.pau@entel.upc.edu> >> >> # Date 1326454785 -3600 >> >> # Node ID 887a3229fd7a50c04981e29709bc7210dafef38f >> >> # Parent 5b2676ac13218951698c49fa0350f2ac48220f3d >> >> libxl: don't set backend state to 5 when trying to unplug a device >> >> >> >> libxl__device_remove was setting backend state to 5, which could >> >> create a race condition, since the previous check for state != 4 and >> >> setting state to 5 was not done inside of the same transaction, so the >> >> kernel could change the state to 6 in the space between the check for >> >> state != 4 and setting state to 5. >> >> >> >> I might be wrong, but since I don't think setting backend state to 5 >> >> helps in any way when disconnecting a device >> > >> > It's the exact thing which makes the disconnect happen at all, isn't it? >> >> What makes the disconnect happen on NetBSD al least is removing the >> frontend or setting the frontend state to 6, but this doesn't do >> anything at all (it might be different on Linux though). > > Linux certainly uses state 5 (AKA XenbusStateClosing) in the state > machine in at least netfront+back, blkfront+back pcifront+back and > fbfront (fbback is not an in kernel driver). > > e.g. when netfront sees netback go to closing then it will shut itself > down, see drivers/net/xen-netfront.c:netback_changed > >> Can someone confirm that this is actually useful on Linux? > > I think really any changes in these areas should be backed up with > empirical experiments on a variety of system types (both front and > back), otherwise we are basing things on supposition and heresay about > how things are supposed to/do work. No one really knows for sure > (witness the number of times we've all gone round on this).Ok, I'm doing a new patch that enclosures everything inside a transaction, and I will test it both on NetBSD and Linux. However, I have a doubt, can we assume that a transaction that only performs a read will always succeed (xs_transaction_end will never return < 0)? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2012-Jan-16 12:14 UTC
Re: [PATCH] libxl: don''t set backend state to 5 when trying to unplug a device
Roger Pau Monné writes ("Re: [Xen-devel] [PATCH] libxl: don''t set backend state to 5 when trying to unplug a device"):> 2012/1/13 Ian Campbell <Ian.Campbell@citrix.com>: > > I think really any changes in these areas should be backed up with > > empirical experiments on a variety of system types (both front and > > back), otherwise we are basing things on supposition and heresay about > > how things are supposed to/do work. No one really knows for sure > > (witness the number of times we''ve all gone round on this).I agree that we don''t seem to know what''s going on. One way to try to understand more would be to look at the xend code, since that''s what everyone currently has to work with.> Ok, I''m doing a new patch that enclosures everything inside a > transaction, and I will test it both on NetBSD and Linux. However, I > have a doubt, can we assume that a transaction that only performs a > read will always succeed (xs_transaction_end will never return < 0)?I think the answer is "yes" (or alternatively "you can ignore the error if there is one") but I can''t see any good reason for wanting to know the answer to that question. Ian.