Darren Shepherd
2013-Mar-02 22:23 UTC
libxl device_disk_add orphans blktap devices on transaction error
I''m using the CentOS 6 bundle of xen from http://dev.centos.org/centos/6/xen-c6/ and ran into an issue when creating domains with multiple VHD tap disks. Comparing unstable to the 4.2.1 code I''m using, it seems this issue still applies. I''m using a configuration line that looks something like disk = [ "tap:vhd:/var/norm/pools/agentTest/d894b704-b890-488d-b66e-1422b2b9a7a5.vhd,xvda,w", "tap:vhd:/var/norm/pools/agentTest/70bd3927-0e27-4830-9bf0-5b66ba290547.vhd,xvdb,w" ] What I noticed happening in the code is that in device_disk_add the call to libxl__xs_transaction_commit returns rc=1 so the "for (;;)" loop just tries again. The problem though is that the blktap device was already created and assigned to a tapdisk process. So it will just leave that process hanging out there and allocated a new device on the second pass through the for loop. The second run through succeeds. What I typically see is that if I have more than 1 disk on the domain, the first disk gets created fine and then each subsequent disk gets two tapdisk processes. Thanks, Darren
Roger Pau Monné
2013-Mar-04 10:59 UTC
Re: libxl device_disk_add orphans blktap devices on transaction error
On 02/03/13 23:23, Darren Shepherd wrote:> I''m using the CentOS 6 bundle of xen from > http://dev.centos.org/centos/6/xen-c6/ and ran into an issue when > creating domains with multiple VHD tap disks. Comparing unstable to > the 4.2.1 code I''m using, it seems this issue still applies. I''m > using a configuration line that looks something like > > disk = [ "tap:vhd:/var/norm/pools/agentTest/d894b704-b890-488d-b66e-1422b2b9a7a5.vhd,xvda,w", > "tap:vhd:/var/norm/pools/agentTest/70bd3927-0e27-4830-9bf0-5b66ba290547.vhd,xvdb,w" > ] > > What I noticed happening in the code is that in device_disk_add the > call to libxl__xs_transaction_commit returns rc=1 so the "for (;;)" > loop just tries again. The problem though is that the blktap device > was already created and assigned to a tapdisk process. So it will > just leave that process hanging out there and allocated a new device > on the second pass through the for loop. The second run through > succeeds. What I typically see is that if I have more than 1 disk on > the domain, the first disk gets created fine and then each subsequent > disk gets two tapdisk processes.Hello Darren Right now I don''t have a system with tapdisk, but I think the following patch should fix the problem, could you please try it and report back? Thanks, Roger. --- From e142569922ab810d960e05d26440fa1284e759e7 Mon Sep 17 00:00:00 2001 From: Roger Pau Monne <roger.pau@citrix.com> Date: Mon, 4 Mar 2013 11:36:41 +0100 Subject: [PATCH] libxl: remove tapdisk process if adding disk fails MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Destroy tapdisk process if transaction fails, and also destroy tapdisk process if disk adding fails. Reported-by: Darren Shepherd <darren.s.shepherd@gmail.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- tools/libxl/libxl.c | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 73e0dc3..30f2b04 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -2160,6 +2160,10 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid, rc = libxl__xs_transaction_commit(gc, &t); if (!rc) break; if (rc < 0) goto out; + if (disk->backend == LIBXL_DISK_BACKEND_TAP) + libxl__device_destroy_tapdisk(gc, GCSPRINTF("%s:%s", + libxl__device_disk_string_of_format(disk->format), + disk->pdev_path)); } aodev->dev = device; @@ -2171,6 +2175,10 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid, out: libxl__xs_transaction_abort(gc, &t); aodev->rc = rc; + if (disk->backend == LIBXL_DISK_BACKEND_TAP) + libxl__device_destroy_tapdisk(gc, GCSPRINTF("%s:%s", + libxl__device_disk_string_of_format(disk->format), + disk->pdev_path)); if (rc) aodev->callback(egc, aodev); return; } -- 1.7.7.5 (Apple Git-26)
Darren Shepherd
2013-Mar-04 21:42 UTC
Re: libxl device_disk_add orphans blktap devices on transaction error
That patch didn''t work. I think what''s happening is that in the successful flow of creating a tapdisk, it immediately tries to release it right after the "out" label. So after the first tapdisk is created it just seems to hang for awhile attempting to destroy the newly created tapdisk. Darren
Roger Pau Monne
2013-Mar-05 16:56 UTC
[PATCH] libxl: don''t launch more than one tapdisk process for each disk
When adding a disk don't launch multiple tapdisk instances for the same disk, if transaction fails in device_disk_add reuse the same tapdisk for further tries instead of creating a new instance each time a transaction fails. Reported-by: Darren Shepherd <darren.s.shepherd@gmail.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- tools/libxl/libxl.c | 17 ++++++++++------- 1 files changed, 10 insertions(+), 7 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 73e0dc3..345d88b 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -2032,7 +2032,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid, STATE_AO_GC(aodev->ao); flexarray_t *front = NULL; flexarray_t *back = NULL; - char *dev, *script; + char *dev = NULL, *script; libxl__device *device; int rc; libxl_ctx *ctx = gc->owner; @@ -2095,12 +2095,15 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid, break; case LIBXL_DISK_BACKEND_TAP: - dev = libxl__blktap_devpath(gc, disk->pdev_path, disk->format); - if (!dev) { - LOG(ERROR, "failed to get blktap devpath for %p\n", - disk->pdev_path); - rc = ERROR_FAIL; - goto out; + if (dev == NULL) { + dev = libxl__blktap_devpath(gc, disk->pdev_path, + disk->format); + if (!dev) { + LOG(ERROR, "failed to get blktap devpath for %p\n", + disk->pdev_path); + rc = ERROR_FAIL; + goto out; + } } flexarray_append(back, "tapdisk-params"); flexarray_append(back, libxl__sprintf(gc, "%s:%s", -- 1.7.7.5 (Apple Git-26) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Darren Shepherd
2013-Mar-10 16:06 UTC
Re: [PATCH] libxl: don''t launch more than one tapdisk process for each disk
Roger, I tried this patch out and it fixed my original issue of orphaning tapdisks. I actually tested it with the CentOS 6 Xen 4.2.1 build. I had to modify the patch slightly to apply (changed "out" to "out_free"). Below is the patch I used. Thanks --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -2032,7 +2032,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid, STATE_AO_GC(aodev->ao); flexarray_t *front = NULL; flexarray_t *back = NULL; - char *dev, *script; + char *dev = NULL, *script; libxl__device *device; int rc; libxl_ctx *ctx = gc->owner; @@ -2095,12 +2095,15 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid, break; case LIBXL_DISK_BACKEND_TAP: - dev = libxl__blktap_devpath(gc, disk->pdev_path, disk->format); - if (!dev) { - LOG(ERROR, "failed to get blktap devpath for %p\n", - disk->pdev_path); - rc = ERROR_FAIL; - goto out_free; + if (dev == NULL) { + dev = libxl__blktap_devpath(gc, disk->pdev_path, + disk->format); + if (!dev) { + LOG(ERROR, "failed to get blktap devpath for %p\n", + disk->pdev_path); + rc = ERROR_FAIL; + goto out; + } } flexarray_append(back, "tapdisk-params"); flexarray_append(back, libxl__sprintf(gc, "%s:%s",
Pasi Kärkkäinen
2013-Mar-12 08:16 UTC
Re: [PATCH] libxl: don''t launch more than one tapdisk process for each disk
On Tue, Mar 05, 2013 at 05:56:42PM +0100, Roger Pau Monne wrote:> When adding a disk don''t launch multiple tapdisk instances for the > same disk, if transaction fails in device_disk_add reuse the same > tapdisk for further tries instead of creating a new instance each > time a transaction fails. >So this patch should be backported to Xen 4.2 after it has passed xen-unstable tests. See Darren''s reply with a Xen 4.2.1 backport included. -- Pasi> Reported-by: Darren Shepherd <darren.s.shepherd@gmail.com> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > tools/libxl/libxl.c | 17 ++++++++++------- > 1 files changed, 10 insertions(+), 7 deletions(-) > > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c > index 73e0dc3..345d88b 100644 > --- a/tools/libxl/libxl.c > +++ b/tools/libxl/libxl.c > @@ -2032,7 +2032,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid, > STATE_AO_GC(aodev->ao); > flexarray_t *front = NULL; > flexarray_t *back = NULL; > - char *dev, *script; > + char *dev = NULL, *script; > libxl__device *device; > int rc; > libxl_ctx *ctx = gc->owner; > @@ -2095,12 +2095,15 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid, > break; > > case LIBXL_DISK_BACKEND_TAP: > - dev = libxl__blktap_devpath(gc, disk->pdev_path, disk->format); > - if (!dev) { > - LOG(ERROR, "failed to get blktap devpath for %p\n", > - disk->pdev_path); > - rc = ERROR_FAIL; > - goto out; > + if (dev == NULL) { > + dev = libxl__blktap_devpath(gc, disk->pdev_path, > + disk->format); > + if (!dev) { > + LOG(ERROR, "failed to get blktap devpath for %p\n", > + disk->pdev_path); > + rc = ERROR_FAIL; > + goto out; > + } > } > flexarray_append(back, "tapdisk-params"); > flexarray_append(back, libxl__sprintf(gc, "%s:%s", > -- > 1.7.7.5 (Apple Git-26) > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Ian Jackson
2013-Mar-13 14:41 UTC
Re: [PATCH] libxl: don''t launch more than one tapdisk process for each disk [and 2 more messages]
Roger Pau Monne writes ("[Xen-devel] [PATCH] libxl: don''t launch more than one tapdisk process for each disk"):> When adding a disk don''t launch multiple tapdisk instances for the > same disk, if transaction fails in device_disk_add reuse the same > tapdisk for further tries instead of creating a new instance each > time a transaction fails.Applied, thanks. Pasi Kärkkäinen writes ("Re: [Xen-devel] [PATCH] libxl: don''t launch more than one tapdisk process for each disk"):> So this patch should be backported to Xen 4.2 after it has passed xen-unstable tests. > See Darren''s reply with a Xen 4.2.1 backport included.Right, thanks. Ian.
Pasi Kärkkäinen
2013-Mar-17 11:47 UTC
Re: [PATCH] libxl: don''t launch more than one tapdisk process for each disk [and 2 more messages]
On Wed, Mar 13, 2013 at 02:41:51PM +0000, Ian Jackson wrote:> Roger Pau Monne writes ("[Xen-devel] [PATCH] libxl: don''t launch more than one tapdisk process for each disk"): > > When adding a disk don''t launch multiple tapdisk instances for the > > same disk, if transaction fails in device_disk_add reuse the same > > tapdisk for further tries instead of creating a new instance each > > time a transaction fails. > > Applied, thanks. > > Pasi Kärkkäinen writes ("Re: [Xen-devel] [PATCH] libxl: don''t launch more than one tapdisk process for each disk"): > > So this patch should be backported to Xen 4.2 after it has passed xen-unstable tests. > > See Darren''s reply with a Xen 4.2.1 backport included. > > Right, thanks. >The patch in question seems to have passed xen-unstable automated tests, so it''d be nice to get this backported to 4.2 branch for 4.2.2. Thanks, -- Pasi