flight 13394 xen-unstable real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/13394/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379 test-i386-i386-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379 test-amd64-amd64-xl-qemuu-win7-amd64 9 guest-localmigrate fail REGR. vs. 13376 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-sedf-pin 10 guest-saverestore fail REGR. vs. 13379 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-amd64-i386-qemuu-rhel6hvm-amd 11 leak-check/check fail never pass test-amd64-i386-rhel6hvm-amd 11 leak-check/check fail never pass test-amd64-i386-qemuu-rhel6hvm-intel 11 leak-check/check fail never pass test-amd64-i386-rhel6hvm-intel 11 leak-check/check fail never pass test-amd64-i386-xend-winxpsp3 16 leak-check/check fail never pass test-amd64-amd64-win 16 leak-check/check fail never pass test-amd64-i386-win-vcpus1 16 leak-check/check fail never pass test-amd64-i386-win 16 leak-check/check fail never pass test-i386-i386-win 16 leak-check/check fail never pass test-amd64-amd64-xl-win7-amd64 13 guest-stop fail never pass test-amd64-amd64-xl-win 13 guest-stop fail never pass test-amd64-i386-xl-win7-amd64 13 guest-stop fail never pass test-i386-i386-xl-win 13 guest-stop fail never pass test-amd64-i386-xl-win-vcpus1 13 guest-stop fail never pass test-amd64-amd64-xl-winxpsp3 13 guest-stop fail never pass test-i386-i386-xl-winxpsp3 13 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 13 guest-stop fail never pass version targeted for testing: xen 0455d8317631 baseline version: xen 4f92bdf3370c ------------------------------------------------------------ People who touched revisions under test: Andres Lagar-Cavilla <andres@lagarcavilla.org> Dario Faggioli <dario.faggioli@citrix.com> Dario Faggioli <raistlin@linux.it> George Dunlap <george.dunlap@eu.citrix.com> Ian Campbell <ian.campbell@citrix.com> Ian Jackson <ian.jackson@eu.citrix.com> Jan Beulich <jbeulich@suse.com> Keir Fraser <keir@xen.org> Roger Pau Monne <roger.pau@entel.upc.edu> Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tim Deegan <tim@xen.org> Yang Zhang <yang.z.zhang@Intel.com> ------------------------------------------------------------ jobs: build-amd64 pass build-i386 pass build-amd64-oldkern pass build-i386-oldkern pass build-amd64-pvops pass build-i386-pvops pass test-amd64-amd64-xl pass test-amd64-i386-xl pass test-i386-i386-xl pass test-amd64-i386-rhel6hvm-amd fail test-amd64-i386-qemuu-rhel6hvm-amd fail test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-amd64-xl-win7-amd64 fail test-amd64-i386-xl-win7-amd64 fail test-amd64-i386-xl-credit2 pass test-amd64-amd64-xl-pcipt-intel fail test-amd64-i386-rhel6hvm-intel fail test-amd64-i386-qemuu-rhel6hvm-intel fail test-amd64-i386-xl-multivcpu pass test-amd64-amd64-pair pass test-amd64-i386-pair pass test-i386-i386-pair pass test-amd64-amd64-xl-sedf-pin fail test-amd64-amd64-pv pass test-amd64-i386-pv pass test-i386-i386-pv pass test-amd64-amd64-xl-sedf pass test-amd64-i386-win-vcpus1 fail test-amd64-i386-xl-win-vcpus1 fail test-amd64-i386-xl-winxpsp3-vcpus1 fail test-amd64-amd64-win fail test-amd64-i386-win fail test-i386-i386-win fail test-amd64-amd64-xl-win fail test-i386-i386-xl-win fail test-amd64-amd64-xl-qemuu-winxpsp3 fail test-i386-i386-xl-qemuu-winxpsp3 fail test-amd64-i386-xend-winxpsp3 fail test-amd64-amd64-xl-winxpsp3 fail test-i386-i386-xl-winxpsp3 fail ------------------------------------------------------------ sg-report-flight on woking.cam.xci-test.com logs: /home/xc_osstest/logs images: /home/xc_osstest/images Logs, config files, etc. are available at http://www.chiark.greenend.org.uk/~xensrcts/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 863 lines long.)
xen.org writes ("[xen-unstable test] 13394: regressions - FAIL"):> Tests which did not succeed and are blocking, > including tests which could not be run: > test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379The logs show this: libxl: error: libxl_dom.c:632:switch_logdirty_timeout: logdirty switch: wait for device model timed out And in xenstore: /local/domain/0/device-model/5/logdirty/cmd = "enable" (n0) And in the source code: $ grep -R logdirty qemu-upstream-unstable.git/* $ So the upstream qemu does not participate properly in the migration protocol. And anyway this protocol seems to involve xenstore and I would have expected it to do something with QMP. But there is no code in libxl to do this (and never has been) and no code in upstream qemu to do it either. That means we''ll get memory corruption in migrated guests with the new qemu: any time qemu writes to guest memory, it needs to trigger a logdirty update so that the write is properly transferred to the migration target domain. With the old libxl we didn''t notice this apart from random failures. With my new migration code, particularly 25542:1883e5c71a87 libxl: wait for qemu to acknowledge logdirty command this turns into a hard failure. I will add this as an allowable test failure pending a proper fix. Ian.
On Fri, 2012-06-29 at 12:20 +0100, Ian Jackson wrote:> xen.org writes ("[xen-unstable test] 13394: regressions - FAIL"): > > Tests which did not succeed and are blocking, > > including tests which could not be run: > > test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379 > > The logs show this: > > libxl: error: libxl_dom.c:632:switch_logdirty_timeout: logdirty switch: wait for device model timed out > > And in xenstore: > > /local/domain/0/device-model/5/logdirty/cmd = "enable" (n0) > > And in the source code: > > $ grep -R logdirty qemu-upstream-unstable.git/* > $ > > So the upstream qemu does not participate properly in the migration > protocol. And anyway this protocol seems to involve xenstore and I > would have expected it to do something with QMP. But there is no code > in libxl to do this (and never has been) and no code in upstream qemu > to do it either. > > That means we''ll get memory corruption in migrated guests with the new > qemu: any time qemu writes to guest memory, it needs to trigger a > logdirty update so that the write is properly transferred to the > migration target domain. > > With the old libxl we didn''t notice this apart from random failures. > With my new migration code, particularly > 25542:1883e5c71a87 > libxl: wait for qemu to acknowledge logdirty command > this turns into a hard failure. > > I will add this as an allowable test failure pending a proper fix.Thanks for investigating. It does appear that this has always been broken. Do we think this is a blocker for 4.2? It certainly prevents us from suggesting that we support HVM migration with the (non-default) upstream qemu. If we can''t fix this for 4.2 (e.g. because we need to get patches into upstream qemu or because the libxl side is too involved) we should at a minimum make libxl reject attempts to migrate such domains with an appropriate error message. How does this impact the use of upstream qemu for PV guest backends vs migration? I *think* they don''t require log-dirty support, but I''m not sure. Ian.
Stefano Stabellini
2012-Jun-29 12:29 UTC
Re: [xen-unstable test] 13394: regressions - FAIL
On Fri, 29 Jun 2012, Ian Campbell wrote:> On Fri, 2012-06-29 at 12:20 +0100, Ian Jackson wrote: > > xen.org writes ("[xen-unstable test] 13394: regressions - FAIL"): > > > Tests which did not succeed and are blocking, > > > including tests which could not be run: > > > test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379 > > > > The logs show this: > > > > libxl: error: libxl_dom.c:632:switch_logdirty_timeout: logdirty switch: wait for device model timed out > > > > And in xenstore: > > > > /local/domain/0/device-model/5/logdirty/cmd = "enable" (n0) > > > > And in the source code: > > > > $ grep -R logdirty qemu-upstream-unstable.git/* > > $ > > > > So the upstream qemu does not participate properly in the migration > > protocol. And anyway this protocol seems to involve xenstore and I > > would have expected it to do something with QMP. But there is no code > > in libxl to do this (and never has been) and no code in upstream qemu > > to do it either. > > > > That means we''ll get memory corruption in migrated guests with the new > > qemu: any time qemu writes to guest memory, it needs to trigger a > > logdirty update so that the write is properly transferred to the > > migration target domain. > > > > With the old libxl we didn''t notice this apart from random failures. > > With my new migration code, particularly > > 25542:1883e5c71a87 > > libxl: wait for qemu to acknowledge logdirty command > > this turns into a hard failure. > > > > I will add this as an allowable test failure pending a proper fix. > > Thanks for investigating. It does appear that this has always been > broken. > > Do we think this is a blocker for 4.2?I wouldn''t consider it a blocker, given that upstream QEMU is not the default for HVM guests.> It certainly prevents us from suggesting that we support HVM migration > with the (non-default) upstream qemu. > > If we can''t fix this for 4.2 (e.g. because we need to get patches into > upstream qemu or because the libxl side is too involved) we should at a > minimum make libxl reject attempts to migrate such domains with an > appropriate error message.We do need to get patches in QEMU to fix this but we could backport them in qemu-upstream-unstable (and ask for backports to the stable trees).> How does this impact the use of upstream qemu for PV guest backends vs > migration? I *think* they don''t require log-dirty support, but I''m not > sure.It does not affect qemu for PV guests.
On Fri, 2012-06-29 at 13:29 +0100, Stefano Stabellini wrote:> On Fri, 29 Jun 2012, Ian Campbell wrote: > > On Fri, 2012-06-29 at 12:20 +0100, Ian Jackson wrote: > > > xen.org writes ("[xen-unstable test] 13394: regressions - FAIL"): > > > > Tests which did not succeed and are blocking, > > > > including tests which could not be run: > > > > test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379 > > > > > > The logs show this: > > > > > > libxl: error: libxl_dom.c:632:switch_logdirty_timeout: logdirty switch: wait for device model timed out > > > > > > And in xenstore: > > > > > > /local/domain/0/device-model/5/logdirty/cmd = "enable" (n0) > > > > > > And in the source code: > > > > > > $ grep -R logdirty qemu-upstream-unstable.git/* > > > $ > > > > > > So the upstream qemu does not participate properly in the migration > > > protocol. And anyway this protocol seems to involve xenstore and I > > > would have expected it to do something with QMP. But there is no code > > > in libxl to do this (and never has been) and no code in upstream qemu > > > to do it either. > > > > > > That means we''ll get memory corruption in migrated guests with the new > > > qemu: any time qemu writes to guest memory, it needs to trigger a > > > logdirty update so that the write is properly transferred to the > > > migration target domain. > > > > > > With the old libxl we didn''t notice this apart from random failures. > > > With my new migration code, particularly > > > 25542:1883e5c71a87 > > > libxl: wait for qemu to acknowledge logdirty command > > > this turns into a hard failure. > > > > > > I will add this as an allowable test failure pending a proper fix. > > > > Thanks for investigating. It does appear that this has always been > > broken. > > > > Do we think this is a blocker for 4.2? > > I wouldn''t consider it a blocker, given that upstream QEMU is not the > default for HVM guests. > > > > It certainly prevents us from suggesting that we support HVM migration > > with the (non-default) upstream qemu. > > > > If we can''t fix this for 4.2 (e.g. because we need to get patches into > > upstream qemu or because the libxl side is too involved) we should at a > > minimum make libxl reject attempts to migrate such domains with an > > appropriate error message. > > We do need to get patches in QEMU to fix this but we could backport them in > qemu-upstream-unstable (and ask for backports to the stable trees).Can we do that in time for 4.2? It''s pretty late in the day. I think we need to consider either achieving this or adding the appropriate error message as a blocker. Hopefully the former but falling back to the latter if it comes to it.> > How does this impact the use of upstream qemu for PV guest backends vs > > migration? I *think* they don''t require log-dirty support, but I''m not > > sure. > > It does not affect qemu for PV guests.Great. Ian.
Stefano Stabellini
2012-Jun-29 12:36 UTC
Re: [xen-unstable test] 13394: regressions - FAIL
On Fri, 29 Jun 2012, Ian Campbell wrote:> On Fri, 2012-06-29 at 13:29 +0100, Stefano Stabellini wrote: > > On Fri, 29 Jun 2012, Ian Campbell wrote: > > > On Fri, 2012-06-29 at 12:20 +0100, Ian Jackson wrote: > > > > xen.org writes ("[xen-unstable test] 13394: regressions - FAIL"): > > > > > Tests which did not succeed and are blocking, > > > > > including tests which could not be run: > > > > > test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379 > > > > > > > > The logs show this: > > > > > > > > libxl: error: libxl_dom.c:632:switch_logdirty_timeout: logdirty switch: wait for device model timed out > > > > > > > > And in xenstore: > > > > > > > > /local/domain/0/device-model/5/logdirty/cmd = "enable" (n0) > > > > > > > > And in the source code: > > > > > > > > $ grep -R logdirty qemu-upstream-unstable.git/* > > > > $ > > > > > > > > So the upstream qemu does not participate properly in the migration > > > > protocol. And anyway this protocol seems to involve xenstore and I > > > > would have expected it to do something with QMP. But there is no code > > > > in libxl to do this (and never has been) and no code in upstream qemu > > > > to do it either. > > > > > > > > That means we''ll get memory corruption in migrated guests with the new > > > > qemu: any time qemu writes to guest memory, it needs to trigger a > > > > logdirty update so that the write is properly transferred to the > > > > migration target domain. > > > > > > > > With the old libxl we didn''t notice this apart from random failures. > > > > With my new migration code, particularly > > > > 25542:1883e5c71a87 > > > > libxl: wait for qemu to acknowledge logdirty command > > > > this turns into a hard failure. > > > > > > > > I will add this as an allowable test failure pending a proper fix. > > > > > > Thanks for investigating. It does appear that this has always been > > > broken. > > > > > > Do we think this is a blocker for 4.2? > > > > I wouldn''t consider it a blocker, given that upstream QEMU is not the > > default for HVM guests. > > > > > > > It certainly prevents us from suggesting that we support HVM migration > > > with the (non-default) upstream qemu. > > > > > > If we can''t fix this for 4.2 (e.g. because we need to get patches into > > > upstream qemu or because the libxl side is too involved) we should at a > > > minimum make libxl reject attempts to migrate such domains with an > > > appropriate error message. > > > > We do need to get patches in QEMU to fix this but we could backport them in > > qemu-upstream-unstable (and ask for backports to the stable trees). > > Can we do that in time for 4.2? It''s pretty late in the day. > > I think we need to consider either achieving this or adding the > appropriate error message as a blocker. Hopefully the former but falling > back to the latter if it comes to it.I think we should add an appropriate error message as a blocker. We should also try to fix this on the QEMU side, but given that the QEMU 1.0 stable tree is pretty much unmaintaned, we won''t be able to backport the fix there, so we cannot be sure that a distro will end up with a QEMU with or without the fix.
On Fri, 2012-06-29 at 13:36 +0100, Stefano Stabellini wrote:> On Fri, 29 Jun 2012, Ian Campbell wrote: > > On Fri, 2012-06-29 at 13:29 +0100, Stefano Stabellini wrote: > > > On Fri, 29 Jun 2012, Ian Campbell wrote: > > > > On Fri, 2012-06-29 at 12:20 +0100, Ian Jackson wrote: > > > > > xen.org writes ("[xen-unstable test] 13394: regressions - FAIL"): > > > > > > Tests which did not succeed and are blocking, > > > > > > including tests which could not be run: > > > > > > test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379 > > > > > > > > > > The logs show this: > > > > > > > > > > libxl: error: libxl_dom.c:632:switch_logdirty_timeout: logdirty switch: wait for device model timed out > > > > > > > > > > And in xenstore: > > > > > > > > > > /local/domain/0/device-model/5/logdirty/cmd = "enable" (n0) > > > > > > > > > > And in the source code: > > > > > > > > > > $ grep -R logdirty qemu-upstream-unstable.git/* > > > > > $ > > > > > > > > > > So the upstream qemu does not participate properly in the migration > > > > > protocol. And anyway this protocol seems to involve xenstore and I > > > > > would have expected it to do something with QMP. But there is no code > > > > > in libxl to do this (and never has been) and no code in upstream qemu > > > > > to do it either. > > > > > > > > > > That means we''ll get memory corruption in migrated guests with the new > > > > > qemu: any time qemu writes to guest memory, it needs to trigger a > > > > > logdirty update so that the write is properly transferred to the > > > > > migration target domain. > > > > > > > > > > With the old libxl we didn''t notice this apart from random failures. > > > > > With my new migration code, particularly > > > > > 25542:1883e5c71a87 > > > > > libxl: wait for qemu to acknowledge logdirty command > > > > > this turns into a hard failure. > > > > > > > > > > I will add this as an allowable test failure pending a proper fix. > > > > > > > > Thanks for investigating. It does appear that this has always been > > > > broken. > > > > > > > > Do we think this is a blocker for 4.2? > > > > > > I wouldn''t consider it a blocker, given that upstream QEMU is not the > > > default for HVM guests. > > > > > > > > > > It certainly prevents us from suggesting that we support HVM migration > > > > with the (non-default) upstream qemu. > > > > > > > > If we can''t fix this for 4.2 (e.g. because we need to get patches into > > > > upstream qemu or because the libxl side is too involved) we should at a > > > > minimum make libxl reject attempts to migrate such domains with an > > > > appropriate error message. > > > > > > We do need to get patches in QEMU to fix this but we could backport them in > > > qemu-upstream-unstable (and ask for backports to the stable trees). > > > > Can we do that in time for 4.2? It''s pretty late in the day. > > > > I think we need to consider either achieving this or adding the > > appropriate error message as a blocker. Hopefully the former but falling > > back to the latter if it comes to it. > > I think we should add an appropriate error message as a blocker. We > should also try to fix this on the QEMU side, but given that the QEMU > 1.0 stable tree is pretty much unmaintaned, we won''t be able to backport > the fix there, so we cannot be sure that a distro will end up with a > QEMU with or without the fix.Agreed. I''ll add this to the 4.2 TODO list. If we can get support into the mainline qemu then as a stretch goal we can consider whether libxl can auto detect the availability of the feature and react accordingly. Ian.
Ian Campbell
2012-Jun-29 16:01 UTC
[PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
> I think we should add an appropriate error message as a blocker. We > should also try to fix this on the QEMU side, but given that the QEMU > 1.0 stable tree is pretty much unmaintaned, we won''t be able to backport > the fix there, so we cannot be sure that a distro will end up with a > QEMU with or without the fix.How about this for the time being, we can always revert or enhance as necessary before 4.2. 8<---------------------- # HG changeset patch # User Ian Campbell <ian.campbell@citrix.com> # Date 1340985595 -3600 # Node ID a0c1c8c585e03279f9ea149bc2951d0df900717e # Parent d849ca2ef197dbf0e731aa4726da0eb3e2801280 libxl: refuse to try and migrate an HVM guest using qemu-xen libxl/qemu-upstream currently do not collude together to enable log-dirty mode and therefore migrations are unsafe. Refuse to even try for now. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> diff -r d849ca2ef197 -r a0c1c8c585e0 tools/libxl/libxl.c --- a/tools/libxl/libxl.c Fri Jun 29 15:57:28 2012 +0100 +++ b/tools/libxl/libxl.c Fri Jun 29 16:59:55 2012 +0100 @@ -746,6 +746,17 @@ int libxl_domain_suspend(libxl_ctx *ctx, goto out_err; } + libxl_device_model_version dm + libxl__device_model_version_running(gc, domid); + if (type == LIBXL_DOMAIN_TYPE_HVM && + dm == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN && + flags & LIBXL_SUSPEND_LIVE) { + LOG(ERROR, "cannot live migrate HVM domains with qemu-xen device-model"); + rc = ERROR_FAIL; + goto out_err; + + } + libxl__domain_suspend_state *dss; GCNEW(dss);
Ian Jackson
2012-Jun-29 16:03 UTC
Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
Ian Campbell writes ("[PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"):> > I think we should add an appropriate error message as a blocker. We > > should also try to fix this on the QEMU side, but given that the QEMU > > 1.0 stable tree is pretty much unmaintaned, we won''t be able to backport > > the fix there, so we cannot be sure that a distro will end up with a > > QEMU with or without the fix. > > How about this for the time being, we can always revert or enhance as > necessary before 4.2....> + libxl_device_model_version dm > + libxl__device_model_version_running(gc, domid);libxl__device_model_version_running can return -1. (Its error handling is pretty bad but that''s no excuse for making matters worse.) I suggest we retcon it as returning a libxl error code. Ian.
Ian Campbell
2012-Jul-02 11:19 UTC
Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
On Fri, 2012-06-29 at 17:03 +0100, Ian Jackson wrote:> Ian Campbell writes ("[PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"): > > > I think we should add an appropriate error message as a blocker. We > > > should also try to fix this on the QEMU side, but given that the QEMU > > > 1.0 stable tree is pretty much unmaintaned, we won''t be able to backport > > > the fix there, so we cannot be sure that a distro will end up with a > > > QEMU with or without the fix. > > > > How about this for the time being, we can always revert or enhance as > > necessary before 4.2. > ... > > + libxl_device_model_version dm > > + libxl__device_model_version_running(gc, domid); > > libxl__device_model_version_running can return -1. (Its error > handling is pretty bad but that''s no excuse for making matters worse.) > I suggest we retcon it as returning a libxl error code.the current callers all do something like switch (libxl__device_model_version_running(gc, domid)) { case LIBXL_DEVICE_MODEL... ... default: return ERROR_INVAL; } and would need some rejigging to support a switch to ERROR_*, and associated chance of triggering gcc''s switch(enum) warnings. I think fixing this properly can be left to 4.3 and for now I''ll just follow this same pattern. Ian. # HG changeset patch # User Ian Campbell <ian.campbell@citrix.com> # Date 1341227605 -3600 # Node ID c6a2abdee84d5ae0692728dc169327314991090a # Parent d29a88850da26424b52520a617ca449d49124143 libxl: refuse to try and migrate an HVM guest using qemu-xen libxl/qemu-upstream currently do not collude together to enable log-dirty mode and therefore migrations are unsafe. Refuse to even try for now. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> diff -r d29a88850da2 -r c6a2abdee84d tools/libxl/libxl.c --- a/tools/libxl/libxl.c Mon Jul 02 12:06:22 2012 +0100 +++ b/tools/libxl/libxl.c Mon Jul 02 12:13:25 2012 +0100 @@ -746,6 +746,22 @@ int libxl_domain_suspend(libxl_ctx *ctx, goto out_err; } + if (type == LIBXL_DOMAIN_TYPE_HVM && flags & LIBXL_SUSPEND_LIVE) { + switch (libxl__device_model_version_running(gc, domid)) { + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN: + LOG(ERROR, + "cannot live migrate HVM domains with qemu-xen device-model"); + rc = ERROR_FAIL; + goto out_err; + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: + /* No problem */ + break; + default: + rc = ERROR_FAIL; + goto out_err; + } + } + libxl__domain_suspend_state *dss; GCNEW(dss);
Ian Jackson
2012-Jul-23 10:52 UTC
Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
Ian Campbell writes ("Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"):> diff -r d29a88850da2 -r c6a2abdee84d tools/libxl/libxl.c > --- a/tools/libxl/libxl.c Mon Jul 02 12:06:22 2012 +0100 > +++ b/tools/libxl/libxl.c Mon Jul 02 12:13:25 2012 +0100 > @@ -746,6 +746,22 @@ int libxl_domain_suspend(libxl_ctx *ctx,...> + switch (libxl__device_model_version_running(gc, domid)) { > + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN: > + LOG(ERROR, > + "cannot live migrate HVM domains with qemu-xen device-model"); > + rc = ERROR_FAIL; > + goto out_err; > + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: > + /* No problem */ > + break; > + default: > + rc = ERROR_FAIL; > + goto out_err; > + }This last case should abort() or log something or not exist. Ian.
Ian Campbell
2012-Jul-23 10:58 UTC
Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
On Mon, 2012-07-23 at 11:52 +0100, Ian Jackson wrote:> Ian Campbell writes ("Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"): > > diff -r d29a88850da2 -r c6a2abdee84d tools/libxl/libxl.c > > --- a/tools/libxl/libxl.c Mon Jul 02 12:06:22 2012 +0100 > > +++ b/tools/libxl/libxl.c Mon Jul 02 12:13:25 2012 +0100 > > @@ -746,6 +746,22 @@ int libxl_domain_suspend(libxl_ctx *ctx, > ... > > + switch (libxl__device_model_version_running(gc, domid)) { > > + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN: > > + LOG(ERROR, > > + "cannot live migrate HVM domains with qemu-xen device-model"); > > + rc = ERROR_FAIL; > > + goto out_err; > > + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: > > + /* No problem */ > > + break; > > + default: > > + rc = ERROR_FAIL; > > + goto out_err; > > + } > > This last case should abort() or log something or not exist.I think case -1: rc = ERROR_FAIL goto out_err; default: abort() would be OK for 4.2. libxl__device_model_version_running either returns -1 (and logs) or a valid LIBXL_DEVICE_MODEL_VERSION_* so the only thing to catch is a new LIBXL_DEVICE_MODEL_VERSION_*, in which case an abort() seems appropriate since it would be a bug to not handle it here. Possibly -1 ought to become LIBXL_DEVICE_MODEL_VERSION_INVALID too Ian.
Ian Campbell
2012-Jul-23 11:18 UTC
Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
On Mon, 2012-07-23 at 11:58 +0100, Ian Campbell wrote:> On Mon, 2012-07-23 at 11:52 +0100, Ian Jackson wrote: > > Ian Campbell writes ("Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"): > > > diff -r d29a88850da2 -r c6a2abdee84d tools/libxl/libxl.c > > > --- a/tools/libxl/libxl.c Mon Jul 02 12:06:22 2012 +0100 > > > +++ b/tools/libxl/libxl.c Mon Jul 02 12:13:25 2012 +0100 > > > @@ -746,6 +746,22 @@ int libxl_domain_suspend(libxl_ctx *ctx, > > ... > > > + switch (libxl__device_model_version_running(gc, domid)) { > > > + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN: > > > + LOG(ERROR, > > > + "cannot live migrate HVM domains with qemu-xen device-model"); > > > + rc = ERROR_FAIL; > > > + goto out_err; > > > + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: > > > + /* No problem */ > > > + break; > > > + default: > > > + rc = ERROR_FAIL; > > > + goto out_err; > > > + } > > > > This last case should abort() or log something or not exist. > > I think > case -1: > rc = ERROR_FAIL > goto out_err; > default: abort() > > would be OK for 4.2. libxl__device_model_version_running either returns > -1 (and logs) or a valid LIBXL_DEVICE_MODEL_VERSION_* so the only thing > to catch is a new LIBXL_DEVICE_MODEL_VERSION_*, in which case an abort() > seems appropriate since it would be a bug to not handle it here. > > Possibly -1 ought to become LIBXL_DEVICE_MODEL_VERSION_INVALID tooActually I think -1 == INVALID would be wrong, the -1 here really means something went wrong internally to libxl. Anyway, I remember now that the reason I did it this was was to avoid the need to rewrite the other callsites for 4.2. In 4.3 we should make libxl__device_model_version_running return ERROR_foo or something valid, and perhaps add 0 == UNKNOWN as a member of the enum. For now I think this will do: 8<------------------------------------------------ # HG changeset patch # User Ian Campbell <ian.campbell@citrix.com> # Date 1343042178 -3600 # Node ID 11b539fb8400f163a871145aa20d7b4b2967e5b4 # Parent 21b166ed1e8765b75a2e90caa9e0e156ea4010d9 libxl: refuse to try and migrate an HVM guest using qemu-xen libxl/qemu-upstream currently do not collude together to enable log-dirty mode and therefore migrations are unsafe. Refuse to even try for now. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> diff -r 21b166ed1e87 -r 11b539fb8400 tools/libxl/libxl.c --- a/tools/libxl/libxl.c Mon Jul 23 12:09:53 2012 +0100 +++ b/tools/libxl/libxl.c Mon Jul 23 12:16:18 2012 +0100 @@ -746,6 +746,23 @@ int libxl_domain_suspend(libxl_ctx *ctx, goto out_err; } + if (type == LIBXL_DOMAIN_TYPE_HVM && flags & LIBXL_SUSPEND_LIVE) { + switch (libxl__device_model_version_running(gc, domid)) { + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN: + LOG(ERROR, + "cannot live migrate HVM domains with qemu-xen device-model"); + rc = ERROR_FAIL; + goto out_err; + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: + /* No problem */ + break; + case -1: + rc = ERROR_FAIL; + goto out_err; + default: abort(); + } + } + libxl__domain_suspend_state *dss; GCNEW(dss);
Ian Jackson
2012-Jul-24 10:30 UTC
Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
Ian Campbell writes ("Re: [Xen-devel] [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"):> Actually I think -1 == INVALID would be wrong, the -1 here really means > something went wrong internally to libxl. Anyway, I remember now that > the reason I did it this was was to avoid the need to rewrite the other > callsites for 4.2. In 4.3 we should make > libxl__device_model_version_running return ERROR_foo or something valid, > and perhaps add 0 == UNKNOWN as a member of the enum.Yes.> For now I think this will do:..> libxl: refuse to try and migrate an HVM guest using qemu-xen > > libxl/qemu-upstream currently do not collude together to enable log-dirty mode > and therefore migrations are unsafe. Refuse to even try for now.Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Thanks, Ian.
Ian Campbell
2012-Jul-25 16:45 UTC
Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
On Tue, 2012-07-24 at 11:30 +0100, Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"): > > Actually I think -1 == INVALID would be wrong, the -1 here really means > > something went wrong internally to libxl. Anyway, I remember now that > > the reason I did it this was was to avoid the need to rewrite the other > > callsites for 4.2. In 4.3 we should make > > libxl__device_model_version_running return ERROR_foo or something valid, > > and perhaps add 0 == UNKNOWN as a member of the enum. > > Yes. > > > For now I think this will do: > .. > > libxl: refuse to try and migrate an HVM guest using qemu-xen > > > > libxl/qemu-upstream currently do not collude together to enable log-dirty mode > > and therefore migrations are unsafe. Refuse to even try for now. > > Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>Applied, thanks.> > Thanks, > Ian.
Ian Campbell
2012-Jul-25 16:56 UTC
Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
On Wed, 2012-07-25 at 17:45 +0100, Ian Campbell wrote:> On Tue, 2012-07-24 at 11:30 +0100, Ian Jackson wrote: > > Ian Campbell writes ("Re: [Xen-devel] [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"): > > > Actually I think -1 == INVALID would be wrong, the -1 here really means > > > something went wrong internally to libxl. Anyway, I remember now that > > > the reason I did it this was was to avoid the need to rewrite the other > > > callsites for 4.2. In 4.3 we should make > > > libxl__device_model_version_running return ERROR_foo or something valid, > > > and perhaps add 0 == UNKNOWN as a member of the enum. > > > > Yes. > > > > > For now I think this will do: > > .. > > > libxl: refuse to try and migrate an HVM guest using qemu-xen > > > > > > libxl/qemu-upstream currently do not collude together to enable log-dirty mode > > > and therefore migrations are unsafe. Refuse to even try for now. > > > > Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> > > Applied, thanks.Right after I pushed I noticed that gcc 4.6.2 (which is not the one I typically use for commit testing) fails with: libxl.c: In function ‘libxl_domain_suspend’: libxl.c:778:9: error: case value ‘4294967295’ not in enumerated type ‘libxl_device_model_version’ [-Werror=switch] Whereas 4.4.5 (which is my typical commit test and day to day gcc) does not. I have pushed the following additional patch to fix this up, tested with 4.4.5 and 4.6.2. 8<--------------- From 365636cac708d4af0907503c07bec530b2dd3a03 Mon Sep 17 00:00:00 2001 From: Ian Campbell <ian.campbell@citrix.com> Date: Wed, 25 Jul 2012 17:52:12 +0100 Subject: [PATCH] libxl: libxl__device_model_version_running should return an int MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On error it returns -1 and therefore it needs to return int and not libxl_device_model_enum. Otherwise gcc 4.6.2 complains: libxl.c: In function ‘libxl_domain_suspend’: libxl.c:778:9: error: case value ‘4294967295’ not in enumerated type ‘libxl_device_model_version’ [-Werror=switch] Signed-off-by: Ian Campbell <ian.campbell@citrix.com> --- tools/libxl/libxl_internal.c | 3 +-- tools/libxl/libxl_internal.h | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c index fbff7d0..24099f5 100644 --- a/tools/libxl/libxl_internal.c +++ b/tools/libxl/libxl_internal.c @@ -330,8 +330,7 @@ out: return rc; } -libxl_device_model_version libxl__device_model_version_running(libxl__gc *gc, - uint32_t domid) +int libxl__device_model_version_running(libxl__gc *gc, uint32_t domid) { char *path = NULL; char *dm_version = NULL; diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 9ffb9d4..1b9b417 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -1515,8 +1515,7 @@ _hidden libxl__json_object *libxl__json_parse(libxl__gc *gc, const char *s); /* Based on /local/domain/$domid/dm-version xenstore key * default is qemu xen traditional */ -_hidden libxl_device_model_version -libxl__device_model_version_running(libxl__gc *gc, uint32_t domid); +_hidden int libxl__device_model_version_running(libxl__gc *gc, uint32_t domid); /* -- 1.7.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel