flight 10481 xen-unstable real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/10481/ Regressions :-( Tests which did not succeed and are blocking: test-amd64-amd64-win 7 windows-install fail REGR. vs. 10474 Tests which are failing intermittently (not blocking): test-amd64-amd64-pv 9 guest-start fail pass in 10480 test-amd64-amd64-xl-sedf-pin 11 guest-localmigrate fail pass in 10480 test-amd64-amd64-xl-sedf 11 guest-localmigrate fail in 10480 pass in 10481 Tests which did not succeed, but are not blocking, including regressions (tests previously passed) regarded as allowable: test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-amd64-i386-rhel6hvm-intel 9 guest-start.2 fail never pass test-amd64-i386-rhel6hvm-amd 9 guest-start.2 fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 13 guest-stop fail never pass test-amd64-i386-xend-winxpsp3 16 leak-check/check fail never pass test-amd64-i386-xl-win7-amd64 13 guest-stop fail never pass test-amd64-amd64-xl-win7-amd64 13 guest-stop fail never pass test-i386-i386-win 16 leak-check/check fail never pass test-amd64-i386-win-vcpus1 16 leak-check/check fail never pass test-amd64-i386-win 16 leak-check/check fail never pass test-amd64-amd64-xl-win 13 guest-stop fail never pass test-amd64-i386-xl-win-vcpus1 13 guest-stop fail never pass test-amd64-amd64-xl-winxpsp3 7 windows-install fail never pass test-i386-i386-xl-win 13 guest-stop fail never pass test-i386-i386-xl-winxpsp3 7 windows-install fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 7 windows-install fail in 10480 like 10474 test-amd64-i386-xend-winxpsp3 7 windows-install fail in 10480 like 10474 test-amd64-i386-xl-win7-amd64 7 windows-install fail in 10480 like 10474 test-amd64-amd64-xl-win7-amd64 7 windows-install fail in 10480 like 10474 test-amd64-i386-win 7 windows-install fail in 10480 like 10473 version targeted for testing: xen 7ca56cca09ad baseline version: xen 1c58bb664d8d ------------------------------------------------------------ People who touched revisions under test: Andre Przywara <andre.przywara@amd.com> Ian Campbell <ian.campbell@citrix.com> Ian Jackson <ian.jackson@eu.citrix.com> Jan Beulich <jbeulich@suse.com> Roger Pau Monne <roger.pau@entel.upc.edu> ------------------------------------------------------------ jobs: build-amd64 pass build-i386 pass build-amd64-oldkern pass build-i386-oldkern pass build-amd64-pvops pass build-i386-pvops pass test-amd64-amd64-xl pass test-amd64-i386-xl pass test-i386-i386-xl pass test-amd64-i386-rhel6hvm-amd fail test-amd64-amd64-xl-win7-amd64 fail test-amd64-i386-xl-win7-amd64 fail test-amd64-i386-xl-credit2 pass test-amd64-amd64-xl-pcipt-intel fail test-amd64-i386-rhel6hvm-intel fail test-amd64-i386-xl-multivcpu pass test-amd64-amd64-pair pass test-amd64-i386-pair pass test-i386-i386-pair pass test-amd64-amd64-xl-sedf-pin fail test-amd64-amd64-pv fail test-amd64-i386-pv pass test-i386-i386-pv pass test-amd64-amd64-xl-sedf pass test-amd64-i386-win-vcpus1 fail test-amd64-i386-xl-win-vcpus1 fail test-amd64-i386-xl-winxpsp3-vcpus1 fail test-amd64-amd64-win fail test-amd64-i386-win fail test-i386-i386-win fail test-amd64-amd64-xl-win fail test-i386-i386-xl-win fail test-amd64-i386-xend-winxpsp3 fail test-amd64-amd64-xl-winxpsp3 fail test-i386-i386-xl-winxpsp3 fail ------------------------------------------------------------ sg-report-flight on woking.cam.xci-test.com logs: /home/xc_osstest/logs images: /home/xc_osstest/images Logs, config files, etc. are available at http://www.chiark.greenend.org.uk/~xensrcts/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 336 lines long.)
On Tue, 2011-12-13 at 05:22 +0000, xen.org wrote:> flight 10481 xen-unstable real [real] > http://www.chiark.greenend.org.uk/~xensrcts/logs/10481/ > > Regressions :-( > > Tests which did not succeed and are blocking: > test-amd64-amd64-win 7 windows-install fail REGR. vs. 10474Failed the network connection after install 011-12-13 01:38:35 Z guest win.guest.osstest 5a:36:0e:f1:00:0c 8936 link/ip/tcp: waiting 7000s... 2011-12-13 01:38:35 Z guest win.guest.osstest 5a:36:0e:f1:00:0c 8936 link/ip/tcp: no active lease (waiting) ... 2011-12-13 01:57:28 Z guest win.guest.osstest 5a:36:0e:f1:00:0c 8936 link/ip/tcp: nc: 256 (UNKNOWN) [10.80.251.132] 8936 (?) : Connection timed out | (waiting) ... VNC screenshot suggests the guest has booted fine and is sitting at the screensaver. brctl and ifconfig show that the devices are set up and in the right places. PV NIC is not initialised (as expected) so we would expect to be using the tap interface which has seemingly passed some traffic: tap5.0 Link encap:Ethernet HWaddr fe:ff:ff:ff:ff:ff inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 RX packets:442 errors:0 dropped:0 overruns:0 frame:0 TX packets:378027 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:33714 (32.9 KiB) TX bytes:24976607 (23.8 MiB) This being a Windows guest there isn''t much debug from within the guest. Could this be an infrastructure thing? Other similar tests passed just fine although two similar winxp sp3 tests also failed in a similar way (this one was sp2, some other sp3 tests did pass though). This was a xend test while others which failed similarly were xl based so there appears to be no correlation there.> Tests which are failing intermittently (not blocking): > test-amd64-amd64-pv 9 guest-start fail pass in 10480This PV guest appears to have failed for similar reasons to the above Windows guests: 2011-12-13 00:58:20 Z FAILURE: guest debian.guest.osstest 5a:36:0e:f1:00:01 22 link/ip/tcp: wait timed out: no active lease. failure: guest debian.guest.osstest 5a:36:0e:f1:00:01 22 link/ip/tcp: wait timed out: no active lease. The guest log shows it did get an IP address at least: Setting kernel variables ...done. Configuring network interfaces...Internet Systems Consortium DHCP Client 4.1.1-P1 Copyright 2004-2010 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Listening on LPF/eth0/5a:36:0e:f1:00:01 Sending on LPF/eth0/5a:36:0e:f1:00:01 Sending on Socket/fallback DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 7 DHCPOFFER from 10.80.2.2 DHCPREQUEST on eth0 to 255.255.255.255 port 67 DHCPACK from 10.80.2.3 bound to 10.80.3.117 -- renewal in 18880 seconds.> test-amd64-amd64-xl-sedf-pin 11 guest-localmigrate fail pass in 10480 > test-amd64-amd64-xl-sedf 11 guest-localmigrate fail in 10480 pass in 10481 > > Tests which did not succeed, but are not blocking, > including regressions (tests previously passed) regarded as allowable:(reordered to group similar)> test-amd64-i386-xl-winxpsp3-vcpus1 13 guest-stop fail never pass > test-amd64-i386-xl-win7-amd64 13 guest-stop fail never pass > test-amd64-amd64-xl-win7-amd64 13 guest-stop fail never pass > test-amd64-amd64-xl-win 13 guest-stop fail never pass > test-amd64-i386-xl-win-vcpus1 13 guest-stop fail never pass > test-i386-i386-xl-win 13 guest-stop fail never passThese are all failing because "xl shutdown" cannot be used with a guest without PV drivers (and prints a warning to that effect). 2011-12-13 01:33:56 Z executing ssh ... root@10.80.249.57 xl shutdown -w win.guest.osstest libxl: error: libxl.c:581:libxl_domain_shutdown: HVM domain without PV drivers: graceful shutdown not possible, use destroy shutdown failed (rc=-3) I think "xl trigger <dom> power" would be what is wanted here -- e.g. send an ACPI power event. It could be argued that xl shutdown could do this automatically?> test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never passNo active link message again but this time the guest says: For info, please visit https://www.isc.org/software/dhcp/ SIOCSIFADDR: No such device eth0: ERROR while getting interface flags: No such device eth0: ERROR while getting interface flags: No such device Bind socket to interface: No such device Failed to bring up eth0. done. Cleaning up temporary files.... If we could preserve a guest in that state and login it might prove informative. My guess would either be a missing/faulty VF driver or udev renaming things.> test-amd64-i386-rhel6hvm-intel 9 guest-start.2 fail never pass > test-amd64-i386-rhel6hvm-amd 9 guest-start.2 fail never passBoth of these appear to be similar harness issues: 2011-12-13 02:11:21 Z toolstack xl Use of uninitialized value in concatenation (.) or string at ./ts-guest-start line 13. 2011-12-13 02:11:21 Z executing ssh ... root@10.80.249.148 xl create Config file not specified and none in save file> test-amd64-i386-xend-winxpsp3 16 leak-check/check fail never pass > test-i386-i386-win 16 leak-check/check fail never pass > test-amd64-i386-win-vcpus1 16 leak-check/check fail never pass > test-amd64-i386-win 16 leak-check/check fail never passThese all leaked a load of /var/lib/xen/qemu-resume.N. This should be quick & easy to fix, I''ll have a look.> test-amd64-amd64-xl-winxpsp3 7 windows-install fail never pass > test-i386-i386-xl-winxpsp3 7 windows-install fail never passThese are failing in the same way as the one discussed right at the top.> test-amd64-i386-xl-winxpsp3-vcpus1 7 windows-install fail in 10480 like 10474 > test-amd64-i386-xend-winxpsp3 7 windows-install fail in 10480 like 10474 > test-amd64-i386-xl-win7-amd64 7 windows-install fail in 10480 like 10474 > test-amd64-amd64-xl-win7-amd64 7 windows-install fail in 10480 like 10474 > test-amd64-i386-win 7 windows-install fail in 10480 like 10473These don''t appear to have failed per the grid at http://www.chiark.greenend.org.uk/~xensrcts/logs/10481/ ? e.g. test-amd64-i386-xl-winxpsp3-vcpus1 appears to have failed at guest-stop instead (and indeed is also listed above in that capacity) This appears to be reporting a failure in a previous run, part of the heisenbug detector? It might be nice to put those in a separate section or to include some indication as the the criteria being evaluated (e.g. are we waiting for a 3rd test to tiebreak?)> > version targeted for testing: > xen 7ca56cca09ad > baseline version: > xen 1c58bb664d8d > > ------------------------------------------------------------ > People who touched revisions under test: > Andre Przywara <andre.przywara@amd.com> > Ian Campbell <ian.campbell@citrix.com> > Ian Jackson <ian.jackson@eu.citrix.com> > Jan Beulich <jbeulich@suse.com> > Roger Pau Monne <roger.pau@entel.upc.edu> > ------------------------------------------------------------ > > jobs: > build-amd64 pass > build-i386 pass > build-amd64-oldkern pass > build-i386-oldkern pass > build-amd64-pvops pass > build-i386-pvops pass > test-amd64-amd64-xl pass > test-amd64-i386-xl pass > test-i386-i386-xl pass > test-amd64-i386-rhel6hvm-amd fail > test-amd64-amd64-xl-win7-amd64 fail > test-amd64-i386-xl-win7-amd64 fail > test-amd64-i386-xl-credit2 pass > test-amd64-amd64-xl-pcipt-intel fail > test-amd64-i386-rhel6hvm-intel fail > test-amd64-i386-xl-multivcpu pass > test-amd64-amd64-pair pass > test-amd64-i386-pair pass > test-i386-i386-pair pass > test-amd64-amd64-xl-sedf-pin fail > test-amd64-amd64-pv fail > test-amd64-i386-pv pass > test-i386-i386-pv pass > test-amd64-amd64-xl-sedf pass > test-amd64-i386-win-vcpus1 fail > test-amd64-i386-xl-win-vcpus1 fail > test-amd64-i386-xl-winxpsp3-vcpus1 fail > test-amd64-amd64-win fail > test-amd64-i386-win fail > test-i386-i386-win fail > test-amd64-amd64-xl-win fail > test-i386-i386-xl-win fail > test-amd64-i386-xend-winxpsp3 fail > test-amd64-amd64-xl-winxpsp3 fail > test-i386-i386-xl-winxpsp3 fail > > > ------------------------------------------------------------ > sg-report-flight on woking.cam.xci-test.com > logs: /home/xc_osstest/logs > images: /home/xc_osstest/images > > Logs, config files, etc. are available at > http://www.chiark.greenend.org.uk/~xensrcts/logs > > Test harness code can be found at > http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary > > > Not pushing. > > (No revision log; it would be 336 lines long.) > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel
Ian Campbell
2011-Dec-13 10:36 UTC
[PATCH] libxl: do not leak qemu saved state on restore (Was: Re: [xen-unstable test] 10481: regressions - FAIL)
On Tue, 2011-12-13 at 09:44 +0000, Ian Campbell wrote:> > > test-amd64-i386-xend-winxpsp3 16 leak-check/check fail never pass > > test-i386-i386-win 16 leak-check/check fail never pass > > test-amd64-i386-win-vcpus1 16 leak-check/check fail never pass > > test-amd64-i386-win 16 leak-check/check fail never pass > > These all leaked a load of /var/lib/xen/qemu-resume.N. This should be > quick & easy to fix, I''ll have a look.These happen to all be xend failures but the only reason xl doesn''t have this is that those tests all fail at the guest-stop stage and never get as far as the migration test. Here is the fix for the xl version, fixing the xend side seems less trivial and I don''t propose to dig into it. Ian. 8<-------------------------------------------------- # HG changeset patch # User Ian Campbell <ian.campbell@citrix.com> # Date 1323772303 0 # Node ID 9ea12474c6dcd75bbbc7a5c62a2b96de902ccb83 # Parent 88df802b4905d7e34032eacf1942b70c76d150a4 libxl: do not leak qemu saved state on restore In particular do not leak /var/lib/xen/qemu-resume.<domid>. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> diff -r 88df802b4905 -r 9ea12474c6dc tools/libxl/libxl_create.c --- a/tools/libxl/libxl_create.c Mon Dec 12 17:43:55 2011 +0000 +++ b/tools/libxl/libxl_create.c Tue Dec 13 10:31:43 2011 +0000 @@ -622,7 +622,7 @@ static int do_domain_create(libxl__gc *g == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) { libxl__qmp_initializations(ctx, domid); } - ret = libxl__confirm_device_model_startup(gc, dm_starting); + ret = libxl__confirm_device_model_startup(gc, dm_info, dm_starting); if (ret < 0) { LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "device model did not start: %d", ret); diff -r 88df802b4905 -r 9ea12474c6dc tools/libxl/libxl_dm.c --- a/tools/libxl/libxl_dm.c Mon Dec 12 17:43:55 2011 +0000 +++ b/tools/libxl/libxl_dm.c Tue Dec 13 10:31:43 2011 +0000 @@ -753,7 +753,8 @@ retry_transaction: ret = ERROR_FAIL; goto out_free; } - if (libxl__confirm_device_model_startup(gc, dm_starting) < 0) { + if (libxl__confirm_device_model_startup(gc, &xenpv_dm_info, + dm_starting) < 0) { ret = ERROR_FAIL; goto out_free; } @@ -894,14 +895,26 @@ out: int libxl__confirm_device_model_startup(libxl__gc *gc, - libxl__spawner_starting *starting) + libxl_device_model_info *dm_info, + libxl__spawner_starting *starting) { + libxl_ctx *ctx = libxl__gc_owner(gc); char *path; int domid = starting->domid; + int ret, ret2; path = libxl__sprintf(gc, "/local/domain/0/device-model/%d/state", domid); - return libxl__spawn_confirm_offspring_startup(gc, + ret = libxl__spawn_confirm_offspring_startup(gc, LIBXL_DEVICE_MODEL_START_TIMEOUT, "Device Model", path, "running", starting); + if (dm_info->saved_state) { + ret2 = unlink(dm_info->saved_state); + if (ret2) LIBXL__LOG_ERRNO(ctx, XTL_ERROR, + "failed to remove device-model state %s\n", + dm_info->saved_state); + /* Do not clobber spawn_confirm error code with unlink error code. */ + if (!ret) ret = ret2; + } + return ret; } int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid) diff -r 88df802b4905 -r 9ea12474c6dc tools/libxl/libxl_internal.h --- a/tools/libxl/libxl_internal.h Mon Dec 12 17:43:55 2011 +0000 +++ b/tools/libxl/libxl_internal.h Tue Dec 13 10:31:43 2011 +0000 @@ -419,6 +419,7 @@ _hidden int libxl__need_xenpv_qemu(libxl * return pass *starting_r (which will be non-0) to * libxl__confirm_device_model_startup or libxl__detach_device_model. */ _hidden int libxl__confirm_device_model_startup(libxl__gc *gc, + libxl_device_model_info *dm_info, libxl__spawner_starting *starting); _hidden int libxl__detach_device_model(libxl__gc *gc, libxl__spawner_starting *starting); _hidden int libxl__wait_for_device_model(libxl__gc *gc,
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"):> I think "xl trigger <dom> power" would be what is wanted here -- e.g. > send an ACPI power event. It could be argued that xl shutdown could do > this automatically?libxl_domain_shutdown should do it automatically for HVM guests with no PV drivers. Ian.> No active link message again but this time the guest says: > For info, please visit https://www.isc.org/software/dhcp/ > > SIOCSIFADDR: No such device > eth0: ERROR while getting interface flags: No such device > eth0: ERROR while getting interface flags: No such device > Bind socket to interface: No such device > Failed to bring up eth0. > done. > Cleaning up temporary files.... > > If we could preserve a guest in that state and login it might prove > informative. My guess would either be a missing/faulty VF driver or udev > renaming things.> > test-amd64-i386-xend-winxpsp3 16 leak-check/check fail never pass > > test-i386-i386-win 16 leak-check/check fail never pass > > test-amd64-i386-win-vcpus1 16 leak-check/check fail never pass > > test-amd64-i386-win 16 leak-check/check fail never pass > > These all leaked a load of /var/lib/xen/qemu-resume.N. This should be > quick & easy to fix, I''ll have a look.These are all xend, of course...> > test-amd64-i386-xl-winxpsp3-vcpus1 7 windows-install fail in 10480 like 10474 > > test-amd64-i386-xend-winxpsp3 7 windows-install fail in 10480 like 10474 > > test-amd64-i386-xl-win7-amd64 7 windows-install fail in 10480 like 10474 > > test-amd64-amd64-xl-win7-amd64 7 windows-install fail in 10480 like 10474 > > test-amd64-i386-win 7 windows-install fail in 10480 like 10473 > > These don''t appear to have failed per the grid at > http://www.chiark.greenend.org.uk/~xensrcts/logs/10481/ ? > > e.g. test-amd64-i386-xl-winxpsp3-vcpus1 appears to have failed at > guest-stop instead (and indeed is also listed above in that capacity)Perhaps the heading> > Tests which did not succeed, but are not blocking, > > including regressions (tests previously passed) regarded as allowable:is slightly misleading, but it does say "fail in 10480". Ie it passed in 10481 but failed in 10480 which tested the same changeset.> This appears to be reporting a failure in a previous run, part of the > heisenbug detector? It might be nice to put those in a separate section > or to include some indication as the the criteria being evaluated (e.g. > are we waiting for a 3rd test to tiebreak?)These are "not blocking" so they don''t prevent a push. I see we got a push in 10486... Ian.
On Tue, 2011-12-13 at 11:33 +0000, Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"): > > I think "xl trigger <dom> power" would be what is wanted here -- e.g. > > send an ACPI power event. It could be argued that xl shutdown could do > > this automatically? > > libxl_domain_shutdown should do it automatically for HVM guests with > no PV drivers.I was about half way through implementing this when it occurred to me that the reaction to a power button press is a guest OS option and can result in a shutdown, a reboot or even a suspend. Unless I''m mistaken? Ian.
>>> On 13.12.11 at 15:07, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Tue, 2011-12-13 at 11:33 +0000, Ian Jackson wrote: >> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - > FAIL"): >> > I think "xl trigger <dom> power" would be what is wanted here -- e.g. >> > send an ACPI power event. It could be argued that xl shutdown could do >> > this automatically? >> >> libxl_domain_shutdown should do it automatically for HVM guests with >> no PV drivers. > > I was about half way through implementing this when it occurred to me > that the reaction to a power button press is a guest OS option and can > result in a shutdown, a reboot or even a suspend.Correct. Jan
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"):> I was about half way through implementing this when it occurred to me > that the reaction to a power button press is a guest OS option and can > result in a shutdown, a reboot or even a suspend. > > Unless I''m mistaken?Firstly, aren''t there several possible indications we could send, in which case we pick one most likely to make the guest shut down ? Secondly, having the guest attempt to reboot is probably better than simply shooting it in the head. (We could even intercept that and have it be destroyed anyway.) At least this way if the user wants graceful shutdown they can configure the guest to do a shutdown and then xl shutdown will work. At the moment it doesn''t work at all without pv drivers. Ian.
On Tue, 2011-12-13 at 14:20 +0000, Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"): > > I was about half way through implementing this when it occurred to me > > that the reaction to a power button press is a guest OS option and can > > result in a shutdown, a reboot or even a suspend. > > > > Unless I''m mistaken? > > Firstly, aren''t there several possible indications we could send, in > which case we pick one most likely to make the guest shut down ?The choices are (omitting the ia64 only ones): xen/include/public/domctl.h:#define XEN_DOMCTL_SENDTRIGGER_NMI 0 xen/include/public/domctl.h:#define XEN_DOMCTL_SENDTRIGGER_POWER 3 xen/include/public/domctl.h:#define XEN_DOMCTL_SENDTRIGGER_SLEEP 4 Of those I think POWER is the only appropriate one and we have no control over what it will actually do.> Secondly, having the guest attempt to reboot is probably better than > simply shooting it in the head. (We could even intercept that and > have it be destroyed anyway.) At least this way if the user wants > graceful shutdown they can configure the guest to do a shutdown and > then xl shutdown will work. > > At the moment it doesn''t work at all without pv drivers.You can explicitly ask for a power button event with "xl button-press". At a minimum the error message should point to this in preference to destroy. You can also send the same event with "xl trigger". Quite why we need two ways to do this I''ve no idea... Ian.
>>> On 13.12.11 at 15:20, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: > Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - > FAIL"): >> I was about half way through implementing this when it occurred to me >> that the reaction to a power button press is a guest OS option and can >> result in a shutdown, a reboot or even a suspend. >> >> Unless I''m mistaken? > > Firstly, aren''t there several possible indications we could send, in > which case we pick one most likely to make the guest shut down ? > > Secondly, having the guest attempt to reboot is probably better than > simply shooting it in the head. (We could even intercept that and > have it be destroyed anyway.)That would be acceptable when picking between shutdown and reboot, but converting a guest attempt to suspend (to RAM) into shutdown certainly isn''t.> At least this way if the user wants > graceful shutdown they can configure the guest to do a shutdown and > then xl shutdown will work. > > At the moment it doesn''t work at all without pv drivers.Isn''t that the way it is intended to be? Jan
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"):> The choices are (omitting the ia64 only ones): > xen/include/public/domctl.h:#define XEN_DOMCTL_SENDTRIGGER_NMI 0 > xen/include/public/domctl.h:#define XEN_DOMCTL_SENDTRIGGER_POWER 3 > xen/include/public/domctl.h:#define XEN_DOMCTL_SENDTRIGGER_SLEEP 4 > > Of those I think POWER is the only appropriate one and we have no > control over what it will actually do.Right.> > At the moment it doesn''t work at all without pv drivers.I think it would be better for "xl shutdown" to send a power button event than to do nothing and print an error message.> You can explicitly ask for a power button event with "xl button-press". > At a minimum the error message should point to this in preference to > destroy.That would be a better improvement.> You can also send the same event with "xl trigger". Quite why we need > two ways to do this I''ve no idea...Hmmm. Ian.
On Tue, 2011-12-13 at 14:48 +0000, Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"): > > The choices are (omitting the ia64 only ones): > > xen/include/public/domctl.h:#define XEN_DOMCTL_SENDTRIGGER_NMI 0 > > xen/include/public/domctl.h:#define XEN_DOMCTL_SENDTRIGGER_POWER 3 > > xen/include/public/domctl.h:#define XEN_DOMCTL_SENDTRIGGER_SLEEP 4 > > > > Of those I think POWER is the only appropriate one and we have no > > control over what it will actually do. > > Right. > > > > At the moment it doesn''t work at all without pv drivers. > > I think it would be better for "xl shutdown" to send a power button > event than to do nothing and print an error message. > > > You can explicitly ask for a power button event with "xl button-press". > > At a minimum the error message should point to this in preference to > > destroy. > > That would be a better improvement.So which do you prefer? An error message pointing to "xl button-press" or sending the button press? Ian.> > > You can also send the same event with "xl trigger". Quite why we need > > two ways to do this I''ve no idea... > > Hmmm. > > Ian.
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"):> On Tue, 2011-12-13 at 14:48 +0000, Ian Jackson wrote: > > Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"): > > > You can explicitly ask for a power button event with "xl button-press". > > > At a minimum the error message should point to this in preference to > > > destroy. > > > > That would be a better improvement. > > So which do you prefer? An error message pointing to "xl button-press" > or sending the button press?Sorry, I misphrased my email. I should have said that would be a "minimal improvement". I would prefer "xl shutdown" to send the button press. Ian.
On Tue, 2011-12-13 at 14:54 +0000, Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"): > > On Tue, 2011-12-13 at 14:48 +0000, Ian Jackson wrote: > > > Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"): > > > > You can explicitly ask for a power button event with "xl button-press". > > > > At a minimum the error message should point to this in preference to > > > > destroy. > > > > > > That would be a better improvement. > > > > So which do you prefer? An error message pointing to "xl button-press" > > or sending the button press? > > Sorry, I misphrased my email. I should have said that would be a > "minimal improvement". I would prefer "xl shutdown" to send the > button press.Just to clarify a bit further: do you think libxl_domain_shutdown should implement this fallback or should it be left to xl to do? Shall "xl reboot"/libxl_domain_reboot do the same? NB: currently we have libxl_domain_shutdown which takes an integer "request" type. I intend to split this into libxl_domain_{shutdown,reboot}. There are some other request types currently but they are not useful: * "suspend" is already provided by libxl_domain_suspend, which includes all the other required scaffolding which libxl_domain_shutdown does not,. * "halt" which is a synonym for shutdown * "crash" which is unused and isn''t supported at least by Linux, someone can add "xl crash" and libxl_domain_crash if they really want it. Ian.
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"):> Just to clarify a bit further: do you think libxl_domain_shutdown should > implement this fallback or should it be left to xl to do?I''m not sure.> Shall "xl reboot"/libxl_domain_reboot do the same?Yes. Putting those things together, it seems to me that something needs to know that "reboot" or "shutdown" are being simulated with the power button. That''s so that if you say "reboot" or "shutdown" and we press the guest''s power button, we know whether, when the guest does shut down, whether we actually want to restart it. Ian.
On Tue, 2011-12-13 at 15:06 +0000, Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"): > > Just to clarify a bit further: do you think libxl_domain_shutdown should > > implement this fallback or should it be left to xl to do? > > I''m not sure. > > > Shall "xl reboot"/libxl_domain_reboot do the same? > > Yes. > > Putting those things together, it seems to me that something needs to > know that "reboot" or "shutdown" are being simulated with the power > button.I think this requires deferring the fallback to xl. Which makes sense since other toolstacks may have other policies in this regard.> That''s so that if you say "reboot" or "shutdown" and we press the > guest''s power button, we know whether, when the guest does shut down, > whether we actually want to restart it.That would involve potentially communicating with the daemonized version of xl which is monitoring the domain (if it exists) and/or taking over it''s function in the xl instance running the shutdown. I''m tempted to just go with the minimal improvement at this stage. Ian.
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"):> On Tue, 2011-12-13 at 15:06 +0000, Ian Jackson wrote: > > That''s so that if you say "reboot" or "shutdown" and we press the > > guest''s power button, we know whether, when the guest does shut down, > > whether we actually want to restart it. > > That would involve potentially communicating with the daemonized version > of xl which is monitoring the domain (if it exists) and/or taking over > it''s function in the xl instance running the shutdown.Yes. That could be done via xenstore of course.> I''m tempted to just go with the minimal improvement at this stage.Fair enough. It''s not going to make the test pass though :-). Ian.
On Tue, 2011-12-13 at 15:20 +0000, Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"): > > On Tue, 2011-12-13 at 15:06 +0000, Ian Jackson wrote: > > > That''s so that if you say "reboot" or "shutdown" and we press the > > > guest''s power button, we know whether, when the guest does shut down, > > > whether we actually want to restart it. > > > > That would involve potentially communicating with the daemonized version > > of xl which is monitoring the domain (if it exists) and/or taking over > > it''s function in the xl instance running the shutdown. > > Yes. That could be done via xenstore of course.Sure, but there''s still a bunch of moving parts to arrange etc.> > I''m tempted to just go with the minimal improvement at this stage. > > Fair enough. It''s not going to make the test pass though :-).Makes the test buggy though :-P Ian.
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"):> On Tue, 2011-12-13 at 15:20 +0000, Ian Jackson wrote: > > Fair enough. It''s not going to make the test pass though :-). > > Makes the test buggy though :-PAs discussed, I think that "xl shutdown" should work and therefore the test is not broken. Ian.
Ian Jackson
2011-Dec-13 15:42 UTC
[PATCH] libxl: do not leak qemu saved state on restore (Was: Re: [xen-unstable test] 10481: regressions - FAIL)
Ian Campbell writes ("[Xen-devel] [PATCH] libxl: do not leak qemu saved state on restore (Was: Re: [xen-unstable test] 10481: regressions - FAIL)"):> These happen to all be xend failures but the only reason xl doesn''t have > this is that those tests all fail at the guest-stop stage and never get > as far as the migration test. Here is the fix for the xl version, fixing > the xend side seems less trivial and I don''t propose to dig into it.Thanks.> # HG changeset patch > # User Ian Campbell <ian.campbell@citrix.com> > # Date 1323772303 0 > # Node ID 9ea12474c6dcd75bbbc7a5c62a2b96de902ccb83 > # Parent 88df802b4905d7e34032eacf1942b70c76d150a4 > libxl: do not leak qemu saved state on restore > > In particular do not leak /var/lib/xen/qemu-resume.<domid>. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
On Tue, 2011-12-13 at 15:30 +0000, Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 10481: regressions - FAIL"): > > On Tue, 2011-12-13 at 15:20 +0000, Ian Jackson wrote: > > > Fair enough. It''s not going to make the test pass though :-). > > > > Makes the test buggy though :-P > > As discussed, I think that "xl shutdown" should work and therefore the > test is not broken."xl shutdown" is currently behaving precisely as documented. Setting aside the complexity of implementing the behaviour you desire playing tricks with turning power button derived reboots into shutdowns and vice versa "xl shutdown" cannot work reliably because the guest can perform effectively arbitrary actions on power button press, doing nothing is one option and suspend/hibernation is the other common reaction which we certainly do not want to trigger when attempting to shutdown or reboot. In short I don''t think the behaviour you desire is ever going to happen. This test failure is preventing us from testing a much more interesting scenario, specifically HVM guest migration. Ian.