Jeremy Fitzhardinge
2011-Sep-16 00:33 UTC
[Xen-devel] xl create crash when using stub domains
When I create an HVM domain with stubdom enabled, it crashes at: (gdb) run create -c /etc/xen/f14hv64 vcpus=4 xen_platform_pci=0 ''boot="d"'' Starting program: /usr/sbin/xl create -c /etc/xen/f14hv64 vcpus=4 xen_platform_pci=0 ''boot="d"'' [Thread debugging using libthread_db enabled] Parsing config file /etc/xen/f14hv64 xc: info: VIRTUAL MEMORY ARRANGEMENT: Loader: 0000000000100000->000000000017b9ec TOTAL: 0000000000000000->000000003f800000 ENTRY ADDRESS: 0000000000100000 xc: info: PHYSICAL MEMORY ALLOCATION: 4KB PAGES: 0x0000000000000200 2MB PAGES: 0x00000000000001fb 1GB PAGES: 0x0000000000000000 xc: error: panic: xc_dom_bzimageloader.c:588: xc_dom_probe_bzimage_kernel: kernel is not a bzImage: Invalid kernel Detaching after fork from child process 26888. [New Thread 0x7ffff7342700 (LWP 26889)] [Thread 0x7ffff7342700 (LWP 26889) exited] [New Thread 0x7ffff7342700 (LWP 26921)] Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7bbbec5 in libxl__wait_for_device_model (gc=0x7fffffffdbb0, domid=22, state=0x7ffff7bc1b8c "running", starting=0x623760, check_callback=0, check_callback_userdata=0x0) at libxl_device.c:555 555 if (starting && starting->for_spawn->fd > xs_fileno(xsh)) (gdb) bt #0 0x00007ffff7bbbec5 in libxl__wait_for_device_model (gc=0x7fffffffdbb0, domid=22, state=0x7ffff7bc1b8c "running", starting=0x623760, check_callback=0, check_callback_userdata=0x0) at libxl_device.c:555 #1 0x00007ffff7bb37b5 in libxl__confirm_device_model_startup ( gc=0x7fffffffdbb0, starting=0x623760) at libxl_dm.c:922 #2 0x00007ffff7bb229b in do_domain_create (gc=0x7fffffffdbb0, d_config=0x7fffffffde30, cb=0x40a053 <autoconnect_console>, priv=0x7fffffffde14, domid_out=0x619ed8, restore_fd=-1) at libxl_create.c:576 #3 0x00007ffff7bb2481 in libxl_domain_create_new (ctx=<optimized out>, d_config=<optimized out>, cb=<optimized out>, priv=<optimized out>, domid=<optimized out>) at libxl_create.c:626 #4 0x0000000000409424 in create_domain (dom_info=0x7fffffffe0c0) at xl_cmdimpl.c:1520 #5 0x000000000040ceef in main_create (argc=6, argv=0x7fffffffe6b0) at xl_cmdimpl.c:3188 #6 0x000000000040501b in main (argc=6, argv=0x7fffffffe6b0) at xl.c:151 The stubdom seems fine, and when I unpause the main domain it seems to work fine. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Sep-21 01:34 UTC
Re: [Xen-devel] xl create crash when using stub domains
On 09/15/2011 05:33 PM, Jeremy Fitzhardinge wrote:> When I create an HVM domain with stubdom enabled, it crashes at: > > (gdb) run create -c /etc/xen/f14hv64 vcpus=4 xen_platform_pci=0 ''boot="d"'' > Starting program: /usr/sbin/xl create -c /etc/xen/f14hv64 vcpus=4 xen_platform_pci=0 ''boot="d"'' > [Thread debugging using libthread_db enabled] > Parsing config file /etc/xen/f14hv64 > xc: info: VIRTUAL MEMORY ARRANGEMENT: > Loader: 0000000000100000->000000000017b9ec > TOTAL: 0000000000000000->000000003f800000 > ENTRY ADDRESS: 0000000000100000 > xc: info: PHYSICAL MEMORY ALLOCATION: > 4KB PAGES: 0x0000000000000200 > 2MB PAGES: 0x00000000000001fb > 1GB PAGES: 0x0000000000000000 > xc: error: panic: xc_dom_bzimageloader.c:588: xc_dom_probe_bzimage_kernel: kernel is not a bzImage: Invalid kernel > Detaching after fork from child process 26888. > [New Thread 0x7ffff7342700 (LWP 26889)] > [Thread 0x7ffff7342700 (LWP 26889) exited] > [New Thread 0x7ffff7342700 (LWP 26921)] > > Program received signal SIGSEGV, Segmentation fault. > 0x00007ffff7bbbec5 in libxl__wait_for_device_model (gc=0x7fffffffdbb0, > domid=22, state=0x7ffff7bc1b8c "running", starting=0x623760, > check_callback=0, check_callback_userdata=0x0) at libxl_device.c:555 > 555 if (starting && starting->for_spawn->fd > xs_fileno(xsh)) > (gdb) btThis patch seems to fix it, but I don''t know if it is a real fix or just papering over something else. J diff -r 7779e12cc99e tools/libxl/libxl_device.c --- a/tools/libxl/libxl_device.c Tue Aug 16 17:05:18 2011 -0700 +++ b/tools/libxl/libxl_device.c Tue Sep 20 18:23:03 2011 -0700 @@ -552,7 +552,7 @@ tv.tv_sec = LIBXL_DEVICE_MODEL_START_TIMEOUT; tv.tv_usec = 0; nfds = xs_fileno(xsh) + 1; - if (starting && starting->for_spawn->fd > xs_fileno(xsh)) + if (starting && starting->for_spawn && starting->for_spawn->fd > xs_fileno(xsh)) nfds = starting->for_spawn->fd + 1; while (rc > 0 || (!rc && tv.tv_sec > 0)) { @@ -586,7 +586,7 @@ free(p); FD_ZERO(&rfds); FD_SET(xs_fileno(xsh), &rfds); - if (starting) + if (starting && starting->for_spawn) FD_SET(starting->for_spawn->fd, &rfds); rc = select(nfds, &rfds, NULL, NULL, &tv); if (rc > 0) { @@ -597,7 +597,7 @@ else goto again; } - if (starting && FD_ISSET(starting->for_spawn->fd, &rfds)) { + if (starting && starting->for_spawn && FD_ISSET(starting->for_spawn->fd, &rfds)) { unsigned char dummy; if (read(starting->for_spawn->fd, &dummy, sizeof(dummy)) != 1) LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_DEBUG, _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Sep-21 08:56 UTC
Re: [Xen-devel] xl create crash when using stub domains
On Wed, 2011-09-21 at 02:34 +0100, Jeremy Fitzhardinge wrote:> On 09/15/2011 05:33 PM, Jeremy Fitzhardinge wrote: > > When I create an HVM domain with stubdom enabled, it crashes at: > > > > (gdb) run create -c /etc/xen/f14hv64 vcpus=4 xen_platform_pci=0 ''boot="d"'' > > Starting program: /usr/sbin/xl create -c /etc/xen/f14hv64 vcpus=4 xen_platform_pci=0 ''boot="d"'' > > [Thread debugging using libthread_db enabled] > > Parsing config file /etc/xen/f14hv64 > > xc: info: VIRTUAL MEMORY ARRANGEMENT: > > Loader: 0000000000100000->000000000017b9ec > > TOTAL: 0000000000000000->000000003f800000 > > ENTRY ADDRESS: 0000000000100000 > > xc: info: PHYSICAL MEMORY ALLOCATION: > > 4KB PAGES: 0x0000000000000200 > > 2MB PAGES: 0x00000000000001fb > > 1GB PAGES: 0x0000000000000000 > > xc: error: panic: xc_dom_bzimageloader.c:588: xc_dom_probe_bzimage_kernel: kernel is not a bzImage: Invalid kernelFWIW I don''t get this message. It seems unrelated to the issue here but makes me curious...> > Detaching after fork from child process 26888. > > [New Thread 0x7ffff7342700 (LWP 26889)] > > [Thread 0x7ffff7342700 (LWP 26889) exited] > > [New Thread 0x7ffff7342700 (LWP 26921)] > > > > Program received signal SIGSEGV, Segmentation fault. > > 0x00007ffff7bbbec5 in libxl__wait_for_device_model (gc=0x7fffffffdbb0, > > domid=22, state=0x7ffff7bc1b8c "running", starting=0x623760, > > check_callback=0, check_callback_userdata=0x0) at libxl_device.c:555 > > 555 if (starting && starting->for_spawn->fd > xs_fileno(xsh)) > > (gdb) bt > > This patch seems to fix it, but I don''t know if it is a real fix or just > papering over something else.I think this is correct because starting->for_spawn is only valid if the device model was launched with libxl__spawn_spawn which is only the case for process based stubdom. libxl__create_device_model heads off into libxl__create_stubdom for this case and explicitly sets for_spawn == NULL. Hmm, actually this function never uses starting except to get at for_spawn perhaps we should just pass in the for_spawn directly. Patch to that effect follows. Ian. ps: can you add this to your ~/.hgrc please: [diff] showfunc = True 8<----------------------------------------------- # HG changeset patch # User Ian Campbell <ian.campbell@citrix.com> # Date 1316595312 -3600 # Node ID eb9330c89fd3843ff0b1348b0ef21cfeb22d4a76 # Parent 21db7a7dd18483aab5c651f2364c09e8e492d7b1 libxl: make libxl__wait_for_device_model use libxl__spawn_starrting directly Instead of indirecting via libxl_device_model_starting. This fixes a segmentation fault using stubdomains where starting->for_spawn is (validly) NULL because starting a stubdom doesn''t need to spawn a process. Most callers of libxl__wait_for_device_model already pass NULL for this variable (because they are not on the starting path) so on libxl__confirm_device_model_startup needs to change. Reported-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> diff -r 21db7a7dd184 -r eb9330c89fd3 tools/libxl/libxl_device.c --- a/tools/libxl/libxl_device.c Tue Sep 20 16:50:44 2011 +0100 +++ b/tools/libxl/libxl_device.c Wed Sep 21 09:55:12 2011 +0100 @@ -528,7 +528,7 @@ out: int libxl__wait_for_device_model(libxl__gc *gc, uint32_t domid, char *state, - libxl__device_model_starting *starting, + libxl__spawn_starting *spawning, int (*check_callback)(libxl__gc *gc, uint32_t domid, const char *state, @@ -558,12 +558,12 @@ int libxl__wait_for_device_model(libxl__ tv.tv_sec = LIBXL_DEVICE_MODEL_START_TIMEOUT; tv.tv_usec = 0; nfds = xs_fileno(xsh) + 1; - if (starting && starting->for_spawn->fd > xs_fileno(xsh)) - nfds = starting->for_spawn->fd + 1; + if (spawning && spawning->fd > xs_fileno(xsh)) + nfds = spawning->fd + 1; while (rc > 0 || (!rc && tv.tv_sec > 0)) { - if ( starting ) { - rc = libxl__spawn_check(gc, starting->for_spawn); + if ( spawning ) { + rc = libxl__spawn_check(gc, spawning); if ( rc ) { LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Device Model died during startup"); @@ -592,8 +592,8 @@ again: free(p); FD_ZERO(&rfds); FD_SET(xs_fileno(xsh), &rfds); - if (starting) - FD_SET(starting->for_spawn->fd, &rfds); + if (spawning) + FD_SET(spawning->fd, &rfds); rc = select(nfds, &rfds, NULL, NULL, &tv); if (rc > 0) { if (FD_ISSET(xs_fileno(xsh), &rfds)) { @@ -603,9 +603,9 @@ again: else goto again; } - if (starting && FD_ISSET(starting->for_spawn->fd, &rfds)) { + if (spawning && FD_ISSET(spawning->fd, &rfds)) { unsigned char dummy; - if (read(starting->for_spawn->fd, &dummy, sizeof(dummy)) != 1) + if (read(spawning->fd, &dummy, sizeof(dummy)) != 1) LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_DEBUG, "failed to read spawn status pipe"); } diff -r 21db7a7dd184 -r eb9330c89fd3 tools/libxl/libxl_dm.c --- a/tools/libxl/libxl_dm.c Tue Sep 20 16:50:44 2011 +0100 +++ b/tools/libxl/libxl_dm.c Wed Sep 21 09:55:12 2011 +0100 @@ -934,7 +934,7 @@ int libxl__confirm_device_model_startup( { int detach; int problem = libxl__wait_for_device_model(gc, starting->domid, "running", - starting, NULL, NULL); + starting->for_spawn, NULL, NULL); detach = detach_device_model(gc, starting); return problem ? problem : detach; } diff -r 21db7a7dd184 -r eb9330c89fd3 tools/libxl/libxl_internal.h --- a/tools/libxl/libxl_internal.h Tue Sep 20 16:50:44 2011 +0100 +++ b/tools/libxl/libxl_internal.h Wed Sep 21 09:55:12 2011 +0100 @@ -288,7 +288,7 @@ _hidden int libxl__confirm_device_model_ _hidden int libxl__detach_device_model(libxl__gc *gc, libxl__device_model_starting *starting); _hidden int libxl__wait_for_device_model(libxl__gc *gc, uint32_t domid, char *state, - libxl__device_model_starting *starting, + libxl__spawn_starting *spawning, int (*check_callback)(libxl__gc *gc, uint32_t domid, const char *state, _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Sep-21 23:06 UTC
Re: [Xen-devel] xl create crash when using stub domains
On 09/21/2011 01:56 AM, Ian Campbell wrote:> On Wed, 2011-09-21 at 02:34 +0100, Jeremy Fitzhardinge wrote: >> On 09/15/2011 05:33 PM, Jeremy Fitzhardinge wrote: >>> When I create an HVM domain with stubdom enabled, it crashes at: >>> >>> (gdb) run create -c /etc/xen/f14hv64 vcpus=4 xen_platform_pci=0 ''boot="d"'' >>> Starting program: /usr/sbin/xl create -c /etc/xen/f14hv64 vcpus=4 xen_platform_pci=0 ''boot="d"'' >>> [Thread debugging using libthread_db enabled] >>> Parsing config file /etc/xen/f14hv64 >>> xc: info: VIRTUAL MEMORY ARRANGEMENT: >>> Loader: 0000000000100000->000000000017b9ec >>> TOTAL: 0000000000000000->000000003f800000 >>> ENTRY ADDRESS: 0000000000100000 >>> xc: info: PHYSICAL MEMORY ALLOCATION: >>> 4KB PAGES: 0x0000000000000200 >>> 2MB PAGES: 0x00000000000001fb >>> 1GB PAGES: 0x0000000000000000 >>> xc: error: panic: xc_dom_bzimageloader.c:588: xc_dom_probe_bzimage_kernel: kernel is not a bzImage: Invalid kernel > FWIW I don''t get this message. It seems unrelated to the issue here but > makes me curious...It''s generated by xc_dom_probe_bzimage_kernel() when starting a PV domain with pvgrub or an HVM domain with stubdoms, AFAIKT. Do you not see it, or just not do those things? Seems to me that a "probe" function shouldn''t be making obnoxious noise. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Sep-22 06:23 UTC
Re: [Xen-devel] xl create crash when using stub domains
On Thu, 2011-09-22 at 00:06 +0100, Jeremy Fitzhardinge wrote:> On 09/21/2011 01:56 AM, Ian Campbell wrote: > > On Wed, 2011-09-21 at 02:34 +0100, Jeremy Fitzhardinge wrote: > >> On 09/15/2011 05:33 PM, Jeremy Fitzhardinge wrote: > >>> When I create an HVM domain with stubdom enabled, it crashes at: > >>> > >>> (gdb) run create -c /etc/xen/f14hv64 vcpus=4 xen_platform_pci=0 ''boot="d"'' > >>> Starting program: /usr/sbin/xl create -c /etc/xen/f14hv64 vcpus=4 xen_platform_pci=0 ''boot="d"'' > >>> [Thread debugging using libthread_db enabled] > >>> Parsing config file /etc/xen/f14hv64 > >>> xc: info: VIRTUAL MEMORY ARRANGEMENT: > >>> Loader: 0000000000100000->000000000017b9ec > >>> TOTAL: 0000000000000000->000000003f800000 > >>> ENTRY ADDRESS: 0000000000100000 > >>> xc: info: PHYSICAL MEMORY ALLOCATION: > >>> 4KB PAGES: 0x0000000000000200 > >>> 2MB PAGES: 0x00000000000001fb > >>> 1GB PAGES: 0x0000000000000000 > >>> xc: error: panic: xc_dom_bzimageloader.c:588: xc_dom_probe_bzimage_kernel: kernel is not a bzImage: Invalid kernel > > FWIW I don''t get this message. It seems unrelated to the issue here but > > makes me curious... > > It''s generated by xc_dom_probe_bzimage_kernel() when starting a PV > domain with pvgrub or an HVM domain with stubdoms, AFAIKT. Do you not > see it, or just not do those things?I didn''t think I''d seen it (booting w/ a stubdom) but looking at the code it must have been in there somewhere.> Seems to me that a "probe" > function shouldn''t be making obnoxious noise.Full ACK. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell writes ("Re: [Xen-devel] xl create crash when using stub domains"):> Hmm, actually this function never uses starting except to get at > for_spawn perhaps we should just pass in the for_spawn directly. Patch > to that effect follows....> libxl: make libxl__wait_for_device_model use libxl__spawn_starrting directlyAcked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com> The original reason for the for_spawn was that all this create code used to be outside libxl where it shouldn''t be looking into libxl''s private data structures. That reason no longer applies. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel