Hi! ''xl create'' crashes due to stack corruption. This is a backtrace where everything seems to be fine. Have a look at the ''fmt'' and ''dir'' arguments. (gdb) bt #0 0x00007f7ffc03f4e2 in vsnprintf () from /usr/lib/libc.so.12 #1 0x00007f7ffd416ab5 in libxl__sprintf (gc=0x7f7fffffd220, fmt=0x7f7ffd41c467 "%s/%s") at libxl_internal.c:112 #2 0x00007f7ffd415258 in libxl__xs_writev (gc=0x7f7fffffd220, t=9, dir=0x7f7ffdb010e0 "/vm/46ac5197-ecc6-df11-bcf6-00e081806fbe", kvs=0x7f7ffdb1f0b0) at libxl_xshelp.c:82 #3 0x00007f7ffd40f477 in libxl_create_device_model (ctx=0x6179a0, info=0x7f7fffffd470, disks=0x7f7ffdb014d0, num_disks=1, vifs=0x7f7ffdb16070, num_vifs=1, starting_r=0x7f7fffffd440) at libxl.c:1723 #4 0x000000000040f6c1 in create_domain (dom_info=0x7f7fffffd6f0) at xl_cmdimpl.c:1458 #5 0x00000000004104b3 in main_create (argc=2, argv=0x7f7fffffdbc8) at xl_cmdimpl.c:3214 #6 0x0000000000404ad2 in main (argc=3, argv=0x7f7fffffdbc8) at xl.c:79 This is a backtrace where the stack corruption happened. Have a look at the ''fmt'' and ''dir'' arguments. (gdb) cont Continuing. Watchpoint 3: fmt Old value = 0x7f7ffd41c467 "%s/%s" New value = 0x7f7fffffce30 "\020" (gdb) bt #0 0x00007f7ffc03f553 in vsnprintf () from /usr/lib/libc.so.12 #1 0x00007f7ffd416ab5 in libxl__sprintf (gc=0x7f7fffffd220, fmt=0x7f7fffffce30 "\020") at libxl_internal.c:112 #2 0x00007f7ffd415258 in libxl__xs_writev (gc=0x7f7fffffd220, t=9, dir=0x7f7ffdb010e0 "/vm/46ac5197-ecc6-df11-bcf6-00e081806fbe", kvs=0x7f7ffdb1f0b0) at libxl_xshelp.c:82 #3 0x00007f7ffd40f477 in libxl_create_device_model (ctx=0x6179a0, info=0x7f7fffffd470, disks=0x7f7ffdb014d0, num_disks=1, vifs=0x7f7ffdb16070, num_vifs=1, starting_r=0x7f7fffffd440) at libxl.c:1723 #4 0x000000000040f6c1 in create_domain (dom_info=0x7f7fffffd6f0) at xl_cmdimpl.c:1458 #5 0x00000000004104b3 in main_create (argc=2, argv=0x7f7fffffdbc8) at xl_cmdimpl.c:3214 #6 0x0000000000404ad2 in main (argc=3, argv=0x7f7fffffdbc8) at xl.c:79 This is a backtrace where the stack corruption caused a segfault. Have a look at the ''fmt'' and ''dir'' arguments. (gdb) bt #0 0x00007f7ffc0cac0e in __vfprintf_unlocked () from /usr/lib/libc.so.12 #1 0x00007f7ffc03f55b in vsnprintf () from /usr/lib/libc.so.12 #2 0x00007f7ffd416ab5 in libxl__sprintf (gc=0x7f7fffffd220, fmt=0x7265776f705f6e6f <Address 0x7265776f705f6e6f out of bounds>) at libxl_internal.c:112 #3 0x00007f7ffd415258 in libxl__xs_writev (gc=0x7f7fffffd220, t=13, dir=0x7f7ffdb010e0 "/vm/46ac5197-ecc6-df11-bcf6-00e081806fbe", kvs=0x7f7ffdb1f0b0) at libxl_xshelp.c:82 #4 0x00007f7ffd40f477 in libxl_create_device_model (ctx=0x6179a0, info=0x7f7fffffd470, disks=0x7f7ffdb014d0, num_disks=1, vifs=0x7f7ffdb16070, num_vifs=1, starting_r=0x7f7fffffd440) at libxl.c:1723 #5 0x000000000040f6c1 in create_domain (dom_info=0x7f7fffffd6f0) at xl_cmdimpl.c:1458 #6 0x00000000004104b3 in main_create (argc=2, argv=0x7f7fffffdbc8) at xl_cmdimpl.c:3214 #7 0x0000000000404ad2 in main (argc=3, argv=0x7f7fffffdbc8) at xl.c:79 The crash is reproducable. The ''dir'' argument always contains the uuid string when the stack corruption happens. And it seems that the ''dir'' string is the longest when it contains the uuid string. -- ---to satisfy European Law for business letters: Advanced Micro Devices GmbH Einsteinring 24, 85609 Dornach b. Muenchen Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, 2010-09-23 at 11:27 +0100, Christoph Egger wrote:> Hi! > > ''xl create'' crashes due to stack corruption.---8<------------------------------------------- xl: Fix stack corruption caused by non-terminated call to libxl__xs_writev Signed-off-by: Gianni Tedesco <gianni.tedesco@citrix.com> diff -r 50c1cc209f8f tools/libxl/libxl.c --- a/tools/libxl/libxl.c Wed Sep 22 18:29:24 2010 +0100 +++ b/tools/libxl/libxl.c Thu Sep 23 16:54:09 2010 +0100 @@ -1718,7 +1718,7 @@ retry_transaction: vm_path = libxl__xs_read(&gc,t,libxl__sprintf(&gc, "%s/vm", p->dom_path)); if (vm_path) { /* Now write the vncpassword into it. */ - pass_stuff = libxl__calloc(&gc, 2, sizeof(char *)); + pass_stuff = libxl__calloc(&gc, 3, sizeof(char *)); pass_stuff[0] = "vncpasswd"; pass_stuff[1] = info->vncpasswd; libxl__xs_writev(&gc,t,vm_path,pass_stuff); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thursday 23 September 2010 17:55:39 Gianni Tedesco wrote:> On Thu, 2010-09-23 at 11:27 +0100, Christoph Egger wrote: > > Hi! > > > > ''xl create'' crashes due to stack corruption. > > ---8<------------------------------------------- > > xl: Fix stack corruption caused by non-terminated call to libxl__xs_writevConfirmed. This fixes the crash. ''xl create'' now hangs in a loop. I will have a look at this later.> > Signed-off-by: Gianni Tedesco <gianni.tedesco@citrix.com> > > diff -r 50c1cc209f8f tools/libxl/libxl.c > --- a/tools/libxl/libxl.c Wed Sep 22 18:29:24 2010 +0100 > +++ b/tools/libxl/libxl.c Thu Sep 23 16:54:09 2010 +0100 > @@ -1718,7 +1718,7 @@ retry_transaction: > vm_path = libxl__xs_read(&gc,t,libxl__sprintf(&gc, "%s/vm", > p->dom_path)); if (vm_path) { > /* Now write the vncpassword into it. */ > - pass_stuff = libxl__calloc(&gc, 2, sizeof(char *)); > + pass_stuff = libxl__calloc(&gc, 3, sizeof(char *)); > pass_stuff[0] = "vncpasswd"; > pass_stuff[1] = info->vncpasswd; > libxl__xs_writev(&gc,t,vm_path,pass_stuff);-- ---to satisfy European Law for business letters: Advanced Micro Devices GmbH Einsteinring 24, 85609 Dornach b. Muenchen Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, 2010-09-23 at 17:18 +0100, Christoph Egger wrote:> On Thursday 23 September 2010 17:55:39 Gianni Tedesco wrote: > > On Thu, 2010-09-23 at 11:27 +0100, Christoph Egger wrote: > > > Hi! > > > > > > ''xl create'' crashes due to stack corruption. > > > > ---8<------------------------------------------- > > > > xl: Fix stack corruption caused by non-terminated call to libxl__xs_writev > > Confirmed. This fixes the crash. ''xl create'' now hangs in a loop. I will have > a look at this later.Weird, did this start happening recently? I blame Stefano ;) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thursday 23 September 2010 19:06:40 Gianni Tedesco wrote:> On Thu, 2010-09-23 at 17:18 +0100, Christoph Egger wrote: > > On Thursday 23 September 2010 17:55:39 Gianni Tedesco wrote: > > > On Thu, 2010-09-23 at 11:27 +0100, Christoph Egger wrote: > > > > Hi! > > > > > > > > ''xl create'' crashes due to stack corruption. > > > > > > ---8<------------------------------------------- > > > > > > xl: Fix stack corruption caused by non-terminated call to > > > libxl__xs_writev > > > > Confirmed. This fixes the crash. ''xl create'' now hangs in a loop. I will > > have a look at this later. > > Weird, did this start happening recently?Don''t know. I didn''t came that far until now. Attached patch fixes the hang. The issue is that ''xl create'' tries to start qemu-dm from a directory it isn''t installed in. Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Now I can start a guest with ''xl create'' but xl is still not there. I can spot yet another bugs: 1) I can''t see the full boot loader output on the guest''s serial console. The output starts when the bootloader invokes a timer. The output coming before that is sort of skipped. When I start the guest with ''xm'' all is fine. 2) The guest crashes: Booting "Xen-in-Xen" ends with: (XEN) ENABLING IO-APIC IRQs (XEN) -> Using new ACK method (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=0 pin2=0 (XEN) ..MP-BIOS bug: 8254 timer not connected to IO-APIC (XEN) ...trying to set up timer (IRQ0) through the 8259A ... (XEN) ..... (found pin 0) ... failed. (XEN) ...trying to set up timer as Virtual Wire IRQ... failed. (XEN) ...trying to set up timer as ExtINT IRQ... failed :(. (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) IO-APIC + timer doesn''t work! Boot with apic=debug and send a report. Then try booting with the ''noapic'' option**************************************** (XEN) (XEN) Reboot in five seconds... Booting a Linux guest ends with: [ 0.004000] Setting APIC routing to physical flat [ 0.004000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=0 pin2=0 [ 0.004000] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 0.004000] ...trying to set up timer (IRQ0) through the 8259A ... [ 0.004000] ..... (found apic 0 pin 0) ... [ 0.004000] ....... failed. [ 0.004000] ...trying to set up timer as Virtual Wire IRQ... [ 0.004000] ..... failed. [ 0.004000] ...trying to set up timer as ExtINT IRQ... [ 0.004000] ..... failed :(. [ 0.004000] Kernel panic - not syncing: IO-APIC + timer doesn''t work! Boot with apic=debug and send a report. Then try booting with the ''noapic'' option. [ 0.004000] [ 0.004000] Pid: 1, comm: swapper Not tainted 2.6.34 #2 [ 0.004000] Call Trace: [ 0.004000] [<ffffffff816055ad>] panic+0xa3/0x11e [ 0.004000] [<ffffffff81028db4>] ? default_spin_lock_flags+0x9/0xd [ 0.004000] [<ffffffff81028db4>] ? default_spin_lock_flags+0x9/0xd [ 0.004000] [<ffffffff812303d5>] ? __const_udelay+0x42/0x44 [ 0.004000] [<ffffffff81df9a14>] setup_IO_APIC+0x9e8/0xa3d [ 0.004000] [<ffffffff81028d26>] ? native_patch+0x1b9/0x1cb [ 0.004000] [<ffffffff81df52b0>] native_smp_prepare_cpus+0x2fd/0x38c [ 0.004000] [<ffffffff81de9606>] kernel_init+0x71/0x1de [ 0.004000] [<ffffffff8100ab24>] kernel_thread_helper+0x4/0x10 [ 0.004000] [<ffffffff81de9595>] ? kernel_init+0x0/0x1de [ 0.004000] [<ffffffff8100ab20>] ? kernel_thread_helper+0x0/0x10 The guests boot fine when I start them with ''xm create''. Christoph -- ---to satisfy European Law for business letters: Advanced Micro Devices GmbH Einsteinring 24, 85609 Dornach b. Muenchen Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 28 Sep 2010, Christoph Egger wrote:> Attached patch fixes the hang. The issue is that ''xl create'' tries to > start qemu-dm from a directory it isn''t installed in. >thanks for the patch, I have applied it.> > Now I can start a guest with ''xl create'' but xl is still not there. > I can spot yet another bugs: > > 1) I can''t see the full boot loader output on the guest''s serial console. > The output starts when the bootloader invokes a timer. > The output coming before that is sort of skipped. > > When I start the guest with ''xm'' all is fine. > > > 2) The guest crashes: > > Booting "Xen-in-Xen" ends with: > (XEN) ENABLING IO-APIC IRQs > (XEN) -> Using new ACK method > (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=0 pin2=0 > (XEN) ..MP-BIOS bug: 8254 timer not connected to IO-APIC > (XEN) ...trying to set up timer (IRQ0) through the 8259A ... > (XEN) ..... (found pin 0) ... failed. > (XEN) ...trying to set up timer as Virtual Wire IRQ... failed. > (XEN) ...trying to set up timer as ExtINT IRQ... failed :(. > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) IO-APIC + timer doesn''t work! Boot with apic=debug and send a report. > Then try booting with the ''noapic'' > option**************************************** > (XEN) > (XEN) Reboot in five seconds... > > Booting a Linux guest ends with: > [ 0.004000] Setting APIC routing to physical flat > [ 0.004000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=0 pin2=0 > [ 0.004000] ..MP-BIOS bug: 8254 timer not connected to IO-APIC > [ 0.004000] ...trying to set up timer (IRQ0) through the 8259A ... > [ 0.004000] ..... (found apic 0 pin 0) ... > [ 0.004000] ....... failed. > [ 0.004000] ...trying to set up timer as Virtual Wire IRQ... > [ 0.004000] ..... failed. > [ 0.004000] ...trying to set up timer as ExtINT IRQ... > [ 0.004000] ..... failed :(. > [ 0.004000] Kernel panic - not syncing: IO-APIC + timer doesn''t work! Boot > with apic=debug and send a report. Then try booting with the ''noapic'' > option. > [ 0.004000] > [ 0.004000] Pid: 1, comm: swapper Not tainted 2.6.34 #2 > [ 0.004000] Call Trace: > [ 0.004000] [<ffffffff816055ad>] panic+0xa3/0x11e > [ 0.004000] [<ffffffff81028db4>] ? default_spin_lock_flags+0x9/0xd > [ 0.004000] [<ffffffff81028db4>] ? default_spin_lock_flags+0x9/0xd > [ 0.004000] [<ffffffff812303d5>] ? __const_udelay+0x42/0x44 > [ 0.004000] [<ffffffff81df9a14>] setup_IO_APIC+0x9e8/0xa3d > [ 0.004000] [<ffffffff81028d26>] ? native_patch+0x1b9/0x1cb > [ 0.004000] [<ffffffff81df52b0>] native_smp_prepare_cpus+0x2fd/0x38c > [ 0.004000] [<ffffffff81de9606>] kernel_init+0x71/0x1de > [ 0.004000] [<ffffffff8100ab24>] kernel_thread_helper+0x4/0x10 > [ 0.004000] [<ffffffff81de9595>] ? kernel_init+0x0/0x1de > [ 0.004000] [<ffffffff8100ab20>] ? kernel_thread_helper+0x0/0x10 > > The guests boot fine when I start them with ''xm create''. >Could you attach the VM config file here? Does that config file work fine with xl in a normal Xen environment? My guess is that it is due to a difference in default values between xl and xend (for example timer_mode). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel