Kurt C. Hackel
2007-Dec-07 21:05 UTC
[Xen-devel] Crash in hypercall_xlat_continuation with 64bit xen 3.1.1 when migrating a 32bit pv guest
Hi, I''m seeing a crash in 64bit xen when doing live migration of a 32bit pv guest under load. The crash does not occur without the load, it happens with both Red Hat/Oracle and Xensource kernels (el4u5 in this case), and it takes about 3-5 migrations to trigger the bug. This is triggering the BUG_ON() in xen/arch/x86/domain.c here: if ( (mask & 1) && *reg == nval ) { *reg = cval; ++rc; } else BUG_ON(*reg != (unsigned int)*reg); <<<< Below you''ll see that the 32bit guest ebx register, stored here in the 64bit reg variable, is supposed to have the upper 32bits cleared, but it has 0x00000080 in the upper 32bits. I''ll take a look at the save/restore code in libxc, unless there are other suggestions. Thanks kurt (XEN) *reg=0000008000000018 i=0 mask=1 nval=549755814584(0x00000080000002B8) cval=3217411164 (XEN) ebx=0000008000000018 ecx=00000000800000a9 edx=0000000000000000 esi=0000000000000004 edi=00000000006221d4 ebp=0000000000305000 (XEN) Xen BUG at domain.c:1522 (XEN) ----[ Xen-3.1.1 x86_64 debug=n Tainted: C ]---- (XEN) CPU: 3 (XEN) RIP: e008:[<ffff828c801243fe>] hypercall_xlat_continuation+0x34e/0x390 (XEN) RFLAGS: 0000000000210206 CONTEXT: hypervisor (XEN) rax: 0000000000000018 rbx: ffff8300ceef7f50 rcx: 00000000000024d1 (XEN) rdx: 0000008000000018 rsi: 000000000000000a rdi: ffff828c80205d89 (XEN) rbp: 0000000000000001 rsp: ffff8300ceef7d58 r8: 0000000000000001 (XEN) r9: 0000000000000001 r10: 00000000ffffffff r11: ffff828c8011ee00 (XEN) r12: ffff8300ceef7f28 r13: 0000000000000000 r14: ffff8300ceef7ee4 (XEN) r15: 00000080000002b8 cr0: 000000008005003b cr4: 00000000000026b0 (XEN) cr3: 000000012d1f3000 cr2: 00000000b7d2a000 (XEN) ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0000 cs: e008 (XEN) Xen stack trace from rsp=ffff8300ceef7d58: (XEN) 0000000000305000 5555555555555555 0000000000000000 bfc5cc5c00000000 (XEN) 0000000000000020 ffff8300ceef7e88 ffff8300ceef7d98 0000000060000000 (XEN) ffff828c801c44c0 000000000012a0cb 00000080000002b8 00000000bfc5cc5c (XEN) 0000000000000000 0000000000000004 ffff8300cefca080 ffff828c80133e95 (XEN) 0000000402abf250 0000000000000000 000000010000001a ffff8300cfdee080 (XEN) ffff828c801c44c0 0000008000000018 0000008000000002 000000000012a0cb (XEN) 0000008000000fd8 cefca0808012c510 00000000c071efd0 cefca08000000000 (XEN) 0000000000200246 000000002b6fa025 0000000000800025 00000000800000a9 (XEN) ffff828c801c7020 00000000000000c6 00000000bfc5d2f8 00000000000000aa (XEN) ffff8300ceef7f28 ffff828c8017d8e5 ffff8300ceef7f28 ffff8300ceef7f28 (XEN) ffff8300ceef7ed8 ffff8300ceef7ee4 0000000400000000 0000000000000000 (XEN) 000000aa00000000 ffff828c801c4020 ffff8300ceef7f28 0000008000000000 (XEN) 00100d6b00000002 0000000100000000 000000010de4e067 ffff8300cfdee080 (XEN) 0000000000305000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 ffff828c80185097 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000305000 0000008000000018 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 000000000000001a 00000000800000a9 0000000000000000 0000000000000004 (XEN) 00000000006221d4 0000010000000000 00000000c0401345 0000000000000061 (XEN) 0000000000200282 00000000c583cee0 0000000000000069 5555555555555555 (XEN) Xen call trace: (XEN) [<ffff828c801243fe>] hypercall_xlat_continuation+0x34e/0x390 (XEN) [<ffff828c80133e95>] do_mmuext_op+0xc95/0xd50 (XEN) [<ffff828c8017d8e5>] compat_mmuext_op+0x3d5/0x470 (XEN) [<ffff828c80185097>] compat_hypercall+0x57/0x60 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 3: (XEN) Xen BUG at domain.c:1522 (XEN) **************************************** (XEN) (XEN) Manual reset required (''noreboot'' specified) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2007-Dec-07 21:47 UTC
RE: [Xen-devel] Crash in hypercall_xlat_continuation with 64bit xen 3.1.1 when migrating a 32bit pv guest
> I''m seeing a crash in 64bit xen when doing live migration of a 32bitpv> guest under load. The crash does not occur without the load, it > happens with both Red Hat/Oracle and Xensource kernels (el4u5 in thiscase),> and it takes about 3-5 migrations to trigger the bug. > Below you''ll see that the 32bit guest ebx register, stored here in the > 64bit reg variable, is supposed to have the upper 32bits cleared, but > it has 0x00000080 in the upper 32bits.Have you tried pulling through this patch? http://xenbits.xensource.com/xen-unstable.hg?rev/46776e65e679 Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
KURT.HACKEL@ORACLE.COM
2007-Dec-08 05:58 UTC
RE: [Xen-devel] Crash in hypercall_xlat_continuation with 64bit xen 3.1.1 when migrating a 32bit pv guest
> > I''m seeing a crash in 64bit xen when doing live migration of a 32bit > pv > > guest under load. The crash does not occur without the load, it > > happens with both Red Hat/Oracle and Xensource kernels (el4u5 in this > case), > > and it takes about 3-5 migrations to trigger the bug. > > Below you''ll see that the 32bit guest ebx register, stored here in the > > 64bit reg variable, is supposed to have the upper 32bits cleared, but > > it has 0x00000080 in the upper 32bits. > > Have you tried pulling through this patch? > > http://xenbits.xensource.com/xen-unstable.hg?rev/46776e65e679 > > IanAwesome. That resolved it. A few dozen migrations with no problems yet. Thanks for the pointer Ian. -kurt _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel