That''s quite a big CR3 value. How much memory does this guest have? -- Keir On 21/9/06 10:56 pm, "Karl Rister" <kmr@us.ibm.com> wrote:> (XEN) Invalid CR3 value=10f780000domain_crash_sync called from vmx.c:1679 > (XEN) Domain 5 (vcpu#1) crashed on cpu#4: > (XEN) ----[ Xen-3.0-unstable x86_64 debug=n Not tainted ]---- > (XEN) CPU: 4 > (XEN) RIP: 0010:[<ffffffff8017680c>] > (XEN) RFLAGS: 0000000000000293 CONTEXT: hvm > (XEN) rax: 000000010f780000 rbx: 0000000000000001 rcx: 0000000000000000 > (XEN) rdx: ffff81010f780000 rsi: 0000000000000000 rdi: ffff81010fc6db5c > (XEN) rbp: ffffffff803f3000 rsp: ffff81010fc6fb48 r8: 0000000000000000 > (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000 > (XEN) r12: ffff81010fb39a80 r13: 0000000000000000 r14: ffff81010fc6d510 > (XEN) r15: ffff81010fc66ac0 cr0: 000000008005003b cr4: 00000000000006e0 > (XEN) cr3: 000000015f4c1000 cr2: 0000000000000000 > (XEN) ds: 0018 es: 0018 fs: 0000 gs: 0000 ss: 0018 cs: 0010 > > The domain was running with 4 VCPUs and had previously completed the test on a > single VCPU and 2 VCPU configurations. The domain was running a baremetal > 2.6.16.29 kernel. Output from ''xm info'' is:_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
While running dbench inside an hvm domain I received the following on the console: (XEN) Invalid CR3 value=10f780000domain_crash_sync called from vmx.c:1679 (XEN) Domain 5 (vcpu#1) crashed on cpu#4: (XEN) ----[ Xen-3.0-unstable x86_64 debug=n Not tainted ]---- (XEN) CPU: 4 (XEN) RIP: 0010:[<ffffffff8017680c>] (XEN) RFLAGS: 0000000000000293 CONTEXT: hvm (XEN) rax: 000000010f780000 rbx: 0000000000000001 rcx: 0000000000000000 (XEN) rdx: ffff81010f780000 rsi: 0000000000000000 rdi: ffff81010fc6db5c (XEN) rbp: ffffffff803f3000 rsp: ffff81010fc6fb48 r8: 0000000000000000 (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000 (XEN) r12: ffff81010fb39a80 r13: 0000000000000000 r14: ffff81010fc6d510 (XEN) r15: ffff81010fc66ac0 cr0: 000000008005003b cr4: 00000000000006e0 (XEN) cr3: 000000015f4c1000 cr2: 0000000000000000 (XEN) ds: 0018 es: 0018 fs: 0000 gs: 0000 ss: 0018 cs: 0010 The domain was running with 4 VCPUs and had previously completed the test on a single VCPU and 2 VCPU configurations. The domain was running a baremetal 2.6.16.29 kernel. Output from ''xm info'' is: host : x460c release : 2.6.16.29-xen0-up version : #1 Tue Sep 19 16:20:47 CDT 2006 machine : x86_64 nr_cpus : 16 nr_nodes : 1 sockets_per_node : 4 cores_per_socket : 2 threads_per_core : 2 cpu_mhz : 3002 hw_caps : bfebfbff:20100800:00000000:00000180:000065bd:00000000:00000001 total_memory : 14335 free_memory : 13633 xen_major : 3 xen_minor : 0 xen_extra : -unstable xen_caps : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : Tue Sep 19 08:26:47 2006 +0100 11536:041be3f6b38e cc_compiler : gcc version 4.1.0 (SUSE Linux) cc_compile_by : root cc_compile_domain : ltc.austin.ibm.com cc_compile_date : Tue Sep 19 16:15:52 CDT 2006 xend_config_format : 2 -- Karl Rister IBM Linux Performance Team kmr@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thursday 21 September 2006 4:55 pm, Keir Fraser wrote:> That''s quite a big CR3 value. How much memory does this guest have?The guest was configured with 2GB of memory. In these particular tests we are scaling up the VCPU count and as we do that we add memory as well. The 1way test used 512MB and the 2way used 1GB. Once I was able to recover my environment from this crash I discovered that the problem is not occurring during the test run but during launch of the domain. I find this curious because before I launched the tests I successfully booted and then cleanly shutdown the hvm domain with 1, 2, 4, 8, and 16 VCPUs. However, I did not scale the memory in that small functional test. I am going to investigate and see if I can more clearly define that the amount of memory is the issue and what the breaking point is. Karl> > -- Keir > > On 21/9/06 10:56 pm, "Karl Rister" <kmr@us.ibm.com> wrote: > > (XEN) Invalid CR3 value=10f780000domain_crash_sync called from vmx.c:1679 > > (XEN) Domain 5 (vcpu#1) crashed on cpu#4: > > (XEN) ----[ Xen-3.0-unstable x86_64 debug=n Not tainted ]---- > > (XEN) CPU: 4 > > (XEN) RIP: 0010:[<ffffffff8017680c>] > > (XEN) RFLAGS: 0000000000000293 CONTEXT: hvm > > (XEN) rax: 000000010f780000 rbx: 0000000000000001 rcx: > > 0000000000000000 (XEN) rdx: ffff81010f780000 rsi: 0000000000000000 > > rdi: ffff81010fc6db5c (XEN) rbp: ffffffff803f3000 rsp: ffff81010fc6fb48 > > r8: 0000000000000000 (XEN) r9: 0000000000000000 r10: > > 0000000000000000 r11: 0000000000000000 (XEN) r12: ffff81010fb39a80 > > r13: 0000000000000000 r14: ffff81010fc6d510 (XEN) r15: ffff81010fc66ac0 > > cr0: 000000008005003b cr4: 00000000000006e0 (XEN) cr3: > > 000000015f4c1000 cr2: 0000000000000000 > > (XEN) ds: 0018 es: 0018 fs: 0000 gs: 0000 ss: 0018 cs: 0010 > > > > The domain was running with 4 VCPUs and had previously completed the test > > on a single VCPU and 2 VCPU configurations. The domain was running a > > baremetal 2.6.16.29 kernel. Output from ''xm info'' is:-- Karl Rister IBM Linux Performance Team kmr@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
After doing quite a bit of testing I came upon something repeatable. When using 4way with 2GB sometimes it would work and sometimes not. With 8way 4GB it was much more consistent and I was able to narrow it down to a single point. With 3840 MB I can boot without problems, if I increase the memory to 3841 it will not boot. Something interesting is that very close (< 6MB over) I actually get slightly different progress before crashing as opposed to always in the same place. If I am way over (say I put in 4096) it always crashes after the "Freeing unused kernel memory: 200k freed" message. If I am closer (like 3841) I can actually get to the point of seeing a few init scripts run before it crashes. If I am at something like 3844 it tends to crash right after udev which immediately follows the "Freeing..." message. Karl On Thursday 21 September 2006 4:55 pm, Keir Fraser wrote:> That''s quite a big CR3 value. How much memory does this guest have? > > -- Keir > > On 21/9/06 10:56 pm, "Karl Rister" <kmr@us.ibm.com> wrote: > > (XEN) Invalid CR3 value=10f780000domain_crash_sync called from vmx.c:1679 > > (XEN) Domain 5 (vcpu#1) crashed on cpu#4: > > (XEN) ----[ Xen-3.0-unstable x86_64 debug=n Not tainted ]---- > > (XEN) CPU: 4 > > (XEN) RIP: 0010:[<ffffffff8017680c>] > > (XEN) RFLAGS: 0000000000000293 CONTEXT: hvm > > (XEN) rax: 000000010f780000 rbx: 0000000000000001 rcx: > > 0000000000000000 (XEN) rdx: ffff81010f780000 rsi: 0000000000000000 > > rdi: ffff81010fc6db5c (XEN) rbp: ffffffff803f3000 rsp: ffff81010fc6fb48 > > r8: 0000000000000000 (XEN) r9: 0000000000000000 r10: > > 0000000000000000 r11: 0000000000000000 (XEN) r12: ffff81010fb39a80 > > r13: 0000000000000000 r14: ffff81010fc6d510 (XEN) r15: ffff81010fc66ac0 > > cr0: 000000008005003b cr4: 00000000000006e0 (XEN) cr3: > > 000000015f4c1000 cr2: 0000000000000000 > > (XEN) ds: 0018 es: 0018 fs: 0000 gs: 0000 ss: 0018 cs: 0010 > > > > The domain was running with 4 VCPUs and had previously completed the test > > on a single VCPU and 2 VCPU configurations. The domain was running a > > baremetal 2.6.16.29 kernel. Output from ''xm info'' is:-- Karl Rister IBM Linux Performance Team kmr@us.ibm.com (512) 838-1553 (t/l 678) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> After doing quite a bit of testing I came upon something repeatable.When> using 4way with 2GB sometimes it would work and sometimes not. With8way> 4GB > it was much more consistent and I was able to narrow it down to asingle> point. With 3840 MB I can boot without problems, if I increase thememory> to > 3841 it will not boot. Something interesting is that very close (<6MB> over) > I actually get slightly different progress before crashing as opposedto> always in the same place. If I am way over (say I put in 4096) italways> crashes after the "Freeing unused kernel memory: 200k freed" message.If I> am closer (like 3841) I can actually get to the point of seeing a fewinit> scripts run before it crashes. If I am at something like 3844 ittends to> crash right after udev which immediately follows the "Freeing..."message. Just to confirm, this is with a recent (e.g. 24h) xen-unstable (or 3.0.3-testing.hg), and the guest is an x86_64 linux 2.6.16 ? Please can you try using a debug=y build of Xen to see if we get any extra output. Thanks, Ian> Karl > > On Thursday 21 September 2006 4:55 pm, Keir Fraser wrote: > > That''s quite a big CR3 value. How much memory does this guest have? > > > > -- Keir > > > > On 21/9/06 10:56 pm, "Karl Rister" <kmr@us.ibm.com> wrote: > > > (XEN) Invalid CR3 value=10f780000domain_crash_sync called from > vmx.c:1679 > > > (XEN) Domain 5 (vcpu#1) crashed on cpu#4: > > > (XEN) ----[ Xen-3.0-unstable x86_64 debug=n Not tainted ]---- > > > (XEN) CPU: 4 > > > (XEN) RIP: 0010:[<ffffffff8017680c>] > > > (XEN) RFLAGS: 0000000000000293 CONTEXT: hvm > > > (XEN) rax: 000000010f780000 rbx: 0000000000000001 rcx: > > > 0000000000000000 (XEN) rdx: ffff81010f780000 rsi:0000000000000000> > > rdi: ffff81010fc6db5c (XEN) rbp: ffffffff803f3000 rsp: > ffff81010fc6fb48 > > > r8: 0000000000000000 (XEN) r9: 0000000000000000 r10: > > > 0000000000000000 r11: 0000000000000000 (XEN) r12:ffff81010fb39a80> > > r13: 0000000000000000 r14: ffff81010fc6d510 (XEN) r15: > ffff81010fc66ac0 > > > cr0: 000000008005003b cr4: 00000000000006e0 (XEN) cr3: > > > 000000015f4c1000 cr2: 0000000000000000 > > > (XEN) ds: 0018 es: 0018 fs: 0000 gs: 0000 ss: 0018 cs:0010> > > > > > The domain was running with 4 VCPUs and had previously completedthe> test > > > on a single VCPU and 2 VCPU configurations. The domain wasrunning a> > > baremetal 2.6.16.29 kernel. Output from ''xm info'' is: > > -- > Karl Rister > IBM Linux Performance Team > kmr@us.ibm.com > (512) 838-1553 (t/l 678) > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Just to confirm, this is with a recent (e.g. 24h) xen-unstable (or > 3.0.3-testing.hg), and the guest is an x86_64 linux 2.6.16 ? > > Please can you try using a debug=y build of Xen to see if we get any > extra output. > Thanks, > Ian >I originally encountered this on testing with xen-unstable changeset 11536:041be3f6b38e from 9/19. Both the dom0 and the guest are running 2.6.16.29. I originally encountered it when running with SMP hvm domains but the problem also occurs when giving the domain just a single VCPU. I rebuilt on a changeset from today (11620:ef41783c664a) with debug=y and now my domain crashes immediately after grub loads. It does this on both a SMP and UP setup and seems independent of the amount of memory allocated to the domain. This is what gets dumped to the console: x460c login: (XEN) sh_update_paging_modes: postponing determination of shadow mode (XEN) (file=hvm.c, line=195) Allocated port 3 for hvm. (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5582 (pseudophys a0): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5581 (pseudophys a1): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5580 (pseudophys a2): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e557f (pseudophys a3): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e557e (pseudophys a4): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e557d (pseudophys a5): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e557c (pseudophys a6): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e557b (pseudophys a7): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e557a (pseudophys a8): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5579 (pseudophys a9): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5578 (pseudophys aa): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5577 (pseudophys ab): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5576 (pseudophys ac): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5575 (pseudophys ad): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5574 (pseudophys ae): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5573 (pseudophys af): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5572 (pseudophys b0): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5571 (pseudophys b1): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5570 (pseudophys b2): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e556f (pseudophys b3): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e556e (pseudophys b4): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e556d (pseudophys b5): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e556c (pseudophys b6): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e556b (pseudophys b7): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e556a (pseudophys b8): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5569 (pseudophys b9): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5568 (pseudophys ba): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5567 (pseudophys bb): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5566 (pseudophys bc): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5565 (pseudophys bd): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5564 (pseudophys be): count=2 type=0 (XEN) (file=memory.c, line=180) Dom2 freeing in-use page e5563 (pseudophys bf): count=2 type=0 (XEN) vmx_do_launch(): GUEST_CR3<=002ad7a0, HOST_CR3<=1ef36b000 (XEN) (GUEST: 2) HVM Loader (XEN) (GUEST: 2) Detected Xen v3.0-unstable (XEN) (GUEST: 2) Loading ROMBIOS ... (XEN) (GUEST: 2) Loading Cirrus VGABIOS ... (XEN) (GUEST: 2) Writing SMBIOS tables ... (XEN) (GUEST: 2) Loading VMXAssist ... (XEN) (GUEST: 2) VMX go ... (XEN) (GUEST: 2) VMXAssist (Sep 26 2006) (XEN) (GUEST: 2) Memory size 3840 MB (XEN) (GUEST: 2) E820 map: (XEN) (GUEST: 2) 0000000000000000 - 000000000009F000 (RAM) (XEN) (GUEST: 2) 000000000009F000 - 00000000000A0000 (Reserved) (XEN) (GUEST: 2) 00000000000A0000 - 00000000000C0000 (Type 16) (XEN) (GUEST: 2) 00000000000F0000 - 0000000000100000 (Reserved) (XEN) (GUEST: 2) 0000000000100000 - 00000000EFFF0000 (RAM) (XEN) (GUEST: 2) 00000000EFFF0000 - 00000000EFFFA000 (ACPI Data) (XEN) (GUEST: 2) 00000000EFFFA000 - 00000000EFFFD000 (ACPI NVS) (XEN) (GUEST: 2) 00000000EFFFD000 - 00000000EFFFE000 (Type 19) (XEN) (GUEST: 2) 00000000EFFFE000 - 00000000EFFFF000 (Type 18) (XEN) (GUEST: 2) 00000000EFFFF000 - 00000000F0000000 (Type 17) (XEN) (GUEST: 2) 00000000FEC00000 - 0000000100000000 (Type 16) (XEN) (GUEST: 2) (XEN) (GUEST: 2) Start BIOS ... (XEN) (GUEST: 2) Starting emulated 16-bit real-mode: ip=F000:FFF0 (XEN) (GUEST: 2) rombios.c,v 1.138 2005/05/07 15:55:26 vruppert Exp $ (XEN) (GUEST: 2) Remapping master: ICW2 0x8 -> 0x20 (XEN) (GUEST: 2) Remapping slave: ICW2 0x70 -> 0x28 (XEN) (GUEST: 2) VGABios $Id: vgabios.c,v 1.61 2005/05/24 16:50:50 vruppert Exp $ (XEN) (GUEST: 2) HVMAssist BIOS, 1 cpu, $Revision: 1.138 $ $Date: 2005/05/07 15:55:26 $ (XEN) (GUEST: 2) (XEN) (GUEST: 2) ata0-0: PCHS=4161/16/63 translation=lba LCHS=520/128/63 (XEN) (GUEST: 2) ata0 master: QEMU HARDDISK ATA-7 Hard-Disk (2048 MBytes) (XEN) (GUEST: 2) ata0-1: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 (XEN) (GUEST: 2) ata0 slave: QEMU HARDDISK ATA-7 Hard-Disk (9773 MBytes) (XEN) (GUEST: 2) (XEN) (GUEST: 2) Booting from Hard Disk... (XEN) (GUEST: 2) int13_harddisk: function 41, unmapped device for ELDL=82 (XEN) (GUEST: 2) int13_harddisk: function 08, unmapped device for ELDL=82 (XEN) (GUEST: 2) *** int 15h function AX=00C0, BX=0000 not yet supported! (XEN) (GUEST: 2) *** int 15h function AX=EC00, BX=0002 not yet supported! (XEN) (GUEST: 2) KBD: unsupported int 16h function 03 (XEN) trying to set reserved bit in EFER (XEN) domain_crash_sync called from vmx.c:2268 (XEN) Domain 2 (vcpu#0) crashed on cpu#2: (XEN) ----[ Xen-3.0-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 2 (XEN) RIP: 0010:[<000000000010004f>] (XEN) RFLAGS: 0000000000010046 CONTEXT: hvm (XEN) rax: 00000000004dc100 rbx: 0000000000000000 rcx: 00000000c0000080 (XEN) rdx: 0000000020100800 rsi: 0000000000090000 rdi: 00000000004ea088 (XEN) rbp: 000000000008e000 rsp: 00000000001010c0 r8: 0000000000000000 (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000 (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000 (XEN) r15: 0000000000000000 cr0: 0000000000050031 cr4: 0000000000000020 (XEN) cr3: 00000000002ad7a0 cr2: 0000000000000000 (XEN) ds: 0018 es: 0018 fs: 0018 gs: 0018 ss: 0018 cs: 0010 -- Karl Rister IBM Linux Performance Team kmr@us.ibm.com (512) 838-1553 (t/l 678) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> (XEN) trying to set reserved bit in EFER > (XEN) domain_crash_sync called from vmx.c:2268 > (XEN) Domain 2 (vcpu#0) crashed on cpu#2: > (XEN) ----[ Xen-3.0-unstable x86_64 debug=y Not tainted ]---- > (XEN) CPU: 2 > (XEN) RIP: 0010:[<000000000010004f>] > (XEN) RFLAGS: 0000000000010046 CONTEXT: hvm > (XEN) rax: 00000000004dc100 rbx: 0000000000000000 rcx:00000000c0000080> (XEN) rdx: 0000000020100800 rsi: 0000000000090000 rdi:00000000004ea088> (XEN) rbp: 000000000008e000 rsp: 00000000001010c0 r8:0000000000000000> (XEN) r9: 0000000000000000 r10: 0000000000000000 r11:0000000000000000> (XEN) r12: 0000000000000000 r13: 0000000000000000 r14:0000000000000000> (XEN) r15: 0000000000000000 cr0: 0000000000050031 cr4:0000000000000020> (XEN) cr3: 00000000002ad7a0 cr2: 0000000000000000 > (XEN) ds: 0018 es: 0018 fs: 0018 gs: 0018 ss: 0018 cs: 0010This is pretty interesting. Can you make your guest config file available, and to save us the effort, the guest kernel too. Thanks, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tuesday 26 September 2006 3:48 pm, Ian Pratt wrote:> This is pretty interesting. Can you make your guest config file > available, and to save us the effort, the guest kernel too.Here are the config files. Legal stuff prevents me from sending the binary kernel... -- Karl Rister IBM Linux Performance Team kmr@us.ibm.com (512) 838-1553 (t/l 678) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel