Jon Swanson
2009-Jan-16 09:02 UTC
[Fedora-xen] f10 x86_64 xen VM guests fail to boot on f8 host
This is a cross post of the same subject on the Fedora Forums. If this is bad practice let me know and i''ll never do it again. Additional log info is available at http://forums.fedoraforum.org/showthread.php?p=1149972&posted=1#post1149 972 I have two machines running fresh installs of f8 with the xen. Kernel and all software versions are the same on both. Specifically: [root@machineA boot]# uname -a Linux machineA 2.6.21.7-5.fc8xen #1 SMP Thu Aug 7 12:44:22 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux [root@machineA boot]# virsh version Compiled against library: libvir 0.4.4 Using library: libvir 0.4.4 Using API: Xen 3.0.1 Running hypervisor: Xen 3.1.0 And: [root@machineB ~]# uname -a Linux machineB 2.6.21.7-5.fc8xen #1 SMP Thu Aug 7 12:44:22 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux [root@machineB ~]# virsh version Compiled against library: libvir 0.4.4 Using library: libvir 0.4.4 Using API: Xen 3.0.1 Running hypervisor: Xen 3.1.0 MachineA has two AMD Opteron 275s. MachineB has four Intel(R) Xeon(TM) CPU 2.80GHz processors. Both machines are as up to date as possible. I can boot or create x86_64 f10 guests on MachineA with no trouble whatsoever. MachineB will not boot/create x86_64 f10 guests. The configuration files are created in the same manner, but as soon as Xen tries to unpause the newly created domain, it crashes pretty much instantly. /var/log/xen/xend.log relevant output: [2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for devices vtpm. [2009-01-16 14:45:32 4120] INFO (XendDomain:1130) Domain f10testB (21) unpaused. [2009-01-16 14:45:32 4120] WARNING (XendDomainInfo:1203) Domain has crashed: name=f10testB id=21. [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1802) XendDomainInfo.destroy: domid=21 [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1821) XendDomainInfo.destroyDomain(21) I''ve also tried moving a functional guest from MachineA to MachineB to boot it there, with the same results. Guest will not boot on MachineB. f8 64bit guests will boot on MachineB with no problems. f10 32bit guests will boot on MachineB with no problems. Only 64bit machines seem to be borked. Any information / help / insight as to why this is happening would be very much appreciated. The machines are pretty similar, and since the guests are paravirtualized it does not really make sense for the processors to be the cause of the problem. Thanks, jon
Mark McLoughlin
2009-Jan-16 10:33 UTC
Re: [Fedora-xen] f10 x86_64 xen VM guests fail to boot on f8 host
Hi Jon, On Fri, 2009-01-16 at 18:02 +0900, Jon Swanson wrote:> This is a cross post of the same subject on the Fedora Forums. If this > is bad practice let me know and i''ll never do it again.Mailing lists can often be a better way to get help from developers, so posting here is no problem. Also, fedora-virt@redhat.com might be a better place to post questions these days - it''s not clear whether the fedora-xen list has a future.> Additional log info is available at > http://forums.fedoraforum.org/showthread.php?p=1149972&posted=1#post1149 > 972 > > I have two machines running fresh installs of f8 with the xen. Kernel > and all software versions are the same on both.You''ve seen this then, right? http://fedoraproject.org/wiki/Bugs/F10Common#Installing_Fedora_10_DomU_on_Fedora_8_Dom0_Fails> /var/log/xen/xend.log relevant output: > > [2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for devices > vtpm. > [2009-01-16 14:45:32 4120] INFO (XendDomain:1130) Domain f10testB (21) > unpaused. > [2009-01-16 14:45:32 4120] WARNING (XendDomainInfo:1203) Domain has > crashed: name=f10testB id=21. > [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1802) > XendDomainInfo.destroy: domid=21 > [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1821) > XendDomainInfo.destroyDomain(21) > I''ve also tried moving a functional guest from MachineA to MachineB to > boot it there, with the same results. Guest will not boot on MachineB. > > f8 64bit guests will boot on MachineB with no problems. > f10 32bit guests will boot on MachineB with no problems. > > Only 64bit machines seem to be borked.Okay, sounds like it might "just" be a F10 kernel bug. Try doing this to get a stack trace: 1) Set "on_crash=preserve" in your domain config 2) Copy the guest kernel''s System.map to the host 2) Once the guest has crashed, run: /usr/lib/xen/bin/xenctx -s System.map <domid> Cheers, Mark.
Virtualization
2009-Jan-20 01:22 UTC
Re: [Fedora-xen] f10 x86_64 xen VM guests fail to boot on f8 host
Hi Mark, I ran the adjusted commands on our system here with the Intel CPU (which crashes in the same way as Jon''s machine): [root@office64 boot]# /usr/lib64/xen/bin/xenctx -s System.map-2.6.27.9-159.fc10.x86_64 114 rip: ffffffff8100b8a2 set_page_prot+0x6d rsp: ffffffff81575f08 rax: ffffffea rbx: 000016e4 rcx: 00000055 rdx: 00000000 rsi: 800000010c2b3061 rdi: ffffffff816e4000 rbp: ffffffff81575f68 r8: 0000000f r9: ffffffff817ee350 r10: ffffffff817ee550 r11: 00000010 r12: ffffffff816e4000 r13: 800000010c2b3061 r14: 8000000000000161 r15: 00002c00 cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000 Stack: 0000000000000055 0000000000000010 ffffffff8100b8a2 000000010000e030 0000000000010082 ffffffff81575f48 000000000000e02b ffffffff8100b89e 0000000000000200 ffffffff816e7000 0000000000000800 0000000000000016 ffffffff81575ff8 ffffffff815a5c60 0000000000002c00 0000000000000000 Code: df 54 1d 00 4c 89 e7 4c 89 ee 31 d2 e8 22 d9 ff ff 85 c0 74 04 <0f> 0b eb fe 5b 41 5c 41 5d 41 5e Call Trace: [<ffffffff8100b8a2>] set_page_prot+0x6d <-- [<ffffffff8100b8a2>] set_page_prot+0x6d [<ffffffff8100b89e>] set_page_prot+0x69 [<ffffffff815a5c60>] xen_start_kernel+0x5dd Cheers hope this helps Phill On Fri, 2009-01-16 at 10:33 +0000, Mark McLoughlin wrote:> Hi Jon, > > On Fri, 2009-01-16 at 18:02 +0900, Jon Swanson wrote: > > This is a cross post of the same subject on the Fedora Forums. If this > > is bad practice let me know and i''ll never do it again. > > Mailing lists can often be a better way to get help from developers, so > posting here is no problem. > > Also, fedora-virt@redhat.com might be a better place to post questions > these days - it''s not clear whether the fedora-xen list has a future. > > > Additional log info is available at > > http://forums.fedoraforum.org/showthread.php?p=1149972&posted=1#post1149 > > 972 > > > > I have two machines running fresh installs of f8 with the xen. Kernel > > and all software versions are the same on both. > > You''ve seen this then, right? > > http://fedoraproject.org/wiki/Bugs/F10Common#Installing_Fedora_10_DomU_on_Fedora_8_Dom0_Fails > > > /var/log/xen/xend.log relevant output: > > > > [2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for devices > > vtpm. > > [2009-01-16 14:45:32 4120] INFO (XendDomain:1130) Domain f10testB (21) > > unpaused. > > [2009-01-16 14:45:32 4120] WARNING (XendDomainInfo:1203) Domain has > > crashed: name=f10testB id=21. > > [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1802) > > XendDomainInfo.destroy: domid=21 > > [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1821) > > XendDomainInfo.destroyDomain(21) > > I''ve also tried moving a functional guest from MachineA to MachineB to > > boot it there, with the same results. Guest will not boot on MachineB. > > > > f8 64bit guests will boot on MachineB with no problems. > > f10 32bit guests will boot on MachineB with no problems. > > > > Only 64bit machines seem to be borked. > > Okay, sounds like it might "just" be a F10 kernel bug. > > Try doing this to get a stack trace: > > 1) Set "on_crash=preserve" in your domain config > > 2) Copy the guest kernel''s System.map to the host > > 2) Once the guest has crashed, run: > > /usr/lib/xen/bin/xenctx -s System.map <domid> > > Cheers, > Mark. > > -- > Fedora-xen mailing list > Fedora-xen@redhat.com > https://www.redhat.com/mailman/listinfo/fedora-xen >
Jon Swanson
2009-Jan-20 05:01 UTC
RE: [Fedora-xen] f10 x86_64 xen VM guests fail to boot on f8 host
Hi Mark, thank you very much for your help. I took an f10host which boots on MachineA and copied it to MachineB, modified the config to include on_crash=preserve, and booted it with xm create. ------------------------------------------------------------------------ ------------------------------------------------ xenctx output: /usr/lib64/xen/bin/xenctx -s System.map-2.6.27.5-117.fc10.x86_64 46 rip: ffffffff8100b8a2 set_page_prot+0x6d rsp: ffffffff81573f08 rax: ffffffea rbx: 000016e1 rcx: 00000055 rdx: 00000000 rsi: 800000014ffc6061 rdi: ffffffff816e1000 rbp: ffffffff81573f68 r8: 0000000f r9: ffffffff817eb450 r10: ffffffff817eb650 r11: 00000010 r12: ffffffff816e1000 r13: 800000014ffc6061 r14: 8000000000000161 r15: 00000016 cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000 Stack: 0000000000000055 0000000000000010 ffffffff8100b8a2 000000010000e030 0000000000010082 ffffffff81573f48 000000000000e02b ffffffff8100b89e 0000000000000200 ffffffff816e4000 0000000000000800 0000000000002c00 ffffffff81573ff8 ffffffff815a3c60 0000000000002c00 0000000000000000 Code: 7b 4a 1d 00 4c 89 e7 4c 89 ee 31 d2 e8 22 d9 ff ff 85 c0 74 04 <0f> 0b eb fe 5b 41 5c 41 5d 41 5e Call Trace: [<ffffffff8100b8a2>] set_page_prot+0x6d <-- [<ffffffff8100b8a2>] set_page_prot+0x6d [<ffffffff8100b89e>] set_page_prot+0x69 [<ffffffff815a3c60>] xen_start_kernel+0x5dd ------------------------------------------------------------------------ ------------------------------------------------ Dmesg also has something which may make sense to someone wiser than myself. Specifically: (XEN) traps.c:405:d46 Unhandled invalid opcode fault/trap [#6] in domain 46 on VCPU 0 [ec=0000] xm dmesg .... (XEN) ffffffff82035000 ffffffff82036000 ffffffff82037000 ffffffff82038000 (XEN) mm.c:1362:d46 Bad L1 flags 800000 (XEN) traps.c:405:d46 Unhandled invalid opcode fault/trap [#6] in domain 46 on VCPU 0 [ec=0000] (XEN) domain_crash_sync called from entry.S (XEN) Domain 46 (vcpu#0) crashed on cpu#2: (XEN) ----[ Xen-3.1.4 x86_64 debug=n Not tainted ]---- (XEN) CPU: 2 (XEN) RIP: e033:[<ffffffff8100b8a2>] (XEN) RFLAGS: 0000000000000282 CONTEXT: guest (XEN) rax: 00000000ffffffea rbx: 00000000000016e1 rcx: 0000000000000055 (XEN) rdx: 0000000000000000 rsi: 800000014ffc6061 rdi: ffffffff816e1000 (XEN) rbp: ffffffff81573f68 rsp: ffffffff81573f08 r8: 000000000000000f (XEN) r9: ffffffff817eb450 r10: ffffffff817eb650 r11: 0000000000000010 (XEN) r12: ffffffff816e1000 r13: 800000014ffc6061 r14: 8000000000000161 (XEN) r15: 0000000000000016 cr0: 000000008005003b cr4: 00000000000006f0 (XEN) cr3: 0000000144f18000 cr2: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 (XEN) Guest stack trace from rsp=ffffffff81573f08: (XEN) 0000000000000055 0000000000000010 ffffffff8100b8a2 000000010000e030 (XEN) 0000000000010082 ffffffff81573f48 000000000000e02b ffffffff8100b89e (XEN) 0000000000000200 ffffffff816e4000 0000000000000800 0000000000002c00 (XEN) ffffffff81573ff8 ffffffff815a3c60 0000000000002c00 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff8208b000 (XEN) 0000000000010000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 ffffffff82008000 (XEN) ffffffff82009000 ffffffff8200a000 ffffffff8200b000 ffffffff8200c000 (XEN) ffffffff8200d000 ffffffff8200e000 ffffffff8200f000 ffffffff82010000 (XEN) ffffffff82011000 ffffffff82012000 ffffffff82013000 ffffffff82014000 (XEN) ffffffff82015000 ffffffff82016000 ffffffff82017000 ffffffff82018000 (XEN) ffffffff82019000 ffffffff8201a000 ffffffff8201b000 ffffffff8201c000 (XEN) ffffffff8201d000 ffffffff8201e000 ffffffff8201f000 ffffffff82020000 (XEN) ffffffff82021000 ffffffff82022000 ffffffff82023000 ffffffff82024000 (XEN) ffffffff82025000 ffffffff82026000 ffffffff82027000 ffffffff82028000 (XEN) ffffffff82029000 ffffffff8202a000 ffffffff8202b000 ffffffff8202c000 (XEN) ffffffff8202d000 ffffffff8202e000 ffffffff8202f000 ffffffff82030000 (XEN) ffffffff82031000 ffffffff82032000 ffffffff82033000 ffffffff82034000 (XEN) ffffffff82035000 ffffffff82036000 ffffffff82037000 ffffffff82038000 ... ------------------------------------------------------------------------ ------------------------------------------------ -----Original Message----- From: Mark McLoughlin [mailto:markmc@redhat.com] Sent: Friday, January 16, 2009 7:34 PM To: Jon Swanson Cc: fedora-xen@redhat.com Subject: Re: [Fedora-xen] f10 x86_64 xen VM guests fail to boot on f8 host Hi Jon, On Fri, 2009-01-16 at 18:02 +0900, Jon Swanson wrote:> This is a cross post of the same subject on the Fedora Forums. If > this is bad practice let me know and i''ll never do it again.Mailing lists can often be a better way to get help from developers, so posting here is no problem. Also, fedora-virt@redhat.com might be a better place to post questions these days - it''s not clear whether the fedora-xen list has a future.> Additional log info is available at > http://forums.fedoraforum.org/showthread.php?p=1149972&posted=1#post11 > 49 > 972 > > I have two machines running fresh installs of f8 with the xen. Kernel > and all software versions are the same on both.You''ve seen this then, right? http://fedoraproject.org/wiki/Bugs/F10Common#Installing_Fedora_10_DomU_o n_Fedora_8_Dom0_Fails> /var/log/xen/xend.log relevant output: > > [2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for > devices vtpm. > [2009-01-16 14:45:32 4120] INFO (XendDomain:1130) Domain f10testB (21)> unpaused. > [2009-01-16 14:45:32 4120] WARNING (XendDomainInfo:1203) Domain has > crashed: name=f10testB id=21. > [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1802) > XendDomainInfo.destroy: domid=21 > [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1821) > XendDomainInfo.destroyDomain(21) > I''ve also tried moving a functional guest from MachineA to MachineB to> boot it there, with the same results. Guest will not boot on MachineB. > > f8 64bit guests will boot on MachineB with no problems. > f10 32bit guests will boot on MachineB with no problems. > > Only 64bit machines seem to be borked.Okay, sounds like it might "just" be a F10 kernel bug. Try doing this to get a stack trace: 1) Set "on_crash=preserve" in your domain config 2) Copy the guest kernel''s System.map to the host 2) Once the guest has crashed, run: /usr/lib/xen/bin/xenctx -s System.map <domid> Cheers, Mark.
Virtualization
2009-Jan-20 06:18 UTC
RE: [Fedora-xen] f10 x86_64 xen VM guests fail to boot on f8 host
Hi list,>From the Intel® Virtualization Technology Specificationfor the IA-32 Intel® Architecture (2005): "2.9.2 Information for VM Exits Due to Vectored Events Event-specific information is provided for VM exits due to the following vectored events: exceptions (including those generated by the instructions INT3, INTO, BOUND, and UD2); external interrupts that occur while the “acknowledge interrupt on exit” VM-exit control is 1; and non-maskable interrupts (NMIs). This information is provided in the following fields:" .... The <0f> 0b in the "Code:" section are the UD2 instruction. Checking through the OpCode map for the Xeon processor, this is an invalid op code. In VT processors the software guide indicates that a program can communicate various events and state information to the underlying virtualization supervisor by executing a UD2 (and some others ops like it). I think that in a non-VT cpu it''s actually a "real" invalid op code. The stuff (hardware) which flips over to the supervisor with all the needed info from the virtual machine isn''t there. KVM uses this, from the patches I''ve seen Googling around for UD2 (if I understand correctly). So why a UD2 in the code? It''s highly unlikely that it''s just some random bytes that happen to be a UD2. Possibly the kernel thinks it''s in fully virt mode at some point? The image notes do seem to indicate this. Cheers Phill. On Tue, 2009-01-20 at 14:01 +0900, Jon Swanson wrote:> Hi Mark, thank you very much for your help. > > I took an f10host which boots on MachineA and copied it to MachineB, > modified the config to include on_crash=preserve, and booted it with xm > create. > ------------------------------------------------------------------------ > ------------------------------------------------ > xenctx output: > /usr/lib64/xen/bin/xenctx -s System.map-2.6.27.5-117.fc10.x86_64 46 > rip: ffffffff8100b8a2 set_page_prot+0x6d > rsp: ffffffff81573f08 > rax: ffffffea rbx: 000016e1 rcx: 00000055 rdx: 00000000 > rsi: 800000014ffc6061 rdi: ffffffff816e1000 rbp: ffffffff81573f68 > r8: 0000000f r9: ffffffff817eb450 r10: ffffffff817eb650 r11: > 00000010 > r12: ffffffff816e1000 r13: 800000014ffc6061 r14: 8000000000000161 > r15: 00000016 > cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000 > > Stack: > 0000000000000055 0000000000000010 ffffffff8100b8a2 000000010000e030 > 0000000000010082 ffffffff81573f48 000000000000e02b ffffffff8100b89e > 0000000000000200 ffffffff816e4000 0000000000000800 0000000000002c00 > ffffffff81573ff8 ffffffff815a3c60 0000000000002c00 0000000000000000 > > Code: > 7b 4a 1d 00 4c 89 e7 4c 89 ee 31 d2 e8 22 d9 ff ff 85 c0 74 04 <0f> 0b > eb fe 5b 41 5c 41 5d 41 5e > > Call Trace: > [<ffffffff8100b8a2>] set_page_prot+0x6d <-- > [<ffffffff8100b8a2>] set_page_prot+0x6d > [<ffffffff8100b89e>] set_page_prot+0x69 > [<ffffffff815a3c60>] xen_start_kernel+0x5dd > > > ------------------------------------------------------------------------ > ------------------------------------------------ > Dmesg also has something which may make sense to someone wiser than > myself. Specifically: > (XEN) traps.c:405:d46 Unhandled invalid opcode fault/trap [#6] in domain > 46 on VCPU 0 [ec=0000] > > > xm dmesg > .... > (XEN) ffffffff82035000 ffffffff82036000 ffffffff82037000 > ffffffff82038000 > (XEN) mm.c:1362:d46 Bad L1 flags 800000 > (XEN) traps.c:405:d46 Unhandled invalid opcode fault/trap [#6] in domain > 46 on VCPU 0 [ec=0000] > (XEN) domain_crash_sync called from entry.S > (XEN) Domain 46 (vcpu#0) crashed on cpu#2: > (XEN) ----[ Xen-3.1.4 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 2 > (XEN) RIP: e033:[<ffffffff8100b8a2>] > (XEN) RFLAGS: 0000000000000282 CONTEXT: guest > (XEN) rax: 00000000ffffffea rbx: 00000000000016e1 rcx: > 0000000000000055 > (XEN) rdx: 0000000000000000 rsi: 800000014ffc6061 rdi: > ffffffff816e1000 > (XEN) rbp: ffffffff81573f68 rsp: ffffffff81573f08 r8: > 000000000000000f > (XEN) r9: ffffffff817eb450 r10: ffffffff817eb650 r11: > 0000000000000010 > (XEN) r12: ffffffff816e1000 r13: 800000014ffc6061 r14: > 8000000000000161 > (XEN) r15: 0000000000000016 cr0: 000000008005003b cr4: > 00000000000006f0 > (XEN) cr3: 0000000144f18000 cr2: 0000000000000000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 > (XEN) Guest stack trace from rsp=ffffffff81573f08: > (XEN) 0000000000000055 0000000000000010 ffffffff8100b8a2 > 000000010000e030 > (XEN) 0000000000010082 ffffffff81573f48 000000000000e02b > ffffffff8100b89e > (XEN) 0000000000000200 ffffffff816e4000 0000000000000800 > 0000000000002c00 > (XEN) ffffffff81573ff8 ffffffff815a3c60 0000000000002c00 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > ffffffff8208b000 > (XEN) 0000000000010000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > ffffffff82008000 > (XEN) ffffffff82009000 ffffffff8200a000 ffffffff8200b000 > ffffffff8200c000 > (XEN) ffffffff8200d000 ffffffff8200e000 ffffffff8200f000 > ffffffff82010000 > (XEN) ffffffff82011000 ffffffff82012000 ffffffff82013000 > ffffffff82014000 > (XEN) ffffffff82015000 ffffffff82016000 ffffffff82017000 > ffffffff82018000 > (XEN) ffffffff82019000 ffffffff8201a000 ffffffff8201b000 > ffffffff8201c000 > (XEN) ffffffff8201d000 ffffffff8201e000 ffffffff8201f000 > ffffffff82020000 > (XEN) ffffffff82021000 ffffffff82022000 ffffffff82023000 > ffffffff82024000 > (XEN) ffffffff82025000 ffffffff82026000 ffffffff82027000 > ffffffff82028000 > (XEN) ffffffff82029000 ffffffff8202a000 ffffffff8202b000 > ffffffff8202c000 > (XEN) ffffffff8202d000 ffffffff8202e000 ffffffff8202f000 > ffffffff82030000 > (XEN) ffffffff82031000 ffffffff82032000 ffffffff82033000 > ffffffff82034000 > (XEN) ffffffff82035000 ffffffff82036000 ffffffff82037000 > ffffffff82038000 > ... > > ------------------------------------------------------------------------ > ------------------------------------------------ > > > -----Original Message----- > From: Mark McLoughlin [mailto:markmc@redhat.com] > Sent: Friday, January 16, 2009 7:34 PM > To: Jon Swanson > Cc: fedora-xen@redhat.com > Subject: Re: [Fedora-xen] f10 x86_64 xen VM guests fail to boot on f8 > host > > Hi Jon, > > On Fri, 2009-01-16 at 18:02 +0900, Jon Swanson wrote: > > This is a cross post of the same subject on the Fedora Forums. If > > this is bad practice let me know and i''ll never do it again. > > Mailing lists can often be a better way to get help from developers, so > posting here is no problem. > > Also, fedora-virt@redhat.com might be a better place to post questions > these days - it''s not clear whether the fedora-xen list has a future. > > > Additional log info is available at > > http://forums.fedoraforum.org/showthread.php?p=1149972&posted=1#post11 > > 49 > > 972 > > > > I have two machines running fresh installs of f8 with the xen. Kernel > > and all software versions are the same on both. > > You''ve seen this then, right? > > > http://fedoraproject.org/wiki/Bugs/F10Common#Installing_Fedora_10_DomU_o > n_Fedora_8_Dom0_Fails > > > /var/log/xen/xend.log relevant output: > > > > [2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for > > devices vtpm. > > [2009-01-16 14:45:32 4120] INFO (XendDomain:1130) Domain f10testB (21) > > > unpaused. > > [2009-01-16 14:45:32 4120] WARNING (XendDomainInfo:1203) Domain has > > crashed: name=f10testB id=21. > > [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1802) > > XendDomainInfo.destroy: domid=21 > > [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1821) > > XendDomainInfo.destroyDomain(21) > > I''ve also tried moving a functional guest from MachineA to MachineB to > > > boot it there, with the same results. Guest will not boot on MachineB. > > > > f8 64bit guests will boot on MachineB with no problems. > > f10 32bit guests will boot on MachineB with no problems. > > > > Only 64bit machines seem to be borked. > > Okay, sounds like it might "just" be a F10 kernel bug. > > Try doing this to get a stack trace: > > 1) Set "on_crash=preserve" in your domain config > > 2) Copy the guest kernel''s System.map to the host > > 2) Once the guest has crashed, run: > > /usr/lib/xen/bin/xenctx -s System.map <domid> > > Cheers, > Mark. > > > > -- > Fedora-xen mailing list > Fedora-xen@redhat.com > https://www.redhat.com/mailman/listinfo/fedora-xen >
Jon Swanson
2009-Jan-20 06:34 UTC
RE: [Fedora-xen] f10 x86_64 xen VM guests fail to boot on f8 host
Hi Phil, Most of what you said went right over my head. It sounds like processor related information may be useful though: MachineA works, MachineB does not. [root@MachineA f10copy]# egrep ''cpu fam|model|flags'' /proc/cpuinfo cpu family : 15 model : 33 model name : Dual Core AMD Opteron(tm) Processor 275 flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy ... [root@MachineB ~]# egrep ''cpu fam|model|flags'' /proc/cpuinfo cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 2.80GHz flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pni monitor ds_cpl est cid cx16 xtpr lahf_lm ... I was unable to simulate the crash on MachineA, but I created the same vm with the following command: xm create -p -c f10copy.config And ran xenctx against the freshly created domain: /usr/lib64/xen/bin/xenctx -s /tmp/System.map-2.6.27.5-117.fc10.x86_64 39 rip: ffffffff810093aa _stext+0x3aa rsp: ffffffff81573ea0 rax: 00000000 rbx: ffffffff81572000 rcx: ffffffff810093aa rdx: 00000000 rsi: 00000000 rdi: 00000001 rbp: ffffffff81573eb8 r8: 00000000 r9: ffff8800020b2348 r10: 00000001 r11: 00000246 r12: 6db6db6db6db6db7 r13: ffffffff815d2660 r14: ffffffff815d4cc0 r15: 00000016 cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000 Stack: ffffffff81766a78 0000000000000003 ffffffff8100a709 ffffffff81573ed8 ffffffff8100ba06 ffffffff81573ed8 ffffffff816d60e8 ffffffff81573ef8 ffffffff8100f279 ffffffff815d4cc0 0000000000000000 ffffffff81573f08 ffffffff8131ed7d ffffffff81573f48 ffffffff8159dd46 ffffffff81573f48 Code: cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc Call Trace: [<ffffffff810093aa>] _stext+0x3aa <-- [<ffffffff8100a709>] xen_safe_halt+0x10 [<ffffffff8100ba06>] xen_idle+0x55 [<ffffffff8100f279>] cpu_idle+0xb2 [<ffffffff8131ed7d>] rest_init+0x61 [<ffffffff8159dd46>] start_kernel+0x39f [<ffffffff8159d2ba>] x86_64_start_reservations+0xa5 [<ffffffff815a3e64>] xen_start_kernel+0x7e1 -----Original Message----- From: Virtualization [mailto:virtualization@webwombat.com.au] Sent: Tuesday, January 20, 2009 3:19 PM To: Jon Swanson Cc: fedora-xen@redhat.com Subject: RE: [Fedora-xen] f10 x86_64 xen VM guests fail to boot on f8 host Hi list,>From the Intel(r) Virtualization Technology Specificationfor the IA-32 Intel(r) Architecture (2005): "2.9.2 Information for VM Exits Due to Vectored Events Event-specific information is provided for VM exits due to the following vectored events: exceptions (including those generated by the instructions INT3, INTO, BOUND, and UD2); external interrupts that occur while the "acknowledge interrupt on exit" VM-exit control is 1; and non-maskable interrupts (NMIs). This information is provided in the following fields:" .... The <0f> 0b in the "Code:" section are the UD2 instruction. Checking through the OpCode map for the Xeon processor, this is an invalid op code. In VT processors the software guide indicates that a program can communicate various events and state information to the underlying virtualization supervisor by executing a UD2 (and some others ops like it). I think that in a non-VT cpu it''s actually a "real" invalid op code. The stuff (hardware) which flips over to the supervisor with all the needed info from the virtual machine isn''t there. KVM uses this, from the patches I''ve seen Googling around for UD2 (if I understand correctly). So why a UD2 in the code? It''s highly unlikely that it''s just some random bytes that happen to be a UD2. Possibly the kernel thinks it''s in fully virt mode at some point? The image notes do seem to indicate this. Cheers Phill. On Tue, 2009-01-20 at 14:01 +0900, Jon Swanson wrote:> Hi Mark, thank you very much for your help. > > I took an f10host which boots on MachineA and copied it to MachineB, > modified the config to include on_crash=preserve, and booted it with > xm create. > ---------------------------------------------------------------------- > -- > ------------------------------------------------ > xenctx output: > /usr/lib64/xen/bin/xenctx -s System.map-2.6.27.5-117.fc10.x86_64 46 > rip: ffffffff8100b8a2 set_page_prot+0x6d > rsp: ffffffff81573f08 > rax: ffffffea rbx: 000016e1 rcx: 00000055 rdx: 00000000 > rsi: 800000014ffc6061 rdi: ffffffff816e1000 rbp: ffffffff81573f68 > r8: 0000000f r9: ffffffff817eb450 r10: ffffffff817eb650 r11: > 00000010 > r12: ffffffff816e1000 r13: 800000014ffc6061 r14: 8000000000000161 > r15: 00000016 > cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000 > > Stack: > 0000000000000055 0000000000000010 ffffffff8100b8a2 000000010000e030 > 0000000000010082 ffffffff81573f48 000000000000e02b ffffffff8100b89e > 0000000000000200 ffffffff816e4000 0000000000000800 0000000000002c00 > ffffffff81573ff8 ffffffff815a3c60 0000000000002c00 0000000000000000 > > Code: > 7b 4a 1d 00 4c 89 e7 4c 89 ee 31 d2 e8 22 d9 ff ff 85 c0 74 04 <0f> 0b> eb fe 5b 41 5c 41 5d 41 5e > > Call Trace: > [<ffffffff8100b8a2>] set_page_prot+0x6d <-- > [<ffffffff8100b8a2>] set_page_prot+0x6d > [<ffffffff8100b89e>] set_page_prot+0x69 > [<ffffffff815a3c60>] xen_start_kernel+0x5dd > > > ---------------------------------------------------------------------- > -- > ------------------------------------------------ > Dmesg also has something which may make sense to someone wiser than > myself. Specifically: > (XEN) traps.c:405:d46 Unhandled invalid opcode fault/trap [#6] in > domain > 46 on VCPU 0 [ec=0000] > > > xm dmesg > .... > (XEN) ffffffff82035000 ffffffff82036000 ffffffff82037000 > ffffffff82038000 > (XEN) mm.c:1362:d46 Bad L1 flags 800000 > (XEN) traps.c:405:d46 Unhandled invalid opcode fault/trap [#6] in > domain > 46 on VCPU 0 [ec=0000] > (XEN) domain_crash_sync called from entry.S > (XEN) Domain 46 (vcpu#0) crashed on cpu#2: > (XEN) ----[ Xen-3.1.4 x86_64 debug=n Not tainted ]---- > (XEN) CPU: 2 > (XEN) RIP: e033:[<ffffffff8100b8a2>] > (XEN) RFLAGS: 0000000000000282 CONTEXT: guest > (XEN) rax: 00000000ffffffea rbx: 00000000000016e1 rcx: > 0000000000000055 > (XEN) rdx: 0000000000000000 rsi: 800000014ffc6061 rdi: > ffffffff816e1000 > (XEN) rbp: ffffffff81573f68 rsp: ffffffff81573f08 r8: > 000000000000000f > (XEN) r9: ffffffff817eb450 r10: ffffffff817eb650 r11: > 0000000000000010 > (XEN) r12: ffffffff816e1000 r13: 800000014ffc6061 r14: > 8000000000000161 > (XEN) r15: 0000000000000016 cr0: 000000008005003b cr4: > 00000000000006f0 > (XEN) cr3: 0000000144f18000 cr2: 0000000000000000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 > (XEN) Guest stack trace from rsp=ffffffff81573f08: > (XEN) 0000000000000055 0000000000000010 ffffffff8100b8a2 > 000000010000e030 > (XEN) 0000000000010082 ffffffff81573f48 000000000000e02b > ffffffff8100b89e > (XEN) 0000000000000200 ffffffff816e4000 0000000000000800 > 0000000000002c00 > (XEN) ffffffff81573ff8 ffffffff815a3c60 0000000000002c00 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > ffffffff8208b000 > (XEN) 0000000000010000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 > ffffffff82008000 > (XEN) ffffffff82009000 ffffffff8200a000 ffffffff8200b000 > ffffffff8200c000 > (XEN) ffffffff8200d000 ffffffff8200e000 ffffffff8200f000 > ffffffff82010000 > (XEN) ffffffff82011000 ffffffff82012000 ffffffff82013000 > ffffffff82014000 > (XEN) ffffffff82015000 ffffffff82016000 ffffffff82017000 > ffffffff82018000 > (XEN) ffffffff82019000 ffffffff8201a000 ffffffff8201b000 > ffffffff8201c000 > (XEN) ffffffff8201d000 ffffffff8201e000 ffffffff8201f000 > ffffffff82020000 > (XEN) ffffffff82021000 ffffffff82022000 ffffffff82023000 > ffffffff82024000 > (XEN) ffffffff82025000 ffffffff82026000 ffffffff82027000 > ffffffff82028000 > (XEN) ffffffff82029000 ffffffff8202a000 ffffffff8202b000 > ffffffff8202c000 > (XEN) ffffffff8202d000 ffffffff8202e000 ffffffff8202f000 > ffffffff82030000 > (XEN) ffffffff82031000 ffffffff82032000 ffffffff82033000 > ffffffff82034000 > (XEN) ffffffff82035000 ffffffff82036000 ffffffff82037000 > ffffffff82038000 > ... > > ---------------------------------------------------------------------- > -- > ------------------------------------------------ > > > -----Original Message----- > From: Mark McLoughlin [mailto:markmc@redhat.com] > Sent: Friday, January 16, 2009 7:34 PM > To: Jon Swanson > Cc: fedora-xen@redhat.com > Subject: Re: [Fedora-xen] f10 x86_64 xen VM guests fail to boot on f8 > host > > Hi Jon, > > On Fri, 2009-01-16 at 18:02 +0900, Jon Swanson wrote: > > This is a cross post of the same subject on the Fedora Forums. If > > this is bad practice let me know and i''ll never do it again. > > Mailing lists can often be a better way to get help from developers, > so posting here is no problem. > > Also, fedora-virt@redhat.com might be a better place to post questions> these days - it''s not clear whether the fedora-xen list has a future. > > > Additional log info is available at > > http://forums.fedoraforum.org/showthread.php?p=1149972&posted=1#post > > 11 > > 49 > > 972 > > > > I have two machines running fresh installs of f8 with the xen. > > Kernel and all software versions are the same on both. > > You''ve seen this then, right? > > > http://fedoraproject.org/wiki/Bugs/F10Common#Installing_Fedora_10_DomU > _o > n_Fedora_8_Dom0_Fails > > > /var/log/xen/xend.log relevant output: > > > > [2009-01-16 14:45:32 4120] DEBUG (DevController:150) Waiting for > > devices vtpm. > > [2009-01-16 14:45:32 4120] INFO (XendDomain:1130) Domain f10testB > > (21) > > > unpaused. > > [2009-01-16 14:45:32 4120] WARNING (XendDomainInfo:1203) Domain has > > crashed: name=f10testB id=21. > > [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1802) > > XendDomainInfo.destroy: domid=21 > > [2009-01-16 14:45:32 4120] DEBUG (XendDomainInfo:1821) > > XendDomainInfo.destroyDomain(21) > > I''ve also tried moving a functional guest from MachineA to MachineB > > to > > > boot it there, with the same results. Guest will not boot onMachineB.> > > > f8 64bit guests will boot on MachineB with no problems. > > f10 32bit guests will boot on MachineB with no problems. > > > > Only 64bit machines seem to be borked. > > Okay, sounds like it might "just" be a F10 kernel bug. > > Try doing this to get a stack trace: > > 1) Set "on_crash=preserve" in your domain config > > 2) Copy the guest kernel''s System.map to the host > > 2) Once the guest has crashed, run: > > /usr/lib/xen/bin/xenctx -s System.map <domid> > > Cheers, > Mark. > > > > -- > Fedora-xen mailing list > Fedora-xen@redhat.com > https://www.redhat.com/mailman/listinfo/fedora-xen >
Avi Kivity
2009-Jan-21 07:30 UTC
Re: [Fedora-xen] f10 x86_64 xen VM guests fail to boot on f8 host
Virtualization wrote:> Hi list, > > >From the Intel® Virtualization Technology Specification > for the IA-32 Intel® Architecture (2005): > > "2.9.2 Information for VM Exits Due to Vectored Events > Event-specific information is provided for VM exits due to the following vectored events: > exceptions (including those generated by the instructions INT3, INTO, BOUND, and UD2); > external interrupts that occur while the “acknowledge interrupt on exit” VM-exit control is 1; > and non-maskable interrupts (NMIs). This information is provided in the following fields:" .... > > The <0f> 0b in the "Code:" section are the UD2 instruction. > > Checking through the OpCode map for the Xeon processor, this is an > invalid op code. In VT processors the software guide indicates that a > program can communicate various events and state information to the > underlying virtualization supervisor by executing a UD2 (and some others > ops like it). > > I think that in a non-VT cpu it''s actually a "real" invalid op code. The > stuff (hardware) which flips over to the supervisor with all the needed > info from the virtual machine isn''t there. > > KVM uses this, from the patches I''ve seen Googling around for UD2 (if I > understand correctly). > > So why a UD2 in the code? It''s highly unlikely that it''s just some > random bytes that happen to be a UD2. Possibly the kernel thinks it''s in > fully virt mode at some point? The image notes do seem to indicate this. >The guest kernel uses UD2 as part of the implementation of an assertion. In this case, UD2 is used by the guest to communicate with itself. Every BUG: report you see will have the code containing 0f 0b. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.