Tilman Schmidt
2012-Sep-11 19:06 UTC
[CentOS] CentOS 6 early panic on ESXi 4.1.0 build 800380
I run VMware vSphere 4 Essentials with three almost identically configured ESXi 4.1 hosts and a mix of 32 and 64 bit guests including Windows 2003 and 2008 as well as CentOS 5 and 6. Recently I updated one of the hosts to build 800380. The new build runs Windows and CentOS 5 VMs fine, but CentOS 6 guests won't come up. I tried two different CentOS 6 VMs. Both have the latest standard kernel (2.6.32-279.5.2.el6.x86_64). Both run perfectly fine on one of the other VMware hosts still running ESXi 4.1.0 build 702113. On build 800380, both display the GRUB menu alright but freeze immediately afterwards, emitting the message PANIC: early exception 0d rip 10:ffffffff81038879 error 0 cr2 0 on the bottom of the virtual console. Both run perfectly fine again once I move them back to the host with the older ESXi build. From one of the failed boot attempts, I captured a VMware debug log which shows: Sep 11 17:21:19.628: vcpu-0| RDMSR: unknown MSR[0x1a0] (read as zero): rip=0xffffffff810388db count=1 Sep 11 17:21:19.628: vcpu-0| RDMSR: unknown MSR[0x1a0] (read as zero): rip=0xffffffff810388db count=2 Sep 11 17:21:19.629: vcpu-0| X86Fault_Warning: vmcore/vmm64/cpu/interp.c:427: cs:eip=0x10:0xffffffff81038879 fault=13 Sep 11 17:21:19.632: vcpu-0| Vix: [1125838 vmxCommands.c:9609]: VMAutomation_HandleCLIHLTEvent. Do nothing. Sep 11 17:21:19.632: vcpu-0| MsgHint: msg.monitorevent.halt (sent) Sep 11 17:21:19.632: vcpu-0| The CPU has been disabled by the guest operating system. Power off or reset the virtual machine. Ideas? aTdHvAaNnKcSe, Tilman -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 259 bytes Desc: OpenPGP digital signature URL: <http://lists.centos.org/pipermail/centos/attachments/20120911/42a467df/attachment-0003.sig>
John R Pierce
2012-Sep-11 19:14 UTC
[CentOS] CentOS 6 early panic on ESXi 4.1.0 build 800380
On 09/11/12 12:06 PM, Tilman Schmidt wrote:> I run VMware vSphere 4 Essentials with three almost identically > configured ESXi 4.1 hosts and a mix of 32 and 64 bit guests including > Windows 2003 and 2008 as well as CentOS 5 and 6. Recently I updated one > of the hosts to build 800380. The new build runs Windows and CentOS 5 > VMs fine, but CentOS 6 guests won't come up. > > I tried two different CentOS 6 VMs. Both have the latest standard kernel > (2.6.32-279.5.2.el6.x86_64). Both run perfectly fine on one of the other > VMware hosts still running ESXi 4.1.0 build 702113. On build 800380, > both display the GRUB menu alright but freeze immediately afterwards, > emitting the message > > PANIC: early exception 0d rip 10:ffffffff81038879 error 0 cr2 0 > > on the bottom of the virtual console. Both run perfectly fine again once > I move them back to the host with the older ESXi build. > > From one of the failed boot attempts, I captured a VMware debug log > which shows: > > Sep 11 17:21:19.628: vcpu-0| RDMSR: unknown MSR[0x1a0] (read as zero): > rip=0xffffffff810388db count=1 > Sep 11 17:21:19.628: vcpu-0| RDMSR: unknown MSR[0x1a0] (read as zero): > rip=0xffffffff810388db count=2 > Sep 11 17:21:19.629: vcpu-0| X86Fault_Warning: > vmcore/vmm64/cpu/interp.c:427: cs:eip=0x10:0xffffffff81038879 fault=13 > Sep 11 17:21:19.632: vcpu-0| Vix: [1125838 vmxCommands.c:9609]: > VMAutomation_HandleCLIHLTEvent. Do nothing. > Sep 11 17:21:19.632: vcpu-0| MsgHint: msg.monitorevent.halt (sent) > Sep 11 17:21:19.632: vcpu-0| The CPU has been disabled by the guest > operating system. Power off or reset the virtual machine. > > Ideas? > >from here, it appears to be a hardware or vmware issue. NOTHING the guest OS does should crash the hypervisor. I'd file a bug report with vmware. -- john r pierce N 37, W 122 santa cruz ca mid-left coast
Le 2012-09-11 21:06, Tilman Schmidt a ?crit?:> I run VMware vSphere 4 Essentials with three almost identically > configured ESXi 4.1 hosts and a mix of 32 and 64 bit guests including > Windows 2003 and 2008 as well as CentOS 5 and 6. Recently I updated > one > of the hosts to build 800380. The new build runs Windows and CentOS 5 > VMs fine, but CentOS 6 guests won't come up. > > I tried two different CentOS 6 VMs. Both have the latest standard > kernel > (2.6.32-279.5.2.el6.x86_64). Both run perfectly fine on one of the > other > VMware hosts still running ESXi 4.1.0 build 702113. On build 800380, > both display the GRUB menu alright but freeze immediately afterwards, > emitting the message >I've found what is probably your post on VMware Communities. http://communities.vmware.com/message/2112173?tstart=0 It seems there's a second 4.1 update 3 build (811144): http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2020362 It fixes another panic, so trying this build may help. -- PR722061: When a Linux kernel crashes, the linux kexec feature is used to enable booting into a special kdump kernel and gathering crash dump files. An SMP Linux guest configured with kexec might cause the virtual machine to fail with a monitor panic during this reboot. Error messages such as the following might be logged: vcpu-0| CPU reset: soft (mode 2) vcpu-0| MONITOR PANIC: vcpu-0:VMM fault 14: src=MONITOR rip=0xfffffffffc28c30d regs=0xfffffffffc008b50 -- -- Laurent.
Tilman Schmidt
2012-Sep-12 11:40 UTC
[CentOS] CentOS 6 early panic on ESXi 4.1.0 build 800380
Am 11.09.2012 21:06, schrieb Tilman Schmidt:> I tried two different CentOS 6 VMs. Both have the latest standard kernel > (2.6.32-279.5.2.el6.x86_64). Both run perfectly fine on one of the other > VMware hosts still running ESXi 4.1.0 build 702113. On build 800380, > both display the GRUB menu alright but freeze immediately afterwards, > emitting the message > > PANIC: early exception 0d rip 10:ffffffff81038879 error 0 cr2 0 > > on the bottom of the virtual console. Both run perfectly fine again once > I move them back to the host with the older ESXi build.Two and a half new data points: - The problem host has a Xeon E3-1270V2 processor while the one which runs the CentOS 6 guests fine has an E3-1230. I'm not sufficiently up to date with Intel processor types to tell whether this would make a difference. - Another CentOS 6 VM with older kernel 2.6.32-220.7.1.el6.x86_64 does come up on the problem host. It does a panic blink (Caps Lock and Scroll Lock blinking in unison while the VM has the keyboard) but I get a working login prompt (I don't get any further because I don't have a logon for the machine) and I can shut it down normally by sending Ctrl-Alt-Del. - (the half point, no idea if it matters) The CentOS 6 VMs which die with "PANIC: early exception 0d" do *not* do a panic blink. So it would seem that something related to the problem was changed in the CentOS kernel between releases 2.6.32-220.7.1 and 2.6.32-279.5.2. -- Tilman Schmidt Phoenix Software GmbH Bonn, Germany -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature URL: <http://lists.centos.org/pipermail/centos/attachments/20120912/460f4fbe/attachment-0003.sig>
Tilman Schmidt
2012-Sep-13 09:30 UTC
[CentOS] CentOS 6.3 early panic on Xeon E3-1270V2 (was: CentOS 6 early panic on ESXi 4.1.0 build 800380)
Alright, it's the CPU. Subject adapted accordingly. In a fit of recklessness, I updated VMware on one of the hosts on which the CentOS 6 machines were still able to run. Lo and behold, they still work fine there. So now I have: ESXi Build 582267 800380 800380 Processor E5620 E3-1230 E3-1270V2 Windows ok ok ok (all versions) CentOS 5.8 ok ok ok 2.6.18-308.13.1.el5 CentOS 6.2 ok ok ok(*) 2.6.32-220.7.1.el6.x86_64 CentOS 6.3 ok ok Panic 2.6.32-279.2.1.el6.x86_64 (*) except for the irritating keyboard blink More ideas? Thx T. -- Tilman Schmidt Phoenix Software GmbH Bonn, Germany -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature URL: <http://lists.centos.org/pipermail/centos/attachments/20120913/85db02be/attachment-0003.sig>
Am 13.09.2012 11:30, schrieb Tilman Schmidt:> In a fit of recklessness, I updated VMware on one of the hosts on which > the CentOS 6 machines were still able to run. Lo and behold, they still > work fine there. So now I have: > > ESXi Build 582267 800380 800380 > Processor E5620 E3-1230 E3-1270V2 > > Windows ok ok ok > (all versions) > > CentOS 5.8 ok ok ok > 2.6.18-308.13.1.el5 > > CentOS 6.2 ok ok ok(*) > 2.6.32-220.7.1.el6.x86_64 > > CentOS 6.3 ok ok Panic > 2.6.32-279.2.1.el6.x86_64 > > (*) except for the irritating keyboard blinkIn the meantime, one other user with the same problem has turned up on the VMware forum. He reports that Windows 8 x64 doesn't work on the E3-1270V2 host either, but a 32 bit install of CentOS 6.3 does. Also, I have updated the last host and it continues to run all VMs fine, so the ESXi version is definitely not the culprit. What happened between kernel releases 2.6.32-220.7.1.el6.x86_64 and 2.6.32-279.2.1.el6.x86_64 that would cause a CPU dependent early exception? -- Tilman Schmidt Phoenix Software GmbH Bonn, Germany
Tilman Schmidt
2013-Mar-12 11:04 UTC
[CentOS] CentOS 6.3 early panic on Xeon E3-1270V2 - CentOS 6.4 too
Having read that RHEL/CentOS 6.4 came with new VMware drivers I checked whether this problem might perchance be fixed. It isn't. In the meantime I got a report that Windows 8 (64 bit) showed a similar problem. So, updated problem matrix (ESXi build omitted as it has no influence): Processor E5620 E3-1230 E3-1270V2 Windows XP/2003/2008 ok ok ok Windows 8 ? ? Panic CentOS 5.8 ok ok ok 2.6.18-308.13.1.el5 CentOS 6.2 ok ok ok(*) 2.6.32-220.7.1.el6.x86_64 CentOS 6.3 ok ok Panic 2.6.32-279.2.1.el6.x86_64 CentOS 6.4 ok ok Panic 2.6.32-358.0.1.el6.x86_64 (*) keyboard shows panic blink but system works fine otherwise Reminder of the problem description: trying to boot a VM with CentOS 6.3 or later on a VMware ESXi 4 host with a Xeon E3-1270V2 processor fails immediately after GRUB, with the VM locking up, console message: Sep 11 17:21:31.498: vmx| PANIC: early exception 0d rip 10:ffffffff81038879 error 0 cr2 0 and ESXi log messages: Sep 11 17:21:19.628: vcpu-0| RDMSR: unknown MSR[0x1a0] (read as zero): rip=0xffffffff810388db count=1 Sep 11 17:21:19.628: vcpu-0| RDMSR: unknown MSR[0x1a0] (read as zero): rip=0xffffffff810388db count=2 Sep 11 17:21:19.629: vcpu-0| X86Fault_Warning: vmcore/vmm64/cpu/interp.c:427: cs:eip=0x10:0xffffffff81038879 fault=13 Sep 11 17:21:19.632: vcpu-0| Vix: [1125838 vmxCommands.c:9609]: VMAutomation_HandleCLIHLTEvent. Do nothing. Sep 11 17:21:19.632: vcpu-0| MsgHint: msg.monitorevent.halt (sent) Sep 11 17:21:19.632: vcpu-0| The CPU has been disabled by the guest operating system. Power off or reset the virtual machine. -- Tilman Schmidt Phoenix Software GmbH Bonn, Germany -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 261 bytes Desc: OpenPGP digital signature URL: <http://lists.centos.org/pipermail/centos/attachments/20130312/d0ce0928/attachment-0002.sig>