thr3ads.net - CentOS - [CentOS] Panic / EL6 / KVM / kernel-2.6.32-754.2.1.el6.x86

If this information is useful, please help other people find it:
Share via:

Leon Fauster

2018-Aug-30 09:40 UTC

[CentOS] Panic / EL6 / KVM / kernel-2.6.32-754.2.1.el6.x86_64

Am 30.08.2018 um 10:54 schrieb isdtor <isdtor at gmail.com>:
> 
> Leon Fauster via CentOS writes:
>> Since the update from kernel-2.6.32-754.2.1.el6.x86_64 
>> to kernel-2.6.32-754.3.5.el6.x86_64 I can not boot my 
>> KVM guests anymore!? The workstation panics immediately! 
>> 
>> I would not have expected this behavior now (last phase of OS). 
>> It was very robust until now (Optiplex Workstation). I see some KVM 
>> related lines in the changelog.diff. Before swimming upstream:
>> 
>> Does some one have problems related to KVM with
kernel-2.6.32-754.3.5.el6.x86_64 ??
> 
> Yes, the exact same thing happened here, and I suspect it is related to
older cpus that don't get any Spectre/Meltdown updates.

Thanks for the feedback. I' was assuming that some kind of Spectre/Meltdown
fixes are causing this.


> IBM x3250
> Intel(R) Xeon(R) CPU           E3110  @ 3.00GHz
> 
> This is a dual-core cpu of similar vintage to yours (can we have a model
#?), pre-2010.

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 Duo CPU     E6850  @ 3.00GHz
stepping	: 11
microcode	: 186
cpu MHz		: 2000.000
cache size	: 4096 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc
arch_perfmon pebs bts rep_good aperfmperf eagerfpu pni dtes64 monitor ds_cpl vmx
smx est tm2 ssse3 cx16 xtpr pdcm lahf_lm dtherm pti retpoline tpr_shadow vnmi
flexpriority
bogomips	: 5984.84
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:




> There goes a cheap and reliable VM dev machine :-/

No way. Should all IT departments trash a big percentage of there hardware now?

--
LF

Simon Matter

2018-Aug-30 10:05 UTC

head link

[CentOS] Panic / EL6 / KVM / kernel-2.6.32-754.2.1.el6.x86_64

>
> Am 30.08.2018 um 10:54 schrieb isdtor <isdtor at gmail.com>:
>
>>
>> Leon Fauster via CentOS writes:
>>> Since the update from kernel-2.6.32-754.2.1.el6.x86_64
>>> to kernel-2.6.32-754.3.5.el6.x86_64 I can not boot my
>>> KVM guests anymore!? The workstation panics immediately!
>>>
>>> I would not have expected this behavior now (last phase of OS).
>>> It was very robust until now (Optiplex Workstation). I see some KVM
>>> related lines in the changelog.diff. Before swimming upstream:
>>>
>>> Does some one have problems related to KVM with
>>> kernel-2.6.32-754.3.5.el6.x86_64 ??
>>
>> Yes, the exact same thing happened here, and I suspect it is related to
>> older cpus that don't get any Spectre/Meltdown updates.
>
>
> Thanks for the feedback. I' was assuming that some kind of
> Spectre/Meltdown fixes are causing this.
>
Doesn downgrading qemu as I proposed in the other mail fix it in your case?

I'm interested because in my case I'm having the issue on two older AMD
CPUs, not Intel.
>
>
>> IBM x3250
>> Intel(R) Xeon(R) CPU           E3110  @ 3.00GHz
>>
>> This is a dual-core cpu of similar vintage to yours (can we have a
model
>> #?), pre-2010.
>
>
> processor	: 1
> vendor_id	: GenuineIntel
> cpu family	: 6
> model		: 15
> model name	: Intel(R) Core(TM)2 Duo CPU     E6850  @ 3.00GHz
> stepping	: 11
> microcode	: 186
> cpu MHz		: 2000.000
> cache size	: 4096 KB
> physical id	: 0
> siblings	: 2
> core id		: 1
> cpu cores	: 2
> apicid		: 1
> initial apicid	: 1
> fpu		: yes
> fpu_exception	: yes
> cpuid level	: 10
> wp		: yes
> flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
> pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm
> constant_tsc arch_perfmon pebs bts rep_good aperfmperf eagerfpu pni dtes64
> monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm lahf_lm dtherm pti
> retpoline tpr_shadow vnmi flexpriority
> bogomips	: 5984.84
> clflush size	: 64
> cache_alignment	: 64
> address sizes	: 36 bits physical, 48 bits virtual
> power management:
>
>
>
>
>
>> There goes a cheap and reliable VM dev machine :-/
>
>
> No way. Should all IT departments trash a big percentage of there hardware
> now?
I second that, I really hope this will be fixed.

Regards,
Simon

isdtor

2018-Aug-30 10:23 UTC

head link

[CentOS] Panic / EL6 / KVM / kernel-2.6.32-754.2.1.el6.x86_64

> >>> Does some one have problems related to KVM with
> >>> kernel-2.6.32-754.3.5.el6.x86_64 ??
> >>
> >> Yes, the exact same thing happened here, and I suspect it is
related to
> >> older cpus that don't get any Spectre/Meltdown updates.
> >
> >
> > Thanks for the feedback. I' was assuming that some kind of
> > Spectre/Meltdown fixes are causing this.
> >
> 
> Doesn downgrading qemu as I proposed in the other mail fix it in your case?
> 
> I'm interested because in my case I'm having the issue on two older
AMD
> CPUs, not Intel. 
Simon, downgrading does not fix the problem in my case. I upgraded to the 3.5
kernel, disabled libvirtd and libvirtd-guests, and rebooted. Once the machine
was up, I started the libvirtd service and the machine crashed immediately.

Old: qemu-img-0.12.1.2-2.506.el6_10.1, qemu-kvm-0.12.1.2-2.506.el6_10.1 (crash
on 2.6.32-754.3.5)
New: qemu-img-0.12.1.2-2.503.el6_9.6,  qemu-kvm-0.12.1.2-2.503.el6_9.6  (crash
on 2.6.32-754.3.5)

In Leon's and my case, the culprit is kernel 2.6.32-754.3.5. In CentOS bug
0015067, the bad kernel is 2.6.32-754.2.1, wich works fine here. The difference
might be how different cpus are handled.

Stephen John Smoogen

2018-Aug-30 12:13 UTC

head link

[CentOS] Panic / EL6 / KVM / kernel-2.6.32-754.2.1.el6.x86_64

On Thu, 30 Aug 2018 at 05:41, Leon Fauster via CentOS <centos at
centos.org>
wrote:
>
>
>
> > There goes a cheap and reliable VM dev machine :-/
>
>
> No way. Should all IT departments trash a big percentage of there hardware
> now?
>
>I am going to say from chip and OEM manufacturers view points: yes. For at
least the last 15 years, they have priced out their hardware to have a 4-6
year lifetime. Consumers of said hardware are supposed to plan around that
with magical budget money and replace their hardware regularly. Since
rarely do people have said money, we have mostly gotten away with having
not having to do so because big things like this don't happen very often.
[The last big one was all the working hardware people had to get rid of for
Y2K.]

The fixes to the old hardware are going to be problematic for a lot of
different reasons (Intel isn't fixing its microcode, backporting deep
kernel rewrites to very old kernels tends to crash a lot, etc.) I would
recommend one of the following strategies:

1. Let your budget know that there will be a lot of replacements coming up.
Replace hardware as you can.
2. Make a decision about what your security risk is for this problem, stick
to an old kernel and put virtual systems which match your security risk on
the old hardware.
3. Test a newer kernel/release on the hardware and see if the problem does
not occur on it. If it does, then it is doubtful that the fix can be
backported until it is fixed in the newer version. If it doesn't then it
might help figure out where the breakage is.

-- 
Stephen J Smoogen.

mark

2018-Aug-30 14:31 UTC

head link

[CentOS] Panic / EL6 / KVM / kernel-2.6.32-754.2.1.el6.x86_64

Stephen John Smoogen wrote:> On Thu, 30 Aug 2018 at 05:41, Leon Fauster via CentOS <centos at
centos.org>
>  wrote:
>>
>>> There goes a cheap and reliable VM dev machine :-/
>>
>> No way. Should all IT departments trash a big percentage of there
>> hardware now?
> I am going to say from chip and OEM manufacturers view points: yes. For
> at least the last 15 years, they have priced out their hardware to have a
> 4-6 year lifetime. Consumers of said hardware are supposed to plan
> around that<snip>
Heh, heh. We're starting to replace the servers that we got when I was
first here... in '09 and '10 and '11. But then, I'm a contractor
at a US
federal gov't agency in the civilian sector, and budgets, um, right, LOL.
Next time someone complains about "waste of tax dollars", why, just
last
year, or was it earlier this year, we finally retired a few servers that
had actual SCSI drives....

The only problems we've had on the latest C 7 kernels *seem* to be related
to a specific Intel chip or two. Otherwise, the older servers work just
fine.

     mark

Possibly Parallel Threads

Search for more reasonably related threads

CentOS - Aug 2018 - Panic / EL6 / KVM / kernel-2.6.32-754.2.1.el6.x86_64

[CentOS] Panic / EL6 / KVM / kernel-2.6.32-754.2.1.el6.x86_64

[CentOS] Panic / EL6 / KVM / kernel-2.6.32-754.2.1.el6.x86_64

[CentOS] Panic / EL6 / KVM / kernel-2.6.32-754.2.1.el6.x86_64

[CentOS] Panic / EL6 / KVM / kernel-2.6.32-754.2.1.el6.x86_64

[CentOS] Panic / EL6 / KVM / kernel-2.6.32-754.2.1.el6.x86_64

Possibly Parallel Threads