Johnny Hughes
2017-Apr-19 17:33 UTC
[CentOS-virt] Xen C6 kernel 4.9.13 and testing 4.9.15 only reboots.
On 04/19/2017 12:18 PM, PJ Welsh wrote:> > On Wed, Apr 19, 2017 at 5:40 AM, Johnny Hughes <johnny at centos.org > <mailto:johnny at centos.org>> wrote: > > On 04/18/2017 12:39 PM, PJ Welsh wrote: > > Here is something interesting... I went through the BIOS options and > > found that one R710 that *is* functioning only differed in that "Logical > > Processor"/Hyperthreading was *enabled* while the one that is *not* > > functioning had HT *disabled*. Enabled Logical Processor and the system > > starts without issue! I've rebooted 3 times now without issue. > > Dell R710 BIOS version 6.4.0 > > 2x Intel(R) Xeon(R) CPU L5639 @ 2.13GHz > > 4.9.20-26.el7.x86_64 #1 SMP Tue Apr 4 11:19:26 CDT 2017 x86_64 x86_64 > > x86_64 GNU/Linux > > > > Outstanding .. I have now released a 4.9.23-26.el6 and .el7 to the > system as normal updates. It should be available later today. > > <snip> > > > I've verified with a second Dell R710 that disabling > Hyperthreading/Logical Processor causes the primary xen booting kernel > to fail and reboot. Consequently, enabling allows for the system to > start as expected and without any issue: > Current tested kernel was: 4.9.13-22.el7.x86_64 #1 SMP Sun Feb 26 > 22:15:59 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux > > I just attempted an update and the 4.9.23-26 is not yet up. Does this > update address the Hyperthreading issue in any way? >I don't think so .. at least I did not specifically add anything to do so. You can get it here for testing: https://buildlogs.centos.org/centos/7/virt/x86_64/xen/ (or from /6/ as well for CentOS-6) Not sure why it did not go out on the signing run .. will check that server. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: OpenPGP digital signature URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20170419/741a8863/attachment-0002.sig>
Anderson, Dave
2017-Apr-21 01:40 UTC
[CentOS-virt] Xen C6 kernel 4.9.13 and testing 4.9.15 only reboots.
Good news/bad news testing the new kernel on CentOS7 with my now notoriously finicky machines: Good news: 4.9.23-26.el7 (grabbed today via yum update) isn't any worse than 4.9.13-22 was on my xen hosts (as far as I can tell so far at least) Bad news: It isn't any better than 4.9.13 was for me either, if I don't set vcpu limit in the grub/xen config, it still panics like so: [ 6.716016] CPU: Physical Processor ID: 0 [ 6.720199] CPU: Processor Core ID: 0 [ 6.724046] mce: CPU supports 2 MCE banks [ 6.728239] Last level iTLB entries: 4KB 512, 2MB 8, 4MB 8 [ 6.733884] Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32, 1GB 0 [ 6.740770] Freeing SMP alternatives memory: 32K (ffffffff821a8000 - ffffffff821b0000) [ 6.750638] ftrace: allocating 34344 entries in 135 pages [ 6.771888] smpboot: Max logical packages: 1 [ 6.776363] VPMU disabled by hypervisor. [ 6.780479] Performance Events: SandyBridge events, PMU not available due to virtualization, using software events only. [ 6.792237] NMI watchdog: disabled (cpu0): hardware events not enabled [ 6.798943] NMI watchdog: Shutting down hard lockup detector on all cpus [ 6.805949] installing Xen timer for CPU 1 [ 6.810659] installing Xen timer for CPU 2 [ 6.815317] installing Xen timer for CPU 3 [ 6.819947] installing Xen timer for CPU 4 [ 6.824618] installing Xen timer for CPU 5 [ 6.829282] installing Xen timer for CPU 6 [ 6.833935] installing Xen timer for CPU 7 [ 6.838565] installing Xen timer for CPU 8 [ 6.843110] smpboot: Package 1 of CPU 8 exceeds BIOS package data 1. [ 6.849475] ------------[ cut here ]------------ [ 6.854091] kernel BUG at arch/x86/kernel/cpu/common.c:997! [ 6.855864] random: fast init done [ 6.863070] invalid opcode: 0000 [#1] SMP [ 6.867088] Modules linked in: [ 6.870168] CPU: 8 PID: 0 Comm: swapper/8 Not tainted 4.9.23-26.el7.x86_64 #1 [ 6.877298] Hardware name: Supermicro X9DRT/X9DRT, BIOS 3.2a 08/04/2015 [ 6.883920] task: ffff880058a6a5c0 task.stack: ffffc900400c0000 [ 6.889840] RIP: e030:[<ffffffff8103e7e7>] [<ffffffff8103e7e7>] identify_secondary_cpu+0x57/0x80 [ 6.898756] RSP: e02b:ffffc900400c3f08 EFLAGS: 00010086 [ 6.904069] RAX: 00000000ffffffe4 RBX: ffff88005d80a020 RCX: ffffffff81e5ffc8 [ 6.911201] RDX: 0000000000000001 RSI: 0000000000000005 RDI: 0000000000000005 [ 6.918335] RBP: ffffc900400c3f18 R08: 00000000000000ce R09: 0000000000000000 [ 6.925466] R10: 0000000000000005 R11: 0000000000000006 R12: 0000000000000008 [ 6.932599] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 6.939735] FS: 0000000000000000(0000) GS:ffff88005d800000(0000) knlGS:0000000000000000 [ 6.947819] CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 [ 6.953565] CR2: 0000000000000000 CR3: 0000000001e07000 CR4: 0000000000042660 [ 6.960696] Stack: [ 6.962731] 0000000000000008 0000000000000000 ffffc900400c3f28 ffffffff8104ebce [ 6.970205] ffffc900400c3f40 ffffffff81029855 0000000000000000 ffffc900400c3f50 [ 6.977691] ffffffff810298d0 0000000000000000 0000000000000000 0000000000000000 [ 6.985164] Call Trace: [ 6.987626] [<ffffffff8104ebce>] smp_store_cpu_info+0x3e/0x40 [ 6.993480] [<ffffffff81029855>] cpu_bringup+0x35/0x90 [ 6.998700] [<ffffffff810298d0>] cpu_bringup_and_idle+0x20/0x40 [ 7.004706] Code: 44 89 e7 ff 50 68 0f b7 93 d2 00 00 00 39 d0 75 1c 0f b7 bb da 00 00 00 44 89 e6 e8 e4 02 01 00 85 c0 75 07 5b 41 5c 5d c3 0f 0b <0f> 0b 0f b7 8b d4 00 00 00 89 c2 44 89 e6 48 c7 c7 90 d3 ca 81 [ 7.024976] RIP [<ffffffff8103e7e7>] identify_secondary_cpu+0x57/0x80 [ 7.031528] RSP <ffffc900400c3f08> [ 7.035032] ---[ end trace f2a8d75941398d9f ]--- [ 7.039658] Kernel panic - not syncing: Attempted to kill the idle task! So...other than my work around...that still works...not sure what else I can provide in the way of feedback/testing. But if you want anything else gathered, let me know. Thanks, -Dave -- Dave Anderson> On Apr 19, 2017, at 10:33 AM, Johnny Hughes <johnny at centos.org> wrote: > > On 04/19/2017 12:18 PM, PJ Welsh wrote: >> >> On Wed, Apr 19, 2017 at 5:40 AM, Johnny Hughes <johnny at centos.org >> <mailto:johnny at centos.org>> wrote: >> >> On 04/18/2017 12:39 PM, PJ Welsh wrote: >>> Here is something interesting... I went through the BIOS options and >>> found that one R710 that *is* functioning only differed in that "Logical >>> Processor"/Hyperthreading was *enabled* while the one that is *not* >>> functioning had HT *disabled*. Enabled Logical Processor and the system >>> starts without issue! I've rebooted 3 times now without issue. >>> Dell R710 BIOS version 6.4.0 >>> 2x Intel(R) Xeon(R) CPU L5639 @ 2.13GHz >>> 4.9.20-26.el7.x86_64 #1 SMP Tue Apr 4 11:19:26 CDT 2017 x86_64 x86_64 >>> x86_64 GNU/Linux >>> >> >> Outstanding .. I have now released a 4.9.23-26.el6 and .el7 to the >> system as normal updates. It should be available later today. >> >> <snip> >> >> >> I've verified with a second Dell R710 that disabling >> Hyperthreading/Logical Processor causes the primary xen booting kernel >> to fail and reboot. Consequently, enabling allows for the system to >> start as expected and without any issue: >> Current tested kernel was: 4.9.13-22.el7.x86_64 #1 SMP Sun Feb 26 >> 22:15:59 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux >> >> I just attempted an update and the 4.9.23-26 is not yet up. Does this >> update address the Hyperthreading issue in any way? >> > > I don't think so .. at least I did not specifically add anything to do so. > > You can get it here for testing: > > https://buildlogs.centos.org/centos/7/virt/x86_64/xen/ > > (or from /6/ as well for CentOS-6) > > Not sure why it did not go out on the signing run .. will check that server. > > > > _______________________________________________ > CentOS-virt mailing list > CentOS-virt at centos.org > https://lists.centos.org/mailman/listinfo/centos-virt
Mark L Sung
2017-Apr-21 10:01 UTC
[CentOS-virt] Xen C6 kernel 4.9.13 and testing 4.9.15 only reboots.
Hummmm, seems there are still stability issues on the "4.9.2-26.el7.x86_64", recently hear many issue related to Supermicro board! :-( Peace!!! On Fri, Apr 21, 2017 at 9:40 AM, Anderson, Dave <daveanderson at wsu.edu> wrote:> Good news/bad news testing the new kernel on CentOS7 with my now > notoriously finicky machines: > > Good news: 4.9.23-26.el7 (grabbed today via yum update) isn't any worse > than 4.9.13-22 was on my xen hosts (as far as I can tell so far at least) > > Bad news: It isn't any better than 4.9.13 was for me either, if I don't > set vcpu limit in the grub/xen config, it still panics like so: > > [ 6.716016] CPU: Physical Processor ID: 0 > [ 6.720199] CPU: Processor Core ID: 0 > [ 6.724046] mce: CPU supports 2 MCE banks > [ 6.728239] Last level iTLB entries: 4KB 512, 2MB 8, 4MB 8 > [ 6.733884] Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32, 1GB 0 > [ 6.740770] Freeing SMP alternatives memory: 32K (ffffffff821a8000 - > ffffffff821b0000) > [ 6.750638] ftrace: allocating 34344 entries in 135 pages > [ 6.771888] smpboot: Max logical packages: 1 > [ 6.776363] VPMU disabled by hypervisor. > [ 6.780479] Performance Events: SandyBridge events, PMU not available > due to virtualization, using software events only. > [ 6.792237] NMI watchdog: disabled (cpu0): hardware events not enabled > [ 6.798943] NMI watchdog: Shutting down hard lockup detector on all cpus > [ 6.805949] installing Xen timer for CPU 1 > [ 6.810659] installing Xen timer for CPU 2 > [ 6.815317] installing Xen timer for CPU 3 > [ 6.819947] installing Xen timer for CPU 4 > [ 6.824618] installing Xen timer for CPU 5 > [ 6.829282] installing Xen timer for CPU 6 > [ 6.833935] installing Xen timer for CPU 7 > [ 6.838565] installing Xen timer for CPU 8 > [ 6.843110] smpboot: Package 1 of CPU 8 exceeds BIOS package data 1. > [ 6.849475] ------------[ cut here ]------------ > [ 6.854091] kernel BUG at arch/x86/kernel/cpu/common.c:997! > [ 6.855864] random: fast init done > [ 6.863070] invalid opcode: 0000 [#1] SMP > [ 6.867088] Modules linked in: > [ 6.870168] CPU: 8 PID: 0 Comm: swapper/8 Not tainted > 4.9.23-26.el7.x86_64 #1 > [ 6.877298] Hardware name: Supermicro X9DRT/X9DRT, BIOS 3.2a 08/04/2015 > [ 6.883920] task: ffff880058a6a5c0 task.stack: ffffc900400c0000 > [ 6.889840] RIP: e030:[<ffffffff8103e7e7>] [<ffffffff8103e7e7>] > identify_secondary_cpu+0x57/0x80 > [ 6.898756] RSP: e02b:ffffc900400c3f08 EFLAGS: 00010086 > [ 6.904069] RAX: 00000000ffffffe4 RBX: ffff88005d80a020 RCX: > ffffffff81e5ffc8 > [ 6.911201] RDX: 0000000000000001 RSI: 0000000000000005 RDI: > 0000000000000005 > [ 6.918335] RBP: ffffc900400c3f18 R08: 00000000000000ce R09: > 0000000000000000 > [ 6.925466] R10: 0000000000000005 R11: 0000000000000006 R12: > 0000000000000008 > [ 6.932599] R13: 0000000000000000 R14: 0000000000000000 R15: > 0000000000000000 > [ 6.939735] FS: 0000000000000000(0000) GS:ffff88005d800000(0000) > knlGS:0000000000000000 > [ 6.947819] CS: e033 DS: 002b ES: 002b CR0: 0000000080050033 > [ 6.953565] CR2: 0000000000000000 CR3: 0000000001e07000 CR4: > 0000000000042660 > [ 6.960696] Stack: > [ 6.962731] 0000000000000008 0000000000000000 ffffc900400c3f28 > ffffffff8104ebce > [ 6.970205] ffffc900400c3f40 ffffffff81029855 0000000000000000 > ffffc900400c3f50 > [ 6.977691] ffffffff810298d0 0000000000000000 0000000000000000 > 0000000000000000 > [ 6.985164] Call Trace: > [ 6.987626] [<ffffffff8104ebce>] smp_store_cpu_info+0x3e/0x40 > [ 6.993480] [<ffffffff81029855>] cpu_bringup+0x35/0x90 > [ 6.998700] [<ffffffff810298d0>] cpu_bringup_and_idle+0x20/0x40 > [ 7.004706] Code: 44 89 e7 ff 50 68 0f b7 93 d2 00 00 00 39 d0 75 1c 0f > b7 bb da 00 00 00 44 89 e6 e8 e4 02 01 00 85 c0 75 07 5b 41 5c 5d c3 0f 0b > <0f> 0b 0f b7 8b d4 00 00 00 89 c2 44 89 e6 48 c7 c7 90 d3 ca 81 > [ 7.024976] RIP [<ffffffff8103e7e7>] identify_secondary_cpu+0x57/0x80 > [ 7.031528] RSP <ffffc900400c3f08> > [ 7.035032] ---[ end trace f2a8d75941398d9f ]--- > [ 7.039658] Kernel panic - not syncing: Attempted to kill the idle task! > > So...other than my work around...that still works...not sure what else I > can provide in the way of feedback/testing. But if you want anything else > gathered, let me know. > > Thanks, > -Dave > > -- > Dave Anderson > > > > On Apr 19, 2017, at 10:33 AM, Johnny Hughes <johnny at centos.org> wrote: > > > > On 04/19/2017 12:18 PM, PJ Welsh wrote: > >> > >> On Wed, Apr 19, 2017 at 5:40 AM, Johnny Hughes <johnny at centos.org > >> <mailto:johnny at centos.org>> wrote: > >> > >> On 04/18/2017 12:39 PM, PJ Welsh wrote: > >>> Here is something interesting... I went through the BIOS options and > >>> found that one R710 that *is* functioning only differed in that > "Logical > >>> Processor"/Hyperthreading was *enabled* while the one that is *not* > >>> functioning had HT *disabled*. Enabled Logical Processor and the system > >>> starts without issue! I've rebooted 3 times now without issue. > >>> Dell R710 BIOS version 6.4.0 > >>> 2x Intel(R) Xeon(R) CPU L5639 @ 2.13GHz > >>> 4.9.20-26.el7.x86_64 #1 SMP Tue Apr 4 11:19:26 CDT 2017 x86_64 x86_64 > >>> x86_64 GNU/Linux > >>> > >> > >> Outstanding .. I have now released a 4.9.23-26.el6 and .el7 to the > >> system as normal updates. It should be available later today. > >> > >> <snip> > >> > >> > >> I've verified with a second Dell R710 that disabling > >> Hyperthreading/Logical Processor causes the primary xen booting kernel > >> to fail and reboot. Consequently, enabling allows for the system to > >> start as expected and without any issue: > >> Current tested kernel was: 4.9.13-22.el7.x86_64 #1 SMP Sun Feb 26 > >> 22:15:59 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux > >> > >> I just attempted an update and the 4.9.23-26 is not yet up. Does this > >> update address the Hyperthreading issue in any way? > >> > > > > I don't think so .. at least I did not specifically add anything to do > so. > > > > You can get it here for testing: > > > > https://buildlogs.centos.org/centos/7/virt/x86_64/xen/ > > > > (or from /6/ as well for CentOS-6) > > > > Not sure why it did not go out on the signing run .. will check that > server. > > > > > > > > _______________________________________________ > > CentOS-virt mailing list > > CentOS-virt at centos.org > > https://lists.centos.org/mailman/listinfo/centos-virt > > _______________________________________________ > CentOS-virt mailing list > CentOS-virt at centos.org > https://lists.centos.org/mailman/listinfo/centos-virt >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20170421/960cb95c/attachment-0002.html>
Possibly Parallel Threads
- Xen C6 kernel 4.9.13 and testing 4.9.15 only reboots.
- Xen C6 kernel 4.9.13 and testing 4.9.15 only reboots.
- Xen C6 kernel 4.9.13 and testing 4.9.15 only reboots.
- Xen C6 kernel 4.9.13 and testing 4.9.15 only reboots.
- Xen C6 kernel 4.9.13 and testing 4.9.15 only reboots.