Thomas Steen Rasmussen
2014-Dec-28 13:49 UTC
"random" hangs during boot, ACPI related ? Workaround problem by disabling HTT
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello list, I've had a problem a couple of times now that I wanted to share with the list to see what you have to say. The problem is that FreeBSD sometimes hangs during boot, the last line of output from the kernel is "ACPI APIC Table: <INTEL DENLOW>" and then nothing more happens. This seems to be related to something not getting initialized correctly after a soft reboot, but only sometimes. The first time I tried this it was on a server that has been in production for months, and suddently after a reboot during an installworld run it froze. Even a verbose boot doesn't reveal anything that jumps at me: http://i.imgur.com/y04ACyP.png Hard rebooting the server appears to solve it, until next reboot where it may or may not happen again. The same problem seems to affect others. This is the thread where I found suggestions for workarounds: https://bugs.pcbsd.org/issues/4024 Workarounds include disabling C3 or disabling HTT in bios. Disabling C3 did nothing for me, but disabling HTT made the server boot successfully again. The FreeBSD versions involved are 9 and 10, various versions. I've attached /var/run/dmesg.boot from one of the servers that had the problem a few weeks ago. On this server I've disabled HTT in the bios which worked around the problem. I just bought a new server at Hetzner (the hosting provider where I am experienzing this problem) and the server is not in production yet. I wanted to hold off on disabling HTT to see if anyone wants me to check something so we can get this fixed. Thoughts ? More info available on request. Best regards, /Thomas Steen Rasmussen -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (MingW32) iQIcBAEBAgAGBQJUoAp5AAoJEHcv938JcvpYGRYQAMdi9Qc/+VQ6d//quk2N9agM VxnlL3njbPdROO/wSOP8zotdYWJZr7axNcgK4Ficp0ksCr9wc9tellvjnHeaE7yB SVoB/bM9GPZXOb0YuCwAy7taqoGM925DNJFD/bVgvYXENC1y7HkfAa47/JjQycV7 2vjHeLNmGr6nX5cXiT5pPdWzcODjQWIiWhMVTuuZGbySn5M/i+9EUteyZC2oJaf7 sgEpmEqtOxV9DLek0mKJquzdC0nQBQloKXC4j7c3Z1HaqIcn62Cu15IByForTHse oIMPrfynVZt4ltSGxCV0fzvnWIN1WTEMbwWwuJyTz1Xygxo4Q+XR9jMnmpVV+OeH W+Ds3Oo6tyFE+D+8gAG4zyHSNuq/WD/D8/OKyOWRha2VvpWcypQ4RwaOkhQMecn6 MabVoJLlryI2+E3e0CP+nYloiQaAPL/RImw0f1Ms3BDjU75vEAs+/CpqiMDuqzLu tXWL8sYlGYXUi6enXPWbqkzTIVjv7dLpah6twKG9opb5hwem+O01fqtbZJ4uwkzg D0wiHY6dljKbpU0gevu/kok1cPqmDFMXTO9Zx87bf2GlyaCDW+vQsPS36jQHze7u aESqeNRo+AHt7evP7+6sXZXeeW8ENWanFNkuelecnxKMobAiAIEWnoeb8sIOfEPk OS3N99D96XTa1AmCSenL =zbsy -----END PGP SIGNATURE----- -------------- next part -------------- Copyright (c) 1992-2014 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 9.3-STABLE #2 r275693: Thu Dec 11 00:01:27 UTC 2014 root at kush.tyknet.dk:/usr/obj/usr/src/sys/TYKJAIL amd64 gcc version 4.2.1 20070831 patched [FreeBSD] CPU: Intel(R) Xeon(R) CPU E3-1270 v3 @ 3.50GHz (3491.99-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x306c3 Family = 0x6 Model = 0x3c Stepping = 3 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x7ffafbff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,<b11>,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND> AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM> AMD Features2=0x21<LAHF,ABM> Standard Extended Features=0x2fbb<GSFSBASE,TSCADJ,SMEP,ENHMOVSB,INVPCID> TSC: P-state invariant, performance statistics real memory = 34359738368 (32768 MB) avail memory = 33087004672 (31554 MB) Event timer "LAPIC" quality 600 ACPI APIC Table: <INTEL DENLOW> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 2 cpu2 (AP): APIC ID: 4 cpu3 (AP): APIC ID: 6 ioapic0 <Version 2.0> irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: <INTEL DENLOW> on motherboard acpi0: Power Button (fixed) acpi0: reservation of ff000000, 1000fff (3) failed cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 cpu2: <ACPI CPU> on acpi0 cpu3: <ACPI CPU> on acpi0 hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 950 Event timer "HPET" frequency 14318180 Hz quality 550 Event timer "HPET1" frequency 14318180 Hz quality 440 Event timer "HPET2" frequency 14318180 Hz quality 440 Event timer "HPET3" frequency 14318180 Hz quality 440 Event timer "HPET4" frequency 14318180 Hz quality 440 atrtc0: <AT realtime clock> port 0x70-0x77 irq 8 on acpi0 atrtc0: Warning: Couldn't map I/O. Event timer "RTC" frequency 32768 Hz quality 0 attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 xhci0: <Intel Lynx Point USB 3.0 controller> mem 0xc0120000-0xc012ffff irq 16 at device 20.0 on pci0 xhci0: 32 byte context size. xhci0: Port routing mask set to 0xffffffff usbus0 on xhci0 ehci0: <Intel Lynx Point USB 2.0 controller USB-B> mem 0xc1220000-0xc12203ff irq 16 at device 26.0 on pci0 usbus1: EHCI version 1.0 usbus1 on ehci0 pcib1: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0 pci1: <ACPI PCI bus> on pcib1 vgapci0: <VGA-compatible display> mem 0xc2000000-0xc2ffffff,0xc1010000-0xc1013fff,0xc0800000-0xc0ffffff irq 16 at device 0.0 on pci1 vgapci0: Boot video device pcib2: <ACPI PCI-PCI bridge> irq 17 at device 28.1 on pci0 pci2: <ACPI PCI bus> on pcib2 igb0: <Intel(R) PRO/1000 Network Connection version - 2.4.0> port 0x3000-0x301f mem 0xc1100000-0xc117ffff,0xc1180000-0xc1183fff irq 17 at device 0.0 on pci2 igb0: Using MSIX interrupts with 5 vectors igb0: Ethernet address: 00:1e:67:99:6b:22 igb0: Bound queue 0 to cpu 0 igb0: Bound queue 1 to cpu 1 igb0: Bound queue 2 to cpu 2 igb0: Bound queue 3 to cpu 3 ehci1: <Intel Lynx Point USB 2.0 controller USB-A> mem 0xc1210000-0xc12103ff irq 23 at device 29.0 on pci0 usbus2: EHCI version 1.0 usbus2 on ehci1 isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 ahci0: <Intel Lynx Point AHCI SATA controller> port 0x4070-0x4077,0x4060-0x4063,0x4050-0x4057,0x4040-0x4043,0x4020-0x403f mem 0xc1200000-0xc12007ff irq 19 at device 31.2 on pci0 ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported ahcich0: <AHCI channel> at channel 0 on ahci0 ahcich1: <AHCI channel> at channel 1 on ahci0 ahcich2: <AHCI channel> at channel 2 on ahci0 ahcich3: <AHCI channel> at channel 3 on ahci0 ahcich4: <AHCI channel> at channel 4 on ahci0 ahcich5: <AHCI channel> at channel 5 on ahci0 acpi_tz0: <Thermal Zone> on acpi0 acpi_tz1: <Thermal Zone> on acpi0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 battery0: <ACPI Control Method Battery> on acpi0 battery1: <ACPI Control Method Battery> on acpi0 battery2: <ACPI Control Method Battery> on acpi0 ppc1: cannot reserve I/O port range orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ppc0: cannot reserve I/O port range est0: <Enhanced SpeedStep Frequency Control> on cpu0 p4tcc0: <CPU Frequency Thermal Control> on cpu0 est1: <Enhanced SpeedStep Frequency Control> on cpu1 p4tcc1: <CPU Frequency Thermal Control> on cpu1 est2: <Enhanced SpeedStep Frequency Control> on cpu2 p4tcc2: <CPU Frequency Thermal Control> on cpu2 est3: <Enhanced SpeedStep Frequency Control> on cpu3 p4tcc3: <CPU Frequency Thermal Control> on cpu3 usbus0: 5.0Gbps Super Speed USB v3.0 ZFS filesystem version: 5 ZFS storage pool version: features support (5000) Timecounters tick every 1.000 msec usbus1: 480Mbps High Speed USB v2.0 usbus2: 480Mbps High Speed USB v2.0 ugen0.1: <0x8086> at usbus0 uhub0: <0x8086 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0 ugen1.1: <Intel> at usbus1 uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1 ugen2.1: <Intel> at usbus2 uhub2: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2 ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: <HGST HUS724040ALA640 MFAOAA70> ATA-8 SATA 3.x device ada0: Serial Number PN1334PAKW763S ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 3815447MB (7814037168 512 byte sectors: 16H 63S/T 16383C) ada0: Previously was known as ad4 ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 ada1: <HGST HUS724040ALA640 MFAOAA70> ATA-8 SATA 3.x device ada1: Serial Number PN1334PAKRBDHS ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 3815447MB (7814037168 512 byte sectors: 16H 63S/T 16383C) ada1: Previously was known as ad6 SMP: AP CPU #1 Launched! SMP: AP CPU #3 Launched! SMP: AP CPU #2 Launched! Timecounter "TSC-low" frequency 1745993874 Hz quality 1000 Root mount waiting for: usbus2 usbus1 usbus0 uhub1: 2 ports with 2 removable, self powered uhub2: 2 ports with 2 removable, self powered uhub0: 21 ports with 21 removable, self powered Root mount waiting for: usbus2 usbus1 usbus0 ugen0.2: <GASIA> at usbus0 ukbd0: <GASIA PS2toUSB Adapter, class 0/0, rev 1.10/2.01, addr 1> on usbus0 kbd0 at ukbd0 ums0: <GASIA PS2toUSB Adapter, class 0/0, rev 1.10/2.01, addr 1> on usbus0 ums0: 5 buttons and [XYZ] coordinates ID=1 ugen1.2: <vendor 0x8087> at usbus1 uhub3: <vendor 0x8087 product 0x8008, class 9/0, rev 2.00/0.05, addr 2> on usbus1 ugen2.2: <vendor 0x8087> at usbus2 uhub4: <vendor 0x8087 product 0x8000, class 9/0, rev 2.00/0.05, addr 2> on usbus2 uhub3: 6 ports with 6 removable, self powered uhub4: 8 ports with 8 removable, self powered ugen0.3: <vendor 0x05e3> at usbus0 uhub5: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/77.63, addr 2> on usbus0 uhub5: 4 ports with 4 removable, self powered Root mount waiting for: usbus0 ugen0.4: <Logitech> at usbus0 ukbd1: <USB Keyboard> on usbus0 kbd2 at ukbd1 uhid0: <USB Keyboard> on usbus0 Root mount waiting for: usbus0 ugen0.5: <Peppercon AG> at usbus0 ukbd2: <Peppercon AG Multidevice, class 0/0, rev 2.00/0.01, addr 4> on usbus0 kbd3 at ukbd2 uhid1: <Peppercon AG Multidevice, class 0/0, rev 2.00/0.01, addr 4> on usbus0 Root mount waiting for: usbus0 ugen0.6: <American Megatrends Inc.> at usbus0 ukbd3: <Keyboard Interface> on usbus0 kbd4 at ukbd3 ums1: <Mouse Interface> on usbus0 ums1: 3 buttons and [Z] coordinates ID=0 Trying to mount root from zfs:tank/root []... cryptosoft0: <software crypto> on motherboard GEOM_ELI: Device ada0p3.eli created. GEOM_ELI: Encryption: AES-XTS 128 GEOM_ELI: Crypto: software GEOM_ELI: Device ada1p3.eli created. GEOM_ELI: Encryption: AES-XTS 128 GEOM_ELI: Crypto: software igb0: link state changed to UP
Kevin Oberman
2014-Dec-28 17:59 UTC
"random" hangs during boot, ACPI related ? Workaround problem by disabling HTT
On Sun, Dec 28, 2014 at 5:49 AM, Thomas Steen Rasmussen <thomas at gibfest.dk> wrote:> > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello list, > > I've had a problem a couple of times now that I wanted to share with > the list to see what you have to say. The problem is that FreeBSD > sometimes hangs during boot, the last line of output from the kernel > is "ACPI APIC Table: <INTEL DENLOW>" and then nothing more happens. > > This seems to be related to something not getting initialized correctly > after a soft reboot, but only sometimes. The first time I tried this it > was on > a server that has been in production for months, and suddently after a > reboot during an installworld run it froze. Even a verbose boot doesn't > reveal anything that jumps at me: http://i.imgur.com/y04ACyP.png > > Hard rebooting the server appears to solve it, until next reboot where > it may or may not happen again. > > The same problem seems to affect others. This is the thread where I > found suggestions for workarounds: https://bugs.pcbsd.org/issues/4024 > > Workarounds include disabling C3 or disabling HTT in bios. Disabling C3 > did nothing for me, but disabling HTT made the server boot successfully > again. > > The FreeBSD versions involved are 9 and 10, various versions. I've > attached /var/run/dmesg.boot from one of the servers that had the > problem a few weeks ago. On this server I've disabled HTT in the bios > which worked around the problem. > > I just bought a new server at Hetzner (the hosting provider where I am > experienzing this problem) and the server is not in production yet. I > wanted to hold off on disabling HTT to see if anyone wants me to > check something so we can get this fixed. > > Thoughts ? More info available on request. > > > Best regards, > > /Thomas Steen Rasmussen >I know that C3 (and higher) and TCC don't get along on some platforms. I thought that throttling was now off by default, but I'm not sure as I have it disabled in /boot/loader.conf: # Disable CPU throttling hint.p4tcc.0.disabled=1 hint.acpi_throttle.0.disabled=1 This will not show up until ACPI and EST come up, so it sort of fits. Whether it causes a lockup or not, disabling it is a good idea. It was intended for use as thermal management, not power management. (The 'T' in TCC (or P4TCC) is "Thermal", after all.) -- R. Kevin Oberman, Network Engineer, Retired E-mail: rkoberman at gmail.com