On Tue, Sep 13, 2016 at 05:54:26PM +0300, Andriy Gapon
wrote:> On 13/09/2016 17:21, Slawa Olhovchenkov wrote:
> > boot failed:
> >
> > set hw.x2apic_enable=0
> > loading required module 'krpc'
> > /boot/kernel.VSTREAM/krpc.ko size 0x2a210 at 0x134e000
> > loading required module 'opensolaris'
> > ^@/boot/kernel.VSTREAM/opensolaris.ko size 0xadb8 at 0x1379000
> > /boot/kernel.VSTREAM/if_igb.ko size 0x69f10 at 0x1384000
> > can't find 'if_ixgbe'
> > /boot/kernel.VSTREAM/if_lagg.ko size 0x150c0 at 0x13ee000^M ^@
> > +/boot/kernel.VSTREAM/ukbd.ko size 0xe128 at 0x1404000
> > loading required module 'usb'
> > /boot/kernel.VSTREAM/usb.ko size 0x458b0 at 0x1413000^M|
> >
/boot/kernel.VSTREAM/umass.ko size 0xaa10 at 0x1459000
> > /boot/kernel.VSTREAM/accf_http.ko size 0x2710 at 0x1464000
> > /boot/kernel.VSTREAM/uhci.ko size 0xd508 at 0x1467000
> > /boot/kernel.VSTREAM/ohci.ko size 0xc9d0 at 0x1475000^M
> > /boot/kernel.VSTREAM/ehci.ko size 0xfc40 at 0x1482000
> > /boot/kernel.VSTREAM/xhci.ko size 0x11068 at 0x1492000
> > /boot/kernel.VSTREAM/cc_htcp.ko size 0x3a70 at 0x14a4000
> > Booting...
> > Copyright (c) 1992-2016 The FreeBSD Project.
> > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
1994
> > The Regents of the University of California. All rights
reserved.
> > FreeBSD is a registered trademark of The FreeBSD Foundation.
> > FreeBSD 11.0-RELEASE-p305117 #0: Mon Sep 12 20:38:53 MSK 2016
> > slw at edge21.int.integros.com:/usr/obj/usr/src/sys/VSTREAM amd64
> > FreeBSD clang version 3.8.0 (tags/RELEASE_380/final 262564) (based on
LLVM 3.8.0)
> > VT(vga): text 80x25
> > CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz (2200.04-MHz K8-class
CPU)
> > Origin="GenuineIntel" Id=0x406f1 Family=0x6 Model=0x4f
Stepping=1
> >
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
> >
Features2=0x7ffefbff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
> > AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
> > AMD Features2=0x121<LAHF,ABM,Prefetch>
> > Structured Extended
Features=0x21cbfbb<FSGSBASE,TSCADJ,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,NFPUSG,PQE,RDSEED,ADX,SMAP,PROCTRACE>
> > XSAVE Features=0x1<XSAVEOPT>
> > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
> > TSC: P-state invariant, performance statistics
> > real memory = 137438953472 (131072 MB)
> > avail memory = 133407973376 (127227 MB)
> > Event timer "LAPIC" quality 600
> > ACPI APIC Table: <ALASKA A M I >
> > boot_cpu_id = 255
> > kernel trap 12 with interrupts disabled
> >
> >
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 0; apic id = ff
> > fault virtual address = 0x0
> > fault code = supervisor read data, page not present
> > instruction pointer = 0x20:0xffffffff80537e74
> > stack pointer = 0x28:0xffffffff814b3a60
> > frame pointer = 0x28:0xffffffff814b3a70
> > code segment = base 0x0, limit 0xfffff, type 0x1b
> > = DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags = resume, IOPL = 0
> > current process = 0 ()
> > trap number = 12
> > panic: page fault
> > cpuid = 0
> > KDB: stack backtrace:
> > #0 0xffffffff805272e7 at kdb_backtrace+0x67
> > #1 0xffffffff804dd662 at vpanic+0x182
> > #2 0xffffffff804dd4d3 at panic+0x43
> > #3 0xffffffff807a37a1 at trap_fatal+0x351
> > #4 0xffffffff807a3993 at trap_pfault+0x1e3
> > #5 0xffffffff807a2f1c at trap+0x26c
> > #6 0xffffffff80787ca1 at calltrap+0x8
> > #7 0xffffffff8083b53a at topo_probe+0x61a
> > #8 0xffffffff8078fe93 at cpu_mp_start+0x1c3
> > #9 0xffffffff805382ca at mp_start+0x3a
> > #10 0xffffffff80465cd8 at mi_startup+0x118
> > #11 0xffffffff8028dfac at btext+0x2c
> > Uptime: 1s
>
> Thank you!
> It seems like exactly the same behavior that happens when you toggle that
BIOS
> option.
>
> My theory is that in both cases, hw.x2apic_enable=0 and X2APIC_OPT_OUT is
on,
> the BIOS turns on x2APIC mode and transitions to OS in that mode.
> In the case when X2APIC_OPT_OUT is on it's clearly a BIOS bug.
> But maybe we could do a little bit better in both cases. At the very least
we
> could detect the situation and panic with a helpful message (e.g.
"x2APIC mode
> is disabled but turn on by BIOS"). Perhaps we could even try to
downgrade to
> xAPIC mode.
If hw.x2apic_enable=0 and machine booted to the stage of topo probe,
BIOS definitely did not made hand-off with x2APIC enabled. Any access
to the LAPIC registers page in x2APIC mode faults.
And X2APIC_OPT_OUT also does not result in x2APIC mode hand-off, for the
same reason. System would panic much earlier, while dmesg indicates that
LAPIC ICR calibration was succesful.
It is invalid LAPIC Id or a bug in topo code or combination of issues.