On Sun, Sep 04, 2016 at 11:19:16AM +0300, Andriy Gapon
wrote:> On 01/09/2016 15:13, Slawa Olhovchenkov wrote:
> > DMAR: Found table at 0x79b32798
> > x2APIC available but disabled by DMAR table
>
> > Event timer "LAPIC" quality 600
> > LAPIC: ipi_wait() us multiplier 1 (r 116268019 tsc 2200043851)
> > ACPI APIC Table: <ALASKA A M I >
> > Package ID shift: 5
> > L3 cache ID shift: 5
> > L2 cache ID shift: 1
> > L1 cache ID shift: 1
> > Core ID shift: 1
> > kernel trap 12 with interrupts disabled
> >
> >
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 0; apic id = ff
>
> > fault virtual address = 0x0
> > fault code = supervisor read data, page not present
> > instruction pointer = 0x20:0xffffffff80537e74
> > stack pointer = 0x28:0xffffffff814b4a60
> > frame pointer = 0x28:0xffffffff814b4a70
> > code segment = base 0x0, limit 0xfffff, type 0x1b
> > = DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags = resume, IOPL = 0
> > current process = 0 ()
> > trap number = 12
> > panic: page fault
> > cpuid = 0
> > KDB: stack backtrace:
> > #0 0xffffffff805272e7 at kdb_backtrace+0x67
> > #1 0xffffffff804dd662 at vpanic+0x182
> > #2 0xffffffff804dd4d3 at panic+0x43
> > #3 0xffffffff807a3791 at trap_fatal+0x351
> > #4 0xffffffff807a3983 at trap_pfault+0x1e3
> > #5 0xffffffff807a2f0c at trap+0x26c
> > #6 0xffffffff80787ca1 at calltrap+0x8
> > #7 0xffffffff8083b52a at topo_probe+0x61a
>
> Interesting. Could you please do 'list *topo_probe+0x61a' in kgdb,
so that I
> can see what code is being executed when the trap happens? Also,
disassembly of
> the function could be useful as well.
>
> Wait...
> Kostik, I see one strange thing which is common to both successful and
> unsuccessful configurations. All "SMP: Added CPU..." lines have
"AP" in them.
> It seems like the platform does not tell explicitly tell which CPU is the
BSP,
> see cpu_add() function. This can break quite a few assumption. And I am
not
> even sure how the successful scenario works.
> Ah... I see that there is a backup code in cpu_mp_start() where boot_cpu_id
is
> set based on the current CPU's Local APIC ID. I suspect then that this
> information is incorrect in the failing case.
>
Well, there is no easy way to read the LAPIC Id of BSP before LAPICs
are initialized. BIOS might reprogram LAPIC Ids, so reading from
CPUID[1].EBX[31:24] might return incorrect data. Even more incorrect
it might be in the x2APIC state, since 8 bits are not enough for 32bit
x2APIC Id.
> Slawa,
> my guess can be checked by adding a printf to cpu_mp_start() right after
> boot_cpu_id assignment.
>
> > #8 0xffffffff8078fe81 at cpu_mp_start+0x1b1
> > #9 0xffffffff805382ca at mp_start+0x3a
> > #10 0xffffffff80465cd8 at mi_startup+0x118
> > #11 0xffffffff8028dfac at btext+0x2c
> > Uptime: 1s
>
>
> --
> Andriy Gapon