Update:
machdep.idle = hlt and machdep.idle_mwait = 0 failed also. It can't
last even longer than machdep.idle = mwait, which could normally panic
after a few passes of building gcc. I tried hlt twice, both not longer
than half hour.
Now, as another round of building 4 gccs in parallel is going to finish, with
machdep.idle = spin and machdep.idle_mwait = 0.
Can I say Ryzen 2400G probably have issues with both mwait and hlt?
Regards,
meowthink
Fatal trap 12: page fault while in user mode
cpuid = 6; apic id = 06
fault virtual address = 0x819cd0000
fault code = user write data, reserved bits in PTE
instruction pointer = 0x43:0x80195de26
stack pointer = 0x3b:0x7fffffffb0b8
frame pointer = 0x3b:0x7fffffffb100
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 3, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 17888 (ld)
trap number = 12
panic: page fault
cpuid = 6
KDB: stack backtrace:
#0 0xffffffff80b414b7 at kdb_backtrace+0x67
#1 0xffffffff80afa9e7 at vpanic+0x177
#2 0xffffffff80afa863 at panic+0x43
#3 0xffffffff80f7c14f at trap_fatal+0x35f
#4 0xffffffff80f7c1a9 at trap_pfault+0x49
#5 0xffffffff80f7ba10 at trap+0x360
#6 0xffffffff80f5bccc at calltrap+0x8
On Tue, Aug 28, 2018 at 11:47 PM Meowthink <meowthink at gmail.com>
wrote:>
> Hi Peeter,
>
> On 8/28/18, karu.pruun <karu.pruun at gmail.com> wrote:
> > On Mon, Aug 27, 2018 at 6:07 PM Meowthink <meowthink at
gmail.com> wrote:
> >
> >> >> Unfortunately, that's for Ryzens family 17h model
00h-0fh, whereas my
> >> >> Ryzen 5 2400G's model is 11h.
> >> >>
> >> >> On the microcode. It shall be updated through UEFI/BIOS
updates. I
> >> >> think mine is now PinnaclePI-AM4_1.0.0.4 with microcode
patchlevel
> >> >> 0x810100b.
> >> >>
> >> >> Seems like ... the only thing I can do is sit down and
wait?
> >> >
> >> > The revision
> >> >
> >> >
https://svnweb.freebsd.org/base/head/sys/x86/x86/cpu_machdep.c?r1=336763&r2=336762&pathrev=336763
> >> >
> >> > works around the mwait issue, i.e. it sets
> >> >
> >> > sysctl machdep.idle_mwait=0
> >> > sysctl machdep.idle=hlt
> >> >
> >>
> >> I think that shall not apply to 2400G, which is model 11h not 1h.
> >> Here're what I have now:
> >>
> >> machdep.idle: acpi
> >> machdep.idle_available: spin, mwait, hlt, acpi
> >> machdep.idle_apl31: 0
> >> machdep.idle_mwait: 1
> >>
> >> > Now it may or may not relate to your problem, but it appears
that
> >> > Ryzen 2400G also has another issue with HLT, see the
DragonFly bug
> >> > report
> >> >
> >> > https://bugs.dragonflybsd.org/issues/3131
> >> >
> >>
> >> Thanks a lot for that info.
> >> It's much easier to prove your problem, since it's
reproducible. But
> >> mine was so random to catch...
> >> Anyway, it seems like the IRET issue [1] is still not fixed?
I'm
> >> highly doubt that my issue is this related because my system
became
> >> significantly more stable since I stop that irq storm from
bluetooth
> >> module - Though it still panics occasionally.
> >> So could anybody tell, what's the difference between FreeBSD
> >> workaround [2] and the DragonflyBSD one?
> >>
> >> > which AMD is aware of and is possibly working on, but it may
not have
> >> > appeared in the errata yet. The bug report says that until
this is
> >> > fixed, the workaround is to also disable HLT in cpu_idle. I
am not
> >> > sure what is the correct value for the sysctl on FreeBSD,
perhaps
> >> >
> >> > sysctl machdep.idle=0
> >> >
> >> > or some other value?
> >>
> >> In the meantime, I have this microcode
> >>
> >> # cpucontrol -m 0x8b /dev/cpuctl0
> >> MSR 0x8b: 0x00000000 0x0810100b
> >>
> >> Hence I should use mwait?
> >> Still don't know what should I set. Any idea?
> >
> >
> > If I was you, I'd play around with the sysctls mentioned above and
see
> > if it helps. Start with disabling both mwait and hlt, perhaps
> >
> > machdep.idle=spin
> > machdep.idle_mwait=0
> >
> > (assuming that 'spin' means hlt will not used) and then if
that does
> > not lead to a panic, try enabling mwait. I can't test 2400G since
I
> > don't have it any more. I booted FreeBSD a couple of times but did
not
> > run it over long periods of time.
>
> It works!
> After hours and hours of different stressing. I got 8 copies of gcc
> built without any problem.
>
> But it costs lots of power and the fan will become very annoying. As
> so, I don't think I'll test long term stability with this state.
>
> machdep.idle: acpi -> spin
> - will add ~5W, maybe some deeper C states disabled?
> machdep.idle_mwait: 1 -> 0
> - will add another ~50W, CPUs are working insomniac.
>
> I tried to set machdep.idle_mwait to 1, or machdep.idle to mwait. Both
> failed with panics when I start building gcc pass by pass.
>
> I'm pretty sure mwait will cause problem, as once I experienced a
> panic immediately after I issued the sysctl command (the 2nd dump info
> followed)
>
> So my next step will be hlt. Still need some time, though.
>
> >
> > Cheers
> >
> > Peeter
> >
> > --
> >
>
> Cheers,
> meowthink
>
> ------------------------------------------------------------------------
> machdep.idle=mwait
>
> panic: ffs_syncvnode: syncing truncated data.
> cpuid = 7
> KDB: stack backtrace:
> #0 0xffffffff80b414b7 at kdb_backtrace+0x67
> #1 0xffffffff80afa9e7 at vpanic+0x177
> #2 0xffffffff80afa863 at panic+0x43
> #3 0xffffffff80dcddc4 at ffs_syncvnode+0x5a4
> #4 0xffffffff80dcc915 at ffs_fsync+0x25
> #5 0xffffffff810ffcb2 at VOP_FSYNC_APV+0x82
> #6 0xffffffff80bc3a62 at sched_sync+0x412
> #7 0xffffffff80abd813 at fork_exit+0x83
> #8 0xffffffff80f5cc7e at fork_trampoline+0xe
>
> ------------------------------------------------------------------------
> machdep.idle_mwait=1
>
> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 7; apic id = 07
> instruction pointer = 0x20:0xffffffff80e094fe
> stack pointer = 0x0:0xfffffe081e5df9e0
> frame pointer = 0x0:0xfffffe081e5dfa50
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 17 (dom0)
> trap number = 9
> panic: general protection fault
> cpuid = 7
> KDB: stack backtrace:
> #0 0xffffffff80b414b7 at kdb_backtrace+0x67
> #1 0xffffffff80afa9e7 at vpanic+0x177
> #2 0xffffffff80afa863 at panic+0x43
> #3 0xffffffff80f7c14f at trap_fatal+0x35f
> #4 0xffffffff80f7b70e at trap+0x5e
> #5 0xffffffff80f5bccc at calltrap+0x8
> #6 0xffffffff80e07a17 at vm_pageout+0x87
> #7 0xffffffff80abd813 at fork_exit+0x83
> #8 0xffffffff80f5cc7e at fork_trampoline+0xe