So, I got my first lockup in weeks, testing with the latest stable and the patch which sets the kernel bits. But I cant say it its Ryzen related or not. Meanwhile I also got access to an Epyc server in Azure. Am also runing the latest STABLE on that tp see how it goes. Interesting thing there is that there appears to be no access to the MSR's. They all appear as zerousing cpucontrol. I am not entirely surprised by this as the are very low level, but I di think they were saved and restored during context switches between virtual machines so I was hoping to be able to set them. Is this normal ? -pete.
Konstantin Belousov
2018-Jul-05 10:31 UTC
Ryzen issues on FreeBSD ? (with sort of workaround)
On Thu, Jul 05, 2018 at 11:13:10AM +0100, Pete French wrote:> So, I got my first lockup in weeks, testing with the latest stable > and the patch which sets the kernel bits. But I cant say it its > Ryzen related or not. > > Meanwhile I also got access to an Epyc server in Azure. Am also > runing the latest STABLE on that tp see how it goes. Interesting > thing there is that there appears to be no access to the MSR's. > They all appear as zerousing cpucontrol. I am not entirely surprised > by this as the are very low level, but I di think they were saved > and restored during context switches between virtual machines so I > was hoping to be able to set them. Is this normal ?It does not make any sense to even try to access the chicken bits MSRs when running under virtualization. It is the duty of the hypervisor to configure hardware. I updated the patch. diff --git a/sys/amd64/amd64/initcpu.c b/sys/amd64/amd64/initcpu.c index ccc5e64d0c4..bb342f42dec 100644 --- a/sys/amd64/amd64/initcpu.c +++ b/sys/amd64/amd64/initcpu.c @@ -130,6 +130,30 @@ init_amd(void) } } + /* Ryzen erratas. */ + if (CPUID_TO_FAMILY(cpu_id) == 0x17 && CPUID_TO_MODEL(cpu_id) == 0x1 && + (cpu_feature2 & CPUID2_HV) == 0) { + /* 1021 */ + msr = rdmsr(0xc0011029); + msr |= 0x2000; + wrmsr(0xc0011029, msr); + + /* 1033 */ + msr = rdmsr(0xc0011020); + msr |= 0x10; + wrmsr(0xc0011020, msr); + + /* 1049 */ + msr = rdmsr(0xc0011028); + msr |= 0x10; + wrmsr(0xc0011028, msr); + + /* 1095 */ + msr = rdmsr(0xc0011020); + msr |= 0x200000000000000; + wrmsr(0xc0011020, msr); + } + /* * Work around a problem on Ryzen that is triggered by executing * code near the top of user memory, in our case the signal