Mike Tancsa
2015-Mar-21 18:13 UTC
RELENG_10 performance regression (was Re: 35-40% performance drop releng9 vs releng10 openvpn
On 3/21/2015 11:52 AM, John Baldwin wrote:>> http://tancsa.com/time/ > > Do you know why you are using the HPET instead of TSC for timestamping?Hi, I am not consciously making any time keep decisions. kern.eventtimer.choice: HPET(550) HPET1(450) LAPIC(400) i8254(100) RTC(0) kern.timecounter.choice: TSC(800) HPET(950) ACPI-fast(900) i8254(0) dummy(-1000000) (The full hardware info is at the above url)> Using the TSC can make a non-trivial performance difference since userland > can calculate timestamps without using system calls when it is used. > (That is not related to this case, but switching to the TSC in general is > preferable.) > > There are a few generations of Intel CPUs where you can't mix deeper sleep > states with the TSC as timecounter, but those CPUs are getting to be a bit > older at this point. >This one is an AMD CPU: AMD G-T40E Processor (1000.02-MHz K8-class CPU) Origin="AuthenticAMD" Id=0x500f20 Family=0x14 Model=0x2 Stepping=0 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> Features2=0x802209<SSE3,MON,SSSE3,CX16,POPCNT> AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM> AMD Features2=0x35ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,IBS,SKINIT,WDT> SVM: NP,NRIP,NAsids=8 TSC: P-state invariant, performance statistics real memory = 2115297280 (2017 MB) avail memory = 2018639872 (1925 MB) Event timer "LAPIC" quality 400 ACPI APIC Table: <CORE COREBOOT> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs FreeBSD/SMP: 1 package(s) x 2 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0 <Version 2.1> irqs 0-23 on motherboard random: <Software, Yarrow> initialized module_register_init: MOD_LOAD (vesa, 0xffffffff80d9ddf0, 0) error 19 kbd0 at kbdmux0 acpi0: <CORE COREBOOT> on motherboard acpi0: Power Button (fixed) cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike at sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/
Konstantin Belousov
2015-Mar-21 18:42 UTC
RELENG_10 performance regression (was Re: 35-40% performance drop releng9 vs releng10 openvpn
On Sat, Mar 21, 2015 at 02:13:06PM -0400, Mike Tancsa wrote:> On 3/21/2015 11:52 AM, John Baldwin wrote: > > >> http://tancsa.com/time/ > > > > Do you know why you are using the HPET instead of TSC for timestamping? > > Hi, > > I am not consciously making any time keep decisions. > > kern.eventtimer.choice: HPET(550) HPET1(450) LAPIC(400) i8254(100) RTC(0) > kern.timecounter.choice: TSC(800) HPET(950) ACPI-fast(900) i8254(0) > dummy(-1000000) > > (The full hardware info is at the above url) > > > > Using the TSC can make a non-trivial performance difference since userland > > can calculate timestamps without using system calls when it is used. > > (That is not related to this case, but switching to the TSC in general is > > preferable.) > > > > There are a few generations of Intel CPUs where you can't mix deeper sleep > > states with the TSC as timecounter, but those CPUs are getting to be a bit > > older at this point. > > > > This one is an AMD > CPU: AMD G-T40E Processor (1000.02-MHz K8-class CPU) > Origin="AuthenticAMD" Id=0x500f20 Family=0x14 Model=0x2 Stepping=0 > > Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > Features2=0x802209<SSE3,MON,SSSE3,CX16,POPCNT> > AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM> > AMD > Features2=0x35ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,IBS,SKINIT,WDT> > SVM: NP,NRIP,NAsids=8 > TSC: P-state invariant, performance statisticsIt seems to be a consequnce of the code from r222869. The test_tsc() does not trust the P-state invariant report and explicitely check for the family. Your CPU family is 0x14, while code only bumps TSC priority for family 0x15+. Currently, tsc_is_invariant is set when CPU reports AMDPM_TSC_INVARIANT, or for some models. Should we bump TSC timecounter priority is smp test passed and AMDPM_TSC_INVARIANT is set ? For now, you could just set TSC as timecounter.