Matt T. Yourst
2006-Apr-08 22:29 UTC
[Xen-devel] Xen cpufreq support status: how to notify hypervisor of frequency change?
Hi, I previously posted a patch to make the dom0 cpufreq drivers properly use DOM0_MSR (instead of wrmsr) to adjust the processor frequency and voltage. This does indeed adjust the frequency, but Xen seems to have major latency problems whenever the frequency change takes affect, causing various problems like losing mouse events, uncontrolled keyboard repeats, and choppy audio and video playback for a few seconds after the shift. This appears to happen because virtual timer interrupts do not get delivered on a regular basis for a few seconds following the frequency change. Assuming we want to keep the cpufreq driver itself in dom0, what''s the proper way to notify the hypervisor that the CPU frequency has just changed, so it can adjust its timers like the cpufreq driver on the native kernel does? I''d really like to have cpufreq working properly under Xen (for both workstations and to a lesser extent servers), so what would be the best way to get this running? - Matt Yourst ------------------------------------------------------- Matt T. Yourst yourst@cs.binghamton.edu Binghamton University, Department of Computer Science ------------------------------------------------------- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Apr-10 14:01 UTC
Re: [Xen-devel] Xen cpufreq support status: how to notify hypervisor of frequency change?
On 8 Apr 2006, at 23:29, Matt T. Yourst wrote:> This appears to happen because virtual timer interrupts do not get > delivered > on a regular basis for a few seconds following the frequency change. > > Assuming we want to keep the cpufreq driver itself in dom0, what''s the > proper > way to notify the hypervisor that the CPU frequency has just changed, > so it > can adjust its timers like the cpufreq driver on the native kernel > does? > > I''d really like to have cpufreq working properly under Xen (for both > workstations and to a lesser extent servers), so what would be the > best way > to get this running?The TSC needs recalibrating on the affected CPU. This will require adapting local_time_calibration() in arch/x86/time.c. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Matt T. Yourst
2006-Apr-11 00:05 UTC
Re: [Xen-devel] Xen cpufreq support status: how to notify hypervisor of frequency change?
On Monday 10 April 2006 10:01 am, Keir Fraser wrote:> > The TSC needs recalibrating on the affected CPU. This will require > adapting local_time_calibration() in arch/x86/time.c. >Thanks a lot - I followed your suggestion and added this functionality in the two patches below (and attached). The first patch (cpufreq-xen-dom0-func.diff) adds a dom0 "setcpufreq" hypercall that sets cpu_khz to the specified value and re-runs local_time_calibration() on any CPU(s) the setcpufreq call specifies. In the second patch (cpufreq-xen-linux-2.6.16, to be applied to the linux-2.6.hg tree), the Linux cpufreq drivers running in dom0 use this call immediately after writing whichever MSR triggers the frequency/voltage shift, but before triggering a cpufreq callback to adjust the dom0 kernel''s own timer and TSC parameters. Note that I only implemented this for powernow-k8 right now, since that''s the only hardware I could test it on. It''s very obvious how to adapt it to the other cpufreq drivers, by just replacing rdmsr and wrmsr with the Xen wrapper versions I provided, and adding in the setcpufreq hypercall at the end. The patches seem to work correctly - I''ve had virtually no latency issues or temporary freezes after frequency transitions now, although I occasionally still get a mouse freeze and "psmouse: lost synchronization, throwing X bytes away" messages in the syslog. This is on an Athlon 64 laptop stepping between 800/1600/2000 MHz. I didn''t test this on an SMP system yet, but I think it''s SMP-safe, but only if local_time_calibration() is reentrant (since it''s scheduled once a second and there may be a collision in rare cases). Is this an issue? I can test it on a dual-core Athlon 64 next week if needed (my SMP Xen test machine is unavailable until then). Someone with a Core Duo should patch speedstep-centrino.c and test it there too. Try this out and let me know if there are problems. It doesn''t seem to have the issues I was having before, but that''s a matter of user experience, so I might be mistaken or just imagining it. - Matt Yourst diff -r 886594fa3aef xen/arch/x86/dom0_ops.c --- a/xen/arch/x86/dom0_ops.c Sat Apr 8 12:10:04 2006 +0100 +++ b/xen/arch/x86/dom0_ops.c Mon Apr 10 19:44:19 2006 -0400 @@ -339,6 +339,13 @@ long arch_do_dom0_op(struct dom0_op *op, } break; + case DOM0_SETCPUFREQ: + { + extern int set_cpu_freq(cpumap_t cpumap, unsigned long khz); + ret = set_cpu_freq(op->u.setcpufreq.cpumap, op->u.setcpufreq.khz); + break; + } + case DOM0_GETMEMLIST: { int i; diff -r 886594fa3aef xen/arch/x86/time.c --- a/xen/arch/x86/time.c Sat Apr 8 12:10:04 2006 +0100 +++ b/xen/arch/x86/time.c Mon Apr 10 19:44:19 2006 -0400 @@ -914,6 +914,60 @@ void __init early_time_init(void) setup_irq(0, &irq0); } +/* + * This is called when the hypervisor is notified of a CPU core + * frequency change initiated by a driver in dom0. + * + * This should be called after the new frequency has stabilized. + * + * The CPUs specified may include any CPU or core (if cores have + * independent PLLs). In an SMP or multi-core system, it may + * take a while for the recalibration function to be scheduled + * on the intended target CPUs; there is no guarantee this will + * happen by the time this call returns. + * + */ +typedef struct percpu_freq_update { + cpumap_t cpumap; + unsigned long khz; +} percpu_freq_update_t; + +void set_cpu_freq_percpu(void *p) { + percpu_freq_update_t *data = (percpu_freq_update_t*)p; + int affected; + + if (!data) { + printk(" Adjust freq on cpu %d: no data!\n", smp_processor_id()); + return; + } + + affected = ((data->cpumap & (1 << smp_processor_id())) != 0); + + printk(" Frequency change request on cpu %d: cpumap %08llx, khz %ld (%s)\n", + smp_processor_id(), (unsigned long long)data->cpumap, data->khz, + (affected ? "adjusting" : "skipping")); + + if (affected) { + cpu_khz = data->khz; + local_time_calibration(NULL); + + printk(" Recalibrated timers on cpu %d to %ld khz\n", + smp_processor_id(), data->khz); + } +} + +int set_cpu_freq(cpumap_t cpumap, unsigned long khz) { + percpu_freq_update_t freq_update; + + printk("CPU frequency change request: cpumap %08llx, khz %ld", + (unsigned long long)cpumap, khz); + + freq_update.cpumap = cpumap; + freq_update.khz = khz; + + return on_each_cpu(set_cpu_freq_percpu, &freq_update, 1, 1); +} + void send_timer_event(struct vcpu *v) { send_guest_vcpu_virq(v, VIRQ_TIMER); diff -r 886594fa3aef xen/include/public/dom0_ops.h --- a/xen/include/public/dom0_ops.h Sat Apr 8 12:10:04 2006 +0100 +++ b/xen/include/public/dom0_ops.h Mon Apr 10 19:44:19 2006 -0400 @@ -283,6 +283,28 @@ typedef struct dom0_getpageframeinfo2 { GUEST_HANDLE(ulong) array; } dom0_getpageframeinfo2_t; DEFINE_GUEST_HANDLE(dom0_getpageframeinfo2_t); + +/* + * Notify hypervisor of a CPU core frequency change completed + * by cpufreq driver in dom0, triggering an internal timer + * recalibration. + * + * This should be called after the new frequency has stabilized. + * + * The CPUs specified may include any CPU or core (if cores have + * independent PLLs). In an SMP or multi-core system, it may + * take a while for the recalibration function to be scheduled + * on the intended target CPUs; there is no guarantee this will + * happen by the time this call returns. + * + */ +#define DOM0_SETCPUFREQ 30 +typedef struct dom0_setcpufreq { + /* IN variables */ + cpumap_t cpumap; + unsigned long khz; +} dom0_setcpufreq_t; +DEFINE_GUEST_HANDLE(dom0_setcpufreq_t); /* * Request memory range (@mfn, @mfn+@nr_mfns-1) to have type @type. @@ -496,6 +518,7 @@ typedef struct dom0_op { struct dom0_shadow_control shadow_control; struct dom0_setdomainmaxmem setdomainmaxmem; struct dom0_getpageframeinfo2 getpageframeinfo2; + struct dom0_setcpufreq setcpufreq; struct dom0_add_memtype add_memtype; struct dom0_del_memtype del_memtype; struct dom0_read_memtype read_memtype; --------------------------------------------------------------- diff -r 640f8b15b9dd arch/i386/kernel/cpu/cpufreq/powernow-k8.c --- a/arch/i386/kernel/cpu/cpufreq/powernow-k8.c Fri Apr 7 01:32:54 2006 +0100 +++ b/arch/i386/kernel/cpu/cpufreq/powernow-k8.c Mon Apr 10 19:33:49 2006 -0400 @@ -48,6 +48,37 @@ #define VERSION "version 1.60.0" #include "powernow-k8.h" +/* Xen support */ + +#ifdef CONFIG_XEN_PRIVILEGED_GUEST +int xen_access_msr(u32 msr, int write, u32* out1, u32* out2, u32 in1, u32 in2) { + dom0_op_t op; + op.cmd = DOM0_MSR; + op.u.msr.write = write; + op.u.msr.cpu_mask = 1; /* only first CPU: not clear how to read multiple CPUs */ + op.u.msr.msr = msr; + op.u.msr.in1 = in1; + op.u.msr.in2 = in2; + BUG_ON(HYPERVISOR_dom0_op(&op)); + + if (!write) { + *out1 = op.u.msr.out1; /* low 32 bits */ + *out2 = op.u.msr.out2; /* high 32 bits */ + } + + return 0; +} + +#define cpu_rdmsr(msr, val1, val2) xen_access_msr((msr), 0, &(val1), &(val2), 0, 0) +#define cpu_wrmsr(msr, val1, val2) xen_access_msr((msr), 1, NULL, NULL, (val1), (val2)) + +#else + +#define cpu_rdmsr(msr, val1, val2) rdmsr(msr, val1, val2) +#define cpu_wrmsr(msr, val1, val2) wrmsr(msr, val1, val2) + +#endif + /* serialize freq changes */ static DECLARE_MUTEX(fidvid_sem); @@ -98,7 +129,7 @@ static int pending_bit_stuck(void) { u32 lo, hi; - rdmsr(MSR_FIDVID_STATUS, lo, hi); + cpu_rdmsr(MSR_FIDVID_STATUS, lo, hi); return lo & MSR_S_LO_CHANGE_PENDING ? 1 : 0; } @@ -116,7 +147,7 @@ static int query_current_values_with_pen dprintk("detected change pending stuck\n"); return 1; } - rdmsr(MSR_FIDVID_STATUS, lo, hi); + cpu_rdmsr(MSR_FIDVID_STATUS, lo, hi); } while (lo & MSR_S_LO_CHANGE_PENDING); data->currvid = hi & MSR_S_HI_CURRENT_VID; @@ -145,13 +176,13 @@ static void fidvid_msr_init(void) u32 lo, hi; u8 fid, vid; - rdmsr(MSR_FIDVID_STATUS, lo, hi); + cpu_rdmsr(MSR_FIDVID_STATUS, lo, hi); vid = hi & MSR_S_HI_CURRENT_VID; fid = lo & MSR_S_LO_CURRENT_FID; lo = fid | (vid << MSR_C_LO_VID_SHIFT); hi = MSR_C_HI_STP_GNT_BENIGN; dprintk("cpu%d, init lo 0x%x, hi 0x%x\n", smp_processor_id(), lo, hi); - wrmsr(MSR_FIDVID_CTL, lo, hi); + cpu_wrmsr(MSR_FIDVID_CTL, lo, hi); } @@ -173,7 +204,7 @@ static int write_new_fid(struct powernow fid, lo, data->plllock * PLL_LOCK_CONVERSION); do { - wrmsr(MSR_FIDVID_CTL, lo, data->plllock * PLL_LOCK_CONVERSION); + cpu_wrmsr(MSR_FIDVID_CTL, lo, data->plllock * PLL_LOCK_CONVERSION); if (i++ > 100) { printk(KERN_ERR PFX "internal error - pending bit very stuck - no further pstate changes possible\n"); return 1; @@ -215,7 +246,7 @@ static int write_new_vid(struct powernow vid, lo, STOP_GRANT_5NS); do { - wrmsr(MSR_FIDVID_CTL, lo, STOP_GRANT_5NS); + cpu_wrmsr(MSR_FIDVID_CTL, lo, STOP_GRANT_5NS); if (i++ > 100) { printk(KERN_ERR PFX "internal error - pending bit very stuck - no further pstate changes possible\n"); return 1; @@ -294,7 +325,7 @@ static int core_voltage_pre_transition(s smp_processor_id(), data->currfid, data->currvid, reqvid, data->rvo); - rdmsr(MSR_FIDVID_STATUS, lo, maxvid); + cpu_rdmsr(MSR_FIDVID_STATUS, lo, maxvid); maxvid = 0x1f & (maxvid >> 16); dprintk("ph1 maxvid=0x%x\n", maxvid); if (reqvid < maxvid) /* lower numbers are higher voltages */ @@ -892,6 +923,19 @@ static int transition_frequency(struct p res = transition_fid_vid(data, fid, vid); +#ifdef CONFIG_XEN_PRIVILEGED_GUEST + { + dom0_op_t op; + int rc; + // printk("powernow-k8: notifying Xen of transition to %d khz on cpu %d\n", freqs.new, freqs.cpu); + op.cmd = DOM0_SETCPUFREQ; + op.u.setcpufreq.cpumap = (1 << freqs.cpu); + op.u.setcpufreq.khz = freqs.new; + rc = HYPERVISOR_dom0_op(&op); + // printk("powernow-k8: notified Xen of transition to %d khz on cpu %d (rc %d)\n", freqs.new, freqs.cpu, rc); + } +#endif + freqs.new = find_khz_freq_from_fid(data->currfid); for_each_cpu_mask(i, cpu_core_map[data->cpu]) { freqs.cpu = i; diff -r 640f8b15b9dd include/xen/interface/dom0_ops.h --- a/include/xen/interface/dom0_ops.h Fri Apr 7 01:32:54 2006 +0100 +++ b/include/xen/interface/dom0_ops.h Mon Apr 10 19:33:49 2006 -0400 @@ -283,6 +283,28 @@ typedef struct dom0_getpageframeinfo2 { GUEST_HANDLE(ulong) array; } dom0_getpageframeinfo2_t; DEFINE_GUEST_HANDLE(dom0_getpageframeinfo2_t); + +/* + * Notify hypervisor of a CPU core frequency change completed + * by cpufreq driver in dom0, triggering an internal timer + * recalibration. + * + * This should be called after the new frequency has stabilized. + * + * The CPUs specified may include any CPU or core (if cores have + * independent PLLs). In an SMP or multi-core system, it may + * take a while for the recalibration function to be scheduled + * on the intended target CPUs; there is no guarantee this will + * happen by the time this call returns. + * + */ +#define DOM0_SETCPUFREQ 30 +typedef struct dom0_setcpufreq { + /* IN variables */ + cpumap_t cpumap; + unsigned long khz; +} dom0_setcpufreq_t; +DEFINE_GUEST_HANDLE(dom0_setcpufreq_t); /* * Request memory range (@mfn, @mfn+@nr_mfns-1) to have type @type. @@ -496,6 +518,7 @@ typedef struct dom0_op { struct dom0_shadow_control shadow_control; struct dom0_setdomainmaxmem setdomainmaxmem; struct dom0_getpageframeinfo2 getpageframeinfo2; + struct dom0_setcpufreq setcpufreq; struct dom0_add_memtype add_memtype; struct dom0_del_memtype del_memtype; struct dom0_read_memtype read_memtype; ------------------------------------------------------- Matt T. Yourst yourst@cs.binghamton.edu Binghamton University, Department of Computer Science ------------------------------------------------------- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Apr-11 08:25 UTC
Re: [Xen-devel] Xen cpufreq support status: how to notify hypervisor of frequency change?
On 11 Apr 2006, at 01:05, Matt T. Yourst wrote:> Note that I only implemented this for powernow-k8 right now, since > that''s the > only hardware I could test it on. It''s very obvious how to adapt it to > the > other cpufreq drivers, by just replacing rdmsr and wrmsr with the Xen > wrapper > versions I provided, and adding in the setcpufreq hypercall at the end.All this stuff should be done by emulating the MSR writes in emulate_privileged_op() in arch/x86/traps.c. This will avoid any modification of Linux at all. Currently there''s only simple filtering of MSR write attempts, but picking up on cpu-freq MSR accesses on e.g., AMD systems and also resync''ing the local clock would not be difficult. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Matt T. Yourst
2006-Apr-11 20:11 UTC
Re: [Xen-devel] Xen cpufreq support status: how to notify hypervisor of frequency change?
On Tuesday 11 April 2006 04:25 am, Keir Fraser wrote:> > All this stuff should be done by emulating the MSR writes in > emulate_privileged_op() in arch/x86/traps.c. This will avoid any > modification of Linux at all. Currently there''s only simple filtering > of MSR write attempts, but picking up on cpu-freq MSR accesses on e.g., > AMD systems and also resync''ing the local clock would not be difficult. >I''ll give this a try - it should be much cleaner than my current method and may avoid some race issues (see below). We may need to duplicate some work done by the dom0 kernel and read out the frequency/voltage tables, since the MSR write on AMD chips (and probably Intel) does not specify an actual "XXX MHz" number - it''s just an index into the FID/VID table. I''ll investigate this, but I need someone with an Intel chip to do the same, since I have no way of testing those MSRs. I''m still getting some issues with the timer not properly re-syncing even though the cpufreq driver makes the new setcpufreq hypercall, which Xen properly receives. Right now it just sets cpu_khz and calls local_time_calibration() on the target CPU(s). Is there something else we need to do, like calling the equivalent of early_time_init() or init_xen_time() all over again? The problem is much less severe now but it still sometimes happens. - Matt ------------------------------------------------------- Matt T. Yourst yourst@cs.binghamton.edu Binghamton University, Department of Computer Science ------------------------------------------------------- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Matt T. Yourst
2006-Apr-12 02:49 UTC
Re: [Xen-devel] Xen cpufreq support status: how to notify hypervisor of frequency change?
On Tuesday 11 April 2006 04:25 am, Keir Fraser wrote:> On 11 Apr 2006, at 01:05, Matt T. Yourst wrote: > > Note that I only implemented this for powernow-k8 right now, since > > that''s the > > only hardware I could test it on. It''s very obvious how to adapt it to > > the > > other cpufreq drivers, by just replacing rdmsr and wrmsr with the Xen > > wrapper > > versions I provided, and adding in the setcpufreq hypercall at the end. > > All this stuff should be done by emulating the MSR writes in > emulate_privileged_op() in arch/x86/traps.c. This will avoid any > modification of Linux at all. Currently there''s only simple filtering > of MSR write attempts, but picking up on cpu-freq MSR accesses on e.g., > AMD systems and also resync''ing the local clock would not be difficult. >Here''s a patch that does just that, without modifying the guest kernel. It seems to work correctly on my machine, and doesn''t freeze up or lose interrupts like before (at least no log messages or visible latency issues). There was one case where the keyboard response got extremely slow, but everything else seemed to continue working (i.e. video played in the background without stalls). Restarting powersaved (the SuSE daemon that drives cpufreq) seemed to restore normal performance, so maybe it just needed to be kicked between frequencies to resync the keyboard timing. Could you please look over the code and make sure there''s nothing I missed that could cause it to be unstable in corner cases? Locking may need to be added since I haven''t tested this on an SMP system and I don''t know how Xen would cope with frequency changes here. I can add support from the Intel cpufreq drivers (speedstep and centrino), but someone else with the appropriate hardware will have to test it if I do. - Matt diff -r 886594fa3aef xen/arch/x86/time.c --- a/xen/arch/x86/time.c Sat Apr 8 12:10:04 2006 +0100 +++ b/xen/arch/x86/time.c Tue Apr 11 22:38:41 2006 -0400 @@ -914,6 +915,144 @@ void __init early_time_init(void) setup_irq(0, &irq0); } +/* + * Frequency Scaling Support + * + * These functions are called from emulate_privileged_op + * in response to the MSR writes that control core frequency + * and voltage on various CPU types. + * + * We identify only those writes that alter the frequency + * itself (i.e. between raising or lowering the voltage + * appropriately) and make sure that the requested frequency + * is different from the current frequency. In this case + * we read the appropriate status MSR until the frequency + * stabilizes, then recalibrate all hypervisor timing + * variables to the new frequency as indicated in the MSR. + * + * The frequency change is effective on the CPU this code + * is called on: it''s the responsibility of the guest OS + * to only write the virtual MSR on the target CPU context. + * + * No modifications to the guest OS cpufreq drivers are + * needed as long as support is provided below for the + * corresponding CPU type. + */ + +/* + * AMD Athlon 64 / Opteron Support (from powernow-k8 driver): + */ + +/* + * According to the AMD manuals, the following formula + * always converts an FID to the actual frequency, + * based on increments of 100 MHz (200 MHz steps): + * + * mhz = 800 + 100*fid + * + * Technically the BIOS is supposed to provide this + * table (so matching voltages can be found), but + * the frequency part is fixed for all K8 cores, + * so we just hard code the following formula: + */ +static inline int k8_fid_to_mhz(int fid) { + return 800 + 100*fid; +} + +int handle_k8_fidvid_status_msr_read(u32* lo, u32* hi) { + /* This will return -1 if the processor isn''t a K8: */ + return rdmsr_safe(MSR_FIDVID_STATUS, *lo, *hi); +} + +static int k8_fidvid_wait(void) { + u32 lo, hi; + u32 i = 0; + + DPRINTK("k8_fidvid_wait: waiting for frequency and voltage to stabilize..."); + + do { + if (i++ > 10000) { + printk("k8_vidfid_wait: Excessive wait time for vid/fid to stabilize\n"); + return -1; + } + rdmsr_safe(MSR_FIDVID_STATUS, lo, hi); + } while (lo & MSR_S_LO_CHANGE_PENDING); + + DPRINTK("OK: new fid %d\n", lo & MSR_S_LO_CHANGE_PENDING); + + return lo & MSR_S_LO_CURRENT_FID; +} + +#if 0 +#undef DPRINTK +#define DPRINTK printk +#endif + +int handle_k8_fidvid_ctl_msr_write(u32 lo, u32 hi) { + int rc; + u32 oldlo, oldhi; + int oldfid, newfid; + int mhz; + unsigned int cpu = smp_processor_id(); + // unsigned long flags; + s_time_t now; + + DPRINTK("fidvid_ctl: requested msr write 0x%08x:0x%08x\n", hi, lo); + + rc = rdmsr_safe(MSR_FIDVID_STATUS, oldlo, oldhi); + /* This will return -1 if the processor isn''t a K8: */ + if (rc) return rc; + + oldfid = (oldlo & MSR_S_LO_CURRENT_FID); + newfid = (lo & MSR_C_LO_NEW_FID); + + if (oldfid != newfid) { + DPRINTK("fidvid_ctl: moving from old fid %d to new fid %d\n", oldfid, newfid); + } else { + DPRINTK("fidvid_ctl: same fid %d\n", oldfid); + } + + DPRINTK("fidvid_ctl: writing MSR 0x%08x with 0x%08x:0x%08x...\n", MSR_FIDVID_CTL, hi, lo); + + rc = wrmsr_safe(MSR_FIDVID_CTL, lo, hi); + if (rc) return rc; + + if (oldfid == newfid) return 0; + + /* Only do the stabilization wait if we''re changing the frequency */ + /* For voltage changes, the OS will do this itself */ + + newfid = k8_fidvid_wait(); + /* excessive wait? abort the change and let guest kernel figure it out */ + if (newfid < 0) return 0; + + DPRINTK("fidvid_ctl: recalibrating TSC..."); + + mhz = k8_fid_to_mhz(newfid); + DPRINTK("%d MHz\n", mhz); + + cpu_khz = mhz * 1000; + set_time_scale(&cpu_time[smp_processor_id()].tsc_scale, mhz * 1000000); + + DPRINTK("fidvid_ctl: resetting timestamps..."); + + rdtscll(cpu_time[cpu].local_tsc_stamp); + now = read_platform_stime(); + + cpu_time[cpu].stime_master_stamp = now; + cpu_time[cpu].stime_local_stamp = now; + + DPRINTK("OK\n"); + + DPRINTK("fidvid_ctl: recalibrating timers..."); + + local_time_calibration(NULL); + __update_vcpu_system_time(current); + DPRINTK("OK\n"); + + return 0; +} + void send_timer_event(struct vcpu *v) { send_guest_vcpu_virq(v, VIRQ_TIMER); diff -r 886594fa3aef xen/arch/x86/traps.c --- a/xen/arch/x86/traps.c Sat Apr 8 12:10:04 2006 +0100 +++ b/xen/arch/x86/traps.c Tue Apr 11 22:38:41 2006 -0400 @@ -1131,6 +1131,16 @@ static int emulate_privileged_op(struct ((u64)regs->edx << 32) | regs->eax; break; #endif + case MSR_FIDVID_CTL: { + extern int handle_k8_fidvid_ctl_msr_write(u32 lo, u32 hi); + /* domU is never allowed to mess with core frequencies and voltages */ + if (!IS_PRIV(current->domain)) + break; + if (handle_k8_fidvid_ctl_msr_write(regs->eax, regs->edx)) + goto fail; + break; + } + default: if ( (rdmsr_safe(regs->ecx, l, h) != 0) || (regs->eax != l) || (regs->edx != h) ) @@ -1162,6 +1172,14 @@ static int emulate_privileged_op(struct if ( rdmsr_safe(regs->ecx, regs->eax, regs->edx) ) goto fail; break; + + case MSR_FIDVID_STATUS: { + extern int handle_k8_fidvid_status_msr_read(u32* lo, u32* hi); + if (handle_k8_fidvid_status_msr_read((u32*)®s->eax, (u32*)®s->edx)) + goto fail; + break; + } + default: /* Everyone can read the MSR space. */ /*DPRINTK("Domain attempted RDMSR %p.\n", _p(regs->ecx));*/ diff -r 886594fa3aef xen/include/asm-x86/msr.h --- a/xen/include/asm-x86/msr.h Sat Apr 8 12:10:04 2006 +0100 +++ b/xen/include/asm-x86/msr.h Tue Apr 11 22:38:41 2006 -0400 @@ -137,6 +137,37 @@ static inline void wrmsrl(unsigned int m #define EFER_LMA (1<<_EFER_LMA) #define EFER_NX (1<<_EFER_NX) #define EFER_SVME (1<<_EFER_SVME) + +/* Model Specific Registers for K8 p-state transitions. MSRs are 64-bit. For */ +/* writes (wrmsr - opcode 0f 30), the register number is placed in ecx, and */ +/* the value to write is placed in edx:eax. For reads (rdmsr - opcode 0f 32), */ +/* the register number is placed in ecx, and the data is returned in edx:eax. */ + +#define MSR_FIDVID_CTL 0xc0010041 +#define MSR_FIDVID_STATUS 0xc0010042 + +/* Field definitions within the FID VID Low Control MSR : */ +#define MSR_C_LO_INIT_FID_VID 0x00010000 +#define MSR_C_LO_NEW_VID 0x00003f00 +#define MSR_C_LO_NEW_FID 0x0000003f +#define MSR_C_LO_VID_SHIFT 8 + +/* Field definitions within the FID VID High Control MSR : */ +#define MSR_C_HI_STP_GNT_TO 0x000fffff + +/* Field definitions within the FID VID Low Status MSR : */ +#define MSR_S_LO_CHANGE_PENDING 0x80000000 /* cleared when completed */ +#define MSR_S_LO_MAX_RAMP_VID 0x3f000000 +#define MSR_S_LO_MAX_FID 0x003f0000 +#define MSR_S_LO_START_FID 0x00003f00 +#define MSR_S_LO_CURRENT_FID 0x0000003f + +/* Field definitions within the FID VID High Status MSR : */ +#define MSR_S_HI_MIN_WORKING_VID 0x3f000000 +#define MSR_S_HI_MAX_WORKING_VID 0x003f0000 +#define MSR_S_HI_START_VID 0x00003f00 +#define MSR_S_HI_CURRENT_VID 0x0000003f +#define MSR_C_HI_STP_GNT_BENIGN 0x00000001 /* Intel MSRs. Some also available on other CPUs */ #define MSR_IA32_PLATFORM_ID 0x17 ------------------------------------------------------- Matt T. Yourst yourst@cs.binghamton.edu Binghamton University, Department of Computer Science ------------------------------------------------------- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Apr-12 05:47 UTC
Re: [Xen-devel] Xen cpufreq support status: how to notify hypervisor of frequency change?
On 11 Apr 2006, at 21:11, Matt T. Yourst wrote:> I''m still getting some issues with the timer not properly re-syncing > even > though the cpufreq driver makes the new setcpufreq hypercall, which Xen > properly receives. > > Right now it just sets cpu_khz and calls local_time_calibration() on > the > target CPU(s). Is there something else we need to do, like calling the > equivalent of early_time_init() or init_xen_time() all over again? > > The problem is much less severe now but it still sometimes happens.local_time_calibration() won''t really do the right thing. It will calibrate for the observed TSC rate over the last few seconds, most of which will have passed at the old TSC rate. The simplest fix would simply be to multiply the calculated TSC scale factor by new_mhz/old_mhz. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Matt T. Yourst
2006-Apr-13 02:31 UTC
Re: [Xen-devel] Xen cpufreq support status: how to notify hypervisor of frequency change?
On Wednesday 12 April 2006 01:47 am, you wrote:> > local_time_calibration() won''t really do the right thing. It will > calibrate for the observed TSC rate over the last few seconds, most of > which will have passed at the old TSC rate. The simplest fix would > simply be to multiply the calculated TSC scale factor by > new_mhz/old_mhz.Doesn''t set_time_scale() do exactly this by directly using the supplied ticks_per_sec value? Why does it need to be scaled based on the old frequency when the values can just be recalculated? I removed the call to local_time_calibration() and it still works just fine, so it looks like set_time_scale() overrides the other settings anyway. I''m still having a strange issue (only in X sessions it appears) where the key repeat rate and response time gets very slow after a frequency shift. Everything else responds normally except for the keyboard, and that problem goes away after a minute or two. Any idea what''s going on? - Matt ------------------------------------------------------- Matt T. Yourst yourst@cs.binghamton.edu Binghamton University, Department of Computer Science ------------------------------------------------------- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Apr-13 10:44 UTC
Re: [Xen-devel] Xen cpufreq support status: how to notify hypervisor of frequency change?
On 13 Apr 2006, at 03:31, Matt T. Yourst wrote:> Doesn''t set_time_scale() do exactly this by directly using the supplied > ticks_per_sec value? Why does it need to be scaled based on the old > frequency > when the values can just be recalculated? > > I removed the call to local_time_calibration() and it still works just > fine, > so it looks like set_time_scale() overrides the other settings anyway.Ah yes, looking closer at your code I think you are doing the right thing and the local_time_calibration() call should be removed.> I''m still having a strange issue (only in X sessions it appears) where > the key > repeat rate and response time gets very slow after a frequency shift. > Everything else responds normally except for the keyboard, and that > problem > goes away after a minute or two. Any idea what''s going on?Any idea if fiddling with CPU frequency affects the local APIC bus frequency? I would guess probably not... Do you really mean the problem continues for a few *minutes* before ''fixing itself''? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Matt T. Yourst
2006-Apr-13 18:28 UTC
Re: [Xen-devel] Xen cpufreq support status: how to notify hypervisor of frequency change?
On Thursday 13 April 2006 06:44 am, you wrote:> > Ah yes, looking closer at your code I think you are doing the right > thing and the local_time_calibration() call should be removed. >OK, so do you think my patch (minus local_time_calibration()) would have a chance of getting merged? I can add basic support for non-AMD MSRs if you want, but can someone else with a machine actually using the speedstep and/or centrino cpufreq drivers will need to test it.> > Any idea if fiddling with CPU frequency affects the local APIC bus > frequency? I would guess probably not... >I think it''s just the core frequency - everything else is on separate PLLs, at least on K8 chips. Maybe Intel is different. It will be interesting doing this on dual core, since I''ve had problems with cpufreq on a dual-core Athlon 64 X2 box. Each core can technically be adjusted independently, but the userspace cpufreq programs (cpufreqd, powersaved, etc) apparently do not correctly measure the system load, i.e. 50% total CPU load runs both cores at half their maximum frequency, while ideally it should run one at full speed while leaving the other core idle. This may be a problem with the load measurement algorithm rather than a kernel issue - as long as the MSRs are written correctly, Xen will follow along with whatever the dom0 kernel decides to do.> > Do you really mean the problem continues for a few *minutes* before > ''fixing itself''? >That''s correct - it can sometimes be minutes before the keyboard returns to normal. The frequency shift causes this, but I don''t understand why it affects only the keyboard, only in X, and why it goes back to normal. Maybe there is something in the X server that instantaneously reads a bogus time just at the frequency transition point, and locks on to that erroneously slow timer until the same corner case occurs a second time? I have no idea. If someone else could reproduce this with the patch, it might help solve it. It''s extremely rare, so it might be difficult to reproduce. - Matt ------------------------------------------------------- Matt T. Yourst yourst@cs.binghamton.edu Binghamton University, Department of Computer Science ------------------------------------------------------- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Apr-14 06:44 UTC
Re: [Xen-devel] Xen cpufreq support status: how to notify hypervisor of frequency change?
On 13 Apr 2006, at 19:28, Matt T. Yourst wrote:>> Ah yes, looking closer at your code I think you are doing the right >> thing and the local_time_calibration() call should be removed. >> > > OK, so do you think my patch (minus local_time_calibration()) would > have a > chance of getting merged? I can add basic support for non-AMD MSRs if > you > want, but can someone else with a machine actually using the speedstep > and/or > centrino cpufreq drivers will need to test it.Yes. There are cleanups I''d like to do but it probably makes sense to apply your patch as is and then clean up incrementally after that (move to an AMD-specific file; more generic interface for registering interest in MSRs; etc). Right now you should at least check you are running on an AMD processor before processing the MSR accesses. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel