Jeremy Fitzhardinge
2009-Oct-08 22:58 UTC
[Xen-devel] [PATCH v2] Allow guests to register secondary vcpu_time_info
Allow a guest to register a second location for the VCPU time info structure for each vcpu. This is intended to allow the guest kernel to map this information into a usermode accessible page, so that usermode can efficiently calculate system time from the TSC without having to make a syscall. The second vcpu_time_info structure is updated by copy, rather than being a shared page between the guest and Xen. It is not entirely updated by copy; instead, Xen preserves and increments the existing version number in place. This allows the guest to also update the version number (useful to indicate vcpu context switches to usermode). This assumes that the guest will only ever update the structure for a given vcpu on that vcpu (as Xen does, so there are never any cross-cpu accesses). Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> diff -r 10cfcbef68ee xen/arch/x86/domain.c --- a/xen/arch/x86/domain.c Tue Oct 06 10:11:53 2009 +0100 +++ b/xen/arch/x86/domain.c Tue Oct 06 19:23:52 2009 -0700 @@ -964,6 +964,24 @@ break; } + case VCPUOP_register_vcpu_time_memory_area: + { + struct vcpu_register_time_memory_area area; + rc = -EFAULT; + if ( copy_from_guest(&area, arg, 1) ) + break; + + if ( !guest_handle_okay(area.addr.h, 1) ) + break; + + rc = 0; + v->time_info_guest = area.addr.h; + + __update_vcpu_system_time(v, 1); + + break; + } + case VCPUOP_get_physid: { struct vcpu_get_physid cpu_id; diff -r 10cfcbef68ee xen/arch/x86/time.c --- a/xen/arch/x86/time.c Tue Oct 06 10:11:53 2009 +0100 +++ b/xen/arch/x86/time.c Tue Oct 06 19:23:52 2009 -0700 @@ -22,6 +22,7 @@ #include <xen/irq.h> #include <xen/softirq.h> #include <xen/keyhandler.h> +#include <xen/guest_access.h> #include <asm/io.h> #include <asm/msr.h> #include <asm/mpspec.h> @@ -826,7 +827,30 @@ (*version)++; } -void update_vcpu_system_time(struct vcpu *v) +static void update_guest_time_info(struct vcpu *v, struct vcpu_time_info *u) +{ + struct vcpu_time_info info; + + if ( guest_handle_is_null(v->time_info_guest) ) + return; + + info = *u; + + /* + * Update the guest copy of the time info. We need to make sure + * we update the guest''s version of the version number rather than + * use a verbtim copy of the master one, because the guest may + * update the version for its own purposes. + */ + if ( __copy_field_from_guest(&info, v->time_info_guest, version) ) + return; + + info.version = (info.version + 2) & ~1; + + __copy_to_guest(v->time_info_guest, &info, 1); +} + +void __update_vcpu_system_time(struct vcpu *v, int force) { struct cpu_time *t; struct vcpu_time_info *u; @@ -839,7 +863,7 @@ if ( v->domain->arch.vtsc ) { - if ( u->tsc_timestamp == t->stime_local_stamp ) + if ( !force && u->tsc_timestamp == t->stime_local_stamp ) return; version_update_begin(&u->version); u->tsc_timestamp = t->stime_local_stamp; @@ -848,7 +872,7 @@ u->tsc_shift = 1; version_update_end(&u->version); } - else if ( u->tsc_timestamp != t->local_tsc_stamp ) + else if ( force || u->tsc_timestamp != t->local_tsc_stamp ) { version_update_begin(&u->version); u->tsc_timestamp = t->local_tsc_stamp; @@ -857,6 +881,13 @@ u->tsc_shift = (s8)t->tsc_scale.shift; version_update_end(&u->version); } + + update_guest_time_info(v, u); +} + +void update_vcpu_system_time(struct vcpu *v) +{ + __update_vcpu_system_time(v, 0); } void update_domain_wallclock_time(struct domain *d) diff -r 10cfcbef68ee xen/include/public/vcpu.h --- a/xen/include/public/vcpu.h Tue Oct 06 10:11:53 2009 +0100 +++ b/xen/include/public/vcpu.h Tue Oct 06 19:23:52 2009 -0700 @@ -202,6 +202,49 @@ #define xen_vcpu_physid_to_x86_acpiid(physid) \ ((((uint32_t)((physid)>>32)) >= 0xff) ? 0xff : ((uint8_t)((physid)>>32))) +/* + * Register a memory location to get a secondary copy of the vcpu time + * parameters. The master copy still exists as part of the vcpu + * shared memory area, and this secondary copy is updated whenever the + * master copy is updated. + * + * The intent is that this copy may be mapped (RO) into userspace so + * that usermode can compute system time using the time info and the + * tsc. Usermode will see an array of vcpu_time_info structures, one + * for each vcpu, and choose the right one by an existing mechanism + * which allows it to get the current vcpu number (such as via a + * segment limit). It can then apply the normal algorithm to compute + * system time from the tsc. + * + * However, because usermode threads are subject to two levels of + * scheduling (kernel scheduling of threads to vcpus, and Xen + * scheduling of vcpus to pcpus), we must make sure that the thread + * knows it has had a race with either (or both) of these two events. + * To allow this, the guest kernel updates the time_info version + * number when the vcpu does a context switch, so that usermode will + * always see a version number change when the parameters need to be + * revalidated. Xen makes sure that it always updates the guest''s + * version rather than overwriting it. (It assumes that a vcpu will + * always update its own version number, so there are no cross-cpu + * synchronization issues; the only concern is that if the guest + * kernel gets preempted by Xen it doesn''t revert the version number + * to an older value.) + * + * @extra_arg == pointer to vcpu_register_time_info_memory_area structure. + */ +#define VCPUOP_register_vcpu_time_memory_area 13 + +DEFINE_XEN_GUEST_HANDLE(vcpu_time_info_t); +struct vcpu_register_time_memory_area { + union { + XEN_GUEST_HANDLE(vcpu_time_info_t) h; + struct vcpu_time_info *v; + uint64_t p; + } addr; +}; +typedef struct vcpu_register_time_memory_area vcpu_register_time_memory_area_t; +DEFINE_XEN_GUEST_HANDLE(vcpu_register_time_memory_area_t); + #endif /* __XEN_PUBLIC_VCPU_H__ */ /* diff -r 10cfcbef68ee xen/include/xen/sched.h --- a/xen/include/xen/sched.h Tue Oct 06 10:11:53 2009 +0100 +++ b/xen/include/xen/sched.h Tue Oct 06 19:23:52 2009 -0700 @@ -102,6 +102,9 @@ } runstate_guest; /* guest address */ #endif + /* A secondary copy of the vcpu time info */ + XEN_GUEST_HANDLE(vcpu_time_info_t) time_info_guest; + /* last time when vCPU is scheduled out */ uint64_t last_run_time; diff -r 10cfcbef68ee xen/include/xen/time.h --- a/xen/include/xen/time.h Tue Oct 06 10:11:53 2009 +0100 +++ b/xen/include/xen/time.h Tue Oct 06 19:23:52 2009 -0700 @@ -54,6 +54,7 @@ #define STIME_MAX ((s_time_t)((uint64_t)~0ull>>1)) extern void update_vcpu_system_time(struct vcpu *v); +extern void __update_vcpu_system_time(struct vcpu *v, int force); extern void update_domain_wallclock_time(struct domain *d); extern void do_settime( _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Oct-16 14:34 UTC
[Xen-devel] Re: [PATCH v2] Allow guests to register secondary vcpu_time_info
On 10/12/09 14:08, Keir Fraser wrote:> You''ll send me a simplified version of this that doesn''t need to support > guest update of the version field, right? >Here''s the simplified version. But I''m wondering if it should do the proper version update with write barriers to make it the same as the normal vcpu_time_info update protocol. J Allow guests to register secondary vcpu_time_info Allow a guest to register a second location for the VCPU time info structure for each vcpu. This is intended to allow the guest kernel to map this information into a usermode accessible page, so that usermode can efficiently calculate system time from the TSC without having to make a syscall. The second vcpu_time_info structure is updated by copy with no enforced write ordering between the fields. It is always updated on the same pcpu as the guest''s vcpu, so it will always preempt the guest when updating the field. However, this means that the guest can only reliably access a vcpu''s time_info on that vcpu. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> diff -r 5d779377a9ae xen/arch/x86/domain.c --- a/xen/arch/x86/domain.c Mon Oct 12 12:56:00 2009 +0100 +++ b/xen/arch/x86/domain.c Wed Oct 14 14:36:30 2009 -0700 @@ -960,6 +960,24 @@ break; } + case VCPUOP_register_vcpu_time_memory_area: + { + struct vcpu_register_time_memory_area area; + rc = -EFAULT; + if ( copy_from_guest(&area, arg, 1) ) + break; + + if ( !guest_handle_okay(area.addr.h, 1) ) + break; + + rc = 0; + v->time_info_guest = area.addr.h; + + __update_vcpu_system_time(v, 1); + + break; + } + case VCPUOP_get_physid: { struct vcpu_get_physid cpu_id; diff -r 5d779377a9ae xen/arch/x86/time.c --- a/xen/arch/x86/time.c Mon Oct 12 12:56:00 2009 +0100 +++ b/xen/arch/x86/time.c Wed Oct 14 14:36:30 2009 -0700 @@ -22,6 +22,7 @@ #include <xen/irq.h> #include <xen/softirq.h> #include <xen/keyhandler.h> +#include <xen/guest_access.h> #include <asm/io.h> #include <asm/msr.h> #include <asm/mpspec.h> @@ -820,7 +821,7 @@ (*version)++; } -void update_vcpu_system_time(struct vcpu *v) +void __update_vcpu_system_time(struct vcpu *v, int force) { struct cpu_time *t; struct vcpu_time_info *u; @@ -833,7 +834,7 @@ if ( v->domain->arch.vtsc ) { - if ( u->tsc_timestamp == t->stime_local_stamp ) + if ( !force && u->tsc_timestamp == t->stime_local_stamp ) return; version_update_begin(&u->version); u->tsc_timestamp = t->stime_local_stamp; @@ -842,7 +843,7 @@ u->tsc_shift = 1; version_update_end(&u->version); } - else if ( u->tsc_timestamp != t->local_tsc_stamp ) + else if ( force || u->tsc_timestamp != t->local_tsc_stamp ) { version_update_begin(&u->version); u->tsc_timestamp = t->local_tsc_stamp; @@ -851,6 +852,14 @@ u->tsc_shift = (s8)t->tsc_scale.shift; version_update_end(&u->version); } + + if ( !guest_handle_is_null(v->time_info_guest) ) + __copy_to_guest(v->time_info_guest, u, 1); +} + +void update_vcpu_system_time(struct vcpu *v) +{ + __update_vcpu_system_time(v, 0); } void update_domain_wallclock_time(struct domain *d) diff -r 5d779377a9ae xen/include/public/vcpu.h --- a/xen/include/public/vcpu.h Mon Oct 12 12:56:00 2009 +0100 +++ b/xen/include/public/vcpu.h Wed Oct 14 14:36:30 2009 -0700 @@ -202,6 +202,35 @@ #define xen_vcpu_physid_to_x86_acpiid(physid) \ ((((uint32_t)((physid)>>32)) >= 0xff) ? 0xff : ((uint8_t)((physid)>>32))) +/* + * Register a memory location to get a secondary copy of the vcpu time + * parameters. The master copy still exists as part of the vcpu + * shared memory area, and this secondary copy is updated whenever the + * master copy is updated. + * + * The intent is that this copy may be mapped (RO) into userspace so + * that usermode can compute system time using the time info and the + * tsc. Usermode will see an array of vcpu_time_info structures, one + * for each vcpu, and choose the right one by an existing mechanism + * which allows it to get the current vcpu number (such as via a + * segment limit). It can then apply the normal algorithm to compute + * system time from the tsc. + * + * @extra_arg == pointer to vcpu_register_time_info_memory_area structure. + */ +#define VCPUOP_register_vcpu_time_memory_area 13 + +DEFINE_XEN_GUEST_HANDLE(vcpu_time_info_t); +struct vcpu_register_time_memory_area { + union { + XEN_GUEST_HANDLE(vcpu_time_info_t) h; + struct vcpu_time_info *v; + uint64_t p; + } addr; +}; +typedef struct vcpu_register_time_memory_area vcpu_register_time_memory_area_t; +DEFINE_XEN_GUEST_HANDLE(vcpu_register_time_memory_area_t); + #endif /* __XEN_PUBLIC_VCPU_H__ */ /* diff -r 5d779377a9ae xen/include/xen/sched.h --- a/xen/include/xen/sched.h Mon Oct 12 12:56:00 2009 +0100 +++ b/xen/include/xen/sched.h Wed Oct 14 14:36:30 2009 -0700 @@ -102,6 +102,9 @@ } runstate_guest; /* guest address */ #endif + /* A secondary copy of the vcpu time info */ + XEN_GUEST_HANDLE(vcpu_time_info_t) time_info_guest; + /* last time when vCPU is scheduled out */ uint64_t last_run_time; diff -r 5d779377a9ae xen/include/xen/time.h --- a/xen/include/xen/time.h Mon Oct 12 12:56:00 2009 +0100 +++ b/xen/include/xen/time.h Wed Oct 14 14:36:30 2009 -0700 @@ -54,6 +54,7 @@ #define STIME_MAX ((s_time_t)((uint64_t)~0ull>>1)) extern void update_vcpu_system_time(struct vcpu *v); +extern void __update_vcpu_system_time(struct vcpu *v, int force); extern void update_domain_wallclock_time(struct domain *d); extern void do_settime( _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-19 11:02 UTC
[Xen-devel] Re: [PATCH v2] Allow guests to register secondary vcpu_time_info
On 16/10/2009 15:34, "Jeremy Fitzhardinge" <jeremy@goop.org> wrote:> On 10/12/09 14:08, Keir Fraser wrote: >> You''ll send me a simplified version of this that doesn''t need to support >> guest update of the version field, right? >> > > Here''s the simplified version. But I''m wondering if it should do the > proper version update with write barriers to make it the same as the > normal vcpu_time_info update protocol.I fixed this and applied as c/s 20339. Take a look. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Oct-19 22:16 UTC
[Xen-devel] Re: [PATCH v2] Allow guests to register secondary vcpu_time_info
On 10/19/09 20:02, Keir Fraser wrote:> On 16/10/2009 15:34, "Jeremy Fitzhardinge" <jeremy@goop.org> wrote: > > >> On 10/12/09 14:08, Keir Fraser wrote: >> >>> You''ll send me a simplified version of this that doesn''t need to support >>> guest update of the version field, right? >>> >>> >> Here''s the simplified version. But I''m wondering if it should do the >> proper version update with write barriers to make it the same as the >> normal vcpu_time_info update protocol. >> > I fixed this and applied as c/s 20339. Take a look. >Looks OK. One thing: if it''s using __copy_to_guest, should it first test the handle with guest_handle_okay()? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Oct-19 23:49 UTC
[Xen-devel] Re: [PATCH v2] Allow guests to register secondary vcpu_time_info
On 10/20/09 07:16, Jeremy Fitzhardinge wrote:> Looks OK. One thing: if it''s using __copy_to_guest, should it first > test the handle with guest_handle_okay()? >Oh, I see it happens at the hypercall. I guess the results of the test can''t be invalidated. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-20 07:31 UTC
[Xen-devel] Re: [PATCH v2] Allow guests to register secondary vcpu_time_info
On 20/10/2009 00:49, "Jeremy Fitzhardinge" <jeremy@goop.org> wrote:> On 10/20/09 07:16, Jeremy Fitzhardinge wrote: >> Looks OK. One thing: if it''s using __copy_to_guest, should it first >> test the handle with guest_handle_okay()? >> > > Oh, I see it happens at the hypercall. I guess the results of the test > can''t be invalidated.Indeed. Also your original patch used __copy_to_guest. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel