Matt Wilson
2012-Dec-05 06:02 UTC
[PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set
An administrator may want Xen to pin dom0 vCPUs to pCPUs 1:1 at boot, but still have the flexibility to change the configuration later. There''s no logic that keys off of domain->is_pinned outside of sched_init_vcpu() and vcpu_set_affinity(). By adjusting the is_pinned_vcpu() macro to only check for a single CPU set in the cpu_affinity mask, dom0 vCPUs can safely be re-pinned after the system boots. Signed-off-by: Matt Wilson <msw@amazon.com> diff -r 29247e44df47 -r 2614dd8be3a0 docs/misc/xen-command-line.markdown --- a/docs/misc/xen-command-line.markdown Fri Nov 30 21:51:17 2012 +0000 +++ b/docs/misc/xen-command-line.markdown Wed Dec 05 05:48:23 2012 +0000 @@ -453,7 +453,7 @@ Practices](http://wiki.xen.org/wiki/Xen_ > Default: `false` -Pin dom0 vcpus to their respective pcpus +Initially pin dom0 vcpus to their respective pcpus ### e820-mtrr-clip > `= <boolean>` diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/domain.c --- a/xen/common/domain.c Fri Nov 30 21:51:17 2012 +0000 +++ b/xen/common/domain.c Wed Dec 05 05:48:23 2012 +0000 @@ -45,10 +45,6 @@ /* xen_processor_pmbits: xen control Cx, Px, ... */ unsigned int xen_processor_pmbits = XEN_PROCESSOR_PM_PX; -/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned. */ -bool_t opt_dom0_vcpus_pin; -boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin); - /* Protect updates/reads (resp.) of domain_list and domain_hash. */ DEFINE_SPINLOCK(domlist_update_lock); DEFINE_RCU_READ_LOCK(domlist_read_lock); @@ -235,7 +231,6 @@ struct domain *domain_create( if ( domid == 0 ) { - d->is_pinned = opt_dom0_vcpus_pin; d->disable_migrate = 1; } diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/schedule.c --- a/xen/common/schedule.c Fri Nov 30 21:51:17 2012 +0000 +++ b/xen/common/schedule.c Wed Dec 05 05:48:23 2012 +0000 @@ -52,6 +52,11 @@ boolean_param("sched_smt_power_savings", * */ int sched_ratelimit_us = SCHED_DEFAULT_RATELIMIT_US; integer_param("sched_ratelimit_us", sched_ratelimit_us); + +/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned at boot. */ +bool_t opt_dom0_vcpus_pin; +boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin); + /* Various timer handlers. */ static void s_timer_fn(void *unused); static void vcpu_periodic_timer_fn(void *data); @@ -194,7 +199,8 @@ int sched_init_vcpu(struct vcpu *v, unsi * domain-0 VCPUs, are pinned onto their respective physical CPUs. */ v->processor = processor; - if ( is_idle_domain(d) || d->is_pinned ) + + if ( is_idle_domain(d) || (d->domain_id == 0 && opt_dom0_vcpus_pin) ) cpumask_copy(v->cpu_affinity, cpumask_of(processor)); else cpumask_setall(v->cpu_affinity); @@ -595,8 +601,6 @@ int vcpu_set_affinity(struct vcpu *v, co cpumask_t online_affinity; cpumask_t *online; - if ( v->domain->is_pinned ) - return -EINVAL; online = VCPU2ONLINE(v); cpumask_and(&online_affinity, affinity, online); if ( cpumask_empty(&online_affinity) ) diff -r 29247e44df47 -r 2614dd8be3a0 xen/include/xen/sched.h --- a/xen/include/xen/sched.h Fri Nov 30 21:51:17 2012 +0000 +++ b/xen/include/xen/sched.h Wed Dec 05 05:48:23 2012 +0000 @@ -292,8 +292,6 @@ struct domain enum { DOMDYING_alive, DOMDYING_dying, DOMDYING_dead } is_dying; /* Domain is paused by controller software? */ bool_t is_paused_by_controller; - /* Domain''s VCPUs are pinned 1:1 to physical CPUs? */ - bool_t is_pinned; /* Are any VCPUs polling event channels (SCHEDOP_poll)? */ #if MAX_VIRT_CPUS <= BITS_PER_LONG @@ -713,8 +711,7 @@ void watchdog_domain_destroy(struct doma #define is_hvm_domain(d) ((d)->is_hvm) #define is_hvm_vcpu(v) (is_hvm_domain(v->domain)) -#define is_pinned_vcpu(v) ((v)->domain->is_pinned || \ - cpumask_weight((v)->cpu_affinity) == 1) +#define is_pinned_vcpu(v) (cpumask_weight((v)->cpu_affinity) == 1) #ifdef HAS_PASSTHROUGH #define need_iommu(d) ((d)->need_iommu) #else
Andrew Cooper
2012-Dec-05 10:44 UTC
Re: [PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set
On 05/12/12 06:02, Matt Wilson wrote:> An administrator may want Xen to pin dom0 vCPUs to pCPUs 1:1 at boot, > but still have the flexibility to change the configuration later. > There''s no logic that keys off of domain->is_pinned outside of > sched_init_vcpu() and vcpu_set_affinity(). By adjusting the > is_pinned_vcpu() macro to only check for a single CPU set in the > cpu_affinity mask, dom0 vCPUs can safely be re-pinned after the system > boots.Sadly this patch will break things. There are certain callers of is_pinned_vcpu() which rely on the value to allow access to certain power related MSRs, which is where the requirement for never permitting an update of the affinity mask comes from. When I encountered this problem before, I considered implementing dom0_vcpu_pin=dynamic (or name to suit) which sets up an identity pin at create time, but leaves is_pinned as false. ~Andrew> > Signed-off-by: Matt Wilson <msw@amazon.com> > > diff -r 29247e44df47 -r 2614dd8be3a0 docs/misc/xen-command-line.markdown > --- a/docs/misc/xen-command-line.markdown Fri Nov 30 21:51:17 2012 +0000 > +++ b/docs/misc/xen-command-line.markdown Wed Dec 05 05:48:23 2012 +0000 > @@ -453,7 +453,7 @@ Practices](http://wiki.xen.org/wiki/Xen_ > > > Default: `false` > > -Pin dom0 vcpus to their respective pcpus > +Initially pin dom0 vcpus to their respective pcpus > > ### e820-mtrr-clip > > `= <boolean>` > diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/domain.c > --- a/xen/common/domain.c Fri Nov 30 21:51:17 2012 +0000 > +++ b/xen/common/domain.c Wed Dec 05 05:48:23 2012 +0000 > @@ -45,10 +45,6 @@ > /* xen_processor_pmbits: xen control Cx, Px, ... */ > unsigned int xen_processor_pmbits = XEN_PROCESSOR_PM_PX; > > -/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned. */ > -bool_t opt_dom0_vcpus_pin; > -boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin); > - > /* Protect updates/reads (resp.) of domain_list and domain_hash. */ > DEFINE_SPINLOCK(domlist_update_lock); > DEFINE_RCU_READ_LOCK(domlist_read_lock); > @@ -235,7 +231,6 @@ struct domain *domain_create( > > if ( domid == 0 ) > { > - d->is_pinned = opt_dom0_vcpus_pin; > d->disable_migrate = 1; > } > > diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/schedule.c > --- a/xen/common/schedule.c Fri Nov 30 21:51:17 2012 +0000 > +++ b/xen/common/schedule.c Wed Dec 05 05:48:23 2012 +0000 > @@ -52,6 +52,11 @@ boolean_param("sched_smt_power_savings", > * */ > int sched_ratelimit_us = SCHED_DEFAULT_RATELIMIT_US; > integer_param("sched_ratelimit_us", sched_ratelimit_us); > + > +/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned at boot. */ > +bool_t opt_dom0_vcpus_pin; > +boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin); > + > /* Various timer handlers. */ > static void s_timer_fn(void *unused); > static void vcpu_periodic_timer_fn(void *data); > @@ -194,7 +199,8 @@ int sched_init_vcpu(struct vcpu *v, unsi > * domain-0 VCPUs, are pinned onto their respective physical CPUs. > */ > v->processor = processor; > - if ( is_idle_domain(d) || d->is_pinned ) > + > + if ( is_idle_domain(d) || (d->domain_id == 0 && opt_dom0_vcpus_pin) ) > cpumask_copy(v->cpu_affinity, cpumask_of(processor)); > else > cpumask_setall(v->cpu_affinity); > @@ -595,8 +601,6 @@ int vcpu_set_affinity(struct vcpu *v, co > cpumask_t online_affinity; > cpumask_t *online; > > - if ( v->domain->is_pinned ) > - return -EINVAL; > online = VCPU2ONLINE(v); > cpumask_and(&online_affinity, affinity, online); > if ( cpumask_empty(&online_affinity) ) > diff -r 29247e44df47 -r 2614dd8be3a0 xen/include/xen/sched.h > --- a/xen/include/xen/sched.h Fri Nov 30 21:51:17 2012 +0000 > +++ b/xen/include/xen/sched.h Wed Dec 05 05:48:23 2012 +0000 > @@ -292,8 +292,6 @@ struct domain > enum { DOMDYING_alive, DOMDYING_dying, DOMDYING_dead } is_dying; > /* Domain is paused by controller software? */ > bool_t is_paused_by_controller; > - /* Domain''s VCPUs are pinned 1:1 to physical CPUs? */ > - bool_t is_pinned; > > /* Are any VCPUs polling event channels (SCHEDOP_poll)? */ > #if MAX_VIRT_CPUS <= BITS_PER_LONG > @@ -713,8 +711,7 @@ void watchdog_domain_destroy(struct doma > > #define is_hvm_domain(d) ((d)->is_hvm) > #define is_hvm_vcpu(v) (is_hvm_domain(v->domain)) > -#define is_pinned_vcpu(v) ((v)->domain->is_pinned || \ > - cpumask_weight((v)->cpu_affinity) == 1) > +#define is_pinned_vcpu(v) (cpumask_weight((v)->cpu_affinity) == 1) > #ifdef HAS_PASSTHROUGH > #define need_iommu(d) ((d)->need_iommu) > #else > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
Matt Wilson
2012-Dec-05 15:59 UTC
Re: [PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set
On Wed, Dec 05, 2012 at 10:44:04AM +0000, Andrew Cooper wrote:> On 05/12/12 06:02, Matt Wilson wrote: > > An administrator may want Xen to pin dom0 vCPUs to pCPUs 1:1 at boot, > > but still have the flexibility to change the configuration later. > > There''s no logic that keys off of domain->is_pinned outside of > > sched_init_vcpu() and vcpu_set_affinity(). By adjusting the > > is_pinned_vcpu() macro to only check for a single CPU set in the > > cpu_affinity mask, dom0 vCPUs can safely be re-pinned after the system > > boots. > > Sadly this patch will break things. There are certain callers of > is_pinned_vcpu() which rely on the value to allow access to certain > power related MSRs, which is where the requirement for never permitting > an update of the affinity mask comes from.If this is true, the existing is_pinned_vcpu() test is broken: #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \ cpumask_weight((v)->cpu_affinity) == 1) It''s && not ||. So if someone pins dom0 vCPUs to pCPUs 1:1 after boot, the MSR traps will suddenly start working. See commit: http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=cc0854dd> When I encountered this problem before, I considered implementing > dom0_vcpu_pin=dynamic (or name to suit) which sets up an identity pin at > create time, but leaves is_pinned as false.I could implement that, but I want to make sure we''re fixing a real problem. It sounds like Keir thinks this can be relaxed. Matt> > Signed-off-by: Matt Wilson <msw@amazon.com> > > > > diff -r 29247e44df47 -r 2614dd8be3a0 docs/misc/xen-command-line.markdown > > --- a/docs/misc/xen-command-line.markdown Fri Nov 30 21:51:17 2012 +0000 > > +++ b/docs/misc/xen-command-line.markdown Wed Dec 05 05:48:23 2012 +0000 > > @@ -453,7 +453,7 @@ Practices](http://wiki.xen.org/wiki/Xen_ > > > > > Default: `false` > > > > -Pin dom0 vcpus to their respective pcpus > > +Initially pin dom0 vcpus to their respective pcpus > > > > ### e820-mtrr-clip > > > `= <boolean>` > > diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/domain.c > > --- a/xen/common/domain.c Fri Nov 30 21:51:17 2012 +0000 > > +++ b/xen/common/domain.c Wed Dec 05 05:48:23 2012 +0000 > > @@ -45,10 +45,6 @@ > > /* xen_processor_pmbits: xen control Cx, Px, ... */ > > unsigned int xen_processor_pmbits = XEN_PROCESSOR_PM_PX; > > > > -/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned. */ > > -bool_t opt_dom0_vcpus_pin; > > -boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin); > > - > > /* Protect updates/reads (resp.) of domain_list and domain_hash. */ > > DEFINE_SPINLOCK(domlist_update_lock); > > DEFINE_RCU_READ_LOCK(domlist_read_lock); > > @@ -235,7 +231,6 @@ struct domain *domain_create( > > > > if ( domid == 0 ) > > { > > - d->is_pinned = opt_dom0_vcpus_pin; > > d->disable_migrate = 1; > > } > > > > diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/schedule.c > > --- a/xen/common/schedule.c Fri Nov 30 21:51:17 2012 +0000 > > +++ b/xen/common/schedule.c Wed Dec 05 05:48:23 2012 +0000 > > @@ -52,6 +52,11 @@ boolean_param("sched_smt_power_savings", > > * */ > > int sched_ratelimit_us = SCHED_DEFAULT_RATELIMIT_US; > > integer_param("sched_ratelimit_us", sched_ratelimit_us); > > + > > +/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned at boot. */ > > +bool_t opt_dom0_vcpus_pin; > > +boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin); > > + > > /* Various timer handlers. */ > > static void s_timer_fn(void *unused); > > static void vcpu_periodic_timer_fn(void *data); > > @@ -194,7 +199,8 @@ int sched_init_vcpu(struct vcpu *v, unsi > > * domain-0 VCPUs, are pinned onto their respective physical CPUs. > > */ > > v->processor = processor; > > - if ( is_idle_domain(d) || d->is_pinned ) > > + > > + if ( is_idle_domain(d) || (d->domain_id == 0 && opt_dom0_vcpus_pin) ) > > cpumask_copy(v->cpu_affinity, cpumask_of(processor)); > > else > > cpumask_setall(v->cpu_affinity); > > @@ -595,8 +601,6 @@ int vcpu_set_affinity(struct vcpu *v, co > > cpumask_t online_affinity; > > cpumask_t *online; > > > > - if ( v->domain->is_pinned ) > > - return -EINVAL; > > online = VCPU2ONLINE(v); > > cpumask_and(&online_affinity, affinity, online); > > if ( cpumask_empty(&online_affinity) ) > > diff -r 29247e44df47 -r 2614dd8be3a0 xen/include/xen/sched.h > > --- a/xen/include/xen/sched.h Fri Nov 30 21:51:17 2012 +0000 > > +++ b/xen/include/xen/sched.h Wed Dec 05 05:48:23 2012 +0000 > > @@ -292,8 +292,6 @@ struct domain > > enum { DOMDYING_alive, DOMDYING_dying, DOMDYING_dead } is_dying; > > /* Domain is paused by controller software? */ > > bool_t is_paused_by_controller; > > - /* Domain''s VCPUs are pinned 1:1 to physical CPUs? */ > > - bool_t is_pinned; > > > > /* Are any VCPUs polling event channels (SCHEDOP_poll)? */ > > #if MAX_VIRT_CPUS <= BITS_PER_LONG > > @@ -713,8 +711,7 @@ void watchdog_domain_destroy(struct doma > > > > #define is_hvm_domain(d) ((d)->is_hvm) > > #define is_hvm_vcpu(v) (is_hvm_domain(v->domain)) > > -#define is_pinned_vcpu(v) ((v)->domain->is_pinned || \ > > - cpumask_weight((v)->cpu_affinity) == 1) > > +#define is_pinned_vcpu(v) (cpumask_weight((v)->cpu_affinity) == 1) > > #ifdef HAS_PASSTHROUGH > > #define need_iommu(d) ((d)->need_iommu) > > #else > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel > > -- > Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer > T: +44 (0)1223 225 900, http://www.citrix.com
Matt Wilson
2012-Dec-05 16:04 UTC
Re: [PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set
On Wed, Dec 05, 2012 at 07:59:08AM -0800, Matt Wilson wrote:> On Wed, Dec 05, 2012 at 10:44:04AM +0000, Andrew Cooper wrote: > > On 05/12/12 06:02, Matt Wilson wrote: > > > An administrator may want Xen to pin dom0 vCPUs to pCPUs 1:1 at boot, > > > but still have the flexibility to change the configuration later. > > > There''s no logic that keys off of domain->is_pinned outside of > > > sched_init_vcpu() and vcpu_set_affinity(). By adjusting the > > > is_pinned_vcpu() macro to only check for a single CPU set in the > > > cpu_affinity mask, dom0 vCPUs can safely be re-pinned after the system > > > boots. > > > > Sadly this patch will break things. There are certain callers of > > is_pinned_vcpu() which rely on the value to allow access to certain > > power related MSRs, which is where the requirement for never permitting > > an update of the affinity mask comes from. > > If this is true, the existing is_pinned_vcpu() test is broken: > > #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \ > cpumask_weight((v)->cpu_affinity) == 1) > > It''s && not ||. So if someone pins dom0 vCPUs to pCPUs 1:1 after boot, > the MSR traps will suddenly start working. > > See commit: http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=cc0854dd > > > When I encountered this problem before, I considered implementing > > dom0_vcpu_pin=dynamic (or name to suit) which sets up an identity pin at > > create time, but leaves is_pinned as false. > > I could implement that, but I want to make sure we''re fixing a real > problem. It sounds like Keir thinks this can be relaxed.Sorry, misread the commit. Jan made that change.> Matt > > > > Signed-off-by: Matt Wilson <msw@amazon.com> > > > > > > diff -r 29247e44df47 -r 2614dd8be3a0 docs/misc/xen-command-line.markdown > > > --- a/docs/misc/xen-command-line.markdown Fri Nov 30 21:51:17 2012 +0000 > > > +++ b/docs/misc/xen-command-line.markdown Wed Dec 05 05:48:23 2012 +0000 > > > @@ -453,7 +453,7 @@ Practices](http://wiki.xen.org/wiki/Xen_ > > > > > > > Default: `false` > > > > > > -Pin dom0 vcpus to their respective pcpus > > > +Initially pin dom0 vcpus to their respective pcpus > > > > > > ### e820-mtrr-clip > > > > `= <boolean>` > > > diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/domain.c > > > --- a/xen/common/domain.c Fri Nov 30 21:51:17 2012 +0000 > > > +++ b/xen/common/domain.c Wed Dec 05 05:48:23 2012 +0000 > > > @@ -45,10 +45,6 @@ > > > /* xen_processor_pmbits: xen control Cx, Px, ... */ > > > unsigned int xen_processor_pmbits = XEN_PROCESSOR_PM_PX; > > > > > > -/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned. */ > > > -bool_t opt_dom0_vcpus_pin; > > > -boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin); > > > - > > > /* Protect updates/reads (resp.) of domain_list and domain_hash. */ > > > DEFINE_SPINLOCK(domlist_update_lock); > > > DEFINE_RCU_READ_LOCK(domlist_read_lock); > > > @@ -235,7 +231,6 @@ struct domain *domain_create( > > > > > > if ( domid == 0 ) > > > { > > > - d->is_pinned = opt_dom0_vcpus_pin; > > > d->disable_migrate = 1; > > > } > > > > > > diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/schedule.c > > > --- a/xen/common/schedule.c Fri Nov 30 21:51:17 2012 +0000 > > > +++ b/xen/common/schedule.c Wed Dec 05 05:48:23 2012 +0000 > > > @@ -52,6 +52,11 @@ boolean_param("sched_smt_power_savings", > > > * */ > > > int sched_ratelimit_us = SCHED_DEFAULT_RATELIMIT_US; > > > integer_param("sched_ratelimit_us", sched_ratelimit_us); > > > + > > > +/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned at boot. */ > > > +bool_t opt_dom0_vcpus_pin; > > > +boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin); > > > + > > > /* Various timer handlers. */ > > > static void s_timer_fn(void *unused); > > > static void vcpu_periodic_timer_fn(void *data); > > > @@ -194,7 +199,8 @@ int sched_init_vcpu(struct vcpu *v, unsi > > > * domain-0 VCPUs, are pinned onto their respective physical CPUs. > > > */ > > > v->processor = processor; > > > - if ( is_idle_domain(d) || d->is_pinned ) > > > + > > > + if ( is_idle_domain(d) || (d->domain_id == 0 && opt_dom0_vcpus_pin) ) > > > cpumask_copy(v->cpu_affinity, cpumask_of(processor)); > > > else > > > cpumask_setall(v->cpu_affinity); > > > @@ -595,8 +601,6 @@ int vcpu_set_affinity(struct vcpu *v, co > > > cpumask_t online_affinity; > > > cpumask_t *online; > > > > > > - if ( v->domain->is_pinned ) > > > - return -EINVAL; > > > online = VCPU2ONLINE(v); > > > cpumask_and(&online_affinity, affinity, online); > > > if ( cpumask_empty(&online_affinity) ) > > > diff -r 29247e44df47 -r 2614dd8be3a0 xen/include/xen/sched.h > > > --- a/xen/include/xen/sched.h Fri Nov 30 21:51:17 2012 +0000 > > > +++ b/xen/include/xen/sched.h Wed Dec 05 05:48:23 2012 +0000 > > > @@ -292,8 +292,6 @@ struct domain > > > enum { DOMDYING_alive, DOMDYING_dying, DOMDYING_dead } is_dying; > > > /* Domain is paused by controller software? */ > > > bool_t is_paused_by_controller; > > > - /* Domain''s VCPUs are pinned 1:1 to physical CPUs? */ > > > - bool_t is_pinned; > > > > > > /* Are any VCPUs polling event channels (SCHEDOP_poll)? */ > > > #if MAX_VIRT_CPUS <= BITS_PER_LONG > > > @@ -713,8 +711,7 @@ void watchdog_domain_destroy(struct doma > > > > > > #define is_hvm_domain(d) ((d)->is_hvm) > > > #define is_hvm_vcpu(v) (is_hvm_domain(v->domain)) > > > -#define is_pinned_vcpu(v) ((v)->domain->is_pinned || \ > > > - cpumask_weight((v)->cpu_affinity) == 1) > > > +#define is_pinned_vcpu(v) (cpumask_weight((v)->cpu_affinity) == 1) > > > #ifdef HAS_PASSTHROUGH > > > #define need_iommu(d) ((d)->need_iommu) > > > #else > > > > > > _______________________________________________ > > > Xen-devel mailing list > > > Xen-devel@lists.xen.org > > > http://lists.xen.org/xen-devel > > > > -- > > Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer > > T: +44 (0)1223 225 900, http://www.citrix.com > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Andrew Cooper
2012-Dec-05 16:06 UTC
Re: [PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set
On 05/12/12 15:59, Matt Wilson wrote:> On Wed, Dec 05, 2012 at 10:44:04AM +0000, Andrew Cooper wrote: >> On 05/12/12 06:02, Matt Wilson wrote: >>> An administrator may want Xen to pin dom0 vCPUs to pCPUs 1:1 at boot, >>> but still have the flexibility to change the configuration later. >>> There''s no logic that keys off of domain->is_pinned outside of >>> sched_init_vcpu() and vcpu_set_affinity(). By adjusting the >>> is_pinned_vcpu() macro to only check for a single CPU set in the >>> cpu_affinity mask, dom0 vCPUs can safely be re-pinned after the system >>> boots. >> Sadly this patch will break things. There are certain callers of >> is_pinned_vcpu() which rely on the value to allow access to certain >> power related MSRs, which is where the requirement for never permitting >> an update of the affinity mask comes from. > If this is true, the existing is_pinned_vcpu() test is broken: > > #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \ > cpumask_weight((v)->cpu_affinity) == 1) > > It''s && not ||. So if someone pins dom0 vCPUs to pCPUs 1:1 after boot, > the MSR traps will suddenly start working. > > See commit: http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=cc0854dd > >> When I encountered this problem before, I considered implementing >> dom0_vcpu_pin=dynamic (or name to suit) which sets up an identity pin at >> create time, but leaves is_pinned as false. > I could implement that, but I want to make sure we''re fixing a real > problem. It sounds like Keir thinks this can be relaxed. > > MattHmm - the logic does indeed look broken. When I was working on this problem, it was an older tree without this changeset. With the current logic, the vcpu will transparently move to a different set of MSRs whenever the affinity is changed, which does not sound safe for an unaware vcpu. I did not pay much attention to the function of the MSRs. ~Andrew> >>> Signed-off-by: Matt Wilson <msw@amazon.com> >>> >>> diff -r 29247e44df47 -r 2614dd8be3a0 docs/misc/xen-command-line.markdown >>> --- a/docs/misc/xen-command-line.markdown Fri Nov 30 21:51:17 2012 +0000 >>> +++ b/docs/misc/xen-command-line.markdown Wed Dec 05 05:48:23 2012 +0000 >>> @@ -453,7 +453,7 @@ Practices](http://wiki.xen.org/wiki/Xen_ >>> >>> > Default: `false` >>> >>> -Pin dom0 vcpus to their respective pcpus >>> +Initially pin dom0 vcpus to their respective pcpus >>> >>> ### e820-mtrr-clip >>> > `= <boolean>` >>> diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/domain.c >>> --- a/xen/common/domain.c Fri Nov 30 21:51:17 2012 +0000 >>> +++ b/xen/common/domain.c Wed Dec 05 05:48:23 2012 +0000 >>> @@ -45,10 +45,6 @@ >>> /* xen_processor_pmbits: xen control Cx, Px, ... */ >>> unsigned int xen_processor_pmbits = XEN_PROCESSOR_PM_PX; >>> >>> -/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned. */ >>> -bool_t opt_dom0_vcpus_pin; >>> -boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin); >>> - >>> /* Protect updates/reads (resp.) of domain_list and domain_hash. */ >>> DEFINE_SPINLOCK(domlist_update_lock); >>> DEFINE_RCU_READ_LOCK(domlist_read_lock); >>> @@ -235,7 +231,6 @@ struct domain *domain_create( >>> >>> if ( domid == 0 ) >>> { >>> - d->is_pinned = opt_dom0_vcpus_pin; >>> d->disable_migrate = 1; >>> } >>> >>> diff -r 29247e44df47 -r 2614dd8be3a0 xen/common/schedule.c >>> --- a/xen/common/schedule.c Fri Nov 30 21:51:17 2012 +0000 >>> +++ b/xen/common/schedule.c Wed Dec 05 05:48:23 2012 +0000 >>> @@ -52,6 +52,11 @@ boolean_param("sched_smt_power_savings", >>> * */ >>> int sched_ratelimit_us = SCHED_DEFAULT_RATELIMIT_US; >>> integer_param("sched_ratelimit_us", sched_ratelimit_us); >>> + >>> +/* opt_dom0_vcpus_pin: If true, dom0 VCPUs are pinned at boot. */ >>> +bool_t opt_dom0_vcpus_pin; >>> +boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin); >>> + >>> /* Various timer handlers. */ >>> static void s_timer_fn(void *unused); >>> static void vcpu_periodic_timer_fn(void *data); >>> @@ -194,7 +199,8 @@ int sched_init_vcpu(struct vcpu *v, unsi >>> * domain-0 VCPUs, are pinned onto their respective physical CPUs. >>> */ >>> v->processor = processor; >>> - if ( is_idle_domain(d) || d->is_pinned ) >>> + >>> + if ( is_idle_domain(d) || (d->domain_id == 0 && opt_dom0_vcpus_pin) ) >>> cpumask_copy(v->cpu_affinity, cpumask_of(processor)); >>> else >>> cpumask_setall(v->cpu_affinity); >>> @@ -595,8 +601,6 @@ int vcpu_set_affinity(struct vcpu *v, co >>> cpumask_t online_affinity; >>> cpumask_t *online; >>> >>> - if ( v->domain->is_pinned ) >>> - return -EINVAL; >>> online = VCPU2ONLINE(v); >>> cpumask_and(&online_affinity, affinity, online); >>> if ( cpumask_empty(&online_affinity) ) >>> diff -r 29247e44df47 -r 2614dd8be3a0 xen/include/xen/sched.h >>> --- a/xen/include/xen/sched.h Fri Nov 30 21:51:17 2012 +0000 >>> +++ b/xen/include/xen/sched.h Wed Dec 05 05:48:23 2012 +0000 >>> @@ -292,8 +292,6 @@ struct domain >>> enum { DOMDYING_alive, DOMDYING_dying, DOMDYING_dead } is_dying; >>> /* Domain is paused by controller software? */ >>> bool_t is_paused_by_controller; >>> - /* Domain''s VCPUs are pinned 1:1 to physical CPUs? */ >>> - bool_t is_pinned; >>> >>> /* Are any VCPUs polling event channels (SCHEDOP_poll)? */ >>> #if MAX_VIRT_CPUS <= BITS_PER_LONG >>> @@ -713,8 +711,7 @@ void watchdog_domain_destroy(struct doma >>> >>> #define is_hvm_domain(d) ((d)->is_hvm) >>> #define is_hvm_vcpu(v) (is_hvm_domain(v->domain)) >>> -#define is_pinned_vcpu(v) ((v)->domain->is_pinned || \ >>> - cpumask_weight((v)->cpu_affinity) == 1) >>> +#define is_pinned_vcpu(v) (cpumask_weight((v)->cpu_affinity) == 1) >>> #ifdef HAS_PASSTHROUGH >>> #define need_iommu(d) ((d)->need_iommu) >>> #else >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xen.org >>> http://lists.xen.org/xen-devel >> -- >> Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer >> T: +44 (0)1223 225 900, http://www.citrix.com-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com
Jan Beulich
2012-Dec-05 16:25 UTC
Re: [PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set
>>> On 05.12.12 at 16:59, Matt Wilson <msw@amazon.com> wrote: > On Wed, Dec 05, 2012 at 10:44:04AM +0000, Andrew Cooper wrote: >> On 05/12/12 06:02, Matt Wilson wrote: >> > An administrator may want Xen to pin dom0 vCPUs to pCPUs 1:1 at boot, >> > but still have the flexibility to change the configuration later. >> > There''s no logic that keys off of domain->is_pinned outside of >> > sched_init_vcpu() and vcpu_set_affinity(). By adjusting the >> > is_pinned_vcpu() macro to only check for a single CPU set in the >> > cpu_affinity mask, dom0 vCPUs can safely be re-pinned after the system >> > boots. >> >> Sadly this patch will break things. There are certain callers of >> is_pinned_vcpu() which rely on the value to allow access to certain >> power related MSRs, which is where the requirement for never permitting >> an update of the affinity mask comes from. > > If this is true, the existing is_pinned_vcpu() test is broken: > > #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \ > cpumask_weight((v)->cpu_affinity) == 1) > > It''s && not ||. So if someone pins dom0 vCPUs to pCPUs 1:1 after boot, > the MSR traps will suddenly start working. > > See commit: http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=cc0854ddI don''t see what''s wrong here. Certain things merely require the pCPU that a vCPU runs on to be stable, which is what the test above is for. Jan
Matt Wilson
2012-Dec-05 17:06 UTC
Re: [PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set
On Wed, Dec 05, 2012 at 04:25:36PM +0000, Jan Beulich wrote:> >>> On 05.12.12 at 16:59, Matt Wilson <msw@amazon.com> wrote: > > On Wed, Dec 05, 2012 at 10:44:04AM +0000, Andrew Cooper wrote: > >> On 05/12/12 06:02, Matt Wilson wrote: > >> > An administrator may want Xen to pin dom0 vCPUs to pCPUs 1:1 at boot, > >> > but still have the flexibility to change the configuration later. > >> > There''s no logic that keys off of domain->is_pinned outside of > >> > sched_init_vcpu() and vcpu_set_affinity(). By adjusting the > >> > is_pinned_vcpu() macro to only check for a single CPU set in the > >> > cpu_affinity mask, dom0 vCPUs can safely be re-pinned after the system > >> > boots. > >> > >> Sadly this patch will break things. There are certain callers of > >> is_pinned_vcpu() which rely on the value to allow access to certain > >> power related MSRs, which is where the requirement for never permitting > >> an update of the affinity mask comes from. > > > > If this is true, the existing is_pinned_vcpu() test is broken: > > > > #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \ > > cpumask_weight((v)->cpu_affinity) == 1) > > > > It''s && not ||. So if someone pins dom0 vCPUs to pCPUs 1:1 after boot, > > the MSR traps will suddenly start working. > > > > See commit: http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=cc0854dd > > I don''t see what''s wrong here. Certain things merely require the > pCPU that a vCPU runs on to be stable, which is what the test > above is for.Me either. That said, are you willing to Ack and commit my patch that started this thread? Thanks, Matt
George Dunlap
2012-Dec-05 17:16 UTC
Re: [PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set
On Wed, Dec 5, 2012 at 5:06 PM, Matt Wilson <msw@amazon.com> wrote:> Me either. That said, are you willing to Ack and commit my patch that > started this thread? >FWIW, I''m OK with the scheduler side of things; can''t comment on the MSR issue: Acked-by: George Dunlap <george.dunlap@eu.citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Jan Beulich
2012-Dec-06 10:44 UTC
Re: [PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set
>>> On 05.12.12 at 18:06, Matt Wilson <msw@amazon.com> wrote: > On Wed, Dec 05, 2012 at 04:25:36PM +0000, Jan Beulich wrote: >> >>> On 05.12.12 at 16:59, Matt Wilson <msw@amazon.com> wrote: >> > On Wed, Dec 05, 2012 at 10:44:04AM +0000, Andrew Cooper wrote: >> >> On 05/12/12 06:02, Matt Wilson wrote: >> >> > An administrator may want Xen to pin dom0 vCPUs to pCPUs 1:1 at boot, >> >> > but still have the flexibility to change the configuration later. >> >> > There''s no logic that keys off of domain->is_pinned outside of >> >> > sched_init_vcpu() and vcpu_set_affinity(). By adjusting the >> >> > is_pinned_vcpu() macro to only check for a single CPU set in the >> >> > cpu_affinity mask, dom0 vCPUs can safely be re-pinned after the system >> >> > boots. >> >> >> >> Sadly this patch will break things. There are certain callers of >> >> is_pinned_vcpu() which rely on the value to allow access to certain >> >> power related MSRs, which is where the requirement for never permitting >> >> an update of the affinity mask comes from. >> > >> > If this is true, the existing is_pinned_vcpu() test is broken: >> > >> > #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \ >> > cpumask_weight((v)->cpu_affinity) == 1) >> > >> > It''s && not ||. So if someone pins dom0 vCPUs to pCPUs 1:1 after boot, >> > the MSR traps will suddenly start working. >> > >> > See commit: > http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=cc0854dd >> >> I don''t see what''s wrong here. Certain things merely require the >> pCPU that a vCPU runs on to be stable, which is what the test >> above is for. > > Me either. That said, are you willing to Ack and commit my patch that > started this thread?In no case without Andrew''s concerns addressed. Beyond that, I''d be hesitant to ack it as I''m myself suspecting side effects that we don''t want and/or aren''t aware of, and in no case could I commit it without Keir''s ack. Jan
Matt Wilson
2012-Dec-10 22:01 UTC
Re: [PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set
On Thu, 6 Dec 2012 10:44:04 +0000, Jan Beulich <JBeulich@suse.com> wrote:> On Wed, 5 Dec 2012 09:06:20 -0800, Matt Wilson <msw@amazon.com> wrote: > > On Wed, 5 Dec 2012 16:25:36 +0000, Jan Beulich <JBeulich@suse.com> wrote: > > > On Wed, 5 Dec 2012 07:59:08 -0800, Matt Wilson <msw@amazon.com> wrote: > > > > > > > > If this is true, the existing is_pinned_vcpu() test is broken: > > > > > > > > #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \ > > > > cpumask_weight((v)->cpu_affinity) == 1) > > > > > > > > It''s && not ||. So if someone pins dom0 vCPUs to pCPUs 1:1 after boot, > > > > the MSR traps will suddenly start working. > > > > > > > > See commit: > > > > http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=cc0854dd > > > > > > I don''t see what''s wrong here. Certain things merely require the > > > pCPU that a vCPU runs on to be stable, which is what the test > > > above is for. > > > > Me either. That said, are you willing to Ack and commit my patch that > > started this thread? > > In no case without Andrew''s concerns addressed. Beyond that, > I''d be hesitant to ack it as I''m myself suspecting side effects that > we don''t want and/or aren''t aware of, and in no case could I > commit it without Keir''s ack.Jan, So today if I boot Xen without dom0_vcpus_pin set, dom0''s vCPUs will be allowed to run on any pCPU. Xen will block attempts to write certain MSRs (MSR_AMD64_NB_CFG, MSR_FAM10H_MMIO_CONF_BASE and MSR_IA32_ENERGY_PERF_BIAS). The VCPUOP_get_physid subop of the vcpu_op hypercall will not return the initial APIC ID or ACPI ID for dom0. Also today, if I run "xl vcpu-pin 0 0", suddenly those MSR writes and the VCPUOP_get_physid hypercall will start working for vCPU 0. For what it''s worth, only legacy XenoLinux-derived kernels appear to use this hypercall during SMP boot. Upstream Linux does not. I think that the real risk is in the XenoLinux SMP booting code on AMD processors where sometimes initial APIC ID != ACPI ID. If the CPU pinning changes, the ACPI ID to APIC ID mapping will be wrong. This broke PowerNow! when it ran in dom0. But PowerNow! is handled by the hypervisor now. So what''s the real danger here? Andrew, your thoughts? Thanks, Matt
Jan Beulich
2012-Dec-11 10:43 UTC
Re: [PATCH] xen: schedule: allow dom0 vCPUs to be re-pinned when dom0_vcpus_pin is set
>>> On 10.12.12 at 23:01, Matt Wilson <msw@amazon.com> wrote: > So today if I boot Xen without dom0_vcpus_pin set, dom0''s vCPUs will > be allowed to run on any pCPU. Xen will block attempts to write > certain MSRs (MSR_AMD64_NB_CFG, MSR_FAM10H_MMIO_CONF_BASE and > MSR_IA32_ENERGY_PERF_BIAS). The VCPUOP_get_physid subop of the vcpu_op > hypercall will not return the initial APIC ID or ACPI ID for dom0. > > Also today, if I run "xl vcpu-pin 0 0", suddenly those MSR writes and > the VCPUOP_get_physid hypercall will start working for vCPU 0. ForExactly as intended.> what it''s worth, only legacy XenoLinux-derived kernels appear to use > this hypercall during SMP boot. Upstream Linux does not.Because that''s being used for the "cpufreq=dom0" case, which I don''t think the pv-ops kernel really ever tried to support.> I think that the real risk is in the XenoLinux SMP booting code on AMD > processors where sometimes initial APIC ID != ACPI ID. If the CPU > pinning changes, the ACPI ID to APIC ID mapping will be wrong. This > broke PowerNow! when it ran in dom0. > > But PowerNow! is handled by the hypervisor now. So what''s the real > danger here?I never liked the idea of "cpufreq=dom0", and our kernels intentionally make it impossible to be used (i.e. much like pv-ops). But for there potentially being people wanting this, we can''t simply rip it out. The real danger is with exactly what you describe - a careless caller using the result of any of the then possible operations in a context where they''re not valid anymore. And there are use cases outside of Dom0-based power management: If carefully done, this allows implementation of a kernel driver (extension) similar to x86''s msr.ko, allowing access to the physical CPU''s MSRs. And properly implementing the dcdbas, coretemp, and via-cputemp drivers. Jan