Wei, Gang
2010-Mar-19 08:22 UTC
[Xen-devel] Optimize credit scheduler __runq_tickle to reduce IPIs
I used to find for multiple idle vms case, there are a lot of break events come from IPIs which are used to raise SCHEDULE_SOFTIRQ to wake up idle cpus to do load balancing -- csched_vcpu_wake ->__runq_tickle->cpumask_raise_softirq. In __runq_tickle(), if there are at least two vcpus runable, it will try to tickle all idle cpus which have affinity with the waking up vcpu to let them pull this vcpu away. I am thinking about an optimization, limiting the number of idle cpus tickled for vcpu migration purpose to ONLY ONE to get rid of a lot of IPI events which may impact the average cpu idle residency time. There are two concerns about this optimization: 1. if the only one target cpu failed to pull this vcpu (for the reason such as it just has been scheduled for another vcpu), this vcpu may stay on the original cpu for a long period until suspend/wakeup again and keep system cpus unbalanced. 2. if first_cpu() was used as the way to choose the target among all possible idle cpus, will it cause overall unbalanced cpu utilization? i.e. cpu 0 > cpu 1 > ... > cpu N Do my concerns make sense? Or any comments, suggestions, ... Jimmy _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Mar-19 09:49 UTC
Re: [Xen-devel] Optimize credit scheduler __runq_tickle to reduce IPIs
>>> "Wei, Gang" <gang.wei@intel.com> 19.03.10 09:22 >>> >2. if first_cpu() was used as the way to choose the target among all possible idle cpus, will it cause overall unbalanced cpu utilization? i.e. cpu 0 > cpu 1 > ... > cpu NThis can be easily avoided by using cycle_cpu() and tracking the last used CPU on e.g. a per-pCPU basis (similar to idle_bias used in _csched_cpu_pick()). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Wei, Gang
2010-Mar-19 09:55 UTC
RE: [Xen-devel] Optimize credit scheduler __runq_tickle to reduce IPIs
Jan Beulich wrote:>>>> "Wei, Gang" <gang.wei@intel.com> 19.03.10 09:22 >>> >> 2. if first_cpu() was used as the way to choose the target among all >> possible idle cpus, will it cause overall unbalanced cpu >> utilization? i.e. cpu 0 > cpu 1 > ... > cpu N > > This can be easily avoided by using cycle_cpu() and tracking the last > used CPU on e.g. a per-pCPU basis (similar to idle_bias used in > _csched_cpu_pick()).Good sugguestion, thanks. Jimmy _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Wei, Gang
2010-Apr-02 07:03 UTC
[Xen-devel] [PATCH] CSCHED: Optimize __runq_tickle to reduce IPIs
Since there are no concern or objection yet, here is my implementation for this optimization. Thanks to Jan for recommending cycle_cpu usage. Jimmy CSCHED: Optimize __runq_tickle to reduce IPIs Limiting the number of idle cpus tickled for vcpu migration purpose to ONLY ONE to get rid of a lot of IPI events which may impact the average cpu idle residency time. The default on option ''tickle_one_idle_cpu=0'' can be used to disable this optimization if needed. Signed-off-by: Wei Gang <gang.wei@intel.com> diff -r dbb473bba30b xen/common/sched_credit.c --- a/xen/common/sched_credit.c Fri Apr 02 10:45:39 2010 +0800 +++ b/xen/common/sched_credit.c Fri Apr 02 11:43:56 2010 +0800 @@ -228,6 +228,11 @@ static void burn_credits(struct csched_v svc->start_time += (credits * MILLISECS(1)) / CSCHED_CREDITS_PER_MSEC; } +static int opt_tickle_one_idle __read_mostly = 1; +boolean_param("tickle_one_idle_cpu", opt_tickle_one_idle); + +DEFINE_PER_CPU(unsigned int, last_tickle_cpu) = 0; + static inline void __runq_tickle(unsigned int cpu, struct csched_vcpu *new) { @@ -265,8 +270,21 @@ __runq_tickle(unsigned int cpu, struct c } else { - CSCHED_STAT_CRANK(tickle_idlers_some); - cpus_or(mask, mask, csched_priv.idlers); + cpumask_t idle_mask; + + cpus_and(idle_mask, csched_priv.idlers, new->vcpu->cpu_affinity); + if ( !cpus_empty(idle_mask) ) + { + CSCHED_STAT_CRANK(tickle_idlers_some); + if ( opt_tickle_one_idle ) + { + this_cpu(last_tickle_cpu) = + cycle_cpu(this_cpu(last_tickle_cpu), idle_mask); + cpu_set(this_cpu(last_tickle_cpu), mask); + } + else + cpus_or(mask, mask, idle_mask); + } cpus_and(mask, mask, new->vcpu->cpu_affinity); } } On Friday, 2010-3-19 4:22 PM, Wei, Gang wrote:> I used to find for multiple idle vms case, there are a lot of break > events come from IPIs which are used to raise SCHEDULE_SOFTIRQ to > wake up idle cpus to do load balancing -- csched_vcpu_wake > ->__runq_tickle->cpumask_raise_softirq. In __runq_tickle(), if there > are at least two vcpus runable, it will try to tickle all idle cpus > which have affinity with the waking up vcpu to let them pull this > vcpu away. > > I am thinking about an optimization, limiting the number of idle cpus > tickled for vcpu migration purpose to ONLY ONE to get rid of a lot of > IPI events which may impact the average cpu idle residency time. > > There are two concerns about this optimization: > 1. if the only one target cpu failed to pull this vcpu (for the > reason such as it just has been scheduled for another vcpu), this > vcpu may stay on the original cpu for a long period until > suspend/wakeup again and keep system cpus unbalanced. > 2. if first_cpu() was used as the way to choose the target among all > possible idle cpus, will it cause overall unbalanced cpu utilization? > i.e. cpu 0 > cpu 1 > ... > cpu N > > Do my concerns make sense? Or any comments, suggestions, ... > > Jimmy_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel