Atsushi SAKAI
2007-Jul-10 08:47 UTC
[Xen-devel] [PATCH] vcpu pin weight considerstion TAKE3
Hi, Keir and Emmanuel This patch intends to correct the weight treatment for vcpu-pin. Would you give me a comment on this? sched_credit.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 files changed, 114 insertions(+), 1 deletion(-) Current algorithm is 1)calculate pcpu_weight based on all active vcpu. vcpu_weight = sdom->weight/sdom->online_vcpu_count(newly added variable) v->processor (This routine runs when vcpu count is changed, not every 30msec.) 2)calulate vcpupin_factor based on avarage of vcpu-pinned-pcpu_weight/(csched_priv.weight/num_online_cpu()) 3)consider above factor when credit is added in vcpu-pin case. at atomic_add(credit_fair, &svc->credit); Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com> Difference from previous patch(+365 lines) is 1)credit_balance consideration factor is omitted (about -150 lines) 2)detailed pcpu_weight calculation is changed (about -100 lines) (currently uses v->processor instead of vcpu->cpu_affinity) Thanks Atsushi SAKAI _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Emmanuel Ackaouy
2007-Jul-10 11:46 UTC
Re: [Xen-devel] [PATCH] vcpu pin weight considerstion TAKE3
Hi. Before I discuss the approach again, let me first comment on your patch: I don''t understand why you need to keep track of the subset of active VCPUs which are online. An active VCPU is one which has recently consumed credits. I don''t understand the point of ignoring those which are currently blocked when you look. Can you explain? Now to step back a little and look at the approach itself: Weights are useful to proportionally allocate credits to domains which are competing for the _same_(!!!) CPU resources. When you restrict where a VCPU can run, you break this assumption. You are attempting to divide the total weight among different "zones" by assigning a restricted VCPU''s share of the total weight to the physical CPU where it last ran. This will work as long as your distinct "zones" do not overlap. When they do overlap, this will not work. To be specific, in summing_vcpupin_weight(), you assign a restricted VCPU''s share of the total weight to the CPU it last ran on. This assumes that all other competing VCPUs will also sum over that physical CPU. This isn''t true when distinct CPU affinity masks ("zones") can overlap. When all distinct CPU affinity masks do not overlap with each other, the host system has essentially been partitioned. Partitioning is an interesting subset of the general problem because: - It is easily defined - it has a simple and elegant solution. Your solution only works when the system has been properly partitioned. This is fine. Actually, the problem of handling overlapping CPU affinity masks is not a tractable one. What I argue is that there is a cleaner approach to deal with the partitioning of the host: For the most part, you only have to allow multiple csched_privates structures to co-exist. Each such group will have a master CPU doing its accounting just like today. The accounting code will be left mostly untouched: You probably just need to AND your group''s assigned CPUs to the online mask here and there. Keeping the accounting work the same and distributing it across CPUs is more desirable than adding complexity while still keeping it on a single CPU. Personally I think Xen would also benefit from having a clean interface for managing CPU partitions but this isn''t strictly necessary. Unfortunately, having left XenSource some months ago, I have virtually no time to spend on Xen. I would love to help review patches that add partitioning support the way I have described: By allowing the co-existence of independent scheduling "zones", each represented by its distinct version of the current csched_private structure. I appreciate the work you are doing trying to tackle this problem but I think your patches are the wrong way forward. Given that, I''m not going to be able to spend time looking at further re-spins of this patch. Cheers, Emmanuel. On Jul 10, 2007, at 10:47, Atsushi SAKAI wrote:> Hi, Keir and Emmanuel > > This patch intends to correct the weight treatment for vcpu-pin. > Would you give me a comment on this? > > sched_credit.c | 115 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++- > 1 files changed, 114 insertions(+), 1 deletion(-) > > > Current algorithm is > 1)calculate pcpu_weight based on all active vcpu. > vcpu_weight = sdom->weight/sdom->online_vcpu_count(newly added > variable) > v->processor > (This routine runs when vcpu count is changed, not every 30msec.) > > 2)calulate vcpupin_factor based on > avarage of > vcpu-pinned-pcpu_weight/(csched_priv.weight/num_online_cpu()) > > 3)consider above factor when credit is added in vcpu-pin case. > at atomic_add(credit_fair, &svc->credit); > > Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com> > > Difference from previous patch(+365 lines) is > > 1)credit_balance consideration factor is omitted (about -150 lines) > 2)detailed pcpu_weight calculation is changed (about -100 lines) > (currently uses v->processor instead of vcpu->cpu_affinity) > > Thanks > Atsushi SAKAI > > <vcpupin0710.patch>_______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Atsushi SAKAI
2007-Jul-10 13:18 UTC
Re: [Xen-devel] [PATCH] vcpu pin weight considerstion TAKE3
Hi, Emmanuel Thank you for commenting and suggestion of my patch. I comment in line. If pcpu group(or partitioning) API is acceptable, that is simple way to solve. Emmanuel Ackaouy <ackaouy@gmail.com> wrote:> Hi. > > Before I discuss the approach again, let me first comment on your > patch: > > I don''t understand why you need to keep track of the subset of > active VCPUs which are online. An active VCPU is one which > has recently consumed credits. I don''t understand the point of > ignoring those which are currently blocked when you look. Can > you explain?This is not blocked one, _VPF_down one. Blocked vcpu is counted in this code(online_vcpu_count). So This is for online vcpu reduced by xm vcpu-sets.> > > Now to step back a little and look at the approach itself: > > Weights are useful to proportionally allocate credits to domains > which are competing for the _same_(!!!) CPU resources. When > you restrict where a VCPU can run, you break this assumption.My case is for pcpu.credit(vcpu.credit sum) > 30msec. To appropriate weighting, I reduced each vcpu credit with same ratio.> > You are attempting to divide the total weight among different > "zones" by assigning a restricted VCPU''s share of the total > weight to the physical CPU where it last ran. This will work > as long as your distinct "zones" do not overlap. When they > do overlap, this will not work.Yes, you are right. I am considering vcpu-pinned domain view. (c.f.My configure is domU1 and domU2 is pinned to pcpu0-1 only on 4pcpu machine.)>From not pinned domain view, skipped weight should give toto the non-pinned vcpu. Current patch gives same CPU resources to the non-pinned vcpu after comsuming credit. (since credit is shorten by vcpu-pinned pcpu.credit >30msec) It should be treat vcpu credit should give based on vcpu weight. I will fix it. Thank you for your suggestion.> To be specific, in summing_vcpupin_weight(), you assign a > restricted VCPU''s share of the total weight to the CPU it last > ran on. This assumes that all other competing VCPUs will > also sum over that physical CPU. This isn''t true when distinct > CPU affinity masks ("zones") can overlap.Yes, For this correct handling, previous TAKE2 patch calculates each pcpu weight based on vcpu-pin information. In this TAKE3 patch, some vcpu-pin overlap case is problematic. (not all of the cases is problematic.) In this mean, this patch does not purposes partitioning at first.> > When all distinct CPU affinity masks do not overlap with each > other, the host system has essentially been partitioned. > > Partitioning is an interesting subset of the general problem > because: > - It is easily defined > - it has a simple and elegant solution. > > Your solution only works when the system has been properly > partitioned. This is fine. Actually, the problem of handling > overlapping CPU affinity masks is not a tractable one. >For this properly handing, TAKE2 patch have vcpu-pin weight handling. Anyway CPU grouping is another solution to solve. If permittable, I am willing re-write it.> What I argue is that there is a cleaner approach to deal with > the partitioning of the host: For the most part, you only have > to allow multiple csched_privates structures to co-exist. Each > such group will have a master CPU doing its accounting just > like today. The accounting code will be left mostly untouched: > You probably just need to AND your group''s assigned CPUs > to the online mask here and there. > > Keeping the accounting work the same and distributing it > across CPUs is more desirable than adding complexity while > still keeping it on a single CPU. > > Personally I think Xen would also benefit from having a clean > interface for managing CPU partitions but this isn''t strictly > necessary. > > Unfortunately, having left XenSource some months ago, I > have virtually no time to spend on Xen. I would love to help > review patches that add partitioning support the way I have > described: By allowing the co-existence of independent > scheduling "zones", each represented by its distinct version > of the current csched_private structure. I appreciate the work > you are doing trying to tackle this problem but I think your > patches are the wrong way forward. Given that, I''m not going > to be able to spend time looking at further re-spins of this > patch. > > Cheers, > Emmanuel. > > On Jul 10, 2007, at 10:47, Atsushi SAKAI wrote: > > > Hi, Keir and Emmanuel > > > > This patch intends to correct the weight treatment for vcpu-pin. > > Would you give me a comment on this? > > > > sched_credit.c | 115 > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++- > > 1 files changed, 114 insertions(+), 1 deletion(-) > > > > > > Current algorithm is > > 1)calculate pcpu_weight based on all active vcpu. > > vcpu_weight = sdom->weight/sdom->online_vcpu_count(newly added > > variable) > > v->processor > > (This routine runs when vcpu count is changed, not every 30msec.) > > > > 2)calulate vcpupin_factor based on > > avarage of > > vcpu-pinned-pcpu_weight/(csched_priv.weight/num_online_cpu()) > > > > 3)consider above factor when credit is added in vcpu-pin case. > > at atomic_add(credit_fair, &svc->credit); > > > > Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com> > > > > Difference from previous patch(+365 lines) is > > > > 1)credit_balance consideration factor is omitted (about -150 lines) > > 2)detailed pcpu_weight calculation is changed (about -100 lines) > > (currently uses v->processor instead of vcpu->cpu_affinity) > > > > Thanks > > Atsushi SAKAI > > > > <vcpupin0710.patch>_______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-develThanks Atsushi SAKAI _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel