thr3ads.net - Xen devel - [Xen-devel] [PATCH] vcpu pin weight considerstion TAKE3 [Jul 2007]

If this information is useful, please help other people find it:
Share via:

Atsushi SAKAI

2007-Jul-10 08:47 UTC

[Xen-devel] [PATCH] vcpu pin weight considerstion TAKE3

Hi, Keir and Emmanuel

This patch intends to correct the weight treatment for vcpu-pin.
Would you give me a comment on this?

 sched_credit.c |  115 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 114 insertions(+), 1 deletion(-)


Current algorithm is 
1)calculate pcpu_weight based on all active vcpu.
  vcpu_weight = sdom->weight/sdom->online_vcpu_count(newly added variable)
  v->processor
  (This routine runs when vcpu count is changed, not every 30msec.)

2)calulate vcpupin_factor based on 
  avarage of vcpu-pinned-pcpu_weight/(csched_priv.weight/num_online_cpu())

3)consider above factor when credit is added in vcpu-pin case. 
  at atomic_add(credit_fair, &svc->credit);

Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com>

Difference from previous patch(+365 lines) is

1)credit_balance consideration factor is omitted (about -150 lines)
2)detailed pcpu_weight calculation is changed (about -100 lines) 
  (currently uses v->processor instead of vcpu->cpu_affinity)

Thanks
Atsushi SAKAI



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Emmanuel Ackaouy

2007-Jul-10 11:46 UTC

head link

Re: [Xen-devel] [PATCH] vcpu pin weight considerstion TAKE3

Hi.

Before I discuss the approach again, let me first comment on your
patch:

I don''t understand why you need to keep track of the subset of
active VCPUs which are online. An active VCPU is one which
has recently consumed credits. I don''t understand the point of
ignoring those which are currently blocked when you look. Can
you explain?

Now to step back a little and look at the approach itself:

Weights are useful to proportionally allocate credits to domains
which are competing for the _same_(!!!) CPU resources. When
you restrict where a VCPU can run, you break this assumption.

You are attempting to divide the total weight among different
"zones" by assigning a restricted VCPU''s share of the total
weight to the physical CPU where it last ran. This will work
as long as your distinct "zones" do not overlap. When they
do overlap, this will not work.

To be specific, in summing_vcpupin_weight(), you assign a
restricted VCPU''s share of the total weight to the CPU it last
ran on. This assumes that all other competing VCPUs will
also sum over that physical CPU. This isn''t true when distinct
CPU affinity masks ("zones") can overlap.

When all distinct CPU affinity masks do not overlap with each
other, the host system has essentially been partitioned.

Partitioning is an interesting subset of the general problem
because:
- It is easily defined
- it has a simple and elegant solution.

Your solution only works when the system has been properly
partitioned. This is fine. Actually, the problem of handling
overlapping CPU affinity masks is not a tractable one.

What I argue is that there is a cleaner approach to deal with
the partitioning of the host: For the most part, you only have
to allow multiple csched_privates structures to co-exist. Each
such group will have a master CPU doing its accounting just
like today. The accounting code will be left mostly untouched:
You probably just need to AND your group''s assigned CPUs
to the online mask here and there.

Keeping the accounting work the same and distributing it
across CPUs is more desirable than adding complexity while
still keeping it on a single CPU.

Personally I think Xen would also benefit from having a clean
interface for managing CPU partitions but this isn''t strictly
necessary.

Unfortunately, having left XenSource some months ago, I
have virtually no time to spend on Xen. I would love to help
review patches that add partitioning support the way I have
described: By allowing the co-existence of independent
scheduling "zones", each represented by its distinct version
of the current csched_private structure. I appreciate the work
you are doing trying to tackle this problem but I think your
patches are the wrong way forward. Given that, I''m not going
to be able to spend time looking at further re-spins of this
patch.

Cheers,
Emmanuel.

On Jul 10, 2007, at 10:47, Atsushi SAKAI wrote:
> Hi, Keir and Emmanuel
>
> This patch intends to correct the weight treatment for vcpu-pin.
> Would you give me a comment on this?
>
>  sched_credit.c |  115 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 114 insertions(+), 1 deletion(-)
>
>
> Current algorithm is
> 1)calculate pcpu_weight based on all active vcpu.
>   vcpu_weight = sdom->weight/sdom->online_vcpu_count(newly added 
> variable)
>   v->processor
>   (This routine runs when vcpu count is changed, not every 30msec.)
>
> 2)calulate vcpupin_factor based on
>   avarage of 
> vcpu-pinned-pcpu_weight/(csched_priv.weight/num_online_cpu())
>
> 3)consider above factor when credit is added in vcpu-pin case.
>   at atomic_add(credit_fair, &svc->credit);
>
> Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com>
>
> Difference from previous patch(+365 lines) is
>
> 1)credit_balance consideration factor is omitted (about -150 lines)
> 2)detailed pcpu_weight calculation is changed (about -100 lines)
>   (currently uses v->processor instead of vcpu->cpu_affinity)
>
> Thanks
> Atsushi SAKAI
>
> <vcpupin0710.patch>_______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Atsushi SAKAI

2007-Jul-10 13:18 UTC

head link

Re: [Xen-devel] [PATCH] vcpu pin weight considerstion TAKE3

Hi, Emmanuel

 Thank you for commenting and suggestion of my patch.
I comment in line.

If pcpu group(or partitioning) API is acceptable, that is simple way to 
solve.



Emmanuel Ackaouy <ackaouy@gmail.com> wrote:
> Hi.
> 
> Before I discuss the approach again, let me first comment on your
> patch:
> 
> I don''t understand why you need to keep track of the subset of
> active VCPUs which are online. An active VCPU is one which
> has recently consumed credits. I don''t understand the point of
> ignoring those which are currently blocked when you look. Can
> you explain?
This is not blocked one, _VPF_down one.
Blocked vcpu is counted in this code(online_vcpu_count).
So This is for online vcpu reduced by  xm vcpu-sets.
> 
> 
> Now to step back a little and look at the approach itself:
> 
> Weights are useful to proportionally allocate credits to domains
> which are competing for the _same_(!!!) CPU resources. When
> you restrict where a VCPU can run, you break this assumption.
My case is for pcpu.credit(vcpu.credit sum) > 30msec.
To appropriate weighting, I reduced each vcpu credit with same ratio.
> 
> You are attempting to divide the total weight among different
> "zones" by assigning a restricted VCPU''s share of the
total
> weight to the physical CPU where it last ran. This will work
> as long as your distinct "zones" do not overlap. When they
> do overlap, this will not work.
Yes, you are right.
I am considering vcpu-pinned domain view.
(c.f.My configure is domU1 and domU2 is pinned to pcpu0-1 only 
on 4pcpu machine.)>From not pinned domain view, skipped weight should give to to the non-pinned vcpu.
Current patch gives same CPU resources to the non-pinned vcpu
after comsuming credit.
(since credit is shorten by vcpu-pinned pcpu.credit >30msec)
It should be treat vcpu credit should give based on vcpu weight.
I will fix it.
Thank you for your suggestion.

> To be specific, in summing_vcpupin_weight(), you assign a
> restricted VCPU''s share of the total weight to the CPU it last
> ran on. This assumes that all other competing VCPUs will
> also sum over that physical CPU. This isn''t true when distinct
> CPU affinity masks ("zones") can overlap.
Yes,
For this correct handling, previous TAKE2 patch calculates each pcpu weight 
based on vcpu-pin information.
In this TAKE3 patch, some vcpu-pin overlap case is problematic.
(not all of the cases is problematic.)
In this mean, this patch does not purposes partitioning at first.

> 
> When all distinct CPU affinity masks do not overlap with each
> other, the host system has essentially been partitioned.
> 
> Partitioning is an interesting subset of the general problem
> because:
> - It is easily defined
> - it has a simple and elegant solution.
> 
> Your solution only works when the system has been properly
> partitioned. This is fine. Actually, the problem of handling
> overlapping CPU affinity masks is not a tractable one.
> For this properly handing, TAKE2 patch have vcpu-pin weight handling.

Anyway CPU grouping is another solution to solve.
If permittable, I am willing re-write it.

> What I argue is that there is a cleaner approach to deal with
> the partitioning of the host: For the most part, you only have
> to allow multiple csched_privates structures to co-exist. Each
> such group will have a master CPU doing its accounting just
> like today. The accounting code will be left mostly untouched:
> You probably just need to AND your group''s assigned CPUs
> to the online mask here and there.
> 
> Keeping the accounting work the same and distributing it
> across CPUs is more desirable than adding complexity while
> still keeping it on a single CPU.
> 
> Personally I think Xen would also benefit from having a clean
> interface for managing CPU partitions but this isn''t strictly
> necessary.
> 
> Unfortunately, having left XenSource some months ago, I
> have virtually no time to spend on Xen. I would love to help
> review patches that add partitioning support the way I have
> described: By allowing the co-existence of independent
> scheduling "zones", each represented by its distinct version
> of the current csched_private structure. I appreciate the work
> you are doing trying to tackle this problem but I think your
> patches are the wrong way forward. Given that, I''m not going
> to be able to spend time looking at further re-spins of this
> patch.
> 
> Cheers,
> Emmanuel.
> 
> On Jul 10, 2007, at 10:47, Atsushi SAKAI wrote:
> 
> > Hi, Keir and Emmanuel
> >
> > This patch intends to correct the weight treatment for vcpu-pin.
> > Would you give me a comment on this?
> >
> >  sched_credit.c |  115 
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> >  1 files changed, 114 insertions(+), 1 deletion(-)
> >
> >
> > Current algorithm is
> > 1)calculate pcpu_weight based on all active vcpu.
> >   vcpu_weight = sdom->weight/sdom->online_vcpu_count(newly added
> > variable)
> >   v->processor
> >   (This routine runs when vcpu count is changed, not every 30msec.)
> >
> > 2)calulate vcpupin_factor based on
> >   avarage of 
> > vcpu-pinned-pcpu_weight/(csched_priv.weight/num_online_cpu())
> >
> > 3)consider above factor when credit is added in vcpu-pin case.
> >   at atomic_add(credit_fair, &svc->credit);
> >
> > Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com>
> >
> > Difference from previous patch(+365 lines) is
> >
> > 1)credit_balance consideration factor is omitted (about -150 lines)
> > 2)detailed pcpu_weight calculation is changed (about -100 lines)
> >   (currently uses v->processor instead of vcpu->cpu_affinity)
> >
> > Thanks
> > Atsushi SAKAI
> >
> >
<vcpupin0710.patch>_______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

Thanks
Atsushi SAKAI




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Jul 2007 - [PATCH] vcpu pin weight considerstion TAKE3

[Xen-devel] [PATCH] vcpu pin weight considerstion TAKE3

Re: [Xen-devel] [PATCH] vcpu pin weight considerstion TAKE3

Re: [Xen-devel] [PATCH] vcpu pin weight considerstion TAKE3