thr3ads.net - Xen devel - [Xen-devel] A credit scheduler issue [Jun 2006]

If this information is useful, please help other people find it:
Share via:

Kamble, Nitin A

2006-Jun-30 01:13 UTC

[Xen-devel] A credit scheduler issue

Hi Emmanuel,

    I am trying to debug the credit scheduler to solve the many HVM
domain instability issues we have found with the credit scheduler. 

 

    While debugging I notice an odd behavior; When running on a 2 CPU
system, dom0 gets 2 vcpus by default. And even if there are no other
domains running in the system,  the dom0 vcpus are getting migrated to
different pcpus in the load balance. I think it is due to the preemption
happening in the credit scheduler; and it is not necessary and is
actually wasteful to move vcpus when no of vcpus in the system are equal
to no of pcpus. 

    I would like to know your thinking about this behavior. Is it an
intended in the design?

 

I added this small fix to the scheduler to fix this behavior. And with
it I see the stability of Xen improved. Win2003 boot was crashing with
unhandled MMIO error on xen64 earlier with credit scheduler. I am not
seeing that crash with this small fix anymore. It is quiet possible that
there are more bugs I need to catch for HVM domains in the credit
scheduler. And I would like to know your thoughts for this change.

 

csched_runq_steal(struct csched_pcpu *spc, int cpu, int pri)

{

    struct list_head *iter;

    struct csched_vcpu *speer;

    struct vcpu *vc;

 

    /* If there are only 1 vcpu in the queue then stealing it from the
queue

     * is not going not help in load balancing.

     */

    if (spc->runq.next->next == &spc->runq)

            return NULL;

 

Thanks & Regards,

Nitin

------------------------------------------------------------------------
-----------

Open Source Technology Center, Intel Corp

 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2006-Jun-30 07:05 UTC

head link

[Xen-devel] Re: A credit scheduler issue

On 30 Jun 2006, at 02:13, Kamble, Nitin A wrote:
>    While debugging I notice an odd behavior; When running on a 2 CPU 
> system, dom0 gets 2 vcpus by default. And even if there are no other 
> domains running in the system,  the dom0 vcpus are getting migrated to 
> different pcpus in the load balance. I think it is due to the 
> preemption happening in the credit scheduler; and it is not necessary 
> and is actually wasteful to move vcpus when no of vcpus in the system 
> are equal to no of pcpus.
>     I would like to know your thinking about this behavior. Is it an 
> intended in the design?
>  
> I added this small fix to the scheduler to fix this behavior. And with 
> it I see the stability of Xen improved. Win2003 boot was crashing with 
> unhandled MMIO error on xen64 earlier with credit scheduler. I am not 
> seeing that crash with this small fix anymore. It is quiet possible 
> that there are more bugs I need to catch for HVM domains in the credit 
> scheduler. And I would like to know your thoughts for this change.
Although you may have spotted a performance bug that should be looked 
into (if this migration happens significantly frequently), it *should 
not* be a correctness bug! I don''t think the fixes for the 
HVM-and-credit-sched bugs lie in the credit scheduler itself. :-)

  -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Emmanuel Ackaouy

2006-Jun-30 10:45 UTC

head link

[Xen-devel] Re: A credit scheduler issue

Hi Nitin,

On Thu, Jun 29, 2006 at 06:13:51PM -0700, Kamble, Nitin A
wrote:>        I am trying to debug the credit scheduler to solve the many HVM
domain
>    instability issues we have found with the credit scheduler.
Great. As Keir pointed out though the problems you are seeing
may not actually be in the credit scheduler itself.
>        While debugging I notice an odd behavior; When running on a 2 CPU
>    system, dom0 gets 2 vcpus by default. And even if there are no other
>    domains running in the system,  the dom0 vcpus are getting migrated to
>    different pcpus in the load balance. I think it is due to the preemption
>    happening in the credit scheduler; and it is not necessary and is
actually
>    wasteful to move vcpus when no of vcpus in the system are equal to no of
>    pcpus.
> 
>        I would like to know your thinking about this behavior. Is it an
>    intended in the design?
This should be very rare. If a VCPU were woken up and put on
the runq of an idle CPU, a peer physical CPU that is in the
scheduler code at that exact time could potentially pick up
the just woken up VCPU.

We can do things to shorten this window, like not pick up a
VCPU from a remote CPU that is currently idle and therefore
probably racing with us to run said newly woken up VCPU on
its runq. But I''m not sure this happens frequently enough to
warrant the added complexity. On top of that, it seems to
me this is more likely to happen to VCPUs that aren''t doing
very much work and therefore would not suffer a performance
loss from migrating physical CPU on occasion. 

Are you seeing a lot of these migrations?
>    I added this small fix to the scheduler to fix this behavior. And with
it
>    I see the stability of Xen improved. Win2003 boot was crashing with
>    unhandled MMIO error on xen64 earlier with credit scheduler. I am not
>    seeing that crash with this small fix anymore. It is quiet possible that
>    there are more bugs I need to catch for HVM domains in the credit
>    scheduler. And I would like to know your thoughts for this change.
I don''t agree with this change.

When a VCPU is the only member of a CPU''s runq, it''s still
waiting for a _running_ VCPU to yield or block. We should
absolutely be picking up such a VCPU to run elsewhere on
an idle CPU. Else, you''d end up with two VCPUs time-slicing
on a processor while other processors in the system are idle.

Your change effectively turns off migration on systems where
the number of active VCPUs is less than 2 multiplied by the
number of physical CPUs. I can see why that would hide any
bugs in the context migrating paths, but that doesn''t make
it right. :-)
> 
>    csched_runq_steal(struct csched_pcpu *spc, int cpu, int pri)
> 
>    {
> 
>        struct list_head *iter;
> 
>        struct csched_vcpu *speer;
> 
>        struct vcpu *vc;
> 
>    
> 
>        /* If there are only 1 vcpu in the queue then stealing it from the
>    queue
> 
>         * is not going not help in load balancing.
> 
>         */
> 
>        if (spc->runq.next->next == &spc->runq)
> 
>                return NULL;
> 
>    
> 
>    Thanks & Regards,
> 
>    Nitin
> 
>   
-----------------------------------------------------------------------------------
> 
>    Open Source Technology Center, Intel Corp
> 
>    
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Kamble, Nitin A

2006-Jun-30 19:23 UTC

head link

[Xen-devel] RE: A credit scheduler issue

Keir, Emmanuel,
  Thanks for the detailed answers, and your views. I agree that my small
change should not affect correctness. I didn''t see the migration often,
I saw the dom0 vcpus migrations happening 4, 5 times from boot to start
of xend. I think we should avoid these migrations; why waste the cache
hotness?
    How solid is the credit scheduler now for DomUs on a SMP box? On
32bit, PAE & 64bit? It would be a useful data point for me to debug the
HVM guest issues with the credit scheduler.

Thanks & Regards,
Nitin
------------------------------------------------------------------------
-----------
Open Source Technology Center, Intel Corp
>-----Original Message-----
>From: Emmanuel Ackaouy [mailto:ack@xensource.com]
>Sent: Friday, June 30, 2006 3:46 AM
>To: Kamble, Nitin A
>Cc: xen-devel@lists.xensource.com; Keir Fraser; Ian Pratt
>Subject: Re: A credit scheduler issue
>
>Hi Nitin,
>
>On Thu, Jun 29, 2006 at 06:13:51PM -0700, Kamble, Nitin A wrote:
>>        I am trying to debug the credit scheduler to solve the many
HVM>domain
>>    instability issues we have found with the credit scheduler.
>
>Great. As Keir pointed out though the problems you are seeing
>may not actually be in the credit scheduler itself.
>
>>        While debugging I notice an odd behavior; When running on a 2
CPU>>    system, dom0 gets 2 vcpus by default. And even if there are no
other>>    domains running in the system,  the dom0 vcpus are getting
migrated to>>    different pcpus in the load balance. I think it is due to the
>preemption
>>    happening in the credit scheduler; and it is not necessary and is
>actually
>>    wasteful to move vcpus when no of vcpus in the system are equal to
no>of
>>    pcpus.
>>
>>        I would like to know your thinking about this behavior. Is it
an>>    intended in the design?
>
>This should be very rare. If a VCPU were woken up and put on
>the runq of an idle CPU, a peer physical CPU that is in the
>scheduler code at that exact time could potentially pick up
>the just woken up VCPU.
>
>We can do things to shorten this window, like not pick up a
>VCPU from a remote CPU that is currently idle and therefore
>probably racing with us to run said newly woken up VCPU on
>its runq. But I''m not sure this happens frequently enough to
>warrant the added complexity. On top of that, it seems to
>me this is more likely to happen to VCPUs that aren''t doing
>very much work and therefore would not suffer a performance
>loss from migrating physical CPU on occasion.
>
>Are you seeing a lot of these migrations?
>
>>    I added this small fix to the scheduler to fix this behavior. And
with>it
>>    I see the stability of Xen improved. Win2003 boot was crashing
with>>    unhandled MMIO error on xen64 earlier with credit scheduler. I am
not>>    seeing that crash with this small fix anymore. It is quiet
possible>that
>>    there are more bugs I need to catch for HVM domains in the credit
>>    scheduler. And I would like to know your thoughts for this change.
>
>I don''t agree with this change.
>
>When a VCPU is the only member of a CPU''s runq, it''s still
>waiting for a _running_ VCPU to yield or block. We should
>absolutely be picking up such a VCPU to run elsewhere on
>an idle CPU. Else, you''d end up with two VCPUs time-slicing
>on a processor while other processors in the system are idle.
>
>Your change effectively turns off migration on systems where
>the number of active VCPUs is less than 2 multiplied by the
>number of physical CPUs. I can see why that would hide any
>bugs in the context migrating paths, but that doesn''t make
>it right. :-)
>
>>
>>    csched_runq_steal(struct csched_pcpu *spc, int cpu, int pri)
>>
>>    {
>>
>>        struct list_head *iter;
>>
>>        struct csched_vcpu *speer;
>>
>>        struct vcpu *vc;
>>
>>
>>
>>        /* If there are only 1 vcpu in the queue then stealing it from
the>>    queue
>>
>>         * is not going not help in load balancing.
>>
>>         */
>>
>>        if (spc->runq.next->next == &spc->runq)
>>
>>                return NULL;
>>
>>
>>
>>    Thanks & Regards,
>>
>>    Nitin
>>
>>
---------------------------------------------------------------------->-------------
>>
>>    Open Source Technology Center, Intel Corp
>>
>>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Pratt

2006-Jun-30 20:22 UTC

head link

[Xen-devel] RE: A credit scheduler issue

> Keir, Emmanuel,
>   Thanks for the detailed answers, and your views. I agree that my
small> change should not affect correctness. I didn''t see the migration
often,> I saw the dom0 vcpus migrations happening 4, 5 times from boot to
start> of xend. I think we should avoid these migrations; why waste the cache
> hotness?
5 migrations in the space of a couple of minutes is going to add
absolutely no measurable overhead. It''s certainly not worth adding code
to fix this. However, its well worth trying to construct other scenarios
to see if is possible to provoke thrashing (I would define thrashing as
events occurring more that e.g. 5-10 times a second). 
>     How solid is the credit scheduler now for DomUs on a SMP box? On
> 32bit, PAE & 64bit? It would be a useful data point for me to debug
the> HVM guest issues with the credit scheduler.
I''m not aware of any issues, at least with the default parameters.

Ian
> Thanks & Regards,
> Nitin
>
------------------------------------------------------------------------> -----------
> Open Source Technology Center, Intel Corp
> 
> >-----Original Message-----
> >From: Emmanuel Ackaouy [mailto:ack@xensource.com]
> >Sent: Friday, June 30, 2006 3:46 AM
> >To: Kamble, Nitin A
> >Cc: xen-devel@lists.xensource.com; Keir Fraser; Ian Pratt
> >Subject: Re: A credit scheduler issue
> >
> >Hi Nitin,
> >
> >On Thu, Jun 29, 2006 at 06:13:51PM -0700, Kamble, Nitin A wrote:
> >>        I am trying to debug the credit scheduler to solve the many
> HVM
> >domain
> >>    instability issues we have found with the credit scheduler.
> >
> >Great. As Keir pointed out though the problems you are seeing
> >may not actually be in the credit scheduler itself.
> >
> >>        While debugging I notice an odd behavior; When running on a
2> CPU
> >>    system, dom0 gets 2 vcpus by default. And even if there are no
> other
> >>    domains running in the system,  the dom0 vcpus are getting
> migrated to
> >>    different pcpus in the load balance. I think it is due to the
> >preemption
> >>    happening in the credit scheduler; and it is not necessary and
is> >actually
> >>    wasteful to move vcpus when no of vcpus in the system are equal
to> no
> >of
> >>    pcpus.
> >>
> >>        I would like to know your thinking about this behavior. Is
it> an
> >>    intended in the design?
> >
> >This should be very rare. If a VCPU were woken up and put on
> >the runq of an idle CPU, a peer physical CPU that is in the
> >scheduler code at that exact time could potentially pick up
> >the just woken up VCPU.
> >
> >We can do things to shorten this window, like not pick up a
> >VCPU from a remote CPU that is currently idle and therefore
> >probably racing with us to run said newly woken up VCPU on
> >its runq. But I''m not sure this happens frequently enough to
> >warrant the added complexity. On top of that, it seems to
> >me this is more likely to happen to VCPUs that aren''t doing
> >very much work and therefore would not suffer a performance
> >loss from migrating physical CPU on occasion.
> >
> >Are you seeing a lot of these migrations?
> >
> >>    I added this small fix to the scheduler to fix this behavior.
And> with
> >it
> >>    I see the stability of Xen improved. Win2003 boot was crashing
> with
> >>    unhandled MMIO error on xen64 earlier with credit scheduler. I
am> not
> >>    seeing that crash with this small fix anymore. It is quiet
> possible
> >that
> >>    there are more bugs I need to catch for HVM domains in the
credit> >>    scheduler. And I would like to know your thoughts for this
change.> >
> >I don''t agree with this change.
> >
> >When a VCPU is the only member of a CPU''s runq, it''s
still
> >waiting for a _running_ VCPU to yield or block. We should
> >absolutely be picking up such a VCPU to run elsewhere on
> >an idle CPU. Else, you''d end up with two VCPUs time-slicing
> >on a processor while other processors in the system are idle.
> >
> >Your change effectively turns off migration on systems where
> >the number of active VCPUs is less than 2 multiplied by the
> >number of physical CPUs. I can see why that would hide any
> >bugs in the context migrating paths, but that doesn''t make
> >it right. :-)
> >
> >>
> >>    csched_runq_steal(struct csched_pcpu *spc, int cpu, int pri)
> >>
> >>    {
> >>
> >>        struct list_head *iter;
> >>
> >>        struct csched_vcpu *speer;
> >>
> >>        struct vcpu *vc;
> >>
> >>
> >>
> >>        /* If there are only 1 vcpu in the queue then stealing it
from> the
> >>    queue
> >>
> >>         * is not going not help in load balancing.
> >>
> >>         */
> >>
> >>        if (spc->runq.next->next == &spc->runq)
> >>
> >>                return NULL;
> >>
> >>
> >>
> >>    Thanks & Regards,
> >>
> >>    Nitin
> >>
> >>
> ----------------------------------------------------------------------
> >-------------
> >>
> >>    Open Source Technology Center, Intel Corp
> >>
> >>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Emmanuel Ackaouy

2006-Jun-30 20:28 UTC

head link

[Xen-devel] Re: A credit scheduler issue

On Fri, Jun 30, 2006 at 12:23:47PM -0700, Kamble, Nitin A
wrote:> Keir, Emmanuel,
>   Thanks for the detailed answers, and your views. I agree that my small
> change should not affect correctness. I didn''t see the migration
often,
> I saw the dom0 vcpus migrations happening 4, 5 times from boot to start
> of xend. I think we should avoid these migrations; why waste the cache
> hotness?
>     How solid is the credit scheduler now for DomUs on a SMP box? On
> 32bit, PAE & 64bit? It would be a useful data point for me to debug the
> HVM guest issues with the credit scheduler.
It''s been pretty solid on 32, pae, and 64bit for a while.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Li, Xin B

2006-Jul-06 10:50 UTC

head link

RE: [Xen-devel] Re: A credit scheduler issue

>Although you may have spotted a performance bug that should be looked 
>into (if this migration happens significantly frequently), it *should 
>not* be a correctness bug! I don''t think the fixes for the 
>HVM-and-credit-sched bugs lie in the credit scheduler itself. :-)
>
Hi Keir,
we just found the root cause of IA32 HVM guests can not run on 64 bit
host with the default credit scheduler.
credit scheduler migrates HVM vcpu from logical precessor from time to
time, but on x86_64 hypervisor, in __context_switch, we have 

    if ( !is_idle_vcpu(p) )
    {
        memcpy(&p->arch.guest_context.user_regs,
               stack_regs,
               CTXT_SWITCH_STACK_BYTES);
        unlazy_fpu(p);
        p->arch.ctxt_switch_from(p);
    }

    if ( !is_idle_vcpu(n) )
    {
        memcpy(stack_regs,
               &n->arch.guest_context.user_regs,
               CTXT_SWITCH_STACK_BYTES);

And CTXT_SWITCH_STACK_BYTES is defined as (offsetof(struct
cpu_user_regs, es)).

The definition of CTXT_SWITCH_STACK_BYTES on x86_64 is OK for 64 bit
guests, no matter para guests or HVM guests, but for 32 bit HVM guests,
since segments registers are skipped, it''s buggy.

I''d perfer to define CTXT_SWITCH_STACK_BYTES to sizeof(struct
cpu_user_regs).

How about your opinion?

Thanks

-Xin 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Jun 2006 - A credit scheduler issue

[Xen-devel] A credit scheduler issue

[Xen-devel] Re: A credit scheduler issue

[Xen-devel] Re: A credit scheduler issue

[Xen-devel] RE: A credit scheduler issue

[Xen-devel] RE: A credit scheduler issue

[Xen-devel] Re: A credit scheduler issue

RE: [Xen-devel] Re: A credit scheduler issue