thr3ads.net - Xen devel - [Xen-devel] unnecessary VCPU migration happens again [Dec 2006]

If this information is useful, please help other people find it:
Share via:

Xu, Anthony

2006-Dec-01 10:11 UTC

[Xen-devel] unnecessary VCPU migration happens again

Emmanue,

I found that unnecessary VCPU migration happens again.


My environment is,

IPF two sockes, two cores per socket, 1 thread per core.

There are 4 core totally.

There are 3 domain, they are all UP,
So there are 3 VCPU totally.

One is domain0,
The other two are VTI-domain.

I found there are lots of migrations.


This is caused by below code segment in function csched_cpu_pick.
When I comments this code segment, there is no migration in above 
enviroment. 



I have a little analysis about this code.

This code handls multi-core and multi-thread, that''s very good,
If two VCPUS running on LPs which belong to the same core, then the
performance
is bad, so if there are free LPS, we should let this two VCPUS run on
different cores.

This code may work well with para-domain.
Because para-domain is seldom blocked,
It may be block due to guest call "halt" instruction.
This means if a idle VCPU is running on a LP,
there is no non-idle VCPU running on this LP.
In this evironment, I think below code should work well.


But in HVM environment, HVM is blocked by IO operation,
That is to say, if a idle VCPU is running on a LP, maybe a
HVM VCPU is blocked, and HVM VCPU will run on this LP, when
it is woken up.
In this evironment, below code cause unnecessary migrations.
I think this doesn''t reach the goal ot this code segment.

In IPF side, migration is time-consuming, so it caused some performance
degradation.


I have a proposal and it may be not good.

We can change the meaning of idle-LP,

Idle-LP means a idle-VCPU is running on this LP, and there is no VCPU
blocked on this
LP.( if this VCPU is woken up, this VCPU will run on this LP).



--Anthony


        /*
         * In multi-core and multi-threaded CPUs, not all idle execution
         * vehicles are equal!
         *
         * We give preference to the idle execution vehicle with the
most
         * idling neighbours in its grouping. This distributes work
across
         * distinct cores first and guarantees we don''t do something
stupid
         * like run two VCPUs on co-hyperthreads while there are idle
cores
         * or sockets.
         */
        while ( !cpus_empty(cpus) )
        {
            nxt = first_cpu(cpus);

            if ( csched_idler_compare(cpu, nxt) < 0 )
            {
                cpu = nxt;
                cpu_clear(nxt, cpus);
            }
            else if ( cpu_isset(cpu, cpu_core_map[nxt]) )
            {
                cpus_andnot(cpus, cpus, cpu_sibling_map[nxt]);
            }
            else
            {
                cpus_andnot(cpus, cpus, cpu_core_map[nxt]);
            }

            ASSERT( !cpu_isset(nxt, cpus) );
        }

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Emmanuel Ackaouy

2006-Dec-06 14:01 UTC

head link

Re: [Xen-devel] unnecessary VCPU migration happens again

Hi Anthony.

Could you send xentrace output for scheduling operations
in your setup?

Perhaps we''re being a little too aggressive spreading
work across sockets. We do this on vcpu_wake right now.

I''m not sure I understand why HVM VCPUs would block
and wake more often than PV VCPUs though. Can you
explain?

If you could gather some scheduler traces and send
results, it will give us a good idea of what''s going
on and why. The multi-core support is new and not
widely tested so it''s possible that it is being
overly aggressive or perhaps even buggy.

Emmanuel.


On Fri, Dec 01, 2006 at 06:11:32PM +0800, Xu, Anthony
wrote:> Emmanue,
> 
> I found that unnecessary VCPU migration happens again.
> 
> 
> My environment is,
> 
> IPF two sockes, two cores per socket, 1 thread per core.
> 
> There are 4 core totally.
> 
> There are 3 domain, they are all UP,
> So there are 3 VCPU totally.
> 
> One is domain0,
> The other two are VTI-domain.
> 
> I found there are lots of migrations.
> 
> 
> This is caused by below code segment in function csched_cpu_pick.
> When I comments this code segment, there is no migration in above 
> enviroment. 
> 
> 
> 
> I have a little analysis about this code.
> 
> This code handls multi-core and multi-thread, that''s very good,
> If two VCPUS running on LPs which belong to the same core, then the
> performance
> is bad, so if there are free LPS, we should let this two VCPUS run on
> different cores.
> 
> This code may work well with para-domain.
> Because para-domain is seldom blocked,
> It may be block due to guest call "halt" instruction.
> This means if a idle VCPU is running on a LP,
> there is no non-idle VCPU running on this LP.
> In this evironment, I think below code should work well.
> 
> 
> But in HVM environment, HVM is blocked by IO operation,
> That is to say, if a idle VCPU is running on a LP, maybe a
> HVM VCPU is blocked, and HVM VCPU will run on this LP, when
> it is woken up.
> In this evironment, below code cause unnecessary migrations.
> I think this doesn''t reach the goal ot this code segment.
> 
> In IPF side, migration is time-consuming, so it caused some performance
> degradation.
> 
> 
> I have a proposal and it may be not good.
> 
> We can change the meaning of idle-LP,
> 
> Idle-LP means a idle-VCPU is running on this LP, and there is no VCPU
> blocked on this
> LP.( if this VCPU is woken up, this VCPU will run on this LP).
> 
> 
> 
> --Anthony
> 
> 
>         /*
>          * In multi-core and multi-threaded CPUs, not all idle execution
>          * vehicles are equal!
>          *
>          * We give preference to the idle execution vehicle with the
> most
>          * idling neighbours in its grouping. This distributes work
> across
>          * distinct cores first and guarantees we don''t do
something
> stupid
>          * like run two VCPUs on co-hyperthreads while there are idle
> cores
>          * or sockets.
>          */
>         while ( !cpus_empty(cpus) )
>         {
>             nxt = first_cpu(cpus);
> 
>             if ( csched_idler_compare(cpu, nxt) < 0 )
>             {
>                 cpu = nxt;
>                 cpu_clear(nxt, cpus);
>             }
>             else if ( cpu_isset(cpu, cpu_core_map[nxt]) )
>             {
>                 cpus_andnot(cpus, cpus, cpu_sibling_map[nxt]);
>             }
>             else
>             {
>                 cpus_andnot(cpus, cpus, cpu_core_map[nxt]);
>             }
> 
>             ASSERT( !cpu_isset(nxt, cpus) );
>         }
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Petersson, Mats

2006-Dec-06 14:13 UTC

head link

RE: [Xen-devel] unnecessary VCPU migration happens again

> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com 
> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of 
> Emmanuel Ackaouy
> Sent: 06 December 2006 14:02
> To: Xu, Anthony
> Cc: xen-devel@lists.xensource.com; xen-ia64-devel
> Subject: Re: [Xen-devel] unnecessary VCPU migration happens again
> 
> Hi Anthony.
> 
> Could you send xentrace output for scheduling operations
> in your setup?
> 
> Perhaps we''re being a little too aggressive spreading
> work across sockets. We do this on vcpu_wake right now.
> 
> I''m not sure I understand why HVM VCPUs would block
> and wake more often than PV VCPUs though. Can you
> explain?
Whilst I don''t know any of the facts of the original poster, I can tell
you why HVM and PV guests have differing number of scheduling
operations... 

Every time you get a IOIO/MMIO vmexit that leads to a qemu-dm
interaction, you''ll get a context switch. So for an average IDE block
read/write (for example) on x86, you get 4-5 IOIO intercepts to send the
command to qemu, then an interrupt is sent to the guest to indicate that
the operation is finished, followed by a 256 x 16-bit IO read/write of
the sector content (which is normally just one IOIO intercept unless the
driver is "stupid"). This means around a dozen or so schedule
operations
to do one disk IO operation.

The same operation in PV (or using PV driver in HVM guest of course)
would require a single transaction from DomU to Dom0 and back, so only
two schedule operations. 

The same "problem" occurs of course for other hardware devices such as
network, keyboard, mouse, where a transaction consists of more than a
single read or write to a single register. 

--
Mats> 
> If you could gather some scheduler traces and send
> results, it will give us a good idea of what''s going
> on and why. The multi-core support is new and not
> widely tested so it''s possible that it is being
> overly aggressive or perhaps even buggy.
> 
> Emmanuel.
> 
> 
> On Fri, Dec 01, 2006 at 06:11:32PM +0800, Xu, Anthony wrote:
> > Emmanue,
> > 
> > I found that unnecessary VCPU migration happens again.
> > 
> > 
> > My environment is,
> > 
> > IPF two sockes, two cores per socket, 1 thread per core.
> > 
> > There are 4 core totally.
> > 
> > There are 3 domain, they are all UP,
> > So there are 3 VCPU totally.
> > 
> > One is domain0,
> > The other two are VTI-domain.
> > 
> > I found there are lots of migrations.
> > 
> > 
> > This is caused by below code segment in function csched_cpu_pick.
> > When I comments this code segment, there is no migration in above 
> > enviroment. 
> > 
> > 
> > 
> > I have a little analysis about this code.
> > 
> > This code handls multi-core and multi-thread, that''s very
good,
> > If two VCPUS running on LPs which belong to the same core, then the
> > performance
> > is bad, so if there are free LPS, we should let this two 
> VCPUS run on
> > different cores.
> > 
> > This code may work well with para-domain.
> > Because para-domain is seldom blocked,
> > It may be block due to guest call "halt" instruction.
> > This means if a idle VCPU is running on a LP,
> > there is no non-idle VCPU running on this LP.
> > In this evironment, I think below code should work well.
> > 
> > 
> > But in HVM environment, HVM is blocked by IO operation,
> > That is to say, if a idle VCPU is running on a LP, maybe a
> > HVM VCPU is blocked, and HVM VCPU will run on this LP, when
> > it is woken up.
> > In this evironment, below code cause unnecessary migrations.
> > I think this doesn''t reach the goal ot this code segment.
> > 
> > In IPF side, migration is time-consuming, so it caused some 
> performance
> > degradation.
> > 
> > 
> > I have a proposal and it may be not good.
> > 
> > We can change the meaning of idle-LP,
> > 
> > Idle-LP means a idle-VCPU is running on this LP, and there 
> is no VCPU
> > blocked on this
> > LP.( if this VCPU is woken up, this VCPU will run on this LP).
> > 
> > 
> > 
> > --Anthony
> > 
> > 
> >         /*
> >          * In multi-core and multi-threaded CPUs, not all 
> idle execution
> >          * vehicles are equal!
> >          *
> >          * We give preference to the idle execution vehicle with the
> > most
> >          * idling neighbours in its grouping. This distributes work
> > across
> >          * distinct cores first and guarantees we don''t do
something
> > stupid
> >          * like run two VCPUs on co-hyperthreads while 
> there are idle
> > cores
> >          * or sockets.
> >          */
> >         while ( !cpus_empty(cpus) )
> >         {
> >             nxt = first_cpu(cpus);
> > 
> >             if ( csched_idler_compare(cpu, nxt) < 0 )
> >             {
> >                 cpu = nxt;
> >                 cpu_clear(nxt, cpus);
> >             }
> >             else if ( cpu_isset(cpu, cpu_core_map[nxt]) )
> >             {
> >                 cpus_andnot(cpus, cpus, cpu_sibling_map[nxt]);
> >             }
> >             else
> >             {
> >                 cpus_andnot(cpus, cpus, cpu_core_map[nxt]);
> >             }
> > 
> >             ASSERT( !cpu_isset(nxt, cpus) );
> >         }
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xu, Anthony

2006-Dec-07 03:37 UTC

head link

RE: [Xen-devel] unnecessary VCPU migration happens again

Hi,

Thanks for your reply. Please see embedded comments.


Petersson, Mats write on 2006年12月6日 22:14:>> -----Original Message-----
>> From: xen-devel-bounces@lists.xensource.com
>> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Emmanuel
>> Ackaouy Sent: 06 December 2006 14:02
>> To: Xu, Anthony
>> Cc: xen-devel@lists.xensource.com; xen-ia64-devel
>> Subject: Re: [Xen-devel] unnecessary VCPU migration happens again
>> 
>> Hi Anthony.
>> 
>> Could you send xentrace output for scheduling operations
>> in your setup?I''m not sure xentrace works on IPF side. I''m trying.
>> 
>> Perhaps we''re being a little too aggressive spreading
>> work across sockets. We do this on vcpu_wake right now.
I think below logic also does spreading work.

1. in csched_load_balance, below code segment sets _VCPUF_migrating flag
 in peer_vcpu, as the comment said,
    /*
     * If we failed to find any remotely queued VCPUs to move here,
     * see if it would be more efficient to move any of the running
     * remote VCPUs over here.
     */


        /* Signal the first candidate only. */
        if ( !is_idle_vcpu(peer_vcpu) &&
             is_idle_vcpu(__runq_elem(spc->runq.next)->vcpu) &&
             __csched_running_vcpu_is_stealable(cpu, peer_vcpu) )
        {
            set_bit(_VCPUF_migrating, &peer_vcpu->vcpu_flags);
            spin_unlock(&per_cpu(schedule_data, peer_cpu).schedule_lock);

            CSCHED_STAT_CRANK(steal_loner_signal);
            cpu_raise_softirq(peer_cpu, SCHEDULE_SOFTIRQ);
            break;
        }


2. When this peer_vcpu is scheduled out, migration happens,

void context_saved(struct vcpu *prev)
{
    clear_bit(_VCPUF_running, &prev->vcpu_flags);

    if ( unlikely(test_bit(_VCPUF_migrating, &prev->vcpu_flags)) )
        vcpu_migrate(prev);
}
>From this logic, the migration happens frequently if the numbers VCPUis less than the number of logic CPU.


Anthony.


>> 
>> I''m not sure I understand why HVM VCPUs would block
>> and wake more often than PV VCPUs though. Can you
>> explain?
> 
> Whilst I don''t know any of the facts of the original poster, I can
> tell you why HVM and PV guests have differing number of scheduling
> operations...
> 
> Every time you get a IOIO/MMIO vmexit that leads to a qemu-dm
> interaction, you''ll get a context switch. So for an average IDE
block
> read/write (for example) on x86, you get 4-5 IOIO intercepts to send
> the command to qemu, then an interrupt is sent to the guest to
> indicate that the operation is finished, followed by a 256 x 16-bit
> IO read/write of the sector content (which is normally just one IOIO
> intercept unless the driver is "stupid"). This means around a
dozen
> or so schedule operations to do one disk IO operation.
> 
> The same operation in PV (or using PV driver in HVM guest of course)
> would require a single transaction from DomU to Dom0 and back, so only
> two schedule operations.
> 
> The same "problem" occurs of course for other hardware devices
such as
> network, keyboard, mouse, where a transaction consists of more than a
> single read or write to a single register.


That I want to highlight is,

When HVM VCPU is executing IO operation,
This HVM VCPU is blocked by HV, until this IO operation
is emulated by Qemu. Then HV wakes up this HVM VCPU.

While PV VCPU will not be blocked by PV driver.


I can give below senario.

There are two sockets, two core per socket.

Assume, dom0 is running on socket1 core1,
 vti1 is runing on socket1 core2,
Vti 2 is runing on socket2 core1,
Socket2 core2 is idle.

If vti2 is blocked by IO operation, then socket2 core1 is idle,
That means two cores in socket2 are idle,
While dom0 and vti1 are running on two cores of socket1,

Then scheduler will try to spread dom0 and vti1 on these two sockets.
Then migration happens. This is no necessary.


> 
>> 
>> If you could gather some scheduler traces and send
>> results, it will give us a good idea of what''s going
>> on and why. The multi-core support is new and not
>> widely tested so it''s possible that it is being
>> overly aggressive or perhaps even buggy.
>> 
>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Emmanuel Ackaouy

2006-Dec-07 10:37 UTC

head link

Re: [Xen-devel] unnecessary VCPU migration happens again

On Thu, Dec 07, 2006 at 11:37:54AM +0800, Xu, Anthony
wrote:> >From this logic, the migration happens frequently if the numbers VCPU
> is less than the number of logic CPU.
This logic is designed to make better use of a partially idle
system by spreading work across sockets and cores before
co-scheduling them. It won''t come into play if there are no
idle execution units.

Note that __csched_running_vcpu_is_stealable() will trigger a
migration only when the end result would be strictly better
than the current situation. Once the system is balanced, it
will not bounce VCPUs around.
> That I want to highlight is,
> 
> When HVM VCPU is executing IO operation,
> This HVM VCPU is blocked by HV, until this IO operation
> is emulated by Qemu. Then HV wakes up this HVM VCPU.
> 
> While PV VCPU will not be blocked by PV driver.
> 
> 
> I can give below senario.
> 
> There are two sockets, two core per socket.
> 
> Assume, dom0 is running on socket1 core1,
>  vti1 is runing on socket1 core2,
> Vti 2 is runing on socket2 core1,
> Socket2 core2 is idle.
> 
> If vti2 is blocked by IO operation, then socket2 core1 is idle,
> That means two cores in socket2 are idle,
> While dom0 and vti1 are running on two cores of socket1,
> 
> Then scheduler will try to spread dom0 and vti1 on these two sockets.
> Then migration happens. This is no necessary.
Argueably, if 2 unrelated VCPUs are runnable on a dual socket
host, it is useful to spread them across both sockets. This
will give each VCPU more achievable bandwidth to memory.

What I think you may be argueing here is that the scheduler
is too aggressive in this action because the VCPU that blocked
on socket 2 will wake up very shortly, negating the host-wide
benefits of the migration when it does while still maintaining
the costs.

There is a tradeoff here. We could try being less aggressive
in spreading stuff over idle sockets. It would be nice to do
this with a greater understanding of the tradeoff though. Can
you share more information, such as benchmark perf results,
migration statistics, or scheduler traces?

Emmanuel.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Petersson, Mats

2006-Dec-07 10:52 UTC

head link

RE: [Xen-devel] unnecessary VCPU migration happens again

> -----Original Message-----
> From: Emmanuel Ackaouy [mailto:ack@xensource.com] 
> Sent: 07 December 2006 10:38
> To: Xu, Anthony
> Cc: Petersson, Mats; xen-devel@lists.xensource.com; xen-ia64-devel
> Subject: Re: [Xen-devel] unnecessary VCPU migration happens again
> 
> On Thu, Dec 07, 2006 at 11:37:54AM +0800, Xu, Anthony wrote:
> > >From this logic, the migration happens frequently if the 
> numbers VCPU
> > is less than the number of logic CPU.
> 
> This logic is designed to make better use of a partially idle
> system by spreading work across sockets and cores before
> co-scheduling them. It won''t come into play if there are no
> idle execution units.
> 
> Note that __csched_running_vcpu_is_stealable() will trigger a
> migration only when the end result would be strictly better
> than the current situation. Once the system is balanced, it
> will not bounce VCPUs around.
> 
> > That I want to highlight is,
> > 
> > When HVM VCPU is executing IO operation,
> > This HVM VCPU is blocked by HV, until this IO operation
> > is emulated by Qemu. Then HV wakes up this HVM VCPU.
> > 
> > While PV VCPU will not be blocked by PV driver.
> > 
> > 
> > I can give below senario.
> > 
> > There are two sockets, two core per socket.
> > 
> > Assume, dom0 is running on socket1 core1,
> >  vti1 is runing on socket1 core2,
> > Vti 2 is runing on socket2 core1,
> > Socket2 core2 is idle.
> > 
> > If vti2 is blocked by IO operation, then socket2 core1 is idle,
> > That means two cores in socket2 are idle,
> > While dom0 and vti1 are running on two cores of socket1,
> > 
> > Then scheduler will try to spread dom0 and vti1 on these 
> two sockets.
> > Then migration happens. This is no necessary.
> 
> Argueably, if 2 unrelated VCPUs are runnable on a dual socket
> host, it is useful to spread them across both sockets. This
> will give each VCPU more achievable bandwidth to memory.
> 
> What I think you may be argueing here is that the scheduler
> is too aggressive in this action because the VCPU that blocked
> on socket 2 will wake up very shortly, negating the host-wide
> benefits of the migration when it does while still maintaining
> the costs.
> 
> There is a tradeoff here. We could try being less aggressive
> in spreading stuff over idle sockets. It would be nice to do
> this with a greater understanding of the tradeoff though. Can
> you share more information, such as benchmark perf results,
> migration statistics, or scheduler traces?
I don''t know if I''ve understood this right or not, but I
believe the
penalty for switching from one core (or socket) to another is higher on
IA64 than on x86. I''m not an expert on IA64, but I remember someone at
the Xen Summit saying something to that effect - I think it was
something like executing a bunch of code to flush the TLB''s or some
such... 

--
Mats> 
> Emmanuel.
> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xu, Anthony

2006-Dec-08 08:43 UTC

head link

RE: [Xen-devel] unnecessary VCPU migration happens again

Petersson, Mats write on 2006年12月7日 18:52:>> -----Original Message-----
>> From: Emmanuel Ackaouy [mailto:ack@xensource.com]
>> Sent: 07 December 2006 10:38
>> To: Xu, Anthony
>> Cc: Petersson, Mats; xen-devel@lists.xensource.com; xen-ia64-devel
>> Subject: Re: [Xen-devel] unnecessary VCPU migration happens again
>> Argueably, if 2 unrelated VCPUs are runnable on a dual socket
>> host, it is useful to spread them across both sockets. This
>> will give each VCPU more achievable bandwidth to memory.
>> 
>> What I think you may be argueing here is that the scheduler
>> is too aggressive in this action because the VCPU that blocked
>> on socket 2 will wake up very shortly, negating the host-wide
>> benefits of the migration when it does while still maintaining the
>> costs. 
Yes, you are right, the VCPU that blocked will wake up very shortly,

as Mats mentioned , the migration is expensive in IPF flatform,

1. TLB penalty,
    assume a VCPU is migrated from CPU0 to CPU1,
      (1) TLB purge penalty
      HV must purge all  CPU0 TLB, in case this VCPU is migrated back, and
    the CPU0 may contain stale TLB entries.

    IA32 doesn''t have such penalty, because every time VCPU switch
happens, it will purge all TLB.
    TLB purge is not caused by this migration.
	
    (2) TLB warm up penalty
    When VCPU is migrated to CPU1, it will warm up TLB in CPU1,
     Both IPF and IA32 have this penalty.

2. cache penatly,
    When VCPU is migrated to CPU1, it will warm up cache in CPU1
    Both IPF and IA32 have this penalty. 


>> 
>> There is a tradeoff here. We could try being less aggressive
>> in spreading stuff over idle sockets. It would be nice to do
>> this with a greater understanding of the tradeoff though. Can
>> you share more information, such as benchmark perf results,
>> migration statistics, or scheduler traces?
> 
I got following basical data on LTP benchmark.

Environment, 
IPF platform
two sockets, two core per socket, two thread per core.
There are 8 logical CPU,

Dom0 is UP
VTIdomaim is 4VCPU,

It takes 66 minutes to run LTP .

Then I comments following code, there is no unnecessary migration.

It takes 48 minites to run LTP,

The degradation is,

(66-48)/66 = 27%

That''s a "big" degradation!


/*
        while ( !cpus_empty(cpus) )
        {
            nxt = first_cpu(cpus);

            if ( csched_idler_compare(cpu, nxt) < 0 )
            {
                cpu = nxt;
                cpu_clear(nxt, cpus);
            }
            else if ( cpu_isset(cpu, cpu_core_map[nxt]) )
            {
                cpus_andnot(cpus, cpus, cpu_sibling_map[nxt]);
            }
            else
            {
                cpus_andnot(cpus, cpus, cpu_core_map[nxt]);
            }

            ASSERT( !cpu_isset(nxt, cpus) );
        }
*/




Thanks,
Anthony

> I don''t know if I''ve understood this right or not, but I
believe the
> penalty for switching from one core (or socket) to another is higher
> on IA64 than on x86. I''m not an expert on IA64, but I remember
> someone at the Xen Summit saying something to that effect - I think
> it was something like executing a bunch of code to flush the TLB''s
or
> some such...
> 
>> 
>> Emmanuel.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Emmanuel Ackaouy

2006-Dec-08 13:05 UTC

head link

Re: [Xen-devel] unnecessary VCPU migration happens again

On Fri, Dec 08, 2006 at 04:43:12PM +0800, Xu, Anthony
wrote:> The degradation is,
> 
> (66-48)/66 = 27%
> 
> That''s a "big" degradation!
Yeah. I''ll throw some code together to fix this.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Emmanuel Ackaouy

2006-Dec-13 22:05 UTC

head link

Re: [Xen-devel] unnecessary VCPU migration happens again

Anthony,

I checked in a change to the scheduler multi-core/thread
mechanisms in xen-unstable which should address the over
aggressive migrations you were seeing.

Can you pull that change, try your experiments again, and
let me know how it works for you?

Cheers,
Emmanuel.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xu, Anthony

2006-Dec-19 07:02 UTC

head link

RE: [Xen-devel] unnecessary VCPU migration happens again

Emmanuel Ackaouy write on 2006年12月14日 6:05:> Anthony,
> 
> I checked in a change to the scheduler multi-core/thread
> mechanisms in xen-unstable which should address the over
> aggressive migrations you were seeing.
> 
> Can you pull that change, try your experiments again, and
> let me know how it works for you?

Hi Emmanuel,

Sorry for late response,

I did some performances tests based on your patch, SMP VTI Kernel build
and SMP VTI LTP.

Your patch is good, and reduce the majority of unnecessary migrations.
But the unnecessary migration still exist. I can still see about 5% performance
degradation on above benchmark( KB and LTP).
In fact this patch had helped a lot (from 27% to 5%)

I can understand it is impossible to implement spreading VCPU over all
sockets/cores
and eliminate all unnecessary migration in the same time.

Is it possible for us to add a argument to function scheduler_init to
enable/disable
spreading VCPU feature?

It''s caller''s responsibilty to enable/disable this feature.

BTW, I used attatched patch to disable spreading VCPU feature.



Thanks,
Anthony


> 
> Cheers,
> Emmanuel.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Emmanuel Ackaouy

2006-Dec-19 08:59 UTC

head link

Re: [Xen-devel] unnecessary VCPU migration happens again

On Dec 19, 2006, at 8:02, Xu, Anthony wrote:> Your patch is good, and reduce the majority of unnecessary migrations.
> But the unnecessary migration still exist. I can still see about 5% 
> performance
> degradation on above benchmark( KB and LTP).
> In fact this patch had helped a lot (from 27% to 5%)
>
> I can understand it is impossible to implement spreading VCPU over all 
> sockets/cores
> and eliminate all unnecessary migration in the same time.
>
> Is it possible for us to add a argument to function scheduler_init to 
> enable/disable
> spreading VCPU feature?
I don''t think this is a good idea. If you want to disable migration, 
you can always
pin your VCPUs in place yourself using the cpu affinity masks.

If the attempt to balance work across sockets hurts performance of 
reasonable
benchmarks, then perhaps it''s still being too aggressive. right now, 
such a
migration could happen on 10ms boundaries. i can try to smooth this 
further.

Can you dump the credit scheduler stat counter before and after you run 
the
benchmark? (^A^A^A on the dom0/hypervisor console to switch to the 
hypervisor
and then type the "r" key to dump scheduler info). That along with an 
idea of the
elapsed time between the two stat samples would be handy.

Cheers,
Emmanuel.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xu, Anthony

2006-Dec-20 03:26 UTC

head link

RE: [Xen-devel] unnecessary VCPU migration happens again

Emmanuel Ackaouy write on 2006年12月19日 17:00:> On Dec 19, 2006, at 8:02, Xu, Anthony wrote:
> Can you dump the credit scheduler stat counter before and after you
> run the
> benchmark? (^A^A^A on the dom0/hypervisor console to switch to the
> hypervisor
> and then type the "r" key to dump scheduler info). That along
with an
> idea of the
> elapsed time between the two stat samples would be handy.
Hi Emmanuel,

I got the dump scheduler info.

The evironment is,
Two sockects, two cores per sockect.
Dom0 is UP,
DomVTI is 2 VCPU SMP,
There are 4 physical CPUs,
There are 3 VCPUs totally.

We run Kernel Build on VTI domain.>make -j3
Before running KB

(XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three times
to switch input to Xen).
(XEN) *** Serial input -> Xen (type ''CTRL-a'' three times to
switch input to DOM0).
(XEN) Scheduler: SMP Credit Scheduler (credit)
(XEN) info:
(XEN)   ncpus              = 4
(XEN)   master             = 0
(XEN)   credit             = 1200
(XEN)   credit balance     = 0
(XEN)   weight             = 0
(XEN)   runq_sort          = 12233
(XEN)   default-weight     = 256
(XEN)   msecs per tick     = 10ms
(XEN)   credits per tick   = 100
(XEN)   ticks per tslice   = 3
(XEN)   ticks per acct     = 3
(XEN) idlers: 0xf
(XEN) stats:
(XEN)   schedule                       = 4521191
(XEN)   acct_run                       = 12233
(XEN)   acct_no_work                   = 26827
(XEN)   acct_balance                   = 7
(XEN)   acct_reorder                   = 0
(XEN)   acct_min_credit                = 0
(XEN)   acct_vcpu_active               = 4197
(XEN)   acct_vcpu_idle                 = 4197
(XEN)   vcpu_sleep                     = 16
(XEN)   vcpu_wake_running              = 140923
(XEN)   vcpu_wake_onrunq               = 0
(XEN)   vcpu_wake_runnable             = 2016395
(XEN)   vcpu_wake_not_runnable         = 0
(XEN)   vcpu_park                      = 0
(XEN)   vcpu_unpark                    = 0
(XEN)   tickle_local_idler             = 2016206
(XEN)   tickle_local_over              = 2
(XEN)   tickle_local_under             = 39
(XEN)   tickle_local_other             = 0
(XEN)   tickle_idlers_none             = 0
(XEN)   tickle_idlers_some             = 204
(XEN)   load_balance_idle              = 2432213
(XEN)   load_balance_over              = 132
(XEN)   load_balance_other             = 0
(XEN)   steal_trylock_failed           = 12730
(XEN)   steal_peer_idle                = 1197122
(XEN)   migrate_queued                 = 169
(XEN)   migrate_running                = 213
(XEN)   dom_init                       = 4
(XEN)   dom_destroy                    = 1
(XEN)   vcpu_init                      = 9
(XEN)   vcpu_destroy                   = 2
(XEN) active vcpus:
(XEN) NOW=0x000001114A1FE784
(XEN) CPU[00]  tick=117181, sort=12233, sibling=0x1, core=0x5
(XEN)   run: [32767.0] pri=-64 flags=0 cpu=0
(XEN) CPU[01]  tick=117189, sort=12233, sibling=0x2, core=0xa
(XEN)   run: [32767.1] pri=-64 flags=0 cpu=1
(XEN) CPU[02]  tick=117196, sort=12233, sibling=0x4, core=0x5
(XEN)   run: [32767.2] pri=-64 flags=0 cpu=2
(XEN) CPU[03]  tick=117176, sort=12233, sibling=0x8, core=0xa
(XEN)   run: [32767.3] pri=-64 flags=0 cpu=3


After running KB

(XEN) Scheduler: SMP Credit Scheduler (credit)
(XEN) info:
(XEN)   ncpus              = 4
(XEN)   master             = 0
(XEN)   credit             = 1200
(XEN)   credit balance     = 0
(XEN)   weight             = 0
(XEN)   runq_sort          = 42999
(XEN)   default-weight     = 256
(XEN)   msecs per tick     = 10ms
(XEN)   credits per tick   = 100
(XEN)   ticks per tslice   = 3
(XEN)   ticks per acct     = 3
(XEN) idlers: 0xf
(XEN) stats:
(XEN)   schedule                       = 8233816
(XEN)   acct_run                       = 42999
(XEN)   acct_no_work                   = 28931
(XEN)   acct_balance                   = 47
(XEN)   acct_reorder                   = 0
(XEN)   acct_min_credit                = 0
(XEN)   acct_vcpu_active               = 9963
(XEN)   acct_vcpu_idle                 = 9963
(XEN)   vcpu_sleep                     = 16
(XEN)   vcpu_wake_running              = 240904
(XEN)   vcpu_wake_onrunq               = 0
(XEN)   vcpu_wake_runnable             = 3697995
(XEN)   vcpu_wake_not_runnable         = 0
(XEN)   vcpu_park                      = 0
(XEN)   vcpu_unpark                    = 0
(XEN)   tickle_local_idler             = 3697336
(XEN)   tickle_local_over              = 18
(XEN)   tickle_local_under             = 148
(XEN)   tickle_local_other             = 0
(XEN)   tickle_idlers_none             = 0
(XEN)   tickle_idlers_some             = 669
(XEN)   load_balance_idle              = 4394421
(XEN)   load_balance_over              = 503
(XEN)   load_balance_other             = 0
(XEN)   steal_trylock_failed           = 23268
(XEN)   steal_peer_idle                = 3014315
(XEN)   migrate_queued                 = 533
(XEN)   migrate_running                = 743
(XEN)   dom_init                       = 4
(XEN)   dom_destroy                    = 1
(XEN)   vcpu_init                      = 9
(XEN)   vcpu_destroy                   = 2
(XEN) active vcpus:
(XEN) NOW=0x000001F7232EB867
(XEN) CPU[00]  tick=215790, sort=42999, sibling=0x1, core=0x5
(XEN)   run: [32767.0] pri=-64 flags=0 cpu=0
(XEN) CPU[01]  tick=215843, sort=42999, sibling=0x2, core=0xa
(XEN)   run: [32767.1] pri=-64 flags=0 cpu=1
(XEN) CPU[02]  tick=215843, sort=42999, sibling=0x4, core=0x5
(XEN)   run: [32767.2] pri=-64 flags=0 cpu=2
(XEN) CPU[03]  tick=215775, sort=42999, sibling=0x8, core=0xa
(XEN)   run: [32767.3] pri=-64 flags=0 cpu=3



--Anthony
> 
> Cheers,
> Emmanuel.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Emmanuel Ackaouy

2006-Dec-21 16:22 UTC

head link

Re: [Xen-devel] unnecessary VCPU migration happens again

Hi Anthony.

Based on the number of "ticks" on CPU0 that occurred between the
two stat dumps, over 16 minutes elapsed during that time.

During that time, 364 regular migrations occurred. These are migrations
that happen when an idle CPU finds a runnable VCPU queued elsewhere
on the system.

Also during that time, 530 multi-core load balancing migrations 
happened.

That''s about one such migration every 1.86 seconds. I''m
somewhat
surprised
that this costs 5% in performance of your benchmark. That said, the 
point
of this code is to balance a partially idle system and not to shuffle 
things
around too much so I''m happy to smooth the algorithm further to reduce 
the
number of these migrations.

I''ll send another patch shortly.

On Dec 20, 2006, at 4:26, Xu, Anthony wrote:> Before running KB
>
> (XEN)   migrate_queued                 = 169
> (XEN)   migrate_running                = 213
>
> (XEN) CPU[00]  tick=117181, sort=12233, sibling=0x1, core=0x5
>
>
> After running KB
>
> (XEN)   migrate_queued                 = 533
> (XEN)   migrate_running                = 743
>
> (XEN) CPU[00]  tick=215790, sort=42999, sibling=0x1, core=0x5

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xu, Anthony

2006-Dec-22 02:26 UTC

head link

RE: [Xen-devel] unnecessary VCPU migration happens again

Hi Emmanuel,

Thanks for your quick response.
I''m not familiar with scheduler, I''ll study it. :-)

I put comments below, maybe it''s not right. :-)

Emmanuel Ackaouy write on 2006年12月22日 0:23:> Hi Anthony.
> 
> Based on the number of "ticks" on CPU0 that occurred between the
> two stat dumps, over 16 minutes elapsed during that time.
> 
> During that time, 364 regular migrations occurred. These are
> migrations that happen when an idle CPU finds a runnable VCPU queued
> elsewhere 
> on the system.
>From the point of user of credit scheduler, it may be not regular
migrations.Because there are 4 CPU, and there are only 3 VCPU,

It would be unlikely that an idle CPU finds a runnable VCPU queued elsewhere 
 on the system.
> 
> Also during that time, 530 multi-core load balancing migrations
> happened.
> 
> That''s about one such migration every 1.86 seconds. I''m
somewhat
> surprised
> that this costs 5% in performance of your benchmark. That said, the
> point
> of this code is to balance a partially idle system and not to shuffle
> things
> around too much so I''m happy to smooth the algorithm further to
reduce
> the
> number of these migrations.
I''m interested about this.
I''ll investigate this.
> 
> I''ll send another patch shortly.Thanks again.
> 
> On Dec 20, 2006, at 4:26, Xu, Anthony wrote:
>> Before running KB
>> 
>> (XEN)   migrate_queued                 = 169
>> (XEN)   migrate_running                = 213
>> 
>> (XEN) CPU[00]  tick=117181, sort=12233, sibling=0x1, core=0x5
>> 
>> 
>> After running KB
>> 
>> (XEN)   migrate_queued                 = 533
>> (XEN)   migrate_running                = 743
>> 
>> (XEN) CPU[00]  tick=215790, sort=42999, sibling=0x1, core=0x5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Dec 2006 - unnecessary VCPU migration happens again

[Xen-devel] unnecessary VCPU migration happens again

Re: [Xen-devel] unnecessary VCPU migration happens again

RE: [Xen-devel] unnecessary VCPU migration happens again

RE: [Xen-devel] unnecessary VCPU migration happens again

Re: [Xen-devel] unnecessary VCPU migration happens again

RE: [Xen-devel] unnecessary VCPU migration happens again

RE: [Xen-devel] unnecessary VCPU migration happens again

Re: [Xen-devel] unnecessary VCPU migration happens again

Re: [Xen-devel] unnecessary VCPU migration happens again

RE: [Xen-devel] unnecessary VCPU migration happens again

Re: [Xen-devel] unnecessary VCPU migration happens again

RE: [Xen-devel] unnecessary VCPU migration happens again

Re: [Xen-devel] unnecessary VCPU migration happens again

RE: [Xen-devel] unnecessary VCPU migration happens again