thr3ads.net - Xen devel - Prioritising dom0 vcpus [May 2013]

If this information is useful, please help other people find it:
Share via:

Marcus Granado

2013-May-31 17:18 UTC

Prioritising dom0 vcpus

As an experiment trying to reduce the latency when scheduling dom0 
vcpus, I applied the following patch to __runq_insert() to xen 4.2:

diff -r 8643ca19d356 -r 91b13479c1a2 xen/common/sched_credit.c
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -205,6 +205,15 @@
      BUG_ON( __vcpu_on_runq(svc) );
      BUG_ON( cpu != svc->vcpu->processor );

+    /* if svc is a dom0 vcpu, put it always before all the other vcpus 
in the runq,
+     * so that dom0 vcpus always have priority
+     */
+    if (svc->vcpu->domain->domain_id == 0) {
+        svc->pri = CSCHED_PRI_TS_BOOST; /* make sure no vcpu goes in 
front of this one  until this vcpu is scheduled */
+        list_add(&svc->runq_elem, (struct list_head *)runq);
+        return;
+    }
+
      list_for_each( iter, runq )
      {
          const struct csched_vcpu * const iter_svc = __runq_elem(iter);


However, this patch seems to have had the opposite effect, and I would 
like to understand why. A win7 guest now takes hours to start up, and I 
believe this is due to dom0 taking an order of 10ms to serve each vm i/o 
request, even though the dom0 vcpus and the guest vcpu are in different 
pcpus.

xenalyze-a.out: http://pastelink.me/getfile.php?key=390a25
xentrace-D-T5.out: http://pastelink.me/getfile.php?key=b3d584

Any ideas why this is the case?
thanks,
Marcus
--

xenalyze-a.out head:
--

   0.006977926 ------                 x d32768v23 runstate_change d4v0 
blocked->runnable
Creating domain 4
Creating vcpu 0 for dom 4
]  0.006979023 ------                 x d32768v23   28004(2:8:4) 2 [ 4 0 ]
]  0.006980999 ------                 x d32768v23   2800e(2:8:e) 2 [ 
7fff edd9df ]
]  0.006981126 ------                 x d32768v23   2800f(2:8:f) 3 [ 4 
e82 1c9c380 ]
]  0.006981403 ------                 x d32768v23   2800a(2:8:a) 4 [ 
7fff 17 4 0 ]
    0.006981687 ------                 x d32768v23 runstate_change 
d32767v23 running->runnable
Creating vcpu 23 for dom 32767
Using first_tsc for d32767v23 (9024 cycles)
    0.006982783 ------                 x d?v? runstate_change d4v0 
runnable->running
]  0.006996466 ------                 x d4v0   28006(2:8:6) 2 [ 4 0 ]
]  0.006997600 ------                 x d4v0   2800e(2:8:e) 2 [ 4 4d19 ]
]  0.006997726 ------                 x d4v0   2800f(2:8:f) 3 [ 7fff 
4d19 ffffffff ]
]  0.006997881 ------                 x d4v0   2800a(2:8:a) 4 [ 4 0 7fff 
17 ]
    0.006998070 ------                 x d4v0 runstate_change d4v0 
running->blocked
    0.006998242 ------                 x d?v? runstate_change d32767v23 
runnable->running
    0.014874949 ----x-                 - d32767v4 runstate_change d0v4 
blocked->runnable
]  0.014879473 ----x-                 - d32767v4   28004(2:8:4) 2 [ 0 4 ]
    0.014880331 -x----                 - d32767v1 runstate_change d0v1 
blocked->runnable
]  0.014884417 ----x-                 - d32767v4   2800e(2:8:e) 2 [ 7fff 
97fc06 ]
]  0.014884544 ----x-                 - d32767v4   2800f(2:8:f) 3 [ 0 
1978 1c9c380 ]
]  0.014884916 ----x-                 - d32767v4   2800a(2:8:a) 4 [ 7fff 
4 0 4 ]
]  0.014885022 -x----                 - d32767v1   28004(2:8:4) 2 [ 0 1 ]
    0.014885134 ----x-                 - d32767v4 runstate_change 
d32767v4 running->runnable
    0.014885251 --x- -                 - d32767v2 runstate_change d0v2 
blocked->runnable
]  0.014889526 -x-- -                 - d32767v1   2800e(2:8:e) 2 [ 7fff 
97cdd8 ]
]  0.014889731 -x-- -                 - d32767v1   2800f(2:8:f) 3 [ 0 
1b68 1c9c380 ]
]  0.014889949 -x-- -                 - d32767v1   2800a(2:8:a) 4 [ 7fff 
1 0 1 ]
    0.014890084 ----x-                 - d?v? runstate_change d0v4 
runnable->running
    0.014890176 -x--|-                 - d32767v1 runstate_change 
d32767v1 running->runnable
]  0.014890291 - x-|-                 - d32767v2   28004(2:8:4) 2 [ 0 2 ]
    0.014890374 - -x|-                 - d32767v3 runstate_change d0v3 
blocked->runnable
    0.014891134 -x--|-                 - d?v? runstate_change d0v1 
runnable->running
]  0.014891811 -|x-|-                 - d32767v2   2800e(2:8:e) 2 [ 7fff 
96f8a4 ]
]  0.014891905 -|-x|-                 - d32767v3   28004(2:8:4) 2 [ 0 3 ]
]  0.014891936 -|x-|-                 - d32767v2   2800f(2:8:f) 3 [ 0 
1c23 1c9c380 ]
]  0.014892155 -|x-|-                 - d32767v2   2800a(2:8:a) 4 [ 7fff 
2 0 2 ]
    0.014892362 -|--|x                 - d32767v5 runstate_change d0v5 
blocked->runnable
    0.014892395 -|x-|-                 - d32767v2 runstate_change 
d32767v2 running->runnable
]  0.014893226 -| x|-                 - d32767v3   2800e(2:8:e) 2 [ 7fff 
982ddb ]
]  0.014893343 -| x|-                 - d32767v3   2800f(2:8:f) 3 [ 0 
c64 1c9c380 ]
    0.014893386 -|x-|-                 - d?v? runstate_change d0v2 
runnable->running
]  0.014893556 -||x|-                 - d32767v3   2800a(2:8:a) 4 [ 7fff 
3 0 3 ]
]  0.014893778 -||-|x                 - d32767v5   28004(2:8:4) 2 [ 0 5 ]
    0.014893867 -||x|-                 - d32767v3 runstate_change 
d32767v3 running->runnable
    0.014894811 -||x|-                 - d?v? runstate_change d0v3 
runnable->running
]  0.014895067 -||||x                 - d32767v5   2800e(2:8:e) 2 [ 7fff 
982654 ]
]  0.014895192 -||||x                 - d32767v5   2800f(2:8:f) 3 [ 0 
c3c 1c9c380 ]
]  0.014895439 -||||x                 - d32767v5   2800a(2:8:a) 4 [ 7fff 
5 0 5 ]
    0.014895815 -||||x                 - d32767v5 runstate_change 
d32767v5 running->runnable
    0.014896751 -||||x                 - d?v? runstate_change d0v5 
runnable->running
]  0.014908155 -|||x|                 - d0v4   28006(2:8:6) 2 [ 0 4 ]
]  0.014908228 -||||x                 - d0v5   28006(2:8:6) 2 [ 0 5 ]
]  0.014908405 -x||||                 - d0v1   28006(2:8:6) 2 [ 0 1 ]
]  0.014909231 -||x||                 - d0v3   28006(2:8:6) 2 [ 0 3 ]
]  0.014910265 -|||x|                 - d0v4   2800e(2:8:e) 2 [ 0 7f14 ]
]  0.014910384 -|||x|                 - d0v4   2800f(2:8:f) 3 [ 7fff 
7f14 ffffffff ]
]  0.014910550 -|||x|                 - d0v4   2800a(2:8:a) 4 [ 0 4 7fff 4 ]
]  0.014910566 -x||||                 - d0v1   2800e(2:8:e) 2 [ 0 6743 ]
]  0.014910679 -x||||                 - d0v1   2800f(2:8:f) 3 [ 7fff 
6743 ffffffff ]
]  0.014910707 -||||x                 - d0v5   2800e(2:8:e) 2 [ 0 3f80 ]
    0.014910783 -|||x|                 - d0v4 runstate_change d0v4 
running->blocked
]  0.014910803 -x|| |                 - d0v1   2800a(2:8:a) 4 [ 0 1 7fff 1 ]
]  0.014910819 -||| x                 - d0v5   2800f(2:8:f) 3 [ 7fff 
3f80 ffffffff ]
]  0.014910944 -||| x                 - d0v5   2800a(2:8:a) 4 [ 0 5 7fff 5 ]
    0.014911030 -x|| |                 - d0v1 runstate_change d0v1 
running->blocked
    0.014911109 - || x                 - d0v5 runstate_change d0v5 
running->blocked
]  0.014911307 - |x                   - d0v3   2800e(2:8:e) 2 [ 0 4c74 ]
    0.014911367 - || x                 - d?v? runstate_change d32767v5 
runnable->running
]  0.014911417 - |x -                 - d0v3   2800f(2:8:f) 3 [ 7fff 
4c74 ffffffff ]
    0.014911471 - ||x-                 - d?v? runstate_change d32767v4 
runnable->running
    0.014911512 -x||--                 - d?v? runstate_change d32767v1 
runnable->running
]  0.014911530 --|x--                 - d0v3   2800a(2:8:a) 4 [ 0 3 7fff 3 ]
    0.014911687 --|x--                 - d0v3 runstate_change d0v3 
running->blocked
    0.014912276 --|x--                 - d?v? runstate_change d32767v3 
runnable->running
]  0.015036914 --x---                 - d0v2   28006(2:8:6) 2 [ 0 2 ]
]  0.015038191 --x---                 - d0v2   2800e(2:8:e) 2 [ 0 28d83 ]
]  0.015038313 --x---                 - d0v2   2800f(2:8:f) 3 [ 7fff 
28d83 ffffffff ]
]  0.015038445 --x---                 - d0v2   2800a(2:8:a) 4 [ 0 2 7fff 2 ]
    0.015038617 --x---                 - d0v2 runstate_change d0v2 
running->blocked
    0.015039232 --x---                 - d?v? runstate_change d32767v2 
runnable->running
    0.020630385 ------                 x d32767v23 runstate_change d4v0 
blocked->runnable
]  0.020631491 ------                 x d32767v23   28004(2:8:4) 2 [ 4 0 ]
]  0.020633401 ------                 x d32767v23   2800e(2:8:e) 2 [ 
7fff edb796 ]
]  0.020633555 ------                 x d32767v23   2800f(2:8:f) 3 [ 4 
d97 1c9c380 ]
]  0.020633813 ------                 x d32767v23   2800a(2:8:a) 4 [ 
7fff 17 4 0 ]
    0.020634086 ------                 x d32767v23 runstate_change 
d32767v23 running->runnable
    0.020635147 ------                 x d?v? runstate_change d4v0 
runnable->running
]  0.020650487 ------                 x d4v0   28006(2:8:6) 2 [ 4 0 ]
]  0.020651616 ------                 x d4v0   2800e(2:8:e) 2 [ 4 5400 ]
]  0.020651739 ------                 x d4v0   2800f(2:8:f) 3 [ 7fff 
5400 ffffffff ]
]  0.020651876 ------                 x d4v0   2800a(2:8:a) 4 [ 4 0 7fff 
17 ]
    0.020652054 ------                 x d4v0 runstate_change d4v0 
running->blocked

Dario Faggioli

2013-Jun-03 16:47 UTC

head link

Re: Prioritising dom0 vcpus

On ven, 2013-05-31 at 18:18 +0100, Marcus Granado wrote:> As an experiment trying to reduce the latency when scheduling dom0 
> vcpus, I applied the following patch to __runq_insert() to xen 4.2:
> 
> diff -r 8643ca19d356 -r 91b13479c1a2 xen/common/sched_credit.c
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -205,6 +205,15 @@
>       BUG_ON( __vcpu_on_runq(svc) );
>       BUG_ON( cpu != svc->vcpu->processor );
> 
> +    /* if svc is a dom0 vcpu, put it always before all the other vcpus 
> in the runq,
> +     * so that dom0 vcpus always have priority
> +     */
> +    if (svc->vcpu->domain->domain_id == 0) {
> +        svc->pri = CSCHED_PRI_TS_BOOST; /* make sure no vcpu goes in 
> front of this one  until this vcpu is scheduled */
> +        list_add(&svc->runq_elem, (struct list_head *)runq);
> +        return;
> +    }
> +
>       list_for_each( iter, runq )
>       {
>           const struct csched_vcpu * const iter_svc = __runq_elem(iter);
> Mmm... Are we talking about wakeup latency --which, BTW, is what
TS_BOOST is all about, AFAIUI ?

In that case, isn''t a waking vcpu, whether or not it belongs to Dom0,
being boosted already in csched_vcpu_wake()? __runq_insert() is called
right after that, so I think it sees the boosting already, without the
need of the above.

If it''s not only wakeup latency issues that you''re trying to
address,
then I''m not sure, but still, __runq_insert() does not look the right
place where to place such logic, at least per my personal taste. :-)
> However, this patch seems to have had the opposite effect, and I would 
> like to understand why. A win7 guest now takes hours to start up, and I 
> believe this is due to dom0 taking an order of 10ms to serve each vm i/o 
> request, even though the dom0 vcpus and the guest vcpu are in different 
> pcpus.
> Well, just shooting in the dark, but __runq_insert() is also called in
csched_schedule(). Perhaps your modification above interacts badly with
the current scheduling logic?

Another way of trying to achieve what you seem to be up to, could be to
put an "is_it_dom0?" check in csched_vcpu_acct() and, if true, do not
clear the boosting. Beware, I''m not saying that it makes sense, or that
I like it, it just seems more clean (at least to me) than hijacking
__runq_insert().

What do you think?
> xenalyze-a.out: http://pastelink.me/getfile.php?key=390a25
> xentrace-D-T5.out: http://pastelink.me/getfile.php?key=b3d584
> Sorry, can''t look at the traces right now... If I find 5 mins for them
and spot something weird, I''ll let you know.

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Xen devel - May 2013 - Prioritising dom0 vcpus

Prioritising dom0 vcpus

Re: Prioritising dom0 vcpus