thr3ads.net - Xen devel - [Xen-devel] scheduler independent forced vcpu selection [May 2005]

If this information is useful, please help other people find it:
Share via:

Ryan Harper

2005-May-17 20:48 UTC

[Xen-devel] scheduler independent forced vcpu selection

I''m working on a new hypercall, do_confer, which allows the directed
yielding of a vcpu to another vcpu.  It is mainly used when a vcpu fails
to acquire a spinlock, yielding to the lock holder instead of spinning. I
ported the ppc64 spinlock implementation for the i386 linux portion.  In
implementing the hypercall, I''ve been trying to figure out how to get
the scheduler (I''ve only played with bvt) to run the vcpu passed in the
hypercall (after some validation) but I''ve run into various bad state
situations (do_softirq pending != 0 assert,
''!active_ac_timer(timer)''
failed , and __task_on_runqueue(prev) failed) which tells me I
don''t fully understand all of the book-keeping that is needed.  Has
anyone thought about how to do this with either BVT or the new EDF
scheduler?

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ryanh@us.ibm.com

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Stephan Diestelhorst

2005-May-18 12:10 UTC

head link

Re: [Xen-devel] scheduler independent forced vcpu selection

That is a good idea, there is quite a number of other spinlock
optimisations on the way...
> I''m working on a new hypercall, do_confer, which allows the
directed
> yielding of a vcpu to another vcpu.  It is mainly used when a vcpu fails
> to acquire a spinlock, yielding to the lock holder instead of spinning. I
> ported the ppc64 spinlock implementation for the i386 linux portion.  In
> implementing the hypercall, I''ve been trying to figure out how to
get
> the scheduler (I''ve only played with bvt) to run the vcpu passed
in the
> hypercall (after some validation) but I''ve run into various bad
state
> situations (do_softirq pending != 0 assert,
''!active_ac_timer(timer)''
> failed , and __task_on_runqueue(prev) failed) which tells me I
> don''t fully understand all of the book-keeping that is needed. 
Has
> anyone thought about how to do this with either BVT or the new EDF
> scheduler?
Building code similar to do_block and __enter_scheduler in
xen/common/schedule.c should be working fine, except of course running
the original scheduler, but switching directly to the hinted domain.

Are you calling do_softirq directly? If not then it is quite strange,
that this assertion fails.
The timer assertion might be the old scheduling timer, which gets
probably reset, but not deleted beforehand... And the on runqueue
assertion suggests that you are ''stealing'' the domain from the
schedulers queues without giving it a chance to notice.

I''d guess cloning do_block and appending code from __enter_scheduler
with some checks (is the ''receiver'' domain runnable? if not
run proper
sched.do_schedule) should give you a solid base to start from.

Cheers,
  Stephan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ryan Harper

2005-May-18 14:55 UTC

head link

Re: [Xen-devel] scheduler independent forced vcpu selection

* Stephan Diestelhorst <sd386@cl.cam.ac.uk> [2005-05-18
09:04]:> Are you calling do_softirq directly? If not then it is quite strange,
No, I just call raise_softirq(SCHEDULE_SOFTIRQ); without a subsequent
do_softirq().
> that this assertion fails.
> The timer assertion might be the old scheduling timer, which gets
> probably reset, but not deleted beforehand... And the on runqueue
> assertion suggests that you are ''stealing'' the domain
from the
> schedulers queues without giving it a chance to notice.
Could you explain what ''giving it a chance to notice'' means?
> I''d guess cloning do_block and appending code from
__enter_scheduler
> with some checks (is the ''receiver'' domain runnable? if
not run proper
> sched.do_schedule) should give you a solid base to start from.
Let me add in a check for domain_runnable and see if that helps.

Thanks for the feedback.  Let me know if you want me to post the patch
of where I''m at right now.

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ryanh@us.ibm.com

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ryan Harper

2005-May-18 18:03 UTC

head link

Re: [Xen-devel] scheduler independent forced vcpu selection

* Stephan Diestelhorst <sd386@cl.cam.ac.uk> [2005-05-18
09:04]:> The timer assertion might be the old scheduling timer, which gets
> probably reset, but not deleted beforehand... And the on runqueue
> assertion suggests that you are ''stealing'' the domain
from the
> schedulers queues without giving it a chance to notice.
Looking at both bvt and sedf, the runqueue is ordered by some metric or
another (evt, deadline respectively).  What I think we need is a way to
swap positions in the runqueues.  That is, if the lock holder is
runnable, I want the holder to run instead of current.  Is there some
way to do this in a scheduler independent manner with the current set of
scheduler ops defined in sched-if.h ?

I noticed that neither bvt or sedf implement the rem_task function which
I thought could be used to help out with the ''stealing'' by
notifying the
schedulers that prev was going away (removing it from the runqueue) but
just removing the exec_domain from the runqueue didn''t help.

I''m including a patch that I''m currently using so you can get
a better
idea of the modifications to schedule.c I''m making.

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ryanh@us.ibm.com


---
--- b/xen/common/schedule.c	2005-05-17 22:16:55.000000000 -0500
+++ c/xen/common/schedule.c	2005-05-18 12:42:44.765691872 -0500
@@ -273,6 +273,49 @@
     return 0;
 }
 
+/* Confer control to another vcpu */
+long do_confer(unsigned int vcpu, unsigned int yield_count)
+{
+    struct domain *d = current->domain;
+   
+    /* count hcalls */
+    current->confercnt++;
+
+    /* Validate CONFER prereqs:
+    * - vcpu is within bounds
+    * - vcpu is a valid in this domain
+    * - current has not already conferred its slice to vcpu
+    * - vcpu is not already running
+    * - designated vcpu''s yield_count matches value from call
+    *
+    * of 1-4 are ok, then set conferred value and enter scheduler
+    */
+
+    if (vcpu > MAX_VIRT_CPUS)
+        return 0; 
+
+    if (d->exec_domain[vcpu] == NULL)
+        return 0;
+
+    if (current->conferred != VCPU_CANCONFER)
+        return 0;
+
+    /* even counts indicate a running vcpu, odd is preempted/conferred */
+    if ((d->exec_domain[vcpu]->vcpu_info->yield_count & 1) == 0)
+        return 0;
+
+    if (d->exec_domain[vcpu]->vcpu_info->yield_count != yield_count)
+        return 0;
+
+    /*
+     * set which vcpu should run in conferred state, request scheduling
+     */
+    current->conferred = (VCPU_CONFERRING|vcpu);
+    raise_softirq(SCHEDULE_SOFTIRQ);
+
+    return 0;
+}
+
 /*
  * Demultiplex scheduler-related hypercalls.
  */
@@ -412,8 +455,9 @@
  */
 static void __enter_scheduler(void)
 {
-    struct exec_domain *prev = current, *next = NULL;
+    struct exec_domain *prev = current, *next = NULL, *holder = NULL;
     int                 cpu = prev->processor;
+    unsigned int        holder_vcpu;
     s_time_t            now;
     struct task_slice   next_slice;
     s32                 r_time;     /* time for new dom to run */
@@ -436,12 +480,39 @@
 
     prev->cpu_time += now - prev->lastschd;
 
-    /* get policy-specific decision on scheduling... */
-    next_slice = ops.do_schedule(now);
+    /* get ed pointer to holder vcpu */
+    holder_vcpu = 0xffff & prev->conferred;
+    holder = prev->domain->exec_domain[holder_vcpu];
+
+    if (unlikely(prev->conferred & VCPU_CONFERRING) &&
+        domain_runnable(holder)) 
+    {
+        /* run holder next */
+        next = holder;
+
+        /* run for the remainder of prev''s slice */
+        r_time = schedule_data[cpu].s_timer.expires - now;
+
+        /* increment confer counters */
+        prev->confer_out++;
+        next->confer_in++;
+
+        /* change prev''s confer state to prevent re-entrance */
+        prev->conferred = VCPU_CONFERRED;
+
+    } else {      
+        /* get policy-specific decision on scheduling... */
+        next_slice = ops.do_schedule(now);
+
+        r_time = next_slice.time;
+        next = next_slice.task;
+    }
+
+    /* 
+     * always clear conferred state so this vcpu can confer during its slice
+     */
+    next->conferred = 0;
 
-    r_time = next_slice.time;
-    next = next_slice.task;
-    
     schedule_data[cpu].curr = next;
     
     next->lastschd = now;
@@ -455,6 +526,12 @@
 
     spin_unlock_irq(&schedule_data[cpu].schedule_lock);
 
+    /* bump vcpu yield_count when controlling domain is not-idle */
+    if ( !is_idle_task(prev->domain) )
+        prev->vcpu_info->yield_count++;
+    if ( !is_idle_task(next->domain) )
+        next->vcpu_info->yield_count++;
+
     if ( unlikely(prev == next) ) {
 #ifdef ADV_SCHED_HISTO
         adv_sched_hist_to_stop(cpu);

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ryan Harper

2005-May-18 22:37 UTC

head link

Re: [Xen-devel] scheduler independent forced vcpu selection

* Stephan Diestelhorst <sd386@cl.cam.ac.uk> [2005-05-18
09:04]:> > I''m working on a new hypercall, do_confer, which allows the
directed
> > yielding of a vcpu to another vcpu.  It is mainly used when a vcpu
fails
> > to acquire a spinlock, yielding to the lock holder instead of
spinning. I
> > ported the ppc64 spinlock implementation for the i386 linux portion. 
In
> > implementing the hypercall, I''ve been trying to figure out
how to get
> > the scheduler (I''ve only played with bvt) to run the vcpu
passed in the
> > hypercall (after some validation) but I''ve run into various
bad state
> > situations (do_softirq pending != 0 assert,
''!active_ac_timer(timer)''
> > failed , and __task_on_runqueue(prev) failed) which tells me I
> > don''t fully understand all of the book-keeping that is
needed.  Has
> > anyone thought about how to do this with either BVT or the new EDF
> > scheduler?
After some thought, domain_wake(), followed by
raise_softirq(SCHEDULE_SOFTIRQ) does what I want and removes the huge
mess I was making in __enter_scheduler().  

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ryanh@us.ibm.com

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Stephan Diestelhorst

2005-May-19 13:22 UTC

head link

Re: [Xen-devel] scheduler independent forced vcpu selection

Ryan Harper schrieb:> * Stephan Diestelhorst <sd386@cl.cam.ac.uk> [2005-05-18 09:04]:
> 
>>The timer assertion might be the old scheduling timer, which gets
>>probably reset, but not deleted beforehand... And the on runqueue
>>assertion suggests that you are ''stealing'' the domain
from the
>>schedulers queues without giving it a chance to notice.
> 
> 
> Looking at both bvt and sedf, the runqueue is ordered by some metric or
> another (evt, deadline respectively).  What I think we need is a way to
> swap positions in the runqueues.  That is, if the lock holder is
> runnable, I want the holder to run instead of current.  Is there some
> way to do this in a scheduler independent manner with the current set of
> scheduler ops defined in sched-if.h ?
How about blocking/pausing the currently running domain? I can''t think
of another way of doing this in an scheduler independent fashion...
> I noticed that neither bvt or sedf implement the rem_task function which
> I thought could be used to help out with the ''stealing''
by notifying the
> schedulers that prev was going away (removing it from the runqueue) but
> just removing the exec_domain from the runqueue didn''t help.
That is really nasty, and just describes what I meant with "stealing"
a
domain from the scheduler! :-)
> I''m including a patch that I''m currently using so you can
get a better
> idea of the modifications to schedule.c I''m making.
Thanks,
  Stephan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Stephan Diestelhorst

2005-May-19 13:25 UTC

head link

Re: [Xen-devel] scheduler independent forced vcpu selection

Ryan Harper schrieb:> * Stephan Diestelhorst <sd386@cl.cam.ac.uk> [2005-05-18 09:04]:
> 
>>>I''m working on a new hypercall, do_confer, which allows the
directed
>>>yielding of a vcpu to another vcpu.  It is mainly used when a vcpu
fails
>>>to acquire a spinlock, yielding to the lock holder instead of
spinning. I
>>>ported the ppc64 spinlock implementation for the i386 linux portion.
In
>>>implementing the hypercall, I''ve been trying to figure out
how to get
>>>the scheduler (I''ve only played with bvt) to run the vcpu
passed in the
>>>hypercall (after some validation) but I''ve run into various
bad state
>>>situations (do_softirq pending != 0 assert,
''!active_ac_timer(timer)''
>>>failed , and __task_on_runqueue(prev) failed) which tells me I
>>>don''t fully understand all of the book-keeping that is
needed.  Has
>>>anyone thought about how to do this with either BVT or the new EDF
>>>scheduler?
> 
> 
> After some thought, domain_wake(), followed by
> raise_softirq(SCHEDULE_SOFTIRQ) does what I want and removes the huge
> mess I was making in __enter_scheduler().  
Are you waking up the domain that holds the lock? Then you would rely on
the scheduler to give the woken domain a high "priority" (whatever
this
means for the current scheduler) and should start that domain
immediatelly, right?

Best,
  Stephan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ryan Harper

2005-May-19 14:55 UTC

head link

Re: [Xen-devel] scheduler independent forced vcpu selection

* Stephan Diestelhorst <sd386@cl.cam.ac.uk> [2005-05-19
09:04]:> Ryan Harper schrieb:
> > * Stephan Diestelhorst <sd386@cl.cam.ac.uk> [2005-05-18 09:04]:
> > 
> >>>I''m working on a new hypercall, do_confer, which
allows the directed
> >>>yielding of a vcpu to another vcpu.  It is mainly used when a
vcpu fails
> >>>to acquire a spinlock, yielding to the lock holder instead of
spinning. I
> >>>ported the ppc64 spinlock implementation for the i386 linux
portion.  In
> >>>implementing the hypercall, I''ve been trying to figure
out how to get
> >>>the scheduler (I''ve only played with bvt) to run the
vcpu passed in the
> >>>hypercall (after some validation) but I''ve run into
various bad state
> >>>situations (do_softirq pending != 0 assert,
''!active_ac_timer(timer)''
> >>>failed , and __task_on_runqueue(prev) failed) which tells me I
> >>>don''t fully understand all of the book-keeping that is
needed.  Has
> >>>anyone thought about how to do this with either BVT or the new
EDF
> >>>scheduler?
> > 
> > 
> > After some thought, domain_wake(), followed by
> > raise_softirq(SCHEDULE_SOFTIRQ) does what I want and removes the huge
> > mess I was making in __enter_scheduler().  
> 
> Are you waking up the domain that holds the lock? Then you would rely on
Yes, that is the idea.
> the scheduler to give the woken domain a high "priority"
(whatever this
> means for the current scheduler) and should start that domain
> immediatelly, right?
Yes, that is part of what is required.  I need to do two things after
validation of do_confer:

1) Wake the lock-holder vcpu
2) Schedule the lock-holder to only run for the remaining time-slice of
the current running vcpu.

Using domain_wake() and softirq, I''m only getting (1), but I have no
guarantee when the lock-holder is actually woken up.  

Any thoughts on how to get (2)?

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ryanh@us.ibm.com

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ryan Harper

2005-May-19 15:05 UTC

head link

Re: [Xen-devel] scheduler independent forced vcpu selection

* Stephan Diestelhorst <sd386@cl.cam.ac.uk> [2005-05-19
09:04]:> Ryan Harper schrieb:
> > * Stephan Diestelhorst <sd386@cl.cam.ac.uk> [2005-05-18 09:04]:
> > 
> >>>I''m working on a new hypercall, do_confer, which
allows the directed
> >>>yielding of a vcpu to another vcpu.  It is mainly used when a
vcpu fails
> >>>to acquire a spinlock, yielding to the lock holder instead of
spinning. I
> >>>ported the ppc64 spinlock implementation for the i386 linux
portion.  In
> >>>implementing the hypercall, I''ve been trying to figure
out how to get
> >>>the scheduler (I''ve only played with bvt) to run the
vcpu passed in the
> >>>hypercall (after some validation) but I''ve run into
various bad state
> >>>situations (do_softirq pending != 0 assert,
''!active_ac_timer(timer)''
> >>>failed , and __task_on_runqueue(prev) failed) which tells me I
> >>>don''t fully understand all of the book-keeping that is
needed.  Has
> >>>anyone thought about how to do this with either BVT or the new
EDF
> >>>scheduler?
> > 
> > 
> > After some thought, domain_wake(), followed by
> > raise_softirq(SCHEDULE_SOFTIRQ) does what I want and removes the huge
> > mess I was making in __enter_scheduler().  
> 
> Are you waking up the domain that holds the lock? Then you would rely on
> the scheduler to give the woken domain a high "priority"
(whatever this
> means for the current scheduler) and should start that domain
> immediatelly, right?
I noticed your comments in sched_sedf.c about domain waking.

* 3. Unconservative (i.e. incorrect)
*     -to boost the performance of I/O dependent domains it would be possible
*      to put the domain into the runnable queue immediately, and let it run
*      for the remainder of the slice of the current period
*      (or even worse: allocate a new full slice for the domain)
*     -either behaviour can lead to missed deadlines in other domains as
*      opposed to approaches 1,2a,2b

Giving the remainder of the current slice to the domain we are waking
*sounds* like what I wanted, but you are concerned that it causes missed
deadlines.  Could you elaborate when we would have such a case?  If we are
only running in the remaining timeslice (which would expire before the
next deadline) then why would such behaviour lead to missing deadlines?


-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ryanh@us.ibm.com

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Apparently Analagous Threads

Search for more seemingly similar threads

Xen devel - May 2005 - scheduler independent forced vcpu selection

[Xen-devel] scheduler independent forced vcpu selection

Re: [Xen-devel] scheduler independent forced vcpu selection

Re: [Xen-devel] scheduler independent forced vcpu selection

Re: [Xen-devel] scheduler independent forced vcpu selection

Re: [Xen-devel] scheduler independent forced vcpu selection

Re: [Xen-devel] scheduler independent forced vcpu selection

Re: [Xen-devel] scheduler independent forced vcpu selection

Re: [Xen-devel] scheduler independent forced vcpu selection

Re: [Xen-devel] scheduler independent forced vcpu selection

Apparently Analagous Threads