This patch adds a facility to block on a particular event channel, for when a domain needs to block, but cannot enable interrupts. A domain uses the call something like this: while (1) { clear_evtchn_pending(evtchn); if (check_for_data(evtchn)) break; HYPERVISOR_block_on(evtchn); } The clear of the pending is needed to ensure that any subsequent calls to block_on() don''t return immediately. regards, john # HG changeset patch # User john.levon@sun.com # Node ID 15aea7d020cd13b1f13692518f10051e401962df # Parent fbeb0a5b7219630839986cf4cdb1b813618cbdce Add SCHEDOP_block_on. Signed-off-by: John Levon <john.levon@sun.com> diff -r fbeb0a5b7219 -r 15aea7d020cd docs/src/interface.tex --- a/docs/src/interface.tex Thu Mar 9 16:24:57 2006 +0000 +++ b/docs/src/interface.tex Thu Mar 9 21:55:15 2006 -0800 @@ -1721,13 +1721,17 @@ \hypercall{sched\_op(unsigned long op)} Request scheduling operation from hypervisor. The options are: {\it -SCHEDOP\_yield}, {\it SCHEDOP\_block}, and {\it SCHEDOP\_shutdown}. -{\it yield} keeps the calling domain runnable but may cause a -reschedule if other domains are runnable. {\it block} removes the -calling domain from the run queue and cause is to sleeps until an -event is delivered to it. {\it shutdown} is used to end the domain''s -execution; the caller can additionally specify whether the domain -should reboot, halt or suspend. +SCHEDOP\_yield}, {\it SCHEDOP\_block}, {\it SCHEDOP\_block\_on} and +{\it SCHEDOP\_shutdown}. {\it yield} keeps the calling vcpu runnable +but may cause a reschedule if other vcpus are runnable. {\it block} +enables event delivery, removes the calling vcpu from the run queue, +and causes it to sleep until an event is delivered to it. +{\it block\_on} causes the calling vcpu to sleep until any event is +delivered. It takes an event channel argument, which is checked to +see if it''s pending; if it is, then the vcpu does not sleep. +{\it shutdown} is used to end the domain''s execution; the caller +can additionally specify whether the domain should reboot, halt or +suspend. \end{quote} To aid the implementation of a process scheduler within a guest OS, diff -r fbeb0a5b7219 -r 15aea7d020cd xen/common/schedule.c --- a/xen/common/schedule.c Thu Mar 9 16:24:57 2006 +0000 +++ b/xen/common/schedule.c Thu Mar 9 21:55:15 2006 -0800 @@ -249,7 +249,7 @@ } /* Block the currently-executing domain until a pertinent event occurs. */ -static long do_block(void) +static void do_block(void) { struct vcpu *v = current; @@ -258,6 +258,32 @@ /* Check for events /after/ blocking: avoids wakeup waiting race. */ if ( event_pending(v) ) + { + clear_bit(_VCPUF_blocked, &v->vcpu_flags); + } + else + { + TRACE_2D(TRC_SCHED_BLOCK, v->domain->domain_id, v->vcpu_id); + __enter_scheduler(); + } +} + +/* + * Block the currently-executing domain without enabling interrupts, for a + * particular evtchn. Whilst we check for a particular evtchn, /any/ event + * will wake us up again. + */ +static long do_block_on(unsigned long evtchn) +{ + struct vcpu *v = current; + + if ( evtchn >= MAX_EVTCHNS ) + return -EINVAL; + + set_bit(_VCPUF_blocked, &v->vcpu_flags); + + /* Check for event /after/ blocking: avoids wakeup waiting race. */ + if ( evtchn_pending(v->domain->shared_info, evtchn) ) { clear_bit(_VCPUF_blocked, &v->vcpu_flags); } @@ -292,7 +318,13 @@ case SCHEDOP_block: { - ret = do_block(); + do_block(); + break; + } + + case SCHEDOP_block_on: + { + ret = do_block_on(arg); break; } diff -r fbeb0a5b7219 -r 15aea7d020cd xen/include/public/sched.h --- a/xen/include/public/sched.h Thu Mar 9 16:24:57 2006 +0000 +++ b/xen/include/public/sched.h Thu Mar 9 21:55:15 2006 -0800 @@ -38,6 +38,16 @@ #define SCHEDOP_shutdown 2 /* + * Block execution of this VCPU until an event is received for processing, + * without clearing the upcall mask; a subsequent pending event will not be + * delivered, but will still wake up the domain. This is intended for use in + * polling loops when "interrupts" are disabled. It takes one argument; that of + * a channel to check for pending events before sleeping. + * @arg == evtchn. + */ +#define SCHEDOP_block_on 3 + +/* * Reason codes for SCHEDOP_shutdown. These may be interpreted by controller * software to determine the appropriate action. For the most part, Xen does * not care about the shutdown code. diff -r fbeb0a5b7219 -r 15aea7d020cd xen/include/xen/event.h --- a/xen/include/xen/event.h Thu Mar 9 16:24:57 2006 +0000 +++ b/xen/include/xen/event.h Thu Mar 9 21:55:15 2006 -0800 @@ -14,6 +14,16 @@ #include <xen/smp.h> #include <asm/bitops.h> #include <asm/event.h> + +/* + * If the vcpu has events masked, we should unblock it even if the event + * channel is already pending. + */ +static inline void evtchn_check_unblock(struct vcpu *v) +{ + if ( v->vcpu_info->evtchn_upcall_mask && (v->vcpu_flags & VCPUF_blocked) ) + vcpu_unblock(v); +} /* * EVENT-CHANNEL NOTIFICATIONS @@ -35,6 +45,10 @@ !test_and_set_bit(0, &v->vcpu_info->evtchn_upcall_pending) ) { evtchn_notify(v); + } + else + { + evtchn_check_unblock(v); } } @@ -69,4 +83,7 @@ /* Bind a local event-channel port to the specified VCPU. */ extern long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id); +#define evtchn_pending(s, e) \ + (test_bit((e), &(s)->evtchn_pending[0])) + #endif /* __XEN_EVENT_H__ */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 10 Mar 2006, at 15:34, John Levon wrote:> This patch adds a facility to block on a particular event channel, for > when a domain needs to block, but cannot enable interrupts.I really don''t much like the change this adds to evtchn_set_pending(), particularly since this patch probably doesn''t have general applicability beyond debuggers. I wonder if you could get the functionality you need with the existing SCHEDOP_block interface if you do the following: HYPERVISOR_set_callbacks(<dummy irq handler>); clear_evtchn_mask(evtchn); while (1) { /* Important to clear the master flag /then/ selector /then/ evtchn pending. */ vcpu_info->evtchn_upcall_pending = 0; clear_bit(evtchn, &vcpu_info->evtchn_pending_sel); clear_evtchn_pending(evtchn); if (check_for_data(evtchn)) break; HYPERVISOR_block(); } Where dummy_irq_handler is a no-op return (which should probably return leaving event delivery disabled, to prevent event-delivery storm, thus it probably really is just IRET). Certainly the above should work fine for a post-mortem debugger. Even if you need to be able to exit the debugger and continue executing normal kernel code, we can come up with some suitable code to restore normal event delivery without Xen interface changes. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Mar 10, 2006 at 04:49:47PM +0000, Keir Fraser wrote:> I really don''t much like the change this adds to evtchn_set_pending(),Could you explain why? Would you prefer an explicit vcpu_flag or something? Whilst I cannot offhand think of a case other than kernel debuggers that need this, it''s also presumably true that it will apply across all domU kernels (I do not know if kdb gets serious use these days, but ...)> HYPERVISOR_set_callbacks(<dummy irq handler>);This would work (I think), but it''s rather unpleasant to say the least.> come up with some suitable code to restore normal event delivery > without Xen interface changes.We do need this. It''s quite tricky having to carefully synthesise the pending_sel bit etc. regards, john _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 10 Mar 2006, at 17:50, John Levon wrote:>> HYPERVISOR_set_callbacks(<dummy irq handler>); > > This would work (I think), but it''s rather unpleasant to say the least.I''d rather have an unpleasant hack in the bowels of your debugger than an unpleasant additional interface in Xen that noone else will use. :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Mar 10, 2006 at 06:32:28PM +0000, Keir Fraser wrote:> >> HYPERVISOR_set_callbacks(<dummy irq handler>); > > > >This would work (I think), but it''s rather unpleasant to say the least. > > I''d rather have an unpleasant hack in the bowels of your debugger than > an unpleasant additional interface in Xen that noone else will use. :-)Then we have no choice but to misuse the interfaces you''re prepared to provide. Oh well. john _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 11 Mar 2006, at 14:17, John Levon wrote:>>> This would work (I think), but it''s rather unpleasant to say the >>> least. >> >> I''d rather have an unpleasant hack in the bowels of your debugger than >> an unpleasant additional interface in Xen that noone else will use. >> :-) > > Then we have no choice but to misuse the interfaces you''re prepared to > provide. Oh well.I had some more of a think about this and actually there is a case for using a polling interface of this type in the pcifront driver. That sends a PCI-config-space request to the backend driver while it has interrupts disabled and currently has to spin on a software flag. At least the spin is usually short, but it''s not a nice thing to have to do. I''ll have another look at your patch next week and reconsider. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 12 Mar 2006, at 08:40, Keir Fraser wrote:>> Then we have no choice but to misuse the interfaces you''re prepared to >> provide. Oh well. > > I had some more of a think about this and actually there is a case for > using a polling interface of this type in the pcifront driver. That > sends a PCI-config-space request to the backend driver while it has > interrupts disabled and currently has to spin on a software flag. At > least the spin is usually short, but it''s not a nice thing to have to > do. > > I''ll have another look at your patch next week and reconsider.I just applied an extended version of your patch as changeset 9238 (c445d4...) to the xen-unstable repository. The changeset comment there says it all, really. It should end up in the public repo in the next couple of hours if I haven''t screwed anything up and the automatic regression tests pass. Until 3.0.2 is called (which could be within a week, depending on when 2.6.16 is released) the implementation and interfaces can be modified -- please let me know asap if anything needs changing! You can see a usage example of the new interface in the pcifront driver. The code for that is in the same changeset. Oh yes, and documentation isn''t yet updated. The patch for that will be a bit bigger than your original one. I''ll look into that tomorrow. :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, Mar 14, 2006 at 06:40:11PM +0000, Keir Fraser wrote:> I just applied an extended version of your patch as changeset 9238 > (c445d4...) to the xen-unstable repository. The changeset comment there > says it all, really. It should end up in the public repo in the next > couple of hours if I haven''t screwed anything up and the automatic > regression tests pass.BTW, is there a particular reason why the staging repository isn''t available to pull? thanks john _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, Mar 14, 2006 at 06:40:11PM +0000, Keir Fraser wrote:> I just applied an extended version of your patch as changeset 9238 > (c445d4...) to the xen-unstable repository. The changeset comment there > says it all, really. It should end up in the public repo in the next > couple of hours if I haven''t screwed anything up and the automatic > regression tests pass. > > Until 3.0.2 is called (which could be within a week, depending on when > 2.6.16 is released) the implementation and interfaces can be modified > -- please let me know asap if anything needs changing! You can see a > usage example of the new interface in the pcifront driver. The code for > that is in the same changeset.This works for us, thanks! regards, john _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 14 Mar 2006, at 19:56, John Levon wrote:>> I just applied an extended version of your patch as changeset 9238 >> (c445d4...) to the xen-unstable repository. The changeset comment >> there >> says it all, really. It should end up in the public repo in the next >> couple of hours if I haven''t screwed anything up and the automatic >> regression tests pass. > > BTW, is there a particular reason why the staging repository isn''t > available to pull?The delay is usually only a couple of hours, if all goes well. And if it doesn''t we get a clear automated bug report without troubling other users of the tree with broken bits. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 2006-03-14 at 18:40 +0000, Keir Fraser wrote:> On 12 Mar 2006, at 08:40, Keir Fraser wrote: > > I had some more of a think about this and actually there is a case for > > using a polling interface of this type in the pcifront driver. That > > sends a PCI-config-space request to the backend driver while it has > > interrupts disabled and currently has to spin on a software flag. At > > least the spin is usually short, but it''s not a nice thing to have to > > do. > > > > I''ll have another look at your patch next week and reconsider. > > I just applied an extended version of your patch as changeset 9238 > (c445d4...) to the xen-unstable repository. The changeset comment there > says it all, really. It should end up in the public repo in the next > couple of hours if I haven''t screwed anything up and the automatic > regression tests pass. > > Until 3.0.2 is called (which could be within a week, depending on when > 2.6.16 is released) the implementation and interfaces can be modified > -- please let me know asap if anything needs changing! You can see a > usage example of the new interface in the pcifront driver. The code for > that is in the same changeset. > > Oh yes, and documentation isn''t yet updated. The patch for that will be > a bit bigger than your original one. I''ll look into that tomorrow. :-)Keir, I''m not 100% sure, but I believe that jiffies does not get updated while interrupts are disabled (the timer is an interrupt in a domU, right?). I could never verify from the source if this is the case, but if the pci backend disappears or stops responding while the pci frontend is waiting for an operation to complete, this could be a problem for the new code in pcifront. I''ve been preparing my patch for late binding in the pci backend and if I unload the pci backend module (yes, unloading will work with my patch!), the pci frontend domain appears to hang (inside of the while loop in do_pci_op()) whereas prior to the HYPERVISOR_poll call and jiffies check, the ttl would count down to 0 and error out of the spin and clean-up gracefully. I chose my ttl method over jiffies in the first place because of this problem with jiffies not getting updated. What would be the best way to fix this problem? The ttl method I had is not that elegant... Ryan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 15 Mar 2006, at 23:58, Ryan wrote:> I''m not 100% sure, but I believe that jiffies does not get updated > while > interrupts are disabled (the timer is an interrupt in a domU, right?). > I > could never verify from the source if this is the case, but if the pci > backend disappears or stops responding while the pci frontend is > waiting > for an operation to complete, this could be a problem for the new code > in pcifront. I''ve been preparing my patch for late binding in the pci > backend and if I unload the pci backend module (yes, unloading will > work > with my patch!), the pci frontend domain appears to hang (inside of the > while loop in do_pci_op()) whereas prior to the HYPERVISOR_poll call > and > jiffies check, the ttl would count down to 0 and error out of the spin > and clean-up gracefully. I chose my ttl method over jiffies in the > first > place because of this problem with jiffies not getting updated. What > would be the best way to fix this problem? The ttl method I had is not > that elegant...Yes, you make a very good point about using jiffies. I''ve checked in a fix to use gettimeofday instead, which does continue to work even with interrupts disabled. Also, the timeout of 5s is arbitrary. We could make it rather less if you think that makes sense. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, 2006-03-16 at 11:34 +0000, Keir Fraser wrote:> Yes, you make a very good point about using jiffies. I''ve checked in a > fix to use gettimeofday instead, which does continue to work even with > interrupts disabled. > > Also, the timeout of 5s is arbitrary. We could make it rather less if > you think that makes sense.I think gettimeofday will work fine. I think the timeout of 5s should probably be less, but I''ve not done any kind of study to determine the best time. Having the interrupts disabled for any large amount of time is probably a bad idea... I would think that 1 or 2 seconds maximum would be better. There should be no reason that the pci backend should take longer than that to complete a request (unless there''s some Xen scheduling concern I''m not aware of). Ryan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Maybe Matching Threads
- [PATCH,RFC 6/17] 32-on-64 shared info handling
- PV driver domains and S3 sleep
- [PATCH 0001/001] xen: multi page ring support for block devices
- [PATCH 0001/001] xen: multi page ring support for block devices
- [PATCH 0001/001] xen: multi page ring support for block devices