We have been seeing cases when events are being lost or delayed. The event is pending but it is never handled. This series fixes two bugs and some documentation that I foudn when reviewing the code. We do not yet have a good repro of the lost events bug yet so I''m not sure if this patches fix the problem so I would appreciate some careful review of the reasoning behind this patches. Thanks. David
David Vrabel
2013-Aug-13 14:31 UTC
[PATCH 1/3] x86/xen: use memory barriers when enabling local irqs
From: David Vrabel <david.vrabel@citrix.com> Because vcpu->evtchn_upcall_mask and vcpu->evtchn_upcall_pending are be written by Xen as well as the guest, using barrier() (a compiler-only barrier) in xen_enable_irq() and xen_restore_fl() is not sufficient. Use mb() (a full memory barrier) instead. Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- arch/x86/xen/irq.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c index 01a4dc0..1a8d0d4 100644 --- a/arch/x86/xen/irq.c +++ b/arch/x86/xen/irq.c @@ -60,7 +60,7 @@ static void xen_restore_fl(unsigned long flags) if (flags == 0) { preempt_check_resched(); - barrier(); /* unmask then check (avoid races) */ + mb(); /* unmask then check (avoid races) */ if (unlikely(vcpu->evtchn_upcall_pending)) xen_force_evtchn_callback(); } @@ -93,7 +93,7 @@ static void xen_irq_enable(void) /* Doesn''t matter if we get preempted here, because any pending event will get dealt with anyway. */ - barrier(); /* unmask then check (avoid races) */ + mb(); /* unmask then check (avoid races) */ if (unlikely(vcpu->evtchn_upcall_pending)) xen_force_evtchn_callback(); } -- 1.7.2.5
David Vrabel
2013-Aug-13 14:31 UTC
[PATCH 2/3] x86/xen: disable premption when enabling local irqs
From: David Vrabel <david.vrabel@citrix.com> If CONFIG_PREEMPT is enabled then xen_enable_irq() (and xen_restore_fl()) could be preempted and rescheduled on a different VCPU in between the clear of the mask and the check for pending events. This may result in events being lost as the upcall will check for pending events on the wrong VCPU. Fix this by disabling preemption around the unmask and check for events. Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- arch/x86/xen/irq.c | 25 ++++++++++++------------- 1 files changed, 12 insertions(+), 13 deletions(-) diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c index 1a8d0d4..7a7a27d 100644 --- a/arch/x86/xen/irq.c +++ b/arch/x86/xen/irq.c @@ -47,23 +47,18 @@ static void xen_restore_fl(unsigned long flags) /* convert from IF type flag */ flags = !(flags & X86_EFLAGS_IF); - /* There''s a one instruction preempt window here. We need to - make sure we''re don''t switch CPUs between getting the vcpu - pointer and updating the mask. */ + /* See xen_irq_enable() for why preemption must be disabled. */ preempt_disable(); vcpu = this_cpu_read(xen_vcpu); vcpu->evtchn_upcall_mask = flags; - preempt_enable_no_resched(); - - /* Doesn''t matter if we get preempted here, because any - pending event will get dealt with anyway. */ if (flags == 0) { - preempt_check_resched(); mb(); /* unmask then check (avoid races) */ if (unlikely(vcpu->evtchn_upcall_pending)) xen_force_evtchn_callback(); - } + preempt_enable(); + } else + preempt_enable_no_resched(); } PV_CALLEE_SAVE_REGS_THUNK(xen_restore_fl); @@ -82,10 +77,12 @@ static void xen_irq_enable(void) { struct vcpu_info *vcpu; - /* We don''t need to worry about being preempted here, since - either a) interrupts are disabled, so no preemption, or b) - the caller is confused and is trying to re-enable interrupts - on an indeterminate processor. */ + /* + * We may be preempted as soon as vcpu->evtchn_upcall_mask is + * cleared, so disable preemption to ensure we check for + * events on the VCPU we are still running on. + */ + preempt_disable(); vcpu = this_cpu_read(xen_vcpu); vcpu->evtchn_upcall_mask = 0; @@ -96,6 +93,8 @@ static void xen_irq_enable(void) mb(); /* unmask then check (avoid races) */ if (unlikely(vcpu->evtchn_upcall_pending)) xen_force_evtchn_callback(); + + preempt_enable(); } PV_CALLEE_SAVE_REGS_THUNK(xen_irq_enable); -- 1.7.2.5
David Vrabel
2013-Aug-13 14:31 UTC
[PATCH 3/3] xen/events: document behaviour when scanning the start word for events
From: David Vrabel <david.vrabel@citrix.com> The original comment on the scanning of the start word on the 2nd pass did not reflect the actual behaviour (the code was incorrectly masking bit_idx instead of the pending word itself). The documented behaviour is not actually required since if event were pending in the MSBs, they would be immediately scanned anyway as we go through the loop again. Update the documentation to reflect this (instead of trying to change the behaviour). Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- drivers/xen/events.c | 17 ++++++++++++----- 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index a58ac43..f866f50 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -1379,14 +1379,21 @@ static void __xen_evtchn_do_upcall(void) pending_bits = active_evtchns(cpu, s, word_idx); bit_idx = 0; /* usually scan entire word from start */ + /* + * We scan the starting word in two parts. + * + * 1st time: start in the middle, scanning the + * MSBs. + * + * 2nd time: scan the whole word (not just the + * parts skipped in the first pass) -- if an + * event in the previously scanned bits is + * pending again it would just be scanned on + * the next loop anyway. + */ if (word_idx == start_word_idx) { - /* We scan the starting word in two parts */ if (i == 0) - /* 1st time: start in the middle */ bit_idx = start_bit_idx; - else - /* 2nd time: mask bits done already */ - bit_idx &= (1UL << start_bit_idx) - 1; } do { -- 1.7.2.5
On 13/08/13 15:31, David Vrabel wrote:> We have been seeing cases when events are being lost or delayed. The > event is pending but it is never handled. This series fixes two bugs > and some documentation that I foudn when reviewing the code. > > We do not yet have a good repro of the lost events bug yet so I''m not > sure if this patches fix the problem so I would appreciate some > careful review of the reasoning behind this patches.Oops. Got Boris''s email wrong. David
Jan Beulich
2013-Aug-13 15:16 UTC
Re: [PATCH 1/3] x86/xen: use memory barriers when enabling local irqs
>>> On 13.08.13 at 16:31, David Vrabel <david.vrabel@citrix.com> wrote: > From: David Vrabel <david.vrabel@citrix.com> > > Because vcpu->evtchn_upcall_mask and vcpu->evtchn_upcall_pending are > be written by Xen as well as the guest, using barrier() (a > compiler-only barrier) in xen_enable_irq() and xen_restore_fl() is not > sufficient. > > Use mb() (a full memory barrier) instead.If this was generic code, I''d agree. But in x86 specific code I don''t see the need. Jan> Signed-off-by: David Vrabel <david.vrabel@citrix.com> > --- > arch/x86/xen/irq.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c > index 01a4dc0..1a8d0d4 100644 > --- a/arch/x86/xen/irq.c > +++ b/arch/x86/xen/irq.c > @@ -60,7 +60,7 @@ static void xen_restore_fl(unsigned long flags) > > if (flags == 0) { > preempt_check_resched(); > - barrier(); /* unmask then check (avoid races) */ > + mb(); /* unmask then check (avoid races) */ > if (unlikely(vcpu->evtchn_upcall_pending)) > xen_force_evtchn_callback(); > } > @@ -93,7 +93,7 @@ static void xen_irq_enable(void) > /* Doesn''t matter if we get preempted here, because any > pending event will get dealt with anyway. */ > > - barrier(); /* unmask then check (avoid races) */ > + mb(); /* unmask then check (avoid races) */ > if (unlikely(vcpu->evtchn_upcall_pending)) > xen_force_evtchn_callback(); > } > -- > 1.7.2.5 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Boris Ostrovsky
2013-Aug-13 15:18 UTC
Re: [PATCH 1/3] x86/xen: use memory barriers when enabling local irqs
On 08/13/2013 10:31 AM, David Vrabel wrote:> From: David Vrabel <david.vrabel@citrix.com> > > Because vcpu->evtchn_upcall_mask and vcpu->evtchn_upcall_pending are > be written by Xen as well as the guest, using barrier() (a > compiler-only barrier) in xen_enable_irq() and xen_restore_fl() is not > sufficient.Unneeded ''be'' and xen_enable_irq -> xen_irq_enable> > Use mb() (a full memory barrier) instead.Are evtchn_upcall_mask and evtchn_upcall_pending written from the same (physical) processor during the potential race? If yes then I am not sure this will make any difference since I think sysret/iret, syscall and interrupts have implicit mfence. It won''t hurt to have mb(), all I am trying to say that this may not be the cause of lost events. -boris> > Signed-off-by: David Vrabel <david.vrabel@citrix.com> > --- > arch/x86/xen/irq.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c > index 01a4dc0..1a8d0d4 100644 > --- a/arch/x86/xen/irq.c > +++ b/arch/x86/xen/irq.c > @@ -60,7 +60,7 @@ static void xen_restore_fl(unsigned long flags) > > if (flags == 0) { > preempt_check_resched(); > - barrier(); /* unmask then check (avoid races) */ > + mb(); /* unmask then check (avoid races) */ > if (unlikely(vcpu->evtchn_upcall_pending)) > xen_force_evtchn_callback(); > } > @@ -93,7 +93,7 @@ static void xen_irq_enable(void) > /* Doesn''t matter if we get preempted here, because any > pending event will get dealt with anyway. */ > > - barrier(); /* unmask then check (avoid races) */ > + mb(); /* unmask then check (avoid races) */ > if (unlikely(vcpu->evtchn_upcall_pending)) > xen_force_evtchn_callback(); > }