This series contains two independent but related fixes to do with irqsave/restore semantics. The first patch prevents local_irq_restore() from corrupting system flags. The second was constructed using the compile errors from the third patch, separated for clarity and reordered to prevent bisection problems. The third patch uses a BUILD_BUG_ON() to ensure that the flags parameter to spin_lock_irqsave is wide enough to not trucate the result from local_irq_save(). I have not compile tested on arm, so request that one of the maintainers explicitly acks the final patch. Any compile failures will almost certainly be fixed with an int->unsigned long conversion on the affected variable(s). Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <JBeulich@suse.com> CC: Ian Campbell <ian.campbell@citrix.com> CC: Stefano Stabellini <stefano.stabellini@citrix.com> CC: Tim Deegan <tim@xen.org> CC: George Dunlap <george.dunlap@eu.citrix.com> -- 1.7.10.4
Andrew Cooper
2013-Oct-21 13:41 UTC
[PATCH 1/3] x86/irq: local_irq_restore() should not blindly popf
local_irq_restore() should only be concerned with possibly changing the interrupt flag. A blind popf could corrupt other system flags. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <JBeulich@suse.com> --- xen/include/asm-x86/system.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h index 6ab7d56..cbf0f6a 100644 --- a/xen/include/asm-x86/system.h +++ b/xen/include/asm-x86/system.h @@ -159,8 +159,10 @@ static always_inline unsigned long __cmpxchg( #define local_irq_restore(x) \ ({ \ BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ - asm volatile ( "push" __OS " %0 ; popf" __OS \ - : : "g" (x) : "memory", "cc" ); \ + if ( x & X86_EFLAGS_IF ) \ + local_irq_enable(); \ + else \ + local_irq_disable(); \ }) static inline int local_irq_is_enabled(void) -- 1.7.10.4
Andrew Cooper
2013-Oct-21 13:41 UTC
[PATCH 2/3] xen: Widen flags parameter for spinlock_irqsave() and friends
These issues were detected using the subsequent patch which forces a compilation error if the result from local_irq_save() would be truncated. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <JBeulich@suse.com> CC: George Dunlap <george.dunlap@eu.citrix.com> --- xen/common/sched_credit2.c | 7 ++++--- xen/common/trace.c | 2 +- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index 84e547b..4e68375 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -1480,7 +1480,7 @@ static void * csched_alloc_domdata(const struct scheduler *ops, struct domain *dom) { struct csched_dom *sdom; - int flags; + unsigned long flags; sdom = xzalloc(struct csched_dom); if ( sdom == NULL ) @@ -1524,7 +1524,7 @@ csched_dom_init(const struct scheduler *ops, struct domain *dom) static void csched_free_domdata(const struct scheduler *ops, void *data) { - int flags; + unsigned long flags; struct csched_dom *sdom = data; spin_lock_irqsave(&CSCHED_PRIV(ops)->lock, flags); @@ -1944,7 +1944,8 @@ static void deactivate_runqueue(struct csched_private *prv, int rqi) static void init_pcpu(const struct scheduler *ops, int cpu) { - int rqi, flags; + int rqi; + unsigned long flags; struct csched_private *prv = CSCHED_PRIV(ops); struct csched_runqueue_data *rqd; spinlock_t *old_lock; diff --git a/xen/common/trace.c b/xen/common/trace.c index 63ea0b7..41ddc33 100644 --- a/xen/common/trace.c +++ b/xen/common/trace.c @@ -420,7 +420,7 @@ int tb_control(xen_sysctl_tbuf_op_t *tbc) * hypercall returns, no more records should be placed into the buffers. */ for_each_online_cpu(i) { - int flags; + unsigned long flags; spin_lock_irqsave(&per_cpu(t_lock, i), flags); per_cpu(lost_records, i)=0; spin_unlock_irqrestore(&per_cpu(t_lock, i), flags); -- 1.7.10.4
Andrew Cooper
2013-Oct-21 13:41 UTC
[PATCH 3/3] common/spinlock: Ensure the flags parameter is wide enough
Because of the construction of spin_lock_irq() (and varients), the flags parameter could be trucated. Use a BUILD_BUG_ON() to verify the width of the parameter. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <JBeulich@suse.com> CC: Ian Campbell <ian.campbell@citrix.com> CC: Stefano Stabellini <stefano.stabellini@citrix.com> CC: Tim Deegan <tim@xen.org> --- The previous patch in the series fixes the compilation issues I found as a result of this patch, but I have not tested on arm. I therefore request an explicit ack from an arm maintainer before this is committed. --- xen/include/xen/spinlock.h | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/xen/include/xen/spinlock.h b/xen/include/xen/spinlock.h index 76581c5..12b0a89 100644 --- a/xen/include/xen/spinlock.h +++ b/xen/include/xen/spinlock.h @@ -188,7 +188,11 @@ int _rw_is_write_locked(rwlock_t *lock); #define spin_lock(l) _spin_lock(l) #define spin_lock_irq(l) _spin_lock_irq(l) -#define spin_lock_irqsave(l, f) ((f) = _spin_lock_irqsave(l)) +#define spin_lock_irqsave(l, f) \ + ({ \ + BUILD_BUG_ON(sizeof(f) != sizeof(unsigned long)); \ + ((f) = _spin_lock_irqsave(l)); \ + }) #define spin_unlock(l) _spin_unlock(l) #define spin_unlock_irq(l) _spin_unlock_irq(l) @@ -220,7 +224,11 @@ int _rw_is_write_locked(rwlock_t *lock); #define read_lock(l) _read_lock(l) #define read_lock_irq(l) _read_lock_irq(l) -#define read_lock_irqsave(l, f) ((f) = _read_lock_irqsave(l)) +#define read_lock_irqsave(l, f) \ + ({ \ + BUILD_BUG_ON(sizeof(f) != sizeof(unsigned long)); \ + ((f) = _read_lock_irqsave(l)); \ + }) #define read_unlock(l) _read_unlock(l) #define read_unlock_irq(l) _read_unlock_irq(l) @@ -229,7 +237,11 @@ int _rw_is_write_locked(rwlock_t *lock); #define write_lock(l) _write_lock(l) #define write_lock_irq(l) _write_lock_irq(l) -#define write_lock_irqsave(l, f) ((f) = _write_lock_irqsave(l)) +#define write_lock_irqsave(l, f) \ + ({ \ + BUILD_BUG_ON(sizeof(f) != sizeof(unsigned long)); \ + ((f) = _write_lock_irqsave(l)); \ + }) #define write_trylock(l) _write_trylock(l) #define write_unlock(l) _write_unlock(l) -- 1.7.10.4
David Vrabel
2013-Oct-21 13:58 UTC
Re: [PATCH 1/3] x86/irq: local_irq_restore() should not blindly popf
On 21/10/13 14:41, Andrew Cooper wrote:> local_irq_restore() should only be concerned with possibly changing the > interrupt flag. A blind popf could corrupt other system flags. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > CC: Keir Fraser <keir@xen.org> > CC: Jan Beulich <JBeulich@suse.com> > --- > xen/include/asm-x86/system.h | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h > index 6ab7d56..cbf0f6a 100644 > --- a/xen/include/asm-x86/system.h > +++ b/xen/include/asm-x86/system.h > @@ -159,8 +159,10 @@ static always_inline unsigned long __cmpxchg( > #define local_irq_restore(x) \ > ({ \ > BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ > - asm volatile ( "push" __OS " %0 ; popf" __OS \ > - : : "g" (x) : "memory", "cc" ); \ > + if ( x & X86_EFLAGS_IF ) \ > + local_irq_enable(); \ > + else \ > + local_irq_disable(); \ > })This adds a branch in a potentially hot path. Is the local_irq_disable() needed? Interrupts should already be disabled on entry. David
On 21/10/2013 14:41, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> This series contains two independent but related fixes to do with > irqsave/restore semantics. > > The first patch prevents local_irq_restore() from corrupting system flags. > > The second was constructed using the compile errors from the third patch, > separated for clarity and reordered to prevent bisection problems. > > The third patch uses a BUILD_BUG_ON() to ensure that the flags parameter to > spin_lock_irqsave is wide enough to not trucate the result from > local_irq_save(). > > I have not compile tested on arm, so request that one of the maintainers > explicitly acks the final patch. Any compile failures will almost certainly > be fixed with an int->unsigned long conversion on the affected variable(s). > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > CC: Keir Fraser <keir@xen.org> > CC: Jan Beulich <JBeulich@suse.com> > CC: Ian Campbell <ian.campbell@citrix.com> > CC: Stefano Stabellini <stefano.stabellini@citrix.com> > CC: Tim Deegan <tim@xen.org> > CC: George Dunlap <george.dunlap@eu.citrix.com>Acked-by: Keir Fraser <keir@xen.org>> -- > 1.7.10.4 >
Keir Fraser
2013-Oct-21 14:09 UTC
Re: [PATCH 1/3] x86/irq: local_irq_restore() should not blindly popf
On 21/10/2013 14:58, "David Vrabel" <david.vrabel@citrix.com> wrote:> On 21/10/13 14:41, Andrew Cooper wrote: >> local_irq_restore() should only be concerned with possibly changing the >> interrupt flag. A blind popf could corrupt other system flags. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> CC: Keir Fraser <keir@xen.org> >> CC: Jan Beulich <JBeulich@suse.com> >> --- >> xen/include/asm-x86/system.h | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h >> index 6ab7d56..cbf0f6a 100644 >> --- a/xen/include/asm-x86/system.h >> +++ b/xen/include/asm-x86/system.h >> @@ -159,8 +159,10 @@ static always_inline unsigned long __cmpxchg( >> #define local_irq_restore(x) \ >> ({ \ >> BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ >> - asm volatile ( "push" __OS " %0 ; popf" __OS \ >> - : : "g" (x) : "memory", "cc" ); \ >> + if ( x & X86_EFLAGS_IF ) \ >> + local_irq_enable(); \ >> + else \ >> + local_irq_disable(); \ >> }) > > This adds a branch in a potentially hot path. > > Is the local_irq_disable() needed? Interrupts should already be disabled > on entry.If that is always true, and so we get rid of the else, then local_irq_restore() should ASSERT it on entry. -- Keir> David
Jan Beulich
2013-Oct-21 14:32 UTC
Re: [PATCH 1/3] x86/irq: local_irq_restore() should not blindly popf
>>> On 21.10.13 at 16:09, Keir Fraser <keir.xen@gmail.com> wrote: > On 21/10/2013 14:58, "David Vrabel" <david.vrabel@citrix.com> wrote: > >> On 21/10/13 14:41, Andrew Cooper wrote: >>> local_irq_restore() should only be concerned with possibly changing the >>> interrupt flag. A blind popf could corrupt other system flags. >>> >>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>> CC: Keir Fraser <keir@xen.org> >>> CC: Jan Beulich <JBeulich@suse.com> >>> --- >>> xen/include/asm-x86/system.h | 6 ++++-- >>> 1 file changed, 4 insertions(+), 2 deletions(-) >>> >>> diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h >>> index 6ab7d56..cbf0f6a 100644 >>> --- a/xen/include/asm-x86/system.h >>> +++ b/xen/include/asm-x86/system.h >>> @@ -159,8 +159,10 @@ static always_inline unsigned long __cmpxchg( >>> #define local_irq_restore(x) \ >>> ({ \ >>> BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ >>> - asm volatile ( "push" __OS " %0 ; popf" __OS \ >>> - : : "g" (x) : "memory", "cc" ); \ >>> + if ( x & X86_EFLAGS_IF ) \ >>> + local_irq_enable(); \ >>> + else \ >>> + local_irq_disable(); \ >>> }) >> >> This adds a branch in a potentially hot path. >> >> Is the local_irq_disable() needed? Interrupts should already be disabled >> on entry. > > If that is always true, and so we get rid of the else, then > local_irq_restore() should ASSERT it on entry.Let''s not go that route - the macro is supposed to do what it says - restore the prior state of the interrupt flag. To deal with the two branches the code adds I''d rather recommend merging current and to-be-restored flags - the necessary "and" and "or" are quite likely faster than any mis-predicted branch (and perhaps as fast as a correctly predicted one). Jan
Jan Beulich
2013-Oct-21 14:42 UTC
Re: [PATCH 1/3] x86/irq: local_irq_restore() should not blindly popf
>>> On 21.10.13 at 15:41, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > --- a/xen/include/asm-x86/system.h > +++ b/xen/include/asm-x86/system.h > @@ -159,8 +159,10 @@ static always_inline unsigned long __cmpxchg( > #define local_irq_restore(x) \ > ({ \ > BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ > - asm volatile ( "push" __OS " %0 ; popf" __OS \ > - : : "g" (x) : "memory", "cc" ); \ > + if ( x & X86_EFLAGS_IF ) \ > + local_irq_enable(); \ > + else \ > + local_irq_disable(); \_If_ you''re going to re-do this using the mask-and-merge approach I suggested in the other reply, may I ask that you also replace the open coded (1<<9) a few lines down with X86_EFLAGS_IF (at once making the comment there superfluous)? Jan
Ian Campbell
2013-Oct-21 15:15 UTC
Re: [PATCH 3/3] common/spinlock: Ensure the flags parameter is wide enough
On Mon, 2013-10-21 at 14:41 +0100, Andrew Cooper wrote:> The previous patch in the series fixes the compilation issues I found as a > result of this patch, but I have not tested on arm. I therefore request an > explicit ack from an arm maintainer before this is committed.I build tested just this patch on current staging and it failed, because I hadn''t spotted that the previous patch was generic and not x86 specific. With patch 2/3 added both arm32 and arm64 hypervisors build for me, so: Acked-by: Ian Campbell <ian.campbell@citrix.com> BTW cross-compiling the hypervisor is more or less trivial, compilers can be found at https://launchpad.net/linaro-toolchain-binaries/+download and the runes are: make dist-xen XEN_TARGET_ARCH=arm32 CROSS_COMPILER=arm-linux-gnueabihf- make dist-xen XEN_TARGET_ARCH=arm64 CROSS_COMPILER=aarch64-linux-gnu- Ian.
Jan Beulich
2013-Oct-21 15:24 UTC
Re: [PATCH 1/3] x86/irq: local_irq_restore() should not blindly popf
>>> On 21.10.13 at 16:32, "Jan Beulich" <JBeulich@suse.com> wrote: >>>> On 21.10.13 at 16:09, Keir Fraser <keir.xen@gmail.com> wrote: >> On 21/10/2013 14:58, "David Vrabel" <david.vrabel@citrix.com> wrote: >> >>> On 21/10/13 14:41, Andrew Cooper wrote: >>>> local_irq_restore() should only be concerned with possibly changing the >>>> interrupt flag. A blind popf could corrupt other system flags. >>>> >>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>>> CC: Keir Fraser <keir@xen.org> >>>> CC: Jan Beulich <JBeulich@suse.com> >>>> --- >>>> xen/include/asm-x86/system.h | 6 ++++-- >>>> 1 file changed, 4 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h >>>> index 6ab7d56..cbf0f6a 100644 >>>> --- a/xen/include/asm-x86/system.h >>>> +++ b/xen/include/asm-x86/system.h >>>> @@ -159,8 +159,10 @@ static always_inline unsigned long __cmpxchg( >>>> #define local_irq_restore(x) \ >>>> ({ \ >>>> BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ >>>> - asm volatile ( "push" __OS " %0 ; popf" __OS \ >>>> - : : "g" (x) : "memory", "cc" ); \ >>>> + if ( x & X86_EFLAGS_IF ) \ >>>> + local_irq_enable(); \ >>>> + else \ >>>> + local_irq_disable(); \ >>>> }) >>> >>> This adds a branch in a potentially hot path. >>> >>> Is the local_irq_disable() needed? Interrupts should already be disabled >>> on entry. >> >> If that is always true, and so we get rid of the else, then >> local_irq_restore() should ASSERT it on entry. > > Let''s not go that route - the macro is supposed to do what it > says - restore the prior state of the interrupt flag.And there is actually an example in the tree that might get broken if we went that suggested route: xen/arch/x86/io_apic.c:timer_irq_works(). Albeit I''m not sure - it looks like the function gets called only in contexts where interrupts are already enabled, but considering its use of local_irq_enable() that may not have been the case earlier. In any event - I don''t think we should take the risk of some rarely executed (perhaps error handling only) path doing something like that. Jan
Andrew Cooper
2013-Oct-21 16:33 UTC
[Patch 1/3 v2] x86/irq: local_irq_restore() should not blindly popf
local_irq_restore() should only be concerned with possibly changing the interrupt flag. A blind popf could corrupt other system flags. While playing in this area, fixup an opencoded use of X86_EFLAGS_IF. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <JBeulich@suse.com> --- This is rather more RFC. It boots and runs VMs, so I am fairly sure it is functionally correct, but I cant help feeling there might be a neater way to do the inline assembly. Suggestions welcome. --- xen/include/asm-x86/system.h | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h index 6ab7d56..ff52671 100644 --- a/xen/include/asm-x86/system.h +++ b/xen/include/asm-x86/system.h @@ -3,6 +3,7 @@ #include <xen/lib.h> #include <xen/bitops.h> +#include <asm/processor.h> #define read_segment_register(name) \ ({ u16 __sel; \ @@ -159,15 +160,19 @@ static always_inline unsigned long __cmpxchg( #define local_irq_restore(x) \ ({ \ BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ - asm volatile ( "push" __OS " %0 ; popf" __OS \ - : : "g" (x) : "memory", "cc" ); \ + asm volatile ( \ + "pushf" __OS "\n\t" \ + "and" __OS " %0, (%%" __OP "sp)\n\t" \ + "orw %1, (%%" __OP "sp)\n\t" \ + "popf" __OS "\n\t" : : "g" ( ~X86_EFLAGS_IF ), \ + "g" ( x & X86_EFLAGS_IF ) ); \ }) static inline int local_irq_is_enabled(void) { unsigned long flags; local_save_flags(flags); - return !!(flags & (1<<9)); /* EFLAGS_IF */ + return !!(flags & X86_EFLAGS_IF); } #define BROKEN_ACPI_Sx 0x0001 -- 1.7.10.4
Keir Fraser
2013-Oct-21 18:18 UTC
Re: [Patch 1/3 v2] x86/irq: local_irq_restore() should not blindly popf
On 21/10/2013 17:33, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> local_irq_restore() should only be concerned with possibly changing the > interrupt flag. A blind popf could corrupt other system flags. > > While playing in this area, fixup an opencoded use of X86_EFLAGS_IF. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > CC: Keir Fraser <keir@xen.org> > CC: Jan Beulich <JBeulich@suse.com> > > --- > > This is rather more RFC. It boots and runs VMs, so I am fairly sure it is > functionally correct, but I cant help feeling there might be a neater way to > do the inline assembly. Suggestions welcome. > --- > xen/include/asm-x86/system.h | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h > index 6ab7d56..ff52671 100644 > --- a/xen/include/asm-x86/system.h > +++ b/xen/include/asm-x86/system.h > @@ -3,6 +3,7 @@ > > #include <xen/lib.h> > #include <xen/bitops.h> > +#include <asm/processor.h> > > #define read_segment_register(name) \ > ({ u16 __sel; \ > @@ -159,15 +160,19 @@ static always_inline unsigned long __cmpxchg( > #define local_irq_restore(x) \ > ({ \ > BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ > - asm volatile ( "push" __OS " %0 ; popf" __OS \ > - : : "g" (x) : "memory", "cc" ); \ > + asm volatile ( \ > + "pushf" __OS "\n\t" \ > + "and" __OS " %0, (%%" __OP "sp)\n\t" \ > + "orw %1, (%%" __OP "sp)\n\t" \ > + "popf" __OS "\n\t" : : "g" ( ~X86_EFLAGS_IF ), \Would this be better as a constant constraint ("i")?> + "g" ( x & X86_EFLAGS_IF ) ); \ > }) > > static inline int local_irq_is_enabled(void) > { > unsigned long flags; > local_save_flags(flags); > - return !!(flags & (1<<9)); /* EFLAGS_IF */ > + return !!(flags & X86_EFLAGS_IF); > } > > #define BROKEN_ACPI_Sx 0x0001
Andrew Cooper
2013-Oct-21 18:30 UTC
Re: [Patch 1/3 v2] x86/irq: local_irq_restore() should not blindly popf
On 21/10/13 19:18, Keir Fraser wrote:> On 21/10/2013 17:33, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: > >> local_irq_restore() should only be concerned with possibly changing the >> interrupt flag. A blind popf could corrupt other system flags. >> >> While playing in this area, fixup an opencoded use of X86_EFLAGS_IF. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> CC: Keir Fraser <keir@xen.org> >> CC: Jan Beulich <JBeulich@suse.com> >> >> --- >> >> This is rather more RFC. It boots and runs VMs, so I am fairly sure it is >> functionally correct, but I cant help feeling there might be a neater way to >> do the inline assembly. Suggestions welcome. >> --- >> xen/include/asm-x86/system.h | 11 ++++++++--- >> 1 file changed, 8 insertions(+), 3 deletions(-) >> >> diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h >> index 6ab7d56..ff52671 100644 >> --- a/xen/include/asm-x86/system.h >> +++ b/xen/include/asm-x86/system.h >> @@ -3,6 +3,7 @@ >> >> #include <xen/lib.h> >> #include <xen/bitops.h> >> +#include <asm/processor.h> >> >> #define read_segment_register(name) \ >> ({ u16 __sel; \ >> @@ -159,15 +160,19 @@ static always_inline unsigned long __cmpxchg( >> #define local_irq_restore(x) \ >> ({ \ >> BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ >> - asm volatile ( "push" __OS " %0 ; popf" __OS \ >> - : : "g" (x) : "memory", "cc" ); \ >> + asm volatile ( \ >> + "pushf" __OS "\n\t" \ >> + "and" __OS " %0, (%%" __OP "sp)\n\t" \ >> + "orw %1, (%%" __OP "sp)\n\t" \ >> + "popf" __OS "\n\t" : : "g" ( ~X86_EFLAGS_IF ), \ > Would this be better as a constant constraint ("i")?I was wondering what the best practice for this would be. In most cases, I would imagine that an immediate would be used. However, as this is a define and therefore forcibly inlined everywhere it is used, it is just possible that the compiler could find a ~X86_EFLAGS_IF already in context, and optimise down to an "and r64,r/m64". ~Andrew> >> + "g" ( x & X86_EFLAGS_IF ) ); \ >> }) >> >> static inline int local_irq_is_enabled(void) >> { >> unsigned long flags; >> local_save_flags(flags); >> - return !!(flags & (1<<9)); /* EFLAGS_IF */ >> + return !!(flags & X86_EFLAGS_IF); >> } >> >> #define BROKEN_ACPI_Sx 0x0001 >
Keir Fraser
2013-Oct-21 18:37 UTC
Re: [Patch 1/3 v2] x86/irq: local_irq_restore() should not blindly popf
On 21/10/2013 19:30, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:>>> #define read_segment_register(name) \ >>> ({ u16 __sel; \ >>> @@ -159,15 +160,19 @@ static always_inline unsigned long __cmpxchg( >>> #define local_irq_restore(x) \ >>> ({ \ >>> BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ >>> - asm volatile ( "push" __OS " %0 ; popf" __OS \ >>> - : : "g" (x) : "memory", "cc" ); \ >>> + asm volatile ( \ >>> + "pushf" __OS "\n\t" \ >>> + "and" __OS " %0, (%%" __OP "sp)\n\t" \ >>> + "orw %1, (%%" __OP "sp)\n\t" \ >>> + "popf" __OS "\n\t" : : "g" ( ~X86_EFLAGS_IF ), \ >> Would this be better as a constant constraint ("i")? > > I was wondering what the best practice for this would be. > > In most cases, I would imagine that an immediate would be used. > However, as this is a define and therefore forcibly inlined everywhere > it is used, it is just possible that the compiler could find a > ~X86_EFLAGS_IF already in context, and optimise down to an "and r64,r/m64".Oh, g includes i, I forgot that. Well your choice is best then. Acked-by: Keir Fraser <keir@xen.org>> ~Andrew
Jan Beulich
2013-Oct-22 08:35 UTC
Re: [Patch 1/3 v2] x86/irq: local_irq_restore() should not blindly popf
>>> On 21.10.13 at 20:37, Keir Fraser <keir.xen@gmail.com> wrote: > On 21/10/2013 19:30, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: > >>>> #define read_segment_register(name) \ >>>> ({ u16 __sel; \ >>>> @@ -159,15 +160,19 @@ static always_inline unsigned long __cmpxchg( >>>> #define local_irq_restore(x) \ >>>> ({ \ >>>> BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ >>>> - asm volatile ( "push" __OS " %0 ; popf" __OS \ >>>> - : : "g" (x) : "memory", "cc" ); \ >>>> + asm volatile ( \ >>>> + "pushf" __OS "\n\t" \ >>>> + "and" __OS " %0, (%%" __OP "sp)\n\t" \ >>>> + "orw %1, (%%" __OP "sp)\n\t" \ >>>> + "popf" __OS "\n\t" : : "g" ( ~X86_EFLAGS_IF ), \ >>> Would this be better as a constant constraint ("i")? >> >> I was wondering what the best practice for this would be. >> >> In most cases, I would imagine that an immediate would be used. >> However, as this is a define and therefore forcibly inlined everywhere >> it is used, it is just possible that the compiler could find a >> ~X86_EFLAGS_IF already in context, and optimise down to an "and r64,r/m64". > > Oh, g includes i, I forgot that. Well your choice is best then.Sorry, but no. "g" also includes "m", and - the other operand of both operations is a memory operand already, so this one can''t also be a memory one, - on a non-debug build (without frame pointers) an eventual %rsp-relative memory location would be broken due to the shifted stack offsets resulting from the PUSHF. Hence both constraints can at best be "ri". Further I have a hard time seeing how the "orw" used above can even have built successfully: If a register gets picked (which ought to be the common case), opcode suffix and register name ought to collide. And "orw" is a bad choice here anyway, in that this is a 2-byte write following an 8-byte one. And finally - what''s the point of using __OS in new assembly constructs? I was actually considering cleaning up all this hard to read cruft, since we no longer care about the 32-bit case. Jan
Andrew Cooper
2013-Oct-22 08:56 UTC
Re: [Patch 1/3 v2] x86/irq: local_irq_restore() should not blindly popf
On 22/10/13 09:35, Jan Beulich wrote:>>>> On 21.10.13 at 20:37, Keir Fraser <keir.xen@gmail.com> wrote: >> On 21/10/2013 19:30, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: >> >>>>> #define read_segment_register(name) \ >>>>> ({ u16 __sel; \ >>>>> @@ -159,15 +160,19 @@ static always_inline unsigned long __cmpxchg( >>>>> #define local_irq_restore(x) \ >>>>> ({ \ >>>>> BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ >>>>> - asm volatile ( "push" __OS " %0 ; popf" __OS \ >>>>> - : : "g" (x) : "memory", "cc" ); \ >>>>> + asm volatile ( \ >>>>> + "pushf" __OS "\n\t" \ >>>>> + "and" __OS " %0, (%%" __OP "sp)\n\t" \ >>>>> + "orw %1, (%%" __OP "sp)\n\t" \ >>>>> + "popf" __OS "\n\t" : : "g" ( ~X86_EFLAGS_IF ), \ >>>> Would this be better as a constant constraint ("i")? >>> I was wondering what the best practice for this would be. >>> >>> In most cases, I would imagine that an immediate would be used. >>> However, as this is a define and therefore forcibly inlined everywhere >>> it is used, it is just possible that the compiler could find a >>> ~X86_EFLAGS_IF already in context, and optimise down to an "and r64,r/m64". >> Oh, g includes i, I forgot that. Well your choice is best then. > Sorry, but no. "g" also includes "m", and > - the other operand of both operations is a memory operand > already, so this one can''t also be a memory one, > - on a non-debug build (without frame pointers) an eventual > %rsp-relative memory location would be broken due to the > shifted stack offsets resulting from the PUSHF. > Hence both constraints can at best be "ri".Ok - I can change this.> > Further I have a hard time seeing how the "orw" used above > can even have built successfully: If a register gets picked > (which ought to be the common case), opcode suffix and > register name ought to collide. And "orw" is a bad choice here > anyway, in that this is a 2-byte write following an 8-byte one.GCC correctly picks a 2-byte register given the orw. Looking at the disassembly, it usually chooses %r12w Why is symmetry of writes important here? We are possibly setting bit 9 alone.> > And finally - what''s the point of using __OS in new assembly > constructs? I was actually considering cleaning up all this hard > to read cruft, since we no longer care about the 32-bit case.I am happy to remove __OS/__OP if that is considered a good thing moving forward - I was merely using the prevailing style. ~Andrew
Jan Beulich
2013-Oct-22 09:28 UTC
Re: [Patch 1/3 v2] x86/irq: local_irq_restore() should not blindly popf
>>> On 22.10.13 at 10:56, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 22/10/13 09:35, Jan Beulich wrote: >> Further I have a hard time seeing how the "orw" used above >> can even have built successfully: If a register gets picked >> (which ought to be the common case), opcode suffix and >> register name ought to collide. And "orw" is a bad choice here >> anyway, in that this is a 2-byte write following an 8-byte one. > > GCC correctly picks a 2-byte register given the orw. Looking at the > disassembly, it usually chooses %r12wTry compiling this void test(unsigned long x) { asm volatile("orw %0, (%0)" :: "ri" (x)); } with a 32-bit gcc (or with -m32). In fact I''m surprised the assembler doesn''t generate a warning (or even error) in the 64-bit case - this very much smells like a bug (and I looked at that code the other day and got the impression that the 64-bit one would be _less_ forgiving here). In any event - if you want to stay with "orw", you ought to use "%w<number>" for the operand.> Why is symmetry of writes important here? We are possibly setting bit 9 > alone.Store-to-load-forwarding works only when the store size is larger than the load one. So I shouldn''t have drawn your attention to the preceding "andq" but to the following "pushfq" (though I wouldn''t be surprised if mixed size writes to overlapping memory locations would also suffer from penalties). Jan
Andrew Cooper
2013-Oct-22 10:14 UTC
[PATCH 1/3 v3] x86/irq: local_irq_restore() should not blindly popf
local_irq_restore() should only be concerned with possibly changing the interrupt flag. A blind popf could corrupt other system flags. While playing in this area, fixup an opencoded use of X86_EFLAGS_IF. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <JBeulich@suse.com> --- Change in v3: * asm tweaks --- xen/include/asm-x86/system.h | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h index 6ab7d56..7d9f31b 100644 --- a/xen/include/asm-x86/system.h +++ b/xen/include/asm-x86/system.h @@ -3,6 +3,7 @@ #include <xen/lib.h> #include <xen/bitops.h> +#include <asm/processor.h> #define read_segment_register(name) \ ({ u16 __sel; \ @@ -159,15 +160,19 @@ static always_inline unsigned long __cmpxchg( #define local_irq_restore(x) \ ({ \ BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ - asm volatile ( "push" __OS " %0 ; popf" __OS \ - : : "g" (x) : "memory", "cc" ); \ + asm volatile ( \ + "pushfq\n\t" \ + "andq %0, (%%rsp)\n\t" \ + "orq %1, (%%rsp)\n\t" \ + "popfq\n\t" : : "ri" ( ~X86_EFLAGS_IF ), \ + "ri" ( x & X86_EFLAGS_IF ) ); \ }) static inline int local_irq_is_enabled(void) { unsigned long flags; local_save_flags(flags); - return !!(flags & (1<<9)); /* EFLAGS_IF */ + return !!(flags & X86_EFLAGS_IF); } #define BROKEN_ACPI_Sx 0x0001 -- 1.7.10.4
Keir Fraser
2013-Oct-22 13:27 UTC
Re: [PATCH 1/3 v3] x86/irq: local_irq_restore() should not blindly popf
On 22/10/2013 11:14, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> local_irq_restore() should only be concerned with possibly changing the > interrupt flag. A blind popf could corrupt other system flags. > > While playing in this area, fixup an opencoded use of X86_EFLAGS_IF. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > CC: Keir Fraser <keir@xen.org> > CC: Jan Beulich <JBeulich@suse.com>Acked-by: Keir Fraser <keir@xen.org>> --- > > Change in v3: > * asm tweaks > --- > xen/include/asm-x86/system.h | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h > index 6ab7d56..7d9f31b 100644 > --- a/xen/include/asm-x86/system.h > +++ b/xen/include/asm-x86/system.h > @@ -3,6 +3,7 @@ > > #include <xen/lib.h> > #include <xen/bitops.h> > +#include <asm/processor.h> > > #define read_segment_register(name) \ > ({ u16 __sel; \ > @@ -159,15 +160,19 @@ static always_inline unsigned long __cmpxchg( > #define local_irq_restore(x) \ > ({ \ > BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ > - asm volatile ( "push" __OS " %0 ; popf" __OS \ > - : : "g" (x) : "memory", "cc" ); \ > + asm volatile ( \ > + "pushfq\n\t" \ > + "andq %0, (%%rsp)\n\t" \ > + "orq %1, (%%rsp)\n\t" \ > + "popfq\n\t" : : "ri" ( ~X86_EFLAGS_IF ), \ > + "ri" ( x & X86_EFLAGS_IF ) ); \ > }) > > static inline int local_irq_is_enabled(void) > { > unsigned long flags; > local_save_flags(flags); > - return !!(flags & (1<<9)); /* EFLAGS_IF */ > + return !!(flags & X86_EFLAGS_IF); > } > > #define BROKEN_ACPI_Sx 0x0001
Jan Beulich
2013-Oct-29 14:53 UTC
Re: [Patch 1/3 v2] x86/irq: local_irq_restore() should not blindly popf
>>> On 22.10.13 at 11:28, "Jan Beulich" <JBeulich@suse.com> wrote: >>>> On 22.10.13 at 10:56, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> On 22/10/13 09:35, Jan Beulich wrote: >>> Further I have a hard time seeing how the "orw" used above >>> can even have built successfully: If a register gets picked >>> (which ought to be the common case), opcode suffix and >>> register name ought to collide. And "orw" is a bad choice here >>> anyway, in that this is a 2-byte write following an 8-byte one. >> >> GCC correctly picks a 2-byte register given the orw. Looking at the >> disassembly, it usually chooses %r12w > > Try compiling this > > void test(unsigned long x) { > asm volatile("orw %0, (%0)" :: "ri" (x)); > } > > with a 32-bit gcc (or with -m32). In fact I''m surprised the assembler > doesn''t generate a warning (or even error) in the 64-bit case - this > very much smells like a bug (and I looked at that code the other day > and got the impression that the 64-bit one would be _less_ forgiving > here).See http://www.sourceware.org/ml/binutils/2013-10/msg00358.html. Jan