Jan Beulich
2008-Mar-14 11:23 UTC
[Xen-devel] [PATCH] x86: fix variable_test_bit() asm constraints
I just sent a (much bigger, see below) patch to the same effect to the x86 Linux maintainers - in Xen, all the operations modifying bits do have "memory" clobbers, so it''s just the test_bit() constraint that''s wrong. However, I wonder whether the non-atomic ops aren''t limiting things too much by having "memory" clobbers, they would much better be restricted to indicate just the changing memory location. This, however, would probably require some additional consideration given that Xen (other than Linux) isn''t using -fno-strict-aliasing. Furthermore, these non-atomic operations, according to their comments, can be re-ordered, which contradicts the use of __asm__ __volatile__ (but note that removing this would probably require extra precautions to meet strict aliasing rules). Further, using ''void *'' for the ''addr'' parameter appears dangerous, since bt{,c,r,s} access the full 32 bits (if ''unsigned long'' was used properly here, 64 bits for x86-64) pointed at, so invalid uses like referencing a ''char'' array cannot currently be caught. Finally I find the leading ''d'' constraints in the ''nr'' assembly operands quite odd - what is the purpose of that? Linux is using just "Ir" here... Signed-off-by: Jan Beulich <jbeulich@novell.com> Index: 2008-03-05/xen/include/asm-x86/bitops.h ==================================================================--- 2008-03-05.orig/xen/include/asm-x86/bitops.h 2007-09-14 11:03:32.000000000 +0200 +++ 2008-03-05/xen/include/asm-x86/bitops.h 2008-03-13 10:20:34.000000000 +0100 @@ -254,7 +254,8 @@ static __inline__ int variable_test_bit( __asm__ __volatile__( "btl %2,%1\n\tsbbl %0,%0" :"=r" (oldbit) - :"m" (CONST_ADDR),"dIr" (nr)); + :"m" (CONST_ADDR), "dIr" (nr), + "m" (((const volatile int *)addr)[nr >> 5])); return oldbit; } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Mar-14 11:55 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit() asm constraints
On 14/3/08 11:23, "Jan Beulich" <jbeulich@novell.com> wrote:> I just sent a (much bigger, see below) patch to the same effect to the > x86 Linux maintainers - in Xen, all the operations modifying bits do > have "memory" clobbers, so it''s just the test_bit() constraint that''s > wrong.Yes, I agree that fix is needed, although it would be consistent with other asm statements in that file to use a memory clobbers instead.> However, I wonder whether the non-atomic ops aren''t limiting things too > much by having "memory" clobbers, they would much better be restricted > to indicate just the changing memory location. This, however, would > probably require some additional consideration given that Xen (other > than Linux) isn''t using -fno-strict-aliasing. Furthermore, these > non-atomic operations, according to their comments, can be re-ordered, > which contradicts the use of __asm__ __volatile__ (but note that > removing this would probably require extra precautions to meet strict > aliasing rules).Xen *does* use -fno-strict-aliasing. I''m afraid I don''t understand why memory-clobber would be needed for the atomic ops, but a dummy memory operand would suffice for non-atomic ops. I used memory clobber everywhere because at least I''m certain that works. Same for ''asm volatile''. We''ve had various problems with the bitops before, and I just want the darn things to work!> Further, using ''void *'' for the ''addr'' parameter appears dangerous, > since bt{,c,r,s} access the full 32 bits (if ''unsigned long'' was used > properly here, 64 bits for x86-64) pointed at, so invalid uses like > referencing a ''char'' array cannot currently be caught.Sure, but those invalid uses do exist, in x86-specific Xen code we inherited from Linux (perhaps older versions of Linux though). I don''t want a huge patch that casts a large number of callers!> Finally I find the leading ''d'' constraints in the ''nr'' assembly > operands quite odd - what is the purpose of that? Linux is using just > "Ir" here...Inherited from asm-x86_64/bitops.h. The ''d'' can indeed go. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Mar-14 11:59 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit() asm constraints
On 14/3/08 11:55, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:>> Further, using ''void *'' for the ''addr'' parameter appears dangerous, >> since bt{,c,r,s} access the full 32 bits (if ''unsigned long'' was used >> properly here, 64 bits for x86-64) pointed at, so invalid uses like >> referencing a ''char'' array cannot currently be caught. > > Sure, but those invalid uses do exist, in x86-specific Xen code we inherited > from Linux (perhaps older versions of Linux though). I don''t want a huge patch > that casts a large number of callers!I see what you mean though: what if one of these bogus users'' fields is adjacent to a legitimate byte-sized atomic variable (e.g., a bool_t)? Perhaps we do need to fix this, and properly without casts. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Mar-14 12:06 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit() asm constraints
On 14/3/08 11:55, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:>> Further, using ''void *'' for the ''addr'' parameter appears dangerous, >> since bt{,c,r,s} access the full 32 bits (if ''unsigned long'' was used >> properly here, 64 bits for x86-64) pointed at, so invalid uses like >> referencing a ''char'' array cannot currently be caught. > > Sure, but those invalid uses do exist, in x86-specific Xen code we inherited > from Linux (perhaps older versions of Linux though). I don''t want a huge patch > that casts a large number of callers!I see what you mean though: what if one of these bogus users'' fields is adjacent to a legitimate byte-sized atomic variable (e.g., a bool_t)? Perhaps we do need to fix this, and properly without casts. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2008-Mar-14 13:46 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit() asmconstraints
>>> Keir Fraser <keir.fraser@eu.citrix.com> 14.03.08 12:59 >>> >On 14/3/08 11:55, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote: > >>> Further, using ''void *'' for the ''addr'' parameter appears dangerous, >>> since bt{,c,r,s} access the full 32 bits (if ''unsigned long'' was used >>> properly here, 64 bits for x86-64) pointed at, so invalid uses like >>> referencing a ''char'' array cannot currently be caught. >> >> Sure, but those invalid uses do exist, in x86-specific Xen code we inherited >> from Linux (perhaps older versions of Linux though). I don''t want a huge patch >> that casts a large number of callers! > >I see what you mean though: what if one of these bogus users'' fields is >adjacent to a legitimate byte-sized atomic variable (e.g., a bool_t)? >Perhaps we do need to fix this, and properly without casts.Correct. Or consider mis-aligned fields close to a page boundary. The intention I''m having here (and likewise on Linux, if I get signs back that this is going to be accepted) is to convert these to macros that this way allow knowing the original type. Based on the alignment of the type, different strategies will then need to be used (e.g. 1-byte aligned fields cannot be accessed with bt?, and 2-byte aligned fields can only be if the bit index is small enough, so and/or/xor will be preferable for the cases where no old bit status needs to be returned, and tweaking of the base address will need to be used in the other cases to make using bt? safe). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2008-Mar-14 13:51 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit() asmconstraints
>>> Keir Fraser <keir.fraser@eu.citrix.com> 14.03.08 12:55 >>> >On 14/3/08 11:23, "Jan Beulich" <jbeulich@novell.com> wrote: > >> I just sent a (much bigger, see below) patch to the same effect to the >> x86 Linux maintainers - in Xen, all the operations modifying bits do >> have "memory" clobbers, so it''s just the test_bit() constraint that''s >> wrong. > >Yes, I agree that fix is needed, although it would be consistent with other >asm statements in that file to use a memory clobbers instead.But memory clobber aren''t nice when you want the compiler to optimize well.>> However, I wonder whether the non-atomic ops aren''t limiting things too >> much by having "memory" clobbers, they would much better be restricted >> to indicate just the changing memory location. This, however, would >> probably require some additional consideration given that Xen (other >> than Linux) isn''t using -fno-strict-aliasing. Furthermore, these >> non-atomic operations, according to their comments, can be re-ordered, >> which contradicts the use of __asm__ __volatile__ (but note that >> removing this would probably require extra precautions to meet strict >> aliasing rules). > >Xen *does* use -fno-strict-aliasing. I''m afraid I don''t understand whyOh, okay, I just grepped xen/ for it, but it lives in Config.mk...>memory-clobber would be needed for the atomic ops, but a dummy memory >operand would suffice for non-atomic ops. I used memory clobber everywhereAtomic ops imply a barrier (otherwise the compiler can defeat the purpose of the atomic operation). The non-atomic ones don''t need a dummy operand, but one that precisely describes the place in memory that changes.>because at least I''m certain that works. Same for ''asm volatile''. We''ve had >various problems with the bitops before, and I just want the darn things to >work!Okay, if you feel that way then I guess I won''t propose changing it, at the expense of slightly worse code generation. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Mar-14 13:57 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit() asmconstraints
On 14/3/08 13:51, "Jan Beulich" <jbeulich@novell.com> wrote:>> because at least I''m certain that works. Same for ''asm volatile''. We''ve had >> various problems with the bitops before, and I just want the darn things to >> work! > > Okay, if you feel that way then I guess I won''t propose changing it, > at the expense of slightly worse code generation.Frankly I''m scared to change it for the minuscule benefit we would derive from it. Regarding your other reply: I would actually be happy to change the bitops to work with longs only. I suspect, and would need to have demonstrated otherwise, that supporting bitops on arbitrary-width fields down to the instruction level is not really worthwhile. Either way, I accept that what we do now is dubious at best. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Samuel Thibault
2008-Mar-14 13:59 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit() asmconstraints
Jan Beulich, le Fri 14 Mar 2008 13:51:26 +0000, a écrit :> >memory-clobber would be needed for the atomic ops, but a dummy memory > >operand would suffice for non-atomic ops. I used memory clobber everywhere > > Atomic ops imply a barrier (otherwise the compiler can defeat the > purpose of the atomic operation). The non-atomic ones don''t need a > dummy operand, but one that precisely describes the place in memory > that changes.Mmm, won''t that unnecessarily make the compiler put code to compute that place? Samuel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Mar-14 14:04 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit() asmconstraints
On 14/3/08 13:59, "Samuel Thibault" <samuel.thibault@eu.citrix.com> wrote:>> Atomic ops imply a barrier (otherwise the compiler can defeat the >> purpose of the atomic operation). The non-atomic ones don''t need a >> dummy operand, but one that precisely describes the place in memory >> that changes. > > Mmm, won''t that unnecessarily make the compiler put code to compute that > place?x86 has enough fancy addressing modes that usually the compiler can emit the effective address without pre-computing any part of it. It''s not guaranteed to do this, though, it''s true. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2008-Mar-14 14:11 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit()asmconstraints
>Regarding your other reply: I would actually be happy to change the bitops >to work with longs only. I suspect, and would need to have demonstrated >otherwise, that supporting bitops on arbitrary-width fields down to the >instruction level is not really worthwhile. Either way, I accept that what >we do now is dubious at best.Hmm, change it to work with longs only but also make it work without casts? You mean then change all places where bitops are applied to exclusively use ''unsigned long'' as the fundamental type? I didn''t look at the number of places that would require changing, but I''m afraid it''d be quite a few (and it might get you further away from Linux originals in some cases). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2008-Mar-14 14:18 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit() asmconstraints
>>> Samuel Thibault <samuel.thibault@eu.citrix.com> 14.03.08 14:59 >>> >Jan Beulich, le Fri 14 Mar 2008 13:51:26 +0000, a écrit : >> >memory-clobber would be needed for the atomic ops, but a dummy memory >> >operand would suffice for non-atomic ops. I used memory clobber everywhere >> >> Atomic ops imply a barrier (otherwise the compiler can defeat the >> purpose of the atomic operation). The non-atomic ones don''t need a >> dummy operand, but one that precisely describes the place in memory >> that changes. > >Mmm, won''t that unnecessarily make the compiler put code to compute that >place?Not generally - if both the ''used'' and the ''fake'' operands are apart just by a constant offset, the same base/index expression will be used by the compiler, just with different offsets. If the bit position is not a constant that might well be true, though (and I ought to check) - since in that case the compiler can''t predict which part of the array changes*, splitting the two cases with a __builtin_constant_p() should be possible, though. The constant section then would need two operands (one of them fake), while the variable section would need just one. * In theory, the compiler could even then detect certain cases of accesses to the same field. If we''re afraid of that, the we''d need to use precise operand descriptions here or, as Keir suggested, just live with using a memory barrier instead of precise operand constraints. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2008-Mar-14 15:17 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit() asmconstraints
>>> Samuel Thibault <samuel.thibault@eu.citrix.com> 14.03.08 14:59 >>> >Jan Beulich, le Fri 14 Mar 2008 13:51:26 +0000, a écrit : >> >memory-clobber would be needed for the atomic ops, but a dummy memory >> >operand would suffice for non-atomic ops. I used memory clobber everywhere >> >> Atomic ops imply a barrier (otherwise the compiler can defeat the >> purpose of the atomic operation). The non-atomic ones don''t need a >> dummy operand, but one that precisely describes the place in memory >> that changes. > >Mmm, won''t that unnecessarily make the compiler put code to compute that >place?Indeed, trying it out makes it obvious that the fake operand has too much of a side effect when the bit offset isn''t constant. So I guess Keir''s intention of just using a memory clobber here is the best we can get without severely complicating the source. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Mar-14 15:37 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit()asmconstraints
On 14/3/08 14:11, "Jan Beulich" <jbeulich@novell.com> wrote:>> Regarding your other reply: I would actually be happy to change the bitops >> to work with longs only. I suspect, and would need to have demonstrated >> otherwise, that supporting bitops on arbitrary-width fields down to the >> instruction level is not really worthwhile. Either way, I accept that what >> we do now is dubious at best. > > Hmm, change it to work with longs only but also make it work without > casts? You mean then change all places where bitops are applied to > exclusively use ''unsigned long'' as the fundamental type? I didn''t look > at the number of places that would require changing, but I''m afraid it''d > be quite a few (and it might get you further away from Linux originals > in some cases).Yes, I wouldn''t have expected that too be too hard, really. Are there lots of places with arrays of 32-bit integers? I doubt 16-bit or 8-bit fields are at all common. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2008-Mar-14 16:42 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit()asmconstraints
>>> Keir Fraser <keir.fraser@eu.citrix.com> 14.03.08 16:37 >>> >On 14/3/08 14:11, "Jan Beulich" <jbeulich@novell.com> wrote: > >>> Regarding your other reply: I would actually be happy to change the bitops >>> to work with longs only. I suspect, and would need to have demonstrated >>> otherwise, that supporting bitops on arbitrary-width fields down to the >>> instruction level is not really worthwhile. Either way, I accept that what >>> we do now is dubious at best. >> >> Hmm, change it to work with longs only but also make it work without >> casts? You mean then change all places where bitops are applied to >> exclusively use ''unsigned long'' as the fundamental type? I didn''t look >> at the number of places that would require changing, but I''m afraid it''d >> be quite a few (and it might get you further away from Linux originals >> in some cases). > >Yes, I wouldn''t have expected that too be too hard, really. Are there lots >of places with arrays of 32-bit integers? I doubt 16-bit or 8-bit fields are >at all common.Actually, just trying it out with set_bit() results in a number of cases where the field used is neither 32- nor 64-bit. The very first one I looked at even has only a byte-aligned (leaving out internal knowledge of the allocator) allocation that it accesses (domid_bitmap in xen/drivers/passthrough/vtd/iommu.c). Also, I''m somewhat reluctant to go with longs only - the REX prefix needed to operate on them on x86-64 could be saved generally, so I''d rather go with a slightly more complicated implementation like static __inline__ void __set_bit32(int nr, volatile void * addr) { __asm__ __volatile__( LOCK_PREFIX "btsl %1,%0" : : "m" (ADDR(int)), "Ir" (nr) : "memory"); } static __inline__ void __set_bit64(long nr, volatile void * addr) { __asm__ __volatile__( LOCK_PREFIX "btsq %1,%0" : : "m" (ADDR(long)), "Jr" (nr) : "memory"); } extern void __set_bit_unknown(int, volatile void *); #define set_bit(nr, addr) ({ \ switch ( min(sizeof(*(addr)), __alignof__(*(addr))) ) \ { \ case 8: __set_bit64(nr, addr); break; \ case 4: __set_bit32(nr, addr); break; \ default: __set_bit_unknown(nr, addr); break; \ } \ }) #undef ADDR #define ADDR (*(volatile long *) addr) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Mar-14 17:08 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit()asmconstraints
On 14/3/08 16:42, "Jan Beulich" <jbeulich@novell.com> wrote:> Actually, just trying it out with set_bit() results in a number of cases > where the field used is neither 32- nor 64-bit. The very first one I > looked at even has only a byte-aligned (leaving out internal knowledge > of the allocator) allocation that it accesses (domid_bitmap in > xen/drivers/passthrough/vtd/iommu.c).How did you find that one? It''s void* so I would have thought you''d miss that one as the compiler will happily cast void*. I hope there aren''t too many lurkers like that! Perhaps you were trying to do your automatic field-width detection approach. I think that''s not needed, but it would conveniently find these void* callers. Perhaps we should wrap the bitops in a macro that will fail on void*? I''m happy to do this change (void* -> long*) myself, by the way, as it''s the kind of thing that''s as much work to review as it is to do in the first place.> Also, I''m somewhat reluctant to go with longs only - the REX prefix > needed to operate on them on x86-64 could be saved generally, so > I''d rather go with a slightly more complicated implementation likeThere''s no need to use 64-bit instruction forms even if we do take ''unsigned long''. After all, the existing bitops implementations only act on 32-bit words -- we should continue with this. Where bitops are concerned the actual operand size doesn''t really matter (except that it shouldn''t be so big as to overlap with adjacent fields which may be updated in parallel). The only reason for not using even smaller-width instructions is that byte-sized bit-twiddling instructions do not exist, and the 16-bit ones are restricted in the size of index (by comparison 32 bits is sufficient, as all ''nr'' arguments to our bitops are ''int'' type). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Mar-16 14:08 UTC
Re: [Xen-devel] [PATCH] x86: fix variable_test_bit()asmconstraints
On 14/3/08 17:08, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:>> Actually, just trying it out with set_bit() results in a number of cases >> where the field used is neither 32- nor 64-bit. The very first one I >> looked at even has only a byte-aligned (leaving out internal knowledge >> of the allocator) allocation that it accesses (domid_bitmap in >> xen/drivers/passthrough/vtd/iommu.c). > > How did you find that one? It''s void* so I would have thought you''d miss that > one as the compiler will happily cast void*. I hope there aren''t too many > lurkers like that! Perhaps you were trying to do your automatic field-width > detection approach. I think that''s not needed, but it would conveniently find > these void* callers. Perhaps we should wrap the bitops in a macro that will > fail on void*? > > I''m happy to do this change (void* -> long*) myself, by the way, as it''s the > kind of thing that''s as much work to review as it is to do in the first place.Done, as changeset 17194. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel