On Feb 15, 2008, at 2:29 PM, Andrew Lenharth wrote:> On 2/15/08, Andrew Lenharth <andrewl at lenharth.org> wrote: >> I'll take a hack at the front end support for >> __sync_synchronize after this goes in. > > This is the gcc side of the patch.GCC 4.2 compiles this to a no-op on x86: void foo() { __sync_synchronize(); } Are you seeing different behavior? What am I missing here? -Chris> > > Index: gcc/llvm-convert.cpp > ==================================================================> --- gcc/llvm-convert.cpp (revision 46956) > +++ gcc/llvm-convert.cpp (working copy) > @@ -4260,6 +4260,15 @@ > EmitBlock(new BasicBlock("")); > return true; > > + case BUILT_IN_SYNCHRONIZE: { > + Value* C[4]; > + C[0] = C[1] = C[2] = C[3] = ConstantInt::get(Type::Int1Ty, 1); > + > + Builder.CreateCall(Intrinsic::getDeclaration(TheModule, > Intrinsic::atomic_membarrier), > + C, C + 4); > + return true; > + } > + > #if 1 // FIXME: Should handle these GCC extensions eventually. > case BUILT_IN_APPLY_ARGS: > case BUILT_IN_APPLY: > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On 2/15/08, Chris Lattner <sabre at nondot.org> wrote:> GCC 4.2 compiles this to a no-op on x86: > > void foo() { > __sync_synchronize(); > } > > Are you seeing different behavior? What am I missing here?I see the same. I don't know why, __sync_synchronize() is suppose to be a full memory barrier. I don't know why gcc doesn't generate a barrier on x86. It does on alpha. X86 will do load-load reordering, so I would expect a "full memory barrier" primitive in gcc to actually generate a barrier (and at least the linux kernel implements barriers on x86 for hardware io). In any event, generating the intrinsic all the time should keep the optimizations from reordering when the programmer doesn't want them too, and the codegen for x86 can always remove unnecessary barriers. Andrew
> GCC 4.2 compiles this to a no-op on x86: > > void foo() { > __sync_synchronize(); > } > > Are you seeing different behavior? What am I missing here?Maybe the processor does a memory barrier when it executes a call instruction. Ciao, Duncan.
On 2/16/08, Duncan Sands <baldrick at free.fr> wrote:> > GCC 4.2 compiles this to a no-op on x86: > > > > void foo() { > > __sync_synchronize(); > > } > > > > Are you seeing different behavior? What am I missing here? > > Maybe the processor does a memory barrier when it executes > a call instruction.I had tried several variants of that with loads and stores around the barrier. GCC never generated a barrier, but it's not needed if you are accessing cached memory (on x86, at least post-ppro, from what I've read), only other stuff, so I am assuming gcc is making that assumption about loads and stores. Andrew