thr3ads.net - llvm dev - [LLVMdev] llvm.atomic.barrier implementation [Feb 2008]

If this information is useful, please help other people find it:
Share via:

Chris Lattner

2008-Feb-15 23:02 UTC

[LLVMdev] llvm.atomic.barrier implementation

On Feb 15, 2008, at 2:29 PM, Andrew Lenharth wrote:
> On 2/15/08, Andrew Lenharth <andrewl at lenharth.org> wrote:
>> I'll take a hack at the front end support for
>>  __sync_synchronize after this goes in.
>
> This is the gcc side of the patch.
GCC 4.2 compiles this to a no-op on x86:

void foo() {
   __sync_synchronize();
}

Are you seeing different behavior?  What am I missing here?

-Chris

>
>
> Index: gcc/llvm-convert.cpp
> ==================================================================> ---
gcc/llvm-convert.cpp        (revision 46956)
> +++ gcc/llvm-convert.cpp        (working copy)
> @@ -4260,6 +4260,15 @@
>     EmitBlock(new BasicBlock(""));
>     return true;
>
> +  case BUILT_IN_SYNCHRONIZE: {
> +    Value* C[4];
> +    C[0] = C[1] = C[2] = C[3] = ConstantInt::get(Type::Int1Ty, 1);
> +
> +    Builder.CreateCall(Intrinsic::getDeclaration(TheModule,
> Intrinsic::atomic_membarrier),
> +                       C, C + 4);
> +    return true;
> +  }
> +
> #if 1  // FIXME: Should handle these GCC extensions eventually.
>     case BUILT_IN_APPLY_ARGS:
>     case BUILT_IN_APPLY:
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Andrew Lenharth

2008-Feb-15 23:23 UTC

head link

[LLVMdev] llvm.atomic.barrier implementation

On 2/15/08, Chris Lattner <sabre at nondot.org>
wrote:> GCC 4.2 compiles this to a no-op on x86:
>
>  void foo() {
>    __sync_synchronize();
>  }
>
>  Are you seeing different behavior?  What am I missing here?
I see the same.  I don't know why, __sync_synchronize() is suppose to
be a full memory barrier.  I don't know why gcc doesn't generate a
barrier on x86.  It does on alpha.  X86 will do load-load reordering,
so I would expect a "full memory barrier" primitive in gcc to actually
generate a barrier (and at least the linux kernel implements barriers
on x86 for hardware io).  In any event, generating the intrinsic all
the time should keep the optimizations from reordering when the
programmer doesn't want them too, and the codegen for x86 can always
remove unnecessary barriers.

Andrew

Duncan Sands

2008-Feb-16 08:16 UTC

head link

[LLVMdev] llvm.atomic.barrier implementation

> GCC 4.2 compiles this to a no-op on x86:
> 
> void foo() {
>    __sync_synchronize();
> }
> 
> Are you seeing different behavior?  What am I missing here?
Maybe the processor does a memory barrier when it executes
a call instruction.

Ciao,

Duncan.

Andrew Lenharth

2008-Feb-16 13:24 UTC

head link

[LLVMdev] llvm.atomic.barrier implementation

On 2/16/08, Duncan Sands <baldrick at free.fr>
wrote:> > GCC 4.2 compiles this to a no-op on x86:
> >
> > void foo() {
> >    __sync_synchronize();
> > }
> >
> > Are you seeing different behavior?  What am I missing here?
>
> Maybe the processor does a memory barrier when it executes
> a call instruction.
I had tried several variants of that with loads and stores around the
barrier.  GCC never generated a barrier, but it's not needed if you
are accessing cached memory (on x86, at least post-ppro, from what
I've read), only other stuff, so I am assuming gcc is making that
assumption about loads and stores.

Andrew

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Feb 2008 - [LLVMdev] llvm.atomic.barrier implementation

[LLVMdev] llvm.atomic.barrier implementation

[LLVMdev] llvm.atomic.barrier implementation

[LLVMdev] llvm.atomic.barrier implementation

[LLVMdev] llvm.atomic.barrier implementation

Possibly Parallel Threads