Instrinsic llvm.memory.barrier does not work as expected. Is it a bug or it has not been implemented yet ? (1) false arguments do not work // pesudo code void foo(int *x) { x[2] = 10; llvm.memory.barrier(0, 0, 0, 0, 0); x[2] = 20; return void } The barrier is actually noop, but it prevents "x[2] = 10" from being deleted. (2) True arguments do not work. // pesudo code void foo(int * restrict x) { x[2] = 10; llvm.memory.barrier(1, 1, 1, 1, 1); x[2] = 20; return void } "x[2] = 10' should not be deleted because barrier is present. But it is deleted anyway. Here is llvm ir for the first case (commented code is for the second case). declare void @llvm.memory.barrier(i1 , i1 , i1 , i1 , i1) define void @foo(i32* %x) nounwind { ; define void @foo(i32* noalias %x) nounwind { entry: %x.addr = alloca i32*, align 4 store i32* %x, i32** %x.addr, align 4 %tmp = load i32** %x.addr, align 4 %arrayidx = getelementptr i32* %tmp, i32 2 store i32 10, i32* %arrayidx, align 4 call void @llvm.memory.barrier(i1 0, i1 0, i1 0, i1 0, i1 0) nounwind ; call void @llvm.memory.barrier(i1 1, i1 1, i1 1, i1 1, i1 1) nounwind %tmp1 = load i32** %x.addr, align 4 %arrayidx1 = getelementptr i32* %tmp, i32 2 store i32 20, i32* %arrayidx, align 4 ret void } Using "opt -O3 " and result is declare void @llvm.memory.barrier(i1, i1, i1, i1, i1) nounwind define void @foo(i32* nocapture %x) nounwind { entry: %arrayidx = getelementptr i32* %x, i32 2 store i32 10, i32* %arrayidx, align 4 tail call void @llvm.memory.barrier(i1 false, i1 false, i1 false, i1 false, i1 false) nounwind store i32 20, i32* %arrayidx, align 4 ret void }
On Wed, Sep 28, 2011 at 3:27 PM, Junjie Gu <jgu222 at gmail.com> wrote:> Instrinsic llvm.memory.barrier does not work as expected. Is it a bug > or it has not been implemented yet ?It's going away in favor of the new fence instruction (and I'll remove it as soon as dragonegg catches up). It should still work at the moment, though.> (1) false arguments do not work > > // pesudo code > void foo(int *x) { > x[2] = 10; > llvm.memory.barrier(0, 0, 0, 0, 0); > x[2] = 20; > return void > } > > > The barrier is actually noop, but it prevents "x[2] = 10" from being deleted.Don't do that. :) Really, why are you using a noop barrier?> (2) True arguments do not work. > > // pesudo code > void foo(int * restrict x) { > x[2] = 10; > llvm.memory.barrier(1, 1, 1, 1, 1); > x[2] = 20; > return void > } > > "x[2] = 10' should not be deleted because barrier is present. But it > is deleted anyway.The pointer is "restrict", therefore the compiler assumes nothing else can touch it while the function runs. Actually, the transformation in question is probably valid even if the pointer isn't restrict (although LLVM won't actually do that); your use of a barrier here doesn't really make sense. -Eli
On Wed, Sep 28, 2011 at 5:47 PM, Eli Friedman <eli.friedman at gmail.com> wrote:> On Wed, Sep 28, 2011 at 3:27 PM, Junjie Gu <jgu222 at gmail.com> wrote: >> Instrinsic llvm.memory.barrier does not work as expected. Is it a bug >> or it has not been implemented yet ? > > It's going away in favor of the new fence instruction (and I'll remove > it as soon as dragonegg catches up). It should still work at the > moment, though. > >> (1) false arguments do not work >> >> // pesudo code >> void foo(int *x) { >> x[2] = 10; >> llvm.memory.barrier(0, 0, 0, 0, 0); >> x[2] = 20; >> return void >> } >> >> >> The barrier is actually noop, but it prevents "x[2] = 10" from being deleted. > > Don't do that. :) Really, why are you using a noop barrier?Just to show it affects optimization, but it shouldn't.> >> (2) True arguments do not work. >> >> // pesudo code >> void foo(int * restrict x) { >> x[2] = 10; >> llvm.memory.barrier(1, 1, 1, 1, 1); >> x[2] = 20; >> return void >> } >> >> "x[2] = 10' should not be deleted because barrier is present. But it >> is deleted anyway. > > The pointer is "restrict", therefore the compiler assumes nothing else > can touch it while the function runs. > > Actually, the transformation in question is probably valid even if the > pointer isn't restrict (although LLVM won't actually do that); your > use of a barrier here doesn't really make sense.If you think of multiple-threaded code, it will make sense. Again, this is a simplied code and is used just for showing the point. The memory barrier requires that all memory operations prior to the barrier point completes before any memory operations after the barrier start. I think that it requires that compilers do not optimize across the barrier point (and compilers do generate memory barrier instructions if needed. I think LLVM only does this.), do you agree on this ? For the following gcc code, the asm basically behaves like barrier that prevents across-barrier optimization. You can see that both writes to p[2] are in the gcc output. void foo (int * __restrict__ p) { p[2] = 10; __asm__ __volatile__ ("":::"memory"); p[2] = 20; } Junjie> > -Eli >