thr3ads.net - llvm dev - [LLVMdev] Lowering Atomic Load to Acquire and Load [Aug 2013]

If this information is useful, please help other people find it:
Share via:

Sam Cristall

2013-Aug-01 19:36 UTC

[LLVMdev] Lowering Atomic Load to Acquire and Load

I'm working with an experimental backend for an MCU with heavy 
multithreading capabilities but lacks proper acquire/release semantics.  
This is okay, as the programmer can customize __cxa_guard_acquire and 
__cxa_guard_release to lower/raise appropriate semaphores.  The issue 
I'm having is that I can't seem to figure out when to lower atomic load 
into an acquire/load pair early enough that the __cxa_guard_acquire is 
evaluated for optimization (most importantly inlining.)  First, is this 
even the proper way to do this and further am I going about this the 
wrong way and is there a "best time" to do a pass to catch these guys?

Thanks!

-Sam

Eli Friedman

2013-Aug-02 23:15 UTC

head link

[LLVMdev] Lowering Atomic Load to Acquire and Load

On Thu, Aug 1, 2013 at 12:36 PM, Sam Cristall <cristall at eleveneng.com>
wrote:> I'm working with an experimental backend for an MCU with heavy
> multithreading capabilities but lacks proper acquire/release semantics.
> This is okay, as the programmer can customize __cxa_guard_acquire and
> __cxa_guard_release to lower/raise appropriate semaphores.  The issue
I'm
> having is that I can't seem to figure out when to lower atomic load
into an
> acquire/load pair early enough that the __cxa_guard_acquire is evaluated
for
> optimization (most importantly inlining.)  First, is this even the proper
> way to do this and further am I going about this the wrong way and is there
> a "best time" to do a pass to catch these guys?
The code clang generates for a guarded initialization looks like this normally:

entry:
  %0 = load atomic i8* bitcast (i64* @_ZGVZ3barvE1x to i8*) acquire, align 8
  %guard.uninitialized = icmp eq i8 %0, 0
  br i1 %guard.uninitialized, label %init.check, label %init.end

init.check:                                       ; preds = %entry
  %1 = tail call i32 @__cxa_guard_acquire(i64* @_ZGVZ3barvE1x) #1
  %tobool = icmp eq i32 %1, 0
  br i1 %tobool, label %init.end, label %init

init:                                             ; preds = %init.check
  %call = tail call i32 @_Z3foov() #1
  store i32 %call, i32* @_ZZ3barvE1x, align 4, !tbaa !0
  tail call void @__cxa_guard_release(i64* @_ZGVZ3barvE1x) #1
  br label %init.end

Given this, there is no reason to inline the call to
__cxa_guard_acquire; it would bloat code-size for no performance
benefit.

What does the IR you are working with look like?

-Eli

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Aug 2013 - [LLVMdev] Lowering Atomic Load to Acquire and Load

[LLVMdev] Lowering Atomic Load to Acquire and Load

[LLVMdev] Lowering Atomic Load to Acquire and Load

Maybe Matching Threads