thr3ads.net - llvm dev - [llvm-dev] Adding sanity to the Atomics implementation [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2016-Jan-31 08:42 UTC

[llvm-dev] Adding sanity to the Atomics implementation

----- Original Message -----> From: "Ben via llvm-dev Craig" <llvm-dev at lists.llvm.org>
> To: llvm-dev at lists.llvm.org
> Sent: Thursday, January 28, 2016 9:41:06 AM
> Subject: Re: [llvm-dev] Adding sanity to the Atomics implementation
> 
> 
> I don't have much of substance to add, but I will say that I like the
> proposal (I too prefer Alternative A).
I also prefer alternative A.

 -Hal
> When adding C11 atomic
> support for hexagon, I found it surprising that support for the
> __sync_* was implemented completely differently than the C11
> atomics.
> 
> 
> On 1/27/2016 1:55 PM, James Knight via llvm-dev wrote:
> 
> 
> 
> Right now, the atomics implementation in clang is a bit of a mess. It
> has three basically completely distinct code-paths:
> 
>     1. There's the legacy __sync_* builtins, which clang lowers
>     directly to atomic IR instructions. Then, the llvm atomic IR
>     instructions themselves can sometimes emit libcalls to __sync_*
>     library functions (which are basically undocumented, and users
>     are often responsible for implementing themselves if they need
>     it).
>     2. There's the new __atomic_* builtins, which clang will,
>     depending on size and alignment and target, lower either to a
>     libcall to a " standardized-by-GCC " __atomic_* library
function
>     (implemented by libatomic), or, to the atomic IR instructions.
>     (Also covered by the same code is the __c11_atomic_* builtins,
>     which are almost-identical clang-specific aliases of the
>     __atomic_* builtins.)
>     3. There's the C11 "_Atomic" type modifier and some
OpenMP stuff,
>     which clang lowers into atomic IR or libcalls to __atomic_*
>     library functions, via almost completely separate code from #2.
>     (Note: doesn't apply to C++ std::atomic<>, which gets
>     implemented with the builtins instead). Beyond just a lot of
>     duplicated logic in the code, the _Atomic impl is kinda broken:
>     sometimes it emits llvm IR instructions directly (letting things
>     fall back to __sync_* calls), and sometimes it checks if it
>     should emit a libcall first (falling back to __atomic_* calls).
>     That's pretty bad behavior.
> 
> 
> 
> I'd like to make a proposal for cleaning this mess up.
> 
> 
> BTW, as a sidenote...
> One thing that's important to remember: at least on the C/C++ level,
> if you support lock-free atomic-ops for a given size/alignment, ALL
> the atomic ops for that size/alignment must be lock-free. This
> property is usually quite easy, because you'll have either LL/SC or
> CAS instructions, which can be used to implement any other
> operation. However, many older CPU models support atomic load,
> store, and exchange operations, but are missing any way to do an
> atomic compare-and-swap. This is true for at least ARMv5, SPARCv8,
> and Intel 80386. When your minimum CPU is set to such a cpu model,
> all atomic operations must be done via libcall -- it's not
> acceptable for atomic_store to emit an atomic "store"
instruction,
> but atomic_fetch_add to require a libcall which gets implemented
> with a lock. If that were to happen, then atomic_fetch_add could not
> actually be made atomic versus a simultaneous atomic_store.
> 
> 
> 
> 
> So anyhow, there's basically two paths I think we could take for
> cleanup. I like "Alternative A" below better, but would be
> interested to hear if others have reasons to think the other would
> be preferable for some reason.
> 
> 
> Both alternatives I've suggested will have the effect that the
> __sync_* builtins in clang will now lower to __atomic_* function
> calls on platforms without inline atomics (or for unsupported
> sizes), and C11 atomics will stop being schizophrenic. Clang will
> cease to ever emit a call to a __sync_* function from any of its
> builtins or language support. This could theoretically cause an
> incompatibility on some target.
> 
> 
> However, I think it ought to be pretty safe: I can't imagine anyone
> who will use an upgraded compiler has __sync_* functions implemented
> for their platform, but doesn't have the __atomic_* library
> functions implemented. The __atomic_* builtins (which already use
> those library calls) are in widespread use, notably, both in
> libstdc++ and libc++, as well as in a lot of third-party code. IMO
> it's worth having that potential incompatibility, in order to
> simplify the situation for users (only one set of atomic library
> calls to worry about), and to have a single code-path for atomics in
> the compiler.
> 
> 
> 
> 
> 
> 
> Alternative A : Move all atomic libcall emission into LLVM
> 
> 
> In this alternative, LLVM will be responsible for lowering all atomic
> operations (to inline code or libcalls as appropriate) for all three
> code-paths listed at the beginning. Clang will emit no libcalls for
> atomic operations itself.
> 
> 
> 
> 
> A1) In LLVM: when lowering "store atomic", "load
atomic",
> "atomicrmw", and "cmpxchg" instructions that aren't
supported by the
> target, emit libcalls to the new __atomic_* functions, (rather than
> the current behavior of calling the legacy __sync_* functions.)
> 
> 
> Additionally, I'd like to move the decision to emit a libcall into
> AtomicExpandPass, and remove the ability to use Expand in ISel to
> create a libcall for ISD::ATOMIC_*. Putting the decision there
> allows querying the target, up front, for its supported sizes,
> before any other lowering decisions (either other parts of
> AtomicExpandPass and in ISel). When choosing whether to inline or
> libcall, only the size and alignment of the operation should be
> considered, and not the operation itself, or the type. This will
> ensure that the "all or none" property is adhered to.
> 
> 
> (Q: what about the implementation of
> __atomic_is_lock_free/__atomic_always_lock_free in clang? The clang
> frontend can't query target information from the llvm backend, can
> it? Is there some other way to expose the information of what sizes
> are supported by a backend so that the clang builtins -- the latter
> of which must be a C++ constant expression -- can also use it?
> Or...just continue duplicating the info in both places, as is done
> today...)
> 
> 
> A2) In clang, start removing the code that knows how to do optimized
> library call lowering for atomics -- always emit llvm atomic ops.
> Except for one case: clang will still need to emit library calls
> itself for data not aligned naturally for its size. The LLVM atomic
> instructions currently will not handle unaligned data, but unaligned
> data is allowed for the four "slab of memory" builtins
> (__atomic_load, __atomic_store, __atomic_compare_exchange, and
> __atomic_exchange).
> 
> 
> A3) In LLVM, add "align" attributes to cmpxchg and atomicrmw, and
> allow specifying "align" values for "load atomic" and
"store atomic"
> (where the attribute currently exists but cannot be used). LLVM will
> then be able to lower misaligned accesses to library calls. In
> clang, get rid of special case for directly emitting libcalls for
> misaligned atomics, and lower to llvm instructions there, as well.
> 
> 
> A4) Hopefully, now, the three codepaths in clang will be sufficiently
> the same (as they're all now just creating llvm atomic instructions)
> that they can be trivially merged, or else they are so trivial that
> the remaining duplication is not worthwhile to merge.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Alternative B : Move all libcall emission for atomics into Clang.
> 
> 
> In this alternative, LLVM will never emit atomic libcalls from the
> atomic IR instructions. If the operation requested is not possible
> on a given target, that is an error. So, the cleanups here are to
> get clang to stop emitting LLVM IR atomics that cannot be lowered
> without libcalls.
> 
> 
> B1) In Clang: have the legacy __sync_* builtins become essentially
> aliases for the __atomic_* builtins (thus they will emit calls to
> __atomic_* library functions when applicable).
> 
> 
> B2) In Clang: Have the C11 _Atomic type support and openmp stuff call
> into the __atomic builtins' code to do their dirty work, instead of
> having its own separate implementation.
> 
> 
> B3) After those two changes, I believe clang will only ever emit
> atomic IR instructions when they can be lowered. So then, in LLVM:
> get rid of the fallback to libcalls to __sync_* from the atomic IR
> instructions. If an atomic IR operation is requested and not
> implementable lock-free, it will be an error, with no fallback.
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
> Linux Foundation Collaborative Project
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

JF Bastien via llvm-dev

2016-Feb-01 14:39 UTC

head link

[llvm-dev] Adding sanity to the Atomics implementation

On Sun, Jan 31, 2016 at 12:42 AM, Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> ----- Original Message -----
> > From: "Ben via llvm-dev Craig" <llvm-dev at
lists.llvm.org>
> > To: llvm-dev at lists.llvm.org
> > Sent: Thursday, January 28, 2016 9:41:06 AM
> > Subject: Re: [llvm-dev] Adding sanity to the Atomics implementation
> >
> >
> > I don't have much of substance to add, but I will say that I like
the
> > proposal (I too prefer Alternative A).
>
> I also prefer alternative A.
>
Same here: +1 to A.

it makes it easier to support virtual ISAs. For targets such as WebAssembly
we don't know ahead-of time what's atomic but we make certain
assumptions
(e.g. atomic_flag has to be sufficiently sized to be lock-free). As David
pointed out this means that _Atomic and std::atomic have to be
"sufficiently sized" e.g. sizeof(T) == sizeof(std::atomic<T>)
isn't
guaranteed (the guarantee is also there for other academic reasons, but I
don't think they matter in this case).

Another consideration: we want to get the atomic optimizations, which
option B would forbid because the middle-end would see libcalls instead of
atomics.

Also, I'd like LLVM to eventually be able to patch code at load-time and
transform maybe-lock-free operations into the proper instruction or libcall.

Related to always lock-free: wg21.link/n4509
Note that C has macros that return never / sometimes / always lock-free.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160201/564bb00e/attachment.html>

Philip Reames via llvm-dev

2016-Feb-01 20:27 UTC

head link

[llvm-dev] Adding sanity to the Atomics implementation

On 02/01/2016 06:39 AM, JF Bastien via llvm-dev wrote:> On Sun, Jan 31, 2016 at 12:42 AM, Hal Finkel via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>
>     ----- Original Message -----
>     > From: "Ben via llvm-dev Craig" <llvm-dev at
lists.llvm.org
>     <mailto:llvm-dev at lists.llvm.org>>
>     > To: llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>     > Sent: Thursday, January 28, 2016 9:41:06 AM
>     > Subject: Re: [llvm-dev] Adding sanity to the Atomics
implementation
>     >
>     >
>     > I don't have much of substance to add, but I will say that I
>     like the
>     > proposal (I too prefer Alternative A).
>
>     I also prefer alternative A.
>
>
> Same here: +1 to A.
Also, +1.>
> it makes it easier to support virtual ISAs. For targets such as 
> WebAssembly we don't know ahead-of time what's atomic but we make 
> certain assumptions (e.g. atomic_flag has to be sufficiently sized to 
> be lock-free). As David pointed out this means that _Atomic and 
> std::atomic have to be "sufficiently sized" e.g. sizeof(T) == 
> sizeof(std::atomic<T>) isn't guaranteed (the guarantee is also
there
> for other academic reasons, but I don't think they matter in this
case).
>
> Another consideration: we want to get the atomic optimizations, which 
> option B would forbid because the middle-end would see libcalls 
> instead of atomics.
Strongly agreed with this part.>
> Also, I'd like LLVM to eventually be able to patch code at load-time 
> and transform maybe-lock-free operations into the proper instruction 
> or libcall.
>
> Related to always lock-free: wg21.link/n4509 <http://wg21.link/n4509>
> Note that C has macros that return never / sometimes / always lock-free.
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160201/5ae65890/attachment.html>

llvm dev - Feb 2016 - Adding sanity to the Atomics implementation

[llvm-dev] Adding sanity to the Atomics implementation

[llvm-dev] Adding sanity to the Atomics implementation

[llvm-dev] Adding sanity to the Atomics implementation