Hal Finkel via llvm-dev
2016-Jan-31  08:42 UTC
[llvm-dev] Adding sanity to the Atomics implementation
----- Original Message -----> From: "Ben via llvm-dev Craig" <llvm-dev at lists.llvm.org> > To: llvm-dev at lists.llvm.org > Sent: Thursday, January 28, 2016 9:41:06 AM > Subject: Re: [llvm-dev] Adding sanity to the Atomics implementation > > > I don't have much of substance to add, but I will say that I like the > proposal (I too prefer Alternative A).I also prefer alternative A. -Hal> When adding C11 atomic > support for hexagon, I found it surprising that support for the > __sync_* was implemented completely differently than the C11 > atomics. > > > On 1/27/2016 1:55 PM, James Knight via llvm-dev wrote: > > > > Right now, the atomics implementation in clang is a bit of a mess. It > has three basically completely distinct code-paths: > > 1. There's the legacy __sync_* builtins, which clang lowers > directly to atomic IR instructions. Then, the llvm atomic IR > instructions themselves can sometimes emit libcalls to __sync_* > library functions (which are basically undocumented, and users > are often responsible for implementing themselves if they need > it). > 2. There's the new __atomic_* builtins, which clang will, > depending on size and alignment and target, lower either to a > libcall to a " standardized-by-GCC " __atomic_* library function > (implemented by libatomic), or, to the atomic IR instructions. > (Also covered by the same code is the __c11_atomic_* builtins, > which are almost-identical clang-specific aliases of the > __atomic_* builtins.) > 3. There's the C11 "_Atomic" type modifier and some OpenMP stuff, > which clang lowers into atomic IR or libcalls to __atomic_* > library functions, via almost completely separate code from #2. > (Note: doesn't apply to C++ std::atomic<>, which gets > implemented with the builtins instead). Beyond just a lot of > duplicated logic in the code, the _Atomic impl is kinda broken: > sometimes it emits llvm IR instructions directly (letting things > fall back to __sync_* calls), and sometimes it checks if it > should emit a libcall first (falling back to __atomic_* calls). > That's pretty bad behavior. > > > > I'd like to make a proposal for cleaning this mess up. > > > BTW, as a sidenote... > One thing that's important to remember: at least on the C/C++ level, > if you support lock-free atomic-ops for a given size/alignment, ALL > the atomic ops for that size/alignment must be lock-free. This > property is usually quite easy, because you'll have either LL/SC or > CAS instructions, which can be used to implement any other > operation. However, many older CPU models support atomic load, > store, and exchange operations, but are missing any way to do an > atomic compare-and-swap. This is true for at least ARMv5, SPARCv8, > and Intel 80386. When your minimum CPU is set to such a cpu model, > all atomic operations must be done via libcall -- it's not > acceptable for atomic_store to emit an atomic "store" instruction, > but atomic_fetch_add to require a libcall which gets implemented > with a lock. If that were to happen, then atomic_fetch_add could not > actually be made atomic versus a simultaneous atomic_store. > > > > > So anyhow, there's basically two paths I think we could take for > cleanup. I like "Alternative A" below better, but would be > interested to hear if others have reasons to think the other would > be preferable for some reason. > > > Both alternatives I've suggested will have the effect that the > __sync_* builtins in clang will now lower to __atomic_* function > calls on platforms without inline atomics (or for unsupported > sizes), and C11 atomics will stop being schizophrenic. Clang will > cease to ever emit a call to a __sync_* function from any of its > builtins or language support. This could theoretically cause an > incompatibility on some target. > > > However, I think it ought to be pretty safe: I can't imagine anyone > who will use an upgraded compiler has __sync_* functions implemented > for their platform, but doesn't have the __atomic_* library > functions implemented. The __atomic_* builtins (which already use > those library calls) are in widespread use, notably, both in > libstdc++ and libc++, as well as in a lot of third-party code. IMO > it's worth having that potential incompatibility, in order to > simplify the situation for users (only one set of atomic library > calls to worry about), and to have a single code-path for atomics in > the compiler. > > > > > > > Alternative A : Move all atomic libcall emission into LLVM > > > In this alternative, LLVM will be responsible for lowering all atomic > operations (to inline code or libcalls as appropriate) for all three > code-paths listed at the beginning. Clang will emit no libcalls for > atomic operations itself. > > > > > A1) In LLVM: when lowering "store atomic", "load atomic", > "atomicrmw", and "cmpxchg" instructions that aren't supported by the > target, emit libcalls to the new __atomic_* functions, (rather than > the current behavior of calling the legacy __sync_* functions.) > > > Additionally, I'd like to move the decision to emit a libcall into > AtomicExpandPass, and remove the ability to use Expand in ISel to > create a libcall for ISD::ATOMIC_*. Putting the decision there > allows querying the target, up front, for its supported sizes, > before any other lowering decisions (either other parts of > AtomicExpandPass and in ISel). When choosing whether to inline or > libcall, only the size and alignment of the operation should be > considered, and not the operation itself, or the type. This will > ensure that the "all or none" property is adhered to. > > > (Q: what about the implementation of > __atomic_is_lock_free/__atomic_always_lock_free in clang? The clang > frontend can't query target information from the llvm backend, can > it? Is there some other way to expose the information of what sizes > are supported by a backend so that the clang builtins -- the latter > of which must be a C++ constant expression -- can also use it? > Or...just continue duplicating the info in both places, as is done > today...) > > > A2) In clang, start removing the code that knows how to do optimized > library call lowering for atomics -- always emit llvm atomic ops. > Except for one case: clang will still need to emit library calls > itself for data not aligned naturally for its size. The LLVM atomic > instructions currently will not handle unaligned data, but unaligned > data is allowed for the four "slab of memory" builtins > (__atomic_load, __atomic_store, __atomic_compare_exchange, and > __atomic_exchange). > > > A3) In LLVM, add "align" attributes to cmpxchg and atomicrmw, and > allow specifying "align" values for "load atomic" and "store atomic" > (where the attribute currently exists but cannot be used). LLVM will > then be able to lower misaligned accesses to library calls. In > clang, get rid of special case for directly emitting libcalls for > misaligned atomics, and lower to llvm instructions there, as well. > > > A4) Hopefully, now, the three codepaths in clang will be sufficiently > the same (as they're all now just creating llvm atomic instructions) > that they can be trivially merged, or else they are so trivial that > the remaining duplication is not worthwhile to merge. > > > > > > > > > > > > > Alternative B : Move all libcall emission for atomics into Clang. > > > In this alternative, LLVM will never emit atomic libcalls from the > atomic IR instructions. If the operation requested is not possible > on a given target, that is an error. So, the cleanups here are to > get clang to stop emitting LLVM IR atomics that cannot be lowered > without libcalls. > > > B1) In Clang: have the legacy __sync_* builtins become essentially > aliases for the __atomic_* builtins (thus they will emit calls to > __atomic_* library functions when applicable). > > > B2) In Clang: Have the C11 _Atomic type support and openmp stuff call > into the __atomic builtins' code to do their dirty work, instead of > having its own separate implementation. > > > B3) After those two changes, I believe clang will only ever emit > atomic IR instructions when they can be lowered. So then, in LLVM: > get rid of the fallback to libcalls to __sync_* from the atomic IR > instructions. If an atomic IR operation is requested and not > implementable lock-free, it will be an error, with no fallback. > > > > _______________________________________________ > LLVM Developers mailing list llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a > Linux Foundation Collaborative Project > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
JF Bastien via llvm-dev
2016-Feb-01  14:39 UTC
[llvm-dev] Adding sanity to the Atomics implementation
On Sun, Jan 31, 2016 at 12:42 AM, Hal Finkel via llvm-dev < llvm-dev at lists.llvm.org> wrote:> ----- Original Message ----- > > From: "Ben via llvm-dev Craig" <llvm-dev at lists.llvm.org> > > To: llvm-dev at lists.llvm.org > > Sent: Thursday, January 28, 2016 9:41:06 AM > > Subject: Re: [llvm-dev] Adding sanity to the Atomics implementation > > > > > > I don't have much of substance to add, but I will say that I like the > > proposal (I too prefer Alternative A). > > I also prefer alternative A. >Same here: +1 to A. it makes it easier to support virtual ISAs. For targets such as WebAssembly we don't know ahead-of time what's atomic but we make certain assumptions (e.g. atomic_flag has to be sufficiently sized to be lock-free). As David pointed out this means that _Atomic and std::atomic have to be "sufficiently sized" e.g. sizeof(T) == sizeof(std::atomic<T>) isn't guaranteed (the guarantee is also there for other academic reasons, but I don't think they matter in this case). Another consideration: we want to get the atomic optimizations, which option B would forbid because the middle-end would see libcalls instead of atomics. Also, I'd like LLVM to eventually be able to patch code at load-time and transform maybe-lock-free operations into the proper instruction or libcall. Related to always lock-free: wg21.link/n4509 Note that C has macros that return never / sometimes / always lock-free. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160201/564bb00e/attachment.html>
Philip Reames via llvm-dev
2016-Feb-01  20:27 UTC
[llvm-dev] Adding sanity to the Atomics implementation
On 02/01/2016 06:39 AM, JF Bastien via llvm-dev wrote:> On Sun, Jan 31, 2016 at 12:42 AM, Hal Finkel via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > ----- Original Message ----- > > From: "Ben via llvm-dev Craig" <llvm-dev at lists.llvm.org > <mailto:llvm-dev at lists.llvm.org>> > > To: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > > Sent: Thursday, January 28, 2016 9:41:06 AM > > Subject: Re: [llvm-dev] Adding sanity to the Atomics implementation > > > > > > I don't have much of substance to add, but I will say that I > like the > > proposal (I too prefer Alternative A). > > I also prefer alternative A. > > > Same here: +1 to A.Also, +1.> > it makes it easier to support virtual ISAs. For targets such as > WebAssembly we don't know ahead-of time what's atomic but we make > certain assumptions (e.g. atomic_flag has to be sufficiently sized to > be lock-free). As David pointed out this means that _Atomic and > std::atomic have to be "sufficiently sized" e.g. sizeof(T) == > sizeof(std::atomic<T>) isn't guaranteed (the guarantee is also there > for other academic reasons, but I don't think they matter in this case). > > Another consideration: we want to get the atomic optimizations, which > option B would forbid because the middle-end would see libcalls > instead of atomics.Strongly agreed with this part.> > Also, I'd like LLVM to eventually be able to patch code at load-time > and transform maybe-lock-free operations into the proper instruction > or libcall. > > Related to always lock-free: wg21.link/n4509 <http://wg21.link/n4509> > Note that C has macros that return never / sometimes / always lock-free. > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160201/5ae65890/attachment.html>