Hi, TL;DR: should we add a new memory ordering to fences? ``fence seq_cst`` is currently use to represent two things: - GCC-style builtin ``__sync_synchronize()`` [0][1]. - C11/C++11's sequentially-consistent thread fence ``std::atomic_thread_fence(std::memory_order_seq_cst)`` [2]. As far as I understand: - The former orders all memory and emits an actual fence instruction. - The latter only provides a total order with other sequentially-consistent loads and stores, which means that it's possible to move non-sequentially-consistent loads and stores around it. The GCC-style builtin effectively does the same as the C11/C++11 sequentially-consistent thread fence, surrounded by compiler barriers (``call void asm sideeffect "", "~{memory}"``). The LLVM language reference [3] describes ``fence seq_cst`` in terms of the C11/C++11 primitive, but it looks like LLVM's codebase treats it like the GCC-style builtin. That's strictly correct, but it seems desirable to represent the GCC-style builtin with a ninth LLVM-internal memory ordering that's stricter than ``llvm::SequentiallyConsistent``. ``fence seq_cst`` could then fully utilize C11/C++11's semantics, without breaking the GCC-style builtin.>From C11/C++11's point of view this other memory ordering isn't usefulbecause the primitives offered are sufficient to express valid and performant code, but I believe that LLVM needs this new memory ordering to accurately represent the GCC-style builtin while fully taking advantage of the C11/C++11 memory model. Am I correct? I don't think it's worth implementing just yet since C11/C++11 are still relatively new, but I'd like to be a bit forward looking for PNaCl's sake. Thanks, JF [0] http://gcc.gnu.org/onlinedocs/gcc-4.8.1/gcc/_005f_005fsync-Builtins.html [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36793 [2] C++11 Standard section 29.8 - Fences [3] http://llvm.org/docs/LangRef.html#fence-instruction
2013/7/31 JF Bastien <jfb at google.com>:> Hi, > > TL;DR: should we add a new memory ordering to fences? > > > ``fence seq_cst`` is currently use to represent two things: > - GCC-style builtin ``__sync_synchronize()`` [0][1]. > - C11/C++11's sequentially-consistent thread fence > ``std::atomic_thread_fence(std::memory_order_seq_cst)`` [2]. > > As far as I understand: > - The former orders all memory and emits an actual fence instruction. > > - The latter only provides a total order with other > sequentially-consistent loads and stores, which means that it's > possible to move non-sequentially-consistent loads and stores around > it.It still acts as an acquire/release fence for any other atomic instruction. For non-atomic instructions, if you have a race, the behavior is undefined anyway, so you can't get a stronger guarantee than what "fence seq_cst" provides. I think "fence seq_cst" is completely equivalent to __sync_synchronize(), but you could convince me otherwise by providing a sample program for which there's a difference.> The GCC-style builtin effectively does the same as the C11/C++11 > sequentially-consistent thread fence, surrounded by compiler barriers > (``call void asm sideeffect "", "~{memory}"``). > > The LLVM language reference [3] describes ``fence seq_cst`` in terms > of the C11/C++11 primitive, but it looks like LLVM's codebase treats > it like the GCC-style builtin. That's strictly correct, but it seems > desirable to represent the GCC-style builtin with a ninth > LLVM-internal memory ordering that's stricter than > ``llvm::SequentiallyConsistent``. ``fence seq_cst`` could then fully > utilize C11/C++11's semantics, without breaking the GCC-style builtin. > From C11/C++11's point of view this other memory ordering isn't useful > because the primitives offered are sufficient to express valid and > performant code, but I believe that LLVM needs this new memory > ordering to accurately represent the GCC-style builtin while fully > taking advantage of the C11/C++11 memory model. > > Am I correct? > > I don't think it's worth implementing just yet since C11/C++11 are > still relatively new, but I'd like to be a bit forward looking for > PNaCl's sake. > > Thanks, > > JF > > > [0] http://gcc.gnu.org/onlinedocs/gcc-4.8.1/gcc/_005f_005fsync-Builtins.html > [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36793 > [2] C++11 Standard section 29.8 - Fences > [3] http://llvm.org/docs/LangRef.html#fence-instruction > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
struct { volatile int flag; int value; } s; int get_value_when_ready() { while (s.flag) ; __sync_synchronize(); return s.value; } This is "valid" legacy code on some processors, yet it's not valid to replace __sync_synchronize with an atomic_thread_fence because, in theory, LLVM could hoist the load of s.value. In practice it currently doesn't, but it may in the future if my understanding is correct. My main point is that LLVM needs to support code that was written before C and C++ got a memory model, it doesn't matter that it's undefined behavior and relies on a GCC-style builtin to be "correct". The current standards offer all you need to write new code that can express the above intended behavior, but __sync_synchronize isn't a 1:1 mapping to atomic_thread_fence(seq_cst), it has stronger semantics and that's constraining which optimizations can be done on ``fence seq_cst``. LLVM therefore probably wants to distinguish both, so that it can fully optimize C++11 code without leaving legacy code in a bad position. 2013/7/31 Jeffrey Yasskin <jyasskin at google.com>:> 2013/7/31 JF Bastien <jfb at google.com>: >> Hi, >> >> TL;DR: should we add a new memory ordering to fences? >> >> >> ``fence seq_cst`` is currently use to represent two things: >> - GCC-style builtin ``__sync_synchronize()`` [0][1]. >> - C11/C++11's sequentially-consistent thread fence >> ``std::atomic_thread_fence(std::memory_order_seq_cst)`` [2]. >> >> As far as I understand: >> - The former orders all memory and emits an actual fence instruction. >> >> - The latter only provides a total order with other >> sequentially-consistent loads and stores, which means that it's >> possible to move non-sequentially-consistent loads and stores around >> it. > > It still acts as an acquire/release fence for any other atomic > instruction. For non-atomic instructions, if you have a race, the > behavior is undefined anyway, so you can't get a stronger guarantee > than what "fence seq_cst" provides. > > I think "fence seq_cst" is completely equivalent to > __sync_synchronize(), but you could convince me otherwise by providing > a sample program for which there's a difference. > >> The GCC-style builtin effectively does the same as the C11/C++11 >> sequentially-consistent thread fence, surrounded by compiler barriers >> (``call void asm sideeffect "", "~{memory}"``). >> >> The LLVM language reference [3] describes ``fence seq_cst`` in terms >> of the C11/C++11 primitive, but it looks like LLVM's codebase treats >> it like the GCC-style builtin. That's strictly correct, but it seems >> desirable to represent the GCC-style builtin with a ninth >> LLVM-internal memory ordering that's stricter than >> ``llvm::SequentiallyConsistent``. ``fence seq_cst`` could then fully >> utilize C11/C++11's semantics, without breaking the GCC-style builtin. >> From C11/C++11's point of view this other memory ordering isn't useful >> because the primitives offered are sufficient to express valid and >> performant code, but I believe that LLVM needs this new memory >> ordering to accurately represent the GCC-style builtin while fully >> taking advantage of the C11/C++11 memory model. >> >> Am I correct? >> >> I don't think it's worth implementing just yet since C11/C++11 are >> still relatively new, but I'd like to be a bit forward looking for >> PNaCl's sake. >> >> Thanks, >> >> JF >> >> >> [0] http://gcc.gnu.org/onlinedocs/gcc-4.8.1/gcc/_005f_005fsync-Builtins.html >> [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36793 >> [2] C++11 Standard section 29.8 - Fences >> [3] http://llvm.org/docs/LangRef.html#fence-instruction >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev