Umesh Kalappa via llvm-dev
2020-Feb-10 13:05 UTC
[llvm-dev] atomic ops are optimized with incorrect semantics .
Hi All, With the "https://gcc.godbolt.org/z/yBYTrd" case . the atomic is converted to non atomic ops for x86 like from xchg dword ptr [100], eax to mov dword ptr [100], 1 the pass is responsible for this tranformation was instCombine i.e InstCombiner::visitAtomicRMWInst which converts the IR like %0 = atomicrmw xchg i32* inttoptr (i64 100 to i32*), i32 1 monotonic to store atomic i32 1, i32* inttoptr (i64 100 to i32*) monotonic, align 4 which is valid for relax(monotonic) and release ordering as per the code out there. we think that,its the inst lowering issue, where the atomic store was lowered to non-atomic store here. to work around we changed our case to strong ordering . and we are debugging the case with x86 and the same goes with an arm too. Any suggestions or thoughts on the transformation? , will be helpful to proceed further. FYI, the problem persists with LLVM-9, not with the LLVM-8. Thank you ~Umesh -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200210/6b759684/attachment.html>
David Chisnall via llvm-dev
2020-Feb-10 14:06 UTC
[llvm-dev] atomic ops are optimized with incorrect semantics .
Hi, It looks as if there are at least two kinds of at least borderline undefined behaviour here, which makes it difficult to evaluate this test case. 1. The atomic operation is in an infinite loop. 2. The address is a cast from a provenance-free integer. That said, I don't believe that this optimisation is taking advantage of either of those. The atomic compare and exchange is relaxed and so does not establish a happens before edge with anything (including itself) and there is no guarantee under the C++11 memory model that there is any ordering of the loads and stores between threads. Given that the load here is unused, a compare and exchange reduces to a store. If it were sequentially consistent, for example, then this transform would not be valid and, indeed, LLVM does not perform the transformation in this case. David On 10/02/2020 13:05, Umesh Kalappa via llvm-dev wrote:> Hi All, > > With the "https://gcc.godbolt.org/z/yBYTrd" case . > > the atomic is converted to non atomic ops for x86 like > > from > xchg dword ptr [100], eax > to > mov dword ptr [100], 1 > > the pass is responsible for this tranformation was instCombine > i.e InstCombiner::visitAtomicRMWInst > > which converts the IR like > %0 = atomicrmw xchg i32* inttoptr (i64 100 to i32*), i32 1 monotonic > to > store atomic i32 1, i32* inttoptr (i64 100 to i32*) monotonic, align 4 > > which is valid for relax(monotonic) and release ordering as per the code > out there. > > we think that,its the inst lowering issue, where the atomic store was > lowered to non-atomic store here. > > to work around we changed our case to strong ordering . > > and we are debugging the case with x86 and the same goes with an arm too. > > Any suggestions or thoughts on the transformation? , will be helpful to > proceed further. > > FYI, the problem persists with LLVM-9, not with the LLVM-8. > > Thank you > ~Umesh > > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Nicolai Hähnle via llvm-dev
2020-Feb-10 20:38 UTC
[llvm-dev] atomic ops are optimized with incorrect semantics .
Hi Umesh, On Mon, Feb 10, 2020 at 2:06 PM Umesh Kalappa via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Hi All, > > With the "https://gcc.godbolt.org/z/yBYTrd" case . > > the atomic is converted to non atomic ops for x86 like > > from > xchg dword ptr [100], eax > to > mov dword ptr [100], 1 > > the pass is responsible for this tranformation was instCombine i.e InstCombiner::visitAtomicRMWInst > > which converts the IR like > %0 = atomicrmw xchg i32* inttoptr (i64 100 to i32*), i32 1 monotonic > to > store atomic i32 1, i32* inttoptr (i64 100 to i32*) monotonic, align 4Can you explain why you think this code generation is incorrect? A simple mov is an atomic store on x86 from what I understand - the godbolt example seems perfectly correct to me, what am I missing? Cheers, Nicolai> > which is valid for relax(monotonic) and release ordering as per the code out there. > > we think that,its the inst lowering issue, where the atomic store was lowered to non-atomic store here. > > to work around we changed our case to strong ordering . > > and we are debugging the case with x86 and the same goes with an arm too. > > Any suggestions or thoughts on the transformation? , will be helpful to proceed further. > > FYI, the problem persists with LLVM-9, not with the LLVM-8. > > Thank you > ~Umesh > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Lerne, wie die Welt wirklich ist, aber vergiss niemals, wie sie sein sollte.
Umesh Kalappa via llvm-dev
2020-Feb-11 09:43 UTC
[llvm-dev] atomic ops are optimized with incorrect semantics .
Thank you All and Nicolai, This is what we get with ARM like (CLANG 9.0.1.1) 1834 0000000080001ccc <bm>: 1835 80001ccc:¦ 52800028 ¦ mov¦w8, #0x1 ¦// #1 1836 80001cd0:¦ b9000008 ¦ str¦w8, [x0] 1837 80001cd4:¦ 17ffffff ¦ b¦ 80001cd0 <bm+0x4> (CLANG 8.0.0.2) 1833 0000000080001c90 <bm>: 1834 80001c90:> 320003e8 > orr>w8, wzr, #0x1 1835 80001c94:> 885f7c1f > ldxr> wzr, [x0] 1836 80001c98:> 881f7c08 > stxr> wzr, w8, [x0] 1837 80001c9c:> 17fffffe > b> 80001c94 <bm+0x4> may be in x86 ,you are right mov is atomic, but if we need to sync the operation with the previous insn's then just having atomic doesn't help right? ~Umesh On Tue, Feb 11, 2020 at 2:08 AM Nicolai Hähnle <nhaehnle at gmail.com> wrote:> Hi Umesh, > > On Mon, Feb 10, 2020 at 2:06 PM Umesh Kalappa via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > Hi All, > > > > With the "https://gcc.godbolt.org/z/yBYTrd" case . > > > > the atomic is converted to non atomic ops for x86 like > > > > from > > xchg dword ptr [100], eax > > to > > mov dword ptr [100], 1 > > > > the pass is responsible for this tranformation was instCombine i.e > InstCombiner::visitAtomicRMWInst > > > > which converts the IR like > > %0 = atomicrmw xchg i32* inttoptr (i64 100 to i32*), i32 1 monotonic > > to > > store atomic i32 1, i32* inttoptr (i64 100 to i32*) monotonic, align 4 > > Can you explain why you think this code generation is incorrect? A > simple mov is an atomic store on x86 from what I understand - the > godbolt example seems perfectly correct to me, what am I missing? > > Cheers, > Nicolai > > > > > > which is valid for relax(monotonic) and release ordering as per the code > out there. > > > > we think that,its the inst lowering issue, where the atomic store was > lowered to non-atomic store here. > > > > to work around we changed our case to strong ordering . > > > > and we are debugging the case with x86 and the same goes with an arm too. > > > > Any suggestions or thoughts on the transformation? , will be helpful to > proceed further. > > > > FYI, the problem persists with LLVM-9, not with the LLVM-8. > > > > Thank you > > ~Umesh > > > > > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > -- > Lerne, wie die Welt wirklich ist, > aber vergiss niemals, wie sie sein sollte. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200211/cb7804ce/attachment.html>
Reasonably Related Threads
- [atomics][AArch64] Possible bug in cmpxchg lowering
- [LLVMdev] ScheduleDAGInstrs computes deps using IR Values that may be invalid
- Aarch64: unaligned access despite -mstrict-align
- [LLVMdev] LICM promoting memory to scalar
- [LLVMdev] LICM promoting memory to scalar