A customer has reported a performance problem, which I have eventually tracked down to the following situation: Consider this program: int foo(int *p, volatile int *q, int n) { int i, s = 0; for (i = 0; i < n; ++i) s += *p + *q; return s; } LLVM's analysis indicates that *p and *q can alias, even though *p is non-volatile whereas *q is volatile. I don't have the exact section from the C standard, but if I remember correctly, accessing volatile memory via a non-volatile object results in an undefined behavior. This would suggest that volatiles and non-volatiles may be considered not to alias automatically, even if TBAA would not be able to prove it. The LLVM's code (on x86) at -O2 looks like this: .text .globl foo .align 16, 0x90 .type foo, at function foo: # @foo .cfi_startproc # BB#0: # %entry xorl %eax, %eax testl %edx, %edx jle .LBB0_2 .align 16, 0x90 .LBB0_1: # %for.body addl (%rdi), %eax addl (%rsi), %eax decl %edx jne .LBB0_1 .LBB0_2: # %for.end ret .Ltmp0: .size foo, .Ltmp0-foo .cfi_endproc .section ".note.GNU-stack","", at progbits For comparison, GCC has only one load in the loop: .text .p2align 4,,15 .globl foo .type foo, @function foo: .LFB0: .cfi_startproc xorl %eax, %eax testl %edx, %edx jle .L3 movl (%rdi), %r8d xorl %ecx, %ecx .p2align 4,,10 .p2align 3 .L4: movl (%rsi), %edi addl $1, %ecx addl %r8d, %edi addl %edi, %eax cmpl %edx, %ecx jne .L4 .L3: rep ret .cfi_endproc .LFE0: .size foo, .-foo .ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3" .section .note.GNU-stack,"", at progbits The specifics in our case were that the AliasSetTracker indicated mod/ref for all pointers in the loop, where one of them was used in a volatile load, and all others were used in non-volatile loads. As a result, the loop had more loads than necessary. Any thoughts? Has there been any consideration for this type of a situation? -K -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
On Wed, Sep 4, 2013 at 2:33 PM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:> A customer has reported a performance problem, which I have eventually > tracked down to the following situation: > > Consider this program: > > int foo(int *p, volatile int *q, int n) { > int i, s = 0; > for (i = 0; i < n; ++i) > s += *p + *q; > return s; > } > > > LLVM's analysis indicates that *p and *q can alias, even though *p is > non-volatile whereas *q is volatile. I don't have the exact section from > the C standard, but if I remember correctly, accessing volatile memory via > a non-volatile object results in an undefined behavior.Accessing a variable declared as "volatile int" through a non-volatile pointer isn't allowed. Accessing a variable declared as non-volatile "int" through a volatile pointer is allowed. You might be able to reason out that either p and q don't alias, or q points to non-volatile memory... but that's substantially more complicated than what you're proposing. (The section in the standard is 6.5p7.) -Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130904/31e30f32/attachment.html>
On Wed, Sep 4, 2013 at 2:48 PM, Eli Friedman <eli.friedman at gmail.com> wrote:> On Wed, Sep 4, 2013 at 2:33 PM, Krzysztof Parzyszek < > kparzysz at codeaurora.org> wrote: > >> A customer has reported a performance problem, which I have eventually >> tracked down to the following situation: >> >> Consider this program: >> >> int foo(int *p, volatile int *q, int n) { >> int i, s = 0; >> for (i = 0; i < n; ++i) >> s += *p + *q; >> return s; >> } >> >> >> LLVM's analysis indicates that *p and *q can alias, even though *p is >> non-volatile whereas *q is volatile. I don't have the exact section from >> the C standard, but if I remember correctly, accessing volatile memory via >> a non-volatile object results in an undefined behavior. > > > Accessing a variable declared as "volatile int" through a non-volatile > pointer isn't allowed. Accessing a variable declared as non-volatile "int" > through a volatile pointer is allowed. > > You might be able to reason out that either p and q don't alias, or q > points to non-volatile memory... but that's substantially more complicated > than what you're proposing. > > (The section in the standard is 6.5p7.) >Furthermore, LLVM IR does not have the concept of "volatile variables". In LLVM IR, operations are volatile, not storage. There are no semantic rules prohibiting the mixing of volatile and non-volatile accesses. Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130904/76fe7a61/attachment.html>
On 9/4/2013 4:48 PM, Eli Friedman wrote:> > Accessing a variable declared as "volatile int" through a non-volatile > pointer isn't allowed. Accessing a variable declared as non-volatile > "int" through a volatile pointer is allowed.Ok, that's not a problem. The problem here is the invariance of the non-volatile load. In the alias set tracker, the alias set that contains a volatile load is marked as "mod/ref". This then implies that any other load (volatile or not) from that set is not invariant in a given region. For that "variance" to happen that load would have to be accessing volatile memory, which isn't legal for non-volatile loads. -K -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
James Courtier-Dutton
2013-Sep-07 08:23 UTC
[LLVMdev] Aliasing of volatile and non-volatile
Are you sure this is an alias problem? What is happening is LLVM is leaving the code looking like this: int foo(int *p, volatile int *q, int n) { int i, s = 0; for (i = 0; i < n; ++i) s += *p + *q; return s; } but GCC is changing to code to look like this: int foo(int *p, volatile int *q, int n) { int i, s = 0; int t; t = *p; for (i = 0; i < n; ++i) s += t + *q; return s; } GCC is raising the *p out of the loop, recognizing the fact that memory access is more expensive than reg access. What is preventing LLVM from doing the same? It could have been even faster if it wished and changed the code to this: int foo(int *p, volatile int *q, int n) { int i, s = 0; int t; t = *p; s += t * n; for (i = 0; i < n; ++i) s += *q; return s; } On 4 September 2013 22:33, Krzysztof Parzyszek <kparzysz at codeaurora.org>wrote:> A customer has reported a performance problem, which I have eventually > tracked down to the following situation: > > Consider this program: > > int foo(int *p, volatile int *q, int n) { > int i, s = 0; > for (i = 0; i < n; ++i) > s += *p + *q; > return s; > } > > > LLVM's analysis indicates that *p and *q can alias, even though *p is > non-volatile whereas *q is volatile. I don't have the exact section from > the C standard, but if I remember correctly, accessing volatile memory via > a non-volatile object results in an undefined behavior. This would suggest > that volatiles and non-volatiles may be considered not to alias > automatically, even if TBAA would not be able to prove it. > > The LLVM's code (on x86) at -O2 looks like this: > > .text > .globl foo > .align 16, 0x90 > .type foo, at function > foo: # @foo > .cfi_startproc > # BB#0: # %entry > xorl %eax, %eax > testl %edx, %edx > jle .LBB0_2 > .align 16, 0x90 > .LBB0_1: # %for.body > addl (%rdi), %eax > addl (%rsi), %eax > decl %edx > jne .LBB0_1 > .LBB0_2: # %for.end > ret > .Ltmp0: > .size foo, .Ltmp0-foo > .cfi_endproc > .section ".note.GNU-stack","", at progbits > > > For comparison, GCC has only one load in the loop: > > .text > .p2align 4,,15 > .globl foo > .type foo, @function > foo: > .LFB0: > .cfi_startproc > xorl %eax, %eax > testl %edx, %edx > jle .L3 > movl (%rdi), %r8d > xorl %ecx, %ecx > .p2align 4,,10 > .p2align 3 > .L4: > movl (%rsi), %edi > addl $1, %ecx > addl %r8d, %edi > addl %edi, %eax > cmpl %edx, %ecx > jne .L4 > .L3: > rep > ret > .cfi_endproc > .LFE0: > .size foo, .-foo > .ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3" > .section .note.GNU-stack,"", at progbits > > > The specifics in our case were that the AliasSetTracker indicated mod/ref > for all pointers in the loop, where one of them was used in a volatile > load, and all others were used in non-volatile loads. As a result, the > loop had more loads than necessary. > > Any thoughts? Has there been any consideration for this type of a > situation? > > -K > > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted > by The Linux Foundation > ______________________________**_________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/**mailman/listinfo/llvmdev<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130907/ba76d18a/attachment.html>
James Courtier-Dutton
2013-Sep-07 09:04 UTC
[LLVMdev] Aliasing of volatile and non-volatile
On 4 September 2013 22:33, Krzysztof Parzyszek <kparzysz at codeaurora.org>wrote:> A customer has reported a performance problem, which I have eventually > tracked down to the following situation: > > Consider this program: > > int foo(int *p, volatile int *q, int n) { > int i, s = 0; > for (i = 0; i < n; ++i) > s += *p + *q; > return s; > } > > > LLVM's analysis indicates that *p and *q can alias, even though *p is > non-volatile whereas *q is volatile. I don't have the exact section from > the C standard, but if I remember correctly, accessing volatile memory via > a non-volatile object results in an undefined behavior. This would suggest > that volatiles and non-volatiles may be considered not to alias > automatically, even if TBAA would not be able to prove it. > >Could the problem also be looked at like this: 1) If *p and *q are aliased, the results of *p is undefined. 2) If *p and *q and not aliased, the results of *p is defined. Surely, in this sort of situation, optimizations should should be done assuming (2), because if the case is (1) the result it undefined, so doing (2) won't change that. Kind Regards James -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130907/c77df181/attachment.html>
On 9/7/2013 3:23 AM, James Courtier-Dutton wrote:> Are you sure this is an alias problem?In principle no, but in some sense yes---it's related to the way AliasSetTracker treats volatile references.> but GCC is changing to code to look like this: > > int foo(int *p, volatile int *q, int n) { > int i, s = 0; > int t; > t = *p; > for (i = 0; i < n; ++i) > s += t + *q; > return s; > } > > GCC is raising the *p out of the loop, recognizing the fact that memory > access is more expensive than reg access. > What is preventing LLVM from doing the same?When a volatile load is added to an AliasSetTracker, the alias set associated with the load's memory location is marked as "mod/ref". This means that even if the alias set only contains loads (i.e. is "ref" but not "mod"), then after adding a volatile load it will become "mod" as well. This will prevent any load from being moved out of a loop in LICM, since LICM uses AST to see if the load's location has been changed in the loop. I have posted a patch on the llvm-commits mailing list: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130902/187072.html -K -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Possibly Parallel Threads
- [LLVMdev] Aliasing of volatile and non-volatile
- [LLVMdev] How can I compile a c source file to use SSE2 Data Movement Instructions?
- [LLVMdev] Failure to optimize ? operator
- [LLVMdev] GCC vs. LLVM difference on simple code example
- [RFC] New pass: LoopExitValues