Marc de Kruijf
2011-Mar-24 18:59 UTC
[LLVMdev] GCC vs. LLVM difference on simple code example
Hi, I have a question on why gcc and llvm-gcc compile the following simple code snippet differently: extern int a; extern int *b; void foo() { int i; for (i = 1; i < 100; ++i) a += b[i]; } gcc compiles this function hoisting the load of the global variable "b" outside of the loop, while llvm-gcc keeps it inside the loop. This results in slower code on the part of llvm-gcc, and I'm wondering why this choice is made? Is it because of the memory consistency model? With respect to memory consistency, does the C standard say whether a global variable used inside a function is loaded at the point of the use(s), or whether it can be loaded by the compiler earlier in the function? I had always thought that it was legal to hoist the load of a global variable outside of the loop as long as it was not declared volatile.... Here is the x86 assembly code generated by gcc 4.5.2. The load of "b" is highlighted: .file "foo.c" .text .p2align 4,,15 .globl foo .type foo, @function foo: * movl b, %ecx* movl $1, %eax movl a, %edx pushl %ebp movl %esp, %ebp .p2align 4,,7 .p2align 3 .L2: addl (%ecx,%eax,4), %edx addl $1, %eax cmpl $100, %eax movl %edx, a jne .L2 popl %ebp ret .size foo, .-foo .ident "GCC: (GNU) 4.5.2" .section .note.GNU-stack,"", at progbits And here is the code produced by llvm-gcc 4.2.1: .file "foo.c" .text .globl foo .align 16, 0x90 .type foo, at function foo: pushl %ebp movl %esp, %ebp movl $1, %eax movl a, %ecx .align 16, 0x90 .LBB0_1: * movl b, %edx* addl (%edx,%eax,4), %ecx movl %ecx, a incl %eax cmpl $100, %eax jne .LBB0_1 popl %ebp ret .Ltmp0: .size foo, .Ltmp0-foo .section .note.GNU-stack,"", at progbits .ident "GCC: (GNU) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build)" -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110324/7d2108ef/attachment.html>
Eli Friedman
2011-Mar-24 19:31 UTC
[LLVMdev] GCC vs. LLVM difference on simple code example
On Thu, Mar 24, 2011 at 11:59 AM, Marc de Kruijf <dekruijf at cs.wisc.edu> wrote:> Hi, > I have a question on why gcc and llvm-gcc compile the following simple code > snippet differently: > extern int a; > extern int *b; > void foo() { > int i; > for (i = 1; i < 100; ++i) > a += b[i]; > }Missed optimization, which appears to be fixed in newer versions. -Eli
Chris Lattner
2011-Mar-25 22:34 UTC
[LLVMdev] GCC vs. LLVM difference on simple code example
On Mar 24, 2011, at 11:59 AM, Marc de Kruijf wrote:> Hi, > > I have a question on why gcc and llvm-gcc compile the following simple code snippet differently: > > extern int a; > extern int *b; > > void foo() { > int i; > for (i = 1; i < 100; ++i) > a += b[i]; > } > > gcc compiles this function hoisting the load of the global variable "b" outside of the loop, while llvm-gcc keeps it inside the loop. This results in slower code on the part of llvm-gcc, and I'm wondering why this choice is made? Is it because of the memory consistency model? With respect to memory consistency, does the C standard say whether a global variable used inside a function is loaded at the point of the use(s), or whether it can be loaded by the compiler earlier in the function? I had always thought that it was legal to hoist the load of a global variable outside of the loop as long as it was not declared volatile....The difference here is that llvm-gcc doesn't support the "-fstrict-aliasing" flag. If you pass -fno-strict-aliasing to gcc, you'll probably get similar code to llvm-gcc. Note that clang does support -fstrict-aliasing. -Chris> > Here is the x86 assembly code generated by gcc 4.5.2. The load of "b" is highlighted: > > .file "foo.c" > .text > .p2align 4,,15 > .globl foo > .type foo, @function > foo: > movl b, %ecx > movl $1, %eax > movl a, %edx > pushl %ebp > movl %esp, %ebp > .p2align 4,,7 > .p2align 3 > .L2: > addl (%ecx,%eax,4), %edx > addl $1, %eax > cmpl $100, %eax > movl %edx, a > jne .L2 > popl %ebp > ret > .size foo, .-foo > .ident "GCC: (GNU) 4.5.2" > .section .note.GNU-stack,"", at progbits > > And here is the code produced by llvm-gcc 4.2.1: > > .file "foo.c" > .text > .globl foo > .align 16, 0x90 > .type foo, at function > foo: > pushl %ebp > movl %esp, %ebp > movl $1, %eax > movl a, %ecx > .align 16, 0x90 > .LBB0_1: > movl b, %edx > addl (%edx,%eax,4), %ecx > movl %ecx, a > incl %eax > cmpl $100, %eax > jne .LBB0_1 > popl %ebp > ret > .Ltmp0: > .size foo, .Ltmp0-foo > .section .note.GNU-stack,"", at progbits > .ident "GCC: (GNU) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build)" > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110325/1efbfe7b/attachment.html>
Apparently Analagous Threads
- [LLVMdev] Aliasing of volatile and non-volatile
- [LLVMdev] How can I compile a c source file to use SSE2 Data Movement Instructions?
- [LLVMdev] Aliasing of volatile and non-volatile
- [LLVMdev] Failure to optimize ? operator
- DragonEgg for GCC v8.x and LLVM v6.x is just able to work