Craig Topper via llvm-dev
2021-Feb-18 01:01 UTC
[llvm-dev] Potentially unsafe loop optimization
I ran the IR through clang -O2 on godbolt and didn't get a return. So I think something is happening in the middle end? ~Craig On Wed, Feb 17, 2021 at 4:57 PM Johannes Doerfert < johannesdoerfert at gmail.com> wrote:> llc generates the return in all version I tried: > https://godbolt.org/z/oh3fqh > > > On 2/17/21 6:46 PM, Craig Topper wrote: > > What version of llvm are you using? Godbolt is showing trunk and llvm 10 > > have a conditional branch, llvm 11 does not. > > > > ~Craig > > > > > > On Wed, Feb 17, 2021 at 4:41 PM Richard Kenner via llvm-dev < > > llvm-dev at lists.llvm.org> wrote: > > > >>> Long story short, from what I can see there is no miscompilation > >>> or change in semantics for that matter. > >> So why does the .s file not contain the loop exit test? > >> > >> .text > >> .file "c26006a.adb" > >> .globl _ada_c26006a # -- Begin function > >> _ada_c26006a > >> .p2align 4, 0x90 > >> .type _ada_c26006a, at function > >> _ada_c26006a: # @_ada_c26006a > >> .cfi_startproc > >> # %bb.0: # %entry > >> pushq %rbx > >> .cfi_def_cfa_offset 16 > >> subq $32, %rsp > >> .cfi_def_cfa_offset 48 > >> .cfi_offset %rbx, -16 > >> movw $8257, 16(%rsp) # imm = 0x2041 > >> movb $49, 18(%rsp) > >> movw $8257, (%rsp) # imm = 0x2041 > >> movb $50, 2(%rsp) > >> xorl %ebx, %ebx > >> jmp .LBB0_1 > >> .p2align 4, 0x90 > >> .LBB0_3: # %loop.cond.iter > >> # in Loop: Header=BB0_1 > Depth=1 > >> incb %bl > >> .LBB0_1: # %loop.cond > >> # =>This Inner Loop Header: > Depth=1 > >> movb %bl, 17(%rsp) > >> movb %bl, 1(%rsp) > >> movzwl (%rsp), %eax > >> xorw 16(%rsp), %ax > >> movzbl 2(%rsp), %ecx > >> xorb 18(%rsp), %cl > >> movzbl %cl, %ecx > >> orw %ax, %cx > >> jne .LBB0_3 > >> # %bb.2: # in Loop: Header=BB0_1 > Depth=1 > >> callq abort > >> jmp .LBB0_3 > >> .Lfunc_end0: > >> .size _ada_c26006a, .Lfunc_end0-_ada_c26006a > >> .cfi_endproc > >> # -- End function > >> .section ".note.GNU-stack","", at progbits > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210217/4e98f725/attachment.html>
Juneyoung Lee via llvm-dev
2021-Feb-18 01:22 UTC
[llvm-dev] Potentially unsafe loop optimization
FWIW, clang 11.0 reproduces the bug: https://godbolt.org/z/GvoK63 But clang trunk doesn't: https://godbolt.org/z/bed7eM On Thu, Feb 18, 2021 at 10:01 AM Craig Topper via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I ran the IR through clang -O2 on godbolt and didn't get a return. So I > think something is happening in the middle end? > > ~Craig > > > On Wed, Feb 17, 2021 at 4:57 PM Johannes Doerfert < > johannesdoerfert at gmail.com> wrote: > >> llc generates the return in all version I tried: >> https://godbolt.org/z/oh3fqh >> >> >> On 2/17/21 6:46 PM, Craig Topper wrote: >> > What version of llvm are you using? Godbolt is showing trunk and llvm 10 >> > have a conditional branch, llvm 11 does not. >> > >> > ~Craig >> > >> > >> > On Wed, Feb 17, 2021 at 4:41 PM Richard Kenner via llvm-dev < >> > llvm-dev at lists.llvm.org> wrote: >> > >> >>> Long story short, from what I can see there is no miscompilation >> >>> or change in semantics for that matter. >> >> So why does the .s file not contain the loop exit test? >> >> >> >> .text >> >> .file "c26006a.adb" >> >> .globl _ada_c26006a # -- Begin function >> >> _ada_c26006a >> >> .p2align 4, 0x90 >> >> .type _ada_c26006a, at function >> >> _ada_c26006a: # @_ada_c26006a >> >> .cfi_startproc >> >> # %bb.0: # %entry >> >> pushq %rbx >> >> .cfi_def_cfa_offset 16 >> >> subq $32, %rsp >> >> .cfi_def_cfa_offset 48 >> >> .cfi_offset %rbx, -16 >> >> movw $8257, 16(%rsp) # imm = 0x2041 >> >> movb $49, 18(%rsp) >> >> movw $8257, (%rsp) # imm = 0x2041 >> >> movb $50, 2(%rsp) >> >> xorl %ebx, %ebx >> >> jmp .LBB0_1 >> >> .p2align 4, 0x90 >> >> .LBB0_3: # %loop.cond.iter >> >> # in Loop: Header=BB0_1 >> Depth=1 >> >> incb %bl >> >> .LBB0_1: # %loop.cond >> >> # =>This Inner Loop Header: >> Depth=1 >> >> movb %bl, 17(%rsp) >> >> movb %bl, 1(%rsp) >> >> movzwl (%rsp), %eax >> >> xorw 16(%rsp), %ax >> >> movzbl 2(%rsp), %ecx >> >> xorb 18(%rsp), %cl >> >> movzbl %cl, %ecx >> >> orw %ax, %cx >> >> jne .LBB0_3 >> >> # %bb.2: # in Loop: Header=BB0_1 >> Depth=1 >> >> callq abort >> >> jmp .LBB0_3 >> >> .Lfunc_end0: >> >> .size _ada_c26006a, .Lfunc_end0-_ada_c26006a >> >> .cfi_endproc >> >> # -- End function >> >> .section ".note.GNU-stack","", at progbits >> >> _______________________________________________ >> >> LLVM Developers mailing list >> >> llvm-dev at lists.llvm.org >> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Juneyoung Lee Software Foundation Lab, Seoul National University -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210218/b9d2cff2/attachment.html>
Johannes Doerfert via llvm-dev
2021-Feb-18 05:42 UTC
[llvm-dev] Potentially unsafe loop optimization
So I looked at it again because the O3 output in IR has the return for trunk and 11 but for the latter it's not there in assembly, this was looking interesting: https://godbolt.org/z/EqhW4v I think there is UB in the jump based on the memcmp because only one byte of the 3 is initialized. To me it seems there is some type punning going on and the two integers are used as storage for 3 bytes instead of the 3 bytes array. I thought this may lead to the selection of the abort branch and bad consequences. However, on the IR level we do not recognize abort as noreturn as far as I can tell, thus that is not it. I took the IR 11 produced as input and run it again through 11 and trunk, same result. Then I replaced the bcmp with `f` and the return magically appeared for 11 too: https://godbolt.org/z/j4T4hW We are on to something here. bcmp is recognized and we act on it. The UB described earlier is present there too, bcmp compares 3 bytes in the two integer struct instead of the 3 byte array. I can manually do the optimization in the middle end that LLVM 11 does (removed two redundant stores from the entry) but no change: https://godbolt.org/z/7efdqb At this point I would guess the backend knows bcmp and utilizes the UB. Whatever it is, as far as I can tell it is technically correct given the UB in the input. However, understanding what made the difference might be worth while nevertheless. ~ Johannes On 2/17/21 7:01 PM, Craig Topper wrote:> I ran the IR through clang -O2 on godbolt and didn't get a return. So I > think something is happening in the middle end? > > ~Craig > > > On Wed, Feb 17, 2021 at 4:57 PM Johannes Doerfert < > johannesdoerfert at gmail.com> wrote: > >> llc generates the return in all version I tried: >> https://godbolt.org/z/oh3fqh >> >> >> On 2/17/21 6:46 PM, Craig Topper wrote: >>> What version of llvm are you using? Godbolt is showing trunk and llvm 10 >>> have a conditional branch, llvm 11 does not. >>> >>> ~Craig >>> >>> >>> On Wed, Feb 17, 2021 at 4:41 PM Richard Kenner via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>>> Long story short, from what I can see there is no miscompilation >>>>> or change in semantics for that matter. >>>> So why does the .s file not contain the loop exit test? >>>> >>>> .text >>>> .file "c26006a.adb" >>>> .globl _ada_c26006a # -- Begin function >>>> _ada_c26006a >>>> .p2align 4, 0x90 >>>> .type _ada_c26006a, at function >>>> _ada_c26006a: # @_ada_c26006a >>>> .cfi_startproc >>>> # %bb.0: # %entry >>>> pushq %rbx >>>> .cfi_def_cfa_offset 16 >>>> subq $32, %rsp >>>> .cfi_def_cfa_offset 48 >>>> .cfi_offset %rbx, -16 >>>> movw $8257, 16(%rsp) # imm = 0x2041 >>>> movb $49, 18(%rsp) >>>> movw $8257, (%rsp) # imm = 0x2041 >>>> movb $50, 2(%rsp) >>>> xorl %ebx, %ebx >>>> jmp .LBB0_1 >>>> .p2align 4, 0x90 >>>> .LBB0_3: # %loop.cond.iter >>>> # in Loop: Header=BB0_1 >> Depth=1 >>>> incb %bl >>>> .LBB0_1: # %loop.cond >>>> # =>This Inner Loop Header: >> Depth=1 >>>> movb %bl, 17(%rsp) >>>> movb %bl, 1(%rsp) >>>> movzwl (%rsp), %eax >>>> xorw 16(%rsp), %ax >>>> movzbl 2(%rsp), %ecx >>>> xorb 18(%rsp), %cl >>>> movzbl %cl, %ecx >>>> orw %ax, %cx >>>> jne .LBB0_3 >>>> # %bb.2: # in Loop: Header=BB0_1 >> Depth=1 >>>> callq abort >>>> jmp .LBB0_3 >>>> .Lfunc_end0: >>>> .size _ada_c26006a, .Lfunc_end0-_ada_c26006a >>>> .cfi_endproc >>>> # -- End function >>>> .section ".note.GNU-stack","", at progbits >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>