Dennis Luehring via llvm-dev
2016-Dec-05 11:32 UTC
[llvm-dev] Clang Optimizer freaks out on "simple" goto code?
FYI found this example while reading: https://github.com/jameysharp/corrode/issues/30#issuecomment-231969365 and compared it with current gcc 6.2, clang 3.9 gcc 6.2 result is quite small - clang 3.9 produces much much more code for this example https://godbolt.org/g/uWxr8F is that a missing optimization opportunity or just wrong behavior of the optimizer?
Philip Pfaffe via llvm-dev
2016-Dec-05 12:08 UTC
[llvm-dev] Clang Optimizer freaks out on "simple" goto code?
Hi Dennis, While Clang's code is significantly larger, that is probably on purpose: Clang has vectorized the goto-loop. To validate whether that was correct and a good idea, plug both results into a benchmark and look at the actual performance data. Philip 2016-12-05 12:32 GMT+01:00 Dennis Luehring via llvm-dev < llvm-dev at lists.llvm.org>:> FYI > > found this example while reading: > https://github.com/jameysharp/corrode/issues/30#issuecomment-231969365 > and compared it with current gcc 6.2, clang 3.9 > > gcc 6.2 result is quite small - clang 3.9 produces much much more code > for this example > https://godbolt.org/g/uWxr8F > > is that a missing optimization opportunity or just wrong behavior of the > optimizer? > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161205/ecf8943d/attachment.html>
mats petersson via llvm-dev
2016-Dec-05 12:31 UTC
[llvm-dev] Clang Optimizer freaks out on "simple" goto code?
On 5 December 2016 at 12:08, Philip Pfaffe via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi Dennis, > > While Clang's code is significantly larger, that is probably on purpose: > Clang has vectorized the goto-loop. > > To validate whether that was correct and a good idea, plug both results > into a benchmark and look at the actual performance data. >And that, in turn, depends on the length of the loop. If the values are known at compile-time for `count`, the compiler will know whether performing SSE operations or not "is worth it". I had a case where I was passing an array of int and a length to a function, and clang generated a whole lot of instructions to unroll the loop and make it SSE - not realizing that the common value for `length` was 1 and never bigger than some small number (16 or 32). I didn't make any effort to imrpove the compiled code, as I realized it was relatively simple to inline the whole piece of code in LLVM-IR (it was part of my Pascal compiler project). But I believe if I had added a `assert(length < 16)` to the code, it would have done a decent job with it. [Inlining it helps my Pascal compiler beat the FreePascal implementation by about 2-3x using that particular algorithm for solving suduko - the call itself was quite an overhead, and "not having a loop when you don't need to" helps even more] -- Mats> > Philip > > 2016-12-05 12:32 GMT+01:00 Dennis Luehring via llvm-dev < > llvm-dev at lists.llvm.org>: > >> FYI >> >> found this example while reading: >> https://github.com/jameysharp/corrode/issues/30#issuecomment-231969365 >> and compared it with current gcc 6.2, clang 3.9 >> >> gcc 6.2 result is quite small - clang 3.9 produces much much more code >> for this example >> https://godbolt.org/g/uWxr8F >> >> is that a missing optimization opportunity or just wrong behavior of the >> optimizer? >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161205/0c86fcaf/attachment.html>
Stephen Checkoway via llvm-dev
2016-Dec-05 17:06 UTC
[llvm-dev] Clang Optimizer freaks out on "simple" goto code?
> On Dec 5, 2016, at 06:08, Philip Pfaffe via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > While Clang's code is significantly larger, that is probably on purpose: Clang has vectorized the goto-loop.Even with -mno-sse (and I suspect there's a better way to inhibit vectorization, but this worked), it looks like Clang is doing something a little strange. Where gcc uses .L7: add eax, edi sub edi, 1; jne .L7, Clang has .LBB0_2: add eax, edi cmp edi, 1 lea ecx, [rdi - 1] mov edi, ecx jg .LBB0_2 It also has a redundant xor eax, eax before the loop for some reason. That said...> To validate whether that was correct and a good idea, plug both results into a benchmark and look at the actual performance data.I didn't actually do this. -- Stephen Checkoway
Possibly Parallel Threads
- {worker} after :end-time worker freaks out
- [inline-asm][asm-goto] Supporting "asm goto" in inline assembly
- [inline-asm][asm-goto] Supporting "asm goto" in inline assembly
- [inline-asm][asm-goto] Supporting "asm goto" in inline assembly
- Patch for app_asr.c: DTMF instead of goto