I actually submitted a patch to fix this very problem quite a while back, but I don't think it was ever added to the baseline: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110711/124136.html I have also explained why it is important to avoid the creation of new loops in an optimizer that performs analysis on the looping structure. I now need to keep this patch updated whenever we upgrade LLVM, to ensure that our optimizer (which is now in production) works correctly. Would it be possible to add this patch to the head? I have a version that works with 3.0 but have not yet updated to 3.1 or the head. Andrew On 10/17/2012 12:55 PM, Chris Lattner wrote:> On Oct 17, 2012, at 9:38 AM, Krzysztof Parzyszek<kparzysz at codeaurora.org> wrote: >>> The current implementation of the CFG simplification is >>> loop-agnostic. In the past I have observed that it can perform >>> transformations that can be detrimental to the loop structure. One >>> example that I recall, is converting a loop like this: >>> while (...) { >>> ... >>> if (cond) continue; >>> ... >>> } >>> into two nested loops. Specifically, the "continue" branch would go >>> back to the loop header making it appear as if there were two back >>> edges. >>> >>> >>> Well, there really are two backedges there. >> There is no reason for two back edges. This loop could look like this: >> >> header: >> ... >> ... >> if (cond) goto tail; else goto next; >> next: >> ... >> tail: >> if (loop-condition) goto header; >> >> >> This way there is one loop with some control flow nested in it, instead of something that looks like two loops. > What problem does that solve? LLVM's concept of loop is tied to the loop header block: this both forms are a single loop, not two nested loops. > > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Thu, Oct 18, 2012 at 10:12 AM, Andrew Clinton <andrew at sidefx.com> wrote:> I actually submitted a patch to fix this very problem quite a while back, > but I don't think it was ever added to the baseline: > > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110711/124136.html > > I have also explained why it is important to avoid the creation of new loops > in an optimizer that performs analysis on the looping structure. > > I now need to keep this patch updated whenever we upgrade LLVM, to ensure > that our optimizer (which is now in production) works correctly. Would it > be possible to add this patch to the head? I have a version that works with > 3.0 but have not yet updated to 3.1 or the head.I'm not quite up for diving all the way into the previous thread, but do you have a test case that demonstrates a missed optimization due to this transformational problem? That should hopefully be enough to convince people the change is valuable. (I assume this is causing you real problems/missed optimizations or you wouldn't bother keeping it for your out of tree releases)> > Andrew > > > On 10/17/2012 12:55 PM, Chris Lattner wrote: >> >> On Oct 17, 2012, at 9:38 AM, Krzysztof Parzyszek<kparzysz at codeaurora.org> >> wrote: >>>> >>>> The current implementation of the CFG simplification is >>>> loop-agnostic. In the past I have observed that it can perform >>>> transformations that can be detrimental to the loop structure. One >>>> example that I recall, is converting a loop like this: >>>> while (...) { >>>> ... >>>> if (cond) continue; >>>> ... >>>> } >>>> into two nested loops. Specifically, the "continue" branch would go >>>> back to the loop header making it appear as if there were two back >>>> edges. >>>> >>>> >>>> Well, there really are two backedges there. >>> >>> There is no reason for two back edges. This loop could look like this: >>> >>> header: >>> ... >>> ... >>> if (cond) goto tail; else goto next; >>> next: >>> ... >>> tail: >>> if (loop-condition) goto header; >>> >>> >>> This way there is one loop with some control flow nested in it, instead >>> of something that looks like two loops. >> >> What problem does that solve? LLVM's concept of loop is tied to the loop >> header block: this both forms are a single loop, not two nested loops. >> >> -Chris >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On 10/18/2012 01:26 PM, David Blaikie wrote:> On Thu, Oct 18, 2012 at 10:12 AM, Andrew Clinton<andrew at sidefx.com> wrote: >> I actually submitted a patch to fix this very problem quite a while back, >> but I don't think it was ever added to the baseline: >> >> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110711/124136.html >> >> I have also explained why it is important to avoid the creation of new loops >> in an optimizer that performs analysis on the looping structure. >> >> I now need to keep this patch updated whenever we upgrade LLVM, to ensure >> that our optimizer (which is now in production) works correctly. Would it >> be possible to add this patch to the head? I have a version that works with >> 3.0 but have not yet updated to 3.1 or the head. > I'm not quite up for diving all the way into the previous thread, but > do you have a test case that demonstrates a missed optimization due to > this transformational problem? That should hopefully be enough to > convince people the change is valuable. (I assume this is causing you > real problems/missed optimizations or you wouldn't bother keeping it > for your out of tree releases) >I think part of the difficulty in explaining the benefit of this change is that the optimization pass where this change is of value to us is also a custom analysis pass. Basically, we're using LLVM as an optimization framework for a SIMD language, and the difficulty with the unwanted loop transformation comes when we have code like this: while (uniform_and_expensive_condition_with_side_effects()) { if (cheap_varying_condition()) do_this } If this gets transformed to something like this (which is similar to the example I posted in the earlier thread): while () // outer loop { while () // inner loop { if (!uniform_and_expensive_condition_with_side_effects()) return; if (cheap_varying_condition()) break; } do_this } If you think of "uniform_and_expensive_condition_with_side_effects()" as a function that is efficient only when it is run in lockstep for the SIMD array, and "cheap_varying_condition()" as a function that can be run efficiently with conditional masking, the transformation is detrimental for performance because "uniform_and_expensive_condition_with_side_effects()" now needs to both work with conditional masking and have an implementation that does this efficiently. In practice, we have some functions that don't work at all unless they are executed "loop uniformly" as in the source loop, so changing the CFG in this way breaks the optimized code. I know this explanation may be a little difficult to follow, especially if you are thinking of LLVM entirely in the context of optimizing serial programs for scalar architectures. So I have offered these other reasons to add the patch: 1) Conditional logic is naturally simpler than loops, so a pass designed to simplify the loop structure should not introduce more complexity in the form of additional loops when the CFG is just representing control flow 2) The patch doesn't degrade the assembly, at least in the most recent test that I performed for Cameron I think it was exactly the same instruction count or slightly better Andrew