thr3ads.net - llvm dev - [LLVMdev] Why should we have the LoopPass and LoopPassManager? Can we get rid of this complexity? [Jan 2014]

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2014-Jan-22 11:39 UTC

[LLVMdev] Why should we have the LoopPass and LoopPassManager? Can we get rid of this complexity?

On Wed, Jan 22, 2014 at 1:01 AM, Andrew Trick <atrick at apple.com> wrote:
> On Jan 22, 2014, at 12:44 AM, Chandler Carruth <chandlerc at
gmail.com>
> wrote:
>
>
> On Wed, Jan 22, 2014 at 12:33 AM, Andrew Trick <atrick at apple.com>
wrote:
>
>> > There appear to be two chunks of "functionality"
provided by loop
>> passes:
>> >
>> > 1) A worklist of loops to process. This is very rarely used:
>> > 1.1) LoopSimplify and LoopUnswitch add loops to the queue.
>>
>> I’m making this up without much thought, but we may benefit from
>> iterative loop transforms like Rotate -> LICM -> Unswitch ->
Rotate ->
>> LICM. We might need to come up with a preferred alternative: we can
either
>> continue to convert transforms into a utilities, or we can invent new
pass
>> manager tricks. e.g. iterate over a set of function passes or
selectively
>> run a pass on “dirty” regions.
>
>
> This is a really good point. Owen pointed it out to me as well in another
> guise: when we unroll a loop, we need to re-run a bunch of the passes, but
> re-running them when we *haven't* successfully unrolled a loop is a
total
> waste.
>
> I'd like to think more about this, so a simpler option: what if we
*just*
> extract LoopSimplify and LCSSA? If all the LoopPasses preserve these, then
> them being function passes would be fine. This would allow us to at least
> *start* writing function passes over loops (like the LoopVectorizer) rather
> than being forced to phrase them as LoopPasses.
>
> I think I could implement this simpler option right away, and that would
> unblock a lot of our efforts around unrolling, vectorizing, and PGO.
>
>
> Great. LoopSimplify can be used as a utility to cleanup transformed loops
> if needed.
>
This would really only work with pretty specialized logic in the
LoopVectorizer and LSR to select only the cleanup aspects of LoopSimplify
that they need. And it wouldn't even have any utility in LSR.

Unfortunately, the loop pass manager makes doing this incrementally hard.
The loop vectorizer runs on the inner most loop, makes in not-simplified.
We then pop out to the outer loop and verify everything. The
now-function-pass LoopSimplify fails its verification because it verifies
*all* of the loops in the function, even though the inner loops aren't
going to be revisited.

My suggestion is to disable the whole-function verification in LoopSimplify
while we're fixing this. We can instead have LCSSA verify the specific loop
it is processing which will give us the same safety guarantees. Does that
sound OK?

That will let me do the following:

1) Hoist LoopSimplify to a function pass
2) Hoist LCSSA to a function pass
3) Make the LoopVectorizer a function pass
4) Make IVUsers a utility of LSR, run on-demand for the loop that LSR is
processing (no pass manager involvement)
5) Make LSR a function pass

As a consequence of this, we will actually fix the long-standing issue of
LoopSimplify vs. ScalarEvolution, and cut the number of runs of
LoopSimplify and LCSSA by half (!!!) so I think this is worth doing.

I have a patch that does #1 already, but wanted to check that you're OK
weakening the verification. Otherwise, I have to do 1, 2, 3, and 5 in a
single commit, or teach the LoopVectorizer and LSR to preserve
LoopSimplify... Yuck.

-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140122/19ec6b23/attachment.html>

Chandler Carruth

2014-Jan-22 12:01 UTC

head link

[LLVMdev] Why should we have the LoopPass and LoopPassManager? Can we get rid of this complexity?

On Wed, Jan 22, 2014 at 3:39 AM, Chandler Carruth <chandlerc at
gmail.com>wrote:
> I have a patch that does #1 already, but wanted to check that you're OK
> weakening the verification. Otherwise, I have to do 1, 2, 3, and 5 in a
> single commit, or teach the LoopVectorizer and LSR to preserve
> LoopSimplify... Yuck.

This patch appears to cause slight changes in three test cases:

    LLVM :: Transforms/IndVarSimplify/lftr-reuse.ll
    LLVM :: Transforms/LoopSimplify/ashr-crash.ll
    LLVM :: Transforms/LoopStrengthReduce/2011-12-19-PostincQuadratic.ll

Looking at lftr-reuse.ll, we successfully hoist the 'icmp slt' into the
'outer:' block as the comment says would be nice (because the outer loop
is
simplified now, the test is checking for unsimplified). The LSR failure is
just that the loop basic blocks have different names (loopexit instead of
preheader).

The ashr-crash.ll case is minutely interesting -- we fail to hoist the
comparison that the test wants hoisted into the entry block. My suspicion
is that getting this to hoist with the heavily reduced pipeline used is
problematic, as the test seems more geared to tickle SCEV bugs than test
important optimization invariants.

All of the other regression tests pass, so this looks pretty good to me.
Let me know! I can mail the patch tomorrow if it helps.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140122/94d5eb58/attachment.html>

Andrew Trick

2014-Jan-22 20:02 UTC

head link

[LLVMdev] Why should we have the LoopPass and LoopPassManager? Can we get rid of this complexity?

On Jan 22, 2014, at 3:39 AM, Chandler Carruth <chandlerc at gmail.com>
wrote:
> 
> On Wed, Jan 22, 2014 at 1:01 AM, Andrew Trick <atrick at apple.com>
wrote:
> On Jan 22, 2014, at 12:44 AM, Chandler Carruth <chandlerc at
gmail.com> wrote:
> 
>> 
>> On Wed, Jan 22, 2014 at 12:33 AM, Andrew Trick <atrick at
apple.com> wrote:
>> > There appear to be two chunks of "functionality"
provided by loop passes:
>> >
>> > 1) A worklist of loops to process. This is very rarely used:
>> > 1.1) LoopSimplify and LoopUnswitch add loops to the queue.
>> 
>> I’m making this up without much thought, but we may benefit from
iterative loop transforms like Rotate -> LICM -> Unswitch -> Rotate
-> LICM. We might need to come up with a preferred alternative: we can either
continue to convert transforms into a utilities, or we can invent new pass
manager tricks. e.g. iterate over a set of function passes or selectively run a
pass on “dirty” regions.
>> 
>> This is a really good point. Owen pointed it out to me as well in
another guise: when we unroll a loop, we need to re-run a bunch of the passes,
but re-running them when we *haven't* successfully unrolled a loop is a
total waste.
>> 
>> I'd like to think more about this, so a simpler option: what if we
*just* extract LoopSimplify and LCSSA? If all the LoopPasses preserve these,
then them being function passes would be fine. This would allow us to at least
*start* writing function passes over loops (like the LoopVectorizer) rather than
being forced to phrase them as LoopPasses.
>> 
>> I think I could implement this simpler option right away, and that
would unblock a lot of our efforts around unrolling, vectorizing, and PGO.
> 
> Great. LoopSimplify can be used as a utility to cleanup transformed loops
if needed.
> 
> This would really only work with pretty specialized logic in the
LoopVectorizer and LSR to select only the cleanup aspects of LoopSimplify that
they need. And it wouldn't even have any utility in LSR.
> 
> Unfortunately, the loop pass manager makes doing this incrementally hard.
The loop vectorizer runs on the inner most loop, makes in not-simplified. We
then pop out to the outer loop and verify everything. The now-function-pass
LoopSimplify fails its verification because it verifies *all* of the loops in
the function, even though the inner loops aren't going to be revisited.
> 
> My suggestion is to disable the whole-function verification in LoopSimplify
while we're fixing this. We can instead have LCSSA verify the specific loop
it is processing which will give us the same safety guarantees. Does that sound
OK?
Yes, in general we cannot rely on LoopSimplify for correctness since it’s not
always possible to put loops in canonical form. Passes should check
isSimplifiedForm, so it should only be a question of missed optimization.
> That will let me do the following:
> 
> 1) Hoist LoopSimplify to a function pass
> 2) Hoist LCSSA to a function pass
> 3) Make the LoopVectorizer a function pass
> 4) Make IVUsers a utility of LSR, run on-demand for the loop that LSR is
processing (no pass manager involvement)
> 5) Make LSR a function pass
Great.
> As a consequence of this, we will actually fix the long-standing issue of
LoopSimplify vs. ScalarEvolution, and cut the number of runs of LoopSimplify and
LCSSA by half (!!!) so I think this is worth doing.
This was a really annoying issue that I spent a lot of time on in the past. So
thanks.
> I have a patch that does #1 already, but wanted to check that you're OK
weakening the verification. Otherwise, I have to do 1, 2, 3, and 5 in a single
commit, or teach the LoopVectorizer and LSR to preserve LoopSimplify... Yuck.

Yes. Thank you.

-Andy

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140122/86b25f3f/attachment.html>

Andrew Trick

2014-Jan-22 20:09 UTC

head link

[LLVMdev] Why should we have the LoopPass and LoopPassManager? Can we get rid of this complexity?

On Jan 22, 2014, at 4:01 AM, Chandler Carruth <chandlerc at gmail.com>
wrote:
> 
> On Wed, Jan 22, 2014 at 3:39 AM, Chandler Carruth <chandlerc at
gmail.com> wrote:
> I have a patch that does #1 already, but wanted to check that you're OK
weakening the verification. Otherwise, I have to do 1, 2, 3, and 5 in a single
commit, or teach the LoopVectorizer and LSR to preserve LoopSimplify... Yuck.
> 
> This patch appears to cause slight changes in three test cases:
> 
>     LLVM :: Transforms/IndVarSimplify/lftr-reuse.ll
>     LLVM :: Transforms/LoopSimplify/ashr-crash.ll
>     LLVM :: Transforms/LoopStrengthReduce/2011-12-19-PostincQuadratic.ll
> 
> Looking at lftr-reuse.ll, we successfully hoist the 'icmp slt' into
the 'outer:' block as the comment says would be nice (because the outer
loop is simplified now, the test is checking for unsimplified). The LSR failure
is just that the loop basic blocks have different names (loopexit instead of
preheader).
> 
> The ashr-crash.ll case is minutely interesting -- we fail to hoist the
comparison that the test wants hoisted into the entry block. My suspicion is
that getting this to hoist with the heavily reduced pipeline used is
problematic, as the test seems more geared to tickle SCEV bugs than test
important optimization invariants.
This looks like an example of the kind of order-of-loop-transform problems that
I spent a staggering amount of time debugging. So the underlying problem should
go away.

That said, it also looks like an example where we benefit from applying multiple
passes to the inner loop before optimizing the outer loop. If you want to file a
PR and disable it for now that’s cool.

-Andy
> All of the other regression tests pass, so this looks pretty good to me.
Let me know! I can mail the patch tomorrow if it helps.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140122/ee6f3e79/attachment.html>

llvm dev - Jan 2014 - [LLVMdev] Why should we have the LoopPass and LoopPassManager? Can we get rid of this complexity?

[LLVMdev] Why should we have the LoopPass and LoopPassManager? Can we get rid of this complexity?

[LLVMdev] Why should we have the LoopPass and LoopPassManager? Can we get rid of this complexity?

[LLVMdev] Why should we have the LoopPass and LoopPassManager? Can we get rid of this complexity?

[LLVMdev] Why should we have the LoopPass and LoopPassManager? Can we get rid of this complexity?