thr3ads.net - llvm dev - [llvm-dev] Making loop guards part of canonical loop structure [May 2019]

If this information is useful, please help other people find it:
Share via:

Kit Barton via llvm-dev

2019-May-28 13:46 UTC

[llvm-dev] Making loop guards part of canonical loop structure

Hi all,

TL;DR: Should the loop guard branch be part of the canonical loop structure?

Background:
-----------
While working on the next set of improvements to loop fusion, we discovered that
loop rotation will put a guard around a loop if it cannot prove the loop will
execute at least once, as seen in the following (simplified) example:

entry:
  br i1 %cmp1, label %for.body.lr.ph, label %for.cond.cleanup

for.body.lr.ph:                                   ; preds = %entry
  br label %for.body

for.body:                                         ; preds = %for.body.lr.ph,
%for.inc
  br label %for.inc

for.inc:                                          ; preds = %for.body
  br i1 %cmp, label %for.body, label %for.cond.for.cond.cleanup_crit_edge

for.cond.for.cond.cleanup_crit_edge:              ; preds = %for.inc
  br label %for.cond.cleanup

for.cond.cleanup:                                 ; preds =
%for.cond.for.cond.cleanup_crit_edge, %entry
  br label %for.end

for.end:                                          ; preds = %for.cond.cleanup
  ret void

However, for loops that can be proven to execute at least once, there is no
guard.

This creates complications for several loop optimizations because they need to
handle both guarded loops and non-guarded loops. For example, the current loop
fusion pass needs to check whether two loops are control flow equivalent before
fusing them (i.e., if the first loop executes, the second loop is guaranteed to
execute). This is currently done by checking that the preheader of the first
loop dominates the preheader of the second loop, and the preheader of the second
loop post-dominates the preheader of the first loop. When one (or both) of the
loops have a guard, then this check no longer works. If the loop guard was part
of the canonical form, then this check could be done using the loop guard,
instead of the preheader.

Similarly, when checking for perfect loop nests, a loop guard around the inner
loop will make the nest appear imperfect. However, if the guard is considered to
be part of the inner loop, it is easier to prove a perfect nest.

Both of these examples have alternative ways to deal with the loop guards, if
they are present. However, I would like to have the discussion regarding making
guards part of the canonical loop structure. If people are supportive of this,
we can start doing the work to implement the guards and start modify loop
optimizations to deal with them appropriately. However, if this is not the
direction we want to go, we can deal with the guards in the specific
optimizations where they are relevant.

Proposed Solution:
------------------
Ensure that all loops in canonical form have a guard branch that dominates the
preheader. If such a guard branch does not exist, a trivial guard will be
created.

I have not looked at the implementation of this
extensively yet because I wanted feedback on this direction first. However, my
initial thought is to modify LoopSimplify to add the concept of a loop guard,
and provide similar guarantees that it currently provides for preheaders and
exit
blocks. Specifically, if a guard block is not found, one is created and inserted
immediately before the loop preheader to guarantee a structure similar to the
one in the example above. Note that as with the preheader and exit blocks, it is
possible that subsequent passes (e.g., SimplifyCFG) may remove these blocks thus
taking it out of canonical form. However, this is an existing situation that
loop optimizations already deal with.



Any comments/feedback are greatly appreciated.

Thanks,

Kit

Philip Reames via llvm-dev

2019-May-28 17:49 UTC

head link

[llvm-dev] Making loop guards part of canonical loop structure

Kit,

I'm a bit unsure about having the guarded loop be canonical form.  In
particular, I'm hesitant about introducing a conditional branch on
constant for loops which aren't found to have an obvious guard block. 
(Note the distinction: no obvious guard does not imply no guard, just
one we didn't find.)

I'd suggest that we maybe start with a small step in this direction,
rather than everything at once?

Introducing utilities on Loop to return the loop guard block if one is
available seems like a reasonable and necessary first step.  Once that's
done, we could consider extending LoopSimplify to make guarded loops
"more obvious" by moving code out of guarded blocks and the like. 
Note
that my "more obvious" here is about exposing existing guarded loops,
and not introducing new guards. 

I think we need both of these pieces if we were to go all the way
towards your proposal right?  If so, let's start incremental and see if
we have enough of a problem left to justify the canonicalization. 

Philip

On 5/28/19 6:46 AM, Kit Barton via llvm-dev wrote:> Hi all,
>
> TL;DR: Should the loop guard branch be part of the canonical loop
structure?
>
> Background:
> -----------
> While working on the next set of improvements to loop fusion, we discovered
that
> loop rotation will put a guard around a loop if it cannot prove the loop
will
> execute at least once, as seen in the following (simplified) example:
>
> entry:
>   br i1 %cmp1, label %for.body.lr.ph, label %for.cond.cleanup
>
> for.body.lr.ph:                                   ; preds = %entry
>   br label %for.body
>
> for.body:                                         ; preds =
%for.body.lr.ph, %for.inc
>   br label %for.inc
>
> for.inc:                                          ; preds = %for.body
>   br i1 %cmp, label %for.body, label %for.cond.for.cond.cleanup_crit_edge
>
> for.cond.for.cond.cleanup_crit_edge:              ; preds = %for.inc
>   br label %for.cond.cleanup
>
> for.cond.cleanup:                                 ; preds =
%for.cond.for.cond.cleanup_crit_edge, %entry
>   br label %for.end
>
> for.end:                                          ; preds =
%for.cond.cleanup
>   ret void
>
> However, for loops that can be proven to execute at least once, there is no
> guard.
>
> This creates complications for several loop optimizations because they need
to
> handle both guarded loops and non-guarded loops. For example, the current
loop
> fusion pass needs to check whether two loops are control flow equivalent
before
> fusing them (i.e., if the first loop executes, the second loop is
guaranteed to
> execute). This is currently done by checking that the preheader of the
first
> loop dominates the preheader of the second loop, and the preheader of the
second
> loop post-dominates the preheader of the first loop. When one (or both) of
the
> loops have a guard, then this check no longer works. If the loop guard was
part
> of the canonical form, then this check could be done using the loop guard,
> instead of the preheader.
>
> Similarly, when checking for perfect loop nests, a loop guard around the
inner
> loop will make the nest appear imperfect. However, if the guard is
considered to
> be part of the inner loop, it is easier to prove a perfect nest.
>
> Both of these examples have alternative ways to deal with the loop guards,
if
> they are present. However, I would like to have the discussion regarding
making
> guards part of the canonical loop structure. If people are supportive of
this,
> we can start doing the work to implement the guards and start modify loop
> optimizations to deal with them appropriately. However, if this is not the
> direction we want to go, we can deal with the guards in the specific
> optimizations where they are relevant.
>
> Proposed Solution:
> ------------------
> Ensure that all loops in canonical form have a guard branch that dominates
the
> preheader. If such a guard branch does not exist, a trivial guard will be
> created.
>
> I have not looked at the implementation of this
> extensively yet because I wanted feedback on this direction first. However,
my
> initial thought is to modify LoopSimplify to add the concept of a loop
guard,
> and provide similar guarantees that it currently provides for preheaders
and exit
> blocks. Specifically, if a guard block is not found, one is created and
inserted
> immediately before the loop preheader to guarantee a structure similar to
the
> one in the example above. Note that as with the preheader and exit blocks,
it is
> possible that subsequent passes (e.g., SimplifyCFG) may remove these blocks
thus
> taking it out of canonical form. However, this is an existing situation
that
> loop optimizations already deal with.
>
>
>
> Any comments/feedback are greatly appreciated.
>
> Thanks,
>
> Kit
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Finkel, Hal J. via llvm-dev

2019-May-28 21:02 UTC

head link

[llvm-dev] Making loop guards part of canonical loop structure

On 5/28/19 8:46 AM, Kit Barton via llvm-dev wrote:> Hi all,
>
> TL;DR: Should the loop guard branch be part of the canonical loop
structure?
>
> Background:
> -----------
> While working on the next set of improvements to loop fusion, we discovered
that
> loop rotation will put a guard around a loop if it cannot prove the loop
will
> execute at least once, as seen in the following (simplified) example:
>
> entry:
>    br i1 %cmp1, label %for.body.lr.ph, label %for.cond.cleanup
>
> for.body.lr.ph:                                   ; preds = %entry
>    br label %for.body
>
> for.body:                                         ; preds =
%for.body.lr.ph, %for.inc
>    br label %for.inc
>
> for.inc:                                          ; preds = %for.body
>    br i1 %cmp, label %for.body, label %for.cond.for.cond.cleanup_crit_edge
>
> for.cond.for.cond.cleanup_crit_edge:              ; preds = %for.inc
>    br label %for.cond.cleanup
>
> for.cond.cleanup:                                 ; preds =
%for.cond.for.cond.cleanup_crit_edge, %entry
>    br label %for.end
>
> for.end:                                          ; preds =
%for.cond.cleanup
>    ret void
>
> However, for loops that can be proven to execute at least once, there is no
> guard.
>
> This creates complications for several loop optimizations because they need
to
> handle both guarded loops and non-guarded loops. For example, the current
loop
> fusion pass needs to check whether two loops are control flow equivalent
before
> fusing them (i.e., if the first loop executes, the second loop is
guaranteed to
> execute). This is currently done by checking that the preheader of the
first
> loop dominates the preheader of the second loop, and the preheader of the
second
> loop post-dominates the preheader of the first loop. When one (or both) of
the
> loops have a guard, then this check no longer works. If the loop guard was
part
> of the canonical form, then this check could be done using the loop guard,
> instead of the preheader.
>
> Similarly, when checking for perfect loop nests, a loop guard around the
inner
> loop will make the nest appear imperfect. However, if the guard is
considered to
> be part of the inner loop, it is easier to prove a perfect nest.
>
> Both of these examples have alternative ways to deal with the loop guards,
if
> they are present. However, I would like to have the discussion regarding
making
> guards part of the canonical loop structure. If people are supportive of
this,
> we can start doing the work to implement the guards and start modify loop
> optimizations to deal with them appropriately. However, if this is not the
> direction we want to go, we can deal with the guards in the specific
> optimizations where they are relevant.
>
> Proposed Solution:
> ------------------
> Ensure that all loops in canonical form have a guard branch that dominates
the
> preheader. If such a guard branch does not exist, a trivial guard will be
> created.
>
> I have not looked at the implementation of this
> extensively yet because I wanted feedback on this direction first. However,
my
> initial thought is to modify LoopSimplify to add the concept of a loop
guard,
> and provide similar guarantees that it currently provides for preheaders
and exit
> blocks. Specifically, if a guard block is not found, one is created and
inserted
> immediately before the loop preheader to guarantee a structure similar to
the
> one in the example above. Note that as with the preheader and exit blocks,
it is
> possible that subsequent passes (e.g., SimplifyCFG) may remove these blocks
thus
> taking it out of canonical form. However, this is an existing situation
that
> loop optimizations already deal with.

Hi, Kit,

Maybe I'm missing something obvious, but how does this help? I 
understand how guards can make checking for perfect nesting, etc. 
harder, but does always having guards make this easier?

Also, we already have a situation with LCSSA form where trivial 
constructs are introduced, and then eliminated by local simplifications, 
sometimes multiple times as the pipeline progresses. Having more of them 
seems undesirable - it leads to more IR changes, which causes more 
analysis invalidation, and so on.

Thanks again,

Hal

>
>
>
> Any comments/feedback are greatly appreciated.
>
> Thanks,
>
> Kit
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Kit Barton via llvm-dev

2019-May-28 23:37 UTC

head link

[llvm-dev] Making loop guards part of canonical loop structure

I just realized that inadvertently hit reply, instead of reply-all, when
replying to Philip. I'm going to summarize the conversation we have had
since, so everyone can see the direction we are thinking to go.

We'll start by adding a guard detection API as a loop utility that can
identify existing guards. The initial patch will be as simple as
possible, to establish the API. From there we can extend the detection
to get more difficult cases, and potentially modify some parts of loop
simplification (or other optimizations) to make the detection easier to
do, it it makes sense. At that point we will have experience and can
revisit this discussion if it still makes sense to do so.

Kit


Philip Reames <listmail at philipreames.com> writes:
> Kit,
>
> I'm a bit unsure about having the guarded loop be canonical form.  In
> particular, I'm hesitant about introducing a conditional branch on
> constant for loops which aren't found to have an obvious guard block. 
> (Note the distinction: no obvious guard does not imply no guard, just
> one we didn't find.)
>
> I'd suggest that we maybe start with a small step in this direction,
> rather than everything at once?
>
> Introducing utilities on Loop to return the loop guard block if one is
> available seems like a reasonable and necessary first step.  Once
that's
> done, we could consider extending LoopSimplify to make guarded loops
> "more obvious" by moving code out of guarded blocks and the
like.  Note
> that my "more obvious" here is about exposing existing guarded
loops,
> and not introducing new guards. 
>
> I think we need both of these pieces if we were to go all the way
> towards your proposal right?  If so, let's start incremental and see if
> we have enough of a problem left to justify the canonicalization. 
>
> Philip
>
> On 5/28/19 6:46 AM, Kit Barton via llvm-dev wrote:
>> Hi all,
>>
>> TL;DR: Should the loop guard branch be part of the canonical loop
structure?
>>
>> Background:
>> -----------
>> While working on the next set of improvements to loop fusion, we
discovered that
>> loop rotation will put a guard around a loop if it cannot prove the
loop will
>> execute at least once, as seen in the following (simplified) example:
>>
>> entry:
>>   br i1 %cmp1, label %for.body.lr.ph, label %for.cond.cleanup
>>
>> for.body.lr.ph:                                   ; preds = %entry
>>   br label %for.body
>>
>> for.body:                                         ; preds =
%for.body.lr.ph, %for.inc
>>   br label %for.inc
>>
>> for.inc:                                          ; preds = %for.body
>>   br i1 %cmp, label %for.body, label
%for.cond.for.cond.cleanup_crit_edge
>>
>> for.cond.for.cond.cleanup_crit_edge:              ; preds = %for.inc
>>   br label %for.cond.cleanup
>>
>> for.cond.cleanup:                                 ; preds =
%for.cond.for.cond.cleanup_crit_edge, %entry
>>   br label %for.end
>>
>> for.end:                                          ; preds =
%for.cond.cleanup
>>   ret void
>>
>> However, for loops that can be proven to execute at least once, there
is no
>> guard.
>>
>> This creates complications for several loop optimizations because they
need to
>> handle both guarded loops and non-guarded loops. For example, the
current loop
>> fusion pass needs to check whether two loops are control flow
equivalent before
>> fusing them (i.e., if the first loop executes, the second loop is
guaranteed to
>> execute). This is currently done by checking that the preheader of the
first
>> loop dominates the preheader of the second loop, and the preheader of
the second
>> loop post-dominates the preheader of the first loop. When one (or both)
of the
>> loops have a guard, then this check no longer works. If the loop guard
was part
>> of the canonical form, then this check could be done using the loop
guard,
>> instead of the preheader.
>>
>> Similarly, when checking for perfect loop nests, a loop guard around
the inner
>> loop will make the nest appear imperfect. However, if the guard is
considered to
>> be part of the inner loop, it is easier to prove a perfect nest.
>>
>> Both of these examples have alternative ways to deal with the loop
guards, if
>> they are present. However, I would like to have the discussion
regarding making
>> guards part of the canonical loop structure. If people are supportive
of this,
>> we can start doing the work to implement the guards and start modify
loop
>> optimizations to deal with them appropriately. However, if this is not
the
>> direction we want to go, we can deal with the guards in the specific
>> optimizations where they are relevant.
>>
>> Proposed Solution:
>> ------------------
>> Ensure that all loops in canonical form have a guard branch that
dominates the
>> preheader. If such a guard branch does not exist, a trivial guard will
be
>> created.
>>
>> I have not looked at the implementation of this
>> extensively yet because I wanted feedback on this direction first.
However, my
>> initial thought is to modify LoopSimplify to add the concept of a loop
guard,
>> and provide similar guarantees that it currently provides for
preheaders and exit
>> blocks. Specifically, if a guard block is not found, one is created and
inserted
>> immediately before the loop preheader to guarantee a structure similar
to the
>> one in the example above. Note that as with the preheader and exit
blocks, it is
>> possible that subsequent passes (e.g., SimplifyCFG) may remove these
blocks thus
>> taking it out of canonical form. However, this is an existing situation
that
>> loop optimizations already deal with.
>>
>>
>>
>> Any comments/feedback are greatly appreciated.
>>
>> Thanks,
>>
>> Kit
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Kit Barton via llvm-dev

2019-May-28 23:43 UTC

head link

[llvm-dev] Making loop guards part of canonical loop structure

Hi Hal,

You're right, for perfect loop nests I think simply knowing that the
inner loop has a guard will make things easier. This ties in with
Philip's earlier reply, that being able to reliably identify guarded
loops is a good first step.

Your comments about continually modifying the IR and thus invalidating
analysis makes a lot of sense. I just replied to the list with a summary
of a discussion that Philip and I had off the list (because I forgot to
use reply-all) and we're thinking to start by focusing on accurately
identifying existing guards before trying to add new (unnecessary)
guards. Please let me know if you're OK with this direction to start.

Kit


"Finkel, Hal J." <hfinkel at anl.gov> writes:
> On 5/28/19 8:46 AM, Kit Barton via llvm-dev wrote:
>> Hi all,
>>
>> TL;DR: Should the loop guard branch be part of the canonical loop
structure?
>>
>> Background:
>> -----------
>> While working on the next set of improvements to loop fusion, we
discovered that
>> loop rotation will put a guard around a loop if it cannot prove the
loop will
>> execute at least once, as seen in the following (simplified) example:
>>
>> entry:
>>    br i1 %cmp1, label %for.body.lr.ph, label %for.cond.cleanup
>>
>> for.body.lr.ph:                                   ; preds = %entry
>>    br label %for.body
>>
>> for.body:                                         ; preds =
%for.body.lr.ph, %for.inc
>>    br label %for.inc
>>
>> for.inc:                                          ; preds = %for.body
>>    br i1 %cmp, label %for.body, label
%for.cond.for.cond.cleanup_crit_edge
>>
>> for.cond.for.cond.cleanup_crit_edge:              ; preds = %for.inc
>>    br label %for.cond.cleanup
>>
>> for.cond.cleanup:                                 ; preds =
%for.cond.for.cond.cleanup_crit_edge, %entry
>>    br label %for.end
>>
>> for.end:                                          ; preds =
%for.cond.cleanup
>>    ret void
>>
>> However, for loops that can be proven to execute at least once, there
is no
>> guard.
>>
>> This creates complications for several loop optimizations because they
need to
>> handle both guarded loops and non-guarded loops. For example, the
current loop
>> fusion pass needs to check whether two loops are control flow
equivalent before
>> fusing them (i.e., if the first loop executes, the second loop is
guaranteed to
>> execute). This is currently done by checking that the preheader of the
first
>> loop dominates the preheader of the second loop, and the preheader of
the second
>> loop post-dominates the preheader of the first loop. When one (or both)
of the
>> loops have a guard, then this check no longer works. If the loop guard
was part
>> of the canonical form, then this check could be done using the loop
guard,
>> instead of the preheader.
>>
>> Similarly, when checking for perfect loop nests, a loop guard around
the inner
>> loop will make the nest appear imperfect. However, if the guard is
considered to
>> be part of the inner loop, it is easier to prove a perfect nest.
>>
>> Both of these examples have alternative ways to deal with the loop
guards, if
>> they are present. However, I would like to have the discussion
regarding making
>> guards part of the canonical loop structure. If people are supportive
of this,
>> we can start doing the work to implement the guards and start modify
loop
>> optimizations to deal with them appropriately. However, if this is not
the
>> direction we want to go, we can deal with the guards in the specific
>> optimizations where they are relevant.
>>
>> Proposed Solution:
>> ------------------
>> Ensure that all loops in canonical form have a guard branch that
dominates the
>> preheader. If such a guard branch does not exist, a trivial guard will
be
>> created.
>>
>> I have not looked at the implementation of this
>> extensively yet because I wanted feedback on this direction first.
However, my
>> initial thought is to modify LoopSimplify to add the concept of a loop
guard,
>> and provide similar guarantees that it currently provides for
preheaders and exit
>> blocks. Specifically, if a guard block is not found, one is created and
inserted
>> immediately before the loop preheader to guarantee a structure similar
to the
>> one in the example above. Note that as with the preheader and exit
blocks, it is
>> possible that subsequent passes (e.g., SimplifyCFG) may remove these
blocks thus
>> taking it out of canonical form. However, this is an existing situation
that
>> loop optimizations already deal with.
>
>
> Hi, Kit,
>
> Maybe I'm missing something obvious, but how does this help? I 
> understand how guards can make checking for perfect nesting, etc. 
> harder, but does always having guards make this easier?
>
> Also, we already have a situation with LCSSA form where trivial 
> constructs are introduced, and then eliminated by local simplifications, 
> sometimes multiple times as the pipeline progresses. Having more of them 
> seems undesirable - it leads to more IR changes, which causes more 
> analysis invalidation, and so on.
>
> Thanks again,
>
> Hal
>
>
>>
>>
>>
>> Any comments/feedback are greatly appreciated.
>>
>> Thanks,
>>
>> Kit
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

David Greene via llvm-dev

2019-May-29 14:04 UTC

head link

[llvm-dev] Making loop guards part of canonical loop structure

This seems like potentially a good thing.  There have been cases where
I've wanted to know the guard associated with the loop.

Loop versioning can create many guards, including guards nested within
other guards.  I don't know how/if we want to capture that level of
complexity, but it was the first question that popped into my mind.

                       -David

Kit Barton via llvm-dev <llvm-dev at lists.llvm.org> writes:
> Hi all,
>
> TL;DR: Should the loop guard branch be part of the canonical loop
structure?
>
> Background:
> -----------
> While working on the next set of improvements to loop fusion, we discovered
that
> loop rotation will put a guard around a loop if it cannot prove the loop
will
> execute at least once, as seen in the following (simplified) example:
>
> entry:
>   br i1 %cmp1, label %for.body.lr.ph, label %for.cond.cleanup
>
> for.body.lr.ph:                                   ; preds = %entry
>   br label %for.body
>
> for.body:                                         ; preds =
%for.body.lr.ph, %for.inc
>   br label %for.inc
>
> for.inc:                                          ; preds = %for.body
>   br i1 %cmp, label %for.body, label %for.cond.for.cond.cleanup_crit_edge
>
> for.cond.for.cond.cleanup_crit_edge:              ; preds = %for.inc
>   br label %for.cond.cleanup
>
> for.cond.cleanup:                                 ; preds =
%for.cond.for.cond.cleanup_crit_edge, %entry
>   br label %for.end
>
> for.end:                                          ; preds =
%for.cond.cleanup
>   ret void
>
> However, for loops that can be proven to execute at least once, there is no
> guard.
>
> This creates complications for several loop optimizations because they need
to
> handle both guarded loops and non-guarded loops. For example, the current
loop
> fusion pass needs to check whether two loops are control flow equivalent
before
> fusing them (i.e., if the first loop executes, the second loop is
guaranteed to
> execute). This is currently done by checking that the preheader of the
first
> loop dominates the preheader of the second loop, and the preheader of the
second
> loop post-dominates the preheader of the first loop. When one (or both) of
the
> loops have a guard, then this check no longer works. If the loop guard was
part
> of the canonical form, then this check could be done using the loop guard,
> instead of the preheader.
>
> Similarly, when checking for perfect loop nests, a loop guard around the
inner
> loop will make the nest appear imperfect. However, if the guard is
considered to
> be part of the inner loop, it is easier to prove a perfect nest.
>
> Both of these examples have alternative ways to deal with the loop guards,
if
> they are present. However, I would like to have the discussion regarding
making
> guards part of the canonical loop structure. If people are supportive of
this,
> we can start doing the work to implement the guards and start modify loop
> optimizations to deal with them appropriately. However, if this is not the
> direction we want to go, we can deal with the guards in the specific
> optimizations where they are relevant.
>
> Proposed Solution:
> ------------------
> Ensure that all loops in canonical form have a guard branch that dominates
the
> preheader. If such a guard branch does not exist, a trivial guard will be
> created.
>
> I have not looked at the implementation of this
> extensively yet because I wanted feedback on this direction first. However,
my
> initial thought is to modify LoopSimplify to add the concept of a loop
guard,
> and provide similar guarantees that it currently provides for preheaders
and exit
> blocks. Specifically, if a guard block is not found, one is created and
inserted
> immediately before the loop preheader to guarantee a structure similar to
the
> one in the example above. Note that as with the preheader and exit blocks,
it is
> possible that subsequent passes (e.g., SimplifyCFG) may remove these blocks
thus
> taking it out of canonical form. However, this is an existing situation
that
> loop optimizations already deal with.
>
>
>
> Any comments/feedback are greatly appreciated.
>
> Thanks,
>
> Kit
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Krzysztof Parzyszek via llvm-dev

2019-May-30 18:28 UTC

head link

[llvm-dev] Making loop guards part of canonical loop structure

On Hexagon, unguarded loops cannot be converted to hardware loops.

If the loop's latch branch alone handles the iteration count, including the
possibility of 0, then the loop cannot be converted to a hardware loop, because
hardware loops must iterate at least once.  If the entire loop is guarded
against zero iteration count, we can put the loop setup in the preheader, since
at that point the loop is guaranteed to execute at least once.

I am strongly in favor of having some way to create loop guards, even if they
are trivial.

-- 
Krzysztof Parzyszek  kparzysz at quicinc.com   LLVM compiler development

-----Original Message-----
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Finkel,
Hal J. via llvm-dev
Sent: Tuesday, May 28, 2019 4:02 PM
To: Kit Barton <kit.barton at gmail.com>; LLVM Dev Mailing List
<llvm-dev at lists.llvm.org>
Subject: [EXT] Re: [llvm-dev] Making loop guards part of canonical loop
structure


On 5/28/19 8:46 AM, Kit Barton via llvm-dev wrote:> Hi all,
>
> TL;DR: Should the loop guard branch be part of the canonical loop
structure?
>
> Background:
> -----------
> While working on the next set of improvements to loop fusion, we 
> discovered that loop rotation will put a guard around a loop if it 
> cannot prove the loop will execute at least once, as seen in the following
(simplified) example:
>
> entry:
>    br i1 %cmp1, label %for.body.lr.ph, label %for.cond.cleanup
>
> for.body.lr.ph:                                   ; preds = %entry
>    br label %for.body
>
> for.body:                                         ; preds =
%for.body.lr.ph, %for.inc
>    br label %for.inc
>
> for.inc:                                          ; preds = %for.body
>    br i1 %cmp, label %for.body, label 
> %for.cond.for.cond.cleanup_crit_edge
>
> for.cond.for.cond.cleanup_crit_edge:              ; preds = %for.inc
>    br label %for.cond.cleanup
>
> for.cond.cleanup:                                 ; preds =
%for.cond.for.cond.cleanup_crit_edge, %entry
>    br label %for.end
>
> for.end:                                          ; preds =
%for.cond.cleanup
>    ret void
>
> However, for loops that can be proven to execute at least once, there 
> is no guard.
>
> This creates complications for several loop optimizations because they 
> need to handle both guarded loops and non-guarded loops. For example, 
> the current loop fusion pass needs to check whether two loops are 
> control flow equivalent before fusing them (i.e., if the first loop 
> executes, the second loop is guaranteed to execute). This is currently 
> done by checking that the preheader of the first loop dominates the 
> preheader of the second loop, and the preheader of the second loop 
> post-dominates the preheader of the first loop. When one (or both) of 
> the loops have a guard, then this check no longer works. If the loop 
> guard was part of the canonical form, then this check could be done using
the loop guard, instead of the preheader.
>
> Similarly, when checking for perfect loop nests, a loop guard around 
> the inner loop will make the nest appear imperfect. However, if the 
> guard is considered to be part of the inner loop, it is easier to prove a
perfect nest.
>
> Both of these examples have alternative ways to deal with the loop 
> guards, if they are present. However, I would like to have the 
> discussion regarding making guards part of the canonical loop 
> structure. If people are supportive of this, we can start doing the 
> work to implement the guards and start modify loop optimizations to 
> deal with them appropriately. However, if this is not the direction we 
> want to go, we can deal with the guards in the specific optimizations where
they are relevant.
>
> Proposed Solution:
> ------------------
> Ensure that all loops in canonical form have a guard branch that 
> dominates the preheader. If such a guard branch does not exist, a 
> trivial guard will be created.
>
> I have not looked at the implementation of this extensively yet 
> because I wanted feedback on this direction first. However, my initial 
> thought is to modify LoopSimplify to add the concept of a loop guard, 
> and provide similar guarantees that it currently provides for 
> preheaders and exit blocks. Specifically, if a guard block is not 
> found, one is created and inserted immediately before the loop 
> preheader to guarantee a structure similar to the one in the example 
> above. Note that as with the preheader and exit blocks, it is possible 
> that subsequent passes (e.g., SimplifyCFG) may remove these blocks 
> thus taking it out of canonical form. However, this is an existing
situation that loop optimizations already deal with.

Hi, Kit,

Maybe I'm missing something obvious, but how does this help? I understand
how guards can make checking for perfect nesting, etc.
harder, but does always having guards make this easier?

Also, we already have a situation with LCSSA form where trivial constructs are
introduced, and then eliminated by local simplifications, sometimes multiple
times as the pipeline progresses. Having more of them seems undesirable - it
leads to more IR changes, which causes more analysis invalidation, and so on.

Thanks again,

Hal

>
>
>
> Any comments/feedback are greatly appreciated.
>
> Thanks,
>
> Kit
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
--
Hal Finkel
Lead, Compiler Technology and Programming Languages Leadership Computing
Facility Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

llvm dev - May 2019 - Making loop guards part of canonical loop structure

[llvm-dev] Making loop guards part of canonical loop structure

[llvm-dev] Making loop guards part of canonical loop structure

[llvm-dev] Making loop guards part of canonical loop structure

[llvm-dev] Making loop guards part of canonical loop structure

[llvm-dev] Making loop guards part of canonical loop structure

[llvm-dev] Making loop guards part of canonical loop structure

[llvm-dev] Making loop guards part of canonical loop structure