thr3ads.net - llvm dev - [llvm-dev] CFG manipulation and !llvm.loop metadata [Mar 2020]

If this information is useful, please help other people find it:
Share via:

Colin McEwan via llvm-dev

2020-Mar-20 17:37 UTC

[llvm-dev] CFG manipulation and !llvm.loop metadata

An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200320/34cdec77/attachment.html>
-------------- next part --------------
Hi all,

I have encountered some issues with the preservation of the location of
llvm.loop metadata (containing optimisation hints), and would appreciate some
feedback on the issue.

The IR language description states that llvm.loop metadata is attached to
terminator of a loop latch block, and accordingly Loop::getLoopID() searches for
it in all loop latches (and only successfully finds it if all latches reference
the same metadata)

However, transforms which modify the CFG, for example using SplitCriticalEdge(),
generally don't make any attempt to preserve this property. Some transforms
dealing specifically with loops use getLoopID and setLoopID to preserve and
reset the metadata after transformations, but function transforms such as GVN
and Jump Threading can modify control flow without any attempt to update the
location.

For example:

        preheader:
                ...
        loop.body: ; preds = %preheader, %loop.body
                ...
                br i1 %cmp, label %loop.body, %loop.exit, !llvm.loop !123
        loop.exit:

If a pass needs to split the critical edge from %loop.body -> %loop.body, it
will create something like

        preheader:
                ...
        loop.body: ; preds = %preheader, %loop.body.crit_edge
                ...
                br i1 %cmp, label %loop.body.crit_edge, %loop.exit, !llvm.loop
!123   // No longer a loop latch block
        loop.body.crit_edge:
                ... 
                br %loop.body                                                   
// Now a loop latch, with no !llvm.loop
        loop.exit:


Now the loop's latch block is %loop.body.crit_edge, and the !llvm.loop will
not be found by Loop::getLoopID(), meaning it is effectively lost. This can
happen in many different places.


I can think of a few approaches to address this, but each has its issues:

    1. Modify the framework's CFG manipulation tools to maintain the
location of the !llvm.loop. A major drawback of this is that LoopInfo is needed
to be able to tell whether a block is a loop latch or not (in the example of
splitting a latch block's back edge, we need LoopInfo to know whether the
edge we're splitting is the latch edge or exit edge) and it's not
necessary available or up to date when we manipulate the CFG.

    2. Have Loop::getLoopID() search other blocks in the loop for metadata. This
has potential compile-time implications, and would change the IR language
definition of the !llvm.loop as potentially existing (in a valid form) anywhere
in the loop.

    3. Fixup utility functions for function passes to use, to search a loop and
move any errant !llvm.loop to the latch block(s) of its loop.


Additionally, it should probably be explicitly stated in the IR language
reference that !llvm.loop preservation is best-effort and may be lost.

Does anyone have any opinions or other insight?


Many thanks,
-- 
Colin

Hiroshi Yamauchi via llvm-dev

2020-Mar-23 16:53 UTC

head link

[llvm-dev] CFG manipulation and !llvm.loop metadata

I see similar (potential) issues in other metadata like branch weights or
updating analysis results like BFI.

I don't have a solution but I suspect some sort of verification/checks
and/or enforcement through the API might be needed to get it right.


On Fri, Mar 20, 2020 at 10:47 AM Colin McEwan via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi all,
>
> I have encountered some issues with the preservation of the location of
> llvm.loop metadata (containing optimisation hints), and would appreciate
> some feedback on the issue.
>
> The IR language description states that llvm.loop metadata is attached to
> terminator of a loop latch block, and accordingly Loop::getLoopID()
> searches for it in all loop latches (and only successfully finds it if all
> latches reference the same metadata)
>
> However, transforms which modify the CFG, for example using
> SplitCriticalEdge(), generally don't make any attempt to preserve this
> property. Some transforms dealing specifically with loops use getLoopID and
> setLoopID to preserve and reset the metadata after transformations, but
> function transforms such as GVN and Jump Threading can modify control flow
> without any attempt to update the location.
>
> For example:
>
>         preheader:
>                 ...
>         loop.body: ; preds = %preheader, %loop.body
>                 ...
>                 br i1 %cmp, label %loop.body, %loop.exit, !llvm.loop !123
>         loop.exit:
>
> If a pass needs to split the critical edge from %loop.body ->
%loop.body,
> it will create something like
>
>         preheader:
>                 ...
>         loop.body: ; preds = %preheader, %loop.body.crit_edge
>                 ...
>                 br i1 %cmp, label %loop.body.crit_edge, %loop.exit,
> !llvm.loop !123   // No longer a loop latch block
>         loop.body.crit_edge:
>                 ...
>                 br %loop.body
>            // Now a loop latch, with no !llvm.loop
>         loop.exit:
>
>
> Now the loop's latch block is %loop.body.crit_edge, and the !llvm.loop
> will not be found by Loop::getLoopID(), meaning it is effectively lost.
> This can happen in many different places.
>
>
> I can think of a few approaches to address this, but each has its issues:
>
>     1. Modify the framework's CFG manipulation tools to maintain the
> location of the !llvm.loop. A major drawback of this is that LoopInfo is
> needed to be able to tell whether a block is a loop latch or not (in the
> example of splitting a latch block's back edge, we need LoopInfo to
know
> whether the edge we're splitting is the latch edge or exit edge) and
it's
> not necessary available or up to date when we manipulate the CFG.
>
>     2. Have Loop::getLoopID() search other blocks in the loop for
> metadata. This has potential compile-time implications, and would change
> the IR language definition of the !llvm.loop as potentially existing (in a
> valid form) anywhere in the loop.
>
>     3. Fixup utility functions for function passes to use, to search a
> loop and move any errant !llvm.loop to the latch block(s) of its loop.
>
>
> Additionally, it should probably be explicitly stated in the IR language
> reference that !llvm.loop preservation is best-effort and may be lost.
>
> Does anyone have any opinions or other insight?
>
>
> Many thanks,
> --
> Colin
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200323/95988b86/attachment.html>

Johannes Doerfert via llvm-dev

2020-Mar-25 06:36 UTC

head link

[llvm-dev] CFG manipulation and !llvm.loop metadata

Hi Colin,

you might want to look at D5344 on phabricator. Also Michael (CC'ed) is
a good person to talk to about these issues. He has done a lot of work
on loop transformation and metadata for loop transformations.

Cheers,
   Johannes


On 3/23/20 11:53 AM, Hiroshi Yamauchi via llvm-dev wrote:
 > I see similar (potential) issues in other metadata like branch weights or
 > updating analysis results like BFI.
 >
 > I don't have a solution but I suspect some sort of verification/checks
 > and/or enforcement through the API might be needed to get it right.
 >
 >
 > On Fri, Mar 20, 2020 at 10:47 AM Colin McEwan via llvm-dev <
 > llvm-dev at lists.llvm.org> wrote:
 >
 >> Hi all,
 >>
 >> I have encountered some issues with the preservation of the location
of
 >> llvm.loop metadata (containing optimisation hints), and would
appreciate
 >> some feedback on the issue.
 >>
 >> The IR language description states that llvm.loop metadata is 
attached to
 >> terminator of a loop latch block, and accordingly Loop::getLoopID()
 >> searches for it in all loop latches (and only successfully finds it 
if all
 >> latches reference the same metadata)
 >>
 >> However, transforms which modify the CFG, for example using
 >> SplitCriticalEdge(), generally don't make any attempt to preserve
this
 >> property. Some transforms dealing specifically with loops use 
getLoopID and
 >> setLoopID to preserve and reset the metadata after transformations,
but
 >> function transforms such as GVN and Jump Threading can modify 
control flow
 >> without any attempt to update the location.
 >>
 >> For example:
 >>
 >>         preheader:
 >>                 ...
 >>         loop.body: ; preds = %preheader, %loop.body
 >>                 ...
 >>                 br i1 %cmp, label %loop.body, %loop.exit, !llvm.loop 
!123
 >>         loop.exit:
 >>
 >> If a pass needs to split the critical edge from %loop.body -> 
%loop.body,
 >> it will create something like
 >>
 >>         preheader:
 >>                 ...
 >>         loop.body: ; preds = %preheader, %loop.body.crit_edge
 >>                 ...
 >>                 br i1 %cmp, label %loop.body.crit_edge, %loop.exit,
 >> !llvm.loop !123   // No longer a loop latch block
 >>         loop.body.crit_edge:
 >>                 ...
 >>                 br %loop.body
 >>            // Now a loop latch, with no !llvm.loop
 >>         loop.exit:
 >>
 >>
 >> Now the loop's latch block is %loop.body.crit_edge, and the
!llvm.loop
 >> will not be found by Loop::getLoopID(), meaning it is effectively
lost.
 >> This can happen in many different places.
 >>
 >>
 >> I can think of a few approaches to address this, but each has its 
issues:
 >>
 >>     1. Modify the framework's CFG manipulation tools to maintain
the
 >> location of the !llvm.loop. A major drawback of this is that LoopInfo
is
 >> needed to be able to tell whether a block is a loop latch or not (in
the
 >> example of splitting a latch block's back edge, we need LoopInfo
to know
 >> whether the edge we're splitting is the latch edge or exit edge)
and
it's
 >> not necessary available or up to date when we manipulate the CFG.
 >>
 >>     2. Have Loop::getLoopID() search other blocks in the loop for
 >> metadata. This has potential compile-time implications, and would
change
 >> the IR language definition of the !llvm.loop as potentially existing 
(in a
 >> valid form) anywhere in the loop.
 >>
 >>     3. Fixup utility functions for function passes to use, to search a
 >> loop and move any errant !llvm.loop to the latch block(s) of its loop.
 >>
 >>
 >> Additionally, it should probably be explicitly stated in the IR
language
 >> reference that !llvm.loop preservation is best-effort and may be lost.
 >>
 >> Does anyone have any opinions or other insight?
 >>
 >>
 >> Many thanks,
 >> --
 >> Colin
 >> _______________________________________________
 >> LLVM Developers mailing list
 >> llvm-dev at lists.llvm.org
 >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
 >>
 >
 >
 > _______________________________________________
 > LLVM Developers mailing list
 > llvm-dev at lists.llvm.org
 > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Florian Hahn via llvm-dev

2020-Mar-25 10:51 UTC

head link

[llvm-dev] CFG manipulation and !llvm.loop metadata

> On Mar 20, 2020, at 17:37, Colin McEwan via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hi all,
> 
> I have encountered some issues with the preservation of the location of
llvm.loop metadata (containing optimisation hints), and would appreciate some
feedback on the issue.
> 
> The IR language description states that llvm.loop metadata is attached to
terminator of a loop latch block, and accordingly Loop::getLoopID() searches for
it in all loop latches (and only successfully finds it if all latches reference
the same metadata)
> 
> However, transforms which modify the CFG, for example using
SplitCriticalEdge(), generally don't make any attempt to preserve this
property. Some transforms dealing specifically with loops use getLoopID and
setLoopID to preserve and reset the metadata after transformations, but function
transforms such as GVN and Jump Threading can modify control flow without any
attempt to update the location.
> 
> For example:
> 
>        preheader:
>                ...
>        loop.body: ; preds = %preheader, %loop.body
>                ...
>                br i1 %cmp, label %loop.body, %loop.exit, !llvm.loop !123
>        loop.exit:
> 
> If a pass needs to split the critical edge from %loop.body ->
%loop.body, it will create something like
> 
>        preheader:
>                ...
>        loop.body: ; preds = %preheader, %loop.body.crit_edge
>                ...
>                br i1 %cmp, label %loop.body.crit_edge, %loop.exit,
!llvm.loop !123   // No longer a loop latch block
>        loop.body.crit_edge:
>                ... 
>                br %loop.body                                               
// Now a loop latch, with no !llvm.loop
>        loop.exit:
> 
> 
> Now the loop's latch block is %loop.body.crit_edge, and the !llvm.loop
will not be found by Loop::getLoopID(), meaning it is effectively lost. This can
happen in many different places.
> 
> 
> I can think of a few approaches to address this, but each has its issues:
> 
>    1. Modify the framework's CFG manipulation tools to maintain the
location of the !llvm.loop. A major drawback of this is that LoopInfo is needed
to be able to tell whether a block is a loop latch or not (in the example of
splitting a latch block's back edge, we need LoopInfo to know whether the
edge we're splitting is the latch edge or exit edge) and it's not
necessary available or up to date when we manipulate the CFG.
> 
>    2. Have Loop::getLoopID() search other blocks in the loop for metadata.
This has potential compile-time implications, and would change the IR language
definition of the !llvm.loop as potentially existing (in a valid form) anywhere
in the loop.
> 
>    3. Fixup utility functions for function passes to use, to search a loop
and move any errant !llvm.loop to the latch block(s) of its loop.
> 
> 
> Additionally, it should probably be explicitly stated in the IR language
reference that !llvm.loop preservation is best-effort and may be lost.
The LangRef spells out that transformations are required to drop all metadata
they do not know how to preserve
(https://llvm.org/docs/LangRef.html#metadata-nodes-and-metadata-strings). As you
mentioned, some utilities know how to preserve certain kinds of metadata, but
various places conservatively drop metadata. I expect we need to fix a lot of
places.

A related issue is preserving debug metadata and a lot of work has gone into
that area already.  A good first step might be a verifier that checks if a pass
drops !llvm.loop. That could look something like:

1. Create a Loopify pass that adds !llvm.loop for each loop in a function
(similar to Debugify
https://github.com/llvm/llvm-project/blob/f64e457cb75b61f6566de8327a1bfae498d5a296/llvm/lib/Transforms/Utils/Debugify.cpp)
2. Create a verifier that checks all loops have !llvm.loop metadata
3. Run -loopify before each transformation and the verifier afterwards
4. Fix issues.

Passes that create new loops might need special handling (or an option to also
automatically attach llvm.loop metadata to the new loops).

Cheers,
Florian

Colin McEwan via llvm-dev

2020-Mar-25 13:31 UTC

head link

[llvm-dev] Metadata [[ CFG manipulation and !llvm.loop metadata ]]

An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200325/4a925c0b/attachment-0001.html>
-------------- next part --------------> > Additionally, it should probably be explicitly stated in the IR
> language reference that !llvm.loop preservation is best-effort and may
> be lost.
> 
> The LangRef spells out that transformations are required to drop all
> metadata they do not know how to preserve
> (https://llvm.org/docs/LangRef.html#metadata-nodes-and-metadata-
> strings). As you mentioned, some utilities know how to preserve certain
> kinds of metadata, but various places conservatively drop metadata. I
> expect we need to fix a lot of places.
I had somehow missed that text until discovering it the other day, and I'm
increasingly concerned about the approach. I haven't seen many
transformations that explicitly take steps to remove metadata that they
don't understand. It's unclear what the scope of metadata that would
have to be invalidated might be -- any instruction touched is at least easily
within scope, but should a transform which modifies control flow clear the
metadata of every instruction in a modified block? It will depend on the
semantics implied by the metadata.

If transforms were to implement this more rigorously, it might lead to excessive
loss of optimisation hint metadata, or require changes to many existing
transforms, for each type of metadata that could be affected.
> A related issue is preserving debug metadata and a lot of work has gone
> into that area already.  A good first step might be a verifier that
> checks if a pass drops !llvm.loop. That could look something like:
> 
> 1. Create a Loopify pass that adds !llvm.loop for each loop in a
> function (similar to Debugify https://github.com/llvm/llvm-
> project/blob/f64e457cb75b61f6566de8327a1bfae498d5a296/llvm/lib/Transfor
> ms/Utils/Debugify.cpp)
> 2. Create a verifier that checks all loops have !llvm.loop metadata 3.
> Run -loopify before each transformation and the verifier afterwards 4.
> Fix issues.
> 
> Passes that create new loops might need special handling (or an option
> to also automatically attach llvm.loop metadata to the new loops).
Thanks, this is similar to the path I'm considering heading down at the
moment for our internal purposes.


I think it's worth considering creating an interface to handle potential
changes to validity of metadata, which can be called by passes or the framework.
This would keep knowledge of the continued validity of metadata in a common
place rather than spread across transforms, so create a separation of concerns
between IR modification and metadata validity, and would simplify enforcing
constraints on any new metadata.

Michael Kruse via llvm-dev

2020-Mar-25 15:07 UTC

head link

[llvm-dev] CFG manipulation and !llvm.loop metadata

When directives such as '#pragma clang loop vectorize(enable)' were
implemented, it needed to attach its metadata somewhere, and the loop
latch was considered the most stable between passes. Strictly
speaking, LLVM metadata does not exactly match the required semantics,
since preserving metadata is just best effort, but e.g.directive such
as '#pragma omp simd' are not allowed to be just dropped. If the
compiler is not able to vectorize the loop, it should at least issue a
warning diagnostic. That is, we intend for fix all the bugs where the
loop metadata is lost.

Am Fr., 20. März 2020 um 12:48 Uhr schrieb Colin McEwan via llvm-dev
<llvm-dev at lists.llvm.org>:> 1. Modify the framework's CFG manipulation tools to maintain the
location of the !llvm.loop. A major drawback of this is that LoopInfo is needed
to be able to tell whether a block is a loop latch or not (in the example of
splitting a latch block's back edge, we need LoopInfo to know whether the
edge we're splitting is the latch edge or exit edge) and it's not
necessary available or up to date when we manipulate the CFG.
This is a good idea whenever LoopInfo is available.

If not there's still the possibility to speculatively copy the
llvm.loop to both BBs, in case a backedge is split. One of the copies
is redundant and should be ignored. I am not sure what the chance of
it to be picked up by a different loop is.
> 2. Have Loop::getLoopID() search other blocks in the loop for metadata.
This has potential compile-time implications, and would change the IR language
definition of the !llvm.loop as potentially existing (in a valid form) anywhere
in the loop.
I don't think this is robust. The search might pick up loop metadata
intended for other loops.

> 3. Fixup utility functions for function passes to use, to search a loop
and move any errant !llvm.loop to the latch block(s) of its loop.
Isn't this the same as your item 1?

> Additionally, it should probably be explicitly stated in the IR language
reference that !llvm.loop preservation is best-effort and may be lost.
Feel free to create a patch and add me as a reviewer.

Michael

llvm dev - Mar 2020 - CFG manipulation and !llvm.loop metadata

[llvm-dev] CFG manipulation and !llvm.loop metadata

[llvm-dev] CFG manipulation and !llvm.loop metadata

[llvm-dev] CFG manipulation and !llvm.loop metadata

[llvm-dev] CFG manipulation and !llvm.loop metadata

[llvm-dev] Metadata [[ CFG manipulation and !llvm.loop metadata ]]

[llvm-dev] CFG manipulation and !llvm.loop metadata