thr3ads.net - llvm dev - [llvm-dev] !llvm.loop ID metadata clarification [Feb 2021]

If this information is useful, please help other people find it:
Share via:

Mehdi AMINI via llvm-dev

2021-Feb-19 01:11 UTC

[llvm-dev] !llvm.loop ID metadata clarification

Hi,

Our frontend (MLIR ;)) is emitting loops into LLVM IR and adding support to
emit the !loop metadatas as well:
https://llvm.org/docs/LangRef.html#llvm-loop
First as an example the metadata look like this:
https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/LoopUnroll/unroll-pragmas.ll#L58-L59

!1 = !{!1, !2}
!2 = !{!"llvm.loop.unroll.disable"}}

The !1 will be attached to all "latches" (back edges into the loop
header)
branches:

  br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !1


I'm a bit confused by a few things with this metadata at the moment. First
the self-referencing of the metadata is called out in LangRef:
> For legacy reasons, the first item of a loop metadata node must be areference to itself. Before the advent of the ‘distinct’ keyword, this
forced the preservation of otherwise identical metadata nodes. Since the
loop-metadata node can be attached to multiple nodes, the ‘distinct’
keyword has become unnecessary.

The last sentence isn't clear to me, I don't see the relationship
between
"the loop-metadata node can be attached to multiple nodes" and
"the
‘distinct’ keyword has become unnecessary"?

Also It isn't clear why we don't use a distinct metadata instead of a
self-referencing one?

Then LangRef mentions:
> Loop metadata nodes cannot be used as unique identifiers. They areneither persistent for the same loop through transformations nor
necessarily unique to just one loop.

So the metadata itself can't be used as a unique identifier for the loop,
however in a loop all latches must have the same MDNode for LoopInfo to
recognize it as valid.

It isn't clear to me why it prevents using unique metadata here without the
self-referencing trick, and rely on content-based uniquing for equality?
Loop info could still rely on pointer-equality to ensure that the metadata
is consistent on every backedge, two loops with the same metadata would
share the same MDNode, but it shouldn't be an issue since these aren't
used
as unique identifiers per LangRef.

Thanks,

-- 
Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210218/f78ea8d4/attachment.html>

Michael Kruse via llvm-dev

2021-Feb-19 06:56 UTC

head link

[llvm-dev] !llvm.loop ID metadata clarification

I wrote the lines in the LangRef and can answer the questions.

When LoopIDs were first conceived, the idea was that each loop has its
own identifier (so you could, e.g. build a DenseMap to information for
a loop). To avoid that the MDNode of two different would be collapsed
into one, the design made the first item of the MDNode reference
itself. This worked because the uniquifier did not do a GVN or
similar, just compared the content (IMHO still a bad idea, the
uniquing algorithm could be improved later). Later, 'distinct' MDNodes
were introduced, but by then code exists that depend on this. Either
because they check for valid LoopID MDNodes eg. in an assert, skip the
first element in a LoopID MDNode, or, in case of llvm.access.group
(which allows either be a single LoopID or a list of LoopIDs) only
handle the list of LoopIDs case because if it;s only a single LoopID,
the MDNode again contains the self-reference LoopID.


There is a problem with using a LoopID as "identifier":

1. The MDNode changes when changing the loop properties, e.g.
LoopVectorize adds llvm.loop.isvectorized to loops it vectorize and
loops that it determined to be not vectorizable to avoid processing it
again. Hence, LoopIDs are not persistent.

2. There is some code that makes copies of code, such as LoopUnroll,
inlining or LoopVersioning. These copy all the instructions including
the MDNodes and you get multiple loops with the same LoopID. There is
even a LoopUnroll test case that checks that. Hence, a LoopID is not
unique.

Therefore, a LoopID does not fulfill the basic requirements to
identify a loop. Instead, we went to interpret the MDNode as a bag of
loop properties which can be changed.


With this reinterpretation, I did not change the name 'LoopID' nor
removed the self-reference to avoid breaking things.



Am Do., 18. Feb. 2021 um 19:12 Uhr schrieb Mehdi AMINI via llvm-dev
<llvm-dev at lists.llvm.org>:> The last sentence isn't clear to me, I don't see the relationship
between "the loop-metadata node can be attached to multiple nodes" and
"the ‘distinct’ keyword has become unnecessary"?
See above. The distinct/self-reference was meant to stop the same
LoopID to be assigned to multiple loops by the MDNode uniquifier.
However, there are other mechanisms that assign the same LoopID to two
different loops.

> Also It isn't clear why we don't use a distinct metadata instead of
a self-referencing one?
To not break existing code that assumes the self-reference.

>
> Then LangRef mentions:
>
> > Loop metadata nodes cannot be used as unique identifiers. They are
neither persistent for the same loop through transformations nor necessarily
unique to just one loop.
>
> So the metadata itself can't be used as a unique identifier for the
loop, however in a loop all latches must have the same MDNode for LoopInfo to
recognize it as valid.
A single loop can have multiple latches. These should be consistent,
otherwise we don't know which one is the correct information for this
loop. We could instead choose the first one encountered, but I don't
think its a solution for robustness. In contrast, metadata is defined
to always be able to be dropped without losing semantic correctes.

> It isn't clear to me why it prevents using unique metadata here without
the self-referencing trick, and rely on content-based uniquing for equality?
To not break existing code that assumes the self-reference.

> Loop info could still rely on pointer-equality to ensure that the metadata
is consistent on every backedge, two loops with the same metadata would share
the same MDNode, but it shouldn't be an issue since these aren't used as
unique identifiers per LangRef.
Correct.

However, it requires work to fix all the places that assumes the
self-reference, makes existing IR incompatible and may cause trouble
downstream. Hal Finkel and me found that this such a change is not
worth the risk.


Michael

llvm dev - Feb 2021 - !llvm.loop ID metadata clarification

[llvm-dev] !llvm.loop ID metadata clarification

[llvm-dev] !llvm.loop ID metadata clarification