thr3ads.net - llvm dev - [llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata) [Dec 2018]

If this information is useful, please help other people find it:
Share via:

Michael Kruse via llvm-dev

2018-Dec-05 00:50 UTC

[llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

Dear LLVM community,

LLVM IR has a concept of 'LoopID' [1] which is a misnomer:

(a) LoopIDs are not unique: Any pass that duplicates IR will do it
including its metadata (e.g. LoopVersioning) such that thereafter
multiple loops are linked with the same LoopID. There is even a test
case (Transforms/LoopUnroll/unroll-pragmas-disabled.ll) for multiple
loops with the same LoopID.

(b) LoopIDs are not persistent: Adding or removing an item from a
LoopID can only be done by creating a new MDNode and assigning it to
the loop's branch(es). Passes such as LoopUnroll
(llvm.loop.unroll.disable) and LoopVectorize (llvm.loop.isvectorized)
use this to mark loops to be not transformed multiple times or to
avoid that a LoopVersioning original loop is transformed.

The only mechanism in LLVM that uses LoopIDs as identifiers is
llvm.mem.parallel_loop_access [2]. It is added to memory-accessing
instructions and points to one or multiple LoopIDs it is parallel to.
Because of (b), transformation passes can change a loop's llvm.loop
metadata such that llvm.mem.parallel_loop_access may point to an
orphaned LoopID.

Patch [3] fixes this issue. Instead of instructions referencing the
loops they are parallel to, it adds a new loop attribute
"llvm.loop.parallel_accesses" that references the instructions. Any
loop transformation that changes the loop metadata will by default
carry-over the llvm.loop.parallel_accesses property and the
information retained.

Since global metadata cannot reference instructions, [3] also
introduces a new distinct metadata type that represents a group of
instructions. An instruction will reference the a group using the
llvm.access.group metadata, and llvm.loop.parallel_accesses uses the
access group MDNode to reference to the instructions.

Currently, the metadata for parallel memory accesses looks like:

loop:
  load ... !llvm.mem.parallel_loop_access !0
  br label %loop, !llvm.loop !0

!0 = distinct !{!0}                                      ; 'LoopID'

My suggestion uses this pattern instead:

loop:
  load ... !llvm.access.group !2
  br label %loop, !llvm.loop !0

!0 = distinct !{!0, !1}                                  ; 'LoopID'
!1 = !{!"llvm.loop.parallel_accesses", !2}
!2 = distinct !{}                                        ; access group

Patch [4] modifies Clang to generate this kind of metadata. Accesses
in the same loop nest level are assigned to the same access group.


Fixing (a) and (b) instead would require much more work. For (a), any
code that copies entire basic blocks (this includes the inliner since
the same function can be inlined multiple times) would need to make a
copy of the LoopID and update any llvm.mem.parallel_loop_access in the
copy. For (b), in addition to MDNode::setOperand, we need a method to
change the MDNode's size without changing its address. Currently,
MDNode is optimized for an unchangeable number of operands.

I suggest to stop seeing 'LoopID' as an identifier entirely, but to
use it as a bag of loop attributes. Patch [5] updates the reference
manual. We may also think about renaming LoopID in the source code,
and dropping the 'distinct' and 'first item is self-reference'.
The
latter two were intended to avoid separate LoopIDs into into one
because their content is identical, i.e. uphold (a). As we have seen,
there are other reasons for loops to have identical LoopIDs. With
patches [3,4], llvm.loop metadata can be collapsed (unlike access
groups), thus the 'distinct' is not necessary anymore. Unfortunately,
there is code in LLVM (and maybe elsewhere) that depends on LoopIDs'
first item, i.e. we cannot get rid of it that easily.


Regards,
Michael


[1] https://llvm.org/docs/LangRef.html#llvm-loop
[2] https://llvm.org/docs/LangRef.html#llvm-mem-parallel-loop-access-metadata
[3] https://reviews.llvm.org/D52116
[4] https://reviews.llvm.org/D52117
[5] https://reviews.llvm.org/D55290

Finkel, Hal J. via llvm-dev

2018-Dec-12 16:10 UTC

head link

[llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

On 12/4/18 6:50 PM, Michael Kruse wrote:> Dear LLVM community,
>
> LLVM IR has a concept of 'LoopID' [1] which is a misnomer:
>
> (a) LoopIDs are not unique: Any pass that duplicates IR will do it
> including its metadata (e.g. LoopVersioning) such that thereafter
> multiple loops are linked with the same LoopID. There is even a test
> case (Transforms/LoopUnroll/unroll-pragmas-disabled.ll) for multiple
> loops with the same LoopID.
>
> (b) LoopIDs are not persistent: Adding or removing an item from a
> LoopID can only be done by creating a new MDNode and assigning it to
> the loop's branch(es). Passes such as LoopUnroll
> (llvm.loop.unroll.disable) and LoopVectorize (llvm.loop.isvectorized)
> use this to mark loops to be not transformed multiple times or to
> avoid that a LoopVersioning original loop is transformed.
>
> The only mechanism in LLVM that uses LoopIDs as identifiers is
> llvm.mem.parallel_loop_access [2]. It is added to memory-accessing
> instructions and points to one or multiple LoopIDs it is parallel to.
> Because of (b), transformation passes can change a loop's llvm.loop
> metadata such that llvm.mem.parallel_loop_access may point to an
> orphaned LoopID.
>
> Patch [3] fixes this issue. Instead of instructions referencing the
> loops they are parallel to, it adds a new loop attribute
> "llvm.loop.parallel_accesses" that references the instructions.
Any
> loop transformation that changes the loop metadata will by default
> carry-over the llvm.loop.parallel_accesses property and the
> information retained.
>
> Since global metadata cannot reference instructions, [3] also
> introduces a new distinct metadata type that represents a group of
> instructions. An instruction will reference the a group using the
> llvm.access.group metadata, and llvm.loop.parallel_accesses uses the
> access group MDNode to reference to the instructions.
>
> Currently, the metadata for parallel memory accesses looks like:
>
> loop:
>   load ... !llvm.mem.parallel_loop_access !0
>   br label %loop, !llvm.loop !0
>
> !0 = distinct !{!0}                                      ; 'LoopID'
>
> My suggestion uses this pattern instead:
>
> loop:
>   load ... !llvm.access.group !2
>   br label %loop, !llvm.loop !0
>
> !0 = distinct !{!0, !1}                                  ; 'LoopID'
> !1 = !{!"llvm.loop.parallel_accesses", !2}
> !2 = distinct !{}                                        ; access group
>
> Patch [4] modifies Clang to generate this kind of metadata. Accesses
> in the same loop nest level are assigned to the same access group.
I think that we should move forward with this solution. It will make
LLVM both more robust and most consistent in this regard:

 1. Robust: It will become less likely that we'll lose track of loops
marked as parallel because of an update to the loop ID. Given that any
change to the loop's properties (which happens, for example, when
user-requested transformations are performed) changes the metadata
identify of the loop id, this is not uncommon. As a result, we really
need the loop id to, in effect, point to the parallel access groupings
instead of the other-way around.

 2. Consistent: The only reason we need, AFAIK, the loop ids to be
distinct, semantically, is for parallel loop annotations in loop nests.
If I have loop-parallel annotations point at the loop ids, and loop ids
are not distinct, then there might be no way to distinguish an inner and
outer loop (and, thus, no way to make with respect to which level in a
loop nest a loop has been annotated as parallel). This new scheme
removes that requirement, and thus, the requirement that all loop ids
must be marked distinct.
>
>
> Fixing (a) and (b) instead would require much more work. For (a), any
> code that copies entire basic blocks (this includes the inliner since
> the same function can be inlined multiple times) would need to make a
> copy of the LoopID and update any llvm.mem.parallel_loop_access in the
> copy. For (b), in addition to MDNode::setOperand, we need a method to
> change the MDNode's size without changing its address. Currently,
> MDNode is optimized for an unchangeable number of operands.
>
> I suggest to stop seeing 'LoopID' as an identifier entirely, but to
> use it as a bag of loop attributes. Patch [5] updates the reference
> manual. We may also think about renaming LoopID in the source code,
> and dropping the 'distinct' and 'first item is
self-reference'. The
> latter two were intended to avoid separate LoopIDs into into one
> because their content is identical, i.e. uphold (a). As we have seen,
> there are other reasons for loops to have identical LoopIDs. With
> patches [3,4], llvm.loop metadata can be collapsed (unlike access
> groups), thus the 'distinct' is not necessary anymore.
Unfortunately,
> there is code in LLVM (and maybe elsewhere) that depends on LoopIDs'
> first item, i.e. we cannot get rid of it that easily.
I don't think it's worth changing this first element, unless we have
some other reason to do so.

 -Hal
>
>
> Regards,
> Michael
>
>
> [1] https://llvm.org/docs/LangRef.html#llvm-loop
> [2]
https://llvm.org/docs/LangRef.html#llvm-mem-parallel-loop-access-metadata
> [3] https://reviews.llvm.org/D52116
> [4] https://reviews.llvm.org/D52117
> [5] https://reviews.llvm.org/D55290
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Michael Kruse via llvm-dev

2018-Dec-12 16:29 UTC

head link

[llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

Am Mi., 12. Dez. 2018 um 10:10 Uhr schrieb Finkel, Hal J. <hfinkel at
anl.gov>:> > As we have seen,
> > there are other reasons for loops to have identical LoopIDs. With
> > patches [3,4], llvm.loop metadata can be collapsed (unlike access
> > groups), thus the 'distinct' is not necessary anymore.
Unfortunately,
> > there is code in LLVM (and maybe elsewhere) that depends on
LoopIDs'
> > first item, i.e. we cannot get rid of it that easily.
>
> I don't think it's worth changing this first element, unless we
have
> some other reason to do so.
Would it be worthwhile to update the metadata uniquing algorithm to
consider shallow self-references?

Michael

llvm dev - Dec 2018 - RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

[llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

[llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

[llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)