On 01/31/2013 03:40 PM, Pekka Jääskeläinen wrote:> Dear all,
>
> Here's an updated version of the parallel loop metadata patch.
> It includes documentation for the new metadata types with
> a semantics description.
Hi Pekka,
I think this looks already very nice. Some more comments:
> Index: llvm/include/llvm/Analysis/LoopInfo.h
> ==================================================================> ---
llvm.orig/include/llvm/Analysis/LoopInfo.h 2013-01-29 23:40:09.480348774 +0200
> +++ llvm/include/llvm/Analysis/LoopInfo.h 2013-01-31 16:13:16.517296071
+0200
> @@ -381,6 +381,19 @@
> /// isSafeToClone - Return true if the loop body is safe to clone in
practice.
> bool isSafeToClone() const;
>
> + /// isParallel - Returns true if the loop should be considered as
> + /// a "parallel loop" with freely scheduled iterations. A
parallel loop can
> + /// be assumed to not contain any dependencies between iterations by the
compiler.
> + /// That is, any loop-carried dependency checking can be skipped
completely when
> + /// parallelizing the loop on the target machine. Thus, if the parallel
loop
> + /// information originates from the programmer, e.g. via the OpenMP
parallel
> + /// for pragma, it is the programmer's responsibility to ensure the
are no
> + /// loop-carried dependencies. The final execution order of the
instructions
> + /// across iterations is not guaranteed, thus, the end result might or
might
might or might _not_> + /// implement actual concurrent execution of instructions across
multiple
> + /// iterations.
This comment is not formatted to our new doxygen style. LLVM now does
not repeat the function name in the comment. Instead it has a brief
comment followed by a full comment. Something like
/// Check if a loop is parallel
///
/// Returns true if the loop should be considered ....
> + bool isParallel() const;
> +
> /// hasDedicatedExits - Return true if no exit block for the loop
> /// has a predecessor that is outside the loop.
Same here, do not repeat the function name.
> bool hasDedicatedExits() const;
> Index: llvm/lib/Analysis/LoopInfo.cpp
> ==================================================================> ---
llvm.orig/lib/Analysis/LoopInfo.cpp 2013-01-29 23:40:12.164348629 +0200
> +++ llvm/lib/Analysis/LoopInfo.cpp 2013-01-31 13:20:04.885692041 +0200
> @@ -233,6 +233,31 @@
> return true;
> }
>
> +
> +bool Loop::isParallel() const {
> +
> + BasicBlock *latch = getLoopLatch();
> + if (latch == NULL ||
> +
latch->getTerminator()->getMetadata("llvm.loop.parallel") ==
NULL)
> + return false;
> +
> + // The loop branch contains the parallel loop metadata. In order to
ensure
> + // that any parallel-loop-unaware optimization pass hasn't added
loop-carried
> + // dependencies (thus converted the loop back to a sequential loop),
check
> + // that all the memory instructions in the loop contain parallelism
metadata.
> + for (block_iterator i = block_begin(), e = block_end(); i != e; ++i) {
> + for (BasicBlock::iterator ii = (*i)->begin(), ee = (*i)->end();
> + ii != ee; ii++) {
In LLVM we normally use uppercase letters for iterators. 'II' and
'EE'.
>
> +'``llvm.loop``'
> +^^^^^^^^^^^^^^^
> +
> +It is sometimes useful to attach information to loop constructs.
Currently,
> +loop metadata is implemented as metadata attached to the branch
instruction
> +in the loop latch block. Loop-level metadata is prefixed with
``llvm.loop``.
> +
> +'``llvm.loop.parallel``' Metadata
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +This loop metadata can be used to communicate that a loop should be
considered
> +a parallel loop. The semantics of parallel loops in this case is the one
> +with the strongest cross-iteration instruction ordering freedom: the
> +iterations in the loop can be considered completely independent of each
> +other (also known as embarrasingly parallel loops).
embarrassingly> +
> +This metadata can originate from a programming language with parallel loop
> +constructs. In such a case it is completely the programmer's
responsibility
> +to ensure the instructions from the different iterations of the loop can
be
> +executed in an arbitrary order, in parallel, or intertwined. No
loop-carried
> +dependency checking at all must be expected from the compiler.
> +
> +In order to fulfil the LLVM requirement for metadata to be ignorable
fulfill> +safely, it is important to ensure that a parallel loop is converted to
> +a sequential loop in case an optimization (unknowingly of the parallel
loop
> +semantics) converts the loop back to such. This happens when new memory
> +accesses that do not fulfil the requirement of free ordering across
iterations
fulfill> +are added to the loop. Therefore, this metadata is required, but not
> +sufficient, to consider the loop at hand a parallel loop. In order to
consider
> +a loop a parallel loop, also all of its memory accessing instructions need
to be
> +marked with the ```llvm.mem.parallel_loop_access``` metadata.
> +
> +'``llvm.mem``'
> +^^^^^^^^^^^^^^^
> +
> +Metadata types used to annotate memory accesses with information helpful
> +for optimizations are prefixed with ``llvm.mem``.
> +
> +'``llvm.mem.parallel_loop_access``' Metadata
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +In order to consider a loop a parallel loop, in addition to using
> +the ``llvm.loop.parallel`` metadata to mark the loop latch branch
instruction,
> +also all of the memory accessing instructions in the loop body need to be
> +marked with the ``llvm.mem.parallel_loop_access`` metadata. If there
> +is at least one memory accessing instruction not marked with the metadata,
> +the loop, despite it possibly using the ``llvm.loop.parallel`` metadata,
> +must be considered a sequential loop. This causes parallel loops to be
> +converted to sequential loops due to optimization passes that are unaware
of
> +the parallel semantics and that insert new memory instructions to the loop
> +body.
> +
> +Example of a loop that is considered parallel due to its correct use of
> +both ``llvm.loop.parallel`` and ```llvm.mem.parallel_loop_access```
> +metadata types:
> +
> +.. code-block:: llvm
> +
> + for.body:
> + %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
> + %arrayidx = getelementptr inbounds i32* %b, i64 %indvars.iv
> + %0 = load i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0
> + ...
> + store i32 %0, i32* %arrayidx4, align 4, !llvm.mem.parallel_loop_access
!0
> + ...
> + br i1 %exitcond, label %for.end, label %for.body, !llvm.loop.parallel
!0
> + for.end: ; preds = %for.body
> + ret void
> + ...
> + !0 = metadata !{i32 1}
One point I am not entirely sure is: Is the parallel_loop_access
meta-data somehow connected to the loop.parallel metadata. I think we
need to also ensure that the parallel_loop_access metadata in a loop
actually comes from the loop itself and was not introduced in some other
unrelated way. I don't have a good example where this may happen, but
could imagine stuff like inlining or licm from inner loops. I believe it
should not be very difficult to couple the two, no?
Cheers
Tobi