thr3ads.net - llvm dev - [LLVMdev] Parallel Loop Metadata [Feb 2013]

If this information is useful, please help other people find it:
Share via:

Nadav Rotem

2013-Feb-07 17:46 UTC

[LLVMdev] Parallel Loop Metadata

Hi, 

I am continuing the discussion about Parallel Loop Metadata from here: 
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-February/059168.html and here:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-February/058999.html

Pekka suggested that we add two kind of metadata: llvm.loop.parallel (attached
to each loop latch) and llvm.mem.parallel (attached to each memory
instruction!).  I think that the motivation for the first metadata is clear - it
says that the loop is data-parallel. I can also see us adding additional
metadata such as llvm.loop.unrollcnt to allow the users to control the unroll
count of loops using pragmas. That's fine. Pekka, can you think of
transformations that may need invalidate or take this metadata into
consideration ?

Regarding the second metadata that you proposed, I am a bit skeptical. I
don't fully understand the semantics of this metadata and I am not sure why
we need it. And even if we do need it, I think that it would require too many
passes to change.

It is very very important to take into account the complexity of these features.
In the past we rejected the parallel 'barrier' semantics to change
because it required too many unrelated passes to change.

Nadav

Pekka Jääskeläinen

2013-Feb-07 18:55 UTC

head link

[LLVMdev] Parallel Loop Metadata

Hi Nadav,

On 02/07/2013 07:46 PM, Nadav Rotem wrote:> Pekka suggested that we add two kind of metadata: llvm.loop.parallel
> (attached to each loop latch) and llvm.mem.parallel (attached to each
memory
> instruction!).  I think that the motivation for the first metadata is clear
-
> it says that the loop is data-parallel. I can also see us adding additional
> metadata such as llvm.loop.unrollcnt to allow the users to control the
unroll
> count of loops using pragmas. That's fine. Pekka, can you think of
> transformations that may need invalidate or take this metadata into
> consideration ?
Any pass that introduces new non-parallel memory instructions to the loop,
because they think the loop is sequential and it's ok to do so. I do not
know
any other such pass than the one pointed out earlier, reg2mem (if the
variables inside the loop body reuse stack slots). E.g., inlining
should be safe. So should be unrolling an inner loop inside a parallel loop.

Anyways, the fact that I do not know more of such passes to exist does not
mean there isn't any. Especially when you consider there are out of tree
passes in external projects that use LLVM. Therefore, the "safety
first"
approach of annotating the memory instructions and falling back to sequential
semantics if non-annotated memory instructions are found sounds sensible to me.

Your loop unroll metadata example does not need this as it's not related
to the parallel semantics. That one works both for parallel and sequential
loops.

The other way to go is the "jump in the cold water" approach: Assume
the
parallel loop metadata itself is something that should be respected by
all passes or breakage might happen, which is a bit rough and not allowed
according to the metadata guidelines. It practically adds a new semantical
construct, a parallel loop, to the LLVM IR. Thus, it's then something that
all
passes potentially *need* to know about to not accidentally break the code (by
assuming it's a sequential loop and doing transformations that actually make
it so).
> Regarding the second metadata that you proposed, I am a bit skeptical. I
> don't fully understand the semantics of this metadata and I am not sure
why
> we need it. And even if we do need it, I think that it would require too
many
> passes to change.
It's there exactly to avoid the *need* for the passes to know about
parallel loop semantics. If they do know about it, they can *optimize* by
retaining it as a parallel loop, but the fallback to a serial loop should
be safe for parallel-loop-unaware passes. E.g., inlining and unrolling
might want to update the metadata to still enable, e.g., the vectorizer to
consider it as a parallel loop. Definitely some kind of helper function
somewhere should do this to make it easy to add
"parallel-loop-awareness"
to the passes.
> It is very very important to take into account the complexity of these
> features. In the past we rejected the parallel 'barrier' semantics
to change
> because it required too many unrelated passes to change.
The additional llvm.mem.parallel metadata tries to avoid exactly this.
Simply put, llvm.loop.parallel as itself is not legal metadata (it cannot
be ignored safely) without the other.

That being said, if you know of a better way to guarantee this type of
"safe fallback", I'll be happy to implement it.

BR,
-- 
--Pekka

Nadav Rotem

2013-Feb-07 22:49 UTC

head link

[LLVMdev] Parallel Loop Metadata

On Feb 7, 2013, at 10:55 AM, Pekka Jääskeläinen <pekka.jaaskelainen at
tut.fi> wrote:
> Hi Nadav,
> 
> On 02/07/2013 07:46 PM, Nadav Rotem wrote:
>> Pekka suggested that we add two kind of metadata: llvm.loop.parallel
>> (attached to each loop latch) and llvm.mem.parallel (attached to each
memory
>> instruction!).  I think that the motivation for the first metadata is
clear -
>> it says that the loop is data-parallel. I can also see us adding
additional
>> metadata such as llvm.loop.unrollcnt to allow the users to control the
unroll
>> count of loops using pragmas. That's fine. Pekka, can you think of
>> transformations that may need invalidate or take this metadata into
>> consideration ?
> 
> Any pass that introduces new non-parallel memory instructions to the loop,
> because they think the loop is sequential and it's ok to do so. I do
not know
> any other such pass than the one pointed out earlier, reg2mem (if the
> variables inside the loop body reuse stack slots). E.g., inlining
> should be safe. So should be unrolling an inner loop inside a parallel
loop.
> 
I suggest that we only add the 'llvm.loop.parallel' metadata and not
llvm.mem.parallem.  I believe that it should be the job of the consumer pass
(e.g loop-vectorizer) to scan the loop and detect parallelism violations. This
is also the approach that we use when we optimize stack slots using lifetime
markers. I understand that the consumer passes will have to be more conservative
and miss some optimizations. But I still think that this is better than forcing
different passes in the compiler to know about parallel metadata.

David Tweed

2013-Feb-08 12:36 UTC

head link

[LLVMdev] Parallel Loop Metadata

On Thu, Feb 7, 2013 at 6:55 PM, Pekka Jääskeläinen <
pekka.jaaskelainen at tut.fi> wrote:
> Hi Nadav,
>
>
> On 02/07/2013 07:46 PM, Nadav Rotem wrote:
>
>> Pekka suggested that we add two kind of metadata: llvm.loop.parallel
>> (attached to each loop latch) and llvm.mem.parallel (attached to each
>> memory
>> instruction!).  I think that the motivation for the first metadata is
>> clear -
>> it says that the loop is data-parallel. I can also see us adding
>> additional
>> metadata such as llvm.loop.unrollcnt to allow the users to control the
>> unroll
>> count of loops using pragmas. That's fine. Pekka, can you think of
>> transformations that may need invalidate or take this metadata into
>> consideration ?
>>
>
> Any pass that introduces new non-parallel memory instructions to the loop,
> because they think the loop is sequential and it's ok to do so. I do
not
> know
> any other such pass than the one pointed out earlier, reg2mem (if the
> variables inside the loop body reuse stack slots). E.g., inlining
> should be safe. So should be unrolling an inner loop inside a parallel
> loop.
>
> Anyways, the fact that I do not know more of such passes to exist does not
> mean there isn't any. Especially when you consider there are out of
tree
> passes in external projects that use LLVM. Therefore, the "safety
first"
> approach of annotating the memory instructions and falling back to
> sequential
> semantics if non-annotated memory instructions are found sounds sensible
> to me.
>
>Another possibility would be for passes to be only-just parallel metadata
aware, in that if a pass which doesn't know enough to correctly preserve
parallel metadata will not do anything if it sees any parallel markers
(maybe trying to find the smallest region to which this "don't run
yourself" applies, maybe not). In that way the metadata is guaranteed to
remain correct at the cost of missing out on some reorgranisations that are
done on non-parallel-metadata code. I don't know how well this would fit in
with the general philosophy of LLVM passes though.

(I'm also aware that we're coming at this from different directions:
most
people are interested in auto-parallelisation, where missing a
parallelisation opportunity is just one of those unfortunate things, while
I have a personal interest in DSLs which try to present LLVM code with huge
"parallelise this" signs pointing at bits of it. It would be
frustrating to
have carefully made sure the DSL twisted things into a parallelisable form
only to have parallelisation/vectorization "fall at the last hurdle".)

Cheers,
Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130208/cdd7f726/attachment.html>

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Feb 2013 - [LLVMdev] Parallel Loop Metadata

[LLVMdev] Parallel Loop Metadata

[LLVMdev] Parallel Loop Metadata

[LLVMdev] Parallel Loop Metadata

[LLVMdev] Parallel Loop Metadata

Apparently Analagous Threads