thr3ads.net - llvm dev - [LLVMdev] LLVM Loop Vectorizer puzzle [May 2013]

If this information is useful, please help other people find it:
Share via:

Arnold Schwaighofer

2013-May-23 14:37 UTC

[LLVMdev] LLVM Loop Vectorizer puzzle

On May 23, 2013, at 9:15 AM, Renato Golin <renato.golin at linaro.org>
wrote:
> On 23 May 2013 14:52, Arnold Schwaighofer <aschwaighofer at
apple.com> wrote:
> I would like us to grow a few annotations, among others, one to force
vectorization irrespective whether the loop vectorizer thinks it is beneficial
or not - however, this is future music.
> 
> Isn't that part of the ivdep implementation? I thought there was
support for that already...
No, llvm.loop.parallel only communicates information about memory dependencies
(or there absence of) and the loop vectorizer only uses it for this. I don’t
think we should give it additional semantics of forcing vectorization.

Of course, you could locally patch llvm to abuse it for other purposes...

(Note, I have not formed a strong opinion on this yet, these are just some
initial thoughts, I am not convinced yet that the attributes below are the right
set of attributes, or that the syntax is right ;)

I am thinking of something like:

llvm.vectorization.<param><value>

where which would allow us to safety and optimization parameters from the front
end:

- Safety:
 #pragma vectorize [max_iterations <NUM>]

 For vectorization we might want to have an optional parameters at which
distance vectorization is safe:
 #pragma vectorize max_iterations 8
 would indicate that vectorization up to a distance 8 is safe. This would
restrict the combinations of VF and unroll factor the vectorizer is allowed to
choose.

- Parameters controlling the vectorizer optimization choices:
 width, unroll factor, force vectorization at Os, don’t vectorize

#pragma vectorize width 4 unroll 2
Forces VF=4 and unroll=2

#pragma vectorize max_iterations 8
Allows the vectorizer to choose.

#pragma vectorize off
Disable vectorization.

#pragma vectorize force

If we decide, that

#pragma ivdep

should imply forced vectorization (which I am not sure it should), the front-end
can than in addition to the llvm.loop.parallel metadata, emit meta data to force
vectorization. But, I don’t think we should overload the semantics of
llvm.loop.parallel.

Cameron McInally

2013-May-23 15:04 UTC

head link

[LLVMdev] LLVM Loop Vectorizer puzzle

On Thu, May 23, 2013 at 10:37 AM, Arnold Schwaighofer <
aschwaighofer at apple.com> wrote:
>
> On May 23, 2013, at 9:15 AM, Renato Golin <renato.golin at
linaro.org> wrote:
>
> > On 23 May 2013 14:52, Arnold Schwaighofer <aschwaighofer at
apple.com>
> wrote:
> > I would like us to grow a few annotations, among others, one to force
> vectorization irrespective whether the loop vectorizer thinks it is
> beneficial or not - however, this is future music.
> >
> > Isn't that part of the ivdep implementation? I thought there was
support
> for that already...
>
> No, llvm.loop.parallel only communicates information about memory
> dependencies (or there absence of) and the loop vectorizer only uses it for
> this. I don’t think we should give it additional semantics of forcing
> vectorization.
>
> Of course, you could locally patch llvm to abuse it for other purposes...
>
>
> (Note, I have not formed a strong opinion on this yet, these are just some
> initial thoughts, I am not convinced yet that the attributes below are the
> right set of attributes, or that the syntax is right ;)
>
> I am thinking of something like:
>
> llvm.vectorization.<param><value>
>
>
> where which would allow us to safety and optimization parameters from the
> front end:
>
>
> - Safety:
>  #pragma vectorize [max_iterations <NUM>]
>
>  For vectorization we might want to have an optional parameters at which
> distance vectorization is safe:
>  #pragma vectorize max_iterations 8
>  would indicate that vectorization up to a distance 8 is safe. This would
> restrict the combinations of VF and unroll factor the vectorizer is allowed
> to choose.
>
> - Parameters controlling the vectorizer optimization choices:
>  width, unroll factor, force vectorization at Os, don’t vectorize
>
> #pragma vectorize width 4 unroll 2
> Forces VF=4 and unroll=2
>
> #pragma vectorize max_iterations 8
> Allows the vectorizer to choose.
>
> #pragma vectorize off
> Disable vectorization.
>
> #pragma vectorize force
>
>
> If we decide, that
>
> #pragma ivdep
>
> should imply forced vectorization (which I am not sure it should), the
> front-end can than in addition to the llvm.loop.parallel metadata, emit
> meta data to force vectorization. But, I don’t think we should overload the
> semantics of llvm.loop.parallel.

I'm not sure that ivdep should "force vectorization" either. My
interpretation of this pragma is that it tells the compiler not to consider
"assumed" dependencies. "Proven" dependencies are still
valid, which can
prevent vectorization.

I apologize in advance if this seems nit-picky.

-Cameron
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130523/36d6398a/attachment.html>

Redmond, Paul

2013-May-23 15:52 UTC

head link

[LLVMdev] LLVM Loop Vectorizer puzzle

On 2013-05-23, at 10:37 AM, Arnold Schwaighofer wrote:
> 
> On May 23, 2013, at 9:15 AM, Renato Golin <renato.golin at
linaro.org> wrote:
> 
>> On 23 May 2013 14:52, Arnold Schwaighofer <aschwaighofer at
apple.com> wrote:
>> I would like us to grow a few annotations, among others, one to force
vectorization irrespective whether the loop vectorizer thinks it is beneficial
or not - however, this is future music.
>> 
>> Isn't that part of the ivdep implementation? I thought there was
support for that already...
> 
> No, llvm.loop.parallel only communicates information about memory
dependencies (or there absence of) and the loop vectorizer only uses it for
this. I don’t think we should give it additional semantics of forcing
vectorization.
> 
> Of course, you could locally patch llvm to abuse it for other purposes...
> 
I was recently thinking about how to extend the parallel loop metadata to
support other hints. Does it make sense to use a single loop id metadata and
attach hints to it?

For example, here is a simple loop with llvm.loop.parallel and
llvm.mem.parallel_loop_access metadata:

loop.body:                                        ; preds = %loop.body,
%loop.body.lr.ph
  %indvars.iv = phi i64 [ %4, %loop.body.lr.ph ], [ %indvars.iv.next, %loop.body
]
  %__index.addr.07 = phi i32 [ %__low, %loop.body.lr.ph ], [ %7, %loop.body ]
  %ref1 = load i32*** %3, align 8, !llvm.mem.parallel_loop_access !0
  %5 = load i32** %ref1, align 8, !llvm.mem.parallel_loop_access !0
  %arrayidx = getelementptr inbounds i32* %5, i64 %indvars.iv
  %6 = trunc i64 %indvars.iv to i32
  store i32 %6, i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0
  %indvars.iv.next = add i64 %indvars.iv, 1
  %7 = add i32 %__index.addr.07, 1
  %exitcond = icmp eq i32 %7, %__high
  br i1 %exitcond, label %loop.end, label %loop.body, !llvm.loop.parallel !0

If I want to add metadata for the vector length how should it look? One thing
that would be nice is not having to check branches for different types of loop
metadata. How about changing llvm.loop.parallel to llvm.loop and making the
hints child nodes?

e.g.,

 br i1 %exitcond, label %loop.end, label %loop.body, !llvm.loop !0

...

!0 = metadata !{ metadata !1, metadata !2 }
!1 = metadata !{ metadata !"llvm.loop.parallel" }
!2 = metadata !{ metadata !"llvm.vectorization.vector_width", i32 8 }

I'm not even sure you would need the llvm.loop.parallel anymore since the
vectorizer could just look to see if the loop id on a parallel_loop_access
matches the loop id of the loop being vectorized.

Does this make any sense?
> 
> If we decide, that
> 
> #pragma ivdep
> 
> should imply forced vectorization (which I am not sure it should), the
front-end can than in addition to the llvm.loop.parallel metadata, emit meta
data to force vectorization. But, I don’t think we should overload the semantics
of llvm.loop.parallel.
ivdep doesn't force vectorization. It just says if you can't prove there
is or isn't a dependency the assume there isn't.

paul

Nadav Rotem

2013-May-23 16:02 UTC

head link

[LLVMdev] LLVM Loop Vectorizer puzzle

On May 23, 2013, at 8:52 AM, "Redmond, Paul" <paul.redmond at
intel.com> wrote:
> 
> !0 = metadata !{ metadata !1, metadata !2 }
> !1 = metadata !{ metadata !"llvm.loop.parallel" }
> !2 = metadata !{ metadata !"llvm.vectorization.vector_width", i32
8 }
> 
> I'm not even sure you would need the llvm.loop.parallel anymore since
the vectorizer could just look to see if the loop id on a parallel_loop_access
matches the loop id of the loop being vectorized.
> 
> Does this make any sense?
> 

Yes. It makes sense to me. 
>> 
>> If we decide, that
>> 
>> #pragma ivdep
>> 
>> should imply forced vectorization (which I am not sure it should), the
front-end can than in addition to the llvm.loop.parallel metadata, emit meta
data to force vectorization. But, I don’t think we should overload the semantics
of llvm.loop.parallel.
> 
> ivdep doesn't force vectorization. It just says if you can't prove
there is or isn't a dependency the assume there isn't.
I think that we should come up with a better name.  I am okay with providing ICC
aliases, but I think that we should come up with slightly less cryptic names for
clang.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130523/55703976/attachment.html>

Pekka Jääskeläinen

2013-May-23 18:13 UTC

head link

[LLVMdev] LLVM Loop Vectorizer puzzle

On 05/23/2013 06:52 PM, Redmond, Paul wrote:> I'm not even sure you would need the llvm.loop.parallel anymore since
the
> vectorizer could just look to see if the loop id on a parallel_loop_access
> matches the loop id of the loop being vectorized.
>
> Does this make any sense?
Yes. However, I think you still need use the self-referencing
metadata trick or similar to make the metadata identifying a loop unique,
though (to avoid merging it with the metadata nodes with the same data). That
is, e.g., the llvm.mem.parallel_loop_access has to refer to *the* original
loop, not just any llvm.loop metadata with the same child metadata.

On dropping the llvm.loop.parallel metadata and relying only on checking the 
parallel_loop_access to identify parallel loops, I'm not so sure. Does it
retain all the info for all cases? Let's say you have a parallel loop
without
memory accesses but, say, a volatile inline asm block. In that case you do not
have a way to communicate that the iterations in the said loop can be executed 
in any order if you cannot mark the loop itself parallel.

Regards,
-- 
--Pekka

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - May 2013 - [LLVMdev] LLVM Loop Vectorizer puzzle

[LLVMdev] LLVM Loop Vectorizer puzzle

[LLVMdev] LLVM Loop Vectorizer puzzle

[LLVMdev] LLVM Loop Vectorizer puzzle

[LLVMdev] LLVM Loop Vectorizer puzzle

[LLVMdev] LLVM Loop Vectorizer puzzle

Reasonably Related Threads