On May 23, 2013, at 9:15 AM, Renato Golin <renato.golin at linaro.org> wrote:> On 23 May 2013 14:52, Arnold Schwaighofer <aschwaighofer at apple.com> wrote: > I would like us to grow a few annotations, among others, one to force vectorization irrespective whether the loop vectorizer thinks it is beneficial or not - however, this is future music. > > Isn't that part of the ivdep implementation? I thought there was support for that already...No, llvm.loop.parallel only communicates information about memory dependencies (or there absence of) and the loop vectorizer only uses it for this. I don’t think we should give it additional semantics of forcing vectorization. Of course, you could locally patch llvm to abuse it for other purposes... (Note, I have not formed a strong opinion on this yet, these are just some initial thoughts, I am not convinced yet that the attributes below are the right set of attributes, or that the syntax is right ;) I am thinking of something like: llvm.vectorization.<param><value> where which would allow us to safety and optimization parameters from the front end: - Safety: #pragma vectorize [max_iterations <NUM>] For vectorization we might want to have an optional parameters at which distance vectorization is safe: #pragma vectorize max_iterations 8 would indicate that vectorization up to a distance 8 is safe. This would restrict the combinations of VF and unroll factor the vectorizer is allowed to choose. - Parameters controlling the vectorizer optimization choices: width, unroll factor, force vectorization at Os, don’t vectorize #pragma vectorize width 4 unroll 2 Forces VF=4 and unroll=2 #pragma vectorize max_iterations 8 Allows the vectorizer to choose. #pragma vectorize off Disable vectorization. #pragma vectorize force If we decide, that #pragma ivdep should imply forced vectorization (which I am not sure it should), the front-end can than in addition to the llvm.loop.parallel metadata, emit meta data to force vectorization. But, I don’t think we should overload the semantics of llvm.loop.parallel.
On Thu, May 23, 2013 at 10:37 AM, Arnold Schwaighofer < aschwaighofer at apple.com> wrote:> > On May 23, 2013, at 9:15 AM, Renato Golin <renato.golin at linaro.org> wrote: > > > On 23 May 2013 14:52, Arnold Schwaighofer <aschwaighofer at apple.com> > wrote: > > I would like us to grow a few annotations, among others, one to force > vectorization irrespective whether the loop vectorizer thinks it is > beneficial or not - however, this is future music. > > > > Isn't that part of the ivdep implementation? I thought there was support > for that already... > > No, llvm.loop.parallel only communicates information about memory > dependencies (or there absence of) and the loop vectorizer only uses it for > this. I don’t think we should give it additional semantics of forcing > vectorization. > > Of course, you could locally patch llvm to abuse it for other purposes... > > > (Note, I have not formed a strong opinion on this yet, these are just some > initial thoughts, I am not convinced yet that the attributes below are the > right set of attributes, or that the syntax is right ;) > > I am thinking of something like: > > llvm.vectorization.<param><value> > > > where which would allow us to safety and optimization parameters from the > front end: > > > - Safety: > #pragma vectorize [max_iterations <NUM>] > > For vectorization we might want to have an optional parameters at which > distance vectorization is safe: > #pragma vectorize max_iterations 8 > would indicate that vectorization up to a distance 8 is safe. This would > restrict the combinations of VF and unroll factor the vectorizer is allowed > to choose. > > - Parameters controlling the vectorizer optimization choices: > width, unroll factor, force vectorization at Os, don’t vectorize > > #pragma vectorize width 4 unroll 2 > Forces VF=4 and unroll=2 > > #pragma vectorize max_iterations 8 > Allows the vectorizer to choose. > > #pragma vectorize off > Disable vectorization. > > #pragma vectorize force > > > If we decide, that > > #pragma ivdep > > should imply forced vectorization (which I am not sure it should), the > front-end can than in addition to the llvm.loop.parallel metadata, emit > meta data to force vectorization. But, I don’t think we should overload the > semantics of llvm.loop.parallel.I'm not sure that ivdep should "force vectorization" either. My interpretation of this pragma is that it tells the compiler not to consider "assumed" dependencies. "Proven" dependencies are still valid, which can prevent vectorization. I apologize in advance if this seems nit-picky. -Cameron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130523/36d6398a/attachment.html>
On 2013-05-23, at 10:37 AM, Arnold Schwaighofer wrote:> > On May 23, 2013, at 9:15 AM, Renato Golin <renato.golin at linaro.org> wrote: > >> On 23 May 2013 14:52, Arnold Schwaighofer <aschwaighofer at apple.com> wrote: >> I would like us to grow a few annotations, among others, one to force vectorization irrespective whether the loop vectorizer thinks it is beneficial or not - however, this is future music. >> >> Isn't that part of the ivdep implementation? I thought there was support for that already... > > No, llvm.loop.parallel only communicates information about memory dependencies (or there absence of) and the loop vectorizer only uses it for this. I don’t think we should give it additional semantics of forcing vectorization. > > Of course, you could locally patch llvm to abuse it for other purposes... >I was recently thinking about how to extend the parallel loop metadata to support other hints. Does it make sense to use a single loop id metadata and attach hints to it? For example, here is a simple loop with llvm.loop.parallel and llvm.mem.parallel_loop_access metadata: loop.body: ; preds = %loop.body, %loop.body.lr.ph %indvars.iv = phi i64 [ %4, %loop.body.lr.ph ], [ %indvars.iv.next, %loop.body ] %__index.addr.07 = phi i32 [ %__low, %loop.body.lr.ph ], [ %7, %loop.body ] %ref1 = load i32*** %3, align 8, !llvm.mem.parallel_loop_access !0 %5 = load i32** %ref1, align 8, !llvm.mem.parallel_loop_access !0 %arrayidx = getelementptr inbounds i32* %5, i64 %indvars.iv %6 = trunc i64 %indvars.iv to i32 store i32 %6, i32* %arrayidx, align 4, !llvm.mem.parallel_loop_access !0 %indvars.iv.next = add i64 %indvars.iv, 1 %7 = add i32 %__index.addr.07, 1 %exitcond = icmp eq i32 %7, %__high br i1 %exitcond, label %loop.end, label %loop.body, !llvm.loop.parallel !0 If I want to add metadata for the vector length how should it look? One thing that would be nice is not having to check branches for different types of loop metadata. How about changing llvm.loop.parallel to llvm.loop and making the hints child nodes? e.g., br i1 %exitcond, label %loop.end, label %loop.body, !llvm.loop !0 ... !0 = metadata !{ metadata !1, metadata !2 } !1 = metadata !{ metadata !"llvm.loop.parallel" } !2 = metadata !{ metadata !"llvm.vectorization.vector_width", i32 8 } I'm not even sure you would need the llvm.loop.parallel anymore since the vectorizer could just look to see if the loop id on a parallel_loop_access matches the loop id of the loop being vectorized. Does this make any sense?> > If we decide, that > > #pragma ivdep > > should imply forced vectorization (which I am not sure it should), the front-end can than in addition to the llvm.loop.parallel metadata, emit meta data to force vectorization. But, I don’t think we should overload the semantics of llvm.loop.parallel.ivdep doesn't force vectorization. It just says if you can't prove there is or isn't a dependency the assume there isn't. paul
On May 23, 2013, at 8:52 AM, "Redmond, Paul" <paul.redmond at intel.com> wrote:> > !0 = metadata !{ metadata !1, metadata !2 } > !1 = metadata !{ metadata !"llvm.loop.parallel" } > !2 = metadata !{ metadata !"llvm.vectorization.vector_width", i32 8 } > > I'm not even sure you would need the llvm.loop.parallel anymore since the vectorizer could just look to see if the loop id on a parallel_loop_access matches the loop id of the loop being vectorized. > > Does this make any sense? >Yes. It makes sense to me.>> >> If we decide, that >> >> #pragma ivdep >> >> should imply forced vectorization (which I am not sure it should), the front-end can than in addition to the llvm.loop.parallel metadata, emit meta data to force vectorization. But, I don’t think we should overload the semantics of llvm.loop.parallel. > > ivdep doesn't force vectorization. It just says if you can't prove there is or isn't a dependency the assume there isn't.I think that we should come up with a better name. I am okay with providing ICC aliases, but I think that we should come up with slightly less cryptic names for clang. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130523/55703976/attachment.html>
On 05/23/2013 06:52 PM, Redmond, Paul wrote:> I'm not even sure you would need the llvm.loop.parallel anymore since the > vectorizer could just look to see if the loop id on a parallel_loop_access > matches the loop id of the loop being vectorized. > > Does this make any sense?Yes. However, I think you still need use the self-referencing metadata trick or similar to make the metadata identifying a loop unique, though (to avoid merging it with the metadata nodes with the same data). That is, e.g., the llvm.mem.parallel_loop_access has to refer to *the* original loop, not just any llvm.loop metadata with the same child metadata. On dropping the llvm.loop.parallel metadata and relying only on checking the parallel_loop_access to identify parallel loops, I'm not so sure. Does it retain all the info for all cases? Let's say you have a parallel loop without memory accesses but, say, a volatile inline asm block. In that case you do not have a way to communicate that the iterations in the said loop can be executed in any order if you cannot mark the loop itself parallel. Regards, -- --Pekka