Hal Finkel
2013-Jan-28 16:51 UTC
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
----- Original Message -----> From: "Nadav Rotem" <nrotem at apple.com> > To: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Monday, January 28, 2013 10:45:36 AM > Subject: Re: [LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer > > Hi Pekka, > > I am okay with this patch, assuming that you follow the review of > Tobias and Renato and provide a separate patch for the > min-iter-count and a few test cases. > > I think that it would be a good idea to start a new thread and to > discuss the best way to annotate loops in LLVM.Is this sufficient to implement #pragma ivdep in clang? -Hal> > Thanks, > Nadav > > On Jan 28, 2013, at 5:49 AM, Pekka Jääskeläinen > <pekka.jaaskelainen at tut.fi> wrote: > > > Hi Renato, > > > > On 01/28/2013 03:22 PM, Renato Golin wrote: > >> This seems an awfully specific check on a generic part of the > >> code... If > > > > True. Perhaps the check is better encapsulated, e.g., in the Loop > > class? > > Or, if there's such thing as a loop-carried data dependency > > analyzer, > > the correct place could be there, as a trivial "no deps" analysis. > > > > > this metadata standard in any form? If this OpenCL specific? Does > > > all > > > > This metadata is not standard in any form. Therefore the request > > for comments. However, its meaning is generic, not OpenCL > > specific at all. It specifies that the loop iterations can be > > treated as independent, regardless of the memory operations the > > body contains. Thus, the potential cross-iteration memory > > dependencies > > can be considered a programming error. > > > > > OpenCL front-ends generate the same meta-data in that way? Etc... > > > > I have no knowledge of other OpenCL implementations than > > pocl as I haven't seen their code. > > > >> It also converts the "min iteration count to vectorize" to a > >> parameter so > >> this can be controlled from the command line. > >> > >> > >> Is this really necessary? Do you have use cases where this would > >> make sense? > > > > Where a lower threshold could be useful? At least with loops having > > long > > bodies and loops with outer loops that iterate the inner loop many > > times. > > > > In fact, shouldn't the default minimum be the minimum vector width > > of the > > machine? The cost estimation routine should take care of the actual > > profitability estimate? > > > >> I think you should send a test case with this patch, not separate. > > > > As soon as there's a consensus on the metadata format and where > > the check shall reside in, I'll prepare a proper patch with > > a vectorizer test case. > > > > Thanks for the comments so far, > > -- > > Pekka > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Renato Golin
2013-Jan-28 17:09 UTC
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
On 28 January 2013 16:51, Hal Finkel <hfinkel at anl.gov> wrote:> Is this sufficient to implement #pragma ivdep in clang? >I'd assume so. It looks as though Loop->isParallel() or similar should account for both OpenCL and #ivdep cases. If we don't have #ivdep yet, would be a good time to add, at least setting the attribute in Clang for both. Nadav, you mentioned ivdep a month ago, did that get any traction? cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130128/2e82585b/attachment.html>
Nadav Rotem
2013-Jan-28 17:20 UTC
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
> Is this sufficient to implement #pragma ivdep in clang?I think so.
Pekka Jääskeläinen
2013-Jan-28 17:36 UTC
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
On 01/28/2013 06:51 PM, Hal Finkel wrote:> Is this sufficient to implement #pragma ivdep in clang?I'm not completely sure of this: "Note: The proven dependencies that prevent vectorization are not ignored, only assumed dependencies are ignored." http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/cref_cls/common/cppref_pragma_ivdep.htm Thus, there's a slight difference. It cannot be used to disable dependency checking altogether (and just blame a sloppy programmer if there actually are dependencies), but it just converts the "unknown alias" to "no alias". If there's a "yes" from the analyzer it still prevents the vectorization. So, sort of a softened programmer-friendlier version of the semantics. The vagueness comes from that it depends on the intelligence of the dependency analysis implementation whether a dependency can be "proven" or not, doesn't it? Thus, #pragma ivdep with a non-existing loop dependence analyzer is equivalent to the semantics of the proposed metadata. Also, it's a bit unclear what is the real difference to the #pragma parallel: http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/cref_cls/common/cppref_pragma_parallel.htm It similarly states: "However, if dependencies are proven, they are not ignored." So conversely, if the compiler cannot prove a dependency for some reason, they *are* ignored? OpenMP's 'omp for', on the other hand, can be used to mark a truly parallel loop where this metadata could be used if one wants to parallelize those loops using a finer-granularity mechanism than threads. -- Pekka
Pekka Jääskeläinen
2013-Jan-28 17:58 UTC
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
On 01/28/2013 07:36 PM, Pekka Jääskeläinen wrote:> If there's a "yes" from the analyzer it still prevents the vectorization. > So, sort of a softened programmer-friendlier version of the semantics.That said, I cannot think of a case where it would *harm* if the dependency analyzer, if it can actually prove a dependency, serializes the code. Thus, the same metadata can be used in both cases, if one doesn't care the possible wasted compilation time spent on the unnecessary dependency checking. -- Pekka
Hal Finkel
2013-Jan-28 18:00 UTC
[LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
----- Original Message -----> From: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Nadav Rotem" <nrotem at apple.com>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Monday, January 28, 2013 11:36:21 AM > Subject: Re: [LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer > > On 01/28/2013 06:51 PM, Hal Finkel wrote: > > Is this sufficient to implement #pragma ivdep in clang? > > I'm not completely sure of this: > > "Note: The proven dependencies that prevent vectorization are not > ignored, > only assumed dependencies are ignored." > > http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/cref_cls/common/cppref_pragma_ivdep.htm > > Thus, there's a slight difference. It cannot be used to disable > dependency > checking altogether (and just blame a sloppy programmer if there > actually > are dependencies), but it just converts the "unknown alias" to "no > alias". > If there's a "yes" from the analyzer it still prevents the > vectorization. > So, sort of a softened programmer-friendlier version of the > semantics. > > The vagueness comes from that it depends on the intelligence > of the dependency analysis implementation whether a dependency can be > "proven" > or not, doesn't it? Thus, #pragma ivdep with a non-existing > loop dependence analyzer is equivalent to the semantics of the > proposed > metadata.And the user has no way of knowing which dependencies are proven and which are assumed, right? It seems like the user just needs to assume that nothing is proven ;) Nevertheless, based on this, we probably do need something with slightly weaker semantics for ivdep.> > Also, it's a bit unclear what is the real difference to the #pragma > parallel: > http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/cref_cls/common/cppref_pragma_parallel.htm > > It similarly states: "However, if dependencies are proven, they are > not > ignored." So conversely, if the compiler cannot prove a dependency > for > some reason, they *are* ignored?Interesting.> > OpenMP's 'omp for', on the other hand, can be used to mark a truly > parallel > loop where this metadata could be used if one wants to parallelize > those > loops using a finer-granularity mechanism than threads.Agreed; we should make sure to incorporate this into the upcoming OpenMP support. The loops will be outlined, but the outlined pieces can then be marked with this 'parallel' metadata. Thanks again, Hal> > -- > Pekka >
Maybe Matching Threads
- [LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
- [LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
- [LLVMdev] parallel loop awareness to the LoopVectorizer
- [LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer
- [LLVMdev] [PATCH] parallel loop awareness to the LoopVectorizer