Anadi Mishra
2013-Apr-17 02:55 UTC
[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation
Hello, I am trying to vectorize the following loop but the vectorizer says: "Found a possible write-write reorder" and does not vectorize. Why? for (j=0; j < 8; j++) { jj = j << 3; m2[j][0] = diff[jj ] + diff[jj+4]; m2[j][1] = diff[jj+1] + diff[jj+5]; m2[j][2] = diff[jj+2] + diff[jj+6]; m2[j][3] = diff[jj+3] + diff[jj+7]; m2[j][4] = diff[jj ] - diff[jj+4]; m2[j][5] = diff[jj+1] - diff[jj+5]; m2[j][6] = diff[jj+2] - diff[jj+6]; m2[j][7] = diff[jj+3] - diff[jj+7]; } Another question is regarding the isannotatedparallel() check. Is there a way to make clang (or any other frontend) to generate parallel annotated IR? Best, Anadi.
Tobias Grosser
2013-Apr-17 06:08 UTC
[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation
On 04/17/2013 04:55 AM, Anadi Mishra wrote:> Hello, > > I am trying to vectorize the following loop but the vectorizer says: > "Found a possible write-write reorder" and does not vectorize. > Why?To my knowledge, the dependence analysis in the loop vectorizer is not yet able to prove the absence of dependences here.> for (j=0; j < 8; j++) > { > jj = j << 3; > m2[j][0] = diff[jj ] + diff[jj+4]; > m2[j][1] = diff[jj+1] + diff[jj+5]; > m2[j][2] = diff[jj+2] + diff[jj+6]; > m2[j][3] = diff[jj+3] + diff[jj+7]; > m2[j][4] = diff[jj ] - diff[jj+4]; > m2[j][5] = diff[jj+1] - diff[jj+5]; > m2[j][6] = diff[jj+2] - diff[jj+6]; > m2[j][7] = diff[jj+3] - diff[jj+7]; > } > > > Another question is regarding the isannotatedparallel() check. Is > there a way to make clang (or any other frontend) to generate parallel > annotated IR?Did you try to put '#pragma ivdep' before the loop. Tobias P.S.: Please attach a full C file as test case. The way the different data structures are declared my influence the analysis.
Pekka Jääskeläinen
2013-Apr-17 10:31 UTC
[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation
Hi Anadi Mishra, On 04/17/2013 05:55 AM, Anadi Mishra wrote:> Another question is regarding the isannotatedparallel() check. Is > there a way to make clang (or any other frontend) to generate parallel > annotated IR?Paul Redmond was adding support for "#pragma ivdep" that would use the parallel metadata, but I haven't been able to follow its progress lately. FWIW, pocl's OpenCL kernel compiler adds the metadata to work-item loops. That is, if your loop body was an OpenCL kernel with each work-item executing a single iteration, it *might* get "horizontally vectorized" using the loop vectorizer if you use pocl's 'loopvec' work group method and if the memory access pattern is suitable. This is quite fresh code which I'm still optimizing, but I've already managed to autovectorize some work groups using it. BR, -- Pekka
Redmond, Paul
2013-Apr-17 14:58 UTC
[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation
On 2013-04-17, at 6:31 AM, Pekka Jääskeläinen wrote:> > Paul Redmond was adding support for "#pragma ivdep" that would use the > parallel metadata, but I haven't been able to follow its progress lately. >I'm still working on it--just slowly :P I'm hoping to have some more patches in the next week or two. paul
Anadi Mishra
2013-Apr-17 16:29 UTC
[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation
On Wed, Apr 17, 2013 at 8:08 AM, Tobias Grosser <tobias at grosser.es> wrote:> On 04/17/2013 04:55 AM, Anadi Mishra wrote: >> >> Hello, >> >> I am trying to vectorize the following loop but the vectorizer says: >> "Found a possible write-write reorder" and does not vectorize. >> Why? > > > To my knowledge, the dependence analysis in the loop vectorizer is not yet > able to prove the absence of dependences here.While that is true, the debug message printed by the vectorizer is misleading, which should not be.>> >> Another question is regarding the isannotatedparallel() check. Is >> there a way to make clang (or any other frontend) to generate parallel >> annotated IR? > > > Did you try to put '#pragma ivdep' before the loop.Thanks for the suggestion, it worked using the latest llvm from svn. Thanks Pekka and Paul for your inputs.> > Tobias > > P.S.: Please attach a full C file as test case. The way the different data > structures are declared my influence the analysis.PFA the example. -Best, Anadi. -------------- next part -------------- A non-text attachment was scrubbed... Name: example.c Type: text/x-csrc Size: 611 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130417/a101a71e/attachment.c>
Maybe Matching Threads
- [LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation
- [LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation
- [LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation
- [LLVMdev] Decouple LoopVectorizer from O3
- [LLVMdev] Decouple LoopVectorizer from O3