thr3ads.net - search: "canvectorizeinstrs"

Displaying 6 results from an estimated 6 matches for "canvectorizeinstrs".

2017 Apr 14

Separate LoopVectorize LLVM pass

Hello. I am trying to create my own LoopVectorize.cpp pass as a separate pass from the LLVM trunk, as described in http://llvm.org/docs/CMake.html#embedding-llvm-in-your-project. Did anybody try something like this? I added close to the end of the .cpp file: /* this line seems to be required - it allows to run this pass as an embedded pass by giving opt -my-loop-vectorize

[LLVMdev] loop vectorizer and storing to uniform addresses

2013 Nov 08

[LLVMdev] loop vectorizer and storing to uniform addresses

On 7 November 2013 17:18, Frank Winter <fwinter at jlab.org> wrote: > LV: We don't allow storing to uniform addresses > This is triggering because it didn't recognize as a reduction variable during the canVectorizeInstrs() but did recognize that sum[q] is loop invariant in canVectorizeMemory(). I'm guessing the nested loop was unrolled because of the low trip-count, and removed, so it ended up as: float foo( int start , int end , float * A ) { float sum[4] = {0.,0.,0.,0.}; for (int i = start ; i < end...

[LLVMdev] loop vectorizer and storing to uniform addresses

2013 Nov 08

[LLVMdev] loop vectorizer and storing to uniform addresses

I am trying my luck on this global reduction kernel: float foo( int start , int end , float * A ) { float sum[4] = {0.,0.,0.,0.}; for (int i = start ; i < end ; ++i ) { for (int q = 0 ; q < 4 ; ++q ) sum[q] += A[i*4+q]; } return sum[0]+sum[1]+sum[2]+sum[3]; } LV: Checking a loop in "foo" LV: Found a loop: for.cond1 LV: Found an induction variable. LV: We

[LLVMdev] loop vectorizer and storing to uniform addresses

2013 Nov 08

[LLVMdev] loop vectorizer and storing to uniform addresses

...wrote: > On 7 November 2013 17:18, Frank Winter <fwinter at jlab.org > <mailto:fwinter at jlab.org>> wrote: > > LV: We don't allow storing to uniform addresses > > > This is triggering because it didn't recognize as a reduction variable > during the canVectorizeInstrs() but did recognize that sum[q] is loop > invariant in canVectorizeMemory(). > > I'm guessing the nested loop was unrolled because of the low > trip-count, and removed, so it ended up as: > > float foo( int start , int end , float * A ) > { > float sum[4] = {0.,0.,0...

Query on unswitching + vectorization

2018 May 14

Query on unswitching + vectorization

* Looks like some sort of pass ordering issue; it will vectorize if indvars runs sometime between loop unswitch and the vectorizer. That insight is helpful. I scheduled Canonicalization of induction variable before loop vectorization and could get the loop vectorized. The indvars are heavily dependent on SCEV. If there a scalar like tmp which is of real type, we may not be able to get the

Missing vectorization of loop due to load late in the loop

2020 May 05

Missing vectorization of loop due to load late in the loop

Hi, TL;DR: A loop doesn't get vectorized due to the interaction of loop- rotate, licm and instcombine. What to do about it? Full story: In the benchmarks for our out-of-tree target we have a case that we would like to get vectorized, but currently it isn't. I've done some digging to see why and have some kind of idea what prevents it, but I don't know what the best way to fix

search for: canvectorizeinstrs