search for: canvectorizeinstrs

Displaying 6 results from an estimated 6 matches for "canvectorizeinstrs".

2017 Apr 14
2
Separate LoopVectorize LLVM pass
Hello. I am trying to create my own LoopVectorize.cpp pass as a separate pass from the LLVM trunk, as described in http://llvm.org/docs/CMake.html#embedding-llvm-in-your-project. Did anybody try something like this? I added close to the end of the .cpp file: /* this line seems to be required - it allows to run this pass as an embedded pass by giving opt -my-loop-vectorize
2013 Nov 08
0
[LLVMdev] loop vectorizer and storing to uniform addresses
On 7 November 2013 17:18, Frank Winter <fwinter at jlab.org> wrote: > LV: We don't allow storing to uniform addresses > This is triggering because it didn't recognize as a reduction variable during the canVectorizeInstrs() but did recognize that sum[q] is loop invariant in canVectorizeMemory(). I'm guessing the nested loop was unrolled because of the low trip-count, and removed, so it ended up as: float foo( int start , int end , float * A ) { float sum[4] = {0.,0.,0.,0.}; for (int i = start ; i < end...
2013 Nov 08
3
[LLVMdev] loop vectorizer and storing to uniform addresses
I am trying my luck on this global reduction kernel: float foo( int start , int end , float * A ) { float sum[4] = {0.,0.,0.,0.}; for (int i = start ; i < end ; ++i ) { for (int q = 0 ; q < 4 ; ++q ) sum[q] += A[i*4+q]; } return sum[0]+sum[1]+sum[2]+sum[3]; } LV: Checking a loop in "foo" LV: Found a loop: for.cond1 LV: Found an induction variable. LV: We
2013 Nov 08
1
[LLVMdev] loop vectorizer and storing to uniform addresses
...wrote: > On 7 November 2013 17:18, Frank Winter <fwinter at jlab.org > <mailto:fwinter at jlab.org>> wrote: > > LV: We don't allow storing to uniform addresses > > > This is triggering because it didn't recognize as a reduction variable > during the canVectorizeInstrs() but did recognize that sum[q] is loop > invariant in canVectorizeMemory(). > > I'm guessing the nested loop was unrolled because of the low > trip-count, and removed, so it ended up as: > > float foo( int start , int end , float * A ) > { > float sum[4] = {0.,0.,0...
2018 May 14
1
Query on unswitching + vectorization
* Looks like some sort of pass ordering issue; it will vectorize if indvars runs sometime between loop unswitch and the vectorizer. That insight is helpful. I scheduled Canonicalization of induction variable before loop vectorization and could get the loop vectorized. The indvars are heavily dependent on SCEV. If there a scalar like tmp which is of real type, we may not be able to get the
2020 May 05
2
Missing vectorization of loop due to load late in the loop
Hi, TL;DR: A loop doesn't get vectorized due to the interaction of loop- rotate, licm and instcombine. What to do about it? Full story: In the benchmarks for our out-of-tree target we have a case that we would like to get vectorized, but currently it isn't. I've done some digging to see why and have some kind of idea what prevents it, but I don't know what the best way to fix