Displaying 20 results from an estimated 6000 matches similar to: "[LLVMdev] LLVM Loop Vectorizer puzzle"
2013 May 23
0
[LLVMdev] LLVM Loop Vectorizer puzzle
Hi,
The TinyTripCountVectorThreshold only applies to loops with a known (constant) trip count. If a loop has a trip count below this value we don’t attempt to vectorize the loop. The loop below has an unknown trip count.
Once we decide to vectorize a loop, we emit code to check whether we can execute one iteration of the vectorized body. This is the code quoted below.
On May 22, 2013, at 10:23
2013 May 23
2
[LLVMdev] LLVM Loop Vectorizer puzzle
Hi,
Just from personal interest, is there a canonical way in IR+metadata to
express "This small constant trip-count loop is desired to be converted
into a sequence of vector operations directly"? Ie, mapping a 4 element i32
loop into a linear sequence of <4 x i32> operations. Obviously this may not
always be a win, but I'm just wondering if there's a way to communicate
2013 May 23
0
[LLVMdev] LLVM Loop Vectorizer puzzle
----- Original Message -----
> 
> 
> 
> Hi,
> 
> Just from personal interest, is there a canonical way in IR+metadata
> to express "This small constant trip-count loop is desired to be
> converted into a sequence of vector operations directly"? Ie,
> mapping a 4 element i32 loop into a linear sequence of <4 x i32>
> operations. Obviously this may not
2013 May 20
2
[LLVMdev] Polly issue
Hi,
     When I test "matmul" in the polly directory, I get the following 
performance data:
//===============================
--> 12. Compare the runtime of the executables
time ./matmul.normalopt.exe
0:23.53 real, 23.48 user, 0.00 sys
time ./matmul.polly.interchanged.exe
0:22.86 real, 22.82 user, 0.01 sys
time ./matmul.polly.interchanged+tiled.exe
0:22.87 real, 22.83 user, 0.00 sys
2013 Dec 02
1
[LLVMdev] Are 7th LLVM Developer meeting's PPTs available?
Hi, everyone:
     The 7th LLVM developer meeting was held at November 6-7, 2013. I 
can't find more infos other than abstract. Any help?
     Thanks.
Regards,
maxs
2013 Jun 03
1
[LLVMdev] LLVM Loop Vectorizer is enabled by default???
In http://llvm.org/docs/Vectorizers.html, it says "LLVM’s Loop
Vectorizer is now enabled by default for -O3". But I use the following
command: opt -O3 -debug-pass=Arguments test.ll -o /dev/null   I can't see
the "loop-vectorize" option in the result. Any advice ?
     My opt version is:
=====================================
$ opt --version
LLVM (http://llvm.org/):
  LLVM
2013 Sep 27
2
[LLVMdev] Trip count and Loop Vectorizer
Hi Nadav,
Thanks for the response. I forgot to mention that there is an upper limit of 16 for the Trip Count check,
TinyTripCountVectorThreshold = 16;
if (TC > 0u && TC < TinyTripCountVectorThreshold). So right now, any loop with Trip Count as 0, or with  value >=16, LV with unroll. With the change to the lower bound, it will also include the loop with 0 trip count.
SCEV returns 0
2013 Sep 27
0
[LLVMdev] Trip count and Loop Vectorizer
Hi Sriram, 
Thanks for performing this analysis. The problem here, both for memcpy and the vectorizer, is that we can’t predict the size of “n”, even though the only use of ’n’ is for the loop bound for the alloca [4 x [8 x i32]]. If you change the unroll condition to TC >= 0 then you will disable loop unrolling for all loops because getSmallConstantTripCount returns an unsigned number. You
2013 Sep 27
2
[LLVMdev] Trip count and Loop Vectorizer
Hi,
I am trying to get a small loop to *not vectorize* for cases where it doesn't make sense. For instance, this loop:
void foo(int a[4][8], int n)
{
    int b[4][8];
    for(int i = 0; i < 4; i++) {
        for(int j = 0; j < n; j++) {
            a[i][j] = b[i][j];
        }
    }
}
* Has maximum of 8ints copy. LLVM tries to use Memcpy for the inner loop. It is not helpful to perform
2013 Nov 06
1
[LLVMdev] loop vectorizer
On 06/11/13 08:54, Arnold wrote:
>
>
> Sent from my iPhone
>
> On Nov 5, 2013, at 7:39 PM, Frank Winter <fwinter at jlab.org 
> <mailto:fwinter at jlab.org>> wrote:
>
>> Good that you bring this up. I still have no solution to this 
>> vectorization problem.
>>
>> However, I can rewrite the code and insert a second loop which 
>>
2012 Dec 21
2
[LLVMdev] assert in InnerLoopVectorizer::createEmptyLoop
I am seeing an assert when I compile the attached program with clang:
$ clang -fno-strict-aliasing -target mips64el-unknown-linux -O3
-fomit-frame-pointer -S test1.c -o test1.ll -emit-llvm
It asserts when LoopVectorize.cpp:506 is executed. It looks like it is
complaining because it is trying to zero-extend an i64 (type
Count->getType() returns i64) to an i32 (IdxTy).
if (Count->getType()
2013 Nov 06
0
[LLVMdev] loop vectorizer
Sent from my iPhone
> On Nov 5, 2013, at 7:39 PM, Frank Winter <fwinter at jlab.org> wrote:
> 
> Good that you bring this up. I still have no solution to this vectorization problem.
> 
> However, I can rewrite the code and insert a second loop which eliminates the 'urem' and 'div' instructions in the index calculations. In this case, the inner loop's trip
2013 Nov 06
3
[LLVMdev] loop vectorizer
Good that you bring this up. I still have no solution to this 
vectorization problem.
However, I can rewrite the code and insert a second loop which 
eliminates the 'urem' and 'div' instructions in the index calculations. 
In this case, the inner loop's trip count would be equal to the SIMD 
length and the loop vectorizer ignores the loop. Unrolling the loop and 
SLP is not an
2008 Sep 09
1
puzzle about contrasts
Hi,
I'm trying to redefine the contrasts for a linear model.
With a 2 level factor, x, with levels A and B, a two level
factor outputs A and B - A from an lm fit, say
lm(y ~ x). I would like to set the contrasts so that
the coefficients output are -0.5 (A + B) and B - A,
but I can't get the sign correct for the first coefficient
(Intercept).
Here is a toy example,
set.seed(12161952)
y
2012 Dec 21
0
[LLVMdev] assert in InnerLoopVectorizer::createEmptyLoop
Please file a bug for these types of issues.
On Fri, Dec 21, 2012 at 1:49 PM, Akira Hatanaka <ahatanak at gmail.com> wrote:
> I am seeing an assert when I compile the attached program with clang:
>
> $ clang -fno-strict-aliasing -target mips64el-unknown-linux -O3
> -fomit-frame-pointer -S test1.c -o test1.ll -emit-llvm
>
>
> It asserts when LoopVectorize.cpp:506 is
2013 May 23
0
[LLVMdev] LLVM Loop Vectorizer puzzle
On Thu, May 23, 2013 at 12:02 PM, Nadav Rotem <nrotem at apple.com> wrote:
>
> On May 23, 2013, at 8:52 AM, "Redmond, Paul" <paul.redmond at intel.com>
> wrote:
>
>
> !0 = metadata !{ metadata !1, metadata !2 }
> !1 = metadata !{ metadata !"llvm.loop.parallel" }
> !2 = metadata !{ metadata !"llvm.vectorization.vector_width", i32 8
2013 May 23
3
[LLVMdev] LLVM Loop Vectorizer puzzle
On May 23, 2013, at 8:52 AM, "Redmond, Paul" <paul.redmond at intel.com> wrote:
> 
> !0 = metadata !{ metadata !1, metadata !2 }
> !1 = metadata !{ metadata !"llvm.loop.parallel" }
> !2 = metadata !{ metadata !"llvm.vectorization.vector_width", i32 8 }
> 
> I'm not even sure you would need the llvm.loop.parallel anymore since the
2013 May 23
2
[LLVMdev] LLVM Loop Vectorizer puzzle
On May 23, 2013, at 10:37 AM, Cameron McInally <cameron.mcinally at nyu.edu> wrote:
> In all fairness, I do not believe that ivdep is an ICC-specific pragma. There are many compilers that support ivdep and lots of legacy (and modern) codes that benefit from it. Seems silly, to me at least, to reinvent the wheel. 
Hi Cameron, 
The history of the idvep pragma is fascinating. I did not
2013 Jul 05
0
[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR
On 07/04/2013 01:39 PM, Stéphane Letz wrote:
> Hi,
>
> Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some
2013 Jul 04
3
[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR
Hi,
Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some informations that are needed by the vectorization passes to