similar to: [LLVMdev] Auto-vectorization and phi nodes

Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] Auto-vectorization and phi nodes"

2013 Feb 19
2
[LLVMdev] Auto-vectorization and phi nodes
Hi Vesa, The pass IndVars changes the induction variables to allow SCEV to analyze them and enable other optimizations. This is the canonicalization phase. Later on, LSR lowers the canonicalized induction variables to induction variables that map nicely to the target's addressing modes. In many cases it can remove some of the induction variables. I suspect that the loop vectorizer does
2013 Feb 19
0
[LLVMdev] Auto-vectorization and phi nodes
----- Original Message ----- > From: "Vesa Norilo" <vnorilo at siba.fi> > To: llvmdev at cs.uiuc.edu > Sent: Tuesday, February 19, 2013 4:40:26 AM > Subject: [LLVMdev] Auto-vectorization and phi nodes > > Hi all, > > Sorry if this is a dumb or FAQ or the wrong list! > > I'm currently investigating LLVM vectorization of my generated code. > My
2013 Feb 19
0
[LLVMdev] Auto-vectorization and phi nodes
Hi Nadav and Hal and thanks for the help! To the best of my understanding, indvars doesn't complain and an induction variable is detected. However, the loop vectorizer says: LV: Checking a loop in "add_vector" LV: Found a loop: Loop LV: Found an induction variable. LV: Found an unidentified PHI. %a.ptr = phi float* [ %a, %Top ], [ %a.next, %Loop ] LV: Can't vectorize the
2013 Feb 19
1
[LLVMdev] Auto-vectorization and phi nodes
On Feb 19, 2013, at 10:09 AM, Vesa Norilo <vnorilo at siba.fi> wrote: > Hi Nadav and Hal and thanks for the help! > > To the best of my understanding, indvars doesn't complain and an induction variable is detected. However, the loop vectorizer says: > > LV: Checking a loop in "add_vector" > LV: Found a loop: Loop > LV: Found an induction variable. >
2013 Nov 10
1
[LLVMdev] C++11 features in LLVM & Clang / bounding support for old host compilers
> Not everyone thinks that such a thing would be bonware [1]. > Regards, > Nate > [1] http://en.windows7sins.org/ Probably shouldn't feed this any more, but.. I work primarily with Visual Studio and LLVM myself and have absolutely no objection to dropping VS2010. In fact, I stronly support such a move, as the VC++ compiler has improved quite a bit recently, especially in the
2016 Feb 19
3
Metadata and compile time performance
Dear LLVMers, I’m investigating the response time of my JIT, and according to profiling, optimization takes 85% of the compile time, while the rest is being split evenly between the front-end and machine code generation. Much of the optimizer time is spent in various alias analysis passes. I’m happy with the generated code quality and wouldn’t like to lower the optimization level (O2). Would
2013 Oct 14
0
[LLVMdev] Vectorization of pointer PHI nodes
On 14 October 2013 18:15, Nadav Rotem <nrotem at apple.com> wrote: > 1. We have 4 stores to consecutive locations, but the last element is the > constant zero, and not an additional SUB. At the moment we don’t have > support for idempotence operations, but this is something that we should > add. > The fourth write is not necessary for GCC to vectorize it (nor was in the
2013 Oct 14
4
[LLVMdev] Vectorization of pointer PHI nodes
This is almost ideal for SLP vectorization, except for two problems: 1. We have 4 stores to consecutive locations, but the last element is the constant zero, and not an additional SUB. At the moment we don’t have support for idempotence operations, but this is something that we should add. 2. The values that we are subtracting come from 3 loads. We usually load 4 elements from memory, or
2013 Jul 05
2
[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR
Le 5 juil. 2013 à 04:11, Tobias Grosser <tobias at grosser.es> a écrit : > On 07/04/2013 01:39 PM, Stéphane Letz wrote: >> Hi, >> >> Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be
2013 Nov 06
2
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
The following IR implements the following nested loop: for (int i = start ; i < end ; ++i ) for (int p = 0 ; p < 4 ; ++p ) a[i*4+p] = b[i*4+p] + c[i*4+p]; define void @main(i64 %arg0, i64 %arg1, i1 %arg2, i64 %arg3, float* noalias %arg4, float* noalias %arg5, float* noalias %arg6) { entrypoint: br i1 %arg2, label %L0, label %L1 L0:
2013 Jul 05
0
[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR
On Jul 5, 2013, at 9:50 AM, Stéphane Letz <letz at grame.fr> wrote: > > Le 5 juil. 2013 à 04:11, Tobias Grosser <tobias at grosser.es> a écrit : > >> On 07/04/2013 01:39 PM, Stéphane Letz wrote: >>> Hi, >>> >>> Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or
2013 Nov 06
0
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
The loop vectorizer relies on cleanup passes to be run after it: from Transforms/IPO/PassManagerBuilder.cpp: // Add the various vectorization passes and relevant cleanup passes for // them since we are no longer in the middle of the main scalar pipeline. MPM.add(createLoopVectorizePass(DisableUnrollLoops)); MPM.add(createInstructionCombiningPass());
2013 Nov 06
2
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
The instcombine pass cleans up a lot. Any idea why there are still shufflevector, insertelement, *and* bitcast (!!) etc. instructions left? The original loop is so clean, a textbook example I'd say. There is no need to shuffle anything.At least I don't see it. Frank vector.ph: ; preds = %L5 %broadcast.splatinsert1 = insertelement <4 x
2013 Nov 01
2
[LLVMdev] loop vectorizer: this loop is not worth vectorizing
I am trying a setup where the one loop is rewritten as two loops. This avoids the 'rem' and 'div' instructions in the index calculation (which give the loop vectorizer a hard time). However, with this setup the loop vectorizer complains about a too small loop. LV: Checking a loop in "main" LV: Found a loop: L3 LV: Found a loop with a very small trip count. This loop
2013 Oct 30
3
[LLVMdev] loop vectorizer
----- Original Message ----- > > > I ran the BB vectorizer as I guess this is the SLP vectorizer. No, while the BB vectorizer is doing a form of SLP vectorization, there is a separate SLP vectorization pass which uses a different algorithm. You can pass -vectorize-slp to opt. -Hal > > BBV: using target information > BBV: fusing loop #1 for for.body in _Z3barmmPfS_S_...
2013 Oct 30
2
[LLVMdev] loop vectorizer
The debug messages are misleading. They should read “trying to vectorize a list of …”; The problem is that the SCEV analysis is unable to detect that C[ir0] and C[ir1] are consecutive. Is this loop from an important benchmark ? Thanks, Nadav On Oct 30, 2013, at 11:13 AM, Frank Winter <fwinter at jlab.org> wrote: > The SLP vectorizer apparently did something in the prologue of the
2013 Oct 28
2
[LLVMdev] loop vectorizer says Bad stride
Verifying function running passes ... LV: Checking a loop in "bar" LV: Found a loop: L0 LV: Found an induction variable. LV: We need to do 0 pointer comparisons. LV: Checking memory dependencies LV: Bad stride - Not an AddRecExpr pointer %13 = getelementptr float* %arg2, i32 %1 SCEV: ((4 * (sext i32 {(256 + %arg0),+,1}<nw><%L0> to i64)) + %arg2) LV: Src Scev: {((4 * (sext
2018 May 14
1
Query on unswitching + vectorization
* Looks like some sort of pass ordering issue; it will vectorize if indvars runs sometime between loop unswitch and the vectorizer. That insight is helpful. I scheduled Canonicalization of induction variable before loop vectorization and could get the loop vectorized. The indvars are heavily dependent on SCEV. If there a scalar like tmp which is of real type, we may not be able to get the
2017 Aug 10
2
PHI nodes and connected ICMp
Hello, I have one more question about how phi nodes and their corresponding ICmp instructions are associated. maybe it is simple, but at first I thought that we always compare against one of incoming value. Is it true that I can have only two cases: %indvars.iv = phi i64 [ %indvars.iv.next, %1 ], [ 0, %0 ] ... %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 %exitcond = icmp eq i64
2013 Nov 01
0
[LLVMdev] loop vectorizer: this loop is not worth vectorizing
In the case when coming from C it was probably the loop unroller and SLP vectorizer which vectorized the code. Potentially I could do the same in the IR. However, the loop body that is generated in the IR can get very large. Thus, the loop unroller will refuse to unroll the loop in a large number of (important) cases. Isn't there a way to convince the loop vectorizer that it should