thr3ads.net - similar to: "[LLVMdev] Auto-vectorization and phi nodes"

Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] Auto-vectorization and phi nodes"

[LLVMdev] Auto-vectorization and phi nodes

2013 Feb 19

[LLVMdev] Auto-vectorization and phi nodes

Hi Vesa, The pass IndVars changes the induction variables to allow SCEV to analyze them and enable other optimizations. This is the canonicalization phase. Later on, LSR lowers the canonicalized induction variables to induction variables that map nicely to the target's addressing modes. In many cases it can remove some of the induction variables. I suspect that the loop vectorizer does

[LLVMdev] Auto-vectorization and phi nodes

2013 Feb 19

[LLVMdev] Auto-vectorization and phi nodes

----- Original Message ----- > From: "Vesa Norilo" <vnorilo at siba.fi> > To: llvmdev at cs.uiuc.edu > Sent: Tuesday, February 19, 2013 4:40:26 AM > Subject: [LLVMdev] Auto-vectorization and phi nodes > > Hi all, > > Sorry if this is a dumb or FAQ or the wrong list! > > I'm currently investigating LLVM vectorization of my generated code. > My

[LLVMdev] Auto-vectorization and phi nodes

2013 Feb 19

[LLVMdev] Auto-vectorization and phi nodes

Hi Nadav and Hal and thanks for the help! To the best of my understanding, indvars doesn't complain and an induction variable is detected. However, the loop vectorizer says: LV: Checking a loop in "add_vector" LV: Found a loop: Loop LV: Found an induction variable. LV: Found an unidentified PHI. %a.ptr = phi float* [ %a, %Top ], [ %a.next, %Loop ] LV: Can't vectorize the

[LLVMdev] Auto-vectorization and phi nodes

2013 Feb 19

[LLVMdev] Auto-vectorization and phi nodes

On Feb 19, 2013, at 10:09 AM, Vesa Norilo <vnorilo at siba.fi> wrote: > Hi Nadav and Hal and thanks for the help! > > To the best of my understanding, indvars doesn't complain and an induction variable is detected. However, the loop vectorizer says: > > LV: Checking a loop in "add_vector" > LV: Found a loop: Loop > LV: Found an induction variable. >

[LLVMdev] C++11 features in LLVM & Clang / bounding support for old host compilers

2013 Nov 10

[LLVMdev] C++11 features in LLVM & Clang / bounding support for old host compilers

> Not everyone thinks that such a thing would be bonware [1]. > Regards, > Nate > [1] http://en.windows7sins.org/ Probably shouldn't feed this any more, but.. I work primarily with Visual Studio and LLVM myself and have absolutely no objection to dropping VS2010. In fact, I stronly support such a move, as the VC++ compiler has improved quite a bit recently, especially in the

Metadata and compile time performance

2016 Feb 19

Metadata and compile time performance

Dear LLVMers, I’m investigating the response time of my JIT, and according to profiling, optimization takes 85% of the compile time, while the rest is being split evenly between the front-end and machine code generation. Much of the optimizer time is spent in various alias analysis passes. I’m happy with the generated code quality and wouldn’t like to lower the optimization level (O2). Would

[LLVMdev] Vectorization of pointer PHI nodes

2013 Oct 14

[LLVMdev] Vectorization of pointer PHI nodes

On 14 October 2013 18:15, Nadav Rotem <nrotem at apple.com> wrote: > 1. We have 4 stores to consecutive locations, but the last element is the > constant zero, and not an additional SUB. At the moment we don’t have > support for idempotence operations, but this is something that we should > add. > The fourth write is not necessary for GCC to vectorize it (nor was in the

[LLVMdev] Vectorization of pointer PHI nodes

2013 Oct 14

[LLVMdev] Vectorization of pointer PHI nodes

This is almost ideal for SLP vectorization, except for two problems: 1. We have 4 stores to consecutive locations, but the last element is the constant zero, and not an additional SUB. At the moment we don’t have support for idempotence operations, but this is something that we should add. 2. The values that we are subtracting come from 3 loads. We usually load 4 elements from memory, or

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

Le 5 juil. 2013 à 04:11, Tobias Grosser <tobias at grosser.es> a écrit : > On 07/04/2013 01:39 PM, Stéphane Letz wrote: >> Hi, >> >> Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

2013 Nov 06

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

The following IR implements the following nested loop: for (int i = start ; i < end ; ++i ) for (int p = 0 ; p < 4 ; ++p ) a[i*4+p] = b[i*4+p] + c[i*4+p]; define void @main(i64 %arg0, i64 %arg1, i1 %arg2, i64 %arg3, float* noalias %arg4, float* noalias %arg5, float* noalias %arg6) { entrypoint: br i1 %arg2, label %L0, label %L1 L0:

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

On Jul 5, 2013, at 9:50 AM, Stéphane Letz <letz at grame.fr> wrote: > > Le 5 juil. 2013 à 04:11, Tobias Grosser <tobias at grosser.es> a écrit : > >> On 07/04/2013 01:39 PM, Stéphane Letz wrote: >>> Hi, >>> >>> Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

2013 Nov 06

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

The loop vectorizer relies on cleanup passes to be run after it: from Transforms/IPO/PassManagerBuilder.cpp: // Add the various vectorization passes and relevant cleanup passes for // them since we are no longer in the middle of the main scalar pipeline. MPM.add(createLoopVectorizePass(DisableUnrollLoops)); MPM.add(createInstructionCombiningPass());

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

2013 Nov 06

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

The instcombine pass cleans up a lot. Any idea why there are still shufflevector, insertelement, *and* bitcast (!!) etc. instructions left? The original loop is so clean, a textbook example I'd say. There is no need to shuffle anything.At least I don't see it. Frank vector.ph: ; preds = %L5 %broadcast.splatinsert1 = insertelement <4 x

[LLVMdev] loop vectorizer: this loop is not worth vectorizing

2013 Nov 01

[LLVMdev] loop vectorizer: this loop is not worth vectorizing

I am trying a setup where the one loop is rewritten as two loops. This avoids the 'rem' and 'div' instructions in the index calculation (which give the loop vectorizer a hard time). However, with this setup the loop vectorizer complains about a too small loop. LV: Checking a loop in "main" LV: Found a loop: L3 LV: Found a loop with a very small trip count. This loop

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

----- Original Message ----- > > > I ran the BB vectorizer as I guess this is the SLP vectorizer. No, while the BB vectorizer is doing a form of SLP vectorization, there is a separate SLP vectorization pass which uses a different algorithm. You can pass -vectorize-slp to opt. -Hal > > BBV: using target information > BBV: fusing loop #1 for for.body in _Z3barmmPfS_S_...

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

The debug messages are misleading. They should read “trying to vectorize a list of …”; The problem is that the SCEV analysis is unable to detect that C[ir0] and C[ir1] are consecutive. Is this loop from an important benchmark ? Thanks, Nadav On Oct 30, 2013, at 11:13 AM, Frank Winter <fwinter at jlab.org> wrote: > The SLP vectorizer apparently did something in the prologue of the

[LLVMdev] loop vectorizer says Bad stride

2013 Oct 28

[LLVMdev] loop vectorizer says Bad stride

Verifying function running passes ... LV: Checking a loop in "bar" LV: Found a loop: L0 LV: Found an induction variable. LV: We need to do 0 pointer comparisons. LV: Checking memory dependencies LV: Bad stride - Not an AddRecExpr pointer %13 = getelementptr float* %arg2, i32 %1 SCEV: ((4 * (sext i32 {(256 + %arg0),+,1}<nw><%L0> to i64)) + %arg2) LV: Src Scev: {((4 * (sext

Query on unswitching + vectorization

2018 May 14

Query on unswitching + vectorization

* Looks like some sort of pass ordering issue; it will vectorize if indvars runs sometime between loop unswitch and the vectorizer. That insight is helpful. I scheduled Canonicalization of induction variable before loop vectorization and could get the loop vectorized. The indvars are heavily dependent on SCEV. If there a scalar like tmp which is of real type, we may not be able to get the

PHI nodes and connected ICMp

2017 Aug 10

PHI nodes and connected ICMp

Hello, I have one more question about how phi nodes and their corresponding ICmp instructions are associated. maybe it is simple, but at first I thought that we always compare against one of incoming value. Is it true that I can have only two cases: %indvars.iv = phi i64 [ %indvars.iv.next, %1 ], [ 0, %0 ] ... %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 %exitcond = icmp eq i64

[LLVMdev] loop vectorizer: this loop is not worth vectorizing

2013 Nov 01

[LLVMdev] loop vectorizer: this loop is not worth vectorizing

In the case when coming from C it was probably the loop unroller and SLP vectorizer which vectorized the code. Potentially I could do the same in the IR. However, the loop body that is generated in the IR can get very large. Thus, the loop unroller will refuse to unroll the loop in a large number of (important) cases. Isn't there a way to convince the loop vectorizer that it should

similar to: [LLVMdev] Auto-vectorization and phi nodes