thr3ads.net - similar to: "Loop vectorization with the loop containing bitcast"

Displaying 20 results from an estimated 5000 matches similar to: "Loop vectorization with the loop containing bitcast"

2015 Feb 26

[LLVMdev] RFC: Loop versioning for LICM

Hi Ashutosh, Have you been following the recent Loop Access Analysis work? LAA was split out from the Loop Vectorizer that have been performing the kind of loop versioning that you describe. The main reason was to be able to share this functionality with other passes. Loop Access Analysis is an analysis pass that computes basic memory dependence and the runtime checks. The versioning decision

Question about a May-alias case

2018 Jun 13

Question about a May-alias case

Hello All, I have a question about a May-alias case. Let's look at one simple C example. char *buf[4]; char c; void test(int idx) { char *a = buf[3 - idx]; char *b = buf[idx]; *a = *b; c++; *a = *b; } We can imagine the second "*a = *b" could be removed. Let's look at the IR snippet with -O3 for above example. 1 define void @test(i32 %idx) { 2 entry:

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2012 Jan 17

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

Hi, On Fri, Dec 30, 2011 at 3:09 AM, Tobias Grosser <tobias at grosser.es> wrote: > As it seems my intuition is wrong, I am very eager to see and understand > an example where a search limit of 4000 is really needed. > To make the ball roll again, I attached a testcase that can be tuned to understand the impact on compile time for different sizes of a basic block. One can also

[LLVMdev] -indvars issues?

2012 Mar 08

[LLVMdev] -indvars issues?

Hi, Is the -indvars pass functional? I've done some small test to check it, but this fails to canonicalize: > int *x; > int *y; > int i; > ... > for (i = 1; i < 100; i+=2) { > x[i] = y[i] + 3; > } The IR produced after -indvars: > br label %for.cond > > for.cond: ; preds = %for.inc, %entry > %indvars.iv =

[LLVMdev] RFC: Loop versioning for LICM

2015 Feb 26

[LLVMdev] RFC: Loop versioning for LICM

I like to propose a new loop multi versioning optimization for LICM. For now I kept this for LICM only, but it can be used in multiple places. The main motivation is to allow optimizations stuck because of memory alias dependencies. Most of the time when alias analysis is unsure about memory access and it says may-alias. This un surety from alias analysis restrict some of the memory based

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Dec 30

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

On 12/29/2011 06:32 PM, Hal Finkel wrote: > On Thu, 2011-12-29 at 15:00 +0100, Tobias Grosser wrote: >> On 12/14/2011 01:25 AM, Hal Finkel wrote: >> One thing that I would still like to have is a test case where >> bb-vectorize-search-limit is needed to avoid exponential compile time >> growth and another test case that is not optimized, if >>

Canonicalize induction variables

2016 Aug 25

Canonicalize induction variables

But even for a very simple loop: int test1 (int *x, int *y, int *z, int k) { int sum = 0; for (int i = 10; i < k; i++) { z[i] = x[i] / y[i]; } return sum; } The initial value of induction variable is not zero after compiling with -O3 -mcpu=power8 x.cpp -S -c -emit-llvm -fno-unroll-loops (see bottom of the email for IR) Also I can write somewhat more complicated loop where step

MemorySSA question

2017 Dec 19

MemorySSA question

Hi, I am new to MemorySSA and wanted to understand its capabilities. Hence I wrote the following program (test.c): int N; void test(int *restrict a, int *restrict b, int *restrict c, int *restrict d, int *restrict e) { int i; for (i = 0; i < N; i = i + 5) { a[i] = b[i] + c[i]; } for (i = 0; i < N - 5; i = i + 5) { e[i] = a[i] * d[i]; } } I compiled this program using

[LLVMdev] RFC: Loop versioning for LICM

2015 Mar 04

[LLVMdev] RFC: Loop versioning for LICM

> On Mar 3, 2015, at 1:29 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com <mailto:Ashutosh.Nema at amd.com>> wrote: > > Hi Adam, > > Thanks for looking into LoopVersioning work. > > I have gone through recent LoopAccessAnalysis changes and found some of the stuff > overlaps (i.e. runtime memory check, loop access analysis etc.). LoopVersioning can > use

MemorySSA question

2017 Dec 19

MemorySSA question

On Tue, Dec 19, 2017 at 9:10 AM, Siddharth Bhat via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I could be entirely wrong, but from my understanding of memorySSA, each > def defines an "abstract heap state" which has the coarsest possible > definition - any write will be modelled as a "new heap state". > This is true for def-def relationships, but

[LLVMdev] Scalar Evolution and Loop Trip Count.

2013 Jul 11

[LLVMdev] Scalar Evolution and Loop Trip Count.

Hi, Scalar evolution seems to be wrapping around the trip count in the following loop. void add (int *restrict a, int *restrict b, int *restrict c) { char i; for (i = 0; i < 255; i++) a[i] = b[i] + c[i]; } When I run scalar evolution on the bit code, I get a backedge-taken count which is obviously wrong. $> cat loop.ll ; Function Attrs: nounwind define void @add(i32* noalias

[LLVMdev] alias analysis on llvm internal globals

2015 Apr 25

[LLVMdev] alias analysis on llvm internal globals

Hi I have this program in which fooBuf can only take on NULL or the address of local_fooBuf, and fooBuf and local_fooBuf have scope of the foo function. Therefore there is no way for the fooPtr argument to alias with fooBuf. However, LLVM basicaa and globalsmodref-aa say the 2 pointers may alias. I am thinking whether i should implement a limited form of point-to alias on the fooBuf pointer in

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

----- Original Message ----- > > > I ran the BB vectorizer as I guess this is the SLP vectorizer. No, while the BB vectorizer is doing a form of SLP vectorization, there is a separate SLP vectorization pass which uses a different algorithm. You can pass -vectorize-slp to opt. -Hal > > BBV: using target information > BBV: fusing loop #1 for for.body in _Z3barmmPfS_S_...

loop unrolling introduces conditional branch

2015 Aug 22

loop unrolling introduces conditional branch

Hi, Mehdi, For example, I have this very simple source code: void foo( int n, int array_x[]) { for (int i=0; i < n; i++) array_x[i] = i; } After I use "clang -emit-llvm -o bc_from_clang.bc -c try.cc", I get bc_from_clang.bc. With my code (using LLVM IRbuilder API), I get bc_from_api.bc. Attachment please find thse two files. I also past the IR here.

[LLVMdev] Loop vectorizer dosen't find loop bounds

2013 Oct 28

[LLVMdev] Loop vectorizer dosen't find loop bounds

I am trying to vectorize the function void bar(float *c, float *a, float *b) { const int width = 256; for (int i = 0 ; i < 256 ; ++i ) { c[ i ] = a[ i ] + b[ i ]; c[ width + i ] = a[ width + i ] + b[ width + i ]; } } using the following commands clang -emit-llvm -S loop.c opt loop.ll -O3 -debug-only=loop-vectorize -S -o - LV: Checking a loop in

loop unrolling introduces conditional branch

2015 Aug 22

loop unrolling introduces conditional branch

Thanks for your point that out. I just add DataLayout in my code such as "mod->setDataLayout("e-m:e-i64:64-f80:128-n8:16:32:64-S128");", still no luck. I'm really confused about this. Do I need to add more passes before -loop-unroll? On Sat, Aug 22, 2015 at 11:36 AM, Mehdi Amini <mehdi.amini at apple.com> wrote: > > On Aug 22, 2015, at 7:27 AM, Xiangyang

[LLVMdev] loop vectorizer

2013 Oct 30

[LLVMdev] loop vectorizer

The SLP vectorizer apparently did something in the prologue of the function (where storing of arguments on the stack happens) which then got eliminated later on (since I don't see any vector instructions in the final IR). Below the debug output of the SLP pass: Args: opt -O1 -vectorize-slp -debug loop.ll -S SLP: Analyzing blocks in _Z3barmmPfS_S_. SLP: Found 2 stores to vectorize. SLP:

Question about a May-alias case

2018 Jun 13

Question about a May-alias case

Hi Eli, Thanks for good comment! I missed to initalize the buf. Let's slightly change the example as below. char subbuf1[2]; char subbuf2[2]; char subbuf3[2]; char subbuf4[2]; char *buf[4] = {subbuf1, subbuf2, subbuf3, subbuf4}; char c; void test(int idx) { char *a = buf[3 - idx]; char *b = buf[idx]; *a = *b; c++; *a = *b; } I think we can say the 'buf' does not

LoopStrengthReduction generates false code

2020 Jun 09

LoopStrengthReduction generates false code

Hi. In my backend I get false code after using StrengthLoopReduction. In the generated code the loop index variable is multiplied by 8 (correct, everything is 64 bit aligned) to get an address offset, and the index variable is incremented by 1*8, which is not correct. It should be incremented by 1 only. The factor 8 appears again. I compared the debug output

[LLVMdev] SimplifyIndVar looses nsw flags

2013 Jun 25

[LLVMdev] SimplifyIndVar looses nsw flags

Hello, I'm using LLVM to reason about memory safety of programs. One goal is to prove that certain array accesses are always safe. Currently, one of these proofs fails because of a missing no-signed-wrap (nsw) flag. I found that it has been "lost" during the SimplifyIndVar pass. Here's the example: int foo(int a[]) { int sum = 0; for (int i = 0; i < 1000; ++i)

similar to: Loop vectorization with the loop containing bitcast