thr3ads.net - search: "indvar"

loop unrolling introduces conditional branch

2015 Aug 20

2

loop unrolling introduces conditional branch

...ditional branch at end of every "unrolled" part. For example, consider the following code *void foo( int n, int array_x[])* *{* * for (int i=0; i < n; i++)* * array_x[i] = i; * *}* Then I use this command "opt-3.5 try.bc -mem2reg -loops -loop-simplify -loop-rotate -lcssa -indvars -loop-unroll -unroll-count=3 -simplifycfg -S", it gives me this IR: *define void @_Z3fooiPi(i32 %n, i32* %array_x) #0 {* * %1 = icmp slt i32 0, %n* * br i1 %1, label %.lr.ph <http://lr.ph/>, label %._crit_edge* *.lr.ph <http://lr.ph/>:...

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 20

3

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

...[InstCombine] icmp sgt (shl nsw X, C1), C0 --> icmp sgt X, C0 >> C1" The Loop Vectorizer generates code with more instructions: ==== Loop Vectorizer from rL292492 ==== for.body5: ; preds = %for.inc16.for.body5_crit_edge, %for.cond.preheader %indvar = phi i64 [ %indvar.next, %for.inc16.for.body5_crit_edge ], [ 0, %for.cond.preheader ] %1 = phi i8 [ %.pre, %for.inc16.for.body5_crit_edge ], [ 1, %for.cond.preheader ] %count.122 = phi i32 [ %count.2, %for.inc16.for.body5_crit_edge ], [ 0, %for.cond.preheader ] %i.119 = phi i64 [ %inc17, %fo...

[LLVMdev] reg2mem pass

2007 Sep 05

2

[LLVMdev] reg2mem pass

...2; } return sum; } ------------------------------------------------------------- I could get the corresponding LLVM assembly with llvm-gcc and llvm-dis: ------------------------------------------------------------- int %foo() { entry: br label %bb8.outer bb8.outer: ; preds = %bb10, %entry %indvar26 = phi uint [ 0, %entry ], [ %indvar.next27, %bb10 ] ; <uint> [#uses=2] %sum.0.pn.ph = phi int [ 0, %entry ], [ %sum.1, %bb10 ] ; <int> [#uses=1] %i.0.0.ph = cast uint %indvar26 to int ; <int> [#uses=1] br label %bb8 bb3: ; preds = %bb8 %indvar.next = add uint %indvar, 1...

[LLVMdev] Loops Prevent Function Pointer Inlining?

2014 Sep 24

3

[LLVMdev] Loops Prevent Function Pointer Inlining?

I've CC'ed Chad Rosier as I think this behaviour is a side-effect of his revert of IndVarSimplify.cpp (git c6b1a7e577a0b9e9cff9f9b7ac35a2cde7c448d8, SVN 217962). The change basically makes the IndVar pass change: ; <label>:4 ; preds = %6, %0 %i.0 = phi i32 [ 0, %0 ], [ %11, %6 ] %5 = icmp eq i32 %i.0, 0 br i1 %5, label %6, label %17 To...

[LLVMdev] -indvars issues?

2012 Mar 08

2

[LLVMdev] -indvars issues?

Hi, Is the -indvars pass functional? I've done some small test to check it, but this fails to canonicalize: > int *x; > int *y; > int i; > ... > for (i = 1; i < 100; i+=2) { > x[i] = y[i] + 3; > } The IR produced after -indvars: > br label %for.cond > > for.cond:...

[LLVMdev] Detecting reduction operations

2009 Oct 13

0

[LLVMdev] Detecting reduction operations

...dependencies? There is some initial work on > dependence analysis, but it is still pretty young. We also have support for > dependence between memory operations that are not loop aware. > > -Chris I think the dependence analysis will have to be loop aware. For example: bb: %indvar = phi i64 [ 0, %bb.nph ], [ %indvar.next, %bb ] %sum = phi i32 [ 0, %bb.nph ], [ %3, %bb ] %1 = getelementptr i32* %X, i64 %indvar %2 = load i32* %1, align 4 %3 = add i32 %2, %sum %indvar.next = add i64 %indvar, 1 %exitcond = icmp eq i64 %indvar.next,...

[LLVMdev] Preserving NSW/NUW bits

2014 Sep 02

2

[LLVMdev] Preserving NSW/NUW bits

David/All, Just a quick question about NSW/NUW bits, if you've got a second. I noticed you've been doing a little work on this as of late. I have a bit of code that looks like the following: %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 %2 = add i64 %indvars.iv.next, -1 %tmp = trunc i64 %2 to i32 %cmp = icmp slt i32 %tmp, %0 br i1 %cmp, label %for.body, label %for.end.loopexit I'm trying to fold the 2nd add instruction into the compare by changing the condition from from '...

loop unrolling introduces conditional branch

2015 Aug 22

3

loop unrolling introduces conditional branch

...; preds = %for.cond ret void } attributes #0 = { nounwind } ******************************** API Generate IR End *********************************************************** Then I use "opt file.bc -mem2reg -loops -loop-simplify -loop-rotate -lcssa -indvars -loop-unroll -unroll-count=4 -irce -simplifycfg -S" to run both .bc files. The first .bc file give me this: ***************************** Clang Generate IR with LoopUnrolling Start********************************************** ; ModuleID = 'bc_from_clang.bc' target datalayout = &quot...

[LLVMdev] Vectorized LLVM IR

2010 May 29

3

[LLVMdev] Vectorized LLVM IR

...t_array_ptr3, align 8 %output_array_ptr0 = getelementptr inbounds float** %outputs, i64 0 %output0 = load float** %output_array_ptr0, align 8 %out = icmp sgt i32 %count, 0 br i1 %out, label %convert, label %return convert: %count_64 = zext i32 %count to i64 br label %loop loop: %indvar = phi i64 [ 0, %convert ], [ %indvar.next, %loop ] %output_ptr0 = getelementptr float* %output0, i64 %indvar %input_ptr1 = getelementptr float* %input1, i64 %indvar %fTemp0 = load float* %input_ptr1, align 4 %input_ptr0 = getelementptr float* %input0, i64 %indvar %fTemp1 = load float* %in...

[LLVMdev] Detecting reduction operations

2009 Oct 12

3

[LLVMdev] Detecting reduction operations

On Oct 12, 2009, at 4:01 PM, Scott Ricketts wrote: > To be more specific, it would be helpful to have some utilities for > finding dependencies (true, output, and anti-). Where is a good place > to start for this kind of analysis? Hi Scott, Do you mean loop carried dependencies? There is some initial work on dependence analysis, but it is still pretty young. We also have support

MemorySSA question

2017 Dec 19

4

MemorySSA question

...of the file "out" are shown below: . . . for.body: ; preds = %for.body.lr.ph, %for.body ; 3 = MemoryPhi({for.body.lr.ph,liveOnEntry},{for.body,1}) %indvars.iv35 = phi i64 [ 0, %for.body.lr.ph ], [ %indvars.iv.next36, %for.body ] %arrayidx = getelementptr inbounds i32, i32* %b, i64 %indvars.iv35 ; MemoryUse(3) %2 = load i32, i32* %arrayidx, align 4, !tbaa !2 %arrayidx2 = getelementptr inbounds i32, i32* %c, i64 %indvars.iv35 ; MemoryUse(3) %3...

loop unrolling introduces conditional branch

2015 Aug 20

2

loop unrolling introduces conditional branch

...uot; part. For > example, consider the following code > > *void foo( int n, int array_x[])* > *{* > * for (int i=0; i < n; i++)* > * array_x[i] = i; * > *}* > > Then I use this command "opt-3.5 try.bc -mem2reg -loops -loop-simplify > -loop-rotate -lcssa -indvars -loop-unroll -unroll-count=3 -simplifycfg -S", > it gives me this IR: > > *define void @_Z3fooiPi(i32 %n, i32* %array_x) #0 {* > * %1 = icmp slt i32 0, %n* > * br i1 %1, label %.lr.ph <http://lr.ph/>, label %._crit_edge* > > *.lr.ph <http://lr.ph/>:...

loop unrolling introduces conditional branch

2015 Aug 22

2

loop unrolling introduces conditional branch

...= %for.cond > ret void > } > > attributes #0 = { nounwind } > > ******************************** API Generate IR End > *********************************************************** > > Then I use "opt file.bc -mem2reg -loops -loop-simplify -loop-rotate -lcssa > -indvars -loop-unroll -unroll-count=4 -irce -simplifycfg -S" to run both > .bc files. > The first .bc file give me this: > > ***************************** Clang Generate IR with LoopUnrolling > Start********************************************** > ; ModuleID = 'bc_from_clang.bc...

[LLVMdev] scalar-evolution + indvars fail to get the loop trip count?

2008 Dec 09

1

[LLVMdev] scalar-evolution + indvars fail to get the loop trip count?

Hi, Seems pass scalar-evolution+indvars fail to get the loop trip count of the following case: int foo(int x, int y, int lam[256], int alp[256]) { int i; int z = y; for (i = 255; i >= 0; i--) { z += x; lam[i] = alp[i]; } return z; } The final optimized ll code is : define i32 @foo(i32 %x, i32 %y, i32* %lam, i32...

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

2017 Jan 22

2

[InstCombine] rL292492 affected LoopVectorizer and caused 17.30%/11.37% perf regressions on Cortex-A53/Cortex-A15 LNT machines

...[InstCombine] icmp sgt (shl nsw X, C1), C0 --> icmp sgt X, C0 >> C1" The Loop Vectorizer generates code with more instructions: ==== Loop Vectorizer from rL292492 ==== for.body5: ; preds = %for.inc16.for.body5_crit_edge, %for.cond.preheader %indvar = phi i64 [ %indvar.next, %for.inc16.for.body5_crit_edge ], [ 0, %for.cond.preheader ] %1 = phi i8 [ %.pre, %for.inc16.for.body5_crit_edge ], [ 1, %for.cond.preheader ] %count.122 = phi i32 [ %count.2, %for.inc16.for.body5_crit_edge ], [ 0, %for.cond.preheader ] %i.119 = phi i64 [ %inc17, %fo...

[LLVMdev] Little bug in LoopInfo after Rotate?

2008 Jul 12

3

[LLVMdev] Little bug in LoopInfo after Rotate?

Hello, I have two for loops (one inside the other), that after indvars, looprotate, etc. (the important here is the loop rotate), is similar to this (I've stripped the real operations): define i32 @f() nounwind { entry: br label %bb1 bb1: ; preds = %bb3, %bb1, %entry %i.0.reg2mem.0.ph = phi i32 [ 0, %entry ], [ %i.0.reg2mem.0.ph, %bb1 ], [ %indv...

[LLVMdev] Vectorized LLVM IR

2010 May 29

0

[LLVMdev] Vectorized LLVM IR

... %output0 = load float** %output_array_ptr0, align 8 > %out = icmp sgt i32 %count, 0 > br i1 %out, label %convert, label %return > convert: > %count_64 = zext i32 %count to i64 > br label %loop > loop: > %indvar = phi i64 [ 0, %convert ], [ %indvar.next, %loop ] > %output_ptr0 = getelementptr float* %output0, i64 %indvar > %input_ptr1 = getelementptr float* %input1, i64 %indvar > %fTemp0 = load float* %input_ptr1, align 4 > %input_ptr0...

Canonicalize induction variables

2016 Aug 25

4

Canonicalize induction variables

...) { z[i] = x[i] / y[i]; } return sum; } again this is not canonicalized in the above sense (see IR at the end of the email). Maybe this condition is too complicated? IR for test1 for.body: ; preds = %for.body.preheader, %for.body * %indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 10, %for.body.preheader ]* %arrayidx = getelementptr inbounds i32, i32* %x, i64 %indvars.iv %0 = load i32, i32* %arrayidx, align 4, !tbaa !1 %arrayidx2 = getelementptr inbounds i32, i32* %y, i64 %indvars.iv %1 = load i32, i32* %arrayidx2, ali...

loop unrolling introduces conditional branch

2015 Aug 21

2

loop unrolling introduces conditional branch

...t;>> *void foo( int n, int array_x[])* >>> *{* >>> * for (int i=0; i < n; i++)* >>> * array_x[i] = i; * >>> *}* >>> >>> Then I use this command "opt-3.5 try.bc -mem2reg -loops -loop-simplify >>> -loop-rotate -lcssa -indvars -loop-unroll -unroll-count=3 -simplifycfg -S", >>> it gives me this IR: >>> >>> *define void @_Z3fooiPi(i32 %n, i32* %array_x) #0 {* >>> * %1 = icmp slt i32 0, %n* >>> * br i1 %1, label %.lr.ph <http://lr.ph/>, label %._crit_edge* >>&...

loop unrolling introduces conditional branch

2015 Aug 22

2

loop unrolling introduces conditional branch

...;>> *{* >>>>> * for (int i=0; i < n; i++)* >>>>> * array_x[i] = i; * >>>>> *}* >>>>> >>>>> Then I use this command "opt-3.5 try.bc -mem2reg -loops -loop-simplify >>>>> -loop-rotate -lcssa -indvars -loop-unroll -unroll-count=3 -simplifycfg -S", >>>>> it gives me this IR: >>>>> >>>>> *define void @_Z3fooiPi(i32 %n, i32* %array_x) #0 {* >>>>> * %1 = icmp slt i32 0, %n* >>>>> * br i1 %1, label %.lr.ph <http://l...

search for: indvar