Vaivaswatha N
2014-Nov-28 07:48 UTC
[LLVMdev] ScalarEvolution: Suboptimal handling of globals
Hi, For the program below, where "incr" and "Arr" are globals ================================int incr; float Arr[1000]; int foo () { float x = 0; int newInc = incr+1; for (int i = 0; i < 1000; i++) { for (int j = 0; j < 1000; j += incr) { x += (Arr[i] + Arr[j]); } } return x; } ================================ The SCEV expression computed for the variable "j" is %j.0 = phi i32 [ 0, %for.body ], [ %add8, %for.inc ] --> %j.0 Exits: <<Unknown>>As is evident, this isn't a useful computation. Whereas if I use the variable newInc to be the increment for "j", i.e., "j += newInc" in the inner loop, the computed SCEV is %j.0 = phi i32 [ 0, %for.body ], [ %add8, %for.inc ] --> {0,+,(1 + %0)}<nsw><%for.cond1> where %0 here is a LoadInst for "incr". In this situation, the analysis is working as expected to compute the add-recurrence for "j". I would've expected a similar computation for the first case too, something like {0,+,%0}<nsw><%for.cond1> Where %0 would be the LoadInst for "incr". Is this a bug or is there something more involved that causes the deficiency? Thanks,- Vaivaswatha -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141128/c59536db/attachment.html>
Sanjoy Das
2014-Nov-28 08:29 UTC
[LLVMdev] ScalarEvolution: Suboptimal handling of globals
What pass ordering are you using? If the "j += incr" expression incurs a per-iteration load from @incr, then that can confuse SCEV. Loop invariant code motion cleans it up. With "opt -mem2reg -licm -S scev.ll | opt -analyze -scalar-evolution" on the unoptimized IR generated by clang, I get: ... %j.0 = phi i32 [ 0, %6 ], [ %19, %18 ] --> {0,+,%3}<nsw><%9> Exits: <<Unknown>> On Thu, Nov 27, 2014 at 11:48 PM, Vaivaswatha N <vaivaswatha at yahoo.co.in> wrote:> Hi, > > For the program below, where "incr" and "Arr" are globals > ================================> int incr; > float Arr[1000]; > > int foo () > { > float x = 0; > int newInc = incr+1; > for (int i = 0; i < 1000; i++) { > for (int j = 0; j < 1000; j += incr) { > x += (Arr[i] + Arr[j]); > } > } > return x; > } > ================================> > The SCEV expression computed for the variable "j" is > %j.0 = phi i32 [ 0, %for.body ], [ %add8, %for.inc ] > --> %j.0 Exits: <<Unknown>> > As is evident, this isn't a useful computation. > > Whereas if I use the variable newInc to be the increment for "j", i.e., "j > += newInc" in the inner loop, the computed SCEV is > %j.0 = phi i32 [ 0, %for.body ], [ %add8, %for.inc ] > --> {0,+,(1 + %0)}<nsw><%for.cond1> > where %0 here is a LoadInst for "incr". In this situation, the analysis is > working as expected to compute the add-recurrence for "j". > > I would've expected a similar computation for the first case too, something > like > {0,+,%0}<nsw><%for.cond1> > Where %0 would be the LoadInst for "incr". > > Is this a bug or is there something more involved that causes the > deficiency? > > Thanks, > - Vaivaswatha > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Vaivaswatha N
2014-Nov-28 12:15 UTC
[LLVMdev] ScalarEvolution: Suboptimal handling of globals
Thank you very much. -licm was what I had missing. - Vaivaswatha On Friday, 28 November 2014 1:59 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: What pass ordering are you using? If the "j += incr" expression incurs a per-iteration load from @incr, then that can confuse SCEV. Loop invariant code motion cleans it up. With "opt -mem2reg -licm -S scev.ll | opt -analyze -scalar-evolution" on the unoptimized IR generated by clang, I get: ... %j.0 = phi i32 [ 0, %6 ], [ %19, %18 ] --> {0,+,%3}<nsw><%9> Exits: <<Unknown>> On Thu, Nov 27, 2014 at 11:48 PM, Vaivaswatha N <vaivaswatha at yahoo.co.in> wrote:> Hi, > > For the program below, where "incr" and "Arr" are globals > ================================> int incr; > float Arr[1000]; > > int foo () > { > float x = 0; > int newInc = incr+1; > for (int i = 0; i < 1000; i++) { > for (int j = 0; j < 1000; j += incr) { > x += (Arr[i] + Arr[j]); > } > } > return x; > } > ================================> > The SCEV expression computed for the variable "j" is > %j.0 = phi i32 [ 0, %for.body ], [ %add8, %for.inc ] > --> %j.0 Exits: <<Unknown>> > As is evident, this isn't a useful computation. > > Whereas if I use the variable newInc to be the increment for "j", i.e., "j > += newInc" in the inner loop, the computed SCEV is > %j.0 = phi i32 [ 0, %for.body ], [ %add8, %for.inc ] > --> {0,+,(1 + %0)}<nsw><%for.cond1> > where %0 here is a LoadInst for "incr". In this situation, the analysis is > working as expected to compute the add-recurrence for "j". > > I would've expected a similar computation for the first case too, something > like > {0,+,%0}<nsw><%for.cond1> > Where %0 would be the LoadInst for "incr". > > Is this a bug or is there something more involved that causes the > deficiency? > > Thanks, > - Vaivaswatha > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141128/e6b2b0ae/attachment.html>
Possibly Parallel Threads
- [LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
- [LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
- [LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
- [LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
- optimisation issue in an llvm IR pass