Vaivaswatha N
2014-Nov-28 07:48 UTC
[LLVMdev] ScalarEvolution: Suboptimal handling of globals
Hi,
For the program below, where "incr" and "Arr" are globals
================================int incr;
float Arr[1000];
int foo ()
{
float x = 0;
int newInc = incr+1;
for (int i = 0; i < 1000; i++) {
for (int j = 0; j < 1000; j += incr) {
x += (Arr[i] + Arr[j]);
}
}
return x;
}
================================
The SCEV expression computed for the variable "j" is %j.0 = phi i32 [
0, %for.body ], [ %add8, %for.inc ]
--> %j.0 Exits: <<Unknown>>As is evident, this
isn't a useful computation.
Whereas if I use the variable newInc to be the increment for "j",
i.e., "j += newInc" in the inner loop, the computed SCEV is %j.0 =
phi i32 [ 0, %for.body ], [ %add8, %for.inc ]
--> {0,+,(1 + %0)}<nsw><%for.cond1>
where %0 here is a LoadInst for "incr". In this situation, the
analysis is working as expected to compute the add-recurrence for "j".
I would've expected a similar computation for the first case too, something
like {0,+,%0}<nsw><%for.cond1>
Where %0 would be the LoadInst for "incr".
Is this a bug or is there something more involved that causes the deficiency?
Thanks,- Vaivaswatha
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141128/c59536db/attachment.html>
Sanjoy Das
2014-Nov-28 08:29 UTC
[LLVMdev] ScalarEvolution: Suboptimal handling of globals
What pass ordering are you using? If the "j += incr" expression incurs
a per-iteration load from @incr, then that can confuse SCEV. Loop
invariant code motion cleans it up.
With "opt -mem2reg -licm -S scev.ll | opt -analyze -scalar-evolution"
on the unoptimized IR generated by clang, I get:
...
%j.0 = phi i32 [ 0, %6 ], [ %19, %18 ]
--> {0,+,%3}<nsw><%9> Exits:
<<Unknown>>
On Thu, Nov 27, 2014 at 11:48 PM, Vaivaswatha N <vaivaswatha at
yahoo.co.in> wrote:> Hi,
>
> For the program below, where "incr" and "Arr" are
globals
> ================================> int incr;
> float Arr[1000];
>
> int foo ()
> {
> float x = 0;
> int newInc = incr+1;
> for (int i = 0; i < 1000; i++) {
> for (int j = 0; j < 1000; j += incr) {
> x += (Arr[i] + Arr[j]);
> }
> }
> return x;
> }
> ================================>
> The SCEV expression computed for the variable "j" is
> %j.0 = phi i32 [ 0, %for.body ], [ %add8, %for.inc ]
> --> %j.0 Exits: <<Unknown>>
> As is evident, this isn't a useful computation.
>
> Whereas if I use the variable newInc to be the increment for "j",
i.e., "j
> += newInc" in the inner loop, the computed SCEV is
> %j.0 = phi i32 [ 0, %for.body ], [ %add8, %for.inc ]
> --> {0,+,(1 + %0)}<nsw><%for.cond1>
> where %0 here is a LoadInst for "incr". In this situation, the
analysis is
> working as expected to compute the add-recurrence for "j".
>
> I would've expected a similar computation for the first case too,
something
> like
> {0,+,%0}<nsw><%for.cond1>
> Where %0 would be the LoadInst for "incr".
>
> Is this a bug or is there something more involved that causes the
> deficiency?
>
> Thanks,
> - Vaivaswatha
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
Vaivaswatha N
2014-Nov-28 12:15 UTC
[LLVMdev] ScalarEvolution: Suboptimal handling of globals
Thank you very much. -licm was what I had missing. - Vaivaswatha
On Friday, 28 November 2014 1:59 PM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
What pass ordering are you using? If the "j += incr" expression
incurs
a per-iteration load from @incr, then that can confuse SCEV. Loop
invariant code motion cleans it up.
With "opt -mem2reg -licm -S scev.ll | opt -analyze -scalar-evolution"
on the unoptimized IR generated by clang, I get:
...
%j.0 = phi i32 [ 0, %6 ], [ %19, %18 ]
--> {0,+,%3}<nsw><%9> Exits:
<<Unknown>>
On Thu, Nov 27, 2014 at 11:48 PM, Vaivaswatha N <vaivaswatha at
yahoo.co.in> wrote:> Hi,
>
> For the program below, where "incr" and "Arr" are
globals
> ================================> int incr;
> float Arr[1000];
>
> int foo ()
> {
> float x = 0;
> int newInc = incr+1;
> for (int i = 0; i < 1000; i++) {
> for (int j = 0; j < 1000; j += incr) {
> x += (Arr[i] + Arr[j]);
> }
> }
> return x;
> }
> ================================>
> The SCEV expression computed for the variable "j" is
> %j.0 = phi i32 [ 0, %for.body ], [ %add8, %for.inc ]
> --> %j.0 Exits: <<Unknown>>
> As is evident, this isn't a useful computation.
>
> Whereas if I use the variable newInc to be the increment for "j",
i.e., "j
> += newInc" in the inner loop, the computed SCEV is
> %j.0 = phi i32 [ 0, %for.body ], [ %add8, %for.inc ]
> --> {0,+,(1 + %0)}<nsw><%for.cond1>
> where %0 here is a LoadInst for "incr". In this situation, the
analysis is
> working as expected to compute the add-recurrence for "j".
>
> I would've expected a similar computation for the first case too,
something
> like
> {0,+,%0}<nsw><%for.cond1>
> Where %0 would be the LoadInst for "incr".
>
> Is this a bug or is there something more involved that causes the
> deficiency?
>
> Thanks,
> - Vaivaswatha
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141128/e6b2b0ae/attachment.html>
Maybe Matching Threads
- [LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
- [LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
- [LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
- [LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
- optimisation issue in an llvm IR pass