I noticed something fishy going on in LSR, but I am not sure why or how to
fix it.
*Symptom*
I managed to get a reduced test (lsr.ll), which is compiled from the
following CUDA code (I believe similar problems exist for C++ in general).
for (int i = n; i != 0; --i) {
int t0 = i * a;
use(t0);
t1 = t0 + i; // equivalent to t0 = i * (a + 1)
use(t1);
}
The in-loop cost excluding "use" is
1 mul: i * a
1 add: t += i
1 sub: --i
1 compare: i != 0
LSR transforms that into (lsr.opt.ll)
int t0 = n * a;
int t1 = n * (a + 1);
int a1 = a + 1;
for (int i = n; i != 0; --i) {
use(t0);
use(t1);
t0 -= a;
t1 -= a1;
}
The in-loop cost excluding "use" is
3 sub: --i, t0 -= a, t1 -= a1
1 compare: i != 0
The total cost is unchanged, but the LSR'ed version uses more registers.
Each indvar (i, t0, or t1) needs one register in the LSR'ed version,
whereas in the original code, t0 and t1 can share the same register. The
strides "a" and "a1" also cost registers. Register pressure
might not be an
issue for this reduced case, but it does degrades the performance in the
real non-reduced benchmark.
*Investigation*
>From the debug output, LSR chooses to put t0 and t1 in different IV chains,
because their SCEVs look different:
%0 = mul nsw i32 %i, %a
--> {(%a * %n),+,(-1 * %a)}<%loop>
%1 = add nsw i32 %0, %i
--> {((1 + %a) * %n),+,(-1 + (-1 * %a))}<%loop>
Looks like this makes LSR to later process them separately and promote both
to induction variables. If LSR realized %1 can be computed as %0 + %i
instead of %i * (%a + 1), it could merge them into the same chain.
I am not sure if this is the root cause, because LSRInstance::Solve seems
to be able to merge IV chains sometimes. So it could be that the IV chains
are not merged in the way I want?
Jingyue
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150717/20f43a0e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lsr.ll
Type: application/octet-stream
Size: 431 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150717/20f43a0e/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lsr.opt.ll
Type: application/octet-stream
Size: 823 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150717/20f43a0e/attachment-0001.obj>