Displaying 1 result from an estimated 1 matches for "r1501".
Did you mean:
1501
2010 Nov 03
1
[LLVMdev] LLVM x86 Code Generator discards Instruction-level Parallelism
...imes:
.
.
{
p1 = p1 * a;
p2 = p2 * b;
p3 = p3 * c;
p4 = p4 * d;
}
.
.
Compiling with NVCC, Ocelot, and LLVM, I can confirm the interleaved instruction
schedule with a four-instruction reuse distance. An excerpt follows:
.
.
%r1500 = fmul float %r1496, %r24 ; compute %1500
%r1501 = fmul float %r1497, %r23
%r1502 = fmul float %r1498, %r22
%r1503 = fmul float %r1499, %r21
%r1504 = fmul float %r1500, %r24 ; first use of %1500
%r1505 = fmul float %r1501, %r23
%r1506 = fmul float %r1502, %r22
%r1507 = fmul float %r1503, %r21
%r1508 = fmul float %r1504, %r24 ; f...