Displaying 2 results from an estimated 2 matches for "_p_scalar_5".
Did you mean:
_p_scalar_
2013 Mar 11
0
[LLVMdev] How to unroll reduction loop with caching accumulator on register?
...y.next_loopiv = add i64 %polly.loopiv10, 1
%p_ = add i64 %polly.loopiv10, %p_.moved.to.4.cloned
%p_newGEPInst9.cloned = getelementptr float* inttoptr (i64 47246749696 to
float*), i64 %p_
%p_newGEPInst12.cloned = getelementptr float* inttoptr (i64 47380971520
to float*), i64 %polly.loopiv10
%_p_scalar_5 = load float* %p_newGEPInst9.cloned, align 4, !tbaa !1
%_p_scalar_6 = load float* %p_newGEPInst12.cloned, align 4, !tbaa !2
%p_7 = fmul float %_p_scalar_5, %_p_scalar_6
%p_8 = fadd float %_p_scalar_, %p_7
store float %p_8, float* inttoptr (i64 47380979712 to float*), align
8192, !tbaa !0...
2013 Mar 11
2
[LLVMdev] How to unroll reduction loop with caching accumulator on register?
Dear all,
Attached notunrolled.ll is a module containing reduction kernel. What I'm
trying to do is to unroll it in such way, that partial reduction on
unrolled iterations would be performed on register, and then stored to
memory only once. Currently llvm's unroller together with all standard
optimizations produce code, which stores value to memory after every
unrolled iteration, which is