thr3ads.net - search: "_p_scalar

Displaying 2 results from an estimated 2 matches for "_p_scalar_6".

Did you mean: _p_scalar_

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

2013 Mar 11

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

...iv10, %p_.moved.to.4.cloned %p_newGEPInst9.cloned = getelementptr float* inttoptr (i64 47246749696 to float*), i64 %p_ %p_newGEPInst12.cloned = getelementptr float* inttoptr (i64 47380971520 to float*), i64 %polly.loopiv10 %_p_scalar_5 = load float* %p_newGEPInst9.cloned, align 4, !tbaa !1 %_p_scalar_6 = load float* %p_newGEPInst12.cloned, align 4, !tbaa !2 %p_7 = fmul float %_p_scalar_5, %_p_scalar_6 %p_8 = fadd float %_p_scalar_, %p_7 store float %p_8, float* inttoptr (i64 47380979712 to float*), align 8192, !tbaa !0 %exitcond = icmp eq i64 %polly.next_loopiv, 512 br i1 %exitcond, lab...

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

2013 Mar 11

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

Dear all, Attached notunrolled.ll is a module containing reduction kernel. What I'm trying to do is to unroll it in such way, that partial reduction on unrolled iterations would be performed on register, and then stored to memory only once. Currently llvm's unroller together with all standard optimizations produce code, which stores value to memory after every unrolled iteration, which is

search for: _p_scalar_6