search for: _p_scalar_

Displaying 2 results from an estimated 2 matches for "_p_scalar_".

Did you mean: _p_scalar_5
2013 Mar 11
0
[LLVMdev] How to unroll reduction loop with caching accumulator on register?
....moved.to.4.cloned = shl nsw i64 %1, 9 br label %polly.loop_body CUDA.AfterLoop.x: ; preds = %polly.loop_body, %"Loop Function Root" ret void polly.loop_body: ; preds = %polly.loop_body, %CUDA.LoopHeader.x.preheader %_p_scalar_ = phi float [ 0.000000e+00, %CUDA.LoopHeader.x.preheader ], [ %p_8, %polly.loop_body ] %polly.loopiv10 = phi i64 [ 0, %CUDA.LoopHeader.x.preheader ], [ %polly.next_loopiv, %polly.loop_body ] %polly.next_loopiv = add i64 %polly.loopiv10, 1 %p_ = add i64 %polly.loopiv10, %p_.moved.to.4.cloned...
2013 Mar 11
2
[LLVMdev] How to unroll reduction loop with caching accumulator on register?
Dear all, Attached notunrolled.ll is a module containing reduction kernel. What I'm trying to do is to unroll it in such way, that partial reduction on unrolled iterations would be performed on register, and then stored to memory only once. Currently llvm's unroller together with all standard optimizations produce code, which stores value to memory after every unrolled iteration, which is