Displaying 14 results from an estimated 14 matches for "afterloop".
2013 Mar 11
0
[LLVMdev] How to unroll reduction loop with caching accumulator on register?
...ll ptx_device i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
%PositionOfBlockInGrid.x = shl i32 %ctaid.x, 9
%BlockLB.Add.ThreadPosInBlock.x = add i32 %PositionOfBlockInGrid.x, %tid.x
%isThreadLBgtLoopUB.x = icmp sgt i32 %BlockLB.Add.ThreadPosInBlock.x,
65535
br i1 %isThreadLBgtLoopUB.x, label %CUDA.AfterLoop.x, label
%CUDA.LoopHeader.x.preheader
CUDA.LoopHeader.x.preheader: ; preds = %"Loop Function
Root"
%1 = sext i32 %BlockLB.Add.ThreadPosInBlock.x to i64
store float 0.000000e+00, float* inttoptr (i64 47380979712 to float*),
align 8192, !tbaa !0
%p_.moved.to.4.cl...
2013 Mar 11
2
[LLVMdev] How to unroll reduction loop with caching accumulator on register?
Dear all,
Attached notunrolled.ll is a module containing reduction kernel. What I'm
trying to do is to unroll it in such way, that partial reduction on
unrolled iterations would be performed on register, and then stored to
memory only once. Currently llvm's unroller together with all standard
optimizations produce code, which stores value to memory after every
unrolled iteration, which is
2010 May 28
4
[LLVMdev] Combining Branch Statements - Missing Optimization Pass?
...hen11
%storemerge26 = phi double [ %phitmp, %then11 ], [ 1.000000e+01, %endif.endif15_crit_edge ] ; <double> [#uses=1]
%lsr.iv.next = add i32 %lsr.iv, 1 ; <i32> [#uses=2]
%exitcond = icmp eq i32 %lsr.iv.next, 10001 ; <i1> [#uses=1]
br i1 %exitcond, label %afterloop, label %loop
afterloop: ; preds = %endif15
%r_fadd19 = fadd double %anotherFloat.1, %storemerge26 ; <double> [#uses=1]
%phitmp32 = fcmp ogt double %r_fadd19, 3.535200e+03 ; <i1> [#uses=1]
%storemerge27 = zext i1 %phitmp32 to i32 ; <...
2010 May 28
0
[LLVMdev] Combining Branch Statements - Missing Optimization Pass?
...erge26 = phi double [ %phitmp, %then11 ], [ 1.000000e+01, %endif.endif15_crit_edge ] ; <double> [#uses=1]
> %lsr.iv.next = add i32 %lsr.iv, 1 ; <i32> [#uses=2]
> %exitcond = icmp eq i32 %lsr.iv.next, 10001 ; <i1> [#uses=1]
> br i1 %exitcond, label %afterloop, label %loop
>
> afterloop: ; preds = %endif15
> %r_fadd19 = fadd double %anotherFloat.1, %storemerge26 ; <double> [#uses=1]
> %phitmp32 = fcmp ogt double %r_fadd19, 3.535200e+03 ; <i1> [#uses=1]
> %storemerge27 = zext i1 %phi...
2017 Jan 28
3
[RFC][PIR] Parallel LLVM IR -- Stage 0 -- IR extension
...fork label %task, label %latch
task:
%aptr = getelementptr i32, i32* %A, i32 0, i32 %i
%aval = load i32* %aptr
%cptr = getelementptr i32, i32* %C, i32 0, i32 %i
store i32 %aval, i32* %aptr
halt label %latch
latch:
%inc = add i32, i32 %i, i32 1
br label %header
exit:
join label %afterloop
afterloop:
...
(2) Reasoning:
The proposed approach is crafted such that the semantics of the parallel
program is represented correctly in almost native, low-level IR right
after front-end and preserved at any point till the final lowering to
sequential IR or parallel runtime library calls. To...
2010 Nov 22
0
[LLVMdev] Emitting LLVM IR for control flow
...adly get 3.000000 instead of 2.000000. This happens, I believe,
because the instruction
%faddtmp = fadd *double* %x1.0, 1.000000e+000 ; <*double*> [#uses=2]
is being generated before
%ltcmptmp = fcmp ult *double* %c.0, 2.000000e+000 ; <i1> [#uses=1]
br i1 %ltcmptmp, label %loop, label %afterloop
and therefore the loop body is emited first and only afterwards we determine
whether the loop should exit. I was wondering if this is the intended
behaviour, since the fibi(x) example in chapter 7 uses this extra loop to
return correct values for fibonacci numbers, or perhaps a known bug in
Kaleid...
2017 Mar 08
5
(no subject)
...= load i32* %aptr
> > %cptr = getelementptr i32, i32* %C, i32 0, i32 %i
> > store i32 %aval, i32* %aptr
> > halt label %latch
> >
> > latch:
> > %inc = add i32, i32 %i, i32 1
> > br label %header
> >
> > exit:
> > join label %afterloop
> >
> > afterloop:
> > ...
> >
> >
> >
> > (2) Reasoning:
> > The proposed approach is crafted such that the semantics of the parallel
> > program is represented correctly in almost native, low-level IR right
> > after front-end and pr...
2017 Mar 08
3
(no subject)
...%cptr = getelementptr i32, i32* %C, i32 0, i32 %i
>>> store i32 %aval, i32* %aptr
>>> halt label %latch
>>>
>>> latch:
>>> %inc = add i32, i32 %i, i32 1
>>> br label %header
>>>
>>> exit:
>>> join label %afterloop
>>>
>>> afterloop:
>>> ...
>>>
>>>
>>>
>>> (2) Reasoning:
>>> The proposed approach is crafted such that the semantics of the
>>> parallel program is represented correctly in almost native,
>>> low-level IR...
2017 Mar 08
3
(no subject)
...32, i32* %C, i32 0, i32 %i
>>>> store i32 %aval, i32* %aptr
>>>> halt label %latch
>>>>
>>>> latch:
>>>> %inc = add i32, i32 %i, i32 1
>>>> br label %header
>>>>
>>>> exit:
>>>> join label %afterloop
>>>>
>>>> afterloop:
>>>> ...
>>>>
>>>>
>>>>
>>>> (2) Reasoning:
>>>> The proposed approach is crafted such that the semantics of the parallel
>>>> program is represented correctly in alm...
2017 Mar 08
4
(no subject)
...2 %i
> >>> store i32 %aval, i32* %aptr
> >>> halt label %latch
> >>>
> >>> latch:
> >>> %inc = add i32, i32 %i, i32 1
> >>> br label %header
> >>>
> >>> exit:
> >>> join label %afterloop
> >>>
> >>> afterloop:
> >>> ...
> >>>
> >>>
> >>>
> >>> (2) Reasoning:
> >>> The proposed approach is crafted such that the semantics of the
> >>> parallel program is represented correctly...
2017 Mar 08
2
(no subject)
...t;>>>> halt label %latch
>>>>>>
>>>>>> latch:
>>>>>> %inc = add i32, i32 %i, i32 1
>>>>>> br label %header
>>>>>>
>>>>>> exit:
>>>>>> join label %afterloop
>>>>>>
>>>>>> afterloop:
>>>>>> ...
>>>>>>
>>>>>>
>>>>>>
>>>>>> (2) Reasoning:
>>>>>> The proposed approach is crafted such that the semantics of the
>&...
2017 Mar 08
2
(no subject)
...l, i32* %aptr
> > >>> halt label %latch
> > >>>
> > >>> latch:
> > >>> %inc = add i32, i32 %i, i32 1
> > >>> br label %header
> > >>>
> > >>> exit:
> > >>> join label %afterloop
> > >>>
> > >>> afterloop:
> > >>> ...
> > >>>
> > >>>
> > >>>
> > >>> (2) Reasoning:
> > >>> The proposed approach is crafted such that the semantics of the
> > >>&g...
2017 Mar 08
3
[RFC][PIR] Parallel LLVM IR -- Stage 0 --
...t;>>>> halt label %latch
>>>>>>
>>>>>> latch:
>>>>>> %inc = add i32, i32 %i, i32 1
>>>>>> br label %header
>>>>>>
>>>>>> exit:
>>>>>> join label %afterloop
>>>>>>
>>>>>> afterloop:
>>>>>> ...
>>>>>>
>>>>>>
>>>>>>
>>>>>> (2) Reasoning:
>>>>>> The proposed approach is crafted such that the semantics of the
>...
2017 Mar 08
2
[RFC][PIR] Parallel LLVM IR -- Stage 0 --
...gt;>>>
>>>>>>>> latch:
>>>>>>>> %inc = add i32, i32 %i, i32 1
>>>>>>>> br label %header
>>>>>>>>
>>>>>>>> exit:
>>>>>>>> join label %afterloop
>>>>>>>>
>>>>>>>> afterloop:
>>>>>>>> ...
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> (2) Reasoning:
>>>>>>>> T...