Displaying 9 results from an estimated 9 matches for "loop_body".
2013 Mar 01
2
[LLVMdev] Interesting post increment situation in DAG combiner
...N, short __attribute__ ((aligned (16))) *A,
short __attribute__ ((aligned (16))) val)
{
unsigned i,j;
for (i=0; i<N; i++) {
for (j=0; j<N; j++) {
A[i*N+j] += val;
}
}
}
The innermost loop looks like this right before the DAG selection begins.
p.loop_body.us65: ; preds =
%p.loop_body.lr.ph.us78, %p.loop_body.us65
%p_arrayidx.us69.phi = phi i16* [ %p_arrayidx.us69.gep,
%p.loop_body.lr.ph.us78 ], [ %p_arrayidx.us69.inc, %p.loop_body.us65 ]
%p.loopiv48.us66 = phi i32 [ 0, %p.loop_body.lr.ph.us78 ], [
%p.next_loopiv.us67,...
2015 Jul 16
3
[LLVMdev] why LoopUnswitch pass does not constant fold conditional branch and merge blocks
Hi,
I have a general question on LoopUnswtich pass.
Consider the following IR snippet:
define i32 @test(i1 %cond) {
br label %loop_begin
loop_begin:
br i1 %cond, label %loop_body, label %loop_exit
loop_body:
br label %do_something
do_something:
call void @some_func() noreturn nounwind
br label %loop_begin
loop_exit:
ret i32 0
}
declare void @some_func() noreturn
After running it through "opt -loop-unswitch -S", it unswitched loop on %cond and produced...
2013 Mar 01
0
[LLVMdev] Interesting post increment situation in DAG combiner
...gned
> (16))) val)
> {
> unsigned i,j;
> for (i=0; i<N; i++) {
> for (j=0; j<N; j++) {
> A[i*N+j] += val;
> }
> }
> }
>
> The innermost loop looks like this right before the DAG selection
> begins.
>
> p.loop_body.us65: ; preds =
> %p.loop_body.lr.ph.us78, %p.loop_body.us65
> %p_arrayidx.us69.phi = phi i16* [ %p_arrayidx.us69.gep,
> %p.loop_body.lr.ph.us78 ], [ %p_arrayidx.us69.inc, %p.loop_body.us65
> ]
> %p.loopiv48.us66 = phi i32 [ 0, %p.loop_body.lr.ph.us78...
2013 Mar 01
1
[LLVMdev] Interesting post increment situation in DAG combiner
...(16))) val) { unsigned i,j; for
> > (i=0; i<N; i++) {
> > for (j=0; j<N; j++) {
> > A[i*N+j] += val;
> > }
> > }
> > }
> >
> > The innermost loop looks like this right before the DAG selection
> > begins.
> >
> > p.loop_body.us65: ; preds =
> > %p.loop_body.lr.ph.us78, %p.loop_body.us65
> > %p_arrayidx.us69.phi = phi i16* [ %p_arrayidx.us69.gep,
> > %p.loop_body.lr.ph.us78 ], [ %p_arrayidx.us69.inc, %p.loop_body.us65
> ]
> > %p.loopiv48.us66 = phi i32 [ 0, %p.l...
2013 Mar 11
0
[LLVMdev] How to unroll reduction loop with caching accumulator on register?
...A.LoopHeader.x.preheader: ; preds = %"Loop Function
Root"
%1 = sext i32 %BlockLB.Add.ThreadPosInBlock.x to i64
store float 0.000000e+00, float* inttoptr (i64 47380979712 to float*),
align 8192, !tbaa !0
%p_.moved.to.4.cloned = shl nsw i64 %1, 9
br label %polly.loop_body
CUDA.AfterLoop.x: ; preds =
%polly.loop_body, %"Loop Function Root"
ret void
polly.loop_body: ; preds =
%polly.loop_body, %CUDA.LoopHeader.x.preheader
%_p_scalar_ = phi float [ 0.000000e+00, %CUDA.LoopHeader.x.preheade...
2013 Mar 11
2
[LLVMdev] How to unroll reduction loop with caching accumulator on register?
Dear all,
Attached notunrolled.ll is a module containing reduction kernel. What I'm
trying to do is to unroll it in such way, that partial reduction on
unrolled iterations would be performed on register, and then stored to
memory only once. Currently llvm's unroller together with all standard
optimizations produce code, which stores value to memory after every
unrolled iteration, which is
2013 Mar 01
0
[LLVMdev] parallel loop metadata simplification
----- Original Message -----
> From: "Paul Redmond" <paul.redmond at intel.com>
> To: "llvmdev at cs.uiuc.edu Dev" <llvmdev at cs.uiuc.edu>
> Sent: Thursday, February 28, 2013 1:30:57 PM
> Subject: [LLVMdev] parallel loop metadata simplification
>
> Hi,
>
> I've been working on clang codegen for #pragma ivdep and creating the
>
2013 Feb 28
5
[LLVMdev] parallel loop metadata simplification
Hi,
I've been working on clang codegen for #pragma ivdep and creating the llvm.mem.parallel_loop_access metadata seems quite difficult. The main problem is that there are so many places where loads and stores are created and all of them need to be changed when emitting a parallel loop. Note that creating llvm.loop.parallel is not a problem.
One option is to modify IRBuilder to enable
2013 May 15
2
[LLVMdev] [polly] Polly Loop info and LoopSimplify functionality
...p_header
polly.loop_after: ; preds =
%polly.loop_header
br label %polly.merge_new_and_old // This is exit from the loop
polly.loop_header: ; preds =
%polly.stmt.for.body6, %polly.start
... <some code>
br i1 %9, label %polly.loop_body, label %polly.loop_after
polly.loop_body: ; preds =
%polly.loop_header
br label %polly.stmt.for.body6
polly.stmt.for.body6: ; preds = %polly.loop_body
... <some code>
br label %polly.loop_header
The question is - is the polly...