thr3ads.net - search: "loop

Displaying 9 results from an estimated 9 matches for "loop_body".

[LLVMdev] Interesting post increment situation in DAG combiner

2013 Mar 01

[LLVMdev] Interesting post increment situation in DAG combiner

...N, short __attribute__ ((aligned (16))) *A, short __attribute__ ((aligned (16))) val) { unsigned i,j; for (i=0; i<N; i++) { for (j=0; j<N; j++) { A[i*N+j] += val; } } } The innermost loop looks like this right before the DAG selection begins. p.loop_body.us65: ; preds = %p.loop_body.lr.ph.us78, %p.loop_body.us65 %p_arrayidx.us69.phi = phi i16* [ %p_arrayidx.us69.gep, %p.loop_body.lr.ph.us78 ], [ %p_arrayidx.us69.inc, %p.loop_body.us65 ] %p.loopiv48.us66 = phi i32 [ 0, %p.loop_body.lr.ph.us78 ], [ %p.next_loopiv.us67,...

[LLVMdev] why LoopUnswitch pass does not constant fold conditional branch and merge blocks

2015 Jul 16

[LLVMdev] why LoopUnswitch pass does not constant fold conditional branch and merge blocks

Hi, I have a general question on LoopUnswtich pass. Consider the following IR snippet: define i32 @test(i1 %cond) { br label %loop_begin loop_begin: br i1 %cond, label %loop_body, label %loop_exit loop_body: br label %do_something do_something: call void @some_func() noreturn nounwind br label %loop_begin loop_exit: ret i32 0 } declare void @some_func() noreturn After running it through "opt -loop-unswitch -S", it unswitched loop on %cond and produced...

[LLVMdev] Interesting post increment situation in DAG combiner

2013 Mar 01

[LLVMdev] Interesting post increment situation in DAG combiner

...gned > (16))) val) > { > unsigned i,j; > for (i=0; i<N; i++) { > for (j=0; j<N; j++) { > A[i*N+j] += val; > } > } > } > > The innermost loop looks like this right before the DAG selection > begins. > > p.loop_body.us65: ; preds = > %p.loop_body.lr.ph.us78, %p.loop_body.us65 > %p_arrayidx.us69.phi = phi i16* [ %p_arrayidx.us69.gep, > %p.loop_body.lr.ph.us78 ], [ %p_arrayidx.us69.inc, %p.loop_body.us65 > ] > %p.loopiv48.us66 = phi i32 [ 0, %p.loop_body.lr.ph.us78...

[LLVMdev] Interesting post increment situation in DAG combiner

2013 Mar 01

[LLVMdev] Interesting post increment situation in DAG combiner

...(16))) val) { unsigned i,j; for > > (i=0; i<N; i++) { > > for (j=0; j<N; j++) { > > A[i*N+j] += val; > > } > > } > > } > > > > The innermost loop looks like this right before the DAG selection > > begins. > > > > p.loop_body.us65: ; preds = > > %p.loop_body.lr.ph.us78, %p.loop_body.us65 > > %p_arrayidx.us69.phi = phi i16* [ %p_arrayidx.us69.gep, > > %p.loop_body.lr.ph.us78 ], [ %p_arrayidx.us69.inc, %p.loop_body.us65 > ] > > %p.loopiv48.us66 = phi i32 [ 0, %p.l...

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

2013 Mar 11

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

...A.LoopHeader.x.preheader: ; preds = %"Loop Function Root" %1 = sext i32 %BlockLB.Add.ThreadPosInBlock.x to i64 store float 0.000000e+00, float* inttoptr (i64 47380979712 to float*), align 8192, !tbaa !0 %p_.moved.to.4.cloned = shl nsw i64 %1, 9 br label %polly.loop_body CUDA.AfterLoop.x: ; preds = %polly.loop_body, %"Loop Function Root" ret void polly.loop_body: ; preds = %polly.loop_body, %CUDA.LoopHeader.x.preheader %_p_scalar_ = phi float [ 0.000000e+00, %CUDA.LoopHeader.x.preheade...

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

2013 Mar 11

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

Dear all, Attached notunrolled.ll is a module containing reduction kernel. What I'm trying to do is to unroll it in such way, that partial reduction on unrolled iterations would be performed on register, and then stored to memory only once. Currently llvm's unroller together with all standard optimizations produce code, which stores value to memory after every unrolled iteration, which is

[LLVMdev] parallel loop metadata simplification

2013 Mar 01

[LLVMdev] parallel loop metadata simplification

----- Original Message ----- > From: "Paul Redmond" <paul.redmond at intel.com> > To: "llvmdev at cs.uiuc.edu Dev" <llvmdev at cs.uiuc.edu> > Sent: Thursday, February 28, 2013 1:30:57 PM > Subject: [LLVMdev] parallel loop metadata simplification > > Hi, > > I've been working on clang codegen for #pragma ivdep and creating the >

[LLVMdev] parallel loop metadata simplification

2013 Feb 28

[LLVMdev] parallel loop metadata simplification

Hi, I've been working on clang codegen for #pragma ivdep and creating the llvm.mem.parallel_loop_access metadata seems quite difficult. The main problem is that there are so many places where loads and stores are created and all of them need to be changed when emitting a parallel loop. Note that creating llvm.loop.parallel is not a problem. One option is to modify IRBuilder to enable

[LLVMdev] [polly] Polly Loop info and LoopSimplify functionality

2013 May 15

[LLVMdev] [polly] Polly Loop info and LoopSimplify functionality

...p_header polly.loop_after: ; preds = %polly.loop_header br label %polly.merge_new_and_old // This is exit from the loop polly.loop_header: ; preds = %polly.stmt.for.body6, %polly.start ... <some code> br i1 %9, label %polly.loop_body, label %polly.loop_after polly.loop_body: ; preds = %polly.loop_header br label %polly.stmt.for.body6 polly.stmt.for.body6: ; preds = %polly.loop_body ... <some code> br label %polly.loop_header The question is - is the polly...

search for: loop_body