thr3ads.net - search: "ctaid"

Displaying 6 results from an estimated 6 matches for "ctaid".

Did you mean: catid

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

2013 Mar 11

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

...rget triple = "nvptx64-unknown-unknown" @__kernelgen_version = constant [15 x i8] c"0.2/1654:1675M\00" define ptx_kernel void @__kernelgen_matvec_loop_7(i32* nocapture) #0 { "Loop Function Root": %tid.x = tail call ptx_device i32 @llvm.nvvm.read.ptx.sreg.tid.x() %ctaid.x = tail call ptx_device i32 @llvm.nvvm.read.ptx.sreg.ctaid.x() %PositionOfBlockInGrid.x = shl i32 %ctaid.x, 9 %BlockLB.Add.ThreadPosInBlock.x = add i32 %PositionOfBlockInGrid.x, %tid.x %isThreadLBgtLoopUB.x = icmp sgt i32 %BlockLB.Add.ThreadPosInBlock.x, 65535 br i1 %isThreadLBgtLoopUB.x,...

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

2013 Mar 11

[LLVMdev] How to unroll reduction loop with caching accumulator on register?

Dear all, Attached notunrolled.ll is a module containing reduction kernel. What I'm trying to do is to unroll it in such way, that partial reduction on unrolled iterations would be performed on register, and then stored to memory only once. Currently llvm's unroller together with all standard optimizations produce code, which stores value to memory after every unrolled iteration, which is

Information about the number of indices in memory accesses

2020 Sep 23

Information about the number of indices in memory accesses

...simple stuff i am getting the right answer but when the index expression becomes more complex multiple GEPs are introduced. For instance: *(A+2*(blockDim.x*blockIdx.x+threadIdx.x+1)+2+3) = 5; produces: %6 = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x() %7 = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.x() %8 = mul i32 %6, %7, %9 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() %10 = add i32 %8, %9, %11 = add i32 %10, 1, %12 = mul i32 2, %11, %13 = zext i32 %12 to i64, %14 = getelementptr inbounds i32, i32* %0, i64 %13 %15 = getelementptr inbounds i32, i32* %14, i64 2 %16 =...

Information about the number of indices in memory accesses

2020 Oct 03

Information about the number of indices in memory accesses

...becomes more complex multiple GEPs are introduced. For > > instance: > > > > *(A+2*(blockDim.x*blockIdx.x+threadIdx.x+1)+2+3) = 5; > > > > produces: > > > > %6 = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x() > > %7 = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.x() > > %8 = mul i32 %6, %7, > > %9 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() > > %10 = add i32 %8, %9, > > %11 = add i32 %10, 1, > > %12 = mul i32 2, %11, > > %13 = zext i32 %12 to i64, > > %14 = getelementptr inbounds i32, i32* %0, i64 %13...

Legal names for Functions and other Identifiers

2017 Jun 22

Legal names for Functions and other Identifiers

Thanks for the heads up Philip ! I did come across a strange case where LLVM allowed "%" to be a part of a function's name. This was in the context of my patch https://reviews.llvm.org/D33985, where I prefix the name of the source function and the Scop ( A special kind of Region that Polly can optimize, the name of the Scop is the name of the Region ) to the name of the PTX kernel

Information about the number of indices in memory accesses

2020 Oct 03

Information about the number of indices in memory accesses

...gt; > instance: >>> > >>> > *(A+2*(blockDim.x*blockIdx.x+threadIdx.x+1)+2+3) = 5; >>> > >>> > produces: >>> > >>> > %6 = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x() >>> > %7 = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.x() >>> > %8 = mul i32 %6, %7, >>> > %9 = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() >>> > %10 = add i32 %8, %9, >>> > %11 = add i32 %10, 1, >>> > %12 = mul i32 2, %11, >>> > %13 = zext i32 %12 to i64, >>> &...

search for: ctaid