search for: selp

Displaying 8 results from an estimated 8 matches for "selp".

Did you mean: help
2012 Jul 10
2
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
...odr)) __inline__ int __any(int a) { int result; asm __volatile__ ("{ \n\t" ".reg .pred \t%%p1; \n\t" ".reg .pred \t%%p2; \n\t" "setp.ne.u32 \t%%p1, %1, 0; \n\t" "vote.any.pred \t%%p2, %%p1; \n\t" "selp.s32 \t%0, 1, 0, %%p2; \n\t" "}" : "=r"(result) : "r"(a)); return result; } > clang -cc1 -emit-llvm -fcuda-is-device -triple ptx64-unknown-unknown test.cu -o test.ll > cat test.ll ; ModuleID = 'test.cu' target datalayout = "e-p:64:64...
2012 Jul 10
0
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
...> int result; > asm __volatile__ ("{ \n\t" > ".reg .pred \t%%p1; \n\t" > ".reg .pred \t%%p2; \n\t" > "setp.ne.u32 \t%%p1, %1, 0; \n\t" > "vote.any.pred \t%%p2, %%p1; \n\t" > "selp.s32 \t%0, 1, 0, %%p2; \n\t" > "}" : "=r"(result) : "r"(a)); > return result; > } > > > clang -cc1 -emit-llvm -fcuda-is-device -triple ptx64-unknown-unknown test.cu -o test.ll > > cat test.ll > ; ModuleID = 'test.cu'...
2012 Jul 10
1
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
...> int result; > asm __volatile__ ("{ \n\t" > ".reg .pred \t%%p1; \n\t" > ".reg .pred \t%%p2; \n\t" > "setp.ne.u32 \t%%p1, %1, 0; \n\t" > "vote.any.pred \t%%p2, %%p1; \n\t" > "selp.s32 \t%0, 1, 0, %%p2; \n\t" > "}" : "=r"(result) : "r"(a)); > return result; > } > > > clang -cc1 -emit-llvm -fcuda-is-device -triple ptx64-unknown-unknown > test.cu -o test.ll > > cat test.ll > ; ModuleID = 'test.cu...
2010 Mar 27
0
data fitting and confidence band
...w many times the pointwise confidence interval at x=0.5 contains the true value at 0.5 # i.e. what is the so called "coverage rate"? # ========================================================================================= pos = which(x==0.5) sum(abs(estlp[pos,] - m(x[pos])) <= 1.96*selp[pos,]) # equidistant x outputs 946 # non-equidistant x outputs 938 sum(abs(estss[pos,] - m(x[pos])) <= 1.96*sess[pos,]) # equidistant x outputs 895 # non-equidistant x outputs 936 T...
2015 Nov 05
1
[PATCH envytools] envydis: gk110: Add support for dadd with an immediate src
...3ull, N("add"), T(frm2a), N("f64"), DSTD, T(neg33), T(abs31), SRC1D, T(neg3b), T(di2) }, { 0x0400000000000001ull, 0x37c0000000000003ull, N("mul"), T(frm2a), T(neg3b), N("f64"), DSTD, SRC1D, T(di2) }, { 0x0500000000000001ull, 0x37c0000000000003ull, N("selp"), DST, SRC1, T(i3bi2), T(pnot2d), PSRC3 }, { 0x07c0000000000001ull, 0x37c0000000000003ull, N("rshf"), N("b32"), DST, SESTART, T(us64_28), SRC1, SRC3, SEEND, T(shfclamp), T(sui2b) }, // d = (s1 >> s2) | (s3 << (32 - s2)) -- 2.5.0
2010 Aug 26
0
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...#39;m not sure why these would be easier with one model over another. > It's a lot of hand-lowering and manual optimization either way.  Can you > explain? > The codegen is smart enough to translate a simple if-else block like if (pred) return A; else return B; into one instruction selp A, B, pred Also codegen has branch-folding support so it would be easier (this is my guess, I've not yet started). I didn't try many examples, but I was convinced that it should be easier. >> All in all, I would propose a PTX backend in codegen approach after I >> have implement...
2010 Aug 23
2
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
Che-Liang Chiou <clchiou at gmail.com> writes: > Hi there, > > Thank Nick for kindly reviewing the patch. Here is the link to the > source code of the PTX backend; it would help Nick review the patch. > http://lime.csie.ntu.edu.tw/~clchiou/llvm-ptx-backend.tar.gz Great! > I decided to take the code generator approach (referred to as codegen > approach) rather than C
2015 Feb 23
2
[Mesa-dev] [PATCH 2/2] nvc0/ir: improve precision of double RCP/RSQ results
Does this give correct results for special floats (0, infs)? We tried to improve (for single floats) x86 rcp in llvmpipe with newton-raphson, but unfortunately not being able to give correct results for these two cases (without even more additional code) meant it got all disabled in the end (you can still see that code in the driver) since the problems are at least as bad as those due to bad