search for: __device__

Displaying 10 results from an estimated 10 matches for "__device__".

2018 Mar 23
2
cuda cross compiling issue for target aarch64-linux-androideabi
...inux-androideabi/include/math_functions.hpp:3477:19: error: no matching function for call to '__isinf' if (a == 0.0 || __isinf(a)) { ^~~~~~~ ../cuda/targets/aarch64-linux-androideabi/include/math_functions_dbl_ptx3.hpp:165:38: note: candidate function not viable: call to __device__ function from __host__ function __MATH_FUNCTIONS_DBL_PTX3_DECL__ int __isinf(double a) __THROW ^ In file included from <built-in>:1: In file included from ../prebuilts/clang/host/linux-x86/clang-4667116/lib64/clang/7.0.1/include/__clang_cuda_runtime_...
2018 Mar 23
0
cuda cross compiling issue for target aarch64-linux-androideabi
...math_functions.hpp:3477:19: > error: no matching function for call to '__isinf' > if (a == 0.0 || __isinf(a)) { > ^~~~~~~ > ../cuda/targets/aarch64-linux-androideabi/include/math_functions_dbl_ptx3.hpp:165:38: > note: candidate function not viable: call to __device__ function > from __host__ function > __MATH_FUNCTIONS_DBL_PTX3_DECL__ int __isinf(double a) __THROW > ^ > In file included from <built-in>:1: > In file included from > ../prebuilts/clang/host/linux-x86/clang-4667116/lib64/clang/7.0.1...
2017 Aug 16
3
CUDA separate compilation
Clang currently doesn't support CUDA separate compilation and thus extern __device__ functions and variables cannot be used. Could someone give me any pointers where to look or what has to be done to support this? If at all possible, I'd like to see what's missing and possibly try to tackle it. -------------- next part -------------- An HTML attachment was scrubbed... URL:...
2016 Oct 14
2
LLVM/CLANG: CUDA compilation fail for inline assembly code
...ects/galois/lonestargpu/download> benchmark suite with LLVM/CLANG. This suite has a following piece of code (more info here <https://devtalk.nvidia.com/default/topic/481465/cuda-programming-and-performance/any-way-to-know-on-which-sm-a-thread-is-running-/2/?offset=21#4996171> ): - static __device__ uint get_smid(void) { - uint ret; - asm("mov.u32 %0, %smid;" : "=r"(ret) ); - return ret; - } The original make file has nvcc compiler with a flag -Xptxas -v. It compiles with nvcc. LLVM has -Xcuda-ptxas <arg>, which I believe is the comparable command for compiling PTX c...
2013 Mar 20
2
[LLVMdev] UNREACHABLE executed! error while trying to generate PTX
Thanks a lot Justin, I will remove the toolkit header. Just one last question..(maybe ;) ) If I do away with toolkit headers it says unknown type name '__device__'. Does this function qualifier have an alternative ? or I can just do away with ? -- View this message in context: http://llvm.1065342.n5.nabble.com/UNREACHABLE-executed-error-while-trying-to-generate-PTX-tp56026p56093.html Sent from the LLVM - Dev mailing list archive at Nabble.com.
2013 Mar 20
0
[LLVMdev] UNREACHABLE executed! error while trying to generate PTX
On Wed, Mar 20, 2013 at 11:29 AM, upit <uday_pitambare at yahoo.com> wrote: > OK. That helps. > It does flash a warning though > > [DEVICE-C++] nbody.kernel.cpp > nbody.kernel.cpp:29:9: warning: '__constant__' macro redefined > #define __constant__ __attribute__((address_space(2))) > ^ > /opt/cuda/include/host_defines.h:183:9: note: previous
2013 Mar 21
0
[LLVMdev] UNREACHABLE executed! error while trying to generate PTX
...4 -Xclang -target-cpu -Xclang sm_20 -S On Wed, Mar 20, 2013 at 3:29 PM, upit <uday_pitambare at yahoo.com> wrote: > Thanks a lot Justin, > > I will remove the toolkit header. Just one last question..(maybe ;) ) If I > do away with toolkit headers it says unknown type name '__device__'. Does > this function qualifier have an alternative ? or I can just do away with ? > > > > > > -- > View this message in context: > http://llvm.1065342.n5.nabble.com/UNREACHABLE-executed-error-while-trying-to-generate-PTX-tp56026p56093.html > Sent from the LLVM -...
2013 Mar 20
2
[LLVMdev] UNREACHABLE executed! error while trying to generate PTX
OK. That helps. It does flash a warning though [DEVICE-C++] nbody.kernel.cpp nbody.kernel.cpp:29:9: warning: '__constant__' macro redefined #define __constant__ __attribute__((address_space(2))) ^ /opt/cuda/include/host_defines.h:183:9: note: previous definition is here #define __constant__ \ ^ 1 warning generated. Another question is What about extern __shared__ ? I
2016 Mar 05
2
instrumenting device code with gpucc
On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com> wrote: > Hi Jingyue, > > My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother > you, but I'm having trouble with gpucc in my project, and I would be really > grateful for your help! > > Currently we're trying to instrument CUDA code using LLVM 3.9, and
2016 Mar 10
4
instrumenting device code with gpucc
...lways fails on the > first cudaMalloc call in host code (the kernel had not even been launched > yet), with the error code being 30 (cudaErrorUnknown). In my > instrumentation pass, I only inserted a hook function upon each access to > device memory, with their signatures being: "__device__ void > _Cool_MemRead_Hook(uint64_t addr)". I've compiled these hooks functions > into a shared object, and linked the axpy binary with it. > > I'm really sorry to bother you again, but I wonder whether any step I did > was apparently wrong, or there's any gpucc-spe...