similar to: Separate compilation of CUDA code?

Displaying 20 results from an estimated 1000 matches similar to: "Separate compilation of CUDA code?"

2017 Jun 09
1
NVPTX Back-end: relocatable device code support for dynamic parallelism
Hi everyone, CUDA allows to call some runtime functions also from the device code. On a multi-GPU system this allows the GPU to determine its device id on its own via cudaGetDevice(). Unfortunately i cannot get it working when compiling with clang. When compiling with nvcc relocatable device code needs to be set to true (-rdc=true) and the cudadevrt is needed when linking [0]. I did not
2017 Jun 14
4
[CUDA] Lost debug information when compiling CUDA code
Hi, I needed to debug some CUDA code in my project; however, although I used -g when compiling the source code, no source-level information is available in cuda-gdb or cuda-memcheck. Specifically, below is what I did: 1) For a CUDA file a.cu, generate IR files: clang++ -g -emit-llvm --cuda-gpu-arch=sm_35 -c a.cu; 2) Instrument the device code a-cuda-nvptx64-nvidia-cuda-sm_35.bc (generated
2017 Jun 17
2
Separate compilation of CUDA code?
Hi, I wonder whether the current version of LLVM supports separate compilation and linking of device code, i.e., is there a flag analogous to nvcc's --relocatable-device-code flag? If not, is there any plan to support this? Thanks! Yuanfeng Peng -------------- next part -------------- An HTML attachment was scrubbed... URL:
2016 Aug 01
3
[GPUCC] link against libdevice
OK, I see the problem. You were right that we weren't picking up libdevice. CUDA 7.0 only ships with the following libdevice binaries (found /path/to/cuda/nvvm/libdevice): libdevice.compute_20.10.bc libdevice.compute_30.10.bc libdevice.compute_35.10.bc If you ask for sm_50 with cuda 7.0, clang can't find a matching libdevice binary, and it will apparently silently give up and try to
2016 Aug 01
0
[GPUCC] link against libdevice
Hi Justin, Thanks for your response! The clang & llvm I'm using was built from source. Below is the output of compiling with -v. Any suggestions would be appreciated! *clang version 3.9.0 (trunk 270145) (llvm/trunk 270133)* *Target: x86_64-unknown-linux-gnu* *Thread model: posix* *InstalledDir: /usr/local/bin* *Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8*
2016 Aug 01
2
[GPUCC] link against libdevice
Hi, Yuanfeng. What version of clang are you using? CUDA is only known to work at tip of head, so you must build clang yourself from source. I suspect that's your problem, but if building from source doesn't fix it, please attach the output of compiling with -v. Regards, -Justin On Sun, Jul 31, 2016 at 9:24 PM, Chandler Carruth <chandlerc at google.com> wrote: > Directly
2015 Aug 21
3
[CUDA/NVPTX] is inlining __syncthreads allowed?
Hi Justin, Is a compiler allowed to inline a function that calls __syncthreads? I saw nvcc does that, but not sure it's valid though. For example, void foo() { __syncthreads(); } if (threadIdx.x % 2 == 0) { ... foo(); } else { ... foo(); } Before inlining, all threads meet at one __syncthreads(). After inlining if (threadIdx.x % 2 == 0) { ... __syncthreads(); } else { ...
2015 Apr 08
5
[LLVMdev] CUDA front-end (CUDA to LLVM IR)
Hi, I wanted to ask whether there is ongoing effort (or an already established tool) that enables to convert CUDA kernels (that uses CUDA specific intrinsics, e.g., threadId.x, __syncthreads(), ...) to LLVM IR. I am aware that I can do this for OpenCL with the help of libclc but I can not find something similar for CUDA. Thanks -------------- next part -------------- An HTML attachment was
2016 Oct 27
3
problem on compiling cuda program with clang++
Hi all, I compiled the *llvm3.9* source code on the *Nvidia TX1* board. And now I am following the document in the docs/CompileCudaWithLLVM.rst to compile cuda program with clang++. However, when I compile `axpy.cu` using `nvcc`, *nvcc* can generate the correct the binary; while compiling `axpy.cu` using clang++, the detailed command is `clang++ axpy.cu -o axpy --cuda-gpu-arch=sm_53
2008 Nov 04
1
Help needed using 3rd party C library/functions from within R (Nvidia CUDA)
Hello, I'm trying to combine the parallel computing power available through NVIDIA CUDA (www.nvidia.com/cuda) from within R. CUDA is an extension to the C language, so I thought it would be possible to do this. If I have a C file with an empty function which includes a needed CUDA library (cutil.h) and compile this to an .so file using a NVIDIA compiler (nvcc), called 'myFunc.so' I
2015 Aug 21
2
[CUDA/NVPTX] is inlining __syncthreads allowed?
I'm using 7.0. I am attaching the reduced example. nvcc sync.cu -arch=sm_35 -ptx gives // .globl _Z3foov .visible .entry _Z3foov( ) { .reg .pred %p<2>; .reg .s32 %r<3>; mov.u32 %r1, %tid.x; and.b32 %r2, %r1, 1; setp.eq.b32 %p1, %r2, 1; @!%p1 bra BB7_2; bra.uni
2016 Oct 14
2
LLVM/CLANG: CUDA compilation fail for inline assembly code
Hi, I am sorry for sending this query again here, but maybe I sent it to wrong list yesterday. I am trying to compile LonestarGPU-rev2.0 <http://iss.ices.utexas.edu/?p=projects/galois/lonestargpu/download> benchmark suite with LLVM/CLANG. This suite has a following piece of code (more info here
2015 Apr 08
2
[LLVMdev] CUDA front-end (CUDA to LLVM IR)
On Wed, Apr 8, 2015 at 10:12 AM, Dmitry Mikushin <dmitry at kernelgen.org> wrote: > A tool of this kind here: https://github.com/apc-llc/nvcc-llvm-ir > > 2015-04-08 19:01 GMT+02:00 Ahmed ElTantawy <ahmede at ece.ubc.ca>: > >> Hi, >> >> I wanted to ask whether there is ongoing effort (or an already >> established tool) that enables to convert CUDA
2017 Oct 05
2
CUDA tools?
Hi, again. So, kmod-nvidia installed. Trouble is, I have no tool to test it. And my user might need nvcc, which, of course, is only provided by the NVidia CUDA, which won't install, because it conflicts with kmod-nvidia. Has *anyone* dealt with this? If so, what was your solution? mark
2011 Aug 15
2
[LLVMdev] Cuda programs on LLVM
Hello , How to execute a cuda program using llvm? More specifically, nvcc produces some temporary files during its compilation. I want to convert the .cu.cpp to .ll format and optimize it. The .cu.cpp file contains typedefs and enums used by cuda runtime and also the host part of the code and the ptx file contains the kernel definition. How can i run the program after optimization? Will Rhodin
2012 Nov 08
3
[LLVMdev] translating from OpenMP to CUDA
Hi, Is it possible to translate an OpenMP program to CUDA using LLVM? I read that dragonegg has a OpenMP front-end and LLVM has a PTX back-end. I don't know how mature these tools are. Please let me know. Thanks. -Apala Postdoctoral Scholar Department of Computer Science, University of Chicago Computation Institute, Argonne National Laboratory http://sites.google.com/site/apalaguha/home/
2013 Jul 18
2
question about Makeconf and nvcc/CUDA
Dear R development: I'm not sure if this is the appropriate list, but it's a start. I would like to put together a package which contains a CUDA program on Windows 7. I believe that it has to do with the Makeconf file in the etc directory. But when I just use the nvcc with the shared option, I can use the dyn.load command, but when I use the is.loaded function, it shows FALSE.
2012 Nov 09
0
[LLVMdev] translating from OpenMP to CUDA
The PTX back-end is robust (it's based on the sources used by nvcc), but I'm not sure about the OpenMP representation in LLVM IR. I believe the OpenMP constructs are already lowered into libgomp calls before leaving DragonEgg. It's been awhile since I've loooked at it though. If you use the PTX back-end and have any issues, please don't hesitate to post to the list and cc:
2016 Mar 15
2
instrumenting device code with gpucc
Hi Jingyue, Sorry to ask again, but how exactly could I glue the fatbin with the instrumented host code? Or does it mean we actually cannot instrument both the host & device code at the same time? Thanks! yuanfeng On Tue, Mar 15, 2016 at 10:09 AM, Jingyue Wu <jingyue at google.com> wrote: > Including fatbin into host code should be done in frontend. > > On Mon, Mar 14, 2016
2016 Mar 12
2
instrumenting device code with gpucc
Hey Jingyue, Though I tried `opt -nvvm-reflect` on both bc files, the nvvm reflect anchor didn't go away; ptxas is still complaining about the duplicate definition of of function '_ZL21__nvvm_reflect_anchorv' . Did I misused the nvvm-reflect pass? Thanks! yuanfeng On Fri, Mar 11, 2016 at 10:10 AM, Jingyue Wu <jingyue at google.com> wrote: > According to the examples you