similar to: making cuda-based R package CRAN friendly

Displaying 20 results from an estimated 6000 matches similar to: "making cuda-based R package CRAN friendly"

2020 Jul 30
2
Status of CUDA 11 support
Hi, I work in a large CUDA codebase and use Clang to build some of our CUDA code to improve compilation speed. We're planning to upgrade to CUDA 11 soon, and it appears that CUDA 11 is not yet supported in LLVM. >From the LLVM commits history, I can see that work on CUDA 11 has started. Is this currently being worked on? What is the remaining work left? And is any help needed to finish
2018 Mar 23
0
cuda cross compiling issue for target aarch64-linux-androideabi
+Artem Belevich <tra at google.com> On Fri, Mar 23, 2018 at 7:53 PM Bharath Bhoopalam via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I was wondering if anyone has encountered this issue when cross compiling > cuda on Nvidia TX2 running android. > > The error is > In file included from <built-in>:1: > In file included from >
2018 Mar 23
2
cuda cross compiling issue for target aarch64-linux-androideabi
I was wondering if anyone has encountered this issue when cross compiling cuda on Nvidia TX2 running android. The error is In file included from <built-in>:1: In file included from prebuilts/clang/host/linux-x86/clang-4667116/lib64/clang/7.0.1/include/__clang_cuda_runtime_wrapper.h:219: ../cuda/targets/aarch64-linux-androideabi/include/math_functions.hpp:3477:19: error: no matching function
2014 Jun 03
1
cuda-memcheck to debug CUDA-enabled R packages
I'm building a simple R extension around a CUDA-enabled dynamic library, and I want to run the whole package with cuda-memcheck for debugging purposes. I can run it just fine with Valgrind: $ R --no-save -d valgrind < test.R However, if I try the same thing with cuda-memcheck, $ R --no-save -d cuda-memcheck < test.R I get: *** Further command line arguments ('--no-save ')
2011 Feb 25
2
Missing R.h
Hi, I'm trying to install a module - gputools - and keep getting compile time errors about missing R.h Does anyone know where this file can be found? Thanks!
2017 Oct 06
0
CUDA tools?
On Thu, 2017-10-05 at 17:07 -0400, m.roth at 5-cent.us wrote: > vychytraly . wrote: > > On Thu, Oct 5, 2017 at 9:51 PM, <m.roth at 5-cent.us> wrote: > > > > > > So, kmod-nvidia installed. Trouble is, I have no tool to test it. And my > > > user might need nvcc, which, of course, is only provided by the NVidia > > > CUDA, which won't install,
2017 Jun 14
4
[CUDA] Lost debug information when compiling CUDA code
Hi, I needed to debug some CUDA code in my project; however, although I used -g when compiling the source code, no source-level information is available in cuda-gdb or cuda-memcheck. Specifically, below is what I did: 1) For a CUDA file a.cu, generate IR files: clang++ -g -emit-llvm --cuda-gpu-arch=sm_35 -c a.cu; 2) Instrument the device code a-cuda-nvptx64-nvidia-cuda-sm_35.bc (generated
2020 Nov 19
0
JIT compiling CUDA source code
I have made a bit of progress... When compiling CUDA source code in memory, the Compilation instance returned by Driver::BuildCompilation() contains two clang Commands: one for the host and one for the CUDA device. I can execute both commands using EmitLLVMOnlyActions. I add the Module from the host compilation to my JIT as usual, but... what to do with the Module from the device compilation? If I
2016 Dec 21
2
llvm/cuda: Indentify kernel functions and optimizations
Hi, I am trying to instrument CUDA kernel functions only (llvm-3.9.0). Is there a way to identify cuda kernel functions? I see that in llvm IR for CUDA has nvvm annotations section, where kernel functions are identified for NVPTX usage. I can parse the whole IR for this kernel metadata and then proceed, but this is very clumsy. Other way is to work with cuda-device-only IR. But then I am not
2013 Nov 18
0
Quadrified GTX 480 VT-d passthrough. CUDA 5.5 in Linux partial success!
Hi everyone, after following in the footsteps of the following discussion (http://lists.xenproject.org/archives/html/xen-users/2013-09/msg00106.html) I had been able to turn my GTX 480 into a Quadro 6000. When I VT-d passthrough it to a Debian jessie VM it shows up fine and CUDA 5.5 seems to function properly up to a point: lspci -v: 00:04.0 VGA compatible controller: NVIDIA Corporation GF100GL
2016 Dec 21
0
llvm/cuda: Indentify kernel functions and optimizations
https://github.com/llvm-mirror/llvm/blob/652375a8cc49615de31fd9d424753795059185b6/lib/Target/NVPTX/NVPTXUtilities.h#L58 Does this solve your problem? On Wed, Dec 21, 2016 at 2:29 PM, Gurunath Kadam via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > I am trying to instrument CUDA kernel functions only (llvm-3.9.0). > > Is there a way to identify cuda kernel
2020 Sep 24
2
cuda __shfl_sync problem
Hi, First of all, i'm not sure if i should be posting this here or in cfe-dev, but here it goes. In order to instrument CUDA kernels i first generate device IR with: clang++ -x cuda --cuda-device-only -emit-llvm --cuda-gpu-arch=sm_52 -o device.bc I also have a library that contains the instrumentation stubs for which i generate IR similarly and i link it with the device IR
2020 Nov 19
1
JIT compiling CUDA source code
Sound right now like you are emitting an LLVM module? The best strategy is probably to use to emit a PTX module and then pass that to the CUDA driver. This is what we do on the Julia side in CUDA.jl. Nvidia has a somewhat helpful tutorial on this at https://github.com/NVIDIA/cuda-samples/blob/c4e2869a2becb4b6d9ce5f64914406bf5e239662/Samples/vectorAdd_nvrtc/vectorAdd.cpp and
2020 Sep 25
2
cuda __shfl_sync problem
Do you mean in llc? Because i don't see such an option i'm afraid. ~George On 24-09-2020 20:54, Johannes Doerfert wrote: > Not that I am an expert but it looks like it defaults to the minimal > PTX version that supports the compute capability. You might be able to > choose PTX 6.0 though. > > ~ Johannes > > > On 9/24/20 1:02 PM, George K via llvm-dev wrote:
2015 Apr 08
2
[LLVMdev] CUDA front-end (CUDA to LLVM IR)
On Wed, Apr 8, 2015 at 10:12 AM, Dmitry Mikushin <dmitry at kernelgen.org> wrote: > A tool of this kind here: https://github.com/apc-llc/nvcc-llvm-ir > > 2015-04-08 19:01 GMT+02:00 Ahmed ElTantawy <ahmede at ece.ubc.ca>: > >> Hi, >> >> I wanted to ask whether there is ongoing effort (or an already >> established tool) that enables to convert CUDA
2016 Oct 27
0
problem on compiling cuda program with clang++
(+llvm-dev) My question was whether your host machine, the one which is running the compiler, is ARM (as opposed to x86 or POWER). The header you pointed to was in "aarch64-linux-gnu", which made me think you might be on an ARM system. If you are not running linux x86, it is not likely to work. If you are running linux x86, we will need much more details about your system in order to
2017 Aug 02
2
CUDA compilation "No available targets are compatible with this triple." problem
Yes, I followed the guide. The same error showed up: >clang++ axpy.cu -o axpy --cuda-gpu-arch=sm_35 -L/usr/local/cuda/lib64 -I/usr/local/cuda/include -lcudart_static -ldl -lrt -pthread error: unable to create target: 'No available targets are compatible with this triple.' ________________________________ From: Kevin Choi <code.kchoi at gmail.com> Sent: Wednesday, August 2,
2016 Oct 27
0
problem on compiling cuda program with clang++
Hi, it looks like you're compiling CUDA for an ARM host? This is not a configuration we have tested, nor is it something we have the capability of testing at the moment. You may be able to make it work by providing the appropriate -isystem flags to clang so that it can find your headers, but who knows, it may be more complicated than that. Regards, -Justin On Wed, Oct 26, 2016 at 9:59 PM,
2011 Feb 10
1
BLAS optimization by CUBLAS
Dear colleagues! In early 2009 there was a discussion about fast BLAS library initiated by Sachin. He reported a faster BLAS library made by Nvidia CUBLAS library. Uwe Ligges showed an interest for placing the optimized rblas.dll into windows/contrib section managed by him. Unfortunately there is no any CUBLAS version of rblas.dll in this section at present. So, is anybody interested in CUBLAS
2017 Sep 27
2
OrcJIT + CUDA Prototype for Cling
Dear LLVM-Developers and Vinod Grover, we are trying to extend the cling C++ interpreter (https://github.com/root-project/cling) with CUDA functionality for Nvidia GPUs. I already developed a prototype based on OrcJIT and am seeking for feedback. I am currently a stuck with a runtime issue, on which my interpreter prototype fails to execute kernels with a CUDA runtime error. === How to use the