search for: fatbinari

Displaying 14 results from an estimated 14 matches for "fatbinari".

Did you mean: fatbinary
2017 Sep 27
2
OrcJIT + CUDA Prototype for Cling
Dear LLVM-Developers and Vinod Grover, we are trying to extend the cling C++ interpreter (https://github.com/root-project/cling) with CUDA functionality for Nvidia GPUs. I already developed a prototype based on OrcJIT and am seeking for feedback. I am currently a stuck with a runtime issue, on which my interpreter prototype fails to execute kernels with a CUDA runtime error. === How to use the
2017 Nov 14
1
OrcJIT + CUDA Prototype for Cling
Hi Lang, thank You very much. I've used Your code and the creating of the object file works. I think the problem is after creating the object file. When I link the object file with ld I get an executable, which is working right. After changing the clang and llvm libraries from the package control version (.deb) to a own compiled version with debug options, I get an assert() fault. In void
2017 Jun 14
4
[CUDA] Lost debug information when compiling CUDA code
Hi, I needed to debug some CUDA code in my project; however, although I used -g when compiling the source code, no source-level information is available in cuda-gdb or cuda-memcheck. Specifically, below is what I did: 1) For a CUDA file a.cu, generate IR files: clang++ -g -emit-llvm --cuda-gpu-arch=sm_35 -c a.cu; 2) Instrument the device code a-cuda-nvptx64-nvidia-cuda-sm_35.bc (generated
2020 Nov 17
2
JIT compiling CUDA source code
We have an application that allows the user to compile and execute C++ code on the fly, using Orc JIT v2, via the LLJIT class. And we would like to extend it to allow the user to provide CUDA source code as well, for GPU programming. But I am having a hard time figuring out how to do it. To JIT compile C++ code, we do basically as follows: 1. call Driver::BuildCompilation(), which returns a
2016 Mar 05
2
instrumenting device code with gpucc
On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com> wrote: > Hi Jingyue, > > My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother > you, but I'm having trouble with gpucc in my project, and I would be really > grateful for your help! > > Currently we're trying to instrument CUDA code using LLVM 3.9, and
2018 Sep 10
9
[RfC] A proposal of adding SPIR-V Toolchain in Clang
Hello, Since 2015 Khronos has switched to the new portable intermediate format SPIR-V, which has replaced the original SPIR. The advantage is that it offers higher portability across different toolchains. There was a talk about it at a Dev Meeting: http://llvm.org/devmtg/2017-03//2017/02/20/accepted-sessions.html#17 LLVM currently only supports SPIR format for OpenCL in Clang. Several Khronos
2020 Nov 19
1
JIT compiling CUDA source code
Sound right now like you are emitting an LLVM module? The best strategy is probably to use to emit a PTX module and then pass that to the CUDA driver. This is what we do on the Julia side in CUDA.jl. Nvidia has a somewhat helpful tutorial on this at https://github.com/NVIDIA/cuda-samples/blob/c4e2869a2becb4b6d9ce5f64914406bf5e239662/Samples/vectorAdd_nvrtc/vectorAdd.cpp and
2018 Sep 11
3
[RfC] A proposal of adding SPIR-V Toolchain in Clang
On Mon, 10 Sep 2018 at 18:47, Nicholas Wilson via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I was going to wait until Neil Trevett got back to me about becoming a > SPIR-V TSG advisor but this seems like just as good an opportunity. Please > see the previous discussion [1] if you have not already, there were many > relevant points made. > > First, I’d like to note
2017 Jun 09
1
NVPTX Back-end: relocatable device code support for dynamic parallelism
Hi everyone, CUDA allows to call some runtime functions also from the device code. On a multi-GPU system this allows the GPU to determine its device id on its own via cudaGetDevice(). Unfortunately i cannot get it working when compiling with clang. When compiling with nvcc relocatable device code needs to be set to true (-rdc=true) and the cudadevrt is needed when linking [0]. I did not
2018 Sep 12
3
[RfC] A proposal of adding SPIR-V Toolchain in Clang
> On Sep 11, 2018, at 7:39 PM, Tom Stellard via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On 09/11/2018 12:50 PM, Richard Smith via llvm-dev wrote: >> On Mon, 10 Sep 2018 at 18:47, Nicholas Wilson via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> I was going to wait until Neil Trevett got back to me
2020 Nov 19
0
JIT compiling CUDA source code
I have made a bit of progress... When compiling CUDA source code in memory, the Compilation instance returned by Driver::BuildCompilation() contains two clang Commands: one for the host and one for the CUDA device. I can execute both commands using EmitLLVMOnlyActions. I add the Module from the host compilation to my JIT as usual, but... what to do with the Module from the device compilation? If I
2018 Sep 13
2
[RfC] A proposal of adding SPIR-V Toolchain in Clang
On Wed, 12 Sep 2018 at 16:52, Tom Stellard via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On 09/12/2018 02:32 PM, Matthias Braun wrote: > > > > > >> On Sep 11, 2018, at 7:39 PM, Tom Stellard via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> > >> On 09/11/2018 12:50 PM, Richard Smith via llvm-dev wrote: > >>> On Mon,
2018 Sep 12
3
[RfC] A proposal of adding SPIR-V Toolchain in Clang
On Tue, 11 Sep 2018 at 19:40, Tom Stellard via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On 09/11/2018 12:50 PM, Richard Smith via llvm-dev wrote: > > On Mon, 10 Sep 2018 at 18:47, Nicholas Wilson via llvm-dev < > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > > > I was going to wait until Neil Trevett got back to me
2016 Mar 10
4
instrumenting device code with gpucc
It's hard to tell what is wrong without a concrete example. E.g., what is the program you are instrumenting? What is the definition of the hook function? How did you link that definition with the binary? One thing suspicious to me is that you may have linked the definition of _Cool_MemRead_Hook as a host function instead of a device function. AFAIK, PTX assembly cannot be linked. So, if you