similar to: JIT compiling CUDA source code

Displaying 20 results from an estimated 3000 matches similar to: "JIT compiling CUDA source code"

2020 Nov 19
1
JIT compiling CUDA source code
Sound right now like you are emitting an LLVM module? The best strategy is probably to use to emit a PTX module and then pass that to the CUDA driver. This is what we do on the Julia side in CUDA.jl. Nvidia has a somewhat helpful tutorial on this at https://github.com/NVIDIA/cuda-samples/blob/c4e2869a2becb4b6d9ce5f64914406bf5e239662/Samples/vectorAdd_nvrtc/vectorAdd.cpp and
2020 Nov 19
0
JIT compiling CUDA source code
I have made a bit of progress... When compiling CUDA source code in memory, the Compilation instance returned by Driver::BuildCompilation() contains two clang Commands: one for the host and one for the CUDA device. I can execute both commands using EmitLLVMOnlyActions. I add the Module from the host compilation to my JIT as usual, but... what to do with the Module from the device compilation? If I
2010 Aug 18
0
[LLVMdev] clang: call extern function using JIT
I tried what you said, now I get: LLVM ERROR: Program used external function 'yipee' which could not be resolved! Stack dump: 0. Running pass 'X86 Machine Code Emitter' on function '@main' did not even get as far as a breakpoint. Óscar Fuentes wrote: > > gafferuk <gafferuk at gmail.com> writes: > >> Im confused. The function i wish to call is
2010 Aug 18
1
[LLVMdev] clang: call extern function using JIT
Heres my full code listing, im totally stuck. // Whistle.cpp : Defines the entry point for the console application. // #include "stdafx.h" #include "clang/CodeGen/CodeGenAction.h" #include "clang/Driver/Compilation.h" #include "clang/Driver/Driver.h" #include "clang/Driver/Tool.h" #include
2010 Aug 18
2
[LLVMdev] clang: call extern function using JIT
gafferuk <gafferuk at gmail.com> writes: > Im confused. The function i wish to call is a return type of int. > Im calling it with int dd = yipee(1); > > What's wrong? Declare the function: int yipee(int); int main() { int dd = yipee(1); return 0; } If that still crashes, put a breakpoint on `yipee' and see if the execution gets there, if the argument is
2017 Jun 14
4
[CUDA] Lost debug information when compiling CUDA code
Hi, I needed to debug some CUDA code in my project; however, although I used -g when compiling the source code, no source-level information is available in cuda-gdb or cuda-memcheck. Specifically, below is what I did: 1) For a CUDA file a.cu, generate IR files: clang++ -g -emit-llvm --cuda-gpu-arch=sm_35 -c a.cu; 2) Instrument the device code a-cuda-nvptx64-nvidia-cuda-sm_35.bc (generated
2017 Sep 27
2
OrcJIT + CUDA Prototype for Cling
Dear LLVM-Developers and Vinod Grover, we are trying to extend the cling C++ interpreter (https://github.com/root-project/cling) with CUDA functionality for Nvidia GPUs. I already developed a prototype based on OrcJIT and am seeking for feedback. I am currently a stuck with a runtime issue, on which my interpreter prototype fails to execute kernels with a CUDA runtime error. === How to use the
2017 Nov 14
1
OrcJIT + CUDA Prototype for Cling
Hi Lang, thank You very much. I've used Your code and the creating of the object file works. I think the problem is after creating the object file. When I link the object file with ld I get an executable, which is working right. After changing the clang and llvm libraries from the package control version (.deb) to a own compiled version with debug options, I get an assert() fault. In void
2020 Aug 10
2
[EXTERNAL] Re: Orc JIT v2 breaks OpenMP in 11.x branch?
Yeah, I remember encountering that error before when getting it to pass the libomp test suite. If you have a struct named "ident_t" somewhere the compiler will rename it because of the conflict with the runtime declaration. This should be solved by casting the usage to the function type found in the definition (i.e. bitcasting a struct.ident_t.21 to struct.ident_t) which solved the
2016 Sep 09
2
defaults for FP contraction [e.g. fused multiply-add]: suggestion and patch to be slightly more aggressive and to make Clang`s optimization settings closer to having the same meaning as when they are given to GCC [at least for "-O3"]
Dear all, In the process of investigating a performance difference between Clang & GCC when both compile the same non-toolchain program while using the "same"* compiler flags, I have found something that may be worth changing in Clang, developed a patch, and confirmed that the patch has its intended effect. *: "same" in quotes b/c the essence of the problem is that the
2020 Aug 10
2
[EXTERNAL] Re: Orc JIT v2 breaks OpenMP in 11.x branch?
Yep, it happens three times, then crashes afterwards, since I removed the assert... arg 0: expected %struct.ident_t* got %struct.ident_t.21* value @0 = private unnamed_addr global %struct.ident_t.21 { i32 0, i32 514, i32 0, i32 0, i8* getelementptr inbounds ([23 x i8], [23 x i8]* @.str, i32 0, i32 0) }, align 8 arg 0: expected %struct.ident_t* got %struct.ident_t.21* value @1 = private
2020 Jun 03
2
[cfe-dev] [RFC] Refactor Clang: move frontend/driver/diagnostics code to LLVM
On Tue, Jun 2, 2020 at 6:38 PM Richard Smith via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Tue, 2 Jun 2020 at 05:08, Andrzej Warzynski via cfe-dev < > cfe-dev at lists.llvm.org> wrote: > >> *TL;DR* >> >> We propose some non-trivial refactoring in Clang and LLVM to enable >> further work on Flang driver. >> >> *SUMMARY* >> We
2020 Aug 10
2
[EXTERNAL] Re: Orc JIT v2 breaks OpenMP in 11.x branch?
Thanks, Joseph and Johannes. I have not merged in anything, I am using the code from the repository as is. What is this -debug-only option, and to whom would I pass it? I am running our own JIT application, which uses clang to compile modules on the fly via clang::CompilerInstance::ExecuteAction(). Working on the assumption that there is a mismatch in the declared type of an OpenMP runtime
2016 Mar 05
2
instrumenting device code with gpucc
On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com> wrote: > Hi Jingyue, > > My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother > you, but I'm having trouble with gpucc in my project, and I would be really > grateful for your help! > > Currently we're trying to instrument CUDA code using LLVM 3.9, and
2013 Oct 03
0
[LLVMdev] libclang JIT frontend
Hi, I'm not sure if this is a libclang, llvm::cl or clang-interpreter issue so I'll try posting here for a response. I am using libclang as a frontend to the LLVM JIT (3.3 release). I started from the clang-interpreter example and have everything working (given a C/C++ source file I can have it JIT'd to memory and executed) for a single run. When I try to compile a second source
2020 Jun 02
12
[RFC] Refactor Clang: move frontend/driver/diagnostics code to LLVM
*TL;DR* We propose some non-trivial refactoring in Clang and LLVM to enable further work on Flang driver. *SUMMARY* We would like to start extracting the driver/frontend code from Clang (alongside the code that the driver/frontend depends on, e.g. Diagnostics) and move the components that could be re-used by non-C-based languages to LLVM. From our initial investigation we see that these
2020 Aug 10
2
[EXTERNAL] Re: Orc JIT v2 breaks OpenMP in 11.x branch?
Hi, That patch was from an ongoing effort to consolidate OpenMP generation in clang. If memory serves the implementation there is still a little incomplete. It's supposed to use types from OMPConstants rather than ones it defined itself and the methods used to create the functions shouldn't need to be static. However attempting this caused a lot of errors so there might be an underlying
2020 Aug 10
2
Orc JIT v2 breaks OpenMP in 11.x branch?
Hi Geoff, Nothing in that backtrace leaps out at me. Based on the stack trace and description my first guess would be a clang misconfiguration rather than a JIT bug. How is that clang invocation being made? Is it from inside a callback from ORC, or is it before you add your module to the JIT? -- Lang. On Mon, Aug 3, 2020 at 5:41 AM Geoff Levner <glevner at gmail.com> wrote: > Here,
2015 Apr 08
5
[LLVMdev] CUDA front-end (CUDA to LLVM IR)
Hi, I wanted to ask whether there is ongoing effort (or an already established tool) that enables to convert CUDA kernels (that uses CUDA specific intrinsics, e.g., threadId.x, __syncthreads(), ...) to LLVM IR. I am aware that I can do this for OpenCL with the help of libclc but I can not find something similar for CUDA. Thanks -------------- next part -------------- An HTML attachment was
2020 Jul 30
2
Status of CUDA 11 support
Hi, I work in a large CUDA codebase and use Clang to build some of our CUDA code to improve compilation speed. We're planning to upgrade to CUDA 11 soon, and it appears that CUDA 11 is not yet supported in LLVM. >From the LLVM commits history, I can see that work on CUDA 11 has started. Is this currently being worked on? What is the remaining work left? And is any help needed to finish