thr3ads.net - similar to: "[LLVMdev] How to use llvm as the backend for cuda?"

Displaying 20 results from an estimated 9000 matches similar to: "[LLVMdev] How to use llvm as the backend for cuda?"

2020 Sep 25

cuda __shfl_sync problem

Do you mean in llc? Because i don't see such an option i'm afraid. ~George On 24-09-2020 20:54, Johannes Doerfert wrote: > Not that I am an expert but it looks like it defaults to the minimal > PTX version that supports the compute capability. You might be able to > choose PTX 6.0 though. > > ~ Johannes > > > On 9/24/20 1:02 PM, George K via llvm-dev wrote:

cuda __shfl_sync problem

2020 Sep 24

cuda __shfl_sync problem

Hi, First of all, i'm not sure if i should be posting this here or in cfe-dev, but here it goes. In order to instrument CUDA kernels i first generate device IR with: clang++ -x cuda --cuda-device-only -emit-llvm --cuda-gpu-arch=sm_52 -o device.bc I also have a library that contains the instrumentation stubs for which i generate IR similarly and i link it with the device IR

PTX generation from CUDA file for compute capability 1.0 (sm_10)

2016 Jun 02

PTX generation from CUDA file for compute capability 1.0 (sm_10)

Hello Bergström/Eric, Thanks for the reply. The G80(sm_10) architecture was ported on FPGA by a group of researchers (http://www.ecs.umass.edu/ece/tessier/andryc-fpt13.pdf). Our group have some further research interest on this work. I was working on modifying the Clang-LLVM for a couple of months and achieved the required changes. But Clang-LLVM is only allowing me to generate PTX for sm_20,

[LLVMdev] Cuda programs on LLVM

2011 Aug 15

[LLVMdev] Cuda programs on LLVM

Hi Adarsh, to my knowledge there is no publicly available CUDA-Frontend for LLVM yet. The work of Helge Rhodin you mentioned is on the backend-side: It allows to generate PTX code from LLVM IR. It is still being maintained, although I think the currently available source code is a little outdated. There is also a PTX backend in the current version of LLVM that makes use of LLVM's

OrcJIT + CUDA Prototype for Cling

2017 Sep 27

OrcJIT + CUDA Prototype for Cling

Dear LLVM-Developers and Vinod Grover, we are trying to extend the cling C++ interpreter (https://github.com/root-project/cling) with CUDA functionality for Nvidia GPUs. I already developed a prototype based on OrcJIT and am seeking for feedback. I am currently a stuck with a runtime issue, on which my interpreter prototype fails to execute kernels with a CUDA runtime error. === How to use the

Debug info for Cuda

2017 Nov 06

Debug info for Cuda

06.11.2017 14:56, Robinson, Paul пишет: >> Hi everybody, >> As you know, Cuda/NVPTX target has very limited support of the debug >> info in Clang/LLVM. Currently, LLVM supports only emission of the line >> numbers debug info. >> This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM >> translates the source code to LLVM IR, which is then lowered to

[LLVMdev] Cuda programs on LLVM

2011 Aug 15

[LLVMdev] Cuda programs on LLVM

Hello , How to execute a cuda program using llvm? More specifically, nvcc produces some temporary files during its compilation. I want to convert the .cu.cpp to .ll format and optimize it. The .cu.cpp file contains typedefs and enums used by cuda runtime and also the host part of the code and the ptx file contains the kernel definition. How can i run the program after optimization? Will Rhodin

RFC: Debug info for Cuda

2017 Nov 06

RFC: Debug info for Cuda

Hi everybody, As you know, Cuda/NVPTX target has very limited support of the debug info in Clang/LLVM. Currently, LLVM supports only emission of the line numbers debug info. This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM translates the source code to LLVM IR, which is then lowered to PTX (parallel thread execution) intermediate file. This PTX file represents special kind of

LLVM/CLANG: CUDA compilation fail for inline assembly code

2016 Oct 14

LLVM/CLANG: CUDA compilation fail for inline assembly code

Hi, I am sorry for sending this query again here, but maybe I sent it to wrong list yesterday. I am trying to compile LonestarGPU-rev2.0 <http://iss.ices.utexas.edu/?p=projects/galois/lonestargpu/download> benchmark suite with LLVM/CLANG. This suite has a following piece of code (more info here

Debug info for Cuda

2017 Nov 08

Debug info for Cuda

Nobody blames ptxas. I'm not saying that these are the troubles, I'm just saying that it has some features and we have some problems to be solved. But lack of labels, label arithmetics in DWARF sections is the real problem, because LLVM actively uses it in DWARF sections Best regards, Alexey Bataev 8 нояб. 2017 г., в 5:35, Madhur Amilkanthwar <madhur13490 at

OrcJIT + CUDA Prototype for Cling

2017 Nov 14

OrcJIT + CUDA Prototype for Cling

Hi Lang, thank You very much. I've used Your code and the creating of the object file works. I think the problem is after creating the object file. When I link the object file with ld I get an executable, which is working right. After changing the clang and llvm libraries from the package control version (.deb) to a own compiled version with debug options, I get an assert() fault. In void

PTX generation from CUDA file for compute capability 1.0 (sm_10)

2016 Jun 02

PTX generation from CUDA file for compute capability 1.0 (sm_10)

Hello, When generating the PTX output from CUDA file(.cu file), the minimum target that is accepted by LLVM is sm_20. But I have a specific requirement to generate PTX output for compute capability 1.0 (sm_10). Is there any previous version of LLVM supporting this? Thank you, Ginu -------------- next part -------------- An HTML attachment was scrubbed... URL:

JIT compiling CUDA source code

2020 Nov 19

JIT compiling CUDA source code

Sound right now like you are emitting an LLVM module? The best strategy is probably to use to emit a PTX module and then pass that to the CUDA driver. This is what we do on the Julia side in CUDA.jl. Nvidia has a somewhat helpful tutorial on this at https://github.com/NVIDIA/cuda-samples/blob/c4e2869a2becb4b6d9ce5f64914406bf5e239662/Samples/vectorAdd_nvrtc/vectorAdd.cpp and

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 09

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

The lack of an open-source vector math library (which is what you suggest here) prompted me to start a project "vecmathlib", available at < https://bitbucket.org/eschnett/vecmathlib>. This library provides almost all math functions available in libm, implemented in a vectorised manner, i.e. suitable for SSE2/AVX/MIC/PTX etc. In its current state the library has rough edges, e.g.

[CUDA] Lost debug information when compiling CUDA code

2017 Jun 14

[CUDA] Lost debug information when compiling CUDA code

Hi, I needed to debug some CUDA code in my project; however, although I used -g when compiling the source code, no source-level information is available in cuda-gdb or cuda-memcheck. Specifically, below is what I did: 1) For a CUDA file a.cu, generate IR files: clang++ -g -emit-llvm --cuda-gpu-arch=sm_35 -c a.cu; 2) Instrument the device code a-cuda-nvptx64-nvidia-cuda-sm_35.bc (generated

[LLVMdev] translating from OpenMP to CUDA

2012 Nov 09

[LLVMdev] translating from OpenMP to CUDA

The PTX back-end is robust (it's based on the sources used by nvcc), but I'm not sure about the OpenMP representation in LLVM IR. I believe the OpenMP constructs are already lowered into libgomp calls before leaving DragonEgg. It's been awhile since I've loooked at it though. If you use the PTX back-end and have any issues, please don't hesitate to post to the list and cc:

JIT compiling CUDA source code

2020 Nov 17

JIT compiling CUDA source code

We have an application that allows the user to compile and execute C++ code on the fly, using Orc JIT v2, via the LLJIT class. And we would like to extend it to allow the user to provide CUDA source code as well, for GPU programming. But I am having a hard time figuring out how to do it. To JIT compile C++ code, we do basically as follows: 1. call Driver::BuildCompilation(), which returns a

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Jun 05

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Dear all, FWIW, I've tested libdevice.compute_20.10.bc and libdevice.compute_30.10.bc from /cuda/nvvm/libdevice shipped with CUDA 5.5 preview. IR is compatible with LLVM 3.4 trunk that we use. Results are correct, performance - almost the same as what we had before with cicc-sniffed IR, or maybe <10% better. Will test libdevice.compute_35.10.bc once we will get K20 support. Thanks for

[LLVMdev] translating from OpenMP to CUDA

2012 Nov 08

[LLVMdev] translating from OpenMP to CUDA

Hi, Is it possible to translate an OpenMP program to CUDA using LLVM? I read that dragonegg has a OpenMP front-end and LLVM has a PTX back-end. I don't know how mature these tools are. Please let me know. Thanks. -Apala Postdoctoral Scholar Department of Computer Science, University of Chicago Computation Institute, Argonne National Laboratory http://sites.google.com/site/apalaguha/home/

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 08

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Yes, it helps a lot and we are working on it. A few questions, 1) What will be your use model of this library? Will you run optimization phases after linking with the library? If so, what are they? 2) Do you care if the names of functions differ from those in libm? For example, it would be gpusin() instead of sin(). 3) Do you need a different library for different host

similar to: [LLVMdev] How to use llvm as the backend for cuda?