similar to: [LLVMdev] How to use llvm as the backend for cuda?

Displaying 20 results from an estimated 7000 matches similar to: "[LLVMdev] How to use llvm as the backend for cuda?"

2020 Sep 25
2
cuda __shfl_sync problem
Do you mean in llc? Because i don't see such an option i'm afraid. ~George On 24-09-2020 20:54, Johannes Doerfert wrote: > Not that I am an expert but it looks like it defaults to the minimal > PTX version that supports the compute capability. You might be able to > choose PTX 6.0 though. > > ~ Johannes > > > On 9/24/20 1:02 PM, George K via llvm-dev wrote:
2020 Sep 24
2
cuda __shfl_sync problem
Hi, First of all, i'm not sure if i should be posting this here or in cfe-dev, but here it goes. In order to instrument CUDA kernels i first generate device IR with: clang++ -x cuda --cuda-device-only -emit-llvm --cuda-gpu-arch=sm_52 -o device.bc I also have a library that contains the instrumentation stubs for which i generate IR similarly and i link it with the device IR
2013 Feb 08
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Yes, it helps a lot and we are working on it. A few questions, 1) What will be your use model of this library? Will you run optimization phases after linking with the library? If so, what are they? 2) Do you care if the names of functions differ from those in libm? For example, it would be gpusin() instead of sin(). 3) Do you need a different library for different host
2016 Jun 02
3
PTX generation from CUDA file for compute capability 1.0 (sm_10)
Hello Bergström/Eric, Thanks for the reply. The G80(sm_10) architecture was ported on FPGA by a group of researchers (http://www.ecs.umass.edu/ece/tessier/andryc-fpt13.pdf). Our group have some further research interest on this work. I was working on modifying the Clang-LLVM for a couple of months and achieved the required changes. But Clang-LLVM is only allowing me to generate PTX for sm_20,
2011 Aug 15
0
[LLVMdev] Cuda programs on LLVM
Hi Adarsh, to my knowledge there is no publicly available CUDA-Frontend for LLVM yet. The work of Helge Rhodin you mentioned is on the backend-side: It allows to generate PTX code from LLVM IR. It is still being maintained, although I think the currently available source code is a little outdated. There is also a PTX backend in the current version of LLVM that makes use of LLVM's
2013 Jun 05
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Dear all, FWIW, I've tested libdevice.compute_20.10.bc and libdevice.compute_30.10.bc from /cuda/nvvm/libdevice shipped with CUDA 5.5 preview. IR is compatible with LLVM 3.4 trunk that we use. Results are correct, performance - almost the same as what we had before with cicc-sniffed IR, or maybe <10% better. Will test libdevice.compute_35.10.bc once we will get K20 support. Thanks for
2013 Feb 17
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
> The issue is really that there is no standard math library for PTX. Well, formally, that could very well be true. Moreover, in some parts CPU math standard is impossible to accomplish on parallel architectures, consider, for example errno behavior. But here we are speaking more about practical side. And the practical side is: past 5 years CUDA claims to accelerate compute applications, and
2013 Feb 07
5
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Hi Justin, gentlemen, I'm afraid I have to escalate this issue at this point. Since it was discussed for the first time last summer, it was sufficient for us for a while to have lowering of math calls into intrinsics disabled at DragonEgg level, and link them against CUDA math functions at LLVM IR level. Now I can say: this is not sufficient any longer, and we need NVPTX backend to deal with
2013 Feb 17
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Dear Yuan, Sorry for delay with reply, Answers on your questions could be different, depending on the math library placement in the code generation pipeline. At KernelGen, we currently have a user-level CUDA math module, adopted from cicc internals [1]. It is intended to be linked with the user LLVM IR module, right before proceeding with the final optimization and backend. Last few months we
2013 Feb 17
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
The X86 back-end just calls into libm: // Always use a library call for pow. setOperationAction(ISD::FPOW , MVT::f32 , Expand); setOperationAction(ISD::FPOW , MVT::f64 , Expand); setOperationAction(ISD::FPOW , MVT::f80 , Expand); The issue is really that there is no standard math library for PTX. I agree that this is a pain for most users, but I
2013 Jun 05
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Thanks for the info! I would be glad to hear of any issues you have encountered on this path. I tried to make sure the 3.3 release was fully compatible with the libdevice implementation shipping with 5.5 (and as far as I know, it is). It's just not an officially supported configuration. Also, I've been meaning to address your -drvcuda issue. How would you feel about making that a part
2017 Sep 27
2
OrcJIT + CUDA Prototype for Cling
Dear LLVM-Developers and Vinod Grover, we are trying to extend the cling C++ interpreter (https://github.com/root-project/cling) with CUDA functionality for Nvidia GPUs. I already developed a prototype based on OrcJIT and am seeking for feedback. I am currently a stuck with a runtime issue, on which my interpreter prototype fails to execute kernels with a CUDA runtime error. === How to use the
2013 Feb 17
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
I would be very hesitant to expose all math library functions as intrinsics. I believe linking with a target-specific math library is the correct approach, as it decouples the back end from the needs of the source program/language. Users should be free to use any math library implementation they choose. Intrinsics are meant for functions that compile down to specific isa features, like fused
2013 Feb 17
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Hi Justin, I don't understand, why, for instance, X86 backend handles pow automatically, and NVPTX should be a PITA requiring user to bring his own pow implementation. Even at a very general level, this limits the interest of users to LLVM NVPTX backend. Could you please elaborate on the rationale behind your point? Why the accuracy modes I suggested are not sufficient, in your opinion? - D.
2017 Nov 06
2
Debug info for Cuda
06.11.2017 14:56, Robinson, Paul пишет: >> Hi everybody, >> As you know, Cuda/NVPTX target has very limited support of the debug >> info in Clang/LLVM. Currently, LLVM supports only emission of the line >> numbers debug info. >> This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM >> translates the source code to LLVM IR, which is then lowered to
2011 Aug 15
2
[LLVMdev] Cuda programs on LLVM
Hello , How to execute a cuda program using llvm? More specifically, nvcc produces some temporary files during its compilation. I want to convert the .cu.cpp to .ll format and optimize it. The .cu.cpp file contains typedefs and enums used by cuda runtime and also the host part of the code and the ptx file contains the kernel definition. How can i run the program after optimization? Will Rhodin
2017 Nov 06
5
RFC: Debug info for Cuda
Hi everybody, As you know, Cuda/NVPTX target has very limited support of the debug info in Clang/LLVM. Currently, LLVM supports only emission of the line numbers debug info. This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM translates the source code to LLVM IR, which is then lowered to PTX (parallel thread execution) intermediate file. This PTX file represents special kind of
2016 Oct 14
2
LLVM/CLANG: CUDA compilation fail for inline assembly code
Hi, I am sorry for sending this query again here, but maybe I sent it to wrong list yesterday. I am trying to compile LonestarGPU-rev2.0 <http://iss.ices.utexas.edu/?p=projects/galois/lonestargpu/download> benchmark suite with LLVM/CLANG. This suite has a following piece of code (more info here
2017 Nov 08
2
Debug info for Cuda
Nobody blames ptxas. I'm not saying that these are the troubles, I'm just saying that it has some features and we have some problems to be solved. But lack of labels, label arithmetics in DWARF sections is the real problem, because LLVM actively uses it in DWARF sections Best regards, Alexey Bataev 8 нояб. 2017 г., в 5:35, Madhur Amilkanthwar <madhur13490 at
2017 Nov 14
1
OrcJIT + CUDA Prototype for Cling
Hi Lang, thank You very much. I've used Your code and the creating of the object file works. I think the problem is after creating the object file. When I link the object file with ld I get an executable, which is working right. After changing the clang and llvm libraries from the package control version (.deb) to a own compiled version with debug options, I get an assert() fault. In void