Displaying 20 results from an estimated 7000 matches similar to: "[LLVMdev] How to use llvm as the backend for cuda?"
2020 Sep 25
2
cuda __shfl_sync problem
Do you mean in llc? Because i don't see such an option i'm afraid.
~George
On 24-09-2020 20:54, Johannes Doerfert wrote:
> Not that I am an expert but it looks like it defaults to the minimal
> PTX version that supports the compute capability. You might be able to
> choose PTX 6.0 though.
>
> ~ Johannes
>
>
> On 9/24/20 1:02 PM, George K via llvm-dev wrote:
2020 Sep 24
2
cuda __shfl_sync problem
Hi,
First of all, i'm not sure if i should be posting this here or in
cfe-dev, but here it goes.
In order to instrument CUDA kernels i first generate device IR with:
clang++ -x cuda --cuda-device-only -emit-llvm --cuda-gpu-arch=sm_52 -o
device.bc
I also have a library that contains the instrumentation stubs for which
i generate IR similarly and i link it with the device IR
2013 Feb 08
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Yes, it helps a lot and we are working on it.
A few questions,
1) What will be your use model of this library? Will you run optimization phases after linking with the library? If so, what are they?
2) Do you care if the names of functions differ from those in libm? For example, it would be gpusin() instead of sin().
3) Do you need a different library for different host
2016 Jun 02
3
PTX generation from CUDA file for compute capability 1.0 (sm_10)
Hello Bergström/Eric,
Thanks for the reply. The G80(sm_10) architecture was ported on FPGA by a
group of researchers (http://www.ecs.umass.edu/ece/tessier/andryc-fpt13.pdf).
Our group have some further research interest on this work. I was working
on modifying the Clang-LLVM for a couple of months and achieved the
required changes. But Clang-LLVM is only allowing me to generate PTX for
sm_20,
2011 Aug 15
0
[LLVMdev] Cuda programs on LLVM
Hi Adarsh,
to my knowledge there is no publicly available CUDA-Frontend for LLVM yet.
The work of Helge Rhodin you mentioned is on the backend-side: It allows
to generate PTX code from LLVM IR. It is still being maintained,
although I think the currently available source code is a little outdated.
There is also a PTX backend in the current version of LLVM that makes
use of LLVM's
2013 Jun 05
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Dear all,
FWIW, I've tested libdevice.compute_20.10.bc and libdevice.compute_30.10.bc
from /cuda/nvvm/libdevice shipped with CUDA 5.5 preview. IR is compatible
with LLVM 3.4 trunk that we use. Results are correct, performance - almost
the same as what we had before with cicc-sniffed IR, or maybe <10% better.
Will test libdevice.compute_35.10.bc once we will get K20 support.
Thanks for
2013 Feb 17
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
> The issue is really that there is no standard math library for PTX.
Well, formally, that could very well be true. Moreover, in some parts CPU
math standard is impossible to accomplish on parallel architectures,
consider, for example errno behavior. But here we are speaking more about
practical side. And the practical side is: past 5 years CUDA claims to
accelerate compute applications, and
2013 Feb 07
5
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Hi Justin, gentlemen,
I'm afraid I have to escalate this issue at this point. Since it was
discussed for the first time last summer, it was sufficient for us for a
while to have lowering of math calls into intrinsics disabled at DragonEgg
level, and link them against CUDA math functions at LLVM IR level. Now I
can say: this is not sufficient any longer, and we need NVPTX backend to
deal with
2013 Feb 17
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Dear Yuan,
Sorry for delay with reply,
Answers on your questions could be different, depending on the math library
placement in the code generation pipeline. At KernelGen, we currently have
a user-level CUDA math module, adopted from cicc internals [1]. It is
intended to be linked with the user LLVM IR module, right before proceeding
with the final optimization and backend. Last few months we
2013 Feb 17
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
The X86 back-end just calls into libm:
// Always use a library call for pow.
setOperationAction(ISD::FPOW , MVT::f32 , Expand);
setOperationAction(ISD::FPOW , MVT::f64 , Expand);
setOperationAction(ISD::FPOW , MVT::f80 , Expand);
The issue is really that there is no standard math library for PTX. I
agree that this is a pain for most users, but I
2013 Jun 05
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Thanks for the info! I would be glad to hear of any issues you have
encountered on this path. I tried to make sure the 3.3 release was fully
compatible with the libdevice implementation shipping with 5.5 (and as far
as I know, it is). It's just not an officially supported configuration.
Also, I've been meaning to address your -drvcuda issue. How would you feel
about making that a part
2017 Sep 27
2
OrcJIT + CUDA Prototype for Cling
Dear LLVM-Developers and Vinod Grover,
we are trying to extend the cling C++ interpreter
(https://github.com/root-project/cling) with CUDA functionality for
Nvidia GPUs.
I already developed a prototype based on OrcJIT and am seeking for
feedback. I am currently a stuck with a runtime issue, on which my
interpreter prototype fails to execute kernels with a CUDA runtime error.
=== How to use the
2013 Feb 17
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
I would be very hesitant to expose all math library functions as
intrinsics. I believe linking with a target-specific math library is the
correct approach, as it decouples the back end from the needs of the source
program/language. Users should be free to use any math library
implementation they choose. Intrinsics are meant for functions that
compile down to specific isa features, like fused
2013 Feb 17
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Hi Justin,
I don't understand, why, for instance, X86 backend handles pow
automatically, and NVPTX should be a PITA requiring user to bring his own
pow implementation. Even at a very general level, this limits the interest
of users to LLVM NVPTX backend. Could you please elaborate on the rationale
behind your point? Why the accuracy modes I suggested are not sufficient,
in your opinion?
- D.
2017 Nov 06
2
Debug info for Cuda
06.11.2017 14:56, Robinson, Paul пишет:
>> Hi everybody,
>> As you know, Cuda/NVPTX target has very limited support of the debug
>> info in Clang/LLVM. Currently, LLVM supports only emission of the line
>> numbers debug info.
>> This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM
>> translates the source code to LLVM IR, which is then lowered to
2011 Aug 15
2
[LLVMdev] Cuda programs on LLVM
Hello ,
How to execute a cuda program using llvm?
More specifically, nvcc produces some temporary files during its
compilation. I want to convert the .cu.cpp to .ll format and optimize it.
The .cu.cpp file contains typedefs and enums used by cuda runtime and also
the host part of the code
and the ptx file contains the kernel definition. How can i run the program
after optimization? Will Rhodin
2017 Nov 06
5
RFC: Debug info for Cuda
Hi everybody,
As you know, Cuda/NVPTX target has very limited support of the debug info in Clang/LLVM. Currently, LLVM supports only emission of the line numbers debug info.
This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM translates the source code to LLVM IR, which is then lowered to PTX (parallel thread execution) intermediate file. This PTX file represents special kind of
2016 Oct 14
2
LLVM/CLANG: CUDA compilation fail for inline assembly code
Hi,
I am sorry for sending this query again here, but maybe I sent it to wrong
list yesterday.
I am trying to compile LonestarGPU-rev2.0
<http://iss.ices.utexas.edu/?p=projects/galois/lonestargpu/download>
benchmark suite with LLVM/CLANG.
This suite has a following piece of code (more info here
2017 Nov 08
2
Debug info for Cuda
Nobody blames ptxas. I'm not saying that these are the troubles, I'm just saying that it has some features and we have some problems to be solved.
But lack of labels, label arithmetics in DWARF sections is the real problem, because LLVM actively uses it in DWARF sections
Best regards,
Alexey Bataev
8 нояб. 2017 г., в 5:35, Madhur Amilkanthwar <madhur13490 at
2017 Nov 14
1
OrcJIT + CUDA Prototype for Cling
Hi Lang,
thank You very much. I've used Your code and the creating of the object
file works. I think the problem is after creating the object file. When
I link the object file with ld I get an executable, which is working right.
After changing the clang and llvm libraries from the package control
version (.deb) to a own compiled version with debug options, I get an
assert() fault.
In
void