similar to: [LLVMdev] Cuda programs on LLVM

Displaying 20 results from an estimated 6000 matches similar to: "[LLVMdev] Cuda programs on LLVM"

2011 Aug 15
0
[LLVMdev] Cuda programs on LLVM
Hi Adarsh, to my knowledge there is no publicly available CUDA-Frontend for LLVM yet. The work of Helge Rhodin you mentioned is on the backend-side: It allows to generate PTX code from LLVM IR. It is still being maintained, although I think the currently available source code is a little outdated. There is also a PTX backend in the current version of LLVM that makes use of LLVM's
2010 Apr 27
5
[LLVMdev] PTX target for LLVM!
Hey everybody, good news for everyone interested in the PTX backend: We decided to release the current source code under the GPL - you can find the latest tarball here: http://www.prog.uni-saarland.de/projects/anysl You will find the README in the attachment, which should hopefully answer a lot of questions concerning the implementation and the current status. If you have further questions,
2011 Aug 29
0
[LLVMdev] PTX target for LLVM!
Hi everyone, I downloaded the latest version of LLVM PTX backend from http://www.prog.uni-saarland.de/projects/anysl and made the required changes to all the files mentioned in the README. But I get the following error when I compile it. llvm[3]: Compiling PTXBackend.cpp for Release build In file included from PTXBackend.h:70:0, from PTXBackend.cpp:36: PTXPasses.h: In constructor
2016 Oct 14
2
LLVM/CLANG: CUDA compilation fail for inline assembly code
Hi, I am sorry for sending this query again here, but maybe I sent it to wrong list yesterday. I am trying to compile LonestarGPU-rev2.0 <http://iss.ices.utexas.edu/?p=projects/galois/lonestargpu/download> benchmark suite with LLVM/CLANG. This suite has a following piece of code (more info here
2012 Nov 08
3
[LLVMdev] translating from OpenMP to CUDA
Hi, Is it possible to translate an OpenMP program to CUDA using LLVM? I read that dragonegg has a OpenMP front-end and LLVM has a PTX back-end. I don't know how mature these tools are. Please let me know. Thanks. -Apala Postdoctoral Scholar Department of Computer Science, University of Chicago Computation Institute, Argonne National Laboratory http://sites.google.com/site/apalaguha/home/
2015 Aug 21
2
[CUDA/NVPTX] is inlining __syncthreads allowed?
I'm using 7.0. I am attaching the reduced example. nvcc sync.cu -arch=sm_35 -ptx gives // .globl _Z3foov .visible .entry _Z3foov( ) { .reg .pred %p<2>; .reg .s32 %r<3>; mov.u32 %r1, %tid.x; and.b32 %r2, %r1, 1; setp.eq.b32 %p1, %r2, 1; @!%p1 bra BB7_2; bra.uni
2012 Nov 09
0
[LLVMdev] translating from OpenMP to CUDA
The PTX back-end is robust (it's based on the sources used by nvcc), but I'm not sure about the OpenMP representation in LLVM IR. I believe the OpenMP constructs are already lowered into libgomp calls before leaving DragonEgg. It's been awhile since I've loooked at it though. If you use the PTX back-end and have any issues, please don't hesitate to post to the list and cc:
2016 Jun 02
3
PTX generation from CUDA file for compute capability 1.0 (sm_10)
Hello Bergström/Eric, Thanks for the reply. The G80(sm_10) architecture was ported on FPGA by a group of researchers (http://www.ecs.umass.edu/ece/tessier/andryc-fpt13.pdf). Our group have some further research interest on this work. I was working on modifying the Clang-LLVM for a couple of months and achieved the required changes. But Clang-LLVM is only allowing me to generate PTX for sm_20,
2016 Oct 27
3
problem on compiling cuda program with clang++
Hi all, I compiled the *llvm3.9* source code on the *Nvidia TX1* board. And now I am following the document in the docs/CompileCudaWithLLVM.rst to compile cuda program with clang++. However, when I compile `axpy.cu` using `nvcc`, *nvcc* can generate the correct the binary; while compiling `axpy.cu` using clang++, the detailed command is `clang++ axpy.cu -o axpy --cuda-gpu-arch=sm_53
2010 Oct 07
1
[LLVMdev] Status of PTX Backend
Hi, The PTX backend we developed (CBackend approach, does not use the target independent code generator) is already more advanced. An older version is published here: http://sourceforge.net/projects/llvmptxbackend/ We recently eliminated a bug which increased the number of required registers per thread. Surprisingly, without that bug the generated code is already comparable to code generated
2010 Aug 10
1
[LLVMdev] PTX backend, BSD license
On Tue, 10 Aug 2010 14:21:43 -0500 "Villmow, Micah" <Micah.Villmow at amd.com> wrote: > > > -----Original Message----- > > From: llvmdev-bounces at cs.uiuc.edu > > [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of David A. Greene > > Sent: Tuesday, August 10, 2010 12:05 PM > > To: Helge Rhodin > > Cc: llvmdev at cs.uiuc.edu > >
2017 Sep 27
2
OrcJIT + CUDA Prototype for Cling
Dear LLVM-Developers and Vinod Grover, we are trying to extend the cling C++ interpreter (https://github.com/root-project/cling) with CUDA functionality for Nvidia GPUs. I already developed a prototype based on OrcJIT and am seeking for feedback. I am currently a stuck with a runtime issue, on which my interpreter prototype fails to execute kernels with a CUDA runtime error. === How to use the
2010 Aug 10
0
[LLVMdev] PTX backend, BSD license
> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of David A. Greene > Sent: Tuesday, August 10, 2010 12:05 PM > To: Helge Rhodin > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] PTX backend, BSD license > > Helge Rhodin <helge.rhodin at alice-dsl.net> writes: > > >> But I
2010 Aug 10
4
[LLVMdev] PTX backend, BSD license
Helge Rhodin <helge.rhodin at alice-dsl.net> writes: >> But I didn't study their code thoroughly, so I might be wrong about this. >> > Yes, we don't use the target-independent code generator and the > backend is based on the CBackend. We decided to not use the code > generator because PTX code is also an intermediate language. The > graphics driver
2013 Jul 18
2
question about Makeconf and nvcc/CUDA
Dear R development: I'm not sure if this is the appropriate list, but it's a start. I would like to put together a package which contains a CUDA program on Windows 7. I believe that it has to do with the Makeconf file in the etc directory. But when I just use the nvcc with the shared option, I can use the dyn.load command, but when I use the is.loaded function, it shows FALSE.
2017 Jun 14
4
[CUDA] Lost debug information when compiling CUDA code
Hi, I needed to debug some CUDA code in my project; however, although I used -g when compiling the source code, no source-level information is available in cuda-gdb or cuda-memcheck. Specifically, below is what I did: 1) For a CUDA file a.cu, generate IR files: clang++ -g -emit-llvm --cuda-gpu-arch=sm_35 -c a.cu; 2) Instrument the device code a-cuda-nvptx64-nvidia-cuda-sm_35.bc (generated
2017 Nov 14
1
OrcJIT + CUDA Prototype for Cling
Hi Lang, thank You very much. I've used Your code and the creating of the object file works. I think the problem is after creating the object file. When I link the object file with ld I get an executable, which is working right. After changing the clang and llvm libraries from the package control version (.deb) to a own compiled version with debug options, I get an assert() fault. In void
2010 Mar 27
2
[LLVMdev] PTX target for LLVM?
Hi I am interested to know: are there are any LLVM targets in the works for Nvidia's PTX ISA? Also if anyone knows about Ocelot (a project done by some students at my school): it does the opposite of what I am trying to do (translates PTX to LLVM IR to run Cuda kernels on the CPU). Thanks in advance. -Puyan
2011 Sep 03
2
[LLVMdev] PTX optimizations
Hi everyone, I am trying to add some optimizations to LLVM's PTX backend. But i am unaware of the existing optimizations. Can you please guide me about the same? Thank You:) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110903/bc038a07/attachment.html>
2018 Jun 21
2
NVPTX - Reordering load instructions
Hi all, I'm looking into the performance difference of a benchmark compiled with NVCC vs NVPTX (coming from Julia, not CUDA C) and I'm seeing a significant difference due to PTX instruction ordering. The relevant source code consists of two nested loops that get fully unrolled, doing some basic arithmetic with values loaded from shared memory: > #define BLOCK_SIZE 16 > >