thr3ads.net - similar to: "Automatic GPU Code Generation"

Displaying 20 results from an estimated 8000 matches similar to: "Automatic GPU Code Generation"

2018 Dec 11

Automatic GPU Code Generation

Thank You.. I am asking to generate directly PTX code automatically or by directives without involvement of CUDA. This way, I am talking about avoiding source to source compiler approach where c code is converted automatically into CUDA, instead I am saying directly to convert C code to PTX assembly. On Tue, Dec 11, 2018 at 12:19 PM Madhur Amilkanthwar <madhur13490 at gmail.com> wrote:

[LLVMdev] GSoC 2012 Proposal: Automatic GPGPU code generation for llvm

2012 Apr 02

[LLVMdev] GSoC 2012 Proposal: Automatic GPGPU code generation for llvm

Hi all, I am a phd student from Huazhong University of Sci&Tech, China. The following is my GSoC 2012 proposal. Comments are welcome! *Title: Automatic GPGPU Code Generation for LLVM* *Abstract* Very often, manually developing an GPGPU application is a time-consuming, complex, error-prone and iterative process. In this project, I propose to build an automatic GPGPU code generation framework

[LLVMdev] GSoC 2012 Proposal: Automatic GPGPU code generation for llvm

2012 Apr 04

[LLVMdev] GSoC 2012 Proposal: Automatic GPGPU code generation for llvm

On 04/03/2012 03:13 PM, Hongbin Zheng wrote: > Hi Yabin, > > Instead of compile the LLVM IR to PTX asm string in a ScopPass, you > can also the improve llc/lli or create new tools to support the code > generation for Heterogeneous platforms[1], i.e. generate code for more > than one target architecture at the same time. Something like this is > not very complicated and had

[LLVMdev] GSoC 2012 Proposal: Automatic GPGPU code generation for llvm

2012 Apr 03

[LLVMdev] GSoC 2012 Proposal: Automatic GPGPU code generation for llvm

Hi Justin, 2012/4/3 Justin Holewinski <justin.holewinski at gmail.com> > *Motivation* >> With the broad proliferation of GPU computing, it is very important to >> provide an easy and automatic tool to develop or port the applications to >> GPU for normal developers, especially for those domain experts who want to >> harness the huge computing power of GPU. Polly

[LLVMdev] GSoC 2012 Proposal: Automatic GPGPU code generation for llvm

2012 Apr 04

[LLVMdev] GSoC 2012 Proposal: Automatic GPGPU code generation for llvm

On Wed, Apr 4, 2012 at 4:49 AM, Tobias Grosser <tobias at grosser.es> wrote: > On 04/03/2012 03:13 PM, Hongbin Zheng wrote: > > Hi Yabin, > > > > Instead of compile the LLVM IR to PTX asm string in a ScopPass, you > > can also the improve llc/lli or create new tools to support the code > > generation for Heterogeneous platforms[1], i.e. generate code for

[LLVMdev] GSoC 2012 Proposal: Automatic GPGPU code generation for llvm

2012 Apr 03

[LLVMdev] GSoC 2012 Proposal: Automatic GPGPU code generation for llvm

Hi Justin, the non-translatable IR with GPU code replaced by appropriate CUDA Driver > API calls. One of CUDA driver apis (cuLaunch) need a ptx asm string as its input. So if I want to provide a one-touch solution and don't introduce any changes to tools outside polly, I must prepare the ptx string before I can generate the correct non-translatable IR part. As your suggestion, It may

PTX generation from CUDA file for compute capability 1.0 (sm_10)

2016 Jun 02

PTX generation from CUDA file for compute capability 1.0 (sm_10)

Hello Bergström/Eric, Thanks for the reply. The G80(sm_10) architecture was ported on FPGA by a group of researchers (http://www.ecs.umass.edu/ece/tessier/andryc-fpt13.pdf). Our group have some further research interest on this work. I was working on modifying the Clang-LLVM for a couple of months and achieved the required changes. But Clang-LLVM is only allowing me to generate PTX for sm_20,

PTX generation from CUDA file for compute capability 1.0 (sm_10)

2016 Jun 02

PTX generation from CUDA file for compute capability 1.0 (sm_10)

Hello, When generating the PTX output from CUDA file(.cu file), the minimum target that is accepted by LLVM is sm_20. But I have a specific requirement to generate PTX output for compute capability 1.0 (sm_10). Is there any previous version of LLVM supporting this? Thank you, Ginu -------------- next part -------------- An HTML attachment was scrubbed... URL:

[LLVMdev] GSoC 2012 Proposal: Automatic GPGPU code generation for llvm

2012 Apr 03

[LLVMdev] GSoC 2012 Proposal: Automatic GPGPU code generation for llvm

On Mon, Apr 2, 2012 at 7:16 AM, Yabin Hu <yabin.hwu at gmail.com> wrote: > Hi all, > > I am a phd student from Huazhong University of Sci&Tech, China. The > following is my GSoC 2012 proposal. > Comments are welcome! > > *Title: Automatic GPGPU Code Generation for LLVM* > > *Abstract* > Very often, manually developing an GPGPU application is a

[LLVMdev] PTX generation examples?

2013 Dec 06

[LLVMdev] PTX generation examples?

OK, fine -- an example of MCJIT that sets up for PTX JIT would also be helpful. On Dec 6, 2013, at 12:32 PM, Eli Bendersky <eliben at google.com> wrote: > > You'll have to switch to MCJIT for this purpose. Legacy JIT doesn't emit PTX. > > Eli -- Larry Gritz lg at larrygritz.com -------------- next part -------------- An HTML attachment was scrubbed... URL:

[LLVMdev] PTX generation examples?

2013 Dec 09

[LLVMdev] PTX generation examples?

Ah, that's helpful. I knew that I'd need to end up with PTX as text, not a true binary, but I would have figured that it would come out of MCJIT. Thanks for helping to steer me away from the wrong trail. OK, one more question: Can anybody clarify the pros and cons of generating the PTX through the standard LLVM distro, versus using the "libnvvm" that comes with the Cuda SDK?

cuda __shfl_sync problem

2020 Sep 24

cuda __shfl_sync problem

Hi, First of all, i'm not sure if i should be posting this here or in cfe-dev, but here it goes. In order to instrument CUDA kernels i first generate device IR with: clang++ -x cuda --cuda-device-only -emit-llvm --cuda-gpu-arch=sm_52 -o device.bc I also have a library that contains the instrumentation stubs for which i generate IR similarly and i link it with the device IR

cuda __shfl_sync problem

2020 Sep 25

cuda __shfl_sync problem

Do you mean in llc? Because i don't see such an option i'm afraid. ~George On 24-09-2020 20:54, Johannes Doerfert wrote: > Not that I am an expert but it looks like it defaults to the minimal > PTX version that supports the compute capability. You might be able to > choose PTX 6.0 though. > > ~ Johannes > > > On 9/24/20 1:02 PM, George K via llvm-dev wrote:

[LLVMdev] PTX generation examples?

2013 Dec 09

[LLVMdev] PTX generation examples?

There is no MCJIT support for PTX at the moment (mainly because PTX does not have a binary format, and is not machine code per se). To generate PTX at run-time, you just set up a standard codegen pass manager like you would like an off-line compiler. The output will be a string buffer that contains the PTX, which you can load into the CUDA runtime. As for determining if PTX support is compiled

[LLVMdev] Backend vs JIT : GPU

2013 Oct 09

[LLVMdev] Backend vs JIT : GPU

Hi guys, I am understanding OpenCL compilation flow on GPU in order to develop OpenCL runtime for a new hardware. I understood that OpenCL compiler is part of a vendor's runtime library which is the heart of OpenCL. Since OpenCL kernel is compiled at runtime, hence at high level its compilation takes place in two steps: i. source code is first converted to intermediate code. ii.

RFC: Debug info for Cuda

2017 Nov 06

RFC: Debug info for Cuda

Hi everybody, As you know, Cuda/NVPTX target has very limited support of the debug info in Clang/LLVM. Currently, LLVM supports only emission of the line numbers debug info. This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM translates the source code to LLVM IR, which is then lowered to PTX (parallel thread execution) intermediate file. This PTX file represents special kind of

[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it

2012 Jun 12

[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it

Dear LLVM NVPTX maintainers, Just to have the issue recorded, I don't know how important it is: clang generates linkonce_odr out of __inline__, and NVPTX generates .weak out of linkonce_odr (how it happens - a big question, btw, because I can't find anything related in NVPTX asm printer - does it chain to some other printer?), and finally ptxas (both 4.2 and 5) fails to compile it to

[LLVMdev] Upstream PTX backend that uses target independent code generator if possible

2010 Aug 11

[LLVMdev] Upstream PTX backend that uses target independent code generator if possible

Che-Liang Chiou <clchiou at gmail.com> writes: > My implementation of predicated instructions is similar to ARM > backend. I traced ARM and PowerPC backend for reference. Cool. > If, David, you were saying a implementation of predication in LLVM IR, > I didn't do that. It was partly because I was not (and is still not) > very familiar with LLVM's design; so I

[LLVMdev] Cuda programs on LLVM

2011 Aug 15

[LLVMdev] Cuda programs on LLVM

Hello , How to execute a cuda program using llvm? More specifically, nvcc produces some temporary files during its compilation. I want to convert the .cu.cpp to .ll format and optimize it. The .cu.cpp file contains typedefs and enums used by cuda runtime and also the host part of the code and the ptx file contains the kernel definition. How can i run the program after optimization? Will Rhodin

[LLVMdev] Upstream PTX backend that uses target independent code generator if possible

2010 Aug 19

[LLVMdev] Upstream PTX backend that uses target independent code generator if possible

Hi there, Thank Nick for kindly reviewing the patch. Here is the link to the source code of the PTX backend; it would help Nick review the patch. http://lime.csie.ntu.edu.tw/~clchiou/llvm-ptx-backend.tar.gz The source code from above link is a working prototype. So it will not be upstreamed as is; I will refactor and add unimplemented features while upstreaming it. That said, the source code

similar to: Automatic GPU Code Generation