similar to: [LLVMdev] [NVPTX] launch_bounds support?

Displaying 8 results from an estimated 8 matches similar to: "[LLVMdev] [NVPTX] launch_bounds support?"

2013 Apr 02
0
[LLVMdev] [NVPTX] launch_bounds support?
Yes, this is supported through metadata. An example usage of these annotations is given in the test/CodeGen/NVPTX/annotations.ll unit test. I'll try to remember to add this to the NVPTX documentation I'm putting together at http://llvm.org/docs/NVPTXUsage.html. On Mon, Apr 1, 2013 at 8:06 AM, Dmitry Mikushin <dmitry at kernelgen.org>wrote: > Dear all, > > Is anybody
2013 Dec 06
2
[LLVMdev] PTX generation examples?
OK, fine -- an example of MCJIT that sets up for PTX JIT would also be helpful. On Dec 6, 2013, at 12:32 PM, Eli Bendersky <eliben at google.com> wrote: > > You'll have to switch to MCJIT for this purpose. Legacy JIT doesn't emit PTX. > > Eli -- Larry Gritz lg at larrygritz.com -------------- next part -------------- An HTML attachment was scrubbed... URL:
2013 Dec 09
0
[LLVMdev] PTX generation examples?
There is no MCJIT support for PTX at the moment (mainly because PTX does not have a binary format, and is not machine code per se). To generate PTX at run-time, you just set up a standard codegen pass manager like you would like an off-line compiler. The output will be a string buffer that contains the PTX, which you can load into the CUDA runtime. As for determining if PTX support is compiled
2013 Dec 09
1
[LLVMdev] PTX generation examples?
Ah, that's helpful. I knew that I'd need to end up with PTX as text, not a true binary, but I would have figured that it would come out of MCJIT. Thanks for helping to steer me away from the wrong trail. OK, one more question: Can anybody clarify the pros and cons of generating the PTX through the standard LLVM distro, versus using the "libnvvm" that comes with the Cuda SDK?
2013 Jun 25
2
[LLVMdev] About writing a modulePass in addPreEmitPass() for NVPTX
Oops! No need of Call Graph! In fact, what I want to do is to find which function is the kernel function and which function is called by that kernel. Since OpenCL will make all functions called by kernels inline, I can use function attribute: Noinline to distinguish them. Sorry for bothering you. Antony Yu -- View this message in context:
2016 Oct 14
2
LLVM/CLANG: CUDA compilation fail for inline assembly code
Hi, I am sorry for sending this query again here, but maybe I sent it to wrong list yesterday. I am trying to compile LonestarGPU-rev2.0 <http://iss.ices.utexas.edu/?p=projects/galois/lonestargpu/download> benchmark suite with LLVM/CLANG. This suite has a following piece of code (more info here
2019 Dec 16
3
Guidance on working with the NVIDIA GPU back-end
Hi all, I'm primarily a hardware person but would like to do some compiler-architecture co-design research. Are there any good references for the NVPTX backend? I'd like to change that backend to have a limited number of physical registers rather than an unlimited number of virtual ones (for more realistic modeling in a uarch simulator). Being able to do register allocation and other
2013 Jun 25
0
[LLVMdev] About writing a modulePass in addPreEmitPass() for NVPTX
You shouldn't rely on that, its an implementation detail. Instead, you can trace back to the original Function object and check for kernel metadata. See http://llvm.org/docs/NVPTXUsage.html#marking-functions-as-kernels On Tue, Jun 25, 2013 at 11:09 AM, Antony Yu <swpenim at gmail.com> wrote: > Oops! No need of Call Graph! > In fact, what I want to do is to find which function