search for: gpugems3_ch39

Displaying 8 results from an estimated 8 matches for "gpugems3_ch39".

2010 Aug 06
4
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...d I would like to upstream it if possible. This backend is implemented by LLVM's target independent code generator framework; I think this will make it easier to maintain. I have tested this backend to translate a work-efficient parallel scan kernel ( http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html ) into PTX code. The generated PTX code was then executed on real hardware, and the result is correct. So far I have to hack clang to generate bitcode for this backend, but I will try to patch clang to parse CUDA (or OpenCL) while I am upstreaming this backend. I am new to LLVM. Any commen...
2010 Aug 10
0
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...nerator. > > But I didn't study their code thoroughly, so I might be wrong about this. I haven't had a chance to look at it yet either. >>> I have tested this backend to translate a work-efficient parallel scan >>> kernel ( http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html >>> ) into PTX code.  The generated PTX code was then executed on real >>> hardware, and the result is correct. >> >> How much of the LLVM IR does this support?  What's missing? > Have to add some intrinsics, calling conventions, and address spaces. > I...
2010 Aug 09
0
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...you do a comparison of the two? Perhaps there are holes in each that can be filled by the other. It would be a shame to have two completely different PTX backends. > I have tested this backend to translate a work-efficient parallel scan > kernel ( http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html > ) into PTX code. The generated PTX code was then executed on real > hardware, and the result is correct. How much of the LLVM IR does this support? What's missing? > So far I have to hack clang to generate bitcode for this backend, but > I will try to patch clang to parse...
2010 Aug 10
4
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...instruction set features that are hard to exploit if not using code generator. But I didn't study their code thoroughly, so I might be wrong about this. >> I have tested this backend to translate a work-efficient parallel scan >> kernel ( http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html >> ) into PTX code.  The generated PTX code was then executed on real >> hardware, and the result is correct. > > How much of the LLVM IR does this support?  What's missing? Have to add some intrinsics, calling conventions, and address spaces. I would say these are relati...
2010 Aug 10
3
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...y their code thoroughly, so I might be wrong about > this. > > I haven't had a chance to look at it yet either. > > >>> I have tested this backend to translate a work-efficient parallel > scan > >>> kernel ( > http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html > >>> ) into PTX code.  The generated PTX code was then executed on real > >>> hardware, and the result is correct. > >> > >> How much of the LLVM IR does this support?  What's missing? > > Have to add some intrinsics, calling conventions, an...
2010 Aug 07
0
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...am it if possible.  This backend is implemented by LLVM's target > independent code generator framework; I think this will make it easier > to maintain. > > I have tested this backend to translate a work-efficient parallel scan > kernel ( http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html > ) into PTX code.  The generated PTX code was then executed on real > hardware, and the result is correct. > > So far I have to hack clang to generate bitcode for this backend, but > I will try to patch clang to parse CUDA (or OpenCL) while I am > upstreaming this backend....
2010 Aug 11
0
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...might be wrong about >> this. >> >> I haven't had a chance to look at it yet either. >> >> >>> I have tested this backend to translate a work-efficient parallel >> scan >> >>> kernel ( >> http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html >> >>> ) into PTX code.  The generated PTX code was then executed on real >> >>> hardware, and the result is correct. >> >> >> >> How much of the LLVM IR does this support?  What's missing? >> > Have to add some intrinsics, cal...
2017 Jul 11
8
[LLD] Linker Relaxation
Here's an example using the gcc toolchain for embedded 32 bit RISC-V (my HiFive1 board): #include <stdio.h> int foo(int i){ if (i < 100){ printf("%d\n", i); } return i; } int main(){ foo(10); return 0; } After compiling to a .o with -O2 -march=RV32IC we get (just looking at foo) 00000000 <foo>: 0: 1141 addi sp,sp,-16