Displaying 8 results from an estimated 8 matches for "gpugems3_ch39".
2010 Aug 06
4
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...d I would like to
upstream it if possible. This backend is implemented by LLVM's target
independent code generator framework; I think this will make it easier
to maintain.
I have tested this backend to translate a work-efficient parallel scan
kernel ( http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html
) into PTX code. The generated PTX code was then executed on real
hardware, and the result is correct.
So far I have to hack clang to generate bitcode for this backend, but
I will try to patch clang to parse CUDA (or OpenCL) while I am
upstreaming this backend.
I am new to LLVM. Any commen...
2010 Aug 10
0
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...nerator.
>
> But I didn't study their code thoroughly, so I might be wrong about this.
I haven't had a chance to look at it yet either.
>>> I have tested this backend to translate a work-efficient parallel scan
>>> kernel ( http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html
>>> ) into PTX code. The generated PTX code was then executed on real
>>> hardware, and the result is correct.
>>
>> How much of the LLVM IR does this support? What's missing?
> Have to add some intrinsics, calling conventions, and address spaces.
> I...
2010 Aug 09
0
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...you do a comparison of the two? Perhaps
there are holes in each that can be filled by the other. It would be
a shame to have two completely different PTX backends.
> I have tested this backend to translate a work-efficient parallel scan
> kernel ( http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html
> ) into PTX code. The generated PTX code was then executed on real
> hardware, and the result is correct.
How much of the LLVM IR does this support? What's missing?
> So far I have to hack clang to generate bitcode for this backend, but
> I will try to patch clang to parse...
2010 Aug 10
4
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...instruction set features that are hard to exploit
if not using code generator.
But I didn't study their code thoroughly, so I might be wrong about this.
>> I have tested this backend to translate a work-efficient parallel scan
>> kernel ( http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html
>> ) into PTX code. The generated PTX code was then executed on real
>> hardware, and the result is correct.
>
> How much of the LLVM IR does this support? What's missing?
Have to add some intrinsics, calling conventions, and address spaces.
I would say these are relati...
2010 Aug 10
3
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...y their code thoroughly, so I might be wrong about
> this.
>
> I haven't had a chance to look at it yet either.
>
> >>> I have tested this backend to translate a work-efficient parallel
> scan
> >>> kernel (
> http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html
> >>> ) into PTX code. The generated PTX code was then executed on real
> >>> hardware, and the result is correct.
> >>
> >> How much of the LLVM IR does this support? What's missing?
> > Have to add some intrinsics, calling conventions, an...
2010 Aug 07
0
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...am it if possible. This backend is implemented by LLVM's target
> independent code generator framework; I think this will make it easier
> to maintain.
>
> I have tested this backend to translate a work-efficient parallel scan
> kernel ( http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html
> ) into PTX code. The generated PTX code was then executed on real
> hardware, and the result is correct.
>
> So far I have to hack clang to generate bitcode for this backend, but
> I will try to patch clang to parse CUDA (or OpenCL) while I am
> upstreaming this backend....
2010 Aug 11
0
[LLVMdev] Upstream PTX backend that uses target independent code generator if possible
...might be wrong about
>> this.
>>
>> I haven't had a chance to look at it yet either.
>>
>> >>> I have tested this backend to translate a work-efficient parallel
>> scan
>> >>> kernel (
>> http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html
>> >>> ) into PTX code. The generated PTX code was then executed on real
>> >>> hardware, and the result is correct.
>> >>
>> >> How much of the LLVM IR does this support? What's missing?
>> > Have to add some intrinsics, cal...
2017 Jul 11
8
[LLD] Linker Relaxation
Here's an example using the gcc toolchain for embedded 32 bit RISC-V (my
HiFive1 board):
#include <stdio.h>
int foo(int i){
if (i < 100){
printf("%d\n", i);
}
return i;
}
int main(){
foo(10);
return 0;
}
After compiling to a .o with -O2 -march=RV32IC we get (just looking at foo)
00000000 <foo>:
0: 1141 addi sp,sp,-16