search for: ptxas

Displaying 20 results from an estimated 71 matches for "ptxas".

2012 Jun 12
2
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
...e recorded, I don't know how important it is: clang generates linkonce_odr out of __inline__, and NVPTX generates .weak out of linkonce_odr (how it happens - a big question, btw, because I can't find anything related in NVPTX asm printer - does it chain to some other printer?), and finally ptxas (both 4.2 and 5) fails to compile it to cubin. Below is the test case: > cat test3.cu __inline__ __attribute__((device)) __attribute__((used)) void test() { return; } > clang -cc1 -emit-llvm -triple ptx64-unknown-unknown -fcuda-is-device test3.cu -o test3.ll > cat test3.ll ; ModuleI...
2012 Jun 13
0
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
...know how important it is: > > clang generates linkonce_odr out of __inline__, and NVPTX generates .weak > out of linkonce_odr (how it happens - a big question, btw, because I can't > find anything related in NVPTX asm printer - does it chain to some other > printer?), and finally ptxas (both 4.2 and 5) fails to compile it to cubin. > Below is the test case: > > > cat test3.cu > > __inline__ __attribute__((device)) __attribute__((used)) void test() > { > return; > } > > > clang -cc1 -emit-llvm -triple ptx64-unknown-unknown -fcuda-is-device &...
2012 Jul 18
2
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
Dear NVPTX community, PTXAS fails to compile the ptx code generated by NVPTX. Is it an issue of backend or an issue of PTXAS or a known reasonable restriction? Thanks, - Dima. > cat test.ll ; ModuleID = '__kernelgen_main_module' target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" target tr...
2012 Jul 18
0
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
...d. NVPTX emits the global variables in the order as in the LLVM IR and does not sort them. It is a bug in the NVPTX backend. Thanks. Yuan From: Dmitry N. Mikushin [mailto:maemarcus at gmail.com] Sent: Wednesday, July 18, 2012 7:44 AM To: LLVM-Dev Cc: Justin Holewinski; Yuan Lin Subject: [NVPTX] PTXAS - Unimplemented feature: labels as initial values Dear NVPTX community, PTXAS fails to compile the ptx code generated by NVPTX. Is it an issue of backend or an issue of PTXAS or a known reasonable restriction? Thanks, - Dima. > cat test.ll ; ModuleID = '__kernelgen_main_module' targe...
2016 Jun 07
3
NVPTX compilation problems - ptxas error
...a8a6bc) I have attached my source code which i try to compile with clang++ (it is just for testing so the code is not mean to do something useful) and here is the command i am using: $ clang++ --cuda-gpu-arch=sm_20 loops.cu -o loops-clang -L/opt/cuda-7.5/lib64 -lcudart_static -ldl -lrt -pthread ptxas /tmp/loops-36dc47.s, line 5; error : Unsupported .version 4.2; current version is '3.2' ptxas fatal : Ptx assembly aborted due to errors clang-3.9: error: ptxas command failed with exit code 255 (use -v to see invocation) Compilation with nvcc works by the way. I will be glad about...
2016 Apr 07
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
...ced that an internal function '* _ZL21__nvvm_reflect_anchorv() *' is defined in both a.bc & b.bc, and when merging these two files, one of the two definitions was renamed to '*_ZL21__nvvm_reflect_anchorv.2()*', and written into c.bc. Then I did *llc c.bc -o c.s -march=nvptx ; ptxas c.s -o c.o* However, ptxas would give the following complaint: *ptxas c.s, line 171; error : Duplicate definition of function '_ZL21__nvvm_reflect_anchorv'* *ptxas c.s, line 171; fatal : Parsing error near '.2': syntax error* So I inspected c.s and found the issue above was...
2016 Apr 08
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
Yeah, '.' is the direct reason for the ptxas failure here. I'm curious, however, about what the purpose of nvvm_reflect_anchorv() is here, and why does the front-end always generate this function? Since the current PTX emission doesn't mangle dots, it would be a reasonable workaround for me to prevent the front-end from generating th...
2016 Mar 12
2
instrumenting device code with gpucc
Hey Jingyue, Though I tried `opt -nvvm-reflect` on both bc files, the nvvm reflect anchor didn't go away; ptxas is still complaining about the duplicate definition of of function '_ZL21__nvvm_reflect_anchorv' . Did I misused the nvvm-reflect pass? Thanks! yuanfeng On Fri, Mar 11, 2016 at 10:10 AM, Jingyue Wu <jingyue at google.com> wrote: > According to the examples you sent, I believe t...
2011 May 16
0
[LLVMdev] TargetRegisterInfo and "infinite" register files
...TX, which is more of an intermediate form than a final assembly language. The format is essentially three-address code, with "virtual" registers instead of "physical" registers. After PTX code generation, the PTX assembly is compiled to a device binary with a proprietary tool (ptxas) that does final register allocation (based on device and user constraints). However, exploiting register re-use at the LLVM/PTX level has shown performance improvement over blindly using a new "physical" register for each def and letting ptxas figure out all of the register allocation d...
2016 Mar 10
4
instrumenting device code with gpucc
...t;> >> To link the modified axpy-sm_20.bc to the final binary, you need several >> extra steps: >> 1. Compile axpy-sm_20.bc to PTX assembly using llc: llc axpy-sm_20.bc -o >> axpy-sm_20.ptx -march=<nvptx or nvptx64> >> 2. Compile the PTX assembly to SASS using ptxas >> 3. Make the SASS a fat binary using NVIDIA's fatbinary tool >> 4. Link the fat binary to the host code using ld. >> >> Clang does step 2-4 by invoking subcommands. Therefore, you can use >> "clang -###" to dump all the subcommands, and then find the on...
2016 Apr 09
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
...> > With David's http://reviews.llvm.org/rL265060, do you think > __nvvm_reflect_anchor is still necessary? > > On Fri, Apr 8, 2016 at 9:37 AM, Yuanfeng Peng via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Yeah, '.' is the direct reason for the ptxas failure here. I'm curious, >> however, about what the purpose of nvvm_reflect_anchorv() is here, and why >> does the front-end always generate this function? Since the current PTX >> emission doesn't mangle dots, it would be a reasonable workaround for me to >> prev...
2012 Jul 10
2
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
...global.u32 [%SP+0], %r0; // inline asm .reg .pred %p1; .reg .pred %p2; setp.ne.u32 %p1, %r0, 0; vote.any.pred %p2, %p1; selp.s32 %r0, 1, 0, %p2; // inline asm st.global.u32 [%SP+4], %r0; mov.b32 func_retval0, %r0; ret; } > ptxas test.ptx -o test.cubin ptxas test.ptx, line 33; error : Duplicate definition of variable '%p1' ptxas test.ptx, line 34; error : Duplicate definition of variable '%p2' ptxas test.ptx, line 36; error : Instruction 'vote' requires .target sm_12 or higher ptxas fatal : P...
2017 Nov 06
5
RFC: Debug info for Cuda
...odegen. Clang/LLVM translates the source code to LLVM IR, which is then lowered to PTX (parallel thread execution) intermediate file. This PTX file represents special kind of the assembler code in text format, which contains the code itself + (possibly) debug info. Then this PTX file is compiled by ptxas tool into the CUDA binary representation. Debug info representation in PTX file. ======================== According to PTX Writer's Guide to Interoperability, Debug information (http://docs.nvidia.com/cuda/ptx-writers-guide-to-interoperability/index.html#debug-information) , debug informatio...
2011 May 16
6
[LLVMdev] TargetRegisterInfo and "infinite" register files
...TX, which is more of an intermediate form than a final assembly language. The format is essentially three-address code, with "virtual" registers instead of "physical" registers. After PTX code generation, the PTX assembly is compiled to a device binary with a proprietary tool (ptxas) that does final register allocation (based on device and user constraints). However, exploiting register re-use at the LLVM/PTX level has shown performance improvement over blindly using a new "physical" register for each def and letting ptxas figure out all of the register allocation d...
2016 Jun 30
0
NVPTX compilation problems - ptxas error
...In my case I had cuda-5.5 installed at /usr/local/cuda-5.5 with /usr/local/cuda linked to it. And I have cuda-7.5 installed elsewhere, that is the version in my $PATH and the one I was linking against during compilation. When I ran the compilation with the -v flag, I found that /usr/local/cuda/ptxas was being being invoked (cuda-5.5) instead of the version of ptxas (cuda-7.5) I had in my $PATH. That is the default location that cuda is normally installed to, clang may be defaulting to that. If you have root access, you might try installing cuda-7.5 to /usr/local/cuda. Doing that fixed it f...
2016 Mar 13
2
instrumenting device code with gpucc
...hould be able to figure out which one. > > On Fri, Mar 11, 2016 at 4:56 PM, Yuanfeng Peng < > yuanfeng.jack.peng at gmail.com> wrote: > >> Hey Jingyue, >> >> Though I tried `opt -nvvm-reflect` on both bc files, the nvvm reflect >> anchor didn't go away; ptxas is still complaining about the duplicate >> definition of of function '_ZL21__nvvm_reflect_anchorv' . Did I misused >> the nvvm-reflect pass? >> >> Thanks! >> yuanfeng >> >> On Fri, Mar 11, 2016 at 10:10 AM, Jingyue Wu <jingyue at google.com>...
2012 Jul 10
0
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
...1; > .reg .pred %p2; > setp.ne.u32 %p1, %r0, 0; > vote.any.pred %p2, %p1; > selp.s32 %r0, 1, 0, %p2; > > // inline asm > st.global.u32 [%SP+4], %r0; > mov.b32 func_retval0, %r0; > ret; > } > > > ptxas test.ptx -o test.cubin > ptxas test.ptx, line 33; error : Duplicate definition of variable '%p1' > ptxas test.ptx, line 34; error : Duplicate definition of variable '%p2' > ptxas test.ptx, line 36; error : Instruction 'vote' requires .target sm_12 or higher &g...
2016 Mar 15
2
instrumenting device code with gpucc
...On Fri, Mar 11, 2016 at 4:56 PM, Yuanfeng Peng < >>> yuanfeng.jack.peng at gmail.com> wrote: >>> >>>> Hey Jingyue, >>>> >>>> Though I tried `opt -nvvm-reflect` on both bc files, the nvvm reflect >>>> anchor didn't go away; ptxas is still complaining about the duplicate >>>> definition of of function '_ZL21__nvvm_reflect_anchorv' . Did I misused >>>> the nvvm-reflect pass? >>>> >>>> Thanks! >>>> yuanfeng >>>> >>>> On Fri, Mar 11, 20...
2017 Nov 08
2
Debug info for Cuda
Nobody blames ptxas. I'm not saying that these are the troubles, I'm just saying that it has some features and we have some problems to be solved. But lack of labels, label arithmetics in DWARF sections is the real problem, because LLVM actively uses it in DWARF sections Best regards, Alexey Bataev 8 нояб. 2...
2012 Jul 10
1
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
...pred %p1; > .reg .pred %p2; > setp.ne.u32 %p1, %r0, 0; > vote.any.pred %p2, %p1; > selp.s32 %r0, 1, 0, %p2; > > // inline asm > st.global.u32 [%SP+4], %r0; > mov.b32 func_retval0, %r0; > ret; > } > > > ptxas test.ptx -o test.cubin > ptxas test.ptx, line 33; error : Duplicate definition of variable '%p1' > ptxas test.ptx, line 34; error : Duplicate definition of variable '%p2' > ptxas test.ptx, line 36; error : Instruction 'vote' requires .target > sm_12 or high...