Dmitry N. Mikushin
2012-Jun-12 22:11 UTC
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
Dear LLVM NVPTX maintainers, Just to have the issue recorded, I don't know how important it is: clang generates linkonce_odr out of __inline__, and NVPTX generates .weak out of linkonce_odr (how it happens - a big question, btw, because I can't find anything related in NVPTX asm printer - does it chain to some other printer?), and finally ptxas (both 4.2 and 5) fails to compile it to cubin. Below is the test case:> cat test3.cu__inline__ __attribute__((device)) __attribute__((used)) void test() { return; }> clang -cc1 -emit-llvm -triple ptx64-unknown-unknown -fcuda-is-devicetest3.cu -o test3.ll> cat test3.ll; ModuleID = 'test3.cu' target datalayout = "e-p:64:64-i64:64:64-f64:64: 64-n1:8:16:32:64" target triple = "ptx64-unknown-unknown" @llvm.used = appending global [1 x i8*] [i8* bitcast (void ()* @_Z4testv to i8*)], section "llvm.metadata" define linkonce_odr ptx_device void @_Z4testv() nounwind inlinehint { entry: ret void }> llc -march=nvptx64 -mcpu=sm_20 test3.ll -o test3.ptx > cat test3.ptx// // Generated by LLVM NVPTX Back-End // .version 3.0 .target sm_20, texmode_independent .address_size 64 .weak _Z4testv .func _Z4testv( ) // @_Z4testv { .reg .pred %p<396>; .reg .s16 %rc<396>; .reg .s16 %rs<396>; .reg .s32 %r<396>; .reg .s64 %rl<396>; .reg .f32 %f<396>; .reg .f64 %fl<396>; // BB#0: // %entry ret; } 1) ptxas @ CUDA 4.2:> ptxas -arch=sm_20 -m64 test3.ptx -o -ptxas test3.ptx, line 10; fatal : Parsing error near '.weak': syntax error ptxas fatal : Ptx assembly aborted due to errors 2) ptxas @ CUDA 5:> ~/cuda/bin/ptxas -arch=sm_20 -m64 test3.ptx -o -ptxas test3.ptx, line 10; error : Feature '.weak directive' requires PTX ISA .version 3.1 or later ptxas test3.ptx, line 10; fatal : Parsing error near '_Z4testv': syntax error ptxas fatal : Ptx assembly aborted due to errors 3) ptxas @ CUDA 5, changed .version to 3.1: still error, because according to 3.1 PTX spec, .weak must be followed by .func:> ~/cuda/bin/ptxas -arch=sm_20 -m64 test3.ptx -o -ptxas test3.ptx, line 10; fatal : Parsing error near '_Z4testv': syntax error ptxas fatal : Ptx assembly aborted due to errors Best, - Dima. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120613/cf1b5eb5/attachment.html>
Justin Holewinski
2012-Jun-13 18:24 UTC
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
On Tue, Jun 12, 2012 at 6:11 PM, Dmitry N. Mikushin <maemarcus at gmail.com>wrote:> Dear LLVM NVPTX maintainers, > > Just to have the issue recorded, I don't know how important it is: > > clang generates linkonce_odr out of __inline__, and NVPTX generates .weak > out of linkonce_odr (how it happens - a big question, btw, because I can't > find anything related in NVPTX asm printer - does it chain to some other > printer?), and finally ptxas (both 4.2 and 5) fails to compile it to cubin. > Below is the test case: > > > cat test3.cu > > __inline__ __attribute__((device)) __attribute__((used)) void test() > { > return; > } > > > clang -cc1 -emit-llvm -triple ptx64-unknown-unknown -fcuda-is-device > test3.cu -o test3.ll > > cat test3.ll > ; ModuleID = 'test3.cu' > target datalayout = "e-p:64:64-i64:64:64-f64:64: > 64-n1:8:16:32:64" > target triple = "ptx64-unknown-unknown" > > @llvm.used = appending global [1 x i8*] [i8* bitcast (void ()* @_Z4testv > to i8*)], section "llvm.metadata" > > define linkonce_odr ptx_device void @_Z4testv() nounwind inlinehint { > entry: > ret void > } > > > llc -march=nvptx64 -mcpu=sm_20 test3.ll -o test3.ptx > > cat test3.ptx > > // > // Generated by LLVM NVPTX Back-End > // > > .version 3.0 > .target sm_20, texmode_independent > .address_size 64 > > > .weak _Z4testv > .func _Z4testv( > > ) // @_Z4testv > { > .reg .pred %p<396>; > .reg .s16 %rc<396>; > .reg .s16 %rs<396>; > .reg .s32 %r<396>; > .reg .s64 %rl<396>; > .reg .f32 %f<396>; > .reg .f64 %fl<396>; > > // BB#0: // %entry > ret; > } > > 1) ptxas @ CUDA 4.2: > > > ptxas -arch=sm_20 -m64 test3.ptx -o - > > ptxas test3.ptx, line 10; fatal : Parsing error near '.weak': syntax > error > ptxas fatal : Ptx assembly aborted due to errors > > 2) ptxas @ CUDA 5: > > > ~/cuda/bin/ptxas -arch=sm_20 -m64 test3.ptx -o - > > ptxas test3.ptx, line 10; error : Feature '.weak directive' requires PTX > ISA .version 3.1 or later > ptxas test3.ptx, line 10; fatal : Parsing error near '_Z4testv': syntax > error > ptxas fatal : Ptx assembly aborted due to errors > > 3) ptxas @ CUDA 5, changed .version to 3.1: still error, because according > to 3.1 PTX spec, .weak must be followed by .func: > > > ~/cuda/bin/ptxas -arch=sm_20 -m64 test3.ptx -o - > > ptxas test3.ptx, line 10; fatal : Parsing error near '_Z4testv': syntax > error > ptxas fatal : Ptx assembly aborted due to errors >Thanks for the report. Unfortunately, this does not appear to have a trivial fix. As you mentioned, it is not the NVPTX back-end itself that is emitting the ".weak", but the default MCAsmStreamer implementation. Setting WeakDefDirective in the NVPTXMCAsmInfo class seems to trigger an emission of ".weak_directive", which doesn't help things. Setting LinkOnceDirective helps the ".weak" case, but there is code in LLVM that causes a special label to be produced when LinkOnceDirective is set, which again messes with the PTX assembler. This is a case where we really need a custom MCAsmStreamer, but this will take a bit of time. Is this a blocker for you?> > Best, > - Dima. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120613/acbb2255/attachment.html>
Dmitry N. Mikushin
2012-Jun-14 18:21 UTC
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
Hi Justin, Thanks for explanation! Not really a blocker, but a bit of extra work, as I'm going to convert /opt/cuda/nvvm/ci_include.h to LLVM IR, in order to fuse it with non-C frontend language at IR-level. This header contains __inline__-s. - D. 2012/6/13 Justin Holewinski <justin.holewinski at gmail.com>> On Tue, Jun 12, 2012 at 6:11 PM, Dmitry N. Mikushin <maemarcus at gmail.com>wrote: > >> Dear LLVM NVPTX maintainers, >> >> Just to have the issue recorded, I don't know how important it is: >> >> clang generates linkonce_odr out of __inline__, and NVPTX generates .weak >> out of linkonce_odr (how it happens - a big question, btw, because I can't >> find anything related in NVPTX asm printer - does it chain to some other >> printer?), and finally ptxas (both 4.2 and 5) fails to compile it to cubin. >> Below is the test case: >> >> > cat test3.cu >> >> __inline__ __attribute__((device)) __attribute__((used)) void test() >> { >> return; >> } >> >> > clang -cc1 -emit-llvm -triple ptx64-unknown-unknown -fcuda-is-device >> test3.cu -o test3.ll >> > cat test3.ll >> ; ModuleID = 'test3.cu' >> target datalayout = "e-p:64:64-i64:64:64-f64:64: >> 64-n1:8:16:32:64" >> target triple = "ptx64-unknown-unknown" >> >> @llvm.used = appending global [1 x i8*] [i8* bitcast (void ()* @_Z4testv >> to i8*)], section "llvm.metadata" >> >> define linkonce_odr ptx_device void @_Z4testv() nounwind inlinehint { >> entry: >> ret void >> } >> >> > llc -march=nvptx64 -mcpu=sm_20 test3.ll -o test3.ptx >> > cat test3.ptx >> >> // >> // Generated by LLVM NVPTX Back-End >> // >> >> .version 3.0 >> .target sm_20, texmode_independent >> .address_size 64 >> >> >> .weak _Z4testv >> .func _Z4testv( >> >> ) // @_Z4testv >> { >> .reg .pred %p<396>; >> .reg .s16 %rc<396>; >> .reg .s16 %rs<396>; >> .reg .s32 %r<396>; >> .reg .s64 %rl<396>; >> .reg .f32 %f<396>; >> .reg .f64 %fl<396>; >> >> // BB#0: // %entry >> ret; >> } >> >> 1) ptxas @ CUDA 4.2: >> >> > ptxas -arch=sm_20 -m64 test3.ptx -o - >> >> ptxas test3.ptx, line 10; fatal : Parsing error near '.weak': syntax >> error >> ptxas fatal : Ptx assembly aborted due to errors >> >> 2) ptxas @ CUDA 5: >> >> > ~/cuda/bin/ptxas -arch=sm_20 -m64 test3.ptx -o - >> >> ptxas test3.ptx, line 10; error : Feature '.weak directive' requires >> PTX ISA .version 3.1 or later >> ptxas test3.ptx, line 10; fatal : Parsing error near '_Z4testv': syntax >> error >> ptxas fatal : Ptx assembly aborted due to errors >> >> 3) ptxas @ CUDA 5, changed .version to 3.1: still error, because >> according to 3.1 PTX spec, .weak must be followed by .func: >> >> > ~/cuda/bin/ptxas -arch=sm_20 -m64 test3.ptx -o - >> >> ptxas test3.ptx, line 10; fatal : Parsing error near '_Z4testv': syntax >> error >> ptxas fatal : Ptx assembly aborted due to errors >> > > Thanks for the report. Unfortunately, this does not appear to have a > trivial fix. As you mentioned, it is not the NVPTX back-end itself that is > emitting the ".weak", but the default MCAsmStreamer implementation. > Setting WeakDefDirective in the NVPTXMCAsmInfo class seems to trigger an > emission of ".weak_directive", which doesn't help things. Setting > LinkOnceDirective helps the ".weak" case, but there is code in LLVM that > causes a special label to be produced when LinkOnceDirective is set, which > again messes with the PTX assembler. > > This is a case where we really need a custom MCAsmStreamer, but this will > take a bit of time. Is this a blocker for you? > > >> >> Best, >> - Dima. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> > > > -- > > Thanks, > > Justin Holewinski > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120614/e6663316/attachment.html>
Possibly Parallel Threads
- [LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
- [LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
- [LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
- [LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
- [LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken