Displaying 20 results from an estimated 4000 matches similar to: "NVPTX compilation problems - ptxas error"
2012 Jun 12
2
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
Dear LLVM NVPTX maintainers,
Just to have the issue recorded, I don't know how important it is:
clang generates linkonce_odr out of __inline__, and NVPTX generates .weak
out of linkonce_odr (how it happens - a big question, btw, because I can't
find anything related in NVPTX asm printer - does it chain to some other
printer?), and finally ptxas (both 4.2 and 5) fails to compile it to
2012 Jun 13
0
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
On Tue, Jun 12, 2012 at 6:11 PM, Dmitry N. Mikushin <maemarcus at gmail.com>wrote:
> Dear LLVM NVPTX maintainers,
>
> Just to have the issue recorded, I don't know how important it is:
>
> clang generates linkonce_odr out of __inline__, and NVPTX generates .weak
> out of linkonce_odr (how it happens - a big question, btw, because I can't
> find anything related
2016 Jun 30
0
NVPTX compilation problems - ptxas error
Hello Lorenz,
I think I may have found what is causing the problem. I had the same
problem and was able to solve it, see below.
Do you have more than one version of cuda installed? I think however
clang invokes the compilation of the cuda parts of the code may be
causing the problem. It may be falling back onto an older version of cuda.
In my case I had cuda-5.5 installed at
2012 Jul 18
0
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
In ptx, variables need to be defined before referenced. NVPTX emits the global variables in the order as in the LLVM IR and does not sort them. It is a bug in the NVPTX backend.
Thanks.
Yuan
From: Dmitry N. Mikushin [mailto:maemarcus at gmail.com]
Sent: Wednesday, July 18, 2012 7:44 AM
To: LLVM-Dev
Cc: Justin Holewinski; Yuan Lin
Subject: [NVPTX] PTXAS - Unimplemented feature: labels as
2012 Jul 18
2
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
Dear NVPTX community,
PTXAS fails to compile the ptx code generated by NVPTX. Is it an issue of
backend or an issue of PTXAS or a known reasonable restriction?
Thanks,
- Dima.
> cat test.ll
; ModuleID = '__kernelgen_main_module'
target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64"
target triple = "ptx64-unknown-unknown"
%struct.__st_parameter_dt.0.4
2017 Jun 09
1
NVPTX Back-end: relocatable device code support for dynamic parallelism
Hi everyone,
CUDA allows to call some runtime functions also from the device code. On
a multi-GPU system this allows the GPU to determine its device id on its
own via cudaGetDevice().
Unfortunately i cannot get it working when compiling with clang. When
compiling with nvcc relocatable device code needs to be set to true
(-rdc=true) and the cudadevrt is needed when linking [0]. I did not
2018 Jun 21
2
NVPTX - Reordering load instructions
We already have a pass that vectorizes loads and stores in nvptx and
amdgpu. Not at my laptop, I forget the exact filename, but it's called
load-store vectorizer.
I think the question is, why is LSV not vectorizing this code?
I think the answer is, llvm can't tell that the loads are aligned. Ptxas
can, but only because it's (apparently) doing vectorization *after* it
reesolves the
2012 Jul 10
0
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Dmitry,
You might be better served by filing this as a bug (http://llvm.org/bugs/). Please include a test case and the steps to reproduce (i.e., what you've provided below).
Chad
On Jul 10, 2012, at 3:15 PM, Dmitry N. Mikushin wrote:
> Hi,
>
> Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of
2012 Jul 10
2
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Hi,
Looks like "{" and "}" are lost when trying to use the combination of Clang
and NVPTX, which may result into clash of definitions of the function-scope
and asm-scope. Here is an example:
> cat test.cu
__attribute__((device)) __attribute__((nv_linkonce_odr)) __inline__ int
__any(int a) {
int result;
asm __volatile__ ("{ \n\t"
".reg .pred
2012 Jul 10
1
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Yes, sure, good idea, because might be also Clang-related.
http://llvm.org/bugs/show_bug.cgi?id=13322
2012/7/11 Chad Rosier <mcrosier at apple.com>
> Dmitry,
> You might be better served by filing this as a bug (http://llvm.org/bugs/).
> Please include a test case and the steps to reproduce (i.e., what you've
> provided below).
>
> Chad
>
> On Jul 10, 2012,
2017 Nov 06
5
RFC: Debug info for Cuda
Hi everybody,
As you know, Cuda/NVPTX target has very limited support of the debug info in Clang/LLVM. Currently, LLVM supports only emission of the line numbers debug info.
This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM translates the source code to LLVM IR, which is then lowered to PTX (parallel thread execution) intermediate file. This PTX file represents special kind of
2017 Nov 08
2
Debug info for Cuda
Nobody blames ptxas. I'm not saying that these are the troubles, I'm just saying that it has some features and we have some problems to be solved.
But lack of labels, label arithmetics in DWARF sections is the real problem, because LLVM actively uses it in DWARF sections
Best regards,
Alexey Bataev
8 нояб. 2017 г., в 5:35, Madhur Amilkanthwar <madhur13490 at
2017 Nov 06
2
Debug info for Cuda
06.11.2017 14:56, Robinson, Paul пишет:
>> Hi everybody,
>> As you know, Cuda/NVPTX target has very limited support of the debug
>> info in Clang/LLVM. Currently, LLVM supports only emission of the line
>> numbers debug info.
>> This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM
>> translates the source code to LLVM IR, which is then lowered to
2020 Jan 15
2
Debug info for CUDA code
Hi Alexey,
Almost a year has passed and Nvidia finally fixes the ptxas issue in CUDA 10.2 according to: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-compiler-resolved-issues However, I can not yet use it with llvm 9.0.0 release because CUDA 10.2 is not supported yet. Is there other branches of the llvm repo that supports CUDA 10.2 now? Or do I need to wait for llvm 10
2019 Mar 11
2
Debug info for CUDA code
Hi Alexey,
Is there any option for clang to turn on debug for the host code only but not the device code? I've been using something like -ggdb3 -O0 but this generate debug info for both host and device. I'm trying to work around the aforementioned ptxas bug.
Thanks,
Char
At 2019-02-28 02:09:54, "Alexey Bataev" <a.bataev at outlook.com> wrote:
Hi Char, it looks like
2016 Apr 07
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
Hi,
I needed to compile a cuda source file (say, a.cu) into IR (a.bc), and then
merge a.bc with another bitcode file (b.bc, compiled from b.cu). So I
used *llvm-link
a.bc b.bc -o c.bc*
However, I noticed that an internal function '* _ZL21__nvvm_reflect_anchorv()
*' is defined in both a.bc & b.bc, and when merging these two files, one of
the two definitions was renamed to
2019 Feb 27
3
Debug info for CUDA code
Hi Alexey,
I submitted the bug report to nvidia. While they are working on it, can you share some insight in what could potentially cause this? I just want to get a sense if such a bug require significant amount of work to fix, which can help me make some decision moving forward with my project.
Thanks,
Char
At 2019-02-27 03:19:02, "Alexey Bataev" <a.bataev at outlook.com>
2016 Mar 12
2
instrumenting device code with gpucc
Hey Jingyue,
Though I tried `opt -nvvm-reflect` on both bc files, the nvvm reflect
anchor didn't go away; ptxas is still complaining about the duplicate
definition of of function '_ZL21__nvvm_reflect_anchorv' . Did I misused
the nvvm-reflect pass?
Thanks!
yuanfeng
On Fri, Mar 11, 2016 at 10:10 AM, Jingyue Wu <jingyue at google.com> wrote:
> According to the examples you
2019 Feb 26
2
Debug info for CUDA code
Hi Alexey,
Just want to make sure I understand what you said because I'm not familiar with the llvm pipeline, it's this line:
/net/gs/vol3/software/modules-sw/cuda/10.0/Linux/RHEL6/x86_64/bin/ptxas" -m64 -g --dont-merge-basicblocks --return-at-end -v --gpu-name sm_75 --output-file /tmp/60663577.1.login.q/testparticles-4fd988.o /tmp/60663577.1.login.q/testparticles-1d20c4.s
that
2016 Apr 08
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
Yeah, '.' is the direct reason for the ptxas failure here. I'm curious,
however, about what the purpose of nvvm_reflect_anchorv() is here, and why
does the front-end always generate this function? Since the current PTX
emission doesn't mangle dots, it would be a reasonable workaround for me to
prevent the front-end from generating this function in the first place.
Is there any