similar to: RFC: Debug info for Cuda

Displaying 20 results from an estimated 7000 matches similar to: "RFC: Debug info for Cuda"

2017 Nov 06
2
Debug info for Cuda
06.11.2017 14:56, Robinson, Paul пишет: >> Hi everybody, >> As you know, Cuda/NVPTX target has very limited support of the debug >> info in Clang/LLVM. Currently, LLVM supports only emission of the line >> numbers debug info. >> This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM >> translates the source code to LLVM IR, which is then lowered to
2017 Nov 08
2
Debug info for Cuda
Nobody blames ptxas. I'm not saying that these are the troubles, I'm just saying that it has some features and we have some problems to be solved. But lack of labels, label arithmetics in DWARF sections is the real problem, because LLVM actively uses it in DWARF sections Best regards, Alexey Bataev 8 нояб. 2017 г., в 5:35, Madhur Amilkanthwar <madhur13490 at
2012 Jul 10
2
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Hi, Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of definitions of the function-scope and asm-scope. Here is an example: > cat test.cu __attribute__((device)) __attribute__((nv_linkonce_odr)) __inline__ int __any(int a) { int result; asm __volatile__ ("{ \n\t" ".reg .pred
2016 Apr 07
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
Hi, I needed to compile a cuda source file (say, a.cu) into IR (a.bc), and then merge a.bc with another bitcode file (b.bc, compiled from b.cu). So I used *llvm-link a.bc b.bc -o c.bc* However, I noticed that an internal function '* _ZL21__nvvm_reflect_anchorv() *' is defined in both a.bc & b.bc, and when merging these two files, one of the two definitions was renamed to
2012 Jul 10
0
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Dmitry, You might be better served by filing this as a bug (http://llvm.org/bugs/). Please include a test case and the steps to reproduce (i.e., what you've provided below). Chad On Jul 10, 2012, at 3:15 PM, Dmitry N. Mikushin wrote: > Hi, > > Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of
2016 Apr 08
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
Yeah, '.' is the direct reason for the ptxas failure here. I'm curious, however, about what the purpose of nvvm_reflect_anchorv() is here, and why does the front-end always generate this function? Since the current PTX emission doesn't mangle dots, it would be a reasonable workaround for me to prevent the front-end from generating this function in the first place. Is there any
2012 Jul 10
1
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Yes, sure, good idea, because might be also Clang-related. http://llvm.org/bugs/show_bug.cgi?id=13322 2012/7/11 Chad Rosier <mcrosier at apple.com> > Dmitry, > You might be better served by filing this as a bug (http://llvm.org/bugs/). > Please include a test case and the steps to reproduce (i.e., what you've > provided below). > > Chad > > On Jul 10, 2012,
2016 Apr 09
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
David's change makes nvvm_reflect_anchor unnecessary. The issue with dots in names generated by llvm still needs to be fixed. On Apr 9, 2016 8:32 AM, "Jingyue Wu" <jingyue at google.com> wrote: > Artem, > > With David's http://reviews.llvm.org/rL265060, do you think > __nvvm_reflect_anchor is still necessary? > > On Fri, Apr 8, 2016 at 9:37 AM, Yuanfeng
2012 Jun 12
2
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
Dear LLVM NVPTX maintainers, Just to have the issue recorded, I don't know how important it is: clang generates linkonce_odr out of __inline__, and NVPTX generates .weak out of linkonce_odr (how it happens - a big question, btw, because I can't find anything related in NVPTX asm printer - does it chain to some other printer?), and finally ptxas (both 4.2 and 5) fails to compile it to
2018 Jun 21
2
NVPTX - Reordering load instructions
Hi all, I'm looking into the performance difference of a benchmark compiled with NVCC vs NVPTX (coming from Julia, not CUDA C) and I'm seeing a significant difference due to PTX instruction ordering. The relevant source code consists of two nested loops that get fully unrolled, doing some basic arithmetic with values loaded from shared memory: > #define BLOCK_SIZE 16 > >
2012 Jun 13
0
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
On Tue, Jun 12, 2012 at 6:11 PM, Dmitry N. Mikushin <maemarcus at gmail.com>wrote: > Dear LLVM NVPTX maintainers, > > Just to have the issue recorded, I don't know how important it is: > > clang generates linkonce_odr out of __inline__, and NVPTX generates .weak > out of linkonce_odr (how it happens - a big question, btw, because I can't > find anything related
2019 Dec 16
3
Guidance on working with the NVIDIA GPU back-end
Hi all, I'm primarily a hardware person but would like to do some compiler-architecture co-design research. Are there any good references for the NVPTX backend? I'd like to change that backend to have a limited number of physical registers rather than an unlimited number of virtual ones (for more realistic modeling in a uarch simulator). Being able to do register allocation and other
2018 Jun 21
2
NVPTX - Reordering load instructions
We already have a pass that vectorizes loads and stores in nvptx and amdgpu. Not at my laptop, I forget the exact filename, but it's called load-store vectorizer. I think the question is, why is LSV not vectorizing this code? I think the answer is, llvm can't tell that the loads are aligned. Ptxas can, but only because it's (apparently) doing vectorization *after* it reesolves the
2019 Feb 27
3
Debug info for CUDA code
Hi Alexey, I submitted the bug report to nvidia. While they are working on it, can you share some insight in what could potentially cause this? I just want to get a sense if such a bug require significant amount of work to fix, which can help me make some decision moving forward with my project. Thanks, Char At 2019-02-27 03:19:02, "Alexey Bataev" <a.bataev at outlook.com>
2016 Mar 05
2
instrumenting device code with gpucc
On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com> wrote: > Hi Jingyue, > > My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother > you, but I'm having trouble with gpucc in my project, and I would be really > grateful for your help! > > Currently we're trying to instrument CUDA code using LLVM 3.9, and
2020 Jan 15
2
Debug info for CUDA code
Hi Alexey, Almost a year has passed and Nvidia finally fixes the ptxas issue in CUDA 10.2 according to: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-compiler-resolved-issues However, I can not yet use it with llvm 9.0.0 release because CUDA 10.2 is not supported yet. Is there other branches of the llvm repo that supports CUDA 10.2 now? Or do I need to wait for llvm 10
2019 Feb 26
2
Debug info for CUDA code
Hi Alexey, Just want to make sure I understand what you said because I'm not familiar with the llvm pipeline, it's this line: /net/gs/vol3/software/modules-sw/cuda/10.0/Linux/RHEL6/x86_64/bin/ptxas" -m64 -g --dont-merge-basicblocks --return-at-end -v --gpu-name sm_75 --output-file /tmp/60663577.1.login.q/testparticles-4fd988.o /tmp/60663577.1.login.q/testparticles-1d20c4.s that
2019 Mar 11
2
Debug info for CUDA code
Hi Alexey, Is there any option for clang to turn on debug for the host code only but not the device code? I've been using something like -ggdb3 -O0 but this generate debug info for both host and device. I'm trying to work around the aforementioned ptxas bug. Thanks, Char At 2019-02-28 02:09:54, "Alexey Bataev" <a.bataev at outlook.com> wrote: Hi Char, it looks like
2019 Feb 26
1
Debug info for CUDA code
Hi Alexey, Thanks for the great work! The version I checked out works most of the time. But I do encounter crashes sometimes. I can't file a bug report on https://bugs.llvm.org/ because I don't have an account. I sent an email to bugs-admin at lists.llvm.org for an account already but I haven't heard back. Meanwhile, can you take a look at the issue? I'm attaching the bug report
2013 Mar 01
4
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
I'm building this with llvm-c, and accessing these intrinsics via calling the intrinsic as if it were a function. class F_SREG<string OpStr, NVPTXRegClass regclassOut, Intrinsic IntOp> : NVPTXInst<(outs regclassOut:$dst), (ins), OpStr, [(set regclassOut:$dst, (IntOp))]>; def INT_PTX_SREG_TID_X : F_SREG<"mov.u32 \t$dst, %tid.x;",