similar to: Guidance on working with the NVIDIA GPU back-end

Displaying 20 results from an estimated 5000 matches similar to: "Guidance on working with the NVIDIA GPU back-end"

2017 Nov 06
5
RFC: Debug info for Cuda
Hi everybody, As you know, Cuda/NVPTX target has very limited support of the debug info in Clang/LLVM. Currently, LLVM supports only emission of the line numbers debug info. This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM translates the source code to LLVM IR, which is then lowered to PTX (parallel thread execution) intermediate file. This PTX file represents special kind of
2012 Jul 18
2
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
Dear NVPTX community, PTXAS fails to compile the ptx code generated by NVPTX. Is it an issue of backend or an issue of PTXAS or a known reasonable restriction? Thanks, - Dima. > cat test.ll ; ModuleID = '__kernelgen_main_module' target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" target triple = "ptx64-unknown-unknown" %struct.__st_parameter_dt.0.4
2017 Nov 06
2
Debug info for Cuda
06.11.2017 14:56, Robinson, Paul пишет: >> Hi everybody, >> As you know, Cuda/NVPTX target has very limited support of the debug >> info in Clang/LLVM. Currently, LLVM supports only emission of the line >> numbers debug info. >> This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM >> translates the source code to LLVM IR, which is then lowered to
2017 Nov 08
2
Debug info for Cuda
Nobody blames ptxas. I'm not saying that these are the troubles, I'm just saying that it has some features and we have some problems to be solved. But lack of labels, label arithmetics in DWARF sections is the real problem, because LLVM actively uses it in DWARF sections Best regards, Alexey Bataev 8 нояб. 2017 г., в 5:35, Madhur Amilkanthwar <madhur13490 at
2012 Jul 18
0
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
In ptx, variables need to be defined before referenced. NVPTX emits the global variables in the order as in the LLVM IR and does not sort them. It is a bug in the NVPTX backend. Thanks. Yuan From: Dmitry N. Mikushin [mailto:maemarcus at gmail.com] Sent: Wednesday, July 18, 2012 7:44 AM To: LLVM-Dev Cc: Justin Holewinski; Yuan Lin Subject: [NVPTX] PTXAS - Unimplemented feature: labels as
2019 Feb 27
3
Debug info for CUDA code
Hi Alexey, I submitted the bug report to nvidia. While they are working on it, can you share some insight in what could potentially cause this? I just want to get a sense if such a bug require significant amount of work to fix, which can help me make some decision moving forward with my project. Thanks, Char At 2019-02-27 03:19:02, "Alexey Bataev" <a.bataev at outlook.com>
2019 Feb 26
2
Debug info for CUDA code
Hi Alexey, Just want to make sure I understand what you said because I'm not familiar with the llvm pipeline, it's this line: /net/gs/vol3/software/modules-sw/cuda/10.0/Linux/RHEL6/x86_64/bin/ptxas" -m64 -g --dont-merge-basicblocks --return-at-end -v --gpu-name sm_75 --output-file /tmp/60663577.1.login.q/testparticles-4fd988.o /tmp/60663577.1.login.q/testparticles-1d20c4.s that
2020 Jan 15
2
Debug info for CUDA code
Hi Alexey, Almost a year has passed and Nvidia finally fixes the ptxas issue in CUDA 10.2 according to: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-compiler-resolved-issues However, I can not yet use it with llvm 9.0.0 release because CUDA 10.2 is not supported yet. Is there other branches of the llvm repo that supports CUDA 10.2 now? Or do I need to wait for llvm 10
2019 Mar 11
2
Debug info for CUDA code
Hi Alexey, Is there any option for clang to turn on debug for the host code only but not the device code? I've been using something like -ggdb3 -O0 but this generate debug info for both host and device. I'm trying to work around the aforementioned ptxas bug. Thanks, Char At 2019-02-28 02:09:54, "Alexey Bataev" <a.bataev at outlook.com> wrote: Hi Char, it looks like
2018 Dec 14
8
Debug info for CUDA code
Are you planning to release this as soon as it's ready or you want to make it into a major release? Is it possible to let me know (maybe by replying to this thread) once the code is ready? I know sometimes it takes a while to get things in the major release. I greatly appreciate your work on this! Thanks, Char 在 2018-12-15 05:19:50,"Alexey Bataev" <a.bataev at outlook.com>
2019 Feb 26
1
Debug info for CUDA code
Hi Alexey, Thanks for the great work! The version I checked out works most of the time. But I do encounter crashes sometimes. I can't file a bug report on https://bugs.llvm.org/ because I don't have an account. I sent an email to bugs-admin at lists.llvm.org for an account already but I haven't heard back. Meanwhile, can you take a look at the issue? I'm attaching the bug report
2018 Jun 21
2
NVPTX - Reordering load instructions
Hi all, I'm looking into the performance difference of a benchmark compiled with NVCC vs NVPTX (coming from Julia, not CUDA C) and I'm seeing a significant difference due to PTX instruction ordering. The relevant source code consists of two nested loops that get fully unrolled, doing some basic arithmetic with values loaded from shared memory: > #define BLOCK_SIZE 16 > >
2012 Jun 12
2
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
Dear LLVM NVPTX maintainers, Just to have the issue recorded, I don't know how important it is: clang generates linkonce_odr out of __inline__, and NVPTX generates .weak out of linkonce_odr (how it happens - a big question, btw, because I can't find anything related in NVPTX asm printer - does it chain to some other printer?), and finally ptxas (both 4.2 and 5) fails to compile it to
2019 Jan 23
2
Debug info for CUDA code
Hi Char, I found the problem, for some reason the last patch was applied correctly. Just committed the fixed version. Tried to compile axpy.cu, everything works. ------------- Best regards, Alexey Bataev 23.01.2019 13:37, treinz пишет: > Hi Alexey, > > I tried the b7195a6 from the llvm github mirror, which does include > your commit D46189 <https://reviews.llvm.org/D46189> (see
2018 Jun 21
2
NVPTX - Reordering load instructions
We already have a pass that vectorizes loads and stores in nvptx and amdgpu. Not at my laptop, I forget the exact filename, but it's called load-store vectorizer. I think the question is, why is LSV not vectorizing this code? I think the answer is, llvm can't tell that the loads are aligned. Ptxas can, but only because it's (apparently) doing vectorization *after* it reesolves the
2012 Jun 13
0
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
On Tue, Jun 12, 2012 at 6:11 PM, Dmitry N. Mikushin <maemarcus at gmail.com>wrote: > Dear LLVM NVPTX maintainers, > > Just to have the issue recorded, I don't know how important it is: > > clang generates linkonce_odr out of __inline__, and NVPTX generates .weak > out of linkonce_odr (how it happens - a big question, btw, because I can't > find anything related
2012 Jul 10
2
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Hi, Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of definitions of the function-scope and asm-scope. Here is an example: > cat test.cu __attribute__((device)) __attribute__((nv_linkonce_odr)) __inline__ int __any(int a) { int result; asm __volatile__ ("{ \n\t" ".reg .pred
2012 Jul 10
0
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Dmitry, You might be better served by filing this as a bug (http://llvm.org/bugs/). Please include a test case and the steps to reproduce (i.e., what you've provided below). Chad On Jul 10, 2012, at 3:15 PM, Dmitry N. Mikushin wrote: > Hi, > > Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of
2016 Mar 05
2
instrumenting device code with gpucc
On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com> wrote: > Hi Jingyue, > > My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother > you, but I'm having trouble with gpucc in my project, and I would be really > grateful for your help! > > Currently we're trying to instrument CUDA code using LLVM 3.9, and
2012 Jul 10
1
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Yes, sure, good idea, because might be also Clang-related. http://llvm.org/bugs/show_bug.cgi?id=13322 2012/7/11 Chad Rosier <mcrosier at apple.com> > Dmitry, > You might be better served by filing this as a bug (http://llvm.org/bugs/). > Please include a test case and the steps to reproduce (i.e., what you've > provided below). > > Chad > > On Jul 10, 2012,