thr3ads.net - similar to: "[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it"

Displaying 20 results from an estimated 5000 matches similar to: "[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it"

[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it

2012 Jun 13

[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it

On Tue, Jun 12, 2012 at 6:11 PM, Dmitry N. Mikushin <maemarcus at gmail.com>wrote: > Dear LLVM NVPTX maintainers, > > Just to have the issue recorded, I don't know how important it is: > > clang generates linkonce_odr out of __inline__, and NVPTX generates .weak > out of linkonce_odr (how it happens - a big question, btw, because I can't > find anything related

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

2012 Jul 11

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

Hello, FYI, this is a bug http://llvm.org/bugs/show_bug.cgi?id=13324 When compiling the following code for sm_20, func params are by some reason given with .align 0, which is invalid. Problem does not occur if compiled for sm_10. > cat test.ll ; ModuleID = '__kernelgen_main_module' target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" target triple =

[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values

2012 Jul 18

[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values

Dear NVPTX community, PTXAS fails to compile the ptx code generated by NVPTX. Is it an issue of backend or an issue of PTXAS or a known reasonable restriction? Thanks, - Dima. > cat test.ll ; ModuleID = '__kernelgen_main_module' target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" target triple = "ptx64-unknown-unknown" %struct.__st_parameter_dt.0.4

[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values

2012 Jul 18

[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values

In ptx, variables need to be defined before referenced. NVPTX emits the global variables in the order as in the LLVM IR and does not sort them. It is a bug in the NVPTX backend. Thanks. Yuan From: Dmitry N. Mikushin [mailto:maemarcus at gmail.com] Sent: Wednesday, July 18, 2012 7:44 AM To: LLVM-Dev Cc: Justin Holewinski; Yuan Lin Subject: [NVPTX] PTXAS - Unimplemented feature: labels as

[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken

2012 Jul 10

[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken

Hi, Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of definitions of the function-scope and asm-scope. Here is an example: > cat test.cu __attribute__((device)) __attribute__((nv_linkonce_odr)) __inline__ int __any(int a) { int result; asm __volatile__ ("{ \n\t" ".reg .pred

[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken

2012 Jul 10

[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken

Dmitry, You might be better served by filing this as a bug (http://llvm.org/bugs/). Please include a test case and the steps to reproduce (i.e., what you've provided below). Chad On Jul 10, 2012, at 3:15 PM, Dmitry N. Mikushin wrote: > Hi, > > Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of

[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken

2012 Jul 10

[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken

Yes, sure, good idea, because might be also Clang-related. http://llvm.org/bugs/show_bug.cgi?id=13322 2012/7/11 Chad Rosier <mcrosier at apple.com> > Dmitry, > You might be better served by filing this as a bug (http://llvm.org/bugs/). > Please include a test case and the steps to reproduce (i.e., what you've > provided below). > > Chad > > On Jul 10, 2012,

NVPTX compilation problems - ptxas error

2016 Jun 07

NVPTX compilation problems - ptxas error

Hello everybody, i am currently testing the NVPTX back-end and playing around with the IR it generates. Unfortunately i have come to an compilation error i cannot solve on my own. Maybe someone reading this knows what is causing the trouble and has a possible solution. I am using Ubuntu 16.04, Cuda 7.5 and clang version 3.9.0 (https://github.com/llvm-mirror/clang.git

instrumenting device code with gpucc

2016 Mar 05

instrumenting device code with gpucc

On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com> wrote: > Hi Jingyue, > > My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother > you, but I'm having trouble with gpucc in my project, and I would be really > grateful for your help! > > Currently we're trying to instrument CUDA code using LLVM 3.9, and

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

2012 Nov 09

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

Dear all, I'm attaching a patch that should fix the issue mentioned above. It simply makes the same check seen in the same file for global variables: emitPTXAddressSpace(PTy->getAddressSpace(), O); if (GVar->getAlignment() == 0) O << " .align " << (int) TD->getPrefTypeAlignment(ETy); else O << " .align " <<

instrumenting device code with gpucc

2016 Mar 10

instrumenting device code with gpucc

It's hard to tell what is wrong without a concrete example. E.g., what is the program you are instrumenting? What is the definition of the hook function? How did you link that definition with the binary? One thing suspicious to me is that you may have linked the definition of _Cool_MemRead_Hook as a host function instead of a device function. AFAIK, PTX assembly cannot be linked. So, if you

PTX generation from CUDA file for compute capability 1.0 (sm_10)

2016 Jun 02

PTX generation from CUDA file for compute capability 1.0 (sm_10)

Hello Bergström/Eric, Thanks for the reply. The G80(sm_10) architecture was ported on FPGA by a group of researchers (http://www.ecs.umass.edu/ece/tessier/andryc-fpt13.pdf). Our group have some further research interest on this work. I was working on modifying the Clang-LLVM for a couple of months and achieved the required changes. But Clang-LLVM is only allowing me to generate PTX for sm_20,

[LLVMdev] NVPTX: __iAtomicCAS support ?

2012 May 16

[LLVMdev] NVPTX: __iAtomicCAS support ?

Dear colleagues, I'm looking if we can replace nvopencc with LLVM NVPTX in our project. It turns NVPTX won't work with the code nvopencc can handle (please see the log below). So are atomic intrinsics not supported or am I doing call in a wrong way? Thanks, - Dima. SOURCE ======== dmikushin at hp2:~> cat kernelgen_monitor.ll ; ModuleID =

[LLVMdev] NVPTX: __iAtomicCAS support ?

2012 May 16

[LLVMdev] NVPTX: __iAtomicCAS support ?

> -----Original Message----- > From: Dmitry N. Mikushin [mailto:maemarcus at gmail.com] > Sent: Wednesday, May 16, 2012 5:44 AM > To: LLVM-Dev > Cc: Justin Holewinski > Subject: NVPTX: __iAtomicCAS support ? > > Dear colleagues, > > I'm looking if we can replace nvopencc with LLVM NVPTX in our project. > It turns NVPTX won't work with the code nvopencc

NVPTX - Reordering load instructions

2018 Jun 21

NVPTX - Reordering load instructions

We already have a pass that vectorizes loads and stores in nvptx and amdgpu. Not at my laptop, I forget the exact filename, but it's called load-store vectorizer. I think the question is, why is LSV not vectorizing this code? I think the answer is, llvm can't tell that the loads are aligned. Ptxas can, but only because it's (apparently) doing vectorization *after* it reesolves the

Missing R.h

2011 Feb 25

Missing R.h

Hi, I'm trying to install a module - gputools - and keep getting compile time errors about missing R.h Does anyone know where this file can be found? Thanks!

RFC: Debug info for Cuda

2017 Nov 06

RFC: Debug info for Cuda

Hi everybody, As you know, Cuda/NVPTX target has very limited support of the debug info in Clang/LLVM. Currently, LLVM supports only emission of the line numbers debug info. This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM translates the source code to LLVM IR, which is then lowered to PTX (parallel thread execution) intermediate file. This PTX file represents special kind of

Debug info for Cuda

2017 Nov 08

Debug info for Cuda

Nobody blames ptxas. I'm not saying that these are the troubles, I'm just saying that it has some features and we have some problems to be solved. But lack of labels, label arithmetics in DWARF sections is the real problem, because LLVM actively uses it in DWARF sections Best regards, Alexey Bataev 8 нояб. 2017 г., в 5:35, Madhur Amilkanthwar <madhur13490 at

Debug info for Cuda

2017 Nov 06

Debug info for Cuda

06.11.2017 14:56, Robinson, Paul пишет: >> Hi everybody, >> As you know, Cuda/NVPTX target has very limited support of the debug >> info in Clang/LLVM. Currently, LLVM supports only emission of the line >> numbers debug info. >> This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM >> translates the source code to LLVM IR, which is then lowered to

[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?

2016 Apr 08

[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?

Yeah, '.' is the direct reason for the ptxas failure here. I'm curious, however, about what the purpose of nvvm_reflect_anchorv() is here, and why does the front-end always generate this function? Since the current PTX emission doesn't mangle dots, it would be a reasonable workaround for me to prevent the front-end from generating this function in the first place. Is there any

similar to: [LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it