similar to: [GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?

Displaying 20 results from an estimated 800 matches similar to: "[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?"

2016 Apr 08
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
Yeah, '.' is the direct reason for the ptxas failure here. I'm curious, however, about what the purpose of nvvm_reflect_anchorv() is here, and why does the front-end always generate this function? Since the current PTX emission doesn't mangle dots, it would be a reasonable workaround for me to prevent the front-end from generating this function in the first place. Is there any
2016 Apr 09
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
David's change makes nvvm_reflect_anchor unnecessary. The issue with dots in names generated by llvm still needs to be fixed. On Apr 9, 2016 8:32 AM, "Jingyue Wu" <jingyue at google.com> wrote: > Artem, > > With David's http://reviews.llvm.org/rL265060, do you think > __nvvm_reflect_anchor is still necessary? > > On Fri, Apr 8, 2016 at 9:37 AM, Yuanfeng
2016 Mar 12
2
instrumenting device code with gpucc
Hey Jingyue, Though I tried `opt -nvvm-reflect` on both bc files, the nvvm reflect anchor didn't go away; ptxas is still complaining about the duplicate definition of of function '_ZL21__nvvm_reflect_anchorv' . Did I misused the nvvm-reflect pass? Thanks! yuanfeng On Fri, Mar 11, 2016 at 10:10 AM, Jingyue Wu <jingyue at google.com> wrote: > According to the examples you
2016 Mar 13
2
instrumenting device code with gpucc
Hey Jingyue, Thanks for being so responsive! I finally figured out a way to resolve the issue: all I have to do is to use `-only-needed` when merging the device bitcodes with llvm-link. However, since we actually need to instrument the host code as well, I encountered another issue when I tried to glue the instrumented host code and fatbin together. When I only instrumented the device code, I
2016 Mar 15
2
instrumenting device code with gpucc
Hi Jingyue, Sorry to ask again, but how exactly could I glue the fatbin with the instrumented host code? Or does it mean we actually cannot instrument both the host & device code at the same time? Thanks! yuanfeng On Tue, Mar 15, 2016 at 10:09 AM, Jingyue Wu <jingyue at google.com> wrote: > Including fatbin into host code should be done in frontend. > > On Mon, Mar 14, 2016
2016 Mar 10
4
instrumenting device code with gpucc
It's hard to tell what is wrong without a concrete example. E.g., what is the program you are instrumenting? What is the definition of the hook function? How did you link that definition with the binary? One thing suspicious to me is that you may have linked the definition of _Cool_MemRead_Hook as a host function instead of a device function. AFAIK, PTX assembly cannot be linked. So, if you
2016 Jul 29
2
[GPUCC] link against libdevice
Hi, I was trying to compile scalarProd.cu (from CUDA SDK) with the following command: * clang++ -I../ -I/usr/local/cuda-7.0/samples/common/inc --cuda-gpu-arch=sm_50 scalarProd.cu* but ended up with the following error: *ptxas fatal : Unresolved extern function '__nv_mul24'* Seems to me that libdevice was not automatically linked. I wonder what flags I need to pass to clang to have
2016 Aug 01
0
[GPUCC] link against libdevice
Directly CC-ing some folks who may be able to help. On Fri, Jul 29, 2016 at 6:27 AM Yuanfeng Peng via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > I was trying to compile scalarProd.cu (from CUDA SDK) with the following > command: > > * clang++ -I../ -I/usr/local/cuda-7.0/samples/common/inc > --cuda-gpu-arch=sm_50 scalarProd.cu* > > but ended up with
2016 Aug 01
2
[GPUCC] link against libdevice
Hi, Yuanfeng. What version of clang are you using? CUDA is only known to work at tip of head, so you must build clang yourself from source. I suspect that's your problem, but if building from source doesn't fix it, please attach the output of compiling with -v. Regards, -Justin On Sun, Jul 31, 2016 at 9:24 PM, Chandler Carruth <chandlerc at google.com> wrote: > Directly
2016 Aug 01
0
[GPUCC] link against libdevice
Hi Justin, Thanks for your response! The clang & llvm I'm using was built from source. Below is the output of compiling with -v. Any suggestions would be appreciated! *clang version 3.9.0 (trunk 270145) (llvm/trunk 270133)* *Target: x86_64-unknown-linux-gnu* *Thread model: posix* *InstalledDir: /usr/local/bin* *Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8*
2016 Aug 01
3
[GPUCC] link against libdevice
OK, I see the problem. You were right that we weren't picking up libdevice. CUDA 7.0 only ships with the following libdevice binaries (found /path/to/cuda/nvvm/libdevice): libdevice.compute_20.10.bc libdevice.compute_30.10.bc libdevice.compute_35.10.bc If you ask for sm_50 with cuda 7.0, clang can't find a matching libdevice binary, and it will apparently silently give up and try to
2012 Jul 10
2
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Hi, Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of definitions of the function-scope and asm-scope. Here is an example: > cat test.cu __attribute__((device)) __attribute__((nv_linkonce_odr)) __inline__ int __any(int a) { int result; asm __volatile__ ("{ \n\t" ".reg .pred
2016 Mar 05
2
instrumenting device code with gpucc
On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com> wrote: > Hi Jingyue, > > My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother > you, but I'm having trouble with gpucc in my project, and I would be really > grateful for your help! > > Currently we're trying to instrument CUDA code using LLVM 3.9, and
2012 Jul 10
0
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Dmitry, You might be better served by filing this as a bug (http://llvm.org/bugs/). Please include a test case and the steps to reproduce (i.e., what you've provided below). Chad On Jul 10, 2012, at 3:15 PM, Dmitry N. Mikushin wrote: > Hi, > > Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of
2012 Jul 10
1
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Yes, sure, good idea, because might be also Clang-related. http://llvm.org/bugs/show_bug.cgi?id=13322 2012/7/11 Chad Rosier <mcrosier at apple.com> > Dmitry, > You might be better served by filing this as a bug (http://llvm.org/bugs/). > Please include a test case and the steps to reproduce (i.e., what you've > provided below). > > Chad > > On Jul 10, 2012,
2017 Jun 14
4
[CUDA] Lost debug information when compiling CUDA code
Hi, I needed to debug some CUDA code in my project; however, although I used -g when compiling the source code, no source-level information is available in cuda-gdb or cuda-memcheck. Specifically, below is what I did: 1) For a CUDA file a.cu, generate IR files: clang++ -g -emit-llvm --cuda-gpu-arch=sm_35 -c a.cu; 2) Instrument the device code a-cuda-nvptx64-nvidia-cuda-sm_35.bc (generated
2017 Nov 08
2
Debug info for Cuda
Nobody blames ptxas. I'm not saying that these are the troubles, I'm just saying that it has some features and we have some problems to be solved. But lack of labels, label arithmetics in DWARF sections is the real problem, because LLVM actively uses it in DWARF sections Best regards, Alexey Bataev 8 нояб. 2017 г., в 5:35, Madhur Amilkanthwar <madhur13490 at
2017 Nov 06
5
RFC: Debug info for Cuda
Hi everybody, As you know, Cuda/NVPTX target has very limited support of the debug info in Clang/LLVM. Currently, LLVM supports only emission of the line numbers debug info. This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM translates the source code to LLVM IR, which is then lowered to PTX (parallel thread execution) intermediate file. This PTX file represents special kind of
2017 Nov 06
2
Debug info for Cuda
06.11.2017 14:56, Robinson, Paul пишет: >> Hi everybody, >> As you know, Cuda/NVPTX target has very limited support of the debug >> info in Clang/LLVM. Currently, LLVM supports only emission of the line >> numbers debug info. >> This is caused by limitations of the Cuda/NVPTX codegen. Clang/LLVM >> translates the source code to LLVM IR, which is then lowered to
2013 Mar 01
4
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
I'm building this with llvm-c, and accessing these intrinsics via calling the intrinsic as if it were a function. class F_SREG<string OpStr, NVPTXRegClass regclassOut, Intrinsic IntOp> : NVPTXInst<(outs regclassOut:$dst), (ins), OpStr, [(set regclassOut:$dst, (IntOp))]>; def INT_PTX_SREG_TID_X : F_SREG<"mov.u32 \t$dst, %tid.x;",