similar to: cuda cross compiling issue for target aarch64-linux-androideabi

Displaying 20 results from an estimated 500 matches similar to: "cuda cross compiling issue for target aarch64-linux-androideabi"

2018 Mar 23
0
cuda cross compiling issue for target aarch64-linux-androideabi
+Artem Belevich <tra at google.com> On Fri, Mar 23, 2018 at 7:53 PM Bharath Bhoopalam via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I was wondering if anyone has encountered this issue when cross compiling > cuda on Nvidia TX2 running android. > > The error is > In file included from <built-in>:1: > In file included from >
2016 Mar 05
2
instrumenting device code with gpucc
On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com> wrote: > Hi Jingyue, > > My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother > you, but I'm having trouble with gpucc in my project, and I would be really > grateful for your help! > > Currently we're trying to instrument CUDA code using LLVM 3.9, and
2016 Mar 10
4
instrumenting device code with gpucc
It's hard to tell what is wrong without a concrete example. E.g., what is the program you are instrumenting? What is the definition of the hook function? How did you link that definition with the binary? One thing suspicious to me is that you may have linked the definition of _Cool_MemRead_Hook as a host function instead of a device function. AFAIK, PTX assembly cannot be linked. So, if you
2016 Mar 13
2
instrumenting device code with gpucc
Hey Jingyue, Thanks for being so responsive! I finally figured out a way to resolve the issue: all I have to do is to use `-only-needed` when merging the device bitcodes with llvm-link. However, since we actually need to instrument the host code as well, I encountered another issue when I tried to glue the instrumented host code and fatbin together. When I only instrumented the device code, I
2016 Mar 15
2
instrumenting device code with gpucc
Hi Jingyue, Sorry to ask again, but how exactly could I glue the fatbin with the instrumented host code? Or does it mean we actually cannot instrument both the host & device code at the same time? Thanks! yuanfeng On Tue, Mar 15, 2016 at 10:09 AM, Jingyue Wu <jingyue at google.com> wrote: > Including fatbin into host code should be done in frontend. > > On Mon, Mar 14, 2016
2016 Mar 12
2
instrumenting device code with gpucc
Hey Jingyue, Though I tried `opt -nvvm-reflect` on both bc files, the nvvm reflect anchor didn't go away; ptxas is still complaining about the duplicate definition of of function '_ZL21__nvvm_reflect_anchorv' . Did I misused the nvvm-reflect pass? Thanks! yuanfeng On Fri, Mar 11, 2016 at 10:10 AM, Jingyue Wu <jingyue at google.com> wrote: > According to the examples you
2019 Mar 11
2
Debug info for CUDA code
Hi Alexey, Is there any option for clang to turn on debug for the host code only but not the device code? I've been using something like -ggdb3 -O0 but this generate debug info for both host and device. I'm trying to work around the aforementioned ptxas bug. Thanks, Char At 2019-02-28 02:09:54, "Alexey Bataev" <a.bataev at outlook.com> wrote: Hi Char, it looks like
2019 Feb 27
3
Debug info for CUDA code
Hi Alexey, I submitted the bug report to nvidia. While they are working on it, can you share some insight in what could potentially cause this? I just want to get a sense if such a bug require significant amount of work to fix, which can help me make some decision moving forward with my project. Thanks, Char At 2019-02-27 03:19:02, "Alexey Bataev" <a.bataev at outlook.com>
2019 Feb 26
2
Debug info for CUDA code
Hi Alexey, Just want to make sure I understand what you said because I'm not familiar with the llvm pipeline, it's this line: /net/gs/vol3/software/modules-sw/cuda/10.0/Linux/RHEL6/x86_64/bin/ptxas" -m64 -g --dont-merge-basicblocks --return-at-end -v --gpu-name sm_75 --output-file /tmp/60663577.1.login.q/testparticles-4fd988.o /tmp/60663577.1.login.q/testparticles-1d20c4.s that
2016 Oct 27
3
problem on compiling cuda program with clang++
Hi all, I compiled the *llvm3.9* source code on the *Nvidia TX1* board. And now I am following the document in the docs/CompileCudaWithLLVM.rst to compile cuda program with clang++. However, when I compile `axpy.cu` using `nvcc`, *nvcc* can generate the correct the binary; while compiling `axpy.cu` using clang++, the detailed command is `clang++ axpy.cu -o axpy --cuda-gpu-arch=sm_53
2017 Aug 02
2
CUDA compilation "No available targets are compatible with this triple." problem
Yes, I followed the guide. The same error showed up: >clang++ axpy.cu -o axpy --cuda-gpu-arch=sm_35 -L/usr/local/cuda/lib64 -I/usr/local/cuda/include -lcudart_static -ldl -lrt -pthread error: unable to create target: 'No available targets are compatible with this triple.' ________________________________ From: Kevin Choi <code.kchoi at gmail.com> Sent: Wednesday, August 2,
2017 Aug 02
2
CUDA compilation "No available targets are compatible with this triple." problem
Hi, I have trouble compiling CUDA code with Clang. The following is a command I tried: > clang++ axpy.cu -o axpy --cuda-gpu-arch=sm_35 --cuda-path=/usr/local/cuda The error message is error: unable to create target: 'No available targets are compatible with this triple.' The info of the LLVM I'm using is as follows: > lang++ --version clang version 6.0.0
2020 Jan 15
2
Debug info for CUDA code
Hi Alexey, Almost a year has passed and Nvidia finally fixes the ptxas issue in CUDA 10.2 according to: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#cuda-compiler-resolved-issues However, I can not yet use it with llvm 9.0.0 release because CUDA 10.2 is not supported yet. Is there other branches of the llvm repo that supports CUDA 10.2 now? Or do I need to wait for llvm 10
2018 May 01
3
Compiling CUDA with clang on Windows
Dear all, In the official document <https://llvm.org/docs/CompileCudaWithLLVM.html>, it is mentioned that CUDA compilation is supported on Windows as of 2017-01-05. I used msys2 to install clang 5.0.1. Then I installed cuda 8.0. However, I basically could not compile any code of cuda by the prescribed setting. I wounder if anyone can successfully compile cuda code by the clang on Windows.
2019 Feb 26
1
Debug info for CUDA code
Hi Alexey, Thanks for the great work! The version I checked out works most of the time. But I do encounter crashes sometimes. I can't file a bug report on https://bugs.llvm.org/ because I don't have an account. I sent an email to bugs-admin at lists.llvm.org for an account already but I haven't heard back. Meanwhile, can you take a look at the issue? I'm attaching the bug report
2019 Jan 23
2
Debug info for CUDA code
Hi Char, I found the problem, for some reason the last patch was applied correctly. Just committed the fixed version. Tried to compile axpy.cu, everything works. ------------- Best regards, Alexey Bataev 23.01.2019 13:37, treinz пишет: > Hi Alexey, > > I tried the b7195a6 from the llvm github mirror, which does include > your commit D46189 <https://reviews.llvm.org/D46189> (see
2018 Dec 14
8
Debug info for CUDA code
Are you planning to release this as soon as it's ready or you want to make it into a major release? Is it possible to let me know (maybe by replying to this thread) once the code is ready? I know sometimes it takes a while to get things in the major release. I greatly appreciate your work on this! Thanks, Char 在 2018-12-15 05:19:50,"Alexey Bataev" <a.bataev at outlook.com>
2013 Jul 17
3
[LLVMdev] regarding compiling clang for different platform
Hi, I am new to LLVM I want to use llvm and clang on Android, I have downloaded android toolchain and did the configure for llvm using the following commad ./configure --build=arm-linux-androideabi --host=arm-linux-androideabi --target=arm-linux-androideabi --with-float=hard --with-fpu=neon --enable-targets=arm --enable-optimized --enable-assertions and was getting the error "checking
2012 Sep 06
1
[LLVMdev] [NVPTX] powf intrinsic in unimplemented
Dear all, During app compilation we have a crash in NVPTX backend: LLVM ERROR: Cannot select: 0x732b270: i64 = ExternalSymbol'__powisf2' [ID=18] As I understand LLVM tries to lower the following call %28 = call ptx_device float @llvm.powi.f32(float 2.000000e+00, i32 %8) nounwind readonly to device intrinsic. The table llvm/IntrinsicsNVVM.td does not contain such intrinsic, however it
2018 Mar 15
2
[RFC] Stop giving a default CPU to the LTO plugin?
Hello everyone, this is most likely Arm specific, but could affect other targets where there is a somewhat complex relationship between the triple and mcpu option. At present when clang is used as a linker driver for the gold-plugin and when using and an explicit -mcpu is not given to clang, then clang will always generate a -Wl,-plugin-opt=mcpu=<default CPU> where the default CPU is based