Displaying 20 results from an estimated 5000 matches similar to: "PTX generation from CUDA file for compute capability 1.0 (sm_10)"
2016 Jun 02
3
PTX generation from CUDA file for compute capability 1.0 (sm_10)
Hello Bergström/Eric,
Thanks for the reply. The G80(sm_10) architecture was ported on FPGA by a
group of researchers (http://www.ecs.umass.edu/ece/tessier/andryc-fpt13.pdf).
Our group have some further research interest on this work. I was working
on modifying the Clang-LLVM for a couple of months and achieved the
required changes. But Clang-LLVM is only allowing me to generate PTX for
sm_20,
2018 Mar 23
2
cuda cross compiling issue for target aarch64-linux-androideabi
I was wondering if anyone has encountered this issue when cross compiling
cuda on Nvidia TX2 running android.
The error is
In file included from <built-in>:1:
In file included from
prebuilts/clang/host/linux-x86/clang-4667116/lib64/clang/7.0.1/include/__clang_cuda_runtime_wrapper.h:219:
../cuda/targets/aarch64-linux-androideabi/include/math_functions.hpp:3477:19:
error: no matching function
2018 Mar 23
0
cuda cross compiling issue for target aarch64-linux-androideabi
+Artem Belevich <tra at google.com>
On Fri, Mar 23, 2018 at 7:53 PM Bharath Bhoopalam via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I was wondering if anyone has encountered this issue when cross compiling
> cuda on Nvidia TX2 running android.
>
> The error is
> In file included from <built-in>:1:
> In file included from
>
2013 Mar 20
2
[LLVMdev] UNREACHABLE executed! error while trying to generate PTX
Thanks a lot Justin,
I will remove the toolkit header. Just one last question..(maybe ;) ) If I
do away with toolkit headers it says unknown type name '__device__'. Does
this function qualifier have an alternative ? or I can just do away with ?
--
View this message in context: http://llvm.1065342.n5.nabble.com/UNREACHABLE-executed-error-while-trying-to-generate-PTX-tp56026p56093.html
2016 Apr 09
2
[GPUCC] how to remove _ZL21__nvvm_reflect_anchorv() automatically?
David's change makes nvvm_reflect_anchor unnecessary. The issue with dots
in names generated by llvm still needs to be fixed.
On Apr 9, 2016 8:32 AM, "Jingyue Wu" <jingyue at google.com> wrote:
> Artem,
>
> With David's http://reviews.llvm.org/rL265060, do you think
> __nvvm_reflect_anchor is still necessary?
>
> On Fri, Apr 8, 2016 at 9:37 AM, Yuanfeng
2011 Feb 25
2
Missing R.h
Hi,
I'm trying to install a module - gputools - and keep getting compile
time errors about missing R.h
Does anyone know where this file can be found?
Thanks!
2012 Jul 10
2
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Hi,
Looks like "{" and "}" are lost when trying to use the combination of Clang
and NVPTX, which may result into clash of definitions of the function-scope
and asm-scope. Here is an example:
> cat test.cu
__attribute__((device)) __attribute__((nv_linkonce_odr)) __inline__ int
__any(int a) {
int result;
asm __volatile__ ("{ \n\t"
".reg .pred
2012 Jul 10
0
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Dmitry,
You might be better served by filing this as a bug (http://llvm.org/bugs/). Please include a test case and the steps to reproduce (i.e., what you've provided below).
Chad
On Jul 10, 2012, at 3:15 PM, Dmitry N. Mikushin wrote:
> Hi,
>
> Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of
2020 Jul 30
2
Status of CUDA 11 support
Hi,
I work in a large CUDA codebase and use Clang to build some of our CUDA code to improve compilation speed. We're planning to upgrade to CUDA 11 soon, and it appears that CUDA 11 is not yet supported in LLVM.
>From the LLVM commits history, I can see that work on CUDA 11 has started. Is this currently being worked on? What is the remaining work left? And is any help needed to finish
2012 Jul 10
1
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
Yes, sure, good idea, because might be also Clang-related.
http://llvm.org/bugs/show_bug.cgi?id=13322
2012/7/11 Chad Rosier <mcrosier at apple.com>
> Dmitry,
> You might be better served by filing this as a bug (http://llvm.org/bugs/).
> Please include a test case and the steps to reproduce (i.e., what you've
> provided below).
>
> Chad
>
> On Jul 10, 2012,
2013 Mar 21
0
[LLVMdev] UNREACHABLE executed! error while trying to generate PTX
Not really. Clang does not have a way to annotate device vs. kernel
functions in C/C++ mode. You're probably better off trying to use OpenCL
or CUDA mode in clang.
In the clang unit tests, there is a cuda.h header that provides very basic
support for these keywords: tests/SemaCUDA/cuda.h
If you compile as CUDA (use .cu extension, or "-x cuda") and use this
header, you will have
2017 Aug 16
3
CUDA separate compilation
Clang currently doesn't support CUDA separate compilation and thus extern
__device__ functions and variables cannot be used.
Could someone give me any pointers where to look or what has to be done to
support this? If at all possible, I'd like to see what's missing and
possibly try to tackle it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2012 Jun 12
2
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
Dear LLVM NVPTX maintainers,
Just to have the issue recorded, I don't know how important it is:
clang generates linkonce_odr out of __inline__, and NVPTX generates .weak
out of linkonce_odr (how it happens - a big question, btw, because I can't
find anything related in NVPTX asm printer - does it chain to some other
printer?), and finally ptxas (both 4.2 and 5) fails to compile it to
2012 Jul 11
2
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
Hello,
FYI, this is a bug http://llvm.org/bugs/show_bug.cgi?id=13324
When compiling the following code for sm_20, func params are by some reason
given with .align 0, which is invalid. Problem does not occur if compiled
for sm_10.
> cat test.ll
; ModuleID = '__kernelgen_main_module'
target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64"
target triple =
2012 Jun 13
0
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
On Tue, Jun 12, 2012 at 6:11 PM, Dmitry N. Mikushin <maemarcus at gmail.com>wrote:
> Dear LLVM NVPTX maintainers,
>
> Just to have the issue recorded, I don't know how important it is:
>
> clang generates linkonce_odr out of __inline__, and NVPTX generates .weak
> out of linkonce_odr (how it happens - a big question, btw, because I can't
> find anything related
2016 Mar 05
2
instrumenting device code with gpucc
On Fri, Mar 4, 2016 at 5:50 PM, Yuanfeng Peng <yuanfeng.jack.peng at gmail.com>
wrote:
> Hi Jingyue,
>
> My name is Yuanfeng Peng, I'm a PhD student at UPenn. I'm sorry to bother
> you, but I'm having trouble with gpucc in my project, and I would be really
> grateful for your help!
>
> Currently we're trying to instrument CUDA code using LLVM 3.9, and
2013 Mar 22
2
[LLVMdev] UNREACHABLE executed! error while trying to generate PTX
Well, I tried the command line given by you and I get the following error
clang++ nbody.kernel.cu -Xclang -fcuda-is-device
-I/home/upitamba/llvm-3.2.src/tools/clang/test/SemaCUDA/ -Xclang -triple
-Xclang nvptx64 -Xclang -target-cpu -Xclang sm_10 -S
fatal error: error in backend: Cannot select: 0x334a870: v4f32 =
NVPTXISD::MoveParam 0x334a770 [ORD=1] [ID=22]
0x334a770: v4f32 =
2013 Feb 17
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Dear Yuan,
Sorry for delay with reply,
Answers on your questions could be different, depending on the math library
placement in the code generation pipeline. At KernelGen, we currently have
a user-level CUDA math module, adopted from cicc internals [1]. It is
intended to be linked with the user LLVM IR module, right before proceeding
with the final optimization and backend. Last few months we
2013 Feb 17
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Hi Justin,
I don't understand, why, for instance, X86 backend handles pow
automatically, and NVPTX should be a PITA requiring user to bring his own
pow implementation. Even at a very general level, this limits the interest
of users to LLVM NVPTX backend. Could you please elaborate on the rationale
behind your point? Why the accuracy modes I suggested are not sufficient,
in your opinion?
- D.
2013 Feb 17
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
> The issue is really that there is no standard math library for PTX.
Well, formally, that could very well be true. Moreover, in some parts CPU
math standard is impossible to accomplish on parallel architectures,
consider, for example errno behavior. But here we are speaking more about
practical side. And the practical side is: past 5 years CUDA claims to
accelerate compute applications, and