thr3ads.net - similar to: "[LLVMdev] [NVPTX] powf intrinsic in unimplemented"

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] [NVPTX] powf intrinsic in unimplemented"

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 07

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Hi Justin, gentlemen, I'm afraid I have to escalate this issue at this point. Since it was discussed for the first time last summer, it was sufficient for us for a while to have lowering of math calls into intrinsics disabled at DragonEgg level, and link them against CUDA math functions at LLVM IR level. Now I can say: this is not sufficient any longer, and we need NVPTX backend to deal with

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 09

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

The lack of an open-source vector math library (which is what you suggest here) prompted me to start a project "vecmathlib", available at < https://bitbucket.org/eschnett/vecmathlib>. This library provides almost all math functions available in libm, implemented in a vectorised manner, i.e. suitable for SSE2/AVX/MIC/PTX etc. In its current state the library has rough edges, e.g.

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 08

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Yes, it helps a lot and we are working on it. A few questions, 1) What will be your use model of this library? Will you run optimization phases after linking with the library? If so, what are they? 2) Do you care if the names of functions differ from those in libm? For example, it would be gpusin() instead of sin(). 3) Do you need a different library for different host

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 17

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Dear Yuan, Sorry for delay with reply, Answers on your questions could be different, depending on the math library placement in the code generation pipeline. At KernelGen, we currently have a user-level CUDA math module, adopted from cicc internals [1]. It is intended to be linked with the user LLVM IR module, right before proceeding with the final optimization and backend. Last few months we

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 17

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

I would be very hesitant to expose all math library functions as intrinsics. I believe linking with a target-specific math library is the correct approach, as it decouples the back end from the needs of the source program/language. Users should be free to use any math library implementation they choose. Intrinsics are meant for functions that compile down to specific isa features, like fused

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 17

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Hi Justin, I don't understand, why, for instance, X86 backend handles pow automatically, and NVPTX should be a PITA requiring user to bring his own pow implementation. Even at a very general level, this limits the interest of users to LLVM NVPTX backend. Could you please elaborate on the rationale behind your point? Why the accuracy modes I suggested are not sufficient, in your opinion? - D.

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 17

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

The X86 back-end just calls into libm: // Always use a library call for pow. setOperationAction(ISD::FPOW , MVT::f32 , Expand); setOperationAction(ISD::FPOW , MVT::f64 , Expand); setOperationAction(ISD::FPOW , MVT::f80 , Expand); The issue is really that there is no standard math library for PTX. I agree that this is a pain for most users, but I

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 17

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

> The issue is really that there is no standard math library for PTX. Well, formally, that could very well be true. Moreover, in some parts CPU math standard is impossible to accomplish on parallel architectures, consider, for example errno behavior. But here we are speaking more about practical side. And the practical side is: past 5 years CUDA claims to accelerate compute applications, and

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Jun 05

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Dear all, FWIW, I've tested libdevice.compute_20.10.bc and libdevice.compute_30.10.bc from /cuda/nvvm/libdevice shipped with CUDA 5.5 preview. IR is compatible with LLVM 3.4 trunk that we use. Results are correct, performance - almost the same as what we had before with cicc-sniffed IR, or maybe <10% better. Will test libdevice.compute_35.10.bc once we will get K20 support. Thanks for

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Jun 05

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Thanks for the info! I would be glad to hear of any issues you have encountered on this path. I tried to make sure the 3.3 release was fully compatible with the libdevice implementation shipping with 5.5 (and as far as I know, it is). It's just not an officially supported configuration. Also, I've been meaning to address your -drvcuda issue. How would you feel about making that a part

Missing TargetPrefix for NVVM intrinsics

2016 Jul 01

Missing TargetPrefix for NVVM intrinsics

Justins: I noticed that the intrinsics in IntrinsicsNVVM don't specify a TargetPrefix. This seems like a simple omission, so I was going to simply throw a `let TargetPrefix = "nvvm" ` block around them, but this doesn't quite work. There seem to be three prefixes that are used in this file. About 900 are int_nvvm_*, 30 are int_ptx_*, and 1 is int_cuda. It isn't clear to me

[LLVMdev] [PATCH][RFC] NVPTX Backend

2012 Apr 25

[LLVMdev] [PATCH][RFC] NVPTX Backend

On 4/24/2012 1:50 PM, Justin Holewinski wrote: > > Hi LLVMers, > > We at NVIDIA would like to contribute back to the LLVM open-source > community by up-streaming the NVPTX back-end for LLVM. This back-end > is based on the sources used by NVIDIA, and currently provides > significantly more functionality than the current PTX back-end. Some > functionality is currently

cuda cross compiling issue for target aarch64-linux-androideabi

2018 Mar 23

cuda cross compiling issue for target aarch64-linux-androideabi

I was wondering if anyone has encountered this issue when cross compiling cuda on Nvidia TX2 running android. The error is In file included from <built-in>:1: In file included from prebuilts/clang/host/linux-x86/clang-4667116/lib64/clang/7.0.1/include/__clang_cuda_runtime_wrapper.h:219: ../cuda/targets/aarch64-linux-androideabi/include/math_functions.hpp:3477:19: error: no matching function

cuda cross compiling issue for target aarch64-linux-androideabi

2018 Mar 23

cuda cross compiling issue for target aarch64-linux-androideabi

+Artem Belevich <tra at google.com> On Fri, Mar 23, 2018 at 7:53 PM Bharath Bhoopalam via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I was wondering if anyone has encountered this issue when cross compiling > cuda on Nvidia TX2 running android. > > The error is > In file included from <built-in>:1: > In file included from >

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

The identifier INT_PTX_SREG_TID_X is the name of an instruction as the back-end sees it, and has very little to do with the name you should use in your IR. Your best bet is to look at the include/llvm/IR/IntrinsicsNVVM.td file and see the definitions for each intrinsic. Then, the name mapping is just: int_foo_bar -> llvm.foo.bar() int_ prefix becomes llvm., and all underscores turn into

[LLVMdev] [NVPTX] Backend failure in LegalizeDAG due to unimplemented expand in target lowering

2012 Jun 30

[LLVMdev] [NVPTX] Backend failure in LegalizeDAG due to unimplemented expand in target lowering

Thanks, for insight, Eli, So instead of setOperationAction(ISD::STORE, MVT::i1, Expand); one should probably do setOperationAction(ISD::STORE, MVT::i1, Custom); and implement it in NVPTXTargetLowering::LowerOperation. But this issue makes a good point about the code efficiency: I suspect such expansion will be very ugly in terms of performance. Probably we can do much better if bool would use

[LLVMdev] [NVPTX] Backend failure in LegalizeDAG due to unimplemented expand in target lowering

2012 Jun 29

[LLVMdev] [NVPTX] Backend failure in LegalizeDAG due to unimplemented expand in target lowering

On Fri, Jun 29, 2012 at 2:11 PM, Dmitry N. Mikushin <maemarcus at gmail.com> wrote: > Hi again, > > Kind people on #llvm helped me to utilize bugpoint to reduce the > previously submitted test case. For record, it code be done with the > following command: > > $ bugpoint -llc-safe test.ll > > The resulting IR is attached, and it is crashing in the same way. Is >

[LLVMdev] [PATCH][RFC] NVPTX Backend

2012 Apr 24

[LLVMdev] [PATCH][RFC] NVPTX Backend

Hi LLVMers, We at NVIDIA would like to contribute back to the LLVM open-source community by up-streaming the NVPTX back-end for LLVM. This back-end is based on the sources used by NVIDIA, and currently provides significantly more functionality than the current PTX back-end. Some functionality is currently disabled due to dependencies on LLVM core changes that we are also in the process of

[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values

2012 Jul 18

[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values

In ptx, variables need to be defined before referenced. NVPTX emits the global variables in the order as in the LLVM IR and does not sort them. It is a bug in the NVPTX backend. Thanks. Yuan From: Dmitry N. Mikushin [mailto:maemarcus at gmail.com] Sent: Wednesday, July 18, 2012 7:44 AM To: LLVM-Dev Cc: Justin Holewinski; Yuan Lin Subject: [NVPTX] PTXAS - Unimplemented feature: labels as

[LLVMdev] [NVPTX] Backend failure in LegalizeDAG due to unimplemented expand in target lowering

2012 Jun 30

[LLVMdev] [NVPTX] Backend failure in LegalizeDAG due to unimplemented expand in target lowering

Hi Duncan, > did you declare i1 to be an illegal type? No. How? 2012/6/30 Duncan Sands <baldrick at free.fr>: > Hi Dmitry, >> So instead of setOperationAction(ISD::STORE, MVT::i1, Expand); one >> should probably do setOperationAction(ISD::STORE, MVT::i1, Custom); >> and implement it in NVPTXTargetLowering::LowerOperation. >> >> But this issue makes a

similar to: [LLVMdev] [NVPTX] powf intrinsic in unimplemented