thr3ads.net - similar to: "[LLVMdev] [NVPTX] Assertion `RegNo < NumRegs && "Attempting to access record for invalid register number!"' failed."

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] [NVPTX] Assertion `RegNo < NumRegs && "Attempting to access record for invalid register number!"' failed."

[LLVMdev] [NVPTX] Strange assertion around BlockToChain.clear(); in Release+Asserts build

2012 Aug 02

[LLVMdev] [NVPTX] Strange assertion around BlockToChain.clear(); in Release+Asserts build

Hi, After building out project in release mode, caught an assertion, which we have not seen before: hello_f: /tmp/rpmbuild_debug/BUILD/llvm/build/include/llvm/ADT/DenseMap.h:126: void llvm::DenseMap<KeyT, ValueT, KeyInfoT>::clear() [with KeyT = llvm::MachineBasicBlock*, ValueT = <unnamed>::BlockChain*, KeyInfoT = llvm::DenseMapInfo<llvm::MachineBasicBlock*>]: Assertion

[LLVMdev] [NVPTX] Strange assertion around BlockToChain.clear(); in Release+Asserts build

2012 Aug 03

[LLVMdev] [NVPTX] Strange assertion around BlockToChain.clear(); in Release+Asserts build

Dear NVPTX community, I've create a bug http://llvm.org/bugs/show_bug.cgi?id=13521 with reprocase for this issue. Please, help us to fix it. Last 1,5 months we regularly encounter & workaround or fix 1-2 bugs per week in NVPTX backend. This is definitely not the amount of work we can completely serve ourselves... We would really really appreciate some collaboration. Thanks, - D.

[LLVMdev] [NVPTX] Strange assertion around BlockToChain.clear(); in Release+Asserts build

2012 Aug 03

[LLVMdev] [NVPTX] Strange assertion around BlockToChain.clear(); in Release+Asserts build

Unfortunately, I cannot reproduce this. Based on your bugzilla comment, it does look like a mis-compile with your system compiler. Does the same issue occur if you build LLVM as static libraries? On 08/03/2012 12:24 AM, Dmitry N. Mikushin wrote: > Dear NVPTX community, > > I've create a bug http://llvm.org/bugs/show_bug.cgi?id=13521 with > reprocase for this issue. > >

[LLVMdev] [NVPTX] launch_bounds support?

2013 Apr 01

[LLVMdev] [NVPTX] launch_bounds support?

Dear all, Is anybody working on CUDA launch bounds support? On PTX level, __attribute__((launch_bounds(MAX_THREADS_PER_BLOCK, MIN_BLOCKS_PER_MP))) should be emitted into .maxntid / .minnctapersm specification. Thanks, - D. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130401/044f2a01/attachment.html>

[LLVMdev] NVPTX: __iAtomicCAS support ?

2012 May 16

[LLVMdev] NVPTX: __iAtomicCAS support ?

Dear colleagues, I'm looking if we can replace nvopencc with LLVM NVPTX in our project. It turns NVPTX won't work with the code nvopencc can handle (please see the log below). So are atomic intrinsics not supported or am I doing call in a wrong way? Thanks, - Dima. SOURCE ======== dmikushin at hp2:~> cat kernelgen_monitor.ll ; ModuleID =

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 17

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Dear Yuan, Sorry for delay with reply, Answers on your questions could be different, depending on the math library placement in the code generation pipeline. At KernelGen, we currently have a user-level CUDA math module, adopted from cicc internals [1]. It is intended to be linked with the user LLVM IR module, right before proceeding with the final optimization and backend. Last few months we

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 08

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Yes, it helps a lot and we are working on it. A few questions, 1) What will be your use model of this library? Will you run optimization phases after linking with the library? If so, what are they? 2) Do you care if the names of functions differ from those in libm? For example, it would be gpusin() instead of sin(). 3) Do you need a different library for different host

[LLVMdev] NVPTX: __iAtomicCAS support ?

2012 May 16

[LLVMdev] NVPTX: __iAtomicCAS support ?

> -----Original Message----- > From: Dmitry N. Mikushin [mailto:maemarcus at gmail.com] > Sent: Wednesday, May 16, 2012 5:44 AM > To: LLVM-Dev > Cc: Justin Holewinski > Subject: NVPTX: __iAtomicCAS support ? > > Dear colleagues, > > I'm looking if we can replace nvopencc with LLVM NVPTX in our project. > It turns NVPTX won't work with the code nvopencc

[LLVMdev] [NVPTX] launch_bounds support?

2013 Apr 02

[LLVMdev] [NVPTX] launch_bounds support?

Yes, this is supported through metadata. An example usage of these annotations is given in the test/CodeGen/NVPTX/annotations.ll unit test. I'll try to remember to add this to the NVPTX documentation I'm putting together at http://llvm.org/docs/NVPTXUsage.html. On Mon, Apr 1, 2013 at 8:06 AM, Dmitry Mikushin <dmitry at kernelgen.org>wrote: > Dear all, > > Is anybody

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 17

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

I would be very hesitant to expose all math library functions as intrinsics. I believe linking with a target-specific math library is the correct approach, as it decouples the back end from the needs of the source program/language. Users should be free to use any math library implementation they choose. Intrinsics are meant for functions that compile down to specific isa features, like fused

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 17

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Hi Justin, I don't understand, why, for instance, X86 backend handles pow automatically, and NVPTX should be a PITA requiring user to bring his own pow implementation. Even at a very general level, this limits the interest of users to LLVM NVPTX backend. Could you please elaborate on the rationale behind your point? Why the accuracy modes I suggested are not sufficient, in your opinion? - D.

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Jun 05

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Dear all, FWIW, I've tested libdevice.compute_20.10.bc and libdevice.compute_30.10.bc from /cuda/nvvm/libdevice shipped with CUDA 5.5 preview. IR is compatible with LLVM 3.4 trunk that we use. Results are correct, performance - almost the same as what we had before with cicc-sniffed IR, or maybe <10% better. Will test libdevice.compute_35.10.bc once we will get K20 support. Thanks for

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 17

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

The X86 back-end just calls into libm: // Always use a library call for pow. setOperationAction(ISD::FPOW , MVT::f32 , Expand); setOperationAction(ISD::FPOW , MVT::f64 , Expand); setOperationAction(ISD::FPOW , MVT::f80 , Expand); The issue is really that there is no standard math library for PTX. I agree that this is a pain for most users, but I

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 17

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

> The issue is really that there is no standard math library for PTX. Well, formally, that could very well be true. Moreover, in some parts CPU math standard is impossible to accomplish on parallel architectures, consider, for example errno behavior. But here we are speaking more about practical side. And the practical side is: past 5 years CUDA claims to accelerate compute applications, and

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Jun 05

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Thanks for the info! I would be glad to hear of any issues you have encountered on this path. I tried to make sure the 3.3 release was fully compatible with the libdevice implementation shipping with 5.5 (and as far as I know, it is). It's just not an officially supported configuration. Also, I've been meaning to address your -drvcuda issue. How would you feel about making that a part

[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values

2012 Jul 18

[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values

Dear NVPTX community, PTXAS fails to compile the ptx code generated by NVPTX. Is it an issue of backend or an issue of PTXAS or a known reasonable restriction? Thanks, - Dima. > cat test.ll ; ModuleID = '__kernelgen_main_module' target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" target triple = "ptx64-unknown-unknown" %struct.__st_parameter_dt.0.4

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 07

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

Hi Justin, gentlemen, I'm afraid I have to escalate this issue at this point. Since it was discussed for the first time last summer, it was sufficient for us for a while to have lowering of math calls into intrinsics disabled at DragonEgg level, and link them against CUDA math functions at LLVM IR level. Now I can say: this is not sufficient any longer, and we need NVPTX backend to deal with

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

2013 Feb 09

[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all

The lack of an open-source vector math library (which is what you suggest here) prompted me to start a project "vecmathlib", available at < https://bitbucket.org/eschnett/vecmathlib>. This library provides almost all math functions available in libm, implemented in a vectorised manner, i.e. suitable for SSE2/AVX/MIC/PTX etc. In its current state the library has rough edges, e.g.

[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values

2012 Jul 18

[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values

In ptx, variables need to be defined before referenced. NVPTX emits the global variables in the order as in the LLVM IR and does not sort them. It is a bug in the NVPTX backend. Thanks. Yuan From: Dmitry N. Mikushin [mailto:maemarcus at gmail.com] Sent: Wednesday, July 18, 2012 7:44 AM To: LLVM-Dev Cc: Justin Holewinski; Yuan Lin Subject: [NVPTX] PTXAS - Unimplemented feature: labels as

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

Ok, as I said, the most precise way to figure out what's wrong is to emit LLVM IR first (use clang -emit-llvm ...) and check out how it differs from working examples, for instance, nvptx regression tests. ----- Original message ----- > I'm building this with llvm-c, and accessing these intrinsics via calling > the intrinsic as if it were a function. > > class F_SREG<string

similar to: [LLVMdev] [NVPTX] Assertion `RegNo < NumRegs && "Attempting to access record for invalid register number!"' failed.