Displaying 20 results from an estimated 6000 matches similar to: "[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all"
2013 Feb 09
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
The lack of an open-source vector math library (which is what you suggest
here) prompted me to start a project "vecmathlib", available at <
https://bitbucket.org/eschnett/vecmathlib>. This library provides almost
all math functions available in libm, implemented in a vectorised manner,
i.e. suitable for SSE2/AVX/MIC/PTX etc.
In its current state the library has rough edges, e.g.
2013 Feb 17
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Dear Yuan,
Sorry for delay with reply,
Answers on your questions could be different, depending on the math library
placement in the code generation pipeline. At KernelGen, we currently have
a user-level CUDA math module, adopted from cicc internals [1]. It is
intended to be linked with the user LLVM IR module, right before proceeding
with the final optimization and backend. Last few months we
2013 Feb 17
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Hi Justin,
I don't understand, why, for instance, X86 backend handles pow
automatically, and NVPTX should be a PITA requiring user to bring his own
pow implementation. Even at a very general level, this limits the interest
of users to LLVM NVPTX backend. Could you please elaborate on the rationale
behind your point? Why the accuracy modes I suggested are not sufficient,
in your opinion?
- D.
2013 Feb 17
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
> The issue is really that there is no standard math library for PTX.
Well, formally, that could very well be true. Moreover, in some parts CPU
math standard is impossible to accomplish on parallel architectures,
consider, for example errno behavior. But here we are speaking more about
practical side. And the practical side is: past 5 years CUDA claims to
accelerate compute applications, and
2013 Feb 08
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Yes, it helps a lot and we are working on it.
A few questions,
1) What will be your use model of this library? Will you run optimization phases after linking with the library? If so, what are they?
2) Do you care if the names of functions differ from those in libm? For example, it would be gpusin() instead of sin().
3) Do you need a different library for different host
2013 Feb 17
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
I would be very hesitant to expose all math library functions as
intrinsics. I believe linking with a target-specific math library is the
correct approach, as it decouples the back end from the needs of the source
program/language. Users should be free to use any math library
implementation they choose. Intrinsics are meant for functions that
compile down to specific isa features, like fused
2013 Feb 17
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
The X86 back-end just calls into libm:
// Always use a library call for pow.
setOperationAction(ISD::FPOW , MVT::f32 , Expand);
setOperationAction(ISD::FPOW , MVT::f64 , Expand);
setOperationAction(ISD::FPOW , MVT::f80 , Expand);
The issue is really that there is no standard math library for PTX. I
agree that this is a pain for most users, but I
2013 Jun 05
0
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Dear all,
FWIW, I've tested libdevice.compute_20.10.bc and libdevice.compute_30.10.bc
from /cuda/nvvm/libdevice shipped with CUDA 5.5 preview. IR is compatible
with LLVM 3.4 trunk that we use. Results are correct, performance - almost
the same as what we had before with cicc-sniffed IR, or maybe <10% better.
Will test libdevice.compute_35.10.bc once we will get K20 support.
Thanks for
2013 Jun 05
2
[LLVMdev] [NVPTX] We need an LLVM CUDA math library, after all
Thanks for the info! I would be glad to hear of any issues you have
encountered on this path. I tried to make sure the 3.3 release was fully
compatible with the libdevice implementation shipping with 5.5 (and as far
as I know, it is). It's just not an officially supported configuration.
Also, I've been meaning to address your -drvcuda issue. How would you feel
about making that a part
2012 Sep 06
1
[LLVMdev] [NVPTX] powf intrinsic in unimplemented
Dear all,
During app compilation we have a crash in NVPTX backend:
LLVM ERROR: Cannot select: 0x732b270: i64 = ExternalSymbol'__powisf2' [ID=18]
As I understand LLVM tries to lower the following call
%28 = call ptx_device float @llvm.powi.f32(float 2.000000e+00, i32 %8)
nounwind readonly
to device intrinsic. The table llvm/IntrinsicsNVVM.td does not contain
such intrinsic, however it
2016 Jul 01
2
Missing TargetPrefix for NVVM intrinsics
Justins:
I noticed that the intrinsics in IntrinsicsNVVM don't specify a
TargetPrefix. This seems like a simple omission, so I was going to
simply throw a `let TargetPrefix = "nvvm" ` block around them, but this
doesn't quite work.
There seem to be three prefixes that are used in this file. About 900
are int_nvvm_*, 30 are int_ptx_*, and 1 is int_cuda. It isn't clear to
me
2013 Apr 01
2
[LLVMdev] [NVPTX] launch_bounds support?
Dear all,
Is anybody working on CUDA launch bounds support?
On PTX level, __attribute__((launch_bounds(MAX_THREADS_PER_BLOCK,
MIN_BLOCKS_PER_MP))) should be emitted into .maxntid / .minnctapersm
specification.
Thanks,
- D.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130401/044f2a01/attachment.html>
2016 Oct 14
2
LLVM/CLANG: CUDA compilation fail for inline assembly code
Hi,
I am sorry for sending this query again here, but maybe I sent it to wrong
list yesterday.
I am trying to compile LonestarGPU-rev2.0
<http://iss.ices.utexas.edu/?p=projects/galois/lonestargpu/download>
benchmark suite with LLVM/CLANG.
This suite has a following piece of code (more info here
2013 Mar 01
4
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
I'm building this with llvm-c, and accessing these intrinsics via calling
the intrinsic as if it were a function.
class F_SREG<string OpStr, NVPTXRegClass regclassOut, Intrinsic IntOp> :
NVPTXInst<(outs regclassOut:$dst), (ins),
OpStr,
[(set regclassOut:$dst, (IntOp))]>;
def INT_PTX_SREG_TID_X : F_SREG<"mov.u32 \t$dst, %tid.x;",
2012 May 01
2
[LLVMdev] [llvm-commits] [PATCH][RFC] NVPTX Backend
> -----Original Message-----
> From: Dan Bailey [mailto:dan at dneg.com]
> Sent: Sunday, April 29, 2012 8:46 AM
> To: Justin Holewinski
> Cc: Jim Grosbach; llvm-commits at cs.uiuc.edu; Vinod Grover;
> llvmdev at cs.uiuc.edu
> Subject: Re: [llvm-commits] [PATCH][RFC] NVPTX Backend
>
> Justin,
>
> Firstly, this is great! It seems to be so much further forward in
2013 Mar 01
1
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
The identifier INT_PTX_SREG_TID_X is the name of an instruction as the
back-end sees it, and has very little to do with the name you should use in
your IR. Your best bet is to look at the include/llvm/IR/IntrinsicsNVVM.td
file and see the definitions for each intrinsic. Then, the name mapping is
just:
int_foo_bar -> llvm.foo.bar()
int_ prefix becomes llvm., and all underscores turn into
2013 Apr 02
0
[LLVMdev] [NVPTX] launch_bounds support?
Yes, this is supported through metadata. An example usage of these
annotations is given in the test/CodeGen/NVPTX/annotations.ll unit test.
I'll try to remember to add this to the NVPTX documentation I'm putting
together at http://llvm.org/docs/NVPTXUsage.html.
On Mon, Apr 1, 2013 at 8:06 AM, Dmitry Mikushin <dmitry at kernelgen.org>wrote:
> Dear all,
>
> Is anybody
2012 May 02
0
[LLVMdev] [llvm-commits] [PATCH][RFC] NVPTX Backend
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Justin Holewinski wrote:
<blockquote
2013 Mar 01
0
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
Hi Timothy,
I'm not sure what you mean by this working for other intrinsics, but
in this case, I think you want the intrinsic name
llvm.nvvm.read.ptx.sreg.tid.x.
For me, this looks like:
%x = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
Pete
On Fri, Mar 1, 2013 at 11:51 AM, Timothy Baldridge <tbaldridge at gmail.com> wrote:
> I'm building this with llvm-c, and accessing these
2012 Apr 27
2
[LLVMdev] [llvm-commits] [PATCH][RFC] NVPTX Backend
Thanks for the feedback!
The attached patch addresses the style issues that have been found.
From: Jim Grosbach [mailto:grosbach at apple.com]
Sent: Wednesday, April 25, 2012 2:22 PM
To: Justin Holewinski
Cc: llvm-commits at cs.uiuc.edu; llvmdev at cs.uiuc.edu; Vinod Grover
Subject: Re: [llvm-commits] [PATCH][RFC] NVPTX Backend
Hi Justin,
Cool stuff, to be sure. Excited to see this.
As a