Displaying 20 results from an estimated 9000 matches similar to: "[LLVMdev] [PATCH][RFC] NVPTX Backend"
2012 Apr 25
0
[LLVMdev] [PATCH][RFC] NVPTX Backend
On 4/24/2012 1:50 PM, Justin Holewinski wrote:
>
> Hi LLVMers,
>
> We at NVIDIA would like to contribute back to the LLVM open-source
> community by up-streaming the NVPTX back-end for LLVM. This back-end
> is based on the sources used by NVIDIA, and currently provides
> significantly more functionality than the current PTX back-end. Some
> functionality is currently
2012 Apr 25
0
[LLVMdev] [llvm-commits] [PATCH][RFC] NVPTX Backend
Hi Justin,
Cool stuff, to be sure. Excited to see this.
As a pre-cursor to more involved technical feedback, I suggest going through and fixing up the coding style and formatting issues. Just glancing through, I see lots of things like function names starting with capital letters, compound statements with the opening brace on the line following an if/for/while/etc., single-statements after an
2012 Apr 27
2
[LLVMdev] [llvm-commits] [PATCH][RFC] NVPTX Backend
Thanks for the feedback!
The attached patch addresses the style issues that have been found.
From: Jim Grosbach [mailto:grosbach at apple.com]
Sent: Wednesday, April 25, 2012 2:22 PM
To: Justin Holewinski
Cc: llvm-commits at cs.uiuc.edu; llvmdev at cs.uiuc.edu; Vinod Grover
Subject: Re: [llvm-commits] [PATCH][RFC] NVPTX Backend
Hi Justin,
Cool stuff, to be sure. Excited to see this.
As a
2014 Apr 21
2
[LLVMdev] [NVPTX] Eliminate common sub-expressions in a group of similar GEPs
Hi Hal,
Thanks for your comments! I'm inlining my responses below.
Jingyue
On Sat, Apr 19, 2014 at 6:38 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> Jingyue,
>
> I can't speak for the NVPTX backend, but I think this looks useful as an
> (optional) target-independent pass. A few thoughts:
>
> - Running GVN tends to be pretty expensive; have you tried EarlyCSE
2014 Apr 19
4
[LLVMdev] [NVPTX] Eliminate common sub-expressions in a group of similar GEPs
Hi,
We wrote an optimization that eliminates common sub-expressions in a group
of similar GEPs for the NVPTX backend. It speeds up some of our benchmarks
by up to 20%, which convinces us to try to upstream it. Here's a brief
description of why we wrote this optimization, what we did, and how we did
it.
Loops in CUDA programs are often extensively unrolled by programmers and
compilers,
2012 Apr 29
0
[LLVMdev] [llvm-commits] [PATCH][RFC] NVPTX Backend
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Justin,<br>
<br>
Firstly, this is great! It seems to be so much further forward in terms
of features
2013 Feb 04
1
[LLVMdev] Problem with PTX assembly printing (NVPTX backend)
Hi Nikolaos,
Following commands work great for me.
$ clang -S -emit-llvm -target nvptx -x cl -include clc/clctypes.h
../data-types/scalar.cl
$ llc -mcpu=sm_30 scalar.s
You can follow Justin's blog [1]. It helped me a lot to understand where to
start.
[1] http://jholewinski.org/blog/llvm-3-0-ptx-backend/
Best,
Ankur
On Mon, Feb 4, 2013 at 11:40 PM, Justin Holewinski <
justin.holewinski
2012 May 07
0
[LLVMdev] NVPTX annotation metadata emission
This new metadata format is currently optional. The old ptx_kernel calling convention should still work.
The only thing you should have to change when converting from PTX -> NVPTX is the address space map. The calling conventions and intrinsics should be compatible with both.
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
2012 May 01
2
[LLVMdev] [llvm-commits] [PATCH][RFC] NVPTX Backend
> -----Original Message-----
> From: Dan Bailey [mailto:dan at dneg.com]
> Sent: Sunday, April 29, 2012 8:46 AM
> To: Justin Holewinski
> Cc: Jim Grosbach; llvm-commits at cs.uiuc.edu; Vinod Grover;
> llvmdev at cs.uiuc.edu
> Subject: Re: [llvm-commits] [PATCH][RFC] NVPTX Backend
>
> Justin,
>
> Firstly, this is great! It seems to be so much further forward in
2012 Jun 12
2
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
Dear LLVM NVPTX maintainers,
Just to have the issue recorded, I don't know how important it is:
clang generates linkonce_odr out of __inline__, and NVPTX generates .weak
out of linkonce_odr (how it happens - a big question, btw, because I can't
find anything related in NVPTX asm printer - does it chain to some other
printer?), and finally ptxas (both 4.2 and 5) fails to compile it to
2012 Jun 13
0
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
On Tue, Jun 12, 2012 at 6:11 PM, Dmitry N. Mikushin <maemarcus at gmail.com>wrote:
> Dear LLVM NVPTX maintainers,
>
> Just to have the issue recorded, I don't know how important it is:
>
> clang generates linkonce_odr out of __inline__, and NVPTX generates .weak
> out of linkonce_odr (how it happens - a big question, btw, because I can't
> find anything related
2013 Feb 04
0
[LLVMdev] Problem with PTX assembly printing (NVPTX backend)
On Mon, Feb 4, 2013 at 1:09 PM, <nkavv at physics.auth.gr> wrote:
> Hi Justin,
>
>
> Has anyone had similar problems with the NVPTX backend? Shouldn't this
>>> code be linked to the AsmPrinter library for NVPTX (already)?
>>>
>>
>> What do you mean by "doesn't work"? The AsmPrinter library really houses
>> the MCInst
2012 May 07
2
[LLVMdev] NVPTX annotation metadata emission
Hi everybody,
I have noticed that the new NVPTX backend requires new metadata to identify
the kernels in the module:
define void @metadata_kernel(float* %a) {
ret void
}
!nvvm.annotations = !{!1}
!1 = metadata !{void (float*)* @metadata_kernel, metadata !"kernel", i32 1}
Is clang going to support the emission of this metadata soon ? Or do I have
to write it on my own ? :)
Thanks,
2015 Aug 21
2
[CUDA/NVPTX] is inlining __syncthreads allowed?
I'm using 7.0. I am attaching the reduced example.
nvcc sync.cu -arch=sm_35 -ptx
gives
// .globl _Z3foov
.visible .entry _Z3foov(
)
{
.reg .pred %p<2>;
.reg .s32 %r<3>;
mov.u32 %r1, %tid.x;
and.b32 %r2, %r1, 1;
setp.eq.b32 %p1, %r2, 1;
@!%p1 bra BB7_2;
bra.uni
2013 Feb 04
3
[LLVMdev] Problem with PTX assembly printing (NVPTX backend)
Hi Justin,
>> Has anyone had similar problems with the NVPTX backend? Shouldn't this
>> code be linked to the AsmPrinter library for NVPTX (already)?
>
> What do you mean by "doesn't work"? The AsmPrinter library really houses
> the MCInst printer, which isn't implemented for NVPTX yet. The older
> assembly printer works just fine. This is
2012 Jul 18
2
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
Dear NVPTX community,
PTXAS fails to compile the ptx code generated by NVPTX. Is it an issue of
backend or an issue of PTXAS or a known reasonable restriction?
Thanks,
- Dima.
> cat test.ll
; ModuleID = '__kernelgen_main_module'
target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64"
target triple = "ptx64-unknown-unknown"
%struct.__st_parameter_dt.0.4
2012 Jul 18
0
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
In ptx, variables need to be defined before referenced. NVPTX emits the global variables in the order as in the LLVM IR and does not sort them. It is a bug in the NVPTX backend.
Thanks.
Yuan
From: Dmitry N. Mikushin [mailto:maemarcus at gmail.com]
Sent: Wednesday, July 18, 2012 7:44 AM
To: LLVM-Dev
Cc: Justin Holewinski; Yuan Lin
Subject: [NVPTX] PTXAS - Unimplemented feature: labels as
2013 Apr 01
2
[LLVMdev] [NVPTX] launch_bounds support?
Dear all,
Is anybody working on CUDA launch bounds support?
On PTX level, __attribute__((launch_bounds(MAX_THREADS_PER_BLOCK,
MIN_BLOCKS_PER_MP))) should be emitted into .maxntid / .minnctapersm
specification.
Thanks,
- D.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130401/044f2a01/attachment.html>
2013 Jun 07
2
[LLVMdev] How to generate constant memory for ptx code by NVPTX?
Hello,
I work on compiling OpenCL kernel to PTX code by clang and NVPTX with
libclc.
I have a kernel that contains constant variable declared in file scope like
this:
constant one_f = 1.0f;
__kernel void test( ...){ ... }
Then it is compiled to llvm-ir:
@one_f = addrspace(4) const float 1.000000e+00, align 4
define void test(...){ ... }
Finally ptx:
.visible .global .align 4 .f32
2012 Nov 09
3
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
Hi Dmitry,
> You're right, global variables use preferred alignment. And - yes,
> preferred alignment in this case is bigger: 8 instead of 4. NVIDIA's
> prop. compiler gives 4. However, since CUDA 5.0 ptx modules are
> linkable with each other, I think alignments for externally visible
> functions and data should all follow ABI rules.
giving it an alignment of 8 does