Displaying 16 results from an estimated 16 matches for "cubin".
Did you mean:
cubic
2017 Jun 09
1
NVPTX Back-end: relocatable device code support for dynamic parallelism
...t; "/tmp/tmpxft_00007040_00000000-13_cuda_id_test.cpp3.i" -o "/tmp/tmpxft_00007040_00000000-6_cuda_id_test.ptx"
#$ ptxas -arch=sm_35 -m64 --compile-only "/tmp/tmpxft_00007040_00000000-6_cuda_id_test.ptx" -o "/tmp/tmpxft_00007040_00000000-14_cuda_id_test.sm_35.cubin"
#$ fatbinary --create="/tmp/tmpxft_00007040_00000000-2_cuda_id_test.fatbin" -64 --cmdline="--compile-only " "--image=profile=sm_35,file=/tmp/tmpxft_00007040_00000000-14_cuda_id_test.sm_35.cubin" "--image=profile=compute_35,file=/tmp/tmpxft_00007040_0000000...
2012 Jul 18
2
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
...isible .global .align 4096 .u64 a = _2E_cst12;
.visible .global .align 4096 .b8 _2E_cst12[3] = {97, 97, 0};
.visible .global .align 4096 .b8 _2E_cst2[26] = {85, 115, 97, 103, 101, 58,
32, 37, 115, 32, 60, 110, 120, 62, 32, 60, 110, 121, 62, 32, 60, 110, 122,
62, 10, 0};
> ptxas test.ptx -o test.cubin
ptxas test.ptx, line 10; error : Unimplemented feature: labels as initial
values
ptxas test.ptx, line 12; error : Unimplemented feature: labels as initial
values
ptxas test.ptx, line 10; error : Label expected for forward reference of
'_2E_cst1'
ptxas test.ptx, line 12; error : Labe...
2012 Jun 12
2
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
...is:
clang generates linkonce_odr out of __inline__, and NVPTX generates .weak
out of linkonce_odr (how it happens - a big question, btw, because I can't
find anything related in NVPTX asm printer - does it chain to some other
printer?), and finally ptxas (both 4.2 and 5) fails to compile it to cubin.
Below is the test case:
> cat test3.cu
__inline__ __attribute__((device)) __attribute__((used)) void test()
{
return;
}
> clang -cc1 -emit-llvm -triple ptx64-unknown-unknown -fcuda-is-device
test3.cu -o test3.ll
> cat test3.ll
; ModuleID = 'test3.cu'
target datalayout = &qu...
2012 Jul 10
2
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
...%r0;
// inline asm
.reg .pred %p1;
.reg .pred %p2;
setp.ne.u32 %p1, %r0, 0;
vote.any.pred %p2, %p1;
selp.s32 %r0, 1, 0, %p2;
// inline asm
st.global.u32 [%SP+4], %r0;
mov.b32 func_retval0, %r0;
ret;
}
> ptxas test.ptx -o test.cubin
ptxas test.ptx, line 33; error : Duplicate definition of variable '%p1'
ptxas test.ptx, line 34; error : Duplicate definition of variable '%p2'
ptxas test.ptx, line 36; error : Instruction 'vote' requires .target
sm_12 or higher
ptxas fatal : Ptx assembly aborted due...
2012 Jul 18
0
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
...isible .global .align 4096 .u64 a = _2E_cst12;
.visible .global .align 4096 .b8 _2E_cst12[3] = {97, 97, 0};
.visible .global .align 4096 .b8 _2E_cst2[26] = {85, 115, 97, 103, 101, 58, 32, 37, 115, 32, 60, 110, 120, 62, 32, 60, 110, 121, 62, 32, 60, 110, 122, 62, 10, 0};
> ptxas test.ptx -o test.cubin
ptxas test.ptx, line 10; error : Unimplemented feature: labels as initial values
ptxas test.ptx, line 12; error : Unimplemented feature: labels as initial values
ptxas test.ptx, line 10; error : Label expected for forward reference of '_2E_cst1'
ptxas test.ptx, line 12; error : Labe...
2012 Jul 10
0
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
...%p2;
> setp.ne.u32 %p1, %r0, 0;
> vote.any.pred %p2, %p1;
> selp.s32 %r0, 1, 0, %p2;
>
> // inline asm
> st.global.u32 [%SP+4], %r0;
> mov.b32 func_retval0, %r0;
> ret;
> }
>
> > ptxas test.ptx -o test.cubin
> ptxas test.ptx, line 33; error : Duplicate definition of variable '%p1'
> ptxas test.ptx, line 34; error : Duplicate definition of variable '%p2'
> ptxas test.ptx, line 36; error : Instruction 'vote' requires .target sm_12 or higher
> ptxas fatal : Ptx...
2012 Jun 13
0
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
...rates linkonce_odr out of __inline__, and NVPTX generates .weak
> out of linkonce_odr (how it happens - a big question, btw, because I can't
> find anything related in NVPTX asm printer - does it chain to some other
> printer?), and finally ptxas (both 4.2 and 5) fails to compile it to cubin.
> Below is the test case:
>
> > cat test3.cu
>
> __inline__ __attribute__((device)) __attribute__((used)) void test()
> {
> return;
> }
>
> > clang -cc1 -emit-llvm -triple ptx64-unknown-unknown -fcuda-is-device
> test3.cu -o test3.ll
> > cat test3.l...
2012 Jul 10
1
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
....reg .pred %p2;
> setp.ne.u32 %p1, %r0, 0;
> vote.any.pred %p2, %p1;
> selp.s32 %r0, 1, 0, %p2;
>
> // inline asm
> st.global.u32 [%SP+4], %r0;
> mov.b32 func_retval0, %r0;
> ret;
> }
>
> > ptxas test.ptx -o test.cubin
> ptxas test.ptx, line 33; error : Duplicate definition of variable '%p1'
> ptxas test.ptx, line 34; error : Duplicate definition of variable '%p2'
> ptxas test.ptx, line 36; error : Instruction 'vote' requires .target
> sm_12 or higher
> ptxas fatal :...
2007 Nov 11
1
nv50 microcode/shader format
Hello there,
With the use of CUDA and ptx I managed to make a disassembler for at least
one of the nv50 microcode/shader formats. This might come in handy for some
people, hence I'm posting it here:
http://www.cs.rug.nl/~wladimir/decuda/
I'm not yet sure if pixel/vertex shaders use the same instruction format as
the compute shaders used by CUDA, but you'd think so, as nv50 is
2012 Jul 11
2
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
...%f1;
add.f32 %f1, %f3, %f0;
add.rn.f32 %f0, %f2, %f1;
sub.f32 %f2, %f2, %f0;
add.rn.f32 %f1, %f2, %f1;
ld.param.u64 %rl0, [__internal_dsmul_param_0];
st.f32 [%rl0], %f1;
st.f32 [%rl0+4], %f0;
ret;
}
> ptxas -arch=sm_20 test.ptx -o ptx.cubin
ptxas test.ptx, line 13; error : Alignment must be a power of two
ptxas test.ptx, line 14; error : Alignment must be a power of two
ptxas fatal : Ptx assembly aborted due to errors
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/piperm...
2013 Oct 09
2
[LLVMdev] Backend vs JIT : GPU
Hi guys,
I am understanding OpenCL compilation flow on GPU in order to develop
OpenCL runtime for a new hardware.
I understood that OpenCL compiler is part of a vendor's runtime library
which is the heart of OpenCL. Since OpenCL kernel is compiled at runtime,
hence at high level its compilation takes place in two steps:
i. source code is first converted to intermediate code.
ii.
2016 Mar 09
2
RFC: Proposing an LLVM subproject for parallelism runtime and support libraries
...==================
Platform Format Location Setter
=========== ======= =========== =========================
CUDA PTX disk `AddCudaPtxOnDisk`
CUDA PTX memory `AddCudaPtxInMemory`
CUDA cubin disk `AddCudaCubinOnDisk`
CUDA cubin memory `AddCudaCubinInMemory`
OpenCL text disk `AddOpenCLTextOnDisk`
OpenCL text memory `AddOpenCLTextInMemory`
OpenCL binary disk...
2016 Mar 09
2
RFC: Proposing an LLVM subproject for parallelism runtime and support libraries
...latform Format Location Setter
> =========== ======= =========== =========================
> CUDA PTX disk `AddCudaPtxOnDisk`
> CUDA PTX memory `AddCudaPtxInMemory`
> CUDA cubin disk `AddCudaCubinOnDisk`
> CUDA cubin memory `AddCudaCubinInMemory`
> OpenCL text disk `AddOpenCLTextOnDisk`
> OpenCL text memory `AddOpenCLTextInMemory`
> OpenCL bin...
2016 Mar 10
2
RFC: Proposing an LLVM subproject for parallelism runtime and support libraries
...ormat Location Setter
>> =========== ======= =========== =========================
>> CUDA PTX disk `AddCudaPtxOnDisk`
>> CUDA PTX memory `AddCudaPtxInMemory`
>> CUDA cubin disk `AddCudaCubinOnDisk`
>> CUDA cubin memory `AddCudaCubinInMemory`
>> OpenCL text disk `AddOpenCLTextOnDisk`
>> OpenCL text memory `AddOpenCLTextInMemory`
>> Ope...
2012 Nov 09
0
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
...d.rn.f32 %f0, %f2, %f1;
> sub.f32 %f2, %f2, %f0;
> add.rn.f32 %f1, %f2, %f1;
> ld.param.u64 %rl0, [__internal_dsmul_param_0];
> st.f32 [%rl0], %f1;
> st.f32 [%rl0+4], %f0;
> ret;
> }
>
>> ptxas -arch=sm_20 test.ptx -o ptx.cubin
> ptxas test.ptx, line 13; error : Alignment must be a power of two
> ptxas test.ptx, line 14; error : Alignment must be a power of two
> ptxas fatal : Ptx assembly aborted due to errors
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: align0.p...
2019 Sep 09
3
Google’s TensorFlow team would like to contribute MLIR to the LLVM Foundation
On Mon, 9 Sep 2019 at 22:22, Chris Lattner <clattner at google.com> wrote:
> Including a bunch of content, eg a full langref doc:
> https://github.com/tensorflow/mlir/blob/master/g3doc/LangRef.md
Thanks Chris, that looks awesome!
This one could perhaps be improved with time:
https://github.com/tensorflow/mlir/blob/master/g3doc/ConversionToLLVMDialect.md
Which I think was Hal's