Displaying 15 results from an estimated 15 matches for "ptx64".
2012 Jan 16
1
[LLVMdev] PTX backend fails instruction selection for load of sext
Loads (on ptx64) with an sext of a computed index operand fail instruction selection:
LLVM ERROR: Cannot select: 0x7ff01401c210: i64,ch = load 0x10580e820, 0x7ff01401b510, 0x7ff01401b910<LD4[%memref1], sext from i32> [ID=8]
0x7ff01401b510: i64 = PTXISD::LOAD_PARAM 0x10580e820, 0x7ff01401b410 [ORD=2] [ID=6...
2012 Jun 12
2
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
...related in NVPTX asm printer - does it chain to some other
printer?), and finally ptxas (both 4.2 and 5) fails to compile it to cubin.
Below is the test case:
> cat test3.cu
__inline__ __attribute__((device)) __attribute__((used)) void test()
{
return;
}
> clang -cc1 -emit-llvm -triple ptx64-unknown-unknown -fcuda-is-device
test3.cu -o test3.ll
> cat test3.ll
; ModuleID = 'test3.cu'
target datalayout = "e-p:64:64-i64:64:64-f64:64:
64-n1:8:16:32:64"
target triple = "ptx64-unknown-unknown"
@llvm.used = appending global [1 x i8*] [i8* bitcast (void ()* @_Z4...
2012 Jul 10
2
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
..."setp.ne.u32 \t%%p1, %1, 0; \n\t"
"vote.any.pred \t%%p2, %%p1; \n\t"
"selp.s32 \t%0, 1, 0, %%p2; \n\t"
"}" : "=r"(result) : "r"(a));
return result;
}
> clang -cc1 -emit-llvm -fcuda-is-device -triple ptx64-unknown-unknown
test.cu -o test.ll
> cat test.ll
; ModuleID = 'test.cu'
target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64"
target triple = "ptx64-unknown-unknown"
define ptx_device i32 @_Z5__anyi(i32 %a) nounwind inlinehint {
entry:
%a.addr = alloca...
2012 Jul 10
0
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
...%1, 0; \n\t"
> "vote.any.pred \t%%p2, %%p1; \n\t"
> "selp.s32 \t%0, 1, 0, %%p2; \n\t"
> "}" : "=r"(result) : "r"(a));
> return result;
> }
>
> > clang -cc1 -emit-llvm -fcuda-is-device -triple ptx64-unknown-unknown test.cu -o test.ll
> > cat test.ll
> ; ModuleID = 'test.cu'
> target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64"
> target triple = "ptx64-unknown-unknown"
>
> define ptx_device i32 @_Z5__anyi(i32 %a) nounwind inlinehin...
2012 Jun 13
0
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
...other
> printer?), and finally ptxas (both 4.2 and 5) fails to compile it to cubin.
> Below is the test case:
>
> > cat test3.cu
>
> __inline__ __attribute__((device)) __attribute__((used)) void test()
> {
> return;
> }
>
> > clang -cc1 -emit-llvm -triple ptx64-unknown-unknown -fcuda-is-device
> test3.cu -o test3.ll
> > cat test3.ll
> ; ModuleID = 'test3.cu'
> target datalayout = "e-p:64:64-i64:64:64-f64:64:
> 64-n1:8:16:32:64"
> target triple = "ptx64-unknown-unknown"
>
> @llvm.used = appending globa...
2012 Jul 10
1
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
...%1, 0; \n\t"
> "vote.any.pred \t%%p2, %%p1; \n\t"
> "selp.s32 \t%0, 1, 0, %%p2; \n\t"
> "}" : "=r"(result) : "r"(a));
> return result;
> }
>
> > clang -cc1 -emit-llvm -fcuda-is-device -triple ptx64-unknown-unknown
> test.cu -o test.ll
> > cat test.ll
> ; ModuleID = 'test.cu'
> target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64"
> target triple = "ptx64-unknown-unknown"
>
> define ptx_device i32 @_Z5__anyi(i32 %a) nounwind inlin...
2012 Jul 18
2
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
...pile the ptx code generated by NVPTX. Is it an issue of
backend or an issue of PTXAS or a known reasonable restriction?
Thanks,
- Dima.
> cat test.ll
; ModuleID = '__kernelgen_main_module'
target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64"
target triple = "ptx64-unknown-unknown"
%struct.__st_parameter_dt.0.4 = type { %struct.__st_parameter_common.1.5,
i64, i64*, i64*, i8*, i8*, i32, i32, i8*, i8*, i32, i32, i8*, [256 x i8],
i32*, i64, i8*, i32, i32, i8*, i8*, i32, i32, i8*, i8*, i32, i32, i8*, i8*,
i32, [4 x i8] }
%struct.__st_parameter_common.1.5 =...
2012 Jul 18
0
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
...pile the ptx code generated by NVPTX. Is it an issue of backend or an issue of PTXAS or a known reasonable restriction?
Thanks,
- Dima.
> cat test.ll
; ModuleID = '__kernelgen_main_module'
target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64"
target triple = "ptx64-unknown-unknown"
%struct.__st_parameter_dt.0.4 = type { %struct.__st_parameter_common.1.5, i64, i64*, i64*, i8*, i8*, i32, i32, i8*, i8*, i32, i32, i8*, [256 x i8], i32*, i64, i8*, i32, i32, i8*, i8*, i32, i32, i8*, i8*, i32, i32, i8*, i8*, i32, [4 x i8] }
%struct.__st_parameter_common.1.5 =...
2012 May 16
2
[LLVMdev] NVPTX: __iAtomicCAS support ?
...cs not supported or am I
doing call in a wrong way?
Thanks,
- Dima.
SOURCE
========
dmikushin at hp2:~> cat kernelgen_monitor.ll
; ModuleID = '/opt/kernelgen/include/kernelgen_monitor.cu'
target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64"
target triple = "ptx64-unknown-unknown"
%struct.kernelgen_callback_t = type { i32, i32,
%"struct.kernelgen::kernel_t"*, i32, i32,
%struct.kernelgen_callback_data_t* }
%"struct.kernelgen::kernel_t" = type opaque
%struct.kernelgen_callback_data_t = type opaque
define ptx_kernel void @_Z17kernelge...
2011 Aug 17
3
[LLVMdev] AMDIL Target Triple patch
Here is a patch for LLVM 2.9 that adds AMDIL as a valid target triple to LLVM.
I'll get an updated patch for LLVM TOT if this doesn't patch cleanly next.
Micah
-------------- next part --------------
A non-text attachment was scrubbed...
Name: AMDIL_OpenSource.patch
Type: application/octet-stream
Size: 1488 bytes
Desc: AMDIL_OpenSource.patch
URL:
2012 Mar 30
0
[LLVMdev] Why this fails on X86_64 host?
hi justn,
I have a llvm ir file which is generated by my own code generator.
When I run
*llc -march=ptx64 ./gpu_kernel.ll *
on it, the following error was given
LLVM ERROR: Cannot select: 0x269a7a0: ch = store 0x2666370, 0x2697760,
0x269a2a0, 0x2698d90<ST4[%p_arrayidx5], trunc to i32> [ID=20]
0x2697760: i64 = add 0x2699ea0, 0x2699590 [ORD=23] [ID=16]
0x2699ea0: i64 = shl 0x2699fa0, 0x269a...
2012 May 16
0
[LLVMdev] NVPTX: __iAtomicCAS support ?
...;
> Thanks,
> - Dima.
>
> SOURCE
> ========
>
> dmikushin at hp2:~> cat kernelgen_monitor.ll
> ; ModuleID = '/opt/kernelgen/include/kernelgen_monitor.cu'
> target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64"
> target triple = "ptx64-unknown-unknown"
>
> %struct.kernelgen_callback_t = type { i32, i32,
> %"struct.kernelgen::kernel_t"*, i32, i32,
> %struct.kernelgen_callback_data_t* }
> %"struct.kernelgen::kernel_t" = type opaque
> %struct.kernelgen_callback_data_t = type opaque
>
&...
2012 May 11
1
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
Hi guys,
Just catching up on an interesting thread :)
On May 7, 2012, at 1:15 AM, Tobias Grosser wrote:
> I believe this can be a way worth going,
> but I doubt now is the right moment for it. I don't share your opinion
> that it is easy to move LLVM-IR in this direction, but I rather believe
> that this is an engineering project that will take several months of
> full
2012 Jul 11
2
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
...wing code for sm_20, func params are by some reason
given with .align 0, which is invalid. Problem does not occur if compiled
for sm_10.
> cat test.ll
; ModuleID = '__kernelgen_main_module'
target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64"
target triple = "ptx64-unknown-unknown"
%struct.float2 = type { float, float }
define ptx_device void @__internal_dsmul(%struct.float2* noalias nocapture
sret %agg.result, %struct.float2* nocapture byval %x, %struct.float2*
nocapture byval %y) nounwind inlinehint alwaysinline {
entry:
%y1 = getelementptr inbound...
2012 Nov 09
0
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
...)
O << " .align " << (int) TD->getPrefTypeAlignment(ETy);
else
O << " .align " << GVar->getAlignment();
Could you please review and commit? Do you think it needs a test case?
Thanks,
- D.
dmikushin at hp2:~/forge/align0> llc -march=nvptx64 -mcpu=sm_20 align0.ll -o -
//
// Generated by LLVM NVPTX Back-End
//
.version 3.1
.target sm_20
.address_size 64
// .globl __internal_dsmul
.visible .func __internal_dsmul(
.param .b64 __internal_dsmul_param_0,
.param .align 4 .b8 __internal_dsmul_param_1[8],
.param .align 4 .b8 __internal_d...