search for: ptx64

Displaying 15 results from an estimated 15 matches for "ptx64".

2012 Jan 16
1
[LLVMdev] PTX backend fails instruction selection for load of sext
Loads (on ptx64) with an sext of a computed index operand fail instruction selection: LLVM ERROR: Cannot select: 0x7ff01401c210: i64,ch = load 0x10580e820, 0x7ff01401b510, 0x7ff01401b910<LD4[%memref1], sext from i32> [ID=8] 0x7ff01401b510: i64 = PTXISD::LOAD_PARAM 0x10580e820, 0x7ff01401b410 [ORD=2] [ID=6...
2012 Jun 12
2
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
...related in NVPTX asm printer - does it chain to some other printer?), and finally ptxas (both 4.2 and 5) fails to compile it to cubin. Below is the test case: > cat test3.cu __inline__ __attribute__((device)) __attribute__((used)) void test() { return; } > clang -cc1 -emit-llvm -triple ptx64-unknown-unknown -fcuda-is-device test3.cu -o test3.ll > cat test3.ll ; ModuleID = 'test3.cu' target datalayout = "e-p:64:64-i64:64:64-f64:64: 64-n1:8:16:32:64" target triple = "ptx64-unknown-unknown" @llvm.used = appending global [1 x i8*] [i8* bitcast (void ()* @_Z4...
2012 Jul 10
2
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
..."setp.ne.u32 \t%%p1, %1, 0; \n\t" "vote.any.pred \t%%p2, %%p1; \n\t" "selp.s32 \t%0, 1, 0, %%p2; \n\t" "}" : "=r"(result) : "r"(a)); return result; } > clang -cc1 -emit-llvm -fcuda-is-device -triple ptx64-unknown-unknown test.cu -o test.ll > cat test.ll ; ModuleID = 'test.cu' target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" target triple = "ptx64-unknown-unknown" define ptx_device i32 @_Z5__anyi(i32 %a) nounwind inlinehint { entry: %a.addr = alloca...
2012 Jul 10
0
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
...%1, 0; \n\t" > "vote.any.pred \t%%p2, %%p1; \n\t" > "selp.s32 \t%0, 1, 0, %%p2; \n\t" > "}" : "=r"(result) : "r"(a)); > return result; > } > > > clang -cc1 -emit-llvm -fcuda-is-device -triple ptx64-unknown-unknown test.cu -o test.ll > > cat test.ll > ; ModuleID = 'test.cu' > target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" > target triple = "ptx64-unknown-unknown" > > define ptx_device i32 @_Z5__anyi(i32 %a) nounwind inlinehin...
2012 Jun 13
0
[LLVMdev] [NVPTX] For linkonce_odr NVPTX generates .weak, but even newest PTXAS can't handle it
...other > printer?), and finally ptxas (both 4.2 and 5) fails to compile it to cubin. > Below is the test case: > > > cat test3.cu > > __inline__ __attribute__((device)) __attribute__((used)) void test() > { > return; > } > > > clang -cc1 -emit-llvm -triple ptx64-unknown-unknown -fcuda-is-device > test3.cu -o test3.ll > > cat test3.ll > ; ModuleID = 'test3.cu' > target datalayout = "e-p:64:64-i64:64:64-f64:64: > 64-n1:8:16:32:64" > target triple = "ptx64-unknown-unknown" > > @llvm.used = appending globa...
2012 Jul 10
1
[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken
...%1, 0; \n\t" > "vote.any.pred \t%%p2, %%p1; \n\t" > "selp.s32 \t%0, 1, 0, %%p2; \n\t" > "}" : "=r"(result) : "r"(a)); > return result; > } > > > clang -cc1 -emit-llvm -fcuda-is-device -triple ptx64-unknown-unknown > test.cu -o test.ll > > cat test.ll > ; ModuleID = 'test.cu' > target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" > target triple = "ptx64-unknown-unknown" > > define ptx_device i32 @_Z5__anyi(i32 %a) nounwind inlin...
2012 Jul 18
2
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
...pile the ptx code generated by NVPTX. Is it an issue of backend or an issue of PTXAS or a known reasonable restriction? Thanks, - Dima. > cat test.ll ; ModuleID = '__kernelgen_main_module' target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" target triple = "ptx64-unknown-unknown" %struct.__st_parameter_dt.0.4 = type { %struct.__st_parameter_common.1.5, i64, i64*, i64*, i8*, i8*, i32, i32, i8*, i8*, i32, i32, i8*, [256 x i8], i32*, i64, i8*, i32, i32, i8*, i8*, i32, i32, i8*, i8*, i32, i32, i8*, i8*, i32, [4 x i8] } %struct.__st_parameter_common.1.5 =...
2012 Jul 18
0
[LLVMdev] [NVPTX] PTXAS - Unimplemented feature: labels as initial values
...pile the ptx code generated by NVPTX. Is it an issue of backend or an issue of PTXAS or a known reasonable restriction? Thanks, - Dima. > cat test.ll ; ModuleID = '__kernelgen_main_module' target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" target triple = "ptx64-unknown-unknown" %struct.__st_parameter_dt.0.4 = type { %struct.__st_parameter_common.1.5, i64, i64*, i64*, i8*, i8*, i32, i32, i8*, i8*, i32, i32, i8*, [256 x i8], i32*, i64, i8*, i32, i32, i8*, i8*, i32, i32, i8*, i8*, i32, i32, i8*, i8*, i32, [4 x i8] } %struct.__st_parameter_common.1.5 =...
2012 May 16
2
[LLVMdev] NVPTX: __iAtomicCAS support ?
...cs not supported or am I doing call in a wrong way? Thanks, - Dima. SOURCE ======== dmikushin at hp2:~> cat kernelgen_monitor.ll ; ModuleID = '/opt/kernelgen/include/kernelgen_monitor.cu' target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" target triple = "ptx64-unknown-unknown" %struct.kernelgen_callback_t = type { i32, i32, %"struct.kernelgen::kernel_t"*, i32, i32, %struct.kernelgen_callback_data_t* } %"struct.kernelgen::kernel_t" = type opaque %struct.kernelgen_callback_data_t = type opaque define ptx_kernel void @_Z17kernelge...
2011 Aug 17
3
[LLVMdev] AMDIL Target Triple patch
Here is a patch for LLVM 2.9 that adds AMDIL as a valid target triple to LLVM. I'll get an updated patch for LLVM TOT if this doesn't patch cleanly next. Micah -------------- next part -------------- A non-text attachment was scrubbed... Name: AMDIL_OpenSource.patch Type: application/octet-stream Size: 1488 bytes Desc: AMDIL_OpenSource.patch URL:
2012 Mar 30
0
[LLVMdev] Why this fails on X86_64 host?
hi justn, I have a llvm ir file which is generated by my own code generator. When I run *llc -march=ptx64 ./gpu_kernel.ll * on it, the following error was given LLVM ERROR: Cannot select: 0x269a7a0: ch = store 0x2666370, 0x2697760, 0x269a2a0, 0x2698d90<ST4[%p_arrayidx5], trunc to i32> [ID=20] 0x2697760: i64 = add 0x2699ea0, 0x2699590 [ORD=23] [ID=16] 0x2699ea0: i64 = shl 0x2699fa0, 0x269a...
2012 May 16
0
[LLVMdev] NVPTX: __iAtomicCAS support ?
...; > Thanks, > - Dima. > > SOURCE > ======== > > dmikushin at hp2:~> cat kernelgen_monitor.ll > ; ModuleID = '/opt/kernelgen/include/kernelgen_monitor.cu' > target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" > target triple = "ptx64-unknown-unknown" > > %struct.kernelgen_callback_t = type { i32, i32, > %"struct.kernelgen::kernel_t"*, i32, i32, > %struct.kernelgen_callback_data_t* } > %"struct.kernelgen::kernel_t" = type opaque > %struct.kernelgen_callback_data_t = type opaque > &...
2012 May 11
1
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
Hi guys, Just catching up on an interesting thread :) On May 7, 2012, at 1:15 AM, Tobias Grosser wrote: > I believe this can be a way worth going, > but I doubt now is the right moment for it. I don't share your opinion > that it is easy to move LLVM-IR in this direction, but I rather believe > that this is an engineering project that will take several months of > full
2012 Jul 11
2
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
...wing code for sm_20, func params are by some reason given with .align 0, which is invalid. Problem does not occur if compiled for sm_10. > cat test.ll ; ModuleID = '__kernelgen_main_module' target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" target triple = "ptx64-unknown-unknown" %struct.float2 = type { float, float } define ptx_device void @__internal_dsmul(%struct.float2* noalias nocapture sret %agg.result, %struct.float2* nocapture byval %x, %struct.float2* nocapture byval %y) nounwind inlinehint alwaysinline { entry: %y1 = getelementptr inbound...
2012 Nov 09
0
[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
...) O << " .align " << (int) TD->getPrefTypeAlignment(ETy); else O << " .align " << GVar->getAlignment(); Could you please review and commit? Do you think it needs a test case? Thanks, - D. dmikushin at hp2:~/forge/align0> llc -march=nvptx64 -mcpu=sm_20 align0.ll -o - // // Generated by LLVM NVPTX Back-End // .version 3.1 .target sm_20 .address_size 64 // .globl __internal_dsmul .visible .func __internal_dsmul( .param .b64 __internal_dsmul_param_0, .param .align 4 .b8 __internal_dsmul_param_1[8], .param .align 4 .b8 __internal_d...