thr3ads.net - similar to: "[LLVMdev] change type allocoted register"

Displaying 20 results from an estimated 500 matches similar to: "[LLVMdev] change type allocoted register"

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

2012 Nov 09

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

Dear all, I'm attaching a patch that should fix the issue mentioned above. It simply makes the same check seen in the same file for global variables: emitPTXAddressSpace(PTy->getAddressSpace(), O); if (GVar->getAlignment() == 0) O << " .align " << (int) TD->getPrefTypeAlignment(ETy); else O << " .align " <<

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

2012 Jul 11

[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params

Hello, FYI, this is a bug http://llvm.org/bugs/show_bug.cgi?id=13324 When compiling the following code for sm_20, func params are by some reason given with .align 0, which is invalid. Problem does not occur if compiled for sm_10. > cat test.ll ; ModuleID = '__kernelgen_main_module' target datalayout = "e-p:64:64-i64:64:64-f64:64:64-n1:8:16:32:64" target triple =

[LLVMdev] Re : ANN: libclc (OpenCL C library implementation)

2011 Oct 20

[LLVMdev] Re : ANN: libclc (OpenCL C library implementation)

Hello, I am the developer of Clover, and so much activity about OpenCL these days is really exciting. Here is my point of view, mainly on Clover and how the projects could use each other. Clover is made in a way that allow a certain level of modularity. Although POCL would be very difficult to merge into Clover (or Clover into POCL), as these two projects are nearly exactly doing the same things

[LLVMdev] Functions: sret and readnone

2009 Nov 05

[LLVMdev] Functions: sret and readnone

It's been a while and I finally had the time to look into this. What I did was to build a custom AliasAnalysis pass, as Chris suggested, that returns AliasAnalysis::Mod for values passed to the sample function in the sret spot, and NoModRef for all other values. I'm also returning AliasAnalysis::AccessesArguments in the pass' getModRefBehavior methods. However, I haven't been

[LLVMdev] Cross compile LLVM

2012 Dec 13

[LLVMdev] Cross compile LLVM

Hi; I am trying to cross compile LLVM for Android-NDK; I am using CMake as a build system; hence, i have defined CMAKE_SYSTEM_NAME variable so to turn on CMAKE_CROSSCOMPILING flag, being used by LLVM. As far as i understand from the LLVM structure; when cross compiling, executable are generated with two versions; one goes under the target build directory ${CMAKE_BINARY_DIR} and the other goes

How to specify the RegisterClass of an IMPLICIT_DEF?

2018 Apr 12

How to specify the RegisterClass of an IMPLICIT_DEF?

Hi, I'm implementing the built_vector as an IMPLICIT_DEF followed by INSERT_SUBREGs. This approach is the one of the SPARC architecture. def : Pat<(build_vector (f32 fpimm:$a1), (f32 fpimm:$a2)), (INSERT_SUBREG(INSERT_SUBREG (v2f32 (IMPLICIT_DEF)), (i32 (COPY_TO_REGCLASS (MOVSUTO_A_iSLo (bitcast_fpimm_to_i32 f32:$a1)), FPUaOffsetClass)), A_UNIT_PART),

[LLVMdev] Tablegen question

2009 Apr 15

[LLVMdev] Tablegen question

If I force it to use v2f32 for my register class, it still fails with: d:\hq\main\sw\appeng\tools\hpc\opencl\compiler\llvm\test\AMDIL>TableGen. exe -gen -dag-isel -I../../include/ test.td > output GPRV2F32:v2f32:$src1 MACRO_DISTANCE_FAST_v2f32: (set GPRF32:f32:$dst, (i ntrinsic_w_chain:f32 84:iPTR, GPRV2F32:v2f32:$src0, GPRV2F32:v2f32:$src1)) TableGen.exe: In

[LLVMdev] RFC: ErLLVM - Implemented HiPE Calling Convention

2012 Apr 24

[LLVMdev] RFC: ErLLVM - Implemented HiPE Calling Convention

This patch (and the others that will follow) are rebased on svn r155440: "AVX2: The BLENDPW instruction selects between vectors of v16i16 using an i8 immediate. We can't use it here because the shuffle code does not check that the lower part of the word is identical to the upper part" Patch 1/3: The attached commits add a new calling convention to support the LLVM backend for

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 16

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

Alex, From my experience in working with GPU vector registers; there is no support for swizzles in the manner that you would normally code them, and in my case I have 6^4 permutations on src registers and 24 combinations in the dst registers. The way that I ended up handling this was to have different register classes for 1, 2, 3 and 4 component vectors. This made the generic cases very simple

[LLVMdev] Tablegen question

2009 Apr 15

[LLVMdev] Tablegen question

On Apr 15, 2009, at 1:11 PM, Villmow, Micah wrote: > If I force it to use v2f32 for my register class, it still fails with: > d:\hq\main\sw\appeng\tools\hpc\opencl\compiler\llvm\test > \AMDIL>TableGen. > exe -gen > -dag-isel -I../../include/ test.td > output > GPRV2F32:v2f32:$src1 MACRO_DISTANCE_FAST_v2f32: (set > GPRF32:f32:$dst, (i > ntrinsic_w_chain:f32

[LLVMdev] Alias in LLVM 3.0

2012 Feb 27

[LLVMdev] Alias in LLVM 3.0

We use alias extensively in our library to support OpenCL generating code for both our CPUs and GPUs. During the transition to LLVM 3.0 with the new type system, we're seeing two problems. Both involve type conversions occurring across an alias. In one case, one of the types is pointer to an opaque type, and ends up creating an assert in the verifier where it is checking that argument types

Node deletion during DAG Combination ?

2018 Jun 20

Node deletion during DAG Combination ?

Hi, I'm trying to optimize the 'extract_vector_elt' for my SIMD microcontroller. The idea is, during DAG combination, to merge load/extract sequence into an architecture specific node. During Instruction Selection, this specific node will be target selected to an architecture specific instruction. By 'combination of DAG nodes' I understand 'replacing a set of DAG nodes by

[LLVMdev] Functions: sret and readnone

2009 Oct 05

[LLVMdev] Functions: sret and readnone

Hi all, I'm currently building a DSL for a computer graphics project that is not unlike NVIDIA's Cg. I have an intrinsic with the following signature float4 sample(texture tex, float2 coords); that is translated to this LLVM IR code: declare void @"sample"(%float4* noalias nocapture sret, %texture, $float2) nounwind readnone The type float4 is basically an array of four

[LLVMdev] TableGen pattern

2009 May 19

[LLVMdev] TableGen pattern

Hello, I am trying to convert the subtree (vector_shuffle v2f32, v2f32 (build_vector imm1, imm2)) to a machine instruction that takes 2 v2f32's and 2 immediates. I tried the following table gen pattern : (set v2f32Reg:$dst, (vector_shuffle v2f32Reg:$src1, v2f32Reg:$src2, (build_vector imm:$c1, imm:$c2))) Table gen barfs about type

[LLVMdev] Tablegen question

2009 Apr 15

[LLVMdev] Tablegen question

On Apr 15, 2009, at 11:15 AM, Villmow, Micah wrote: > I still think there is a bug somewhere, but not sure where yet. > This is what is generated in intrinsic.gen: > case Intrinsic::opencl_math_fdistance: // > llvm.opencl.math.fdistance > ResultTy = Type::FloatTy; > ArgTys.push_back(Tys[0]); > ArgTys.push_back(Tys[0]); > break; OK. That looks right to me.

[LLVMdev] Vector promotions for calling conventions

2010 Jul 05

[LLVMdev] Vector promotions for calling conventions

The X86-64 calling convention (annoyingly) specifies that "struct x { float a,b,c,d; }" is passed or returned in the low 2 elements of two separate XMM registers. For example, returning that would return "a,b" in the low elements of XMM0 and "c,d" in the low elements of XMM1. Both llvm-gcc and clang currently generate atrocious IR for these structs, which you can

[LLVMdev] Instruction pattern type inference problem

2007 Apr 23

[LLVMdev] Instruction pattern type inference problem

On Sun, 22 Apr 2007, Christopher Lamb wrote: > 1. Is there a good reason that v2f32 types are excluded from the > isFloatingPoint filter? Looks like a bug to me. > > v2f32 = 22, // 2 x f32 > v4f32 = 23, // 4 x f32 <== start ?? > v2f64 = 24, // 2 x f64 <== end > > static inline bool isFloatingPoint(ValueType VT) {

create list entry from variable

2009 Aug 24

create list entry from variable

Hi; assume i<-10 how can i create a list having key=10 and value=11 list(i=11) generates a list with 'i' [1] 11 and not 10 [1] 11 any help? Thanks _________________________________________________________________ Facebook. :ON:WL:en-US:SI_SB_facebook:082009 [[alternative HTML version deleted]]

[LLVMdev] Functions: sret and readnone

2009 Oct 05

[LLVMdev] Functions: sret and readnone

On Oct 5, 2009, at 7:21 AM, Stephan Reiter wrote: > Hi all, > > I'm currently building a DSL for a computer graphics project that is > not unlike NVIDIA's Cg. I have an intrinsic with the following > signature > > float4 sample(texture tex, float2 coords); > > that is translated to this LLVM IR code: > > declare void @"sample"(%float4* noalias

[LLVMdev] vector shuffle emulation/expand in backend?

2012 Mar 02

[LLVMdev] vector shuffle emulation/expand in backend?

I'm having some troubles implementing vector support to our custom backend It seems that llvm cannot emulate shuffle with extracts, inserts and builds? I've enabled vector registers with addRegisterClass(MVT::v2i32, TCE::V2I32RegsRegisterClass); addRegisterClass(MVT::v2f32, TCE::V2F32RegsRegisterClass); and created patterns for most vector instructions, including insert, extract and

similar to: [LLVMdev] change type allocoted register