thr3ads.net - similar to: "[LLVMdev] question on scalarization"

Displaying 20 results from an estimated 1100 matches similar to: "[LLVMdev] question on scalarization"

[LLVMdev] Vector immediates in tablegen w/o build_vector?

2011 Dec 16

[LLVMdev] Vector immediates in tablegen w/o build_vector?

I have two patterns in tablegen that do look like the exact same thing: Pat 1) def MOV_v4i16 : ILFormat<IL_OP_MOV, (outs GPRV4I16:$dst), (ins i16imm:$val), asm, [(set GPRV4I16:$dst, (build_vector (i16 imm:$val)))]>; Pat 2) def v4i16imm : Operand<v4i16>; def MOV_v4i16 : ILFormat<IL_OP_MOV, (outs GPRV4I16:$dst), (ins v4i16imm:$val), asm, [(set

llvm-dev Digest, Vol 166, Issue 22

2018 Apr 09

llvm-dev Digest, Vol 166, Issue 22

Hi Krzysztof, Sure, please see below. DAG.dump.() before and after, annotated with what I believe the DAG means. I've spent some time debugging the method but it's proving difficult to determine where the logic is misfiring. Disabling the entire combine causes a lot of failing x86-64 tests - I may have to learn an upstream vector ISA to make progress on this. Thank you >From your

Legalizing vector types

2020 Jan 03

Legalizing vector types

Hi all, I am working on a target that has support for v4i16 vectors, and no support for v4i8 / v8i8 / v8i16 V4i8 is promoted to v4i16 which is nice V8i16 is split to 2 x v4i16 which is nice as well Now v8i8 is scalarized, which is not so nice. Ideally I would like v8i8 to be first promoted to v8i16 then split to 2xv4i16 (or split to 2xV4i8 then promoted to 2xv4i16) Is there a way to achieve

[LLVMdev] type legalizer promoting BUILD_VECTORs

2009 Feb 02

[LLVMdev] type legalizer promoting BUILD_VECTORs

Hi Bob, > LLVM's type legalizer is changing the types of BUILD_VECTORs in a way > that seems wrong to me, but I'm not sure if this is a bug or if some > targets may be relying on it. > > On a 32-bit target, the default action for legalizing i8 and i16 types > is to promote them. If you then have a BUILD_VECTOR to construct a > legal vector type composed of

[LLVMdev] How to define macros in a tablegen file?

2012 Jun 19

[LLVMdev] How to define macros in a tablegen file?

Hi, I was wondering if there is a way to specify macros to help shorten rewriting patterns like these: def : Pat <(v4i8 (mul (v4i8 IntRegs:$a), (v4i8 IntRegs:$b))), (v4i8 (VTRUNEHB (v4i16 (VTRUNEWH (v2i32 (VMPYH (v2i16 (EXTRACT_SUBREG (v4i16 (VSXTBH (v4i8 IntRegs:$a))), subreg_hireg)), (v2i16 (EXTRACT_SUBREG (v4i16 (VSXTBH (v4i8

[LLVMdev] Lowering to MMX

2011 Oct 25

[LLVMdev] Lowering to MMX

Hi Nicolas, > I found out that the performance regression is due to removing support > for lowering 64-bit vector operations to MMX, and using SSE2 instead. My > code uses a mix of MMX intrinsics and v4i16 operations, so it ping-pongs > back and forth between MMX and SSE2 instructions in the generated code. > > To get more optimal code, I see three options, and I was wondering

[LLVMdev] Lowering to MMX

2011 Oct 20

[LLVMdev] Lowering to MMX

Hi all, I'm working on a graphics project which uses LLVM for dynamic code generation, and I noticed a major performance regression when upgrading from LLVM 2.8 to 3.0-rc1 (LLVM 2.9 didn't support Win64 so I skipped it entirely). I found out that the performance regression is due to removing support for lowering 64-bit vector operations to MMX, and using SSE2 instead. My code uses a

[LLVMdev] type legalizer promoting BUILD_VECTORs

2009 Feb 02

[LLVMdev] type legalizer promoting BUILD_VECTORs

LLVM's type legalizer is changing the types of BUILD_VECTORs in a way that seems wrong to me, but I'm not sure if this is a bug or if some targets may be relying on it. On a 32-bit target, the default action for legalizing i8 and i16 types is to promote them. If you then have a BUILD_VECTOR to construct a legal vector type composed of i8 or i16 values, the type legalizer will

Question on pattern matching extractelt

2019 Nov 28

Question on pattern matching extractelt

Hi, I have an issue with pattern matching. I have the following SelectionDAG: t13: i32 = extract_vector_elt t2, Constant:i64<1> That I am trying to match with the following pattern: def : Pat<(extractelt (v4i16 SingleReg:$v), 1), (SRADd1 SingleReg :$v, (i64 16))>; But for some reason the pattern does not match. It seems to be due to the fact extract_vector_elt's result

[LLVMdev] type legalizer promoting BUILD_VECTORs

2009 Feb 03

[LLVMdev] type legalizer promoting BUILD_VECTORs

On Feb 2, 2009, at 1:22 PM, Duncan Sands wrote: > another way this could be done is to say that the operands of a > BUILD_VECTOR don't have to have the same type as the element type > of the built vector. Then when the type legalizer sees a > v4i16 = BUILD_VECTOR(i16, i16, i16, i16) it can turn this into a > v4i16 = BUILD_VECTOR(i32, i32, i32, i32) and it will be happy >

[LLVMdev] type legalizer promoting BUILD_VECTORs

2009 Feb 03

[LLVMdev] type legalizer promoting BUILD_VECTORs

(Resend, since it didn't seem to reach the mailing list the first time) Hi Bob, > LLVM's type legalizer is changing the types of BUILD_VECTORs in a way > that seems wrong to me, but I'm not sure if this is a bug or if some > targets may be relying on it. > > On a 32-bit target, the default action for legalizing i8 and i16 types > is to promote them. If you

[LLVMdev] Lowering to MMX

2011 Oct 25

[LLVMdev] Lowering to MMX

On Oct 20, 2011, at 8:42 AM, Nicolas Capens wrote: > Hi all, > > I'm working on a graphics project which uses LLVM for dynamic code > generation, and I noticed a major performance regression when upgrading > from LLVM 2.8 to 3.0-rc1 (LLVM 2.9 didn't support Win64 so I skipped it > entirely). > > I found out that the performance regression is due to removing

[LLVMdev] Lowering to MMX

2011 Oct 26

[LLVMdev] Lowering to MMX

Hi Bill, Comments inline: On 24/10/2011 9:50 PM, Bill Wendling wrote: > On Oct 20, 2011, at 8:42 AM, Nicolas Capens wrote: > >> Hi all, >> >> I'm working on a graphics project which uses LLVM for dynamic code >> generation, and I noticed a major performance regression when upgrading >> from LLVM 2.8 to 3.0-rc1 (LLVM 2.9 didn't support Win64 so I

Return value from TargetLowering::LowerOperation?

2016 Jan 25

Return value from TargetLowering::LowerOperation?

Hi, On 01/22/2016 05:02 PM, Tom Stellard wrote: > On Fri, Jan 22, 2016 at 01:58:49PM +0100, Mikael Holmén via llvm-dev wrote: >> Hi, >> >> I'm a litle bit puzzled by the TargetLowering::LowerOperation function, >> and what different callers of this function assumes about the returned >> value. >> > SelectionDAGLegalize::LegalizeOp() is your best

Avoiding during my pass the optimization (copy propagation) of my LLVM IR code (at generation)

2016 Dec 30

Avoiding during my pass the optimization (copy propagation) of my LLVM IR code (at generation)

Hello. I'm writing an LLVM pass that is working on LLVM IR. To my surprise the following LLVM pass code generates optimized code - it does copy propagation on it. Value *vecShuffleOnePtr = Builder.CreateGEP(ptr_B, vecShuffleOne, "VectorGep"); ... packed_gather_params.push_back(vecShuffleOnePtr); CallInst *callGather =

[LLVMdev] [PTX] Should we keep backward-compatibility of PTX?

2011 Mar 10

[LLVMdev] [PTX] Should we keep backward-compatibility of PTX?

Hi Justin, There are some backward incompatible features of PTX; for example, special registers are redefined as v4i32 (they were v4i16) in PTX 2.0. And CUDA 4.0 was rolled out last week. I heard that some instructions are deprecated. I am not sure how stable (or unstable) PTX specification is. Do you have a rough assessment of its stability? If PTX specification is still fast evolving, I would

Question about quad-register

2017 Sep 10

Question about quad-register

Hi All, If the target supports quad-register R0:R1:R2:R3 (Rn is 32-bit register), is it possible mapping quad-register to v4i32 so that the following example work? typedef int v4si __attribute__ ((vector_size (16))); void foo(v4si i) { v4si j = i; } I don't know how to write CallingConv.td to represent the concept of occupying quad-register R0:R1:R2:R3 once seeing

[LLVMdev] How to define macros in a tablegen file?

2012 Jun 19

[LLVMdev] How to define macros in a tablegen file?

If the patterns only include SDNodes, then pattern fragments will work. I might be wrong, but I've yet to find a way to do it with machine instructions, which is what you seem to have here. Micah > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Sebastian Pop > Sent: Tuesday, June 19, 2012 3:39 PM > To:

Question about store with unaligned memory address

2016 Jan 29

Question about store with unaligned memory address

Hi Krzysztof, Thanks for response. The method is working almost of test cases which use load and store instructions connected with chain. There is other situation. Let's look at a example as follows: typedef unsigned short int UV __attribute__((vector_size (8))); void test (UV *x, UV *y) { *x = *y / ((UV) { 4, 4, 4, 4 }); } The target does not support vector type so CodeGen tries to

[LLVMdev] Expand vector type

2012 Feb 29

[LLVMdev] Expand vector type

Hi, * Is there a way to setup LLVM to automatically convert vec3s to vec4s? Yes, if you specify v3i16 and friends as "promote" instead of "legal", llvm will promote it to a v4i16. The ARM NEON backend does this already. I'm surprised you haven't got this happening already as you mention that LLVM widens your loads to 4-element vectors. (this should happen during

similar to: [LLVMdev] question on scalarization