thr3ads.net - similar to: "[LLVMdev] SplitVecRes with SIGN_EXTEND

Displaying 20 results from an estimated 900 matches similar to: "[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported"

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

2009 Dec 10

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

On Wed, Dec 9, 2009 at 8:40 PM, Villmow, Micah <Micah.Villmow at amd.com> wrote: > I have code that is generating sign extend in reg on a v8i32, but the > backend does not support this data type. This then asserts in > LegalizeVectorTypes.cpp:389 because there is no function to split this > vector into smaller sizes. Would a correct solution be to add this case so > to

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

2009 Dec 10

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

Thanks Eli, I'll see if I can get something working and submit a patch. Micah -----Original Message----- From: Eli Friedman [mailto:eli.friedman at gmail.com] Sent: Wednesday, December 09, 2009 11:18 PM To: Villmow, Micah Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported On Wed, Dec 9, 2009 at 8:40 PM, Villmow, Micah <Micah.Villmow at

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

2009 Dec 10

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

Eli, I have a simple SplitVecRes function that implements what you mentioned, splitting the LHS just as in BinaryOp, but passing through the RHS. The problem is that the second operand is MVT::Other, but when casted to an VTSDNode reveals that it is a vector length of the same size as the LHS SDValue. This causes a split on the LHS side to work correctly, but then it fails instruction selection

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

2009 Dec 10

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

On Thu, Dec 10, 2009 at 12:46 PM, Villmow, Micah <Micah.Villmow at amd.com> wrote: > Eli, > I have a simple SplitVecRes function that implements what you mentioned, splitting the LHS just as in BinaryOp, but passing through the RHS. The problem is that the second operand is MVT::Other, but when casted to an VTSDNode reveals that it is a vector length of the same size as the LHS

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

2009 Dec 10

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

Eli, I think I was able to get it working. Thanks for the help, does this look correct to you? void DAGTypeLegalizer::SplitVecRes_SIGN_EXTEND_INREG(SDNode *N, SDValue &Lo, SDValue &Hi) { SDValue LHSLo, LHSHi; GetSplitVector(N->getOperand(0), LHSLo, LHSHi); DebugLoc dl = N->getDebugLoc(); EVT LoVT, HiVT;

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

2009 Dec 10

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

Ok, It doesn't work. The problem is LLVM then asserts later on in SelectionDAG:2642 because it is checking to see whether the second operand is an Integer, and if not it assumes it is floating point and asserts with the method Cannot *_EXTEND_INREG FP types. So, it seems that the root problem here is the 'MVT::Other' still hanging around. How do I convert this SDValue to an int vector

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

2009 Dec 10

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

Eli, I don't see how this helps with the splitting of the Other node as it isn't the Dest that is the problem, but the second source value. Any place in the code that I can look at on how to split a VTSDNode? Thanks, Micah > -----Original Message----- > From: Eli Friedman [mailto:eli.friedman at gmail.com] > Sent: Thursday, December 10, 2009 1:25 PM > To: Villmow, Micah

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

2009 Dec 11

[LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported

After more digging, it seems that the SIGN_EXTEND_INREG is getting generated in DAGCombiner.cpp:3033. // fold (sext (truncate x)) -> (sextinreg x). if (!LegalOperations || TLI.isOperationLegal(ISD::SIGN_EXTEND_INREG, N0.getValueType())) { if (Op.getValueType().bitsLT(VT)) Op = DAG.getNode(ISD::ANY_EXTEND, N0.getDebugLoc(), VT,

[LoopVectorizer] Improving the performance of dot product reduction loop

2018 Jul 23

[LoopVectorizer] Improving the performance of dot product reduction loop

Hello all, This code https://godbolt.org/g/tTyxpf is a dot product reduction loop multipying sign extended 16-bit values to produce a 32-bit accumulated result. The x86 backend is currently not able to optimize it as well as gcc and icc. The IR we are getting from the loop vectorizer has several v8i32 adds and muls inside the loop. These are fed by v8i16 loads and sexts from v8i16 to v8i32. The

Masked intrinsics and non-default address spaces

2016 Feb 15

Masked intrinsics and non-default address spaces

Masked load/store are overloaded intrinsics, the only generic type is the type of the value being loaded/stored. The signature of the intrinsic is generated based on this type. The type of the pointer argument is generated as a pointer to the return type with default addrspace. E.g.: declare <8 x i32> @llvm.masked.load.v8i32(<8 x i32>*, i32, <8 x i1>, <8 x i32>) The

[LoopVectorizer] Improving the performance of dot product reduction loop

2018 Jul 23

[LoopVectorizer] Improving the performance of dot product reduction loop

~Craig On Mon, Jul 23, 2018 at 4:24 PM Hal Finkel <hfinkel at anl.gov> wrote: > > On 07/23/2018 05:22 PM, Craig Topper wrote: > > Hello all, > > This code https://godbolt.org/g/tTyxpf is a dot product reduction loop > multipying sign extended 16-bit values to produce a 32-bit accumulated > result. The x86 backend is currently not able to optimize it as well as gcc

[LoopVectorizer] Improving the performance of dot product reduction loop

2018 Jul 24

[LoopVectorizer] Improving the performance of dot product reduction loop

On Tue, Jul 24, 2018 at 6:10 AM Hal Finkel <hfinkel at anl.gov> wrote: > > On 07/23/2018 06:37 PM, Craig Topper wrote: > > > ~Craig > > > On Mon, Jul 23, 2018 at 4:24 PM Hal Finkel <hfinkel at anl.gov> wrote: > >> >> On 07/23/2018 05:22 PM, Craig Topper wrote: >> >> Hello all, >> >> This code https://godbolt.org/g/tTyxpf

[LoopVectorizer] Improving the performance of dot product reduction loop

2018 Jul 23

[LoopVectorizer] Improving the performance of dot product reduction loop

On 07/23/2018 06:23 PM, Hal Finkel via llvm-dev wrote: > > On 07/23/2018 05:22 PM, Craig Topper wrote: >> Hello all, >> >> This code https://godbolt.org/g/tTyxpf is a dot product reduction >> loop multipying sign extended 16-bit values to produce a 32-bit >> accumulated result. The x86 backend is currently not able to optimize >> it as well as gcc and icc.

[LLVMdev] Cannot split vector result of AVX intrinsic _mm256_rsqrt_ps

2014 Dec 13

[LLVMdev] Cannot split vector result of AVX intrinsic _mm256_rsqrt_ps

I'm getting this on LLVM trunk: SplitVectorResult #0: 0x27e6250: v8f32 = llvm.x86.avx.rsqrt.ps.256 0x2739310, 0x2739420 [ORD=16] [ID=0] LLVM ERROR: Do not know how to split the result of this operator! clang: error: linker command failed with exit code 1 (use -v to see invocation) Oddly, when I build the same code without -flto I don't see this issue. I see a similar bug was reported

Matching ConstantFPSDNode tablegen

2018 Jun 07

Matching ConstantFPSDNode tablegen

I'm trying to match a ConstantFPSDNode == 0 in dag pattern for tablegen but am having some issues. So LLVM doesn't seem to accept a floating point constant literal match like: %v = call <4 x float> @foo(i32 15, float %s, float 0.0, <8 x i32> %rsrc, <4 x i32> %samp, i1 0, i32 0, i32 0) ret <4 x float> %v def : XXXPat<(v4f32 (int_foo i32:$mask, f32:$s, 0,

[LLVMdev] widen_load fails on AVX

2012 Jan 11

[LLVMdev] widen_load fails on AVX

Hello Chris, We caught this failure: ./llc -mattr=+avx ../../test/CodeGen/X86/widen_load-2.ll llc: LegalizeTypes.cpp:831: void llvm::DAGTypeLegalizer::SetSplitVector(llvm::SDValue, llvm::SDValue, llvm::SDValue): Assertion `Lo.getValueType().getVectorElementType() == Op.getValueType().getVectorElementType() && 2*Lo.getValueType().getVectorNumElements() ==

[LLVMdev] SplitVecRes_LOAD

2010 May 05

[LLVMdev] SplitVecRes_LOAD

I was going through the function DAGTypeLegalizer::SplitVecRes_LOAD in LegalizeVectorTypes.cpp. I noticed that it is using getSizeInBits()/8 to compute IncrementSize, which is the offset for the load of second half of the vector. I have a situation where the frontend is producing load for a <2 x i1> type, and the architecture has i1 registers (but not v2i1 registers). The store size of i1 is

[LLVMdev] Question about LLVM NEON intrinsics

2012 Sep 21

[LLVMdev] Question about LLVM NEON intrinsics

Hi all, I would like to know if LLVM Neon intrinsics are designed to support only 'Legal' types for NEON units. Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on following ll code: ; ModuleID = 'vmax.ll' target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32" target triple =

MCRegisterClass mandatory vs preferred alignment?

2015 Aug 31

MCRegisterClass mandatory vs preferred alignment?

On 08/31/2015 03:59 PM, Matthias Braun wrote: > Looks to me like the alignment is specified in tablegen. From Target.td: > > class RegisterClass<string namespace, list<ValueType> regTypes, int alignment, > dag regList, RegAltNameIndex idx = NoRegAltName> > > X86RegisterInfo.td: > > def VR256 : RegisterClass<"X86", [v32i8,

MCRegisterClass mandatory vs preferred alignment?

2015 Aug 31

MCRegisterClass mandatory vs preferred alignment?

Looking around today, it appears that TargetRegisterClass and MCRegisterClass only includes a single alignment. This is documented as being the minimum legal alignment, but it appears to often be greater than this in practice. For instance, on x86 the alignment of %ymm0 is listed as 32, not 1. Does anyone know why this is? Additionally, where are these alignments actually defined? I

similar to: [LLVMdev] SplitVecRes with SIGN_EXTEND_INREG unsupported