similar to: X86 TRUNCATE cost for AVX & AVX2 mode

Displaying 20 results from an estimated 500 matches similar to: "X86 TRUNCATE cost for AVX & AVX2 mode"

2016 Apr 12
2
X86 TRUNCATE cost for AVX & AVX2 mode
<Copied Cong> Thanks Elena. Mostly I was interested in why such a high cost 30 kept for TRUNCATE v16i32 to v16i8 in SSE41. Looking at the code it appears like TRUNCATE v16i32 to v16i8 in SSE41 is very expensive vs SSE2. I feel this number should be same/close to the cost mentioned for same operation in SSE2ConversionTbl. Below patch from Cong Hou reduce cost for same operation in SSE2
2009 Nov 10
4
[LLVMdev] Altivec vs the type legalizer
PPC Altivec supports vector type v16i8 (and others) where the element type is not legal (in llvm's implementation). When we have a BUILD_VECTOR of these types with constant elements, LegalizeTypes first promotes the element types to i32, then builds a constant pool entry of type v16i32. This is wrong. I can fix it by truncating the elements back to i8 in ExpandBUILD_VECTOR. Does
2009 Nov 10
1
[LLVMdev] Altivec vs the type legalizer
Hi Dale, I think Bob is right: the type legalizer shouldn't be turning v16i8 into v16i32, what should happen is that the return type of the BUILD_VECTOR continues to be v16i8, but the type of the operands changes to i32, so you end up with a BUILD_VECTOR that takes 16 lots of i32, and produces a v16i8. The target then has all the info it needs to produce the best code, but needs to be careful
2009 Nov 10
0
[LLVMdev] Altivec vs the type legalizer
On Nov 9, 2009, at 6:11 PM, Dale Johannesen wrote: > PPC Altivec supports vector type v16i8 (and others) where the element > type is not legal (in llvm's implementation). When we have a > BUILD_VECTOR of these types with constant elements, LegalizeTypes > first promotes the element types to i32, then builds a constant pool > entry of type v16i32. This is wrong. I can
2009 Nov 10
0
[LLVMdev] Altivec vs the type legalizer
Hi Dale, > PPC Altivec supports vector type v16i8 (and others) where the element > type is not legal (in llvm's implementation). When we have a > BUILD_VECTOR of these types with constant elements, LegalizeTypes first > promotes the element types to i32, then builds a constant pool entry of > type v16i32. are you sure? I would expect it to build v4i32. Ciao, Duncan.
2009 Nov 10
1
[LLVMdev] Altivec vs the type legalizer
On Nov 9, 2009, at 6:33 PM, Duncan Sands wrote: > Hi Dale, > >> PPC Altivec supports vector type v16i8 (and others) where the >> element type is not legal (in llvm's implementation). When we have >> a BUILD_VECTOR of these types with constant elements, LegalizeTypes >> first promotes the element types to i32, then builds a constant >> pool entry of
2016 May 15
2
r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set
Hi , In the future, we will address this issue. Regards Michael Zuckerman From: Eric Christopher [mailto:echristo at gmail.com] Sent: Sunday, May 01, 2016 19:54 To: Zuckerman, Michael <michael.zuckerman at intel.com>; Craig Topper <craig.topper at gmail.com> Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa
2016 May 01
2
r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set
Hi, For now no. But I will add this three builtins to CGBuiltin.cpp. If you want, you can be a reviewer of this change. Regards Michael Zuckerman From: Craig Topper [mailto:craig.topper at gmail.com] Sent: Thursday, April 28, 2016 04:53 To: Zuckerman, Michael <michael.zuckerman at intel.com> Subject: Re: r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps
2013 May 10
4
[LLVMdev] Predicated Vector Operations
Jeff Bush <jeffbush001 at gmail.com> writes: > Ah, I think I get it now. This was mentioned earlier in the thread, > but it didn't click at the time. It sounds like I can do instruction > selection with a pattern like (omitting selection of the sources): > > let Constraints = "$dst = $oldvalue" in { > def MASKEDARITH : MyInstruction< >
2016 Jul 29
2
Help with ISEL matching for an SDAG
I have the following selection DAG: SelectionDAG has 9 nodes: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 t16: i32,ch = load<LD1[%ptr](tbaa=<0x10023c9f448>), anyext from i8> t0, t2, undef:i64 t15: v16i8 = BUILD_VECTOR t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16 t11: ch,glue = CopyToReg t0, Register:v16i8 %V2, t15
2008 Dec 17
1
Noobie question, regression across levels
NB: Not reply needed (Ben was extremely helpful!) I've just started using R last week and am still scratching my head. I have a data set and want to run a separate regression across each level of a factor (treating each one separately). The data right now is arranged such that the value of the factor along which I want to "split" my data is one column among many. Best way to do
2014 Oct 26
2
[LLVMdev] Masked vector intrinsics and name mangling
Hal, thank you for your opinion. I just was confused when I saw so long name " llvm.masked.load.v16i32.p0i32.v16i32.i32.v16i1" . If we stay with a short name, we do a step towards instruction form. - Elena -----Original Message----- From: Hal Finkel [mailto:hfinkel at anl.gov] Sent: Sunday, October 26, 2014 17:06 To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu Subject: Re:
2014 Oct 26
2
[LLVMdev] Masked vector intrinsics and name mangling
Hi, The proposed masked vector intrinsics are overloaded - one intrinsic ID for multiple types. After name mangling it will look like: %res = call <16 x i32> @llvm.masked.load.v16i32.p0i32.v16i32.i32.v16i1(i32* %addr, <16 x i32>%passthru, i32 4, <16 x i1> %mask) 6 types x 3 vector sizes = 18 names for one operation I propose to remove name mangling from these intrinsics: %res
2013 May 11
0
[LLVMdev] Predicated Vector Operations
On Fri, May 10, 2013 at 9:53 AM, <dag at cray.com> wrote: > Jeff Bush <jeffbush001 at gmail.com> writes: > >> Ah, I think I get it now. This was mentioned earlier in the thread, >> but it didn't click at the time. It sounds like I can do instruction >> selection with a pattern like (omitting selection of the sources): >> >> let Constraints =
2013 May 09
2
[LLVMdev] Predicated Vector Operations
On May 9, 2013, at 3:05 PM, Jeff Bush <jeffbush001 at gmail.com> wrote: > On Thu, May 9, 2013 at 8:10 AM, <dag at cray.com> wrote: >> Jeff Bush <jeffbush001 at gmail.com> writes: >> >>> %tx = select %mask, %x, <0.0, 0.0, 0.0 ...> >>> %ty = select %mask, %y, <0.0, 0.0, 0.0 ...> >>> %sum = fadd %tx, %ty >>> %newvalue
2014 Oct 26
2
[LLVMdev] Masked vector intrinsics and name mangling
> On Oct 26, 2014, at 8:22 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > ----- Original Message ----- >> From: "Elena Demikhovsky" <elena.demikhovsky at intel.com> >> To: "Hal Finkel" <hfinkel at anl.gov> >> Cc: llvmdev at cs.uiuc.edu >> Sent: Sunday, October 26, 2014 10:17:49 AM >> Subject: RE: [LLVMdev] Masked vector
2013 Jul 23
3
[LLVMdev] Vector DAG Patterns
Hi All, Been having a problem constructing a suitable pattern to represent some vector operations in the DAG. Stuff like andx/orx operations where elements of a vector are anded/ored together. My approach thus far has been to extract the sub elements of the vector and and/or those elements. This is ok for 4 vectors of i32s, but becomes cumbersome for v16i8s. Example instruction: andx $dst
2013 May 10
0
[LLVMdev] Predicated Vector Operations
Ah, I think I get it now. This was mentioned earlier in the thread, but it didn't click at the time. It sounds like I can do instruction selection with a pattern like (omitting selection of the sources): let Constraints = "$dst = $oldvalue" in { def MASKEDARITH : MyInstruction< (outs VectorReg:$dst), (ins MaskReg:$mask, VectorReg:$src1, VectorReg:$src2,
2013 Jul 26
0
[LLVMdev] Vector DAG Patterns
To elaborate, it is not only cumbersome writing these patterns for vectors of 16 characters (v16i8), it does not work. When I compile with this pattern for an andx operation on v16i8: /[(set RC:$dst,// // (and (i8 (vector_extract(vt VC:$src), 0 ) ), // // (and (i8 (vector_extract(vt VC:$src), 1 ) ),// // (and (i8 (vector_extract(vt VC:$src), 2 ) ),// ////(and (i8 (vector_extract(vt
2010 Jul 15
2
[LLVMdev] v16i32/v16f32
Hi I find types such as v16i32, v16f32 missing in my llvm version 2.7 So does the following page not list them http://llvm.org/docs/doxygen/html/classllvm_1_1MVT.html is that intentional for any reason or can I just add them ? thanks shrey