Displaying 20 results from an estimated 500 matches similar to: "X86 TRUNCATE cost for AVX & AVX2 mode"
2016 Apr 12
2
X86 TRUNCATE cost for AVX & AVX2 mode
<Copied Cong>
Thanks Elena.
Mostly I was interested in why such a high cost 30 kept for TRUNCATE v16i32 to v16i8 in SSE41.
Looking at the code it appears like TRUNCATE v16i32 to v16i8 in SSE41 is very expensive
vs SSE2. I feel this number should be same/close to the cost mentioned for same
operation in SSE2ConversionTbl.
Below patch from Cong Hou reduce cost for same operation in SSE2
2009 Nov 10
4
[LLVMdev] Altivec vs the type legalizer
PPC Altivec supports vector type v16i8 (and others) where the element
type is not legal (in llvm's implementation). When we have a
BUILD_VECTOR of these types with constant elements, LegalizeTypes
first promotes the element types to i32, then builds a constant pool
entry of type v16i32. This is wrong. I can fix it by truncating the
elements back to i8 in ExpandBUILD_VECTOR. Does
2009 Nov 10
1
[LLVMdev] Altivec vs the type legalizer
Hi Dale, I think Bob is right: the type legalizer shouldn't be turning v16i8
into v16i32, what should happen is that the return type of the BUILD_VECTOR
continues to be v16i8, but the type of the operands changes to i32, so you
end up with a BUILD_VECTOR that takes 16 lots of i32, and produces a v16i8.
The target then has all the info it needs to produce the best code, but needs
to be careful
2009 Nov 10
0
[LLVMdev] Altivec vs the type legalizer
On Nov 9, 2009, at 6:11 PM, Dale Johannesen wrote:
> PPC Altivec supports vector type v16i8 (and others) where the element
> type is not legal (in llvm's implementation). When we have a
> BUILD_VECTOR of these types with constant elements, LegalizeTypes
> first promotes the element types to i32, then builds a constant pool
> entry of type v16i32. This is wrong. I can
2009 Nov 10
0
[LLVMdev] Altivec vs the type legalizer
Hi Dale,
> PPC Altivec supports vector type v16i8 (and others) where the element
> type is not legal (in llvm's implementation). When we have a
> BUILD_VECTOR of these types with constant elements, LegalizeTypes first
> promotes the element types to i32, then builds a constant pool entry of
> type v16i32.
are you sure? I would expect it to build v4i32.
Ciao,
Duncan.
2009 Nov 10
1
[LLVMdev] Altivec vs the type legalizer
On Nov 9, 2009, at 6:33 PM, Duncan Sands wrote:
> Hi Dale,
>
>> PPC Altivec supports vector type v16i8 (and others) where the
>> element type is not legal (in llvm's implementation). When we have
>> a BUILD_VECTOR of these types with constant elements, LegalizeTypes
>> first promotes the element types to i32, then builds a constant
>> pool entry of
2016 May 15
2
r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set
Hi ,
In the future, we will address this issue.
Regards
Michael Zuckerman
From: Eric Christopher [mailto:echristo at gmail.com]
Sent: Sunday, May 01, 2016 19:54
To: Zuckerman, Michael <michael.zuckerman at intel.com>; Craig Topper <craig.topper at gmail.com>
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa
2016 May 01
2
r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set
Hi,
For now no.
But I will add this three builtins to CGBuiltin.cpp.
If you want, you can be a reviewer of this change.
Regards
Michael Zuckerman
From: Craig Topper [mailto:craig.topper at gmail.com]
Sent: Thursday, April 28, 2016 04:53
To: Zuckerman, Michael <michael.zuckerman at intel.com>
Subject: Re: r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps
2013 May 10
4
[LLVMdev] Predicated Vector Operations
Jeff Bush <jeffbush001 at gmail.com> writes:
> Ah, I think I get it now. This was mentioned earlier in the thread,
> but it didn't click at the time. It sounds like I can do instruction
> selection with a pattern like (omitting selection of the sources):
>
> let Constraints = "$dst = $oldvalue" in {
> def MASKEDARITH : MyInstruction<
>
2016 Jul 29
2
Help with ISEL matching for an SDAG
I have the following selection DAG:
SelectionDAG has 9 nodes:
t0: ch = EntryToken
t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0
t16: i32,ch = load<LD1[%ptr](tbaa=<0x10023c9f448>), anyext from i8> t0,
t2, undef:i64
t15: v16i8 = BUILD_VECTOR t16, t16, t16, t16, t16, t16, t16, t16, t16,
t16, t16, t16, t16, t16, t16, t16
t11: ch,glue = CopyToReg t0, Register:v16i8 %V2, t15
2008 Dec 17
1
Noobie question, regression across levels
NB: Not reply needed (Ben was extremely helpful!)
I've just started using R last week and am still scratching my head.
I have a data set and want to run a separate regression across each level of
a factor (treating each one separately). The data right now is arranged such
that the value of the factor along which I want to "split" my data is one
column among many.
Best way to do
2014 Oct 26
2
[LLVMdev] Masked vector intrinsics and name mangling
Hal, thank you for your opinion.
I just was confused when I saw so long name " llvm.masked.load.v16i32.p0i32.v16i32.i32.v16i1" .
If we stay with a short name, we do a step towards instruction form.
- Elena
-----Original Message-----
From: Hal Finkel [mailto:hfinkel at anl.gov]
Sent: Sunday, October 26, 2014 17:06
To: Demikhovsky, Elena
Cc: llvmdev at cs.uiuc.edu
Subject: Re:
2014 Oct 26
2
[LLVMdev] Masked vector intrinsics and name mangling
Hi,
The proposed masked vector intrinsics are overloaded - one intrinsic ID for multiple types.
After name mangling it will look like:
%res = call <16 x i32> @llvm.masked.load.v16i32.p0i32.v16i32.i32.v16i1(i32* %addr, <16 x i32>%passthru, i32 4, <16 x i1> %mask)
6 types x 3 vector sizes = 18 names for one operation
I propose to remove name mangling from these intrinsics:
%res
2013 May 11
0
[LLVMdev] Predicated Vector Operations
On Fri, May 10, 2013 at 9:53 AM, <dag at cray.com> wrote:
> Jeff Bush <jeffbush001 at gmail.com> writes:
>
>> Ah, I think I get it now. This was mentioned earlier in the thread,
>> but it didn't click at the time. It sounds like I can do instruction
>> selection with a pattern like (omitting selection of the sources):
>>
>> let Constraints =
2013 May 09
2
[LLVMdev] Predicated Vector Operations
On May 9, 2013, at 3:05 PM, Jeff Bush <jeffbush001 at gmail.com> wrote:
> On Thu, May 9, 2013 at 8:10 AM, <dag at cray.com> wrote:
>> Jeff Bush <jeffbush001 at gmail.com> writes:
>>
>>> %tx = select %mask, %x, <0.0, 0.0, 0.0 ...>
>>> %ty = select %mask, %y, <0.0, 0.0, 0.0 ...>
>>> %sum = fadd %tx, %ty
>>> %newvalue
2014 Oct 26
2
[LLVMdev] Masked vector intrinsics and name mangling
> On Oct 26, 2014, at 8:22 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> ----- Original Message -----
>> From: "Elena Demikhovsky" <elena.demikhovsky at intel.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: llvmdev at cs.uiuc.edu
>> Sent: Sunday, October 26, 2014 10:17:49 AM
>> Subject: RE: [LLVMdev] Masked vector
2013 Jul 23
3
[LLVMdev] Vector DAG Patterns
Hi All,
Been having a problem constructing a suitable pattern to represent some
vector operations in the DAG. Stuff like andx/orx operations where
elements of a vector are anded/ored together.
My approach thus far has been to extract the sub elements of the vector
and and/or those elements. This is ok for 4 vectors of i32s, but becomes
cumbersome for v16i8s. Example instruction:
andx $dst
2013 May 10
0
[LLVMdev] Predicated Vector Operations
Ah, I think I get it now. This was mentioned earlier in the thread,
but it didn't click at the time. It sounds like I can do instruction
selection with a pattern like (omitting selection of the sources):
let Constraints = "$dst = $oldvalue" in {
def MASKEDARITH : MyInstruction<
(outs VectorReg:$dst),
(ins MaskReg:$mask, VectorReg:$src1, VectorReg:$src2,
2013 Jul 26
0
[LLVMdev] Vector DAG Patterns
To elaborate, it is not only cumbersome writing these patterns for
vectors of 16 characters (v16i8), it does not work.
When I compile with this pattern for an andx operation on v16i8:
/[(set RC:$dst,//
// (and (i8 (vector_extract(vt VC:$src), 0 ) ), //
// (and (i8 (vector_extract(vt VC:$src), 1 ) ),//
// (and (i8 (vector_extract(vt VC:$src), 2 ) ),//
////(and (i8 (vector_extract(vt
2010 Jul 15
2
[LLVMdev] v16i32/v16f32
Hi
I find types such as v16i32, v16f32 missing in my llvm version 2.7
So does the following page not list them
http://llvm.org/docs/doxygen/html/classllvm_1_1MVT.html
is that intentional for any reason or can I just add them ?
thanks
shrey