thr3ads.net - search: "v16i16"

MCRegisterClass mandatory vs preferred alignment?

2015 Aug 31

2

MCRegisterClass mandatory vs preferred alignment?

...in tablegen. From Target.td: > > class RegisterClass<string namespace, list<ValueType> regTypes, int alignment, > dag regList, RegAltNameIndex idx = NoRegAltName> > > X86RegisterInfo.td: > > def VR256 : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32, v4f64], > 256, (sequence "YMM%u", 0, 15)>; > def VR256X : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32, v4f64], > 256, (sequence "YMM%u", 0, 31)>; > > Seems...

MCRegisterClass mandatory vs preferred alignment?

2015 Aug 31

3

MCRegisterClass mandatory vs preferred alignment?

Looking around today, it appears that TargetRegisterClass and MCRegisterClass only includes a single alignment. This is documented as being the minimum legal alignment, but it appears to often be greater than this in practice. For instance, on x86 the alignment of %ymm0 is listed as 32, not 1. Does anyone know why this is? Additionally, where are these alignments actually defined? I

[RFC] Introducing a vector reduction add instruction.

2015 Nov 13

2

[RFC] Introducing a vector reduction add instruction.

Hi When a reduction instruction is vectorized in a loop, it will be turned into an instruction with vector operands of the same operation type. This new instruction has a special property that can give us more flexibility during instruction selection later: this operation is valid as long as the reduction of all elements of the result vector is identical to the reduction of all elements of its

[LLVMdev] AVX spill alignment

2011 Aug 25

2

[LLVMdev] AVX spill alignment

Hey guys, Are spills/reloads of AVX registers using aligned stores/loads? I can't seem to find the code that aligns the stack slots to 32-bytes. Could someone point me in the right direction? Thanks, Cameron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110825/b5724dec/attachment.html>

[LLVMdev] AVX spill alignment

2011 Sep 01

0

[LLVMdev] AVX spill alignment

...eloads of AVX registers using aligned stores/loads? Yes. > I can't > seem to find the code that aligns the stack slots to 32-bytes. Could > someone point me in the right direction? The register class has 256-bit spill alignment: def VR256 : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32, v4f64], 256, (sequence "YMM%u", 0, 15)> { let SubRegClasses = [(FR32 sub_ss), (FR64 sub_sd), (VR128 sub_xmm)]; } /jakob

[LLVMdev] RFC: ErLLVM - Implemented HiPE Calling Convention

2012 May 02

0

[LLVMdev] RFC: ErLLVM - Implemented HiPE Calling Convention

On 04/24/12 17:10, Yiannis Tsiouris wrote: > This patch (and the others that will follow) are rebased on svn r155440: > > "AVX2: The BLENDPW instruction selects between vectors of v16i16 using an i8 > immediate. We can't use it here because the shuffle code does not check that > the lower part of the word is identical to the upper part" > > Patch 1/3: > > The attached commits add a new calling convention to support the LLVM backend > for HiPE compi...

[LLVMdev] RFC: ErLLVM - Implemented HiPE Calling Convention

2012 Apr 24

2

[LLVMdev] RFC: ErLLVM - Implemented HiPE Calling Convention

This patch (and the others that will follow) are rebased on svn r155440: "AVX2: The BLENDPW instruction selects between vectors of v16i16 using an i8 immediate. We can't use it here because the shuffle code does not check that the lower part of the word is identical to the upper part" Patch 1/3: The attached commits add a new calling convention to support the LLVM backend for HiPE compiler, as described in a previous ema...

[LLVMdev] RFC: ErLLVM - Implemented HiPE Calling Convention

2012 May 02

1

[LLVMdev] RFC: ErLLVM - Implemented HiPE Calling Convention

...u Cc: erllvm at softlab.ntua.gr Subject: Re: [LLVMdev] RFC: ErLLVM - Implemented HiPE Calling Convention On 04/24/12 17:10, Yiannis Tsiouris wrote: > This patch (and the others that will follow) are rebased on svn r155440: > > "AVX2: The BLENDPW instruction selects between vectors of v16i16 using an i8 > immediate. We can't use it here because the shuffle code does not check that > the lower part of the word is identical to the upper part" > > Patch 1/3: > > The attached commits add a new calling convention to support the LLVM backend > for HiPE compi...

[LoopVectorizer] Improving the performance of dot product reduction loop

2018 Jul 23

4

[LoopVectorizer] Improving the performance of dot product reduction loop

...better it would be great if we > could have something like two v32i8 loads, two shufflevectors to extract > the even elements and the odd elements to create four v16i8 pieces. > > > Why v*i8 loads? I thought that we have 16-bit and 32-bit types here? > Oops that should have been v16i16. Mixed up my 256-bit types. > > Sign extend each of those pieces. Multiply the two even pieces and the two > odd pieces separately, sum those results with a v8i32 add. Then another > v8i32 add to accumulate the previous loop iterations. Then ensures that no > pieces exceed the targ...

[LoopVectorizer] Improving the performance of dot product reduction loop

2018 Jul 24

4

[LoopVectorizer] Improving the performance of dot product reduction loop

...gt;> could have something like two v32i8 loads, two shufflevectors to extract >> the even elements and the odd elements to create four v16i8 pieces. >> >> >> Why v*i8 loads? I thought that we have 16-bit and 32-bit types here? >> > > Oops that should have been v16i16. Mixed up my 256-bit types. > > >> >> Sign extend each of those pieces. Multiply the two even pieces and the >> two odd pieces separately, sum those results with a v8i32 add. Then another >> v8i32 add to accumulate the previous loop iterations. >> >> > I...

[LLVMdev] AVX spill alignment

2011 Sep 01

1

[LLVMdev] AVX spill alignment

...s/loads? > > Yes. > >> I can't >> seem to find the code that aligns the stack slots to 32-bytes. Could >> someone point me in the right direction? > > The register class has 256-bit spill alignment: > > def VR256 : RegisterClass<"X86", [v32i8, v16i16, v8i32, v4i64, v8f32, v4f64], > 256, (sequence "YMM%u", 0, 15)> { > let SubRegClasses = [(FR32 sub_ss), (FR64 sub_sd), (VR128 sub_xmm)]; > } > > /jakob > > -------------- next part -------------- An HTML attachment was scrubbed... URL: &lt...

IR canonicalization: shufflevector or vector trunc?

2017 Jan 17

2

IR canonicalization: shufflevector or vector trunc?

...I'm not sure how that would apply to vector types. Ie, let's say v256 is a legal type in your example. DataLayout doesn't appear to specify what configurations of a 256-bit vector are legal, so I don't think we can currently use that to say v2i128 should be treated differently than v16i16. Is this a valid argument to not canonicalize the IR? On Mon, Jan 16, 2017 at 10:16 AM, Rackover, Zvi <zvi.rackover at intel.com> wrote: > Suppose we prefer the ‘trunc’ form, then what about cases such as: > > define <2 x i16> @shuffle(<16 x i16> %x) { > > %shu...

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

2012 Jul 30

2

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

...MVT list: v4i8 = 14, // 4 x i8 v8i8 = 15, // 8 x i8 v16i8 = 16, // 16 x i8 v32i8 = 17, // 32 x i8 v2i16 = 18, // 2 x i16 v4i16 = 19, // 4 x i16 v8i16 = 20, // 8 x i16 v16i16 = 21, // 16 x i16 v2i32 = 22, // 2 x i32 So, for my platform with the 'and' I promote all i8 and i16 types, so the first type that is legal is v2i32. If I add the v1i32 then it works, however, it breaks when I added v1i16(which I need for the v2i8 case). So...

How to create vector pointer type?

2019 Oct 21

2

How to create vector pointer type?

Hello, Say the original type is Integer i16*,  I want to create a v16i16* type to replace it. static Type *getVectorPtr(Type *Ty) {     PointerType *PointerTy = dyn_cast<PointerType>(Ty);     assert(PointerTy && "PointerType expected");     unsigned addSpace = PointerTy->g...

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

2012 Jul 30

0

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

...MVT list: v4i8 = 14, // 4 x i8 v8i8 = 15, // 8 x i8 v16i8 = 16, // 16 x i8 v32i8 = 17, // 32 x i8 v2i16 = 18, // 2 x i16 v4i16 = 19, // 4 x i16 v8i16 = 20, // 8 x i16 v16i16 = 21, // 16 x i16 v2i32 = 22, // 2 x i32 So, for my platform with the 'and' I promote all i8 and i16 types, so the first type that is legal is v2i32. If I add the v1i32 then it works, however, it breaks when I added v1i16(which I need for the v2i8 case). So...

IR canonicalization: shufflevector or vector trunc?

2017 Jan 21

2

IR canonicalization: shufflevector or vector trunc?

...t would apply to vector types. > > Ie, let's say v256 is a legal type in your example. DataLayout doesn't > appear to specify what configurations of a 256-bit vector are legal, so I > don't think we can currently use that to say v2i128 should be treated > differently than v16i16. > > Is this a valid argument to not canonicalize the IR? > > > > On Mon, Jan 16, 2017 at 10:16 AM, Rackover, Zvi <zvi.rackover at intel.com> > wrote: > > Suppose we prefer the ‘trunc’ form, then what about cases such as: > > define <2 x i16> @shuffle(&l...

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

2012 Jul 30

2

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

.../ 4 x i8 > v8i8 = 15, // 8 x i8 > v16i8 = 16, // 16 x i8 > v32i8 = 17, // 32 x i8 > v2i16 = 18, // 2 x i16 > v4i16 = 19, // 4 x i16 > v8i16 = 20, // 8 x i16 > v16i16 = 21, // 16 x i16 > v2i32 = 22, // 2 x i32 > > So, for my platform with the 'and' I promote all i8 and i16 types, so > the first type that is legal is v2i32. > > If I add the v1i32 then it works, however, it breaks when I added > v1i16(wh...

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

2012 Jul 30

0

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

.../ 4 x i8 > v8i8 = 15, // 8 x i8 > v16i8 = 16, // 16 x i8 > v32i8 = 17, // 32 x i8 > v2i16 = 18, // 2 x i16 > v4i16 = 19, // 4 x i16 > v8i16 = 20, // 8 x i16 > v16i16 = 21, // 16 x i16 > v2i32 = 22, // 2 x i32 > > So, for my platform with the 'and' I promote all i8 and i16 types, so > the first type that is legal is v2i32. > > If I add the v1i32 then it works, however, it breaks when I added > v1i16(...

[LLVMdev] RFC: ErLLVM - An LLVM backend for Erlang

2012 Apr 24

0

[LLVMdev] RFC: ErLLVM - An LLVM backend for Erlang

Hi, Following Chris' advice, I will rebase the patches and break them in 3 distinct emails (one at a time) in order to be easier for a reviewer to approve/comments. Please note that the three patches while being code-wise independent, they 're strongly-connected *semantically*, meaning that including just a subset of these patches to LLVM's code base is quite weak if the others are

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

2012 Jul 30

0

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

I don't know how your target architecture looks like, but I suspect that <4 x i8> should not be legalized to <1 x i32>. I think that what you are seeing is that <4 x i8> is first split into <2 x i8>, and later promoted to <2 x i32>. At the moment different targets can only affect type-legalization by declaring different legal types. A number of us discussed the

search for: v16i16