thr3ads.net - similar to: "[LLVMdev] Enabling Vector-select"

Displaying 20 results from an estimated 8000 matches similar to: "[LLVMdev] Enabling Vector-select"

2011 Oct 16

[LLVMdev] Enabling Vector-select

Hi Nadav, great work, thanks a lot! I did not have the time to migrate our OpenCL driver to the latest trunk yet, but I followed your commits and tried out some small tests which worked as expected :). The last thing missing for us now is AVX support in the JIT, but that is a different issue. However, there is one thing I do not fully understand: what if somebody actually wants a vector of

[LLVMdev] vector type legalization

2013 Aug 12

[LLVMdev] vector type legalization

Hi Nadav, On 2013-08-12 12:59 PM, "Nadav Rotem" <nrotem at apple.com> wrote: >Hi Paul, > >You can read about it here: >http://blog.llvm.org/2011/12/llvm-31-vector-changes.html > >> Hi, >> >> I am trying to understand how vector type legalization works. In >>particular, I'm looking at i8 vector types on x86 (with sse42 features)

[LLVMdev] vector type legalization

2013 Aug 12

[LLVMdev] vector type legalization

Hi, I am trying to understand how vector type legalization works. In particular, I'm looking at i8 vector types on x86 (with sse42 features) v3i8 gets widened to v4i8 and then operations get unrolled (scalarized) because v4i8 is not a legal type whereas v4i8 gets promoted to v4i32. Why doesn't v3i8 (or even v4i8) get widened to v16i8? Alternatively, v3i8 could be widened to v4i8 then

[LLVMdev] vector type legalization

2013 Aug 12

[LLVMdev] vector type legalization

This is a bug in the implementation of WidenVecRes_Binary. On line 1546 it assumes that “Widen” is the last phase of type-legalization and we check if the result is a legal type. But actually we want to continue and promote the elements of the vector. In other cases we may want to widen (to the next power of two) and later split in half because the vector is too big. On Aug 12, 2013, at 10:46

[LLVMdev] vector type legalization

2013 Aug 12

[LLVMdev] vector type legalization

Hi Paul, You can read about it here: http://blog.llvm.org/2011/12/llvm-31-vector-changes.html > Hi, > > I am trying to understand how vector type legalization works. In particular, I'm looking at i8 vector types on x86 (with sse42 features) > > v3i8 gets widened to v4i8 and then operations get unrolled (scalarized) because v4i8 is not a legal type whereas v4i8 gets This

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

2012 Jul 30

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

Hrmm.... PromoteVectorOp doesn't seem to follow this at all. http://llvm.org/svn/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp SDValue VectorLegalizer::PromoteVectorOp(SDValue Op) { // Vector "promotion" is basically just bitcasting and doing the operation // in a different type. For example, x86 promotes ISD::AND on v2i32 to // v1i64. EVT VT =

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

2012 Jul 30

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

v4i8 itself is a legal type, just not on the 'AND' operation. So there seems to be multiple problems here. 1) PromoteVectorOp doesn't handle the case where the types are not the same size, this occurs because #2 2) getTypeToPromoteTo doesn't actual check to see if the type it should promote to makes any sense. 3) PromoteVectorOp also doesn't handle the case where

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

2012 Jul 30

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

Sorry, <4 x i8> should convert to a <1 x i32>. What currently is happening is that it is returning a <2 x i32> because <1 x i32> does not exist. Micah > -----Original Message----- > From: Rotem, Nadav [mailto:nadav.rotem at intel.com] > Sent: Monday, July 30, 2012 10:51 AM > To: Villmow, Micah; Developers Mailing List > Subject: RE: Vector promotion broken

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

2012 Jul 30

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

Notice that PromoteVectorOp is called after the type legalization legalized all of the types in the program. It legalizes the *operations*, not the types. So, you should only see legal types (Legal types are types that fit into your registers). So, if your target has v2i32, I suspect that v4i8 is an illegal because it has a different size. -----Original Message----- From: Villmow, Micah

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

2012 Jul 30

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

If v4i8 is a legal type then getTypeToPromoteTo should return the pair v4i8 and 'legal'. This looks like the root of the problem. -----Original Message----- From: Villmow, Micah [mailto:Micah.Villmow at amd.com] Sent: Monday, July 30, 2012 22:10 To: Rotem, Nadav; Developers Mailing List Subject: RE: Vector promotion broken for <2 x [i8|i16]> v4i8 itself is a legal type, just not

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

2012 Jul 30

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

I don't know how your target architecture looks like, but I suspect that <4 x i8> should not be legalized to <1 x i32>. I think that what you are seeing is that <4 x i8> is first split into <2 x i8>, and later promoted to <2 x i32>. At the moment different targets can only affect type-legalization by declaring different legal types. A number of us discussed the

[LLVMdev] vector type legalization

2013 Aug 13

[LLVMdev] vector type legalization

Hi Nadav, I believe the implementation to keep on widening the vector to the next power of two must be in TargetLowering.h because that is where we decide whether to Widen the vector or not, and the size to which we widen it. In this case, we stop at 4xi8 and do not check if it is legal or not. But the comment says ‘try to widen vector elements until a legal type is found’. Also, there is a

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

2012 Jul 30

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

No, that is correct. I am adding the new types so that I can bitcast v2i8 into a v1i16 and then perform the 'and' operation and have legalize types turn the v1i16 into a scalar. Though I am having trouble in understanding how x86 supports the <1 x i64> type. Based on looking at the code, it should fail because v1i64 is not supported on the x86 platform as far as I can tell. Micah

[RFC] Introducing a vector reduction add instruction.

2015 Nov 13

[RFC] Introducing a vector reduction add instruction.

Hi When a reduction instruction is vectorized in a loop, it will be turned into an instruction with vector operands of the same operation type. This new instruction has a special property that can give us more flexibility during instruction selection later: this operation is valid as long as the reduction of all elements of the result vector is identical to the reduction of all elements of its

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

2012 Jul 30

[LLVMdev] Vector promotion broken for <2 x [i8|i16]>

>Though I am having trouble in understanding how x86 supports the <1 x i64> type. Based on looking at the code, it should fail because v1i64 is not supported on the x86 platform as >far as I can tell. The Type-Legalizer can handle vector types in the following ways: 1. Split - this splits vectors into two halves. For example on SSE4, <4 x i64> is split to <2 x i64> 2.

[LLVMdev] vector type legalization

2013 Aug 12

[LLVMdev] vector type legalization

Hi Nadav, From: Nadav Rotem <nrotem at apple.com<mailto:nrotem at apple.com>> Date: Monday, 12 August, 2013 1:59 PM To: Paul Redmond <paul.redmond at intel.com<mailto:paul.redmond at intel.com>> Cc: LLVM Developers Mailing List <llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu>> Subject: Re: [LLVMdev] vector type legalization This is a bug in the

[LLVMdev] Vector-select status update

2011 Oct 01

[LLVMdev] Vector-select status update

Hi, As of recently, the LLVM code-generator started supporting vector-select instructions (select instructions where the predicate operand is a vector of booleans). This support includes efficient sequences for targets which have dedicated blend instructions (such as SSE4 and AVX), a slower implementation using vector AND/OR/XOR instructions for unoptimized targets, and scalarization for

[LLVMdev] Bug #16941

2013 Oct 25

[LLVMdev] Bug #16941

Nadav, The problem appears only for vectors longer than available hardware register (in doubleword elements, i.e. more than 4 on SSE4 and more than 8 on AVX). Select does weird thing. <8 x i1> mask comes as two XMM registers, select converts them to a single XMM registers (i.e. 8 x 16 bit), immediately after it converts back to two XMM registers and does blend. Conversion forth and back has

[LLVMdev] Bug #16941

2013 Oct 26

[LLVMdev] Bug #16941

Hi Dmitry, Yes, this is a known problem with legalizing vector masks. The type <8 x i1> is legalized to 8 x i16, on SSE, but your operands are legalized to <4 x i32>. Type-legalization is performed per-node and we don’t have a good way to support instructions that mix the mask and operand type. Why does ISPC generate illegal vector types ? Does ISPC rely on the LLVM codegen to

[LLVMdev] Vector splitting vs widening

2013 Mar 09

[LLVMdev] Vector splitting vs widening

----- Original Message ----- > From: "Nadav Rotem" <nrotem at apple.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "llvmdev at cs.uiuc.edu Dev" <llvmdev at cs.uiuc.edu> > Sent: Wednesday, March 6, 2013 3:40:50 PM > Subject: Re: [LLVMdev] Vector splitting vs widening > > Hi Hal, > > > > > > > The

similar to: [LLVMdev] Enabling Vector-select