thr3ads.net - search: "v4i1"

2013 Mar 05

4

[LLVMdev] Vector splitting vs widening

...Scalarize node result 0: 0x2348420: v1f32 = extract_subvector 0x23434a0, 0x2348320 [ID=0] Scalarize node result 0: 0x2348220: v1f32 = extract_subvector 0x23434a0, 0x23466e0 [ID=0] Split node result: 0x23469e0: v4f32 = extract_subvector 0x23435a0, 0x23466e0 [ID=0] Split node operand: 0x2346be0: v4i1 = setcc 0x23467e0, 0x23469e0, 0x23436a0 [ID=0] Split node result: 0x2348620: v2f32 = extract_subvector 0x23435a0, 0x2346de0 [ID=0] Widen node result 0: 0x2348820: v2i1 = setcc 0x2346ee0, 0x2348620, 0x23436a0 [ID=0] llc: lib/CodeGen/SelectionDAG/LegalizeTypes.h:599: llvm::SDValue llvm::DAGTypeLeg...

[LLVMdev] Boolean floats and v4i1

2012 Jun 25

2

[LLVMdev] Boolean floats and v4i1

...your platform. For example, the AND operation is really only an AND operation on the sign bits of the underlying floating-point numbers, it does not AND all of the bits (and it always changes them so that the operation always returns -1.0 or 1.0). But I need this AND to be used for the promoted v4i1 values (so I need to mark it as legal and match it to the associated vector logical operation). Thanks again, Hal > > Nadav > > > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu > [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel Sent: >...

[LLVMdev] Boolean floats and v4i1

2012 Jun 25

3

[LLVMdev] Boolean floats and v4i1

...s. For those inputs that are logically vectors of booleans the system uses the following convention: positive numbers are true, everything else (including NaNs) are false. The outputs of logical operations are -1.0 and 1.0. I am not sure how to best support this in LLVM. LLVM does not have an MVT::v4i1. One thing that I can do (without modifying LLVM core) is to add v4i64 to the vector registers, and pretend that the v4i1 is being promoted to that type (I match loads and stores to pairs of memory operations and fp<->int conversions). This works somewhat (CodeGen will happily generate vector...

[LLVMdev] Boolean floats and v4i1

2012 Jun 25

0

[LLVMdev] Boolean floats and v4i1

You could set the AND operation action to custom. The problem is that you would have no way of knowing if the type 'v4i64' originated from v4i1 or v4i64. And I don't think that you can use SimplifyDemandedBits (to discover if only the high bit is set) during the legalizer because the DAG is in a strange state, but I could be mistaken on this one. Okay, here is another idea. There are several DAGCombine invocations, including one b...

Question about VectorLegalizer::ExpandStore() with v4i1

2016 Jun 25

2

Question about VectorLegalizer::ExpandStore() with v4i1

Hi All, I have a problem with VectorLegalizer::ExpandStore() with v4i1. Let's see a example. * LLVM IR store <4 x i1> %edgeMask_for.body1314, <4 x i1>* %27 * SelectionDAG before vector legalization ch = store<ST1[%16](align=4), trunc to v4i1> t0, t128, t32, undef:i64 * SelectionDAG after vector legalization ch = store<ST1[%16](align=4), tr...

[LLVMdev] Boolean floats and v4i1

2012 Jun 25

0

[LLVMdev] Boolean floats and v4i1

...") them to something that works on your platform. Nadav -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel Sent: Monday, June 25, 2012 06:28 To: LLVM Developers Mailing List Subject: [LLVMdev] Boolean floats and v4i1 Hello, I'm working on support for the SIMD instruction set on our new BG/Q supercomputer. This instruction set is v4f64 (with the exception of some int <-> fp conversions, floating-point only). The vectorized comparisons, logical operations and selects also exclusively use floating-poin...

[LLVMdev] Vector splitting vs widening

2013 Mar 09

1

[LLVMdev] Vector splitting vs widening

...Dev" <llvmdev at cs.uiuc.edu> > Sent: Wednesday, March 6, 2013 3:40:50 PM > Subject: Re: [LLVMdev] Vector splitting vs widening > > Hi Hal, > > > > > > > The problem is essentially the following: there are no vector f32 > types (yet), so the <v4i1> = setcc <v4f32> node needs to be split > and scalarized. The operand splitting seems to start correctly, but > because <v4i1> is itself a legal type, after splitting the node into > <v2i1> = setcc <v2f32>, the process becomes confused. The operands > are agai...

Question about VectorLegalizer::ExpandStore() with v4i1

2016 Jun 28

0

Question about VectorLegalizer::ExpandStore() with v4i1

Hi All, Can someone comment below question whether it is wrong or not please? 2016-06-25 7:52 GMT+01:00 jingu kang <jaykang10 at gmail.com>: > Hi All, > > I have a problem with VectorLegalizer::ExpandStore() with v4i1. > > Let's see a example. > > * LLVM IR > store <4 x i1> %edgeMask_for.body1314, <4 x i1>* %27 > > * SelectionDAG before vector legalization > ch = store<ST1[%16](align=4), trunc to v4i1> t0, t128, t32, undef:i64 > > * SelectionDAG after vector le...

[LLVMdev] Vector splitting vs widening

2013 Mar 06

0

[LLVMdev] Vector splitting vs widening

Hi Hal, > The problem is essentially the following: there are no vector f32 types (yet), so the <v4i1> = setcc <v4f32> node needs to be split and scalarized. The operand splitting seems to start correctly, but because <v4i1> is itself a legal type, after splitting the node into <v2i1> = setcc <v2f32>, the process becomes confused. The operands are again split (as they sho...

Question about VectorLegalizer::ExpandStore() with v4i1

2016 Jun 28

2

Question about VectorLegalizer::ExpandStore() with v4i1

...m-dev at lists.llvm.org> wrote: > Hi All, > > Can someone comment below question whether it is wrong or not please? > > 2016-06-25 7:52 GMT+01:00 jingu kang <jaykang10 at gmail.com>: >> Hi All, >> >> I have a problem with VectorLegalizer::ExpandStore() with v4i1. >> >> Let's see a example. >> >> * LLVM IR >> store <4 x i1> %edgeMask_for.body1314, <4 x i1>* %27 >> >> * SelectionDAG before vector legalization >> ch = store<ST1[%16](align=4), trunc to v4i1> t0, t128, t32, undef:i64 >>...

[LLVMdev] Vector splitting vs widening

2013 Mar 05

0

[LLVMdev] Vector splitting vs widening

...0x2348420: v1f32 = extract_subvector 0x23434a0, 0x2348320 [ID=0] > > Scalarize node result 0: 0x2348220: v1f32 = extract_subvector 0x23434a0, 0x23466e0 [ID=0] > > Split node result: 0x23469e0: v4f32 = extract_subvector 0x23435a0, 0x23466e0 [ID=0] > > Split node operand: 0x2346be0: v4i1 = setcc 0x23467e0, 0x23469e0, 0x23436a0 [ID=0] > > Split node result: 0x2348620: v2f32 = extract_subvector 0x23435a0, 0x2346de0 [ID=0] > > Widen node result 0: 0x2348820: v2i1 = setcc 0x2346ee0, 0x2348620, 0x23436a0 [ID=0] > > llc: lib/CodeGen/SelectionDAG/LegalizeTypes.h:599: llv...

Question about VectorLegalizer::ExpandStore() with v4i1

2016 Jun 29

0

Question about VectorLegalizer::ExpandStore() with v4i1

...VL should make it easier. Without AVX512BW and VL (i.e., all of today's x86 targets), optimal representation of the result of compare is determined by how it is consumed, and it is not a good idea to have such optimization in multiple different places. If the legalizer has to blindly legalize v4i1 without knowing how it is consumed, it is best to look at what happens to v8i1. We can then let the same optimizer work to get the optimal ASM code out in the end, whether vectorization factor is 4 or 8. In the end, I may be agreeing to Rob, but not because of the reasons Rob mentioned. One of the...

[LLVMdev] Vector select/compare support in LLVM

2011 Mar 10

0