search for: v4i1

Displaying 20 results from an estimated 21 matches for "v4i1".

Did you mean: v4i16
2013 Mar 05
4
[LLVMdev] Vector splitting vs widening
...Scalarize node result 0: 0x2348420: v1f32 = extract_subvector 0x23434a0, 0x2348320 [ID=0] Scalarize node result 0: 0x2348220: v1f32 = extract_subvector 0x23434a0, 0x23466e0 [ID=0] Split node result: 0x23469e0: v4f32 = extract_subvector 0x23435a0, 0x23466e0 [ID=0] Split node operand: 0x2346be0: v4i1 = setcc 0x23467e0, 0x23469e0, 0x23436a0 [ID=0] Split node result: 0x2348620: v2f32 = extract_subvector 0x23435a0, 0x2346de0 [ID=0] Widen node result 0: 0x2348820: v2i1 = setcc 0x2346ee0, 0x2348620, 0x23436a0 [ID=0] llc: lib/CodeGen/SelectionDAG/LegalizeTypes.h:599: llvm::SDValue llvm::DAGTypeLeg...
2012 Jun 25
2
[LLVMdev] Boolean floats and v4i1
...your platform. For example, the AND operation is really only an AND operation on the sign bits of the underlying floating-point numbers, it does not AND all of the bits (and it always changes them so that the operation always returns -1.0 or 1.0). But I need this AND to be used for the promoted v4i1 values (so I need to mark it as legal and match it to the associated vector logical operation). Thanks again, Hal > > Nadav > > > -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu > [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel Sent: >...
2012 Jun 25
3
[LLVMdev] Boolean floats and v4i1
...s. For those inputs that are logically vectors of booleans the system uses the following convention: positive numbers are true, everything else (including NaNs) are false. The outputs of logical operations are -1.0 and 1.0. I am not sure how to best support this in LLVM. LLVM does not have an MVT::v4i1. One thing that I can do (without modifying LLVM core) is to add v4i64 to the vector registers, and pretend that the v4i1 is being promoted to that type (I match loads and stores to pairs of memory operations and fp<->int conversions). This works somewhat (CodeGen will happily generate vector...
2012 Jun 25
0
[LLVMdev] Boolean floats and v4i1
You could set the AND operation action to custom. The problem is that you would have no way of knowing if the type 'v4i64' originated from v4i1 or v4i64. And I don't think that you can use SimplifyDemandedBits (to discover if only the high bit is set) during the legalizer because the DAG is in a strange state, but I could be mistaken on this one. Okay, here is another idea. There are several DAGCombine invocations, including one b...
2016 Jun 25
2
Question about VectorLegalizer::ExpandStore() with v4i1
Hi All, I have a problem with VectorLegalizer::ExpandStore() with v4i1. Let's see a example. * LLVM IR store <4 x i1> %edgeMask_for.body1314, <4 x i1>* %27 * SelectionDAG before vector legalization ch = store<ST1[%16](align=4), trunc to v4i1> t0, t128, t32, undef:i64 * SelectionDAG after vector legalization ch = store<ST1[%16](align=4), tr...
2012 Jun 25
0
[LLVMdev] Boolean floats and v4i1
...") them to something that works on your platform. Nadav -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel Sent: Monday, June 25, 2012 06:28 To: LLVM Developers Mailing List Subject: [LLVMdev] Boolean floats and v4i1 Hello, I'm working on support for the SIMD instruction set on our new BG/Q supercomputer. This instruction set is v4f64 (with the exception of some int <-> fp conversions, floating-point only). The vectorized comparisons, logical operations and selects also exclusively use floating-poin...
2013 Mar 09
1
[LLVMdev] Vector splitting vs widening
...Dev" <llvmdev at cs.uiuc.edu> > Sent: Wednesday, March 6, 2013 3:40:50 PM > Subject: Re: [LLVMdev] Vector splitting vs widening > > Hi Hal, > > > > > > > The problem is essentially the following: there are no vector f32 > types (yet), so the <v4i1> = setcc <v4f32> node needs to be split > and scalarized. The operand splitting seems to start correctly, but > because <v4i1> is itself a legal type, after splitting the node into > <v2i1> = setcc <v2f32>, the process becomes confused. The operands > are agai...
2016 Jun 28
0
Question about VectorLegalizer::ExpandStore() with v4i1
Hi All, Can someone comment below question whether it is wrong or not please? 2016-06-25 7:52 GMT+01:00 jingu kang <jaykang10 at gmail.com>: > Hi All, > > I have a problem with VectorLegalizer::ExpandStore() with v4i1. > > Let's see a example. > > * LLVM IR > store <4 x i1> %edgeMask_for.body1314, <4 x i1>* %27 > > * SelectionDAG before vector legalization > ch = store<ST1[%16](align=4), trunc to v4i1> t0, t128, t32, undef:i64 > > * SelectionDAG after vector le...
2013 Mar 06
0
[LLVMdev] Vector splitting vs widening
Hi Hal, > The problem is essentially the following: there are no vector f32 types (yet), so the <v4i1> = setcc <v4f32> node needs to be split and scalarized. The operand splitting seems to start correctly, but because <v4i1> is itself a legal type, after splitting the node into <v2i1> = setcc <v2f32>, the process becomes confused. The operands are again split (as they sho...
2016 Jun 28
2
Question about VectorLegalizer::ExpandStore() with v4i1
...m-dev at lists.llvm.org> wrote: > Hi All, > > Can someone comment below question whether it is wrong or not please? > > 2016-06-25 7:52 GMT+01:00 jingu kang <jaykang10 at gmail.com>: >> Hi All, >> >> I have a problem with VectorLegalizer::ExpandStore() with v4i1. >> >> Let's see a example. >> >> * LLVM IR >> store <4 x i1> %edgeMask_for.body1314, <4 x i1>* %27 >> >> * SelectionDAG before vector legalization >> ch = store<ST1[%16](align=4), trunc to v4i1> t0, t128, t32, undef:i64 >>...
2013 Mar 05
0
[LLVMdev] Vector splitting vs widening
...0x2348420: v1f32 = extract_subvector 0x23434a0, 0x2348320 [ID=0] > > Scalarize node result 0: 0x2348220: v1f32 = extract_subvector 0x23434a0, 0x23466e0 [ID=0] > > Split node result: 0x23469e0: v4f32 = extract_subvector 0x23435a0, 0x23466e0 [ID=0] > > Split node operand: 0x2346be0: v4i1 = setcc 0x23467e0, 0x23469e0, 0x23436a0 [ID=0] > > Split node result: 0x2348620: v2f32 = extract_subvector 0x23435a0, 0x2346de0 [ID=0] > > Widen node result 0: 0x2348820: v2i1 = setcc 0x2346ee0, 0x2348620, 0x23436a0 [ID=0] > > llc: lib/CodeGen/SelectionDAG/LegalizeTypes.h:599: llv...
2016 Jun 29
0
Question about VectorLegalizer::ExpandStore() with v4i1
...VL should make it easier. Without AVX512BW and VL (i.e., all of today's x86 targets), optimal representation of the result of compare is determined by how it is consumed, and it is not a good idea to have such optimization in multiple different places. If the legalizer has to blindly legalize v4i1 without knowing how it is consumed, it is best to look at what happens to v8i1. We can then let the same optimizer work to get the optimal ASM code out in the end, whether vectorization factor is 4 or 8. In the end, I may be agreeing to Rob, but not because of the reasons Rob mentioned. One of the...
2011 Mar 10
0
[LLVMdev] Vector select/compare support in LLVM
...p nesting, etc). Saving masks in xmm registers may result in vector-register pressure which will cause spilling of these registers. I agree with you that GP registers are also a precious resource. > I am not sure what is the best way to store masks. > > In my private branch, I added the [v4i1 .. v64i1] types. I also implemented a new type of target lowering: "PACK". This lowering packs vectors of i1s into integer registers. For example, the<4 x i1> type would get packed into the i8 type. I modified LegalizeTypes and LegalizeVectorTypes and added legalization for SETCC,...
2011 Mar 10
2
[LLVMdev] Vector select/compare support in LLVM
...for each loop nesting, etc). Saving masks in xmm registers may result in vector-register pressure which will cause spilling of these registers. I agree with you that GP registers are also a precious resource. I am not sure what is the best way to store masks. In my private branch, I added the [v4i1 .. v64i1] types. I also implemented a new type of target lowering: "PACK". This lowering packs vectors of i1s into integer registers. For example, the <4 x i1> type would get packed into the i8 type. I modified LegalizeTypes and LegalizeVectorTypes and added legalization for SETCC,...
2011 Mar 10
2
[LLVMdev] Vector select/compare support in LLVM
...for each loop nesting, etc). Saving masks in xmm registers may result in vector-register pressure which will cause spilling of these registers. I agree with you that GP registers are also a precious resource. I am not sure what is the best way to store masks. In my private branch, I added the [v4i1 .. v64i1] types. I also implemented a new type of target lowering: "PACK". This lowering packs vectors of i1s into integer registers. For example, the <4 x i1> type would get packed into the i8 type. I modified LegalizeTypes and LegalizeVectorTypes and added legalization for SETCC,...
2011 Mar 08
3
[LLVMdev] Vector select/compare support in LLVM
...ose registers. I started by adding several new types to ValueTypes (td and h).  I added ‘4vi1, 8vi1, 16vi1 … 64vi1’.  For x86, I mapped the v8i1 .. v8i64 to general purpose x86 registers. I started playing with a small program, which performed a vector CMP on 4 elements. The legalizer promoted the v4i1 to the next legal pow-of-two type, which was v8i1. I changed WidenVecRes_SETCC and added a new method WidenVecOp_Select to handle the legalization of these types. The widening of the Select and SETCC ops was simple since I only widened the operands which needed widening. I am not sure if this is co...
2011 Mar 10
0
[LLVMdev] Vector select/compare support in LLVM
...g of these registers. I agree with > you that GP registers are also a precious resource. GPRs are more precious than vector registers in my experience. Spilling a vector register isn't that painful. Spilling a GPR holding an address is disastrous. > In my private branch, I added the [v4i1 .. v64i1] types. I also > implemented a new type of target lowering: "PACK". This lowering packs Is PACK in the X86 namespace? It seems a pretty target-specific thing. > I also plan to experiment with promoting <4 x i1> to <4 x i32>. At > this point I can't re...
2011 Mar 14
1
[LLVMdev] Vector select/compare support in LLVM
...g of these registers. I agree with > you that GP registers are also a precious resource. GPRs are more precious than vector registers in my experience. Spilling a vector register isn't that painful. Spilling a GPR holding an address is disastrous. > In my private branch, I added the [v4i1 .. v64i1] types. I also > implemented a new type of target lowering: "PACK". This lowering packs Is PACK in the X86 namespace? It seems a pretty target-specific thing. > I also plan to experiment with promoting <4 x i1> to <4 x i32>. At > this point I can't re...
2011 Mar 09
0
[LLVMdev] Vector select/compare support in LLVM
"Rotem, Nadav" <nadav.rotem at intel.com> writes: > I can think of two ways to represent masks in x86: sparse and > packed. In the sparse method, the masks are kept in <4 x 32bit> > registers, which are mapped to xmm registers. This is the ‘native’ way > of using masks. This argues for the sparse representation, I think. > _Sparse_ After my discussion with
2019 Feb 09
2
how experimental are the llvm.experimental.vector.reduce.* functions?
On Sat, Feb 9, 2019 at 6:25 PM Simon Pilgrim <llvm-dev at redking.me.uk> wrote: > The add/sub (+mul) overflow intrinsics are being updated to support > vectors to match the related add/sub saturation intrinsics. We haven't > updated the docs yet as legalization, vectorization and various minor bits > of plumbing still need to be finished before it can be officially supported