thr3ads.net - similar to: "Question about VectorLegalizer::ExpandStore() with v4i1"

Displaying 20 results from an estimated 900 matches similar to: "Question about VectorLegalizer::ExpandStore() with v4i1"

Question about VectorLegalizer::ExpandStore() with v4i1

2016 Jun 28

Question about VectorLegalizer::ExpandStore() with v4i1

On Tue, Jun 28, 2016 at 2:45 AM, jingu kang via llvm-dev <llvm-dev at lists.llvm.org> wrote: > Hi All, > > Can someone comment below question whether it is wrong or not please? > > 2016-06-25 7:52 GMT+01:00 jingu kang <jaykang10 at gmail.com>: >> Hi All, >> >> I have a problem with VectorLegalizer::ExpandStore() with v4i1. >> >> Let's

Question about VectorLegalizer::ExpandStore() with v4i1

2016 Jun 28

Question about VectorLegalizer::ExpandStore() with v4i1

Hi All, Can someone comment below question whether it is wrong or not please? 2016-06-25 7:52 GMT+01:00 jingu kang <jaykang10 at gmail.com>: > Hi All, > > I have a problem with VectorLegalizer::ExpandStore() with v4i1. > > Let's see a example. > > * LLVM IR > store <4 x i1> %edgeMask_for.body1314, <4 x i1>* %27 > > * SelectionDAG before vector

Question about VectorLegalizer::ExpandStore() with v4i1

2016 Jun 25

Question about VectorLegalizer::ExpandStore() with v4i1

Hi All, I have a problem with VectorLegalizer::ExpandStore() with v4i1. Let's see a example. * LLVM IR store <4 x i1> %edgeMask_for.body1314, <4 x i1>* %27 * SelectionDAG before vector legalization ch = store<ST1[%16](align=4), trunc to v4i1> t0, t128, t32, undef:i64 * SelectionDAG after vector legalization ch = store<ST1[%16](align=4), trunc to i1> t0, t133, t32,

[LLVMdev] Boolean floats and v4i1

2012 Jun 25

[LLVMdev] Boolean floats and v4i1

You could set the AND operation action to custom. The problem is that you would have no way of knowing if the type 'v4i64' originated from v4i1 or v4i64. And I don't think that you can use SimplifyDemandedBits (to discover if only the high bit is set) during the legalizer because the DAG is in a strange state, but I could be mistaken on this one. Okay, here is another idea.

[LLVMdev] branch on vector compare?

2012 Sep 05

[LLVMdev] branch on vector compare?

Am 05.09.2012 00:24, schrieb Stephen: > Roland Scheidegger <sroland <at> vmware.com> writes: >> This looks quite similar to something I filed a bug on (12312). Michael >> Liao submitted fixes for this, so I think >> if you change it to >> %16 = fcmp ogt <4 x float> %15, %cr >> %17 = sext <4 x i1> %16 to <4 x i32> >> %18 =

[LLVMdev] Boolean floats and v4i1

2012 Jun 25

[LLVMdev] Boolean floats and v4i1

Hi Hal, Why do say that the type v4i64 is broken ? You can specify that this type has no legal operations and the codegen will lower ("legalize") them to something that works on your platform. Nadav -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel Sent: Monday, June 25, 2012 06:28 To: LLVM Developers

[LLVMdev] Boolean floats and v4i1

2012 Jun 25

[LLVMdev] Boolean floats and v4i1

Hello, I'm working on support for the SIMD instruction set on our new BG/Q supercomputer. This instruction set is v4f64 (with the exception of some int <-> fp conversions, floating-point only). The vectorized comparisons, logical operations and selects also exclusively use floating-point inputs. For those inputs that are logically vectors of booleans the system uses the following

[LLVMdev] Boolean floats and v4i1

2012 Jun 25

[LLVMdev] Boolean floats and v4i1

On Mon, 25 Jun 2012 05:45:57 +0000 "Rotem, Nadav" <nadav.rotem at intel.com> wrote: > Hi Hal, > > Why do say that the type v4i64 is broken ? You can specify that this > type has no legal operations and the codegen will lower ("legalize") > them to something that works on your platform. For example, the AND operation is really only an AND operation

Use Galois field New Instructions (GFNI) to combine affine instructions

2020 May 18

Use Galois field New Instructions (GFNI) to combine affine instructions

On 5/18/20 8:24 PM, Craig Topper wrote: > I can tell you that your avx512 issue is that v64i8 gfni instructions also > require avx512bw to be enabled to make v64i8 a supported type. The C > intrinsics handling in the front end know this rule. But since you > generated your own intrinsics you bypassed that. Indeed that's the issue... I was stick with what Intel announces here

[LLVMdev] Vector select/compare support in LLVM

2011 Mar 08

[LLVMdev] Vector select/compare support in LLVM

Hello, I started working on adding vector support for the SELECT and CMP instructions in the codegen (bugs: 3384, 1784, 2314). Currently, the codegen scalarizes vector CMPs into multiple scalar CMPs. It is easy to add similar scalarization support to the SELECT instruction. However, using multiple scalar operations is slower than using vector operations. In LLVM, vector-compare operations

[LLVMdev] Vector select/compare support in LLVM

2011 Mar 10

[LLVMdev] Vector select/compare support in LLVM

Hey, I am currently forced to create the BLENDVPS intrinsic as an external call (via Intrinsic::x86_sse41_blendvps) which has the following signature (from IntrinsicsX86.td): def int_x86_sse41_blendvps : GCCBuiltin<"__builtin_ia32_blendvps">, Intrinsic<[llvm_v4f32_ty],[llvm_v4f32_ty, llvm_v4f32_ty, llvm_v4f32_ty],[IntrNoMem]> Thus, it expects the mask (first operand if

avx512 JIT backend generates wrong code on <4 x float>

2016 Jun 29

avx512 JIT backend generates wrong code on <4 x float>

Hi Frank, I recommend trying trunk LLVM. AVX-512 development has been very active recently. -Hal ----- Original Message ----- > From: "Frank Winter via llvm-dev" <llvm-dev at lists.llvm.org> > To: "LLVM Dev" <llvm-dev at lists.llvm.org> > Sent: Wednesday, June 29, 2016 2:41:39 PM > Subject: [llvm-dev] avx512 JIT backend generates wrong code on <4

avx512 JIT backend generates wrong code on <4 x float>

2016 Jun 30

avx512 JIT backend generates wrong code on <4 x float>

Hi Hal! Thanks, but unfortunately it didn't help. The exact same assembler instructions are generated for both 3.8 (yesterday) and trunk (from today). So, this really looks like a bug. Best, Frank On 06/29/2016 03:48 PM, Hal Finkel wrote: > Hi Frank, > > I recommend trying trunk LLVM. AVX-512 development has been very active recently. > > -Hal > > ----- Original

[LLVMdev] Vector select/compare support in LLVM

2011 Mar 10

[LLVMdev] Vector select/compare support in LLVM

After I implemented a new type of legalization (the packing of i1 vectors), I found that x86 does not have a way to load packed masks into SSE registers. So, I guess that legalizing of <4 x i1> to <4 x i32> is the way to go. Cheers, Nadav -----Original Message----- From: Rotem, Nadav Sent: Thursday, March 10, 2011 11:04 To: 'David A. Greene' Cc: llvmdev at cs.uiuc.edu

[LLVMdev] Vector select/compare support in LLVM

2011 Mar 10

[LLVMdev] Vector select/compare support in LLVM

Hi David, The MOVMSKPS instruction is cheap (2 cycles). Not to be confused with VMASKMOV, the AVX masked move, which is expensive. One of the arguments for packing masks is that it reduces vector-registers pressure. Auto-vectorizing compilers maintain multiple masks for different execution paths (for each loop nesting, etc). Saving masks in xmm registers may result in vector-register

[LLVMdev] branch on vector compare?

2012 Sep 04

[LLVMdev] branch on vector compare?

Roland Scheidegger <sroland <at> vmware.com> writes: > This looks quite similar to something I filed a bug on (12312). Michael > Liao submitted fixes for this, so I think > if you change it to > %16 = fcmp ogt <4 x float> %15, %cr > %17 = sext <4 x i1> %16 to <4 x i32> > %18 = bitcast <4 x i32> %17 to i128 > %19 = icmp ne i128 %18, 0

[LLVMdev] Vector splitting vs widening

2013 Mar 06

[LLVMdev] Vector splitting vs widening

Hi Hal, > The problem is essentially the following: there are no vector f32 types (yet), so the <v4i1> = setcc <v4f32> node needs to be split and scalarized. The operand splitting seems to start correctly, but because <v4i1> is itself a legal type, after splitting the node into <v2i1> = setcc <v2f32>, the process becomes confused. The operands are again split

[LLVMdev] Vector splitting vs widening

2013 Mar 09

[LLVMdev] Vector splitting vs widening

----- Original Message ----- > From: "Nadav Rotem" <nrotem at apple.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "llvmdev at cs.uiuc.edu Dev" <llvmdev at cs.uiuc.edu> > Sent: Wednesday, March 6, 2013 3:40:50 PM > Subject: Re: [LLVMdev] Vector splitting vs widening > > Hi Hal, > > > > > > > The

Issue with libguestfs-test-tool on a guest hosted on VMWare ESXi

2018 Mar 23

Issue with libguestfs-test-tool on a guest hosted on VMWare ESXi

I am using a debian 9 guest, hosted on a ESXi platform with nested virtualisation enabled. On this debian 9 guest when I run libguesfs-test-tool, it fails with an error: "qemu-system-x86_64: /build/qemu-DqynNa/qemu-2.8+dfsg/target-i386/kvm.c:1805: kvm_put_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed." Instead when I use a wrapper script and hook it with the env

RFC: Generic IR reductions

2017 Feb 01

RFC: Generic IR reductions

Hi All, Renato wrote: >As they say: if it's not broken, don't fix it. >Let's talk about the reductions that AVX512 and SVE can't handle with IR semantics, but let's not change the current IR semantics for no reason. Main problem for SVE: We can't write straight-line IR instruction sequence for reduction last value compute, without knowing #elements in vector to

similar to: Question about VectorLegalizer::ExpandStore() with v4i1