similar to: RFC: New intrinsics masked.expandload and masked.compressstore

Displaying 20 results from an estimated 300 matches similar to: "RFC: New intrinsics masked.expandload and masked.compressstore"

2016 Sep 25
5
RFC: New intrinsics masked.expandload and masked.compressstore
| |Hi Elena, | |Technically speaking, this seems straightforward. | |I wonder, however, how target-independent this is in a practical |sense; will there be an efficient lowering when targeting any other |ISA? I don't want to get into the territory where, because the |vectorizer is supposed to be architecture independent, we need to |add target-independent intrinsics for all
2016 Sep 26
2
RFC: New intrinsics masked.expandload and masked.compressstore
| |How would this work in this case? The result would need to affect the |legality and cost of the memory instruction. From your poster, it looks |like we're talking about loops with constructs like this: | |for (i =0; i < N; i++) { | if (topVal > b[i]) { | *dst = a[i]; | dst++; | } |} | |is this loop vectorizable at all without these constructs? Good
2020 May 18
2
Use Galois field New Instructions (GFNI) to combine affine instructions
Hello everyone, On the last couple of days, I have been experimenting with teaching LLVM how to combine a set of affine instructions into an instruction that uses the GFNI [1] AVX512 extension, especially GF2P8AFFINEQB [2]. While the general idea seems to work, I have some questions about my current implementation (see below). FTR, I have named this transformation AffineCombineExpr (ACE).
2020 May 18
2
Use Galois field New Instructions (GFNI) to combine affine instructions
On 5/18/20 8:24 PM, Craig Topper wrote: > I can tell you that your avx512 issue is that v64i8 gfni instructions also > require avx512bw to be enabled to make v64i8 a supported type. The C > intrinsics handling in the front end know this rule. But since you > generated your own intrinsics you bypassed that. Indeed that's the issue... I was stick with what Intel announces here
2019 Oct 22
4
Complex proposal v3 + roundtable agenda
Ahead of the Wednesday’s roundtable at the developers’ conference, here is version three of the proposal for first-class complex types in LLVM. I was not able to add Krzysztof Parzyszek’s suggestion of a “cunzip” intrinsic returning two vectors as I could not find examples of intrinsics that return two values at the IR level. The Hexagon intrinsics declared to return two values do not actually
2015 Mar 15
2
[LLVMdev] Indexed Load and Store Intrinsics - proposal
hi Hao, I started to upstream and the second patch is stalled under review now. - Elena -----Original Message----- From: Hao Liu [mailto:haoliuts at gmail.com] Sent: Friday, March 13, 2015 05:56 To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Indexed Load and Store Intrinsics - proposal Hi Elena, I think such intrinsics are very useful. Do you have any plan to
2014 Dec 18
8
[LLVMdev] Indexed Load and Store Intrinsics - proposal
Hi, Recent Intel architectures AVX-512 and AVX2 provide vector gather and/or scatter instructions. Gather/scatter instructions allow read/write access to multiple memory addresses. The addresses are specified using a base address and a vector of indices. We'd like Vectorizers to tap this functionality, and propose to do so by introducing new intrinsics: VectorValue = @llvm.sindex.load
2016 Jun 25
2
Question about VectorLegalizer::ExpandStore() with v4i1
Hi All, I have a problem with VectorLegalizer::ExpandStore() with v4i1. Let's see a example. * LLVM IR store <4 x i1> %edgeMask_for.body1314, <4 x i1>* %27 * SelectionDAG before vector legalization ch = store<ST1[%16](align=4), trunc to v4i1> t0, t128, t32, undef:i64 * SelectionDAG after vector legalization ch = store<ST1[%16](align=4), trunc to i1> t0, t133, t32,
2012 Sep 21
5
[LLVMdev] Question about LLVM NEON intrinsics
Hi all, I would like to know if LLVM Neon intrinsics are designed to support only 'Legal' types for NEON units. Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on following ll code: ; ModuleID = 'vmax.ll' target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32" target triple =
2012 Sep 21
0
[LLVMdev] Question about LLVM NEON intrinsics
On Fri, Sep 21, 2012 at 1:28 AM, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: > Hi all, > > I would like to know if LLVM Neon intrinsics are designed to support only 'Legal' types for NEON units. > Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on following ll code: > > > ; ModuleID = 'vmax.ll' > target datalayout =
2016 Jun 28
0
Question about VectorLegalizer::ExpandStore() with v4i1
Hi All, Can someone comment below question whether it is wrong or not please? 2016-06-25 7:52 GMT+01:00 jingu kang <jaykang10 at gmail.com>: > Hi All, > > I have a problem with VectorLegalizer::ExpandStore() with v4i1. > > Let's see a example. > > * LLVM IR > store <4 x i1> %edgeMask_for.body1314, <4 x i1>* %27 > > * SelectionDAG before vector
2012 Sep 21
2
[LLVMdev] RE : Question about LLVM NEON intrinsics
Hi Eli, Thanks for the answer, it clarifies the situation for me. Do you know if there is Pass in LLVM that could be adapted to 'legalize' intrinsics calls ? Or shall I define my own intrinsics for non supported types ? Best Regards Seb ________________________________________ De : Eli Friedman [eli.friedman at gmail.com] Date d'envoi : vendredi 21 septembre 2012 11:54 À : Sebastien
2010 Jul 15
2
[LLVMdev] v16i32/v16f32
Hi I find types such as v16i32, v16f32 missing in my llvm version 2.7 So does the following page not list them http://llvm.org/docs/doxygen/html/classllvm_1_1MVT.html is that intentional for any reason or can I just add them ? thanks shrey
2016 Jun 28
2
Question about VectorLegalizer::ExpandStore() with v4i1
On Tue, Jun 28, 2016 at 2:45 AM, jingu kang via llvm-dev <llvm-dev at lists.llvm.org> wrote: > Hi All, > > Can someone comment below question whether it is wrong or not please? > > 2016-06-25 7:52 GMT+01:00 jingu kang <jaykang10 at gmail.com>: >> Hi All, >> >> I have a problem with VectorLegalizer::ExpandStore() with v4i1. >> >> Let's
2008 Apr 01
3
Xen without APIC
Hi, I am trying to boot Xen on top of a simulator. I am having problems with the APIC module in my simulator, not related to Xen. So, I want to boot Xen without APIC support. Is there a way to disable APIC support in Xen? I am fine even Xen is restricted to uni-processor environment. Thanks, Bhaskar _______________________________________________ Xen-users mailing list
2012 Sep 21
0
[LLVMdev] Question about LLVM NEON intrinsics
On Sep 21, 2012, at 2:58 AM, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: > Hi Eli, > > Thanks for the answer, it clarifies the situation for me. Do you know if there is Pass in LLVM that could be adapted to 'legalize' intrinsics calls ? > Or shall I define my own intrinsics for non supported types ? You should never generate these sorts of intrinsics with
2012 Sep 21
0
[LLVMdev] Question about LLVM NEON intrinsics
On 21 September 2012 09:28, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: > declare <16 x float> @llvm.arm.neon.vmaxs.v16f32(<16 x float>, <16 x float>) nounwind readnone > > llc fails with following message: > > SplitVectorResult #0: 0x2258350: v16f32 = llvm.arm.neon.vmaxs 0x2258250, 0x2258050, 0x2258150 [ORD=3] [ID=0] > > LLVM ERROR: Do not
2019 Aug 29
2
Complex proposal v2
All, Here is the second revision of the proposal for a complex type in LLVM. It clarifies a few things that came up during discussion and adds additional operations for complex types. -David Proposal to Support Complex Operations in LLVM ---------------------------------------------- Revision History v1 - Initial proposal [1] v2 - This proposal - Added complex of
2010 Jul 15
0
[LLVMdev] v16i32/v16f32
On Wed, Jul 14, 2010 at 6:48 PM, shreyas krishnan <shreyas76 at gmail.com> wrote: > Hi >   I find  types such as v16i32, v16f32  missing in my llvm version 2.7 > > So does the following page not list them > http://llvm.org/docs/doxygen/html/classllvm_1_1MVT.html > > is that intentional for any reason or can I just add them  ? As far as I know, they're not there
2017 Aug 06
2
VBROADCAST Implementation Issues
i want to implement gather for v64i32. i wrote following code. def GATHER_256B : I<0x68, MRMSrcMem, (outs VR_2048:$dst), (ins i2048mem:$src), "GATHER_256B\t{$src, $dst|$dst, $src}", [(set VR_2048:$dst, (v64i32 (masked_gather addr:$src)))], IIC_MOV_MEM>, TA; def: Pat<(v64f32 (masked_gather addr:$src)), (GATHER_256B