thr3ads.net - similar to: "Enabling scalarized conditional stores in the loop vectorizer"

Displaying 20 results from an estimated 9000 matches similar to: "Enabling scalarized conditional stores in the loop vectorizer"

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 13

Enabling scalarized conditional stores in the loop vectorizer

Conceptually speaking, I think we really ought to enable this. Practically, I'm going to test it on our benchmarks (on x86), and see if we have any regressions - this seems like a fairly major change. Re targets - let's see where we stand w.r.t regressions first. What kind of performance testing have you already run on this? Do you know of specific targets where the cost model is known to

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 13

Enabling scalarized conditional stores in the loop vectorizer

I added this feature for libquantum (http://llvm.org/viewvc/llvm-project?view=revision&revision=200270) waiting for an update to the cost model modeling the scalarization of stores which you recently added. Assuming no serious regressions this SGTM. > On Dec 13, 2016, at 5:41 AM, Matthew Simpson <mssimpso at codeaurora.org> wrote: > > Hi Michael, > > Thanks for

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 13

Enabling scalarized conditional stores in the loop vectorizer

Hi Michael, Thanks for testing this on your benchmarks and target. I think the results will help guide the direction we go. I tested the feature with spec2k/2k6 on AArch64/Kryo and saw minor performance swings, aside from a large (30%) improvement in spec2k6/libquantum. The primary loop in that benchmark has a conditional store, so I expected it to benefit. Regarding the cost model, I think the

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 13

Enabling scalarized conditional stores in the loop vectorizer

----- Original Message ----- > From: "Arnold Schwaighofer via llvm-dev" <llvm-dev at lists.llvm.org> > To: "Matthew Simpson" <mssimpso at codeaurora.org> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Tuesday, December 13, 2016 9:17:08 AM > Subject: Re: [llvm-dev] Enabling scalarized conditional stores in the loop vectorizer

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 14

Enabling scalarized conditional stores in the loop vectorizer

Hi Dibyendu, Are you using a recent compiler? What architecture are you targeting? The target will determine whether the vectorizer thinks vectorization is profitable without having to manually force the vector width. For example, top-of-trunk vectorizes your snippet with "clang -O2 -mllvm -enable-cond-stores-vec" and "--target=aarch64-unknown-linux-gnu". However, with

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 14

Enabling scalarized conditional stores in the loop vectorizer

Hi Michael- Since you bring up libquantum performance can you let me know what the IR will look like for this small code snippet (libquantum-like) with –enable-cond-stores-vec ? I ask because I don’t see vectorization kicking in unless -force-vector-width=<> is specified. Let me know if I am missing something. -Thx struct nodeTy { unsigned int c1; unsigned int c2; unsigned

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 14

Enabling scalarized conditional stores in the loop vectorizer

Hi Matt- Yeah I used a pretty recent llvm (post 3.9) on an x86-64 ( both AMD and Intel ). -dibyendu From: Matthew Simpson [mailto:mssimpso at codeaurora.org] Sent: Wednesday, December 14, 2016 10:03 PM To: Das, Dibyendu <Dibyendu.Das at amd.com> Cc: Michael Kuperstein <mkuper at google.com>; llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] Enabling scalarized conditional stores in

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 15

Enabling scalarized conditional stores in the loop vectorizer

If there are no objections, I'll submit a patch for review that sets the default value of "-enable-cond-stores-vec" to "true". Thanks! -- Matt On Wed, Dec 14, 2016 at 12:55 PM, Michael Kuperstein via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I haven't verified what Matt described is what actually happens, but > assuming it is - that is a known

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 15

Enabling scalarized conditional stores in the loop vectorizer

Thanks Michael and Dibyendu for doing the experimentation and bringing this up to our attention. It might be the case what Matt described here. I will take a look at it. Farhana From: Michael Kuperstein [mailto:mkuper at google.com] Sent: Wednesday, December 14, 2016 9:56 AM To: Das, Dibyendu <Dibyendu.Das at amd.com>; Aleen, Farhana A <farhana.a.aleen at intel.com> Cc: Matthew

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 14

Enabling scalarized conditional stores in the loop vectorizer

I haven't verified what Matt described is what actually happens, but assuming it is - that is a known issue in the x86 cost model. Vectorizing interleaved memory accesses on x86 was, until recently, disabled by default. It's been enabled since r284779, but the cost model is very conservative, and basically assumes we're going to scalarize interleaved ops. I believe Farhana is working

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

2020 Nov 06

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

On 11/6/20 8:49 AM, Roger Ferrer Ibáñez wrote: Hi Sjoerd, Trying to remember how everything fits together here, but could get.active.lane.mask not create the %mask of the VP intrinsics? Or in other words, in the vectoriser, who's producing the %mask and %evl that is consumed by the VP intrinsics? I'm not sure what would be the best way here. I think about the Loop Vectorizer. I imagine

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

2020 Nov 06

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

On 11/6/20 12:39 PM, Sjoerd Meijer wrote: Hello Simon, Thanks for your replies, very useful. And yes, thanks for the example and making the target differences clear: ; Some examples: ; RISC-V V & VE(*): ; %mask = (splat i1 1) ; %evl = min(256, %n - %i) ; MVE/SVE : ; %mask = get.active.lane.mask(%i, %n) ; %evl = call @llvm.vscale() ; AVX: ; %mask = icmp (%i + (seq

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

2020 Nov 06

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

Hello Simon, Thanks for your replies, very useful. And yes, thanks for the example and making the target differences clear: ; Some examples: ; RISC-V V & VE(*): ; %mask = (splat i1 1) ; %evl = min(256, %n - %i) ; MVE/SVE : ; %mask = get.active.lane.mask(%i, %n) ; %evl = call @llvm.vscale() ; AVX: ; %mask = icmp (%i + (seq <8 x i32> 0,1,2,.,)), %n, ; %evl

[RFC] Adding Intrinsics for Masked Vector Integer Division and Remainder

2017 Oct 17

[RFC] Adding Intrinsics for Masked Vector Integer Division and Remainder

Introduction ========== We would like to add support for masked vector signed/unsigned integer division and remainder in the LLVM IR by introducing new target-independent intrinsics. This follows similar work which was done already for masked vector loads and stores - http://lists.llvm.org/pipermail/llvm-dev/2014-October/078059.html. Another relevant reference is the masked scatter/gather

Loop Vectorize: Testing cost model driven transformations

2016 Nov 28

Loop Vectorize: Testing cost model driven transformations

Note: This is a continuation of a discussion over at https://reviews.llvm.org/D26869. Hi all, In a discussion over on llvm-commits, we are debating how best to test loop vectorization transformations that are guided by the cost model. The cost model is currently used primarily for determining the vectorization and interleave factors. Both of these parameters are easily overridden with command

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

2020 Nov 05

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

For RISC-V V and VE being explicit about %evl is important for performance & correctness and that is what VP does. The get.active.lane.mask intrinsic is used as a hint for the MVE, SVE backends to use hardware tail-predication (the backends reverse engineer that hint by pattern matching for get.active.lane.mask in the mask parameter of "some" masked intrinsics). IMHO, it's more

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

2020 Nov 09

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

; RISC-V V & VE(*): ; %mask = get.active.lane.mask(%i, %i) ; %evl = min(256, %n - %i) ; MVE/SVE/AVX : ; %mask = get.active.lane.mask(%i, %n) ; %evl = call @llvm.vscale() For VE, we want to do as much predication as possible through %evl and as little as possible with %mask. This has performance implications on VE and RISC-V - VE does not generate a mask from %evl but %evl is

[LV] [ScalarEvolution] Feedback on bug 34965 - After r311849 Loop Vectorizer crashes with "The instruction should be scalarized"

2017 Oct 26

[LV] [ScalarEvolution] Feedback on bug 34965 - After r311849 Loop Vectorizer crashes with "The instruction should be scalarized"

Hi! I uploaded a tentative patch for the following bug in LV (https://bugs.llvm.org/show_bug.cgi?id=34965) but I have some concerns about it. I would appreciate if someone with more experience in SE/PSE can provide some feedback about current tentative fix and alternative solutions described in the comments. Thanks! Diego Caballero, Intel Vectorizer Team

Loop Vectorize: Testing cost model driven transformations

2016 Nov 30

Loop Vectorize: Testing cost model driven transformations

That's right. In your example, if the target isn't specified anywhere, an llc invocation would be equivalent to "llc -mtriple=x86_64-unknown-linux-gnu -mcpu=generic". TTI queries (in e.g., CodeGenPrepare) would be based on this. From opt, if the target triple is left unspecified, we will use the "base" TTI implementation (not x86). -- Matt On Wed, Nov 30, 2016 at 2:07

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

2020 Nov 02

Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)

Hi all, At the Barcelona Supercomputing Center, we have been working on an end-to-end vectorizer using scalable vectors for RISC-V Vector extension in context of the EPI Project <https://www.european-processor-initiative.eu/accelerator/>. We earlier shared a demo of our prototype implementation (https://repo.hca.bsc.es/epic/z/9eYRIF, see below) with the folks involved with LLVM

similar to: Enabling scalarized conditional stores in the loop vectorizer