similar to: Enabling scalarized conditional stores in the loop vectorizer

Displaying 20 results from an estimated 9000 matches similar to: "Enabling scalarized conditional stores in the loop vectorizer"

2016 Dec 13
0
Enabling scalarized conditional stores in the loop vectorizer
Conceptually speaking, I think we really ought to enable this. Practically, I'm going to test it on our benchmarks (on x86), and see if we have any regressions - this seems like a fairly major change. Re targets - let's see where we stand w.r.t regressions first. What kind of performance testing have you already run on this? Do you know of specific targets where the cost model is known to
2016 Dec 13
0
Enabling scalarized conditional stores in the loop vectorizer
I added this feature for libquantum (http://llvm.org/viewvc/llvm-project?view=revision&revision=200270) waiting for an update to the cost model modeling the scalarization of stores which you recently added. Assuming no serious regressions this SGTM. > On Dec 13, 2016, at 5:41 AM, Matthew Simpson <mssimpso at codeaurora.org> wrote: > > Hi Michael, > > Thanks for
2016 Dec 13
4
Enabling scalarized conditional stores in the loop vectorizer
Hi Michael, Thanks for testing this on your benchmarks and target. I think the results will help guide the direction we go. I tested the feature with spec2k/2k6 on AArch64/Kryo and saw minor performance swings, aside from a large (30%) improvement in spec2k6/libquantum. The primary loop in that benchmark has a conditional store, so I expected it to benefit. Regarding the cost model, I think the
2016 Dec 13
1
Enabling scalarized conditional stores in the loop vectorizer
----- Original Message ----- > From: "Arnold Schwaighofer via llvm-dev" <llvm-dev at lists.llvm.org> > To: "Matthew Simpson" <mssimpso at codeaurora.org> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Tuesday, December 13, 2016 9:17:08 AM > Subject: Re: [llvm-dev] Enabling scalarized conditional stores in the loop vectorizer
2016 Dec 14
2
Enabling scalarized conditional stores in the loop vectorizer
Hi Dibyendu, Are you using a recent compiler? What architecture are you targeting? The target will determine whether the vectorizer thinks vectorization is profitable without having to manually force the vector width. For example, top-of-trunk vectorizes your snippet with "clang -O2 -mllvm -enable-cond-stores-vec" and "--target=aarch64-unknown-linux-gnu". However, with
2016 Dec 14
0
Enabling scalarized conditional stores in the loop vectorizer
Hi Michael- Since you bring up libquantum performance can you let me know what the IR will look like for this small code snippet (libquantum-like) with –enable-cond-stores-vec ? I ask because I don’t see vectorization kicking in unless -force-vector-width=<> is specified. Let me know if I am missing something. -Thx struct nodeTy { unsigned int c1; unsigned int c2; unsigned
2016 Dec 14
0
Enabling scalarized conditional stores in the loop vectorizer
Hi Matt- Yeah I used a pretty recent llvm (post 3.9) on an x86-64 ( both AMD and Intel ). -dibyendu From: Matthew Simpson [mailto:mssimpso at codeaurora.org] Sent: Wednesday, December 14, 2016 10:03 PM To: Das, Dibyendu <Dibyendu.Das at amd.com> Cc: Michael Kuperstein <mkuper at google.com>; llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] Enabling scalarized conditional stores in
2016 Dec 15
0
Enabling scalarized conditional stores in the loop vectorizer
If there are no objections, I'll submit a patch for review that sets the default value of "-enable-cond-stores-vec" to "true". Thanks! -- Matt On Wed, Dec 14, 2016 at 12:55 PM, Michael Kuperstein via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I haven't verified what Matt described is what actually happens, but > assuming it is - that is a known
2016 Dec 15
0
Enabling scalarized conditional stores in the loop vectorizer
Thanks Michael and Dibyendu for doing the experimentation and bringing this up to our attention. It might be the case what Matt described here. I will take a look at it. Farhana From: Michael Kuperstein [mailto:mkuper at google.com] Sent: Wednesday, December 14, 2016 9:56 AM To: Das, Dibyendu <Dibyendu.Das at amd.com>; Aleen, Farhana A <farhana.a.aleen at intel.com> Cc: Matthew
2016 Dec 14
4
Enabling scalarized conditional stores in the loop vectorizer
I haven't verified what Matt described is what actually happens, but assuming it is - that is a known issue in the x86 cost model. Vectorizing interleaved memory accesses on x86 was, until recently, disabled by default. It's been enabled since r284779, but the cost model is very conservative, and basically assumes we're going to scalarize interleaved ops. I believe Farhana is working
2020 Nov 06
4
Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)
On 11/6/20 8:49 AM, Roger Ferrer Ibáñez wrote: Hi Sjoerd, Trying to remember how everything fits together here, but could get.active.lane.mask not create the %mask of the VP intrinsics? Or in other words, in the vectoriser, who's producing the %mask and %evl that is consumed by the VP intrinsics? I'm not sure what would be the best way here. I think about the Loop Vectorizer. I imagine
2020 Nov 06
2
Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)
On 11/6/20 12:39 PM, Sjoerd Meijer wrote: Hello Simon, Thanks for your replies, very useful. And yes, thanks for the example and making the target differences clear: ; Some examples: ; RISC-V V & VE(*): ; %mask = (splat i1 1) ; %evl = min(256, %n - %i) ; MVE/SVE : ; %mask = get.active.lane.mask(%i, %n) ; %evl = call @llvm.vscale() ; AVX: ; %mask = icmp (%i + (seq
2020 Nov 06
0
Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)
Hello Simon, Thanks for your replies, very useful. And yes, thanks for the example and making the target differences clear: ; Some examples: ; RISC-V V & VE(*): ; %mask = (splat i1 1) ; %evl = min(256, %n - %i) ; MVE/SVE : ; %mask = get.active.lane.mask(%i, %n) ; %evl = call @llvm.vscale() ; AVX: ; %mask = icmp (%i + (seq <8 x i32> 0,1,2,.,)), %n, ; %evl
2017 Oct 17
3
[RFC] Adding Intrinsics for Masked Vector Integer Division and Remainder
Introduction ========== We would like to add support for masked vector signed/unsigned integer division and remainder in the LLVM IR by introducing new target-independent intrinsics. This follows similar work which was done already for masked vector loads and stores - http://lists.llvm.org/pipermail/llvm-dev/2014-October/078059.html. Another relevant reference is the masked scatter/gather
2016 Nov 28
2
Loop Vectorize: Testing cost model driven transformations
Note: This is a continuation of a discussion over at https://reviews.llvm.org/D26869. Hi all, In a discussion over on llvm-commits, we are debating how best to test loop vectorization transformations that are guided by the cost model. The cost model is currently used primarily for determining the vectorization and interleave factors. Both of these parameters are easily overridden with command
2020 Nov 05
2
Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)
For RISC-V V and VE being explicit about %evl is important for performance & correctness and that is what VP does. The get.active.lane.mask intrinsic is used as a hint for the MVE, SVE backends to use hardware tail-predication (the backends reverse engineer that hint by pattern matching for get.active.lane.mask in the mask parameter of "some" masked intrinsics). IMHO, it's more
2020 Nov 09
0
Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)
; RISC-V V & VE(*): ; %mask = get.active.lane.mask(%i, %i) ; %evl = min(256, %n - %i) ; MVE/SVE/AVX : ; %mask = get.active.lane.mask(%i, %n) ; %evl = call @llvm.vscale() For VE, we want to do as much predication as possible through %evl and as little as possible with %mask. This has performance implications on VE and RISC-V - VE does not generate a mask from %evl but %evl is
2017 Oct 26
2
[LV] [ScalarEvolution] Feedback on bug 34965 - After r311849 Loop Vectorizer crashes with "The instruction should be scalarized"
Hi! I uploaded a tentative patch for the following bug in LV (https://bugs.llvm.org/show_bug.cgi?id=34965) but I have some concerns about it. I would appreciate if someone with more experience in SE/PSE can provide some feedback about current tentative fix and alternative solutions described in the comments. Thanks! Diego Caballero, Intel Vectorizer Team
2016 Nov 30
3
Loop Vectorize: Testing cost model driven transformations
That's right. In your example, if the target isn't specified anywhere, an llc invocation would be equivalent to "llc -mtriple=x86_64-unknown-linux-gnu -mcpu=generic". TTI queries (in e.g., CodeGenPrepare) would be based on this. From opt, if the target triple is left unspecified, we will use the "base" TTI implementation (not x86). -- Matt On Wed, Nov 30, 2016 at 2:07
2020 Nov 02
2
Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)
Hi all, At the Barcelona Supercomputing Center, we have been working on an end-to-end vectorizer using scalable vectors for RISC-V Vector extension in context of the EPI Project <https://www.european-processor-initiative.eu/accelerator/>. We earlier shared a demo of our prototype implementation  (https://repo.hca.bsc.es/epic/z/9eYRIF, see below) with the folks involved with LLVM