thr3ads.net - similar to: "RFC: Generic IR reductions"

Displaying 20 results from an estimated 10000 matches similar to: "RFC: Generic IR reductions"

2017 Feb 01

RFC: Generic IR reductions

Hi, Renato. >So I vote "it depends". :) My preference is to let vectorizer emit one kind of "reduce vector into scalar" instead of letting vectorizer choose one of many different ways. I'm perfectly fine with @llvm.reduce.op.typein.typeout.ordered?(%vector) being that "one kind of reduce vector into scalar". I think we are converging enough at the detail

RFC: Generic IR reductions

2017 Feb 02

RFC: Generic IR reductions

Thanks for the summary, some more comments inline. On 1 February 2017 at 22:02, Renato Golin <renato.golin at linaro.org> wrote: > On 1 February 2017 at 21:22, Saito, Hideki <hideki.saito at intel.com> wrote: >> I think we are converging enough at the detail level, but having a big >> difference in the opinions at the "vision" level. :) > > Vision is

RFC: Generic IR reductions

2017 Jan 31

RFC: Generic IR reductions

+cc Simon who's also interested in reductions for the any_true, all_true predicate vectors. On 31 January 2017 at 20:19, Renato Golin via llvm-dev <llvm-dev at lists.llvm.org> wrote: > Hi Amara, > > We also had some discussions on the SVE side of reductions on the main > SVE thread, but this description is much more detailed than we had > before. > > I don't

RFC: Generic IR reductions

2017 Jan 31

RFC: Generic IR reductions

Hi Amara, We also had some discussions on the SVE side of reductions on the main SVE thread, but this description is much more detailed than we had before. I don't want to discuss specifically about SVE, as the spec is not out yet, but I think we can cover a lot of ground until very close to SVE and do the final step when we get there. On 31 January 2017 at 17:27, Amara Emerson via

RFC: Generic IR reductions

2017 Feb 01

RFC: Generic IR reductions

> One that we have had multiple times and the usual consensus is: if it can be represented in plain IR, it must. Adding multiple semantics for the same concept, especially stiff ones like builtins, adds complexity to the optimiser. > Regardless of the merits in this case, builtins should only be introduced IFF there is no other way. So first we should discuss adding it to IR with generic

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 06

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

Amara, >I support this direction Thanks for the support. >but are there actually any real world workloads where gather/scatter scalarisation would be worth it, on any micro-architecture? If we don’t have examples and the compile time cost is non-negligible then I think we’d still like to keep the early >bailouts in some form.’ It's not like I have specific application code in

RFC: Generic IR reductions

2017 Jan 31

RFC: Generic IR reductions

Hi all, During the Nov 2016 dev meeting, we had a hackers’ lab session where we discussed some issues about loop idiom recognition, IR representation and cost modelling. I took an action to write up an RFC about introducing reduction intrinsics to LLVM to represent horizontal operations across vectors. Vector reductions have been discussed in the past before, notably here:

Working on FP SCEV Analysis

2016 May 24

Working on FP SCEV Analysis

Adding support for FP inductions through isInductionPHI() is certainly possible, I have a relatively small local patch that does exactly that for simple fp add-recurrence cases, along with changes to the vectorizer to make it aware of FP inductions. It won't get give you the powerful reasoning capabilities of SCEV, but for the B-like cases it should work. Amara On 20 May 2016 at 19:31, Adam

RFC: Generic IR reductions

2017 Feb 01

RFC: Generic IR reductions

Constant propagation: %sum = add <N x float> %a, %b @llvm.reduce(ext <N x double> %sum) if %a and %b are vector of constants, the %sum also becomes a vector of constants. At this point you have @llvm.reduce(ext <N x double> %sum) and don't know what kind of reduction do you need. - Elena -----Original Message----- From: Renato Golin [mailto:renato.golin at linaro.org]

RFC: Generic IR reductions

2017 Feb 01

RFC: Generic IR reductions

> If you mean "patterns may not be matched, and reduction instructions will not be generated, making the code worse", then this is just a matter of making the patterns obvious and the back-ends robust enough to cope with it, no? The Back-end should be as robust as possible, I agree. The problem that I see is in adding another kind of complexity to the optimizer that works between the

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 07

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

On 01/05/2018 06:28 PM, Saito, Hideki wrote: > Amara, > >> I support this direction > Thanks for the support. > >> but are there actually any real world workloads where gather/scatter scalarisation would be worth it, on any micro-architecture? If we don’t have examples and the compile time cost is non-negligible then I think we’d still like to keep the early >bailouts in

RFC: Generic IR reductions

2017 Feb 01

RFC: Generic IR reductions

> My proposal was to have a reduction intrinsic that can infer the type by the predecessors. > For example: > @llvm.reduce(ext <N x double> ( add <N x float> %a, %b)) And if we don't have %b? We just want to sum all elements of %a? Something like @llvm.reduce(ext <N x double> ( add <N x float> %a, zeroinitializer)) Don't we have a problem with constant

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 09

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

Thanks, Hal. I plan to post a patch w/o HW Legality early bailout first. That should enable further discussion on where the initial very high cost for "illegal masked load/store/gather/scatter" should be coming from --- like should LoopVectorize provide it? Or should it be provided by TTI? I prefer the latter (TTI) but the first revision of the patch will intentionally do the former

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 05

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

All, I'm trying to refactor LoopVectorize such that it has better conformance to VPlan vision going forward (http://www.llvm.org/docs/Proposals/VectorizationPlan.html). All VP*Recipe class definitions are now moved to VPlan.h, and I have a patch under review to move LoopVectorizationPlanner class out of LoopVectorize.cpp (https://reviews.llvm.org/D41420). Next thing I'm working on is

Working on FP SCEV Analysis

2016 May 20

Working on FP SCEV Analysis

Hi Hideki, I like this summary overall, thanks. More below. > On May 20, 2016, at 10:04 AM, Saito, Hideki <hideki.saito at intel.com> wrote: > > > To the best of my experience, handling case B (secondary induction) is must-have, and if I’m not mistaken, > people aren’t opposed to that. > > For me, handling case A (primary induction) is “why not?”, but I certainly

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 05

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

> On 5 Jan 2018, at 21:01, Saito, Hideki via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > All, > > I'm trying to refactor LoopVectorize such that it has better conformance to VPlan vision going forward > (http://www.llvm.org/docs/Proposals/VectorizationPlan.html). All VP*Recipe class definitions are now > moved to VPlan.h, and I have a patch under review

Working on FP SCEV Analysis

2016 May 20

Working on FP SCEV Analysis

To the best of my experience, handling case B (secondary induction) is must-have, and if I’m not mistaken, people aren’t opposed to that. For me, handling case A (primary induction) is “why not?”, but I certainly admit that that can be very naïve thinking coming from lack of good understanding on SCEV and their proper usages. Now, let’s assume we can postpone discussion about case A. What is the

RFC: Promoting experimental reduction intrinsics to first class intrinsics

2020 Apr 08

RFC: Promoting experimental reduction intrinsics to first class intrinsics

Hi, It’s been a few years now since I added some intrinsics for doing vector reductions. We’ve been using them exclusively on AArch64, and I’ve seen some traffic a while ago on list for other targets too. Sander did some work last year to refine the semantics after some discussion. Are we at the point where we can drop the “experimental” from the name? IMO all target should begin to transition

RFC: Generic IR reductions

2017 Feb 03

RFC: Generic IR reductions

Yes, SVE can vectorize early exit loops by using speculative (first-faulting) loads, which essentially give a predicate of the lanes loaded successfully. For uncounted loops with these special loads, the loop predicate tests can be done using a 'ptest' instruction, checking if the last element is active. Amara On 3 February 2017 at 10:15, Simon Pilgrim <llvm-dev at redking.me.uk>

[LoopVectorizer] Improving the performance of dot product reduction loop

2018 Jul 23

[LoopVectorizer] Improving the performance of dot product reduction loop

~Craig On Mon, Jul 23, 2018 at 4:24 PM Hal Finkel <hfinkel at anl.gov> wrote: > > On 07/23/2018 05:22 PM, Craig Topper wrote: > > Hello all, > > This code https://godbolt.org/g/tTyxpf is a dot product reduction loop > multipying sign extended 16-bit values to produce a 32-bit accumulated > result. The x86 backend is currently not able to optimize it as well as gcc

similar to: RFC: Generic IR reductions