Displaying 20 results from an estimated 35 matches for "scalarised".
2013 Nov 15
2
[LLVMdev] [PATCH] Add a Scalarize pass
...decision about what to scalarise and what not to scalarise, without
any help from llvmpipe. The problem I'm trying to solve is that
codegen is too late to get the benefit of other IR optimisations.
So in my case I do not want to _change_ the decision about which
vectors get scalarised and how. I just want to do it earlier.
It would be a shame if that meant that llvmpipe had to duplicate
exactly the decisions that codegen makes wrt scalarisation,
since codegen can easily make those decisions available through
TargetTransformInfo.
That's why I thought using T...
2013 Nov 15
0
[LLVMdev] [PATCH] Add a Scalarize pass
...t to scalarise and what not to scalarise, without
> any help from llvmpipe. The problem I'm trying to solve is that
> codegen is too late to get the benefit of other IR optimisations.
>
> So in my case I do not want to _change_ the decision about which
> vectors get scalarised and how. I just want to do it earlier.
> It would be a shame if that meant that llvmpipe had to duplicate
> exactly the decisions that codegen makes wrt scalarisation,
> since codegen can easily make those decisions available through
> TargetTransformInfo.
>
> That...
2013 Nov 14
2
[LLVMdev] [PATCH] Add a Scalarize pass
Richard Sandiford <rsandifo at linux.vnet.ibm.com> writes:
> Are you worried that adding it to PMB will increase compile time?
> The pass exits very early for any target that doesn't opt-in to doing
> scalarisation at the IR level, without even looking at the function.
As an alternative, adding Scalarizer and InstCombine passes to
SystemZPassConfig::addIRPasses() would probably
2013 Nov 14
0
[LLVMdev] [PATCH] Add a Scalarize pass
On Nov 14, 2013, at 2:32 PM, Richard Sandiford <rsandifo at linux.vnet.ibm.com> wrote:
> Richard Sandiford <rsandifo at linux.vnet.ibm.com> writes:
>> Are you worried that adding it to PMB will increase compile time?
>> The pass exits very early for any target that doesn't opt-in to doing
>> scalarisation at the IR level, without even looking at the function.
2016 Feb 09
2
Vectorization with fast-math on irregular ISA sub-sets
...enablement work, so be it.
There might be a slight issue with legacy IR bitcode, but if that's going to be a problem in practice, we can design some scheme to let auto-upgrade do the right thing.
>
> If the scalarisation is in IR, then any NEON intrinsic in C code will
> get wrongly scalarised. Builtins can be lowered in either IR
> operations or builtins, and the back-end has no way of knowing the
> origin.
>
> If the scalarization is lower down, then we risk also changing inline
> ASM snippets, which is even worse.
Yes, but we don't do that, so that's not a pra...
2016 Feb 09
2
Vectorization with fast-math on irregular ISA sub-sets
----- Original Message -----
> From: "James Molloy" <James.Molloy at arm.com>
> To: "Renato Golin" <renato.golin at linaro.org>
> Cc: "Nadav Rotem" <nrotem at apple.com>, "Arnold Schwaighofer" <aschwaighofer at apple.com>, "Hal Finkel"
> <hfinkel at anl.gov>, "LLVM Dev" <llvm-dev at
2013 Nov 14
2
[LLVMdev] [PATCH] Add a Scalarize pass
Hi Richard,
Thanks for working on this. Comments below.
> I don't understand the basis for the last statement though. Do you mean
> that you think most cases produce better code if scalarised at the SD stage
> rather than at the IR level? Could you give an example?
You presented an example that shows that scalarizing vectors allow further optimizations. But I don’t think that this example represents the kind of problems that we run into in general C++ code. We currently consider...
2013 Oct 25
3
[LLVMdev] Is there pass to break down <4 x float> to scalars
...posed.
>
If I got it right, this may not be necessary, or it may even be harmful.
Say you decide that <4 x i32> vectors should be left alone, so that your
pass only scalarise the others. But when the vectorizer passes again (to
try and use CPU vector instructions), it might not match the scalarised
version with the vector, and you end up with data movement between scalar
and vector pipelines, which normally slows down CPUs (at least in ARM's
case). Also, problematic cases like <5 x i32> could be better split into
3+2 pairs, rather than 4+1.
If you scalarise everything, than the vec...
2013 Nov 14
0
[LLVMdev] [PATCH] Add a Scalarize pass
Nadav Rotem <nrotem at apple.com> writes:
>> I don't understand the basis for the last statement though. Do you mean
>> that you think most cases produce better code if scalarised at the SD stage
>> rather than at the IR level? Could you give an example?
>
> You presented an example that shows that scalarizing vectors allow
> further optimizations. But I don’t think that this example represents
> the kind of problems that we run into in general C++ code....
2013 Oct 25
0
[LLVMdev] Is there pass to break down <4 x float> to scalars
...but I found in the llvmpipe case
that this made things worse with TBAA, because DAGCombiner::GaterAllAliases
has some fairly strict limits. So I disabled that by default; use
-decompose-vector-load-store to reenable.
The main motivation for z was instead to get InstCombine to rewrite
things like scalarised selects.
I haven't submitted it yet because it's less of a win than the TBAA
DAGCombiner patch I posted, so I didn't want to distract from that.
It would also need some TargetTransformInfo hooks to decide which
vectors should be decomposed.
Thanks,
Richard
-------------- next part --...
2013 Oct 25
3
[LLVMdev] Is there pass to break down <4 x float> to scalars
Hi, LLVM community,
I write some code in hand using LLVM IR. for simplicity, I write them in <4
x float>. now I found some stores for elements are useless.
for example, If I store {0.0, 1.0, 2.0, 3.0} to a <4 x float> %a. maybe
only %a.xy is alive in my program. our target doesn't feature SIMD
instruction, which means we have to lower vector to many scalar
instructions. I found
2016 Feb 11
4
Vectorization with fast-math on irregular ISA sub-sets
..., 2016 8:30:50 AM
> Subject: Re: Vectorization with fast-math on irregular ISA sub-sets
>
> On 9 February 2016 at 20:29, Hal Finkel <hfinkel at anl.gov> wrote:
> >> If the scalarisation is in IR, then any NEON intrinsic in C code
> >> will
> >> get wrongly scalarised. Builtins can be lowered in either IR
> >> operations or builtins, and the back-end has no way of knowing the
> >> origin.
> >>
> >> If the scalarization is lower down, then we risk also changing
> >> inline
> >> ASM snippets, which is even wors...
2020 Nov 05
4
[Proposal] Introducing the concept of invalid costs to the IR cost model
Hi,
I'd like to propose a change to our cost interfaces so that instead of returning
an unsigned value from functions like getInstructionCost, getUserCost, etc., we
instead return a wrapper class that encodes an integer cost along with extra
state. The extra state can be used to express:
1. A cost as infinitely expensive in order to prevent certain optimisations
taking place. For example,
2013 Oct 25
0
[LLVMdev] Is there pass to break down <4 x float> to scalars
...ot it right, this may not be necessary, or it may even be harmful.
>
> Say you decide that <4 x i32> vectors should be left alone, so that your
> pass only scalarise the others. But when the vectorizer passes again (to
> try and use CPU vector instructions), it might not match the scalarised
> version with the vector, and you end up with data movement between scalar
> and vector pipelines, which normally slows down CPUs (at least in ARM's
> case). Also, problematic cases like <5 x i32> could be better split into
> 3+2 pairs, rather than 4+1.
>
> If you scala...
2013 Nov 13
2
[LLVMdev] [PATCH] Add a Scalarize pass
Hi Richard,
Thanks for working on this. We should probably move this discussion to llvm-dev because it is not strictly related to the patch review anymore. The code below is not representative of general c/c++ code. Usually only domain specific language (such as OpenCL) contain vector instructions. The LLVM pass manager configuration (pass manager builder) is designed for C/C++ compilers, not
2013 Nov 13
0
[LLVMdev] [PATCH] Add a Scalarize pass
...need to come up with an
> optimization pipe and works for most programs that we care about. I
> still think that scalarizing in SD is a reasonable solution for c/c++.
I don't understand the basis for the last statement though. Do you mean
that you think most cases produce better code if scalarised at the SD stage
rather than at the IR level? Could you give an example?
If the idea is to have a clean separation of concerns between the front end
and LLVM, then it seems like there are two obvious approaches:
(a) make it the front end's responsibility to only generate vector widths
tha...
2018 Jan 06
2
RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)
Amara,
>I support this direction
Thanks for the support.
>but are there actually any real world workloads where gather/scatter scalarisation would be worth it, on any micro-architecture? If we don’t have examples and the compile time cost is non-negligible then I think we’d still like to keep the early >bailouts in some form.’
It's not like I have specific application code in
2014 Oct 24
2
[LLVMdev] Adding masked vector load and store intrinsics
...ust cast to it from whatever the deal pointer type is.
-Hal
>
>
> Also, given that the types of the vectors matter, it seems like we’re
> going to need TTI anyway whenever we want to generate one of these,
> or else we’ll end up generating an illegal version which has to be
> scalarised in the backend.
>
>
> Thanks,
> Pete
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
--
Hal Finkel
Assistant Computat...
2018 Jan 05
0
RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)
> On 5 Jan 2018, at 21:01, Saito, Hideki via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
>
> All,
>
> I'm trying to refactor LoopVectorize such that it has better conformance to VPlan vision going forward
> (http://www.llvm.org/docs/Proposals/VectorizationPlan.html). All VP*Recipe class definitions are now
> moved to VPlan.h, and I have a patch under review
2018 Jan 07
0
RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)
On 01/05/2018 06:28 PM, Saito, Hideki wrote:
> Amara,
>
>> I support this direction
> Thanks for the support.
>
>> but are there actually any real world workloads where gather/scatter scalarisation would be worth it, on any micro-architecture? If we don’t have examples and the compile time cost is non-negligible then I think we’d still like to keep the early >bailouts in