Displaying 20 results from an estimated 50000 matches similar to: "Fwd: How to use CostModel?"
2014 Jul 17
4
[LLVMdev] Using CostModel to estimate machine cycles of each instruction
There is CostModel.cpp since LLVM3, I am wondering if anyone can give me
an concrete example on how to use this pass to estimate cycles used in a
given IR file.
Thank you very much.
Don
2013 Jan 09
2
[LLVMdev] ARM vectorizer cost model
Hi Nadav,
I'm interested in knowing how you'll work up the ARM cost model and how
easy it'd be to split the work.
As far as I can see, LoopVectorizationCostModel is the class that does all
the work, with assistance from the target transform info.
Do you think that updating ARMTTI would be the best course of action now,
and inspect the differences in the CostModel later?
I also
2018 Jan 09
1
RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)
Thanks, Hal.
I plan to post a patch w/o HW Legality early bailout first. That should enable further discussion on where the initial very high cost for "illegal masked load/store/gather/scatter" should be coming from --- like should LoopVectorize provide it? Or should it be provided by TTI? I prefer the latter (TTI) but the first revision of the patch will intentionally do the former
2018 Jan 07
0
RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)
On 01/05/2018 06:28 PM, Saito, Hideki wrote:
> Amara,
>
>> I support this direction
> Thanks for the support.
>
>> but are there actually any real world workloads where gather/scatter scalarisation would be worth it, on any micro-architecture? If we don’t have examples and the compile time cost is non-negligible then I think we’d still like to keep the early >bailouts in
2018 Jan 05
0
RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)
> On 5 Jan 2018, at 21:01, Saito, Hideki via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
>
> All,
>
> I'm trying to refactor LoopVectorize such that it has better conformance to VPlan vision going forward
> (http://www.llvm.org/docs/Proposals/VectorizationPlan.html). All VP*Recipe class definitions are now
> moved to VPlan.h, and I have a patch under review
2018 Jan 05
2
RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)
All,
I'm trying to refactor LoopVectorize such that it has better conformance to VPlan vision going forward
(http://www.llvm.org/docs/Proposals/VectorizationPlan.html). All VP*Recipe class definitions are now
moved to VPlan.h, and I have a patch under review to move LoopVectorizationPlanner class out of
LoopVectorize.cpp (https://reviews.llvm.org/D41420).
Next thing I'm working on is
2018 Jan 06
2
RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)
Amara,
>I support this direction
Thanks for the support.
>but are there actually any real world workloads where gather/scatter scalarisation would be worth it, on any micro-architecture? If we don’t have examples and the compile time cost is non-negligible then I think we’d still like to keep the early >bailouts in some form.’
It's not like I have specific application code in
2013 Aug 15
4
[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops
Hi all,
I have investigated the 6X extra compile-time overhead when Polly compiles the simple nestedloop benchmark in LLVM-testsuite. (http://188.40.87.11:8000/db_default/v4/nts/31?compare_to=28&baseline=28). Preliminary results show that such compile-time overhead is resulted by the complicated polly-dependence analysis. However, the key seems to be the polly-prepare pass, which introduces
2015 Jan 14
6
[LLVMdev] Instruction Cost
Hi,
I'm looking for APIs that compute instruction costs, and noticed several of
them.
1. A series of APIs of TargetTransformInfo that compute the cost of
instructions of a particular type (e.g. getArithmeticInstrCost and
getShuffleCost)
2. TargetTransformInfo::getOperationCost
3. CostModel::getInstructionCost::getInstructionCost in
lib/Analysis/CostModel.cpp
Only the first one is used
2016 Oct 06
2
LoopVectorizer -- generating bad and unhandled shufflevector sequence
Hi,
I have experimented with enabling the LoopVectorizer for SystemZ. I have
come across a loop which, when vectorized, seems to have been poorly
generated. In short, there seems to be a completely unnecessary sequence
of shufflevector instructions, that doesn't get optimized away anywhere.
In other words, there is a shuffling so that leads back to the original
vector:
[0 1 2 3
2014 Jan 16
3
[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info
On Wed, Jan 15, 2014 at 5:30 PM, Nadav Rotem <nrotem at apple.com> wrote:
> Was the vectorizer successful in unrolling the loop in quantum_sigma_x? I
> wonder if 'size’ is typically high or low.
No. The vectorizer stated that it wasn't going to bother with the loop
because it wasn't profitable. Specifically:
LV: Checking a loop in "quantum_sigma_x"
LV: Found a
2013 Oct 27
3
[LLVMdev] Why is the loop vectorizer not working on my function?
Hi Frank,
On Oct 26, 2013, at 6:29 PM, Frank Winter <fwinter at jlab.org> wrote:
> I would need this to work when calling the vectorizer through
> the function pass manager. Unfortunately I am having the same
> problem there:
I am not sure which function pass manager you are referring here. I assume you create your own (you are not using opt but configure your own pass
2013 Oct 27
0
[LLVMdev] Why is the loop vectorizer not working on my function?
Hi Arnold,
thanks for the detailed setup. Still, I haven't figured out the right
thing to do.
I would need only the native target since all generated code will
execute on the JIT execution machine (right now, the old JIT interface).
There is no need for other targets.
Maybe it would be good to ask specific questions:
How do I get the triple for the native target?
How do I setup the
2013 Feb 04
6
[LLVMdev] Vectorizer using Instruction, not opcodes
On 4 February 2013 18:25, Arnold Schwaighofer <aschwaighofer at apple.com>wrote:
> For cases where this approach breaks really badly we could consider adding
> a specialized api or parameters (like the type of a user/use). But we
> should do so only as a last resort and backed by actual code that would
> benefit from doing so.
>
Very sensible, more or less what I had in
2013 Jan 09
0
[LLVMdev] ARM vectorizer cost model
Hi Renato,
> I'm interested in knowing how you'll work up the ARM cost model and how easy it'd be to split the work.
Yes, I am starting to work on the ARM cost model and I would appreciate any help in the form of: advice, performance measurements, patches, etc.
I tune the cost model by running the cost model analysis pass and I compare the output of the analysis to the output
2013 Oct 26
2
[LLVMdev] Why is the loop vectorizer not working on my function?
Hi Arnold,
adding '-debug-only=loop-vectorize' to the command gives:
LV: Checking a loop in "bar"
LV: Found a loop: L0
LV: Found an induction variable.
LV: Found an unidentified write ptr: %7 = load float** %6
LV: Found an unidentified read ptr: %10 = load float** %9
LV: Found an unidentified read ptr: %13 = load float** %12
LV: We need to do 2 pointer comparisons.
LV: We
2013 Oct 26
2
[LLVMdev] Why is the loop vectorizer not working on my function?
Hi Hal!
I am using the 'x86_64' target. Below the complete module dump and here
the command line:
opt -march=x64-64 -loop-vectorize -debug-only=loop-vectorize -S test.ll
Frank
; ModuleID = 'test.ll'
target datalayout =
"e-p:64:64:64-S128-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f16:16:16-f32:32:32-f64:64:64-f128:128:128-v64:64:64-v128:12
2015 Mar 12
3
[LLVMdev] Question about shouldMergeGEPs in InstructionCombining
I think it would make sense for (1) and (2). I am not sure if (3) is
feasible in instcombine. (I am not too familiar with LoopInfo)
For the Octasic's Opus platform, I modified shouldMergeGEPs in our fork to:
if (GEP.hasAllZeroIndices() && !Src.hasAllZeroIndices() &&
!Src.hasOneUse())
return false;
return Src.hasAllConstantIndices(); // was return false;
2014 Nov 04
3
[LLVMdev] supporting SAD in loop vectorizer
----- Original Message -----
> From: "Renato Golin" <renato.golin at linaro.org>
> To: "Dibyendu Das" <Dibyendu.Das at amd.com>
> Cc: llvmdev at cs.uiuc.edu
> Sent: Tuesday, November 4, 2014 5:23:30 AM
> Subject: Re: [LLVMdev] supporting SAD in loop vectorizer
>
> On 4 November 2014 11:06, Das, Dibyendu <Dibyendu.Das at amd.com> wrote:
2013 Oct 26
3
[LLVMdev] Why is the loop vectorizer not working on my function?
----- Original Message -----
> >>> LV: The Widest type: 32 bits.
> >>> LV: The Widest register is: 32 bits.
>
> Yep, we don’t pick up the right TTI.
>
> Try -march=x86-64 (or leave it out) you already have this info in the
> triple.
>
> Then it should work (does for me with your example below).
That may depend on what CPU is picks by default; Frank,