thr3ads.net - similar to: "Fwd: How to use CostModel?"

Displaying 20 results from an estimated 50000 matches similar to: "Fwd: How to use CostModel?"

[LLVMdev] Using CostModel to estimate machine cycles of each instruction

2014 Jul 17

[LLVMdev] Using CostModel to estimate machine cycles of each instruction

There is CostModel.cpp since LLVM3, I am wondering if anyone can give me an concrete example on how to use this pass to estimate cycles used in a given IR file. Thank you very much. Don

[LLVMdev] ARM vectorizer cost model

2013 Jan 09

[LLVMdev] ARM vectorizer cost model

Hi Nadav, I'm interested in knowing how you'll work up the ARM cost model and how easy it'd be to split the work. As far as I can see, LoopVectorizationCostModel is the class that does all the work, with assistance from the target transform info. Do you think that updating ARMTTI would be the best course of action now, and inspect the differences in the CostModel later? I also

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 09

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

Thanks, Hal. I plan to post a patch w/o HW Legality early bailout first. That should enable further discussion on where the initial very high cost for "illegal masked load/store/gather/scatter" should be coming from --- like should LoopVectorize provide it? Or should it be provided by TTI? I prefer the latter (TTI) but the first revision of the patch will intentionally do the former

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 07

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

On 01/05/2018 06:28 PM, Saito, Hideki wrote: > Amara, > >> I support this direction > Thanks for the support. > >> but are there actually any real world workloads where gather/scatter scalarisation would be worth it, on any micro-architecture? If we don’t have examples and the compile time cost is non-negligible then I think we’d still like to keep the early >bailouts in

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 05

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

> On 5 Jan 2018, at 21:01, Saito, Hideki via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > All, > > I'm trying to refactor LoopVectorize such that it has better conformance to VPlan vision going forward > (http://www.llvm.org/docs/Proposals/VectorizationPlan.html). All VP*Recipe class definitions are now > moved to VPlan.h, and I have a patch under review

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 05

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

All, I'm trying to refactor LoopVectorize such that it has better conformance to VPlan vision going forward (http://www.llvm.org/docs/Proposals/VectorizationPlan.html). All VP*Recipe class definitions are now moved to VPlan.h, and I have a patch under review to move LoopVectorizationPlanner class out of LoopVectorize.cpp (https://reviews.llvm.org/D41420). Next thing I'm working on is

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

2018 Jan 06

RFC: [LV] any objections in moving isLegalMasked* check from Legal to CostModel? (Cleaning up LoopVectorizationLegality)

Amara, >I support this direction Thanks for the support. >but are there actually any real world workloads where gather/scatter scalarisation would be worth it, on any micro-architecture? If we don’t have examples and the compile time cost is non-negligible then I think we’d still like to keep the early >bailouts in some form.’ It's not like I have specific application code in

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

2013 Aug 15

[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops

Hi all, I have investigated the 6X extra compile-time overhead when Polly compiles the simple nestedloop benchmark in LLVM-testsuite. (http://188.40.87.11:8000/db_default/v4/nts/31?compare_to=28&baseline=28). Preliminary results show that such compile-time overhead is resulted by the complicated polly-dependence analysis. However, the key seems to be the polly-prepare pass, which introduces

[LLVMdev] Instruction Cost

2015 Jan 14

[LLVMdev] Instruction Cost

Hi, I'm looking for APIs that compute instruction costs, and noticed several of them. 1. A series of APIs of TargetTransformInfo that compute the cost of instructions of a particular type (e.g. getArithmeticInstrCost and getShuffleCost) 2. TargetTransformInfo::getOperationCost 3. CostModel::getInstructionCost::getInstructionCost in lib/Analysis/CostModel.cpp Only the first one is used

LoopVectorizer -- generating bad and unhandled shufflevector sequence

2016 Oct 06

LoopVectorizer -- generating bad and unhandled shufflevector sequence

Hi, I have experimented with enabling the LoopVectorizer for SystemZ. I have come across a loop which, when vectorized, seems to have been poorly generated. In short, there seems to be a completely unnecessary sequence of shufflevector instructions, that doesn't get optimized away anywhere. In other words, there is a shuffling so that leads back to the original vector: [0 1 2 3

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

2014 Jan 16

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

On Wed, Jan 15, 2014 at 5:30 PM, Nadav Rotem <nrotem at apple.com> wrote: > Was the vectorizer successful in unrolling the loop in quantum_sigma_x? I > wonder if 'size’ is typically high or low. No. The vectorizer stated that it wasn't going to bother with the loop because it wasn't profitable. Specifically: LV: Checking a loop in "quantum_sigma_x" LV: Found a

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 27

[LLVMdev] Why is the loop vectorizer not working on my function?

Hi Frank, On Oct 26, 2013, at 6:29 PM, Frank Winter <fwinter at jlab.org> wrote: > I would need this to work when calling the vectorizer through > the function pass manager. Unfortunately I am having the same > problem there: I am not sure which function pass manager you are referring here. I assume you create your own (you are not using opt but configure your own pass

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 27

[LLVMdev] Why is the loop vectorizer not working on my function?

Hi Arnold, thanks for the detailed setup. Still, I haven't figured out the right thing to do. I would need only the native target since all generated code will execute on the JIT execution machine (right now, the old JIT interface). There is no need for other targets. Maybe it would be good to ask specific questions: How do I get the triple for the native target? How do I setup the

[LLVMdev] Vectorizer using Instruction, not opcodes

2013 Feb 04

[LLVMdev] Vectorizer using Instruction, not opcodes

On 4 February 2013 18:25, Arnold Schwaighofer <aschwaighofer at apple.com>wrote: > For cases where this approach breaks really badly we could consider adding > a specialized api or parameters (like the type of a user/use). But we > should do so only as a last resort and backed by actual code that would > benefit from doing so. > Very sensible, more or less what I had in

[LLVMdev] ARM vectorizer cost model

2013 Jan 09

[LLVMdev] ARM vectorizer cost model

Hi Renato, > I'm interested in knowing how you'll work up the ARM cost model and how easy it'd be to split the work. Yes, I am starting to work on the ARM cost model and I would appreciate any help in the form of: advice, performance measurements, patches, etc. I tune the cost model by running the cost model analysis pass and I compare the output of the analysis to the output

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 26

[LLVMdev] Why is the loop vectorizer not working on my function?

Hi Arnold, adding '-debug-only=loop-vectorize' to the command gives: LV: Checking a loop in "bar" LV: Found a loop: L0 LV: Found an induction variable. LV: Found an unidentified write ptr: %7 = load float** %6 LV: Found an unidentified read ptr: %10 = load float** %9 LV: Found an unidentified read ptr: %13 = load float** %12 LV: We need to do 2 pointer comparisons. LV: We

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 26

[LLVMdev] Why is the loop vectorizer not working on my function?

Hi Hal! I am using the 'x86_64' target. Below the complete module dump and here the command line: opt -march=x64-64 -loop-vectorize -debug-only=loop-vectorize -S test.ll Frank ; ModuleID = 'test.ll' target datalayout = "e-p:64:64:64-S128-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f16:16:16-f32:32:32-f64:64:64-f128:128:128-v64:64:64-v128:12

[LLVMdev] Question about shouldMergeGEPs in InstructionCombining

2015 Mar 12

[LLVMdev] Question about shouldMergeGEPs in InstructionCombining

I think it would make sense for (1) and (2). I am not sure if (3) is feasible in instcombine. (I am not too familiar with LoopInfo) For the Octasic's Opus platform, I modified shouldMergeGEPs in our fork to: if (GEP.hasAllZeroIndices() && !Src.hasAllZeroIndices() && !Src.hasOneUse()) return false; return Src.hasAllConstantIndices(); // was return false;

[LLVMdev] supporting SAD in loop vectorizer

2014 Nov 04

[LLVMdev] supporting SAD in loop vectorizer

----- Original Message ----- > From: "Renato Golin" <renato.golin at linaro.org> > To: "Dibyendu Das" <Dibyendu.Das at amd.com> > Cc: llvmdev at cs.uiuc.edu > Sent: Tuesday, November 4, 2014 5:23:30 AM > Subject: Re: [LLVMdev] supporting SAD in loop vectorizer > > On 4 November 2014 11:06, Das, Dibyendu <Dibyendu.Das at amd.com> wrote:

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 26

[LLVMdev] Why is the loop vectorizer not working on my function?

----- Original Message ----- > >>> LV: The Widest type: 32 bits. > >>> LV: The Widest register is: 32 bits. > > Yep, we don’t pick up the right TTI. > > Try -march=x86-64 (or leave it out) you already have this info in the > triple. > > Then it should work (does for me with your example below). That may depend on what CPU is picks by default; Frank,

similar to: Fwd: How to use CostModel?