thr3ads.net - similar to: "[LLVMdev] LLVM Inliner"

Displaying 20 results from an estimated 8000 matches similar to: "[LLVMdev] LLVM Inliner"

2010 Nov 24

[LLVMdev] LLVM Inliner

On Tue, Nov 23, 2010 at 8:07 PM, Xinliang David Li <xinliangli at gmail.com> wrote: > Hi, I browsed the LLVM inliner implementation, and it seems there is room > for improvement. (I have not read it too carefully, so correct me if what I > observed is wrong). > First the good side of the inliner -- the function level summary and inline > cost estimation is more elaborate and

[LLVMdev] LLVM Inliner

2010 Nov 24

[LLVMdev] LLVM Inliner

Xinliang David Li wrote: > Hi, I browsed the LLVM inliner implementation, and it seems there is > room for improvement. (I have not read it too carefully, so correct me > if what I observed is wrong). > > First the good side of the inliner -- the function level summary and > inline cost estimation is more elaborate and complete than gcc. For > instance, it considers callsite

[LLVMdev] LLVM Inliner

2010 Nov 28

[LLVMdev] LLVM Inliner

On Nov 23, 2010, at 5:07 PM, Xinliang David Li wrote: > Hi, I browsed the LLVM inliner implementation, and it seems there is room for improvement. (I have not read it too carefully, so correct me if what I observed is wrong). > > First the good side of the inliner -- the function level summary and inline cost estimation is more elaborate and complete than gcc. For instance, it considers

[LLVMdev] LLVM Inliner

2010 Nov 24

[LLVMdev] LLVM Inliner

On Wed, Nov 24, 2010 at 12:37 PM, Nick Lewycky <nicholas at mxc.ca> wrote: > Xinliang David Li wrote: > >> Hi, I browsed the LLVM inliner implementation, and it seems there is >> room for improvement. (I have not read it too carefully, so correct me >> if what I observed is wrong). >> >> First the good side of the inliner -- the function level summary

[LLVMdev] LLVM Inliner

2010 Nov 29

[LLVMdev] LLVM Inliner

On Sun, Nov 28, 2010 at 2:37 PM, Chris Lattner <clattner at apple.com> wrote: > On Nov 23, 2010, at 5:07 PM, Xinliang David Li wrote: > > Hi, I browsed the LLVM inliner implementation, and it seems there is room > for improvement. (I have not read it too carefully, so correct me if what I > observed is wrong). > > > > First the good side of the inliner -- the

[LLVMdev] the PartialSpecialization pass (was Re: Is there a "callback optimization"?)

2010 Jun 08

[LLVMdev] the PartialSpecialization pass (was Re: Is there a "callback optimization"?)

Good evening, Kenneth. Thank you to apply (and rewrite my naive code better) and to file the issue to http://llvm.org/bugs/show_bug.cgi?id=7304 I have checked r105528 at this morning. I think the pass must be still cleaned up and rewritten. There are my two proposals for enhancement. 1) To separate Specialization(and rewriting callsites) to other module. It would be better if new module were

RFC: Inlining report

2015 Oct 22

RFC: Inlining report

RFC: Inlining Report Motivation Making good inlining choices while optimizing an application is often key to achieving optimal performance. While the compiler's default inlining heuristics sometimes provide great out-of-box results, optimal performance is sometimes achieved only after varying the settings of certain compiler options related to inlining or adding "always_inline" or

RFC: Synthetic function entry counts

2017 Dec 13

RFC: Synthetic function entry counts

Functions in LLVM IR have a function_entry_count metadata that is attached in PGO compilation. By using the entry count together with the block frequency info, the compiler computes the profile count of call instructions based on which the hotness/coldness of callsites can be determined. Experiments have shown that using a higher threshold for hot callsites results in improved runtime performance

RFC: Synthetic function entry counts

2017 Dec 15

RFC: Synthetic function entry counts

On Fri, Dec 15, 2017 at 12:22 AM, Sean Silva <chisophugis at gmail.com> wrote: > IIUC, this proposal is just saying that we should infer a static profile > for entry counts just like we do for branch probabilities. In the case of > entry counts, we do not hide that information behind an analysis like BPI, > so currently just annotating synthetic PGO entry counts is a simple >

RFC: Synthetic function entry counts

2017 Dec 15

RFC: Synthetic function entry counts

On Fri, Dec 15, 2017 at 11:13 AM, Sean Silva <chisophugis at gmail.com> wrote: > > > On Fri, Dec 15, 2017 at 10:22 AM, Easwaran Raman <eraman at google.com> > wrote: > >> >> >> On Fri, Dec 15, 2017 at 12:22 AM, Sean Silva <chisophugis at gmail.com> >> wrote: >> >>> IIUC, this proposal is just saying that we should infer a

RFC: Synthetic function entry counts

2017 Dec 15

RFC: Synthetic function entry counts

On Fri, Dec 15, 2017 at 11:56 AM, Sean Silva <chisophugis at gmail.com> wrote: > > > On Fri, Dec 15, 2017 at 11:27 AM, Xinliang David Li <davidxl at google.com> > wrote: > >> >> >> On Fri, Dec 15, 2017 at 11:13 AM, Sean Silva <chisophugis at gmail.com> >> wrote: >> >>> >>> >>> On Fri, Dec 15, 2017 at 10:22

[LLVMdev] RFC: Callee speedup estimation in inline cost analysis

2015 Jul 30

[LLVMdev] RFC: Callee speedup estimation in inline cost analysis

TLDR - The proposal below is intended to allow inlining of larger callees when such inlining is expected to reduce the dynamic instructions count. Proposal ------------- LLVM inlines a function if the size growth (in the given context) is less than a threshold. The threshold is increased based on certain characteristics of the called function (inline keyword and the fraction of vector

[LLVMdev] [RFC] "noclone" function attribute

2012 Dec 06

[LLVMdev] [RFC] "noclone" function attribute

Hi Michael, After some head-scratching and discussion with our tame Khronos member, I agree with you. It comes down to the interpretation of the ambiguous spec. It refers to "the barrier", implying there is some sort of equivalence relation over barriers. The question is, what is that equivalent relation? In your example code: > void f(int foo) { > if (foo) > b();

[RFC][InlineCost] Modeling JumpThreading (or similar) in inline cost model

2017 Aug 07

[RFC][InlineCost] Modeling JumpThreading (or similar) in inline cost model

Hi, Coincidentally I've been working to optimize this same case last week. I was struggling a bit to determine where to put this functionality and eventually went for the pragmatic approach of creating an experimental pass. Probably not the eventual solution, but it may provide some useful input to the discussion here. Basically, I experimented with a 'pre-inlining-transform' pass

[LLVMdev] [RFC] "noclone" function attribute

2012 Dec 04

[LLVMdev] [RFC] "noclone" function attribute

Hi all + llvm-commits, After the discussion below, please find attached my patch to add a new "noduplicate" function attribute. I've modified CodeMetrics and LoopInfo, which covers most cases, but JumpThreading and InlineCost don't use CodeMetrics yet, so they required changing manually. Cheers, James On Mon, 2012-12-03 at 23:46 +0000, Chris Lattner wrote: > On Dec 3,

[LLVMdev] [RFC] "noclone" function attribute

2012 Dec 07

[LLVMdev] [RFC] "noclone" function attribute

Sounds good to me. I'm not sure the solution for transitivity is optimal, but it's a good compromise. -----Original Message----- From: James Molloy [mailto:James.Molloy at arm.com] Sent: Thursday, December 06, 2012 13:05 To: Kuperstein, Michael M Cc: Chris Lattner; llvm-commits; Nadav Rotem; llvmdev at cs.uiuc.edu Subject: RE: [LLVMdev] [RFC] "noclone" function attribute Hi

PGO information at LTO/thinLTO link step

2017 Oct 03

PGO information at LTO/thinLTO link step

On Tue, Oct 3, 2017 at 1:46 PM, Teresa Johnson via llvm-dev < llvm-dev at lists.llvm.org> wrote: > > > On Tue, Oct 3, 2017 at 1:38 PM, Graham Yiu <gyiu at ca.ibm.com> wrote: > >> Hi Teresa, >> >> Actually, enabling the new pass manager manually seems to have solved >> this issue, so this problem is only valid for the old pass manager. >> >

PGO information at LTO/thinLTO link step

2017 Oct 03

PGO information at LTO/thinLTO link step

Thanks Easwaran. This is what we've observed as well, where the old PM inliner was only looking hot/cold callee information, which have signficantly smaller boosts/penalties compared to callsite information. Teresa, do you know if there is some documentation/video/presentation on how PGO information is represented in LLVM and what information is passed via the IR? I'm finding some

[LLVMdev] RFC: Callee speedup estimation in inline cost analysis

2015 Jul 31

[LLVMdev] RFC: Callee speedup estimation in inline cost analysis

Just nitpicking: 1) DI(F) should include a component that estimate the epi/prologue cost (frameSetupCost) which InlinedDF does not have 2) The speedup should include callsite cost associated with 'C' (call instr, argument passing): Speedup(F,C) = (DI(F) + CallCost(C) - InlinedDF(F,C))/DI(F). Otherwise the proposal looks reasonable to me. David On Thu, Jul 30, 2015 at 2:25 PM,

PGO information at LTO/thinLTO link step

2017 Oct 03

PGO information at LTO/thinLTO link step

Hi Teresa, Actually, enabling the new pass manager manually seems to have solved this issue, so this problem is only valid for the old pass manager. Thanks, Graham Yiu LLVM Compiler Development IBM Toronto Software Lab Office: (905) 413-4077 C2-707/8200/Markham Email: gyiu at ca.ibm.com From: Teresa Johnson <tejohnson at google.com> To: Graham Yiu <gyiu at ca.ibm.com> Cc:

similar to: [LLVMdev] LLVM Inliner