thr3ads.net - similar to: "[LLVMdev] Path forward on profile guided inlining?"

Displaying 20 results from an estimated 50000 matches similar to: "[LLVMdev] Path forward on profile guided inlining?"

2017 Dec 13

RFC: Synthetic function entry counts

Functions in LLVM IR have a function_entry_count metadata that is attached in PGO compilation. By using the entry count together with the block frequency info, the compiler computes the profile count of call instructions based on which the hotness/coldness of callsites can be determined. Experiments have shown that using a higher threshold for hot callsites results in improved runtime performance

[LLVMdev] Path forward on profile guided inlining?

2015 Dec 11

[LLVMdev] Path forward on profile guided inlining?

On Thu, Dec 10, 2015 at 4:51 PM, Philip Reames <listmail at philipreames.com> wrote: > > > On 12/10/2015 04:29 PM, Xinliang David Li wrote: >> >> On Thu, Dec 10, 2015 at 4:00 PM, Philip Reames >> <listmail at philipreames.com> wrote: >>> >>> Given I didn't get any response to my original query, I chose not to >>> invest

[LLVMdev] Path forward on profile guided inlining?

2015 Dec 07

[LLVMdev] Path forward on profile guided inlining?

(Resending after removing llvmdev at cs.uiuc.edu and using llvm-dev at lists.llvm.org) On Mon, Dec 7, 2015 at 3:08 PM, Easwaran Raman <eraman at google.com> wrote: > Hi Philip, > > Is there any update on this? I've been sending patches to get rid of the > callee hotness based inline hints from the frontend and move the logic to > the inliner. The next step is to use

[LLVMdev] Path forward on profile guided inlining?

2015 Dec 11

[LLVMdev] Path forward on profile guided inlining?

On Thu, Dec 10, 2015 at 4:00 PM, Philip Reames <listmail at philipreames.com> wrote: > Given I didn't get any response to my original query, I chose not to invest > time in this at the time. I am unlikely to get time for this in the near > future. > > On 12/07/2015 03:13 PM, Easwaran Raman wrote: > > (Resending after removing llvmdev at cs.uiuc.edu and using >

RFC: Synthetic function entry counts

2017 Dec 15

RFC: Synthetic function entry counts

On Fri, Dec 15, 2017 at 12:22 AM, Sean Silva <chisophugis at gmail.com> wrote: > IIUC, this proposal is just saying that we should infer a static profile > for entry counts just like we do for branch probabilities. In the case of > entry counts, we do not hide that information behind an analysis like BPI, > so currently just annotating synthetic PGO entry counts is a simple >

RFC: Synthetic function entry counts

2017 Dec 15

RFC: Synthetic function entry counts

On Fri, Dec 15, 2017 at 11:13 AM, Sean Silva <chisophugis at gmail.com> wrote: > > > On Fri, Dec 15, 2017 at 10:22 AM, Easwaran Raman <eraman at google.com> > wrote: > >> >> >> On Fri, Dec 15, 2017 at 12:22 AM, Sean Silva <chisophugis at gmail.com> >> wrote: >> >>> IIUC, this proposal is just saying that we should infer a

RFC: Synthetic function entry counts

2017 Dec 15

RFC: Synthetic function entry counts

On Fri, Dec 15, 2017 at 11:56 AM, Sean Silva <chisophugis at gmail.com> wrote: > > > On Fri, Dec 15, 2017 at 11:27 AM, Xinliang David Li <davidxl at google.com> > wrote: > >> >> >> On Fri, Dec 15, 2017 at 11:13 AM, Sean Silva <chisophugis at gmail.com> >> wrote: >> >>> >>> >>> On Fri, Dec 15, 2017 at 10:22

PGO information at LTO/thinLTO link step

2017 Oct 03

PGO information at LTO/thinLTO link step

Thanks Easwaran. This is what we've observed as well, where the old PM inliner was only looking hot/cold callee information, which have signficantly smaller boosts/penalties compared to callsite information. Teresa, do you know if there is some documentation/video/presentation on how PGO information is represented in LLVM and what information is passed via the IR? I'm finding some

[RFC] AlwaysInline codegen

2015 Aug 21

[RFC] AlwaysInline codegen

Hi, There is a problem with the handling of alwaysinline functions in Clang: they are not always inlined. AFAIK, this may only happen when the caller is in the dead code, but then we don't always successfully remove all dead code. Because of this, we may end up emitting an undefined reference for an "inline __attribute__((always_inline))" function. Libc++ relies on the compiler

[LLVMdev] Inline hints for *compiler clients*

2006 Mar 15

[LLVMdev] Inline hints for *compiler clients*

On Mar 15, 2006, at 11:15 AM, Chris Lattner wrote: > On Wed, 15 Mar 2006, Vikram S. Adve wrote: >>> Why can't the compiler pass just call InlineFunction(CallSite) on >>> the callsite it wants inlined? The only way that can fail is if >>> LLVM cannot ever inline the call (e.g. it uses varargs). > >> In some cases, that would be fine. But in other

[LLVMdev] MCJIT versus getLazyBitcodeModule?

2014 Jan 26

[LLVMdev] MCJIT versus getLazyBitcodeModule?

Hi Gael, I tried converting to your approach but I had some issues making sure that all symbols accessed by the jit modules have entries in the dynamic symbol table. To be specific, my current approach is to use MCJIT (using an objectcache) to JIT the runtime module and then let MCJIT handle linking any references from the jit'd modules; I just experimented with what I think you're doing,

[LLVMdev] RFC: Callee speedup estimation in inline cost analysis

2015 Jul 30

[LLVMdev] RFC: Callee speedup estimation in inline cost analysis

TLDR - The proposal below is intended to allow inlining of larger callees when such inlining is expected to reduce the dynamic instructions count. Proposal ------------- LLVM inlines a function if the size growth (in the given context) is less than a threshold. The threshold is increased based on certain characteristics of the called function (inline keyword and the fraction of vector

[LLVMdev] RFC: liveoncall parameter attribute

2015 Jun 01

[LLVMdev] RFC: liveoncall parameter attribute

TLDR - I have a runtime which expects to be able to inspect certain arguments to a function even if that argument isn't used within the callee itself. DeadArgumentElimination doesn't respect this today. I want to add an argument that records an argument to a call as live even if the value is known to be not used in the callee. My use case ----------------- What my runtime is doing

[LLVMdev] Inline hints for *compiler clients*

2006 Mar 15

[LLVMdev] Inline hints for *compiler clients*

Vikram S. Adve wrote: Hmmm. It seems the discussion has grown a little bit larger than I had intended. :) Basically what I think would be useful is an option to the inliner that gives it a list of functions to skip when inlining. My argument for this is that we have several transformations now that search for calls to specific functions; if those functions are inlined, the transform pass

[LLVMdev] Wondering how best to run inlining on a single function.

2009 May 26

[LLVMdev] Wondering how best to run inlining on a single function.

In Unladen Swallow we (intend to) compile each function as we determine it's hot. To "compile" a function means to translate it from CPython bytecode to LLVM IR, optimize the IR using a FunctionPassManager, and JIT the IR to machine code. We'd like to include inlining among our optimizations. Currently the Inliner is a CallGraphSCCPass, which can only be run by the

[cfe-dev] [RFC] AlwaysInline codegen

2015 Aug 21

[cfe-dev] [RFC] AlwaysInline codegen

On Thu, Aug 20, 2015 at 7:17 PM, John McCall <rjmccall at apple.com> wrote: > > On Aug 20, 2015, at 5:19 PM, Evgenii Stepanov via cfe-dev < > cfe-dev at lists.llvm.org> wrote: > > Hi, > > > > There is a problem with the handling of alwaysinline functions in > > Clang: they are not always inlined. AFAIK, this may only happen when > > the caller is

RFC: Inlining report

2015 Oct 22

RFC: Inlining report

RFC: Inlining Report Motivation Making good inlining choices while optimizing an application is often key to achieving optimal performance. While the compiler's default inlining heuristics sometimes provide great out-of-box results, optimal performance is sometimes achieved only after varying the settings of certain compiler options related to inlining or adding "always_inline" or

[LLVMdev] Wondering how best to run inlining on a single function.

2009 May 27

[LLVMdev] Wondering how best to run inlining on a single function.

On May 26, 2009, at 3:15 PM, Jeffrey Yasskin wrote: > In Unladen Swallow we (intend to) compile each function as we > determine it's hot. To "compile" a function means to translate it from > CPython bytecode to LLVM IR, optimize the IR using a > FunctionPassManager, and JIT the IR to machine code. We'd like to > include inlining among our optimizations. Currently

[LLVMdev] LLVM Inliner

2010 Nov 24

[LLVMdev] LLVM Inliner

Hi, I browsed the LLVM inliner implementation, and it seems there is room for improvement. (I have not read it too carefully, so correct me if what I observed is wrong). First the good side of the inliner -- the function level summary and inline cost estimation is more elaborate and complete than gcc. For instance, it considers callsite arguments and the effects of optimization enabled by

[LLVMdev] Inlining bitcast functions...

2012 Nov 09

[LLVMdev] Inlining bitcast functions...

I've got a call instruction: call void bitcast (void (%4 addrspace(1)*, <2 x i32>, <4 x float>)* @_Z12write_imagefPU3AS110_image2d_tDv2_iDv4_f to void (%9 addrspace(1)*, <2 x i32>, <4 x float>)*)(%9 addrspace(1)* %dstimg, <2 x i32> %28, <4 x float> %26) nounwind %4 and %9 are both (stripped) opaque structs. InlineFunction() does not inline this because

similar to: [LLVMdev] Path forward on profile guided inlining?