Displaying 20 results from an estimated 6000 matches similar to: "PGO information at LTO/thinLTO link step"
2017 Oct 03
2
PGO information at LTO/thinLTO link step
Hi Teresa,
Actually, enabling the new pass manager manually seems to have solved this
issue, so this problem is only valid for the old pass manager.
Thanks,
Graham Yiu
LLVM Compiler Development
IBM Toronto Software Lab
Office: (905) 413-4077 C2-707/8200/Markham
Email: gyiu at ca.ibm.com
From: Teresa Johnson <tejohnson at google.com>
To: Graham Yiu <gyiu at ca.ibm.com>
Cc:
2017 Oct 03
2
PGO information at LTO/thinLTO link step
On Tue, Oct 3, 2017 at 1:46 PM, Teresa Johnson via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
>
> On Tue, Oct 3, 2017 at 1:38 PM, Graham Yiu <gyiu at ca.ibm.com> wrote:
>
>> Hi Teresa,
>>
>> Actually, enabling the new pass manager manually seems to have solved
>> this issue, so this problem is only valid for the old pass manager.
>>
>
2017 Oct 03
2
PGO information at LTO/thinLTO link step
Thanks Easwaran. This is what we've observed as well, where the old PM
inliner was only looking hot/cold callee information, which have
signficantly smaller boosts/penalties compared to callsite information.
Teresa, do you know if there is some documentation/video/presentation on
how PGO information is represented in LLVM and what information is passed
via the IR? I'm finding some
2017 Oct 03
5
General question about enabling partial inlining
Hi Graham,
Thanks for sharing this. Are you planning on enabling the pass only on PGO?
Even in non-PGO, I noticed some performance gains when we are aggressive in
partially inlining the early return part, especially when the callee spill
CSRs in the entry block. At a high level, I have two questions:
1. What is the main obstacle that prevent the pass from being enabled
by default?
2.
2017 Nov 10
5
[RFC] Enable Partial Inliner by default
Hi Graham,
Thank you for offering help. I am trying to create a reproducer. The problem is that the crashes happen whilst LTO is used. One thing I am sure about IR is broken at compile time.
Thanks,
Evgeny
From: Graham Yiu <gyiu at ca.ibm.com>
Date: Friday, 10 November 2017 at 16:09
To: Evgeny Astigeevich <Evgeny.Astigeevich at arm.com>
Cc: "junbuml at codeaurora.org"
2017 Nov 02
13
[RFC] Enable Partial Inliner by default
Forgot to add that all experiments were done with '-O3 -m64
-fexperimental-new-pass-manager'.
Graham Yiu
LLVM Compiler Development
IBM Toronto Software Lab
Office: (905) 413-4077 C2-707/8200/Markham
Email: gyiu at ca.ibm.com
From: Graham Yiu/Toronto/IBM
To: llvm-dev at lists.llvm.org
Cc: junbuml at codeaurora.org, xinliangli at gmail.com
Date: 11/02/2017 05:26 PM
Subject: [RFC]
2017 Aug 24
3
[RFC] Enhance Partial Inliner by using a general outlining scheme for cold blocks
Hi David,
The only reason I can see to use the 'pattern matching' part as a fall-back
is in case we cannot inline the (what I'm assuming would be) a much bigger
hot-path-only cloned function for whatever reason. What I'm assuming here
is that after cold-region outlining, we may still have a large portion of
the original function body to attempt to inline, whereas the pattern
2017 Sep 13
2
General question about enabling partial inlining
Hi,
I noticed some performance gains in some spec benchmarks without
significant code size bloat when aggressively performing partial
inlining, especially when the original callee spill CSRs in the entry
block. I guess the partial inlining is not enabled mainly due to the
code size. Is there any other issue which prevent the pass from being
enabled? Do we have any plan or any on-going works
2017 Oct 03
2
New Pass Manager with flto[=thin] not enabled (??)
On Tue, Oct 3, 2017 at 12:08 PM, Davide Italiano <davide at freebsd.org> wrote:
> On Tue, Oct 3, 2017 at 11:57 AM, Graham Yiu via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Hello,
>>
>> I recently noticed that the new pass manager was not enabled at regular/thin
>> LTO link step even if '-fexperimental-new-pass-manager' was specified in the
2017 Aug 29
3
[RFC] Enhance Partial Inliner by using a general outlining scheme for cold blocks
I second the fact that a way to outline specific function regions
independently of the partial inliner sound very useful. I am not sure
however if we would want a mode within the partialInliner or something
completely independent.
As a general question, does anybody has a clear idea of what are the
constraints on the region CodeExtractor is currently able to handle ?
Going through the code, it
2017 Aug 15
3
[RFC] Enhance Partial Inliner by using a general outlining scheme for cold blocks
Hi Jessica,
Thanks for the feedback.
I believe the existing partial inliner pass does use some common utilities
like the code extractor to do outlining. Is that what you're referring to?
I don't recall seeing any other passes that has outlining other than the
machine outliner, but I may have missed something.
I briefly looked at River's RFC and it seems he's mainly utilizing
2017 Nov 10
0
[RFC] Enable Partial Inliner by default
Hi Evgeny,
I just realized that if these are compile-time errors I can help
investigate on my end. Do you have something I can use to reproduce?
Cheers,
Graham Yiu
LLVM Compiler Development
IBM Toronto Software Lab
Office: (905) 413-4077 C2-707/8200/Markham
Email: gyiu at ca.ibm.com
From: Graham Yiu/Toronto/IBM
To: Evgeny Astigeevich <Evgeny.Astigeevich at arm.com>
Cc:
2017 Aug 15
8
[RFC] Enhance Partial Inliner by using a general outlining scheme for cold blocks
Hello,
My team and I are looking to do some enhancements in the partial inliner
in opt. Would appreciate any feedback that folks might have.
# Partial Inlining in LLVM opt
## Summary
### Background
Currently, the partial inliner searches the first few blocks of the callee
and looks for a branch to the return block (ie. early return). If found,
it attempts to outline the rest of the
2017 Aug 15
2
[RFC] Enhance Partial Inliner by using a general outlining scheme for cold blocks
On Tue, Aug 15, 2017 at 4:14 PM, River Riddle via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hey Graham,
> I worked on pretty much this exact thing last year. I did something
> similar to what you described, I traversed the CFG and built potentially
> profitable regions from any given valid start node. At that point there
> were several road blocks that prevented it
2017 Aug 24
1
[RFC] Enhance Partial Inliner by using a general outlining scheme for cold blocks
Hi David,
So I've began doing some implementation on the outlining portion of the
code. Currently, I got the partial inliner to outline cold regions (single
entry, single exit) of the code, based solely on the existence of
ProfileSummaryInfo (ie. profiling data). However, I have some concerns on
how this will co-exist with the existing code that peels early returns.
The control flow looks
2017 Oct 03
2
New Pass Manager with flto[=thin] not enabled (??)
Hello,
I recently noticed that the new pass manager was not enabled at
regular/thin LTO link step even if '-fexperimental-new-pass-manager' was
specified in the compile step and link step commands. Upon closer
inspection, it seems there's so real way to invoke the new pass manager
path ('runNewPMPasses' in lib/LTO/LTOBackend.cpp) during link step.
'Conf.UseNewPM' is
2016 Sep 02
3
[ThinLTO] Importing based on PGO data
On Fri, Sep 2, 2016 at 3:30 PM, Xinliang David Li <davidxl at google.com>
wrote:
> On Fri, Sep 2, 2016 at 3:16 PM, Piotr Padlewski
> <piotr.padlewski at gmail.com> wrote:
> >
> >
> > 2016-09-02 15:04 GMT-07:00 Xinliang David Li <davidxl at google.com>:
> >>
> >> On Fri, Sep 2, 2016 at 2:58 PM, Piotr Padlewski
> >>
2015 Mar 24
2
[LLVMdev] RFC - Improvements to PGO profile support
> On Mar 24, 2015, at 12:08 PM, Xinliang David Li <davidxl at google.com> wrote:
>
> On Tue, Mar 24, 2015 at 10:54 AM, Bob Wilson <bob.wilson at apple.com> wrote:
>>
>>> On Mar 24, 2015, at 10:53 AM, Diego Novillo <dnovillo at google.com> wrote:
>>>
>>> On Tue, Mar 24, 2015 at 1:48 PM, Bob Wilson <bob.wilson at apple.com> wrote:
2016 Sep 02
2
[ThinLTO] Importing based on PGO data
The profile summary is saved in the global metadata ASAIK. If we want to
calculate if something is hot/cold while choosing functions for importing,
we would either need to read whole Module (which we clearly don't want to
do)
or duplicate this information in the summary, so we could get it without
reading Module.
2016-09-02 15:49 GMT-07:00 Mehdi Amini <mehdi.amini at apple.com>:
>
2017 Dec 13
5
RFC: Synthetic function entry counts
Functions in LLVM IR have a function_entry_count metadata that is attached
in PGO compilation. By using the entry count together with the block
frequency info, the compiler computes the profile count of call
instructions based on which the hotness/coldness of callsites can be
determined. Experiments have shown that using a higher threshold for hot
callsites results in improved runtime performance