Displaying 20 results from an estimated 7000 matches similar to: "[RFC] Propeller: A frame work for Post Link Optimizations"
2019 Sep 26
2
[RFC] Propeller: A frame work for Post Link Optimizations
On Wed, Sep 25, 2019 at 5:02 PM Eli Friedman via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> My biggest question about this architecture is about when propeller runs
> basic block reordering within a function. It seems like a lot of the
> complexity comes from using the proposed -fbasicblock-sections to generated
> mangled ELF, and then re-parsing the mangled ELF as a
2020 Feb 28
5
A Propeller link (similar to a Thin Link as used by ThinLTO)?
I met with the Propeller team today (we work for the same company but it
was my first time meeting two members on the team:) ).
One thing I have been reassured:
* There is no general disassembly work. General
disassembly work would assuredly frighten off developers. (Inherently
unreliable, memory usage heavy and difficult to deal with CFI, debug
information, etc)
Minimal amount of plumbing work
2019 Oct 22
2
[RFC] Propeller: A frame work for Post Link Optimizations
We are going to be at the llvm-dev meeting the next two days. We will get
back to you after that.
Sri
On Mon, Oct 21, 2019 at 10:07 PM Maksim Panchenko <maks at fb.com> wrote:
> Hi Sri,
>
>
>
> Thank you for replying to our feedback. 7 out 12 high-level concerns have
> been
>
> answered; 2 of them are fully addressed. The rest are being tracked at the
>
>
2019 Oct 14
2
[RFC] Propeller: A frame work for Post Link Optimizations
Hello,
I wanted to consolidate all the discussions and our final thoughts on the
concerns raised. I have attached a document consolidating it.
BOLT’s performance gains inspired this work and we believe BOLT
is a great piece of engineering. However, there are build environments
where
scalability is critical and memory limits per process are tight :
* Debug Fission,
2019 Oct 11
2
[RFC] Propeller: A frame work for Post Link Optimizations
Is there large value from deferring the block ordering to link time? That
is, does the block layout algorithm need to consider global layout issues
when deciding which blocks to put together and which to relegate to the
far-away part of the code?
Or, could the propellor-optimized compile step instead split each function
into only 2 pieces -- one containing an "optimally-ordered" set of
2019 Oct 17
2
[RFC] Propeller: A frame work for Post Link Optimizations
Hello Maksim,
On Wed, Oct 16, 2019 at 3:52 PM Maksim Panchenko <maks at fb.com> wrote:
> Hi Sri,
>
>
>
> I want to clarify one thing before sending a detailed reply: did you
> evaluate
>
> BOLT on Clang built with basic block sections?
>
In the makefile you reference,
>
> there are two versions: a “vanilla” and a default built with function
> sections.
2019 Oct 18
3
[RFC] Propeller: A frame work for Post Link Optimizations
Hello Maksim,
On Fri, Oct 18, 2019 at 10:57 AM Maksim Panchenko <maks at fb.com> wrote:
> Cool. The new numbers look good. If you run BOLT with jemalloc library
>
> preloaded, you will likely get a runtime closer to 1 minute. We’ve noticed
> that
>
> compared to the default malloc, it improves the multithreaded
>
> performance and brings down memory usage
2019 Oct 08
2
[RFC] Propeller: A frame work for Post Link Optimizations
Some more information about the relaxation pass whose effectiveness
and convergence guarantees were listed as a concern:
TLDR; Our relaxation pass is similar to what LLVM’s MCAssembler does
but with a caveat for efficiency. Our experimental results show it is
efficient and convergence is guaranteed.
Our relaxation pass is very similar to what MCAssembler does as it
needs to solve the same
2019 Oct 07
2
[RFC] Propeller: A frame work for Post Link Optimizations
We would also like to clarify on the misconceptions around CFI Instructions:
There are two things that need to be clarified here:
1) Extra CFI FDE entries for basic blocks does not mean more dynamic
instructions are executed. In fact, they do not increase at all. Krys
talked about this earlier.
2) We do deduplication of common static CFI instructions in the FDE
and move it to the CIE . Hence,
2019 Oct 02
4
[RFC] Propeller: A frame work for Post Link Optimizations
I'm a bit confused by this subthread -- doesn't BOLT have the exact same
CFI bloat issue? From my cursory reading of the propellor doc, the CFI
duplication is _necessary_ to represent discontiguous functions, not
anything particular to the way Propellor happens to generate those
discontiguous functions.
And emitting discontiguous functions is a fundamental goal of this, right?
On Wed,
2019 Sep 26
2
[RFC] Propeller: A frame work for Post Link Optimizations
On Thu, Sep 26, 2019 at 12:39 PM Eli Friedman <efriedma at quicinc.com> wrote:
>
>
>
> From: Xinliang David Li <xinliangli at gmail.com>
> Sent: Wednesday, September 25, 2019 5:58 PM
> To: Eli Friedman <efriedma at quicinc.com>
> Cc: Sriraman Tallam <tmsriram at google.com>; llvm-dev <llvm-dev at lists.llvm.org>
> Subject: [EXT] Re: [llvm-dev]
2019 Sep 27
5
[RFC] Propeller: A frame work for Post Link Optimizations
On Thu, Sep 26, 2019 at 5:13 PM Eli Friedman <efriedma at quicinc.com> wrote:
>
> > -----Original Message-----
> > From: Sriraman Tallam <tmsriram at google.com>
> > Sent: Thursday, September 26, 2019 3:24 PM
> > To: Eli Friedman <efriedma at quicinc.com>
> > Cc: Xinliang David Li <xinliangli at gmail.com>; llvm-dev <llvm-dev at
2019 Oct 02
2
[RFC] Propeller: A frame work for Post Link Optimizations
On Wed, Oct 2, 2019 at 8:41 PM Maksim Panchenko via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> *Pessimization/overhead for stack unwinding used by system-wide profilers
> and
> for exception handling*
>
> Larger CFI programs put an extra burden on unwinding at runtime as more CFI
> (and thus native) instructions have to be executed. This will cause more
> overhead
2019 Sep 30
2
[RFC] Propeller: A frame work for Post Link Optimizations
On Mon, Sep 30, 2019 at 12:26 PM Eric Christopher <echristo at gmail.com>
wrote:
> On Sat, Sep 28, 2019 at 8:25 AM Sriraman Tallam <tmsriram at google.com>
> wrote:
> >
> >
> >
> > On Fri, Sep 27, 2019 at 10:36 PM Eric Christopher <echristo at gmail.com>
> wrote:
> >>
> >> On Fri, Sep 27, 2019 at 2:08 PM Sriraman Tallam via
2019 Sep 30
2
[RFC] Propeller: A frame work for Post Link Optimizations
On Mon, Sep 30, 2019 at 1:27 PM Eric Christopher <echristo at gmail.com> wrote:
> On Mon, Sep 30, 2019 at 12:31 PM Sriraman Tallam <tmsriram at google.com>
> wrote:
> >
> >
> >
> > On Mon, Sep 30, 2019 at 12:26 PM Eric Christopher <echristo at gmail.com>
> wrote:
> >>
> >> On Sat, Sep 28, 2019 at 8:25 AM Sriraman Tallam
2019 Sep 27
3
[RFC] Propeller: A frame work for Post Link Optimizations
On Thu, Sep 26, 2019 at 6:16 PM Eli Friedman <efriedma at quicinc.com> wrote:
>
> > -----Original Message-----
> > From: Sriraman Tallam <tmsriram at google.com>
> > Sent: Thursday, September 26, 2019 5:31 PM
> > To: Eli Friedman <efriedma at quicinc.com>
> > Cc: Xinliang David Li <xinliangli at gmail.com>; llvm-dev <llvm-dev at
2019 Sep 28
2
[RFC] Propeller: A frame work for Post Link Optimizations
On Fri, Sep 27, 2019 at 10:36 PM Eric Christopher <echristo at gmail.com>
wrote:
> On Fri, Sep 27, 2019 at 2:08 PM Sriraman Tallam via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> >
> > On Fri, Sep 27, 2019 at 1:16 PM Eli Friedman <efriedma at quicinc.com>
> wrote:
> > >
> > > > -----Original Message-----
> > > > From:
2019 Sep 30
2
[RFC] Propeller: A frame work for Post Link Optimizations
I guess Eric means full program optimization/cross module optimization
using MIR. This is in theory workable in full LTO style, but not in
ThinLTO style which works on summary data. As we have discussed,
eliminating monolithic style optimization is the key design goal.
This was also briefly discussed in one of the previous replies I sent.
There are other benefits of doing this in linker such
2019 Sep 27
3
[RFC] Propeller: A frame work for Post Link Optimizations
On Fri, Sep 27, 2019 at 1:16 PM Eli Friedman <efriedma at quicinc.com> wrote:
>
> > -----Original Message-----
> > From: Sriraman Tallam <tmsriram at google.com>
> > Sent: Friday, September 27, 2019 9:43 AM
> > To: Eli Friedman <efriedma at quicinc.com>
> > Cc: Xinliang David Li <xinliangli at gmail.com>; llvm-dev <llvm-dev at
2020 Aug 05
10
[RFC] Machine Function Splitter - Split out cold blocks from machine functions using profile data
Greetings,
We present “Machine Function Splitter”, a codegen optimization pass which
splits functions into hot and cold parts. This pass leverages the basic
block sections feature recently introduced in LLVM from the Propeller
project. The pass targets functions with profile coverage, identifies cold
blocks and moves them to a separate section. The linker groups all cold
blocks across functions