thr3ads.net - llvm dev - [llvm-dev] RFC [ThinLTO]: Promoting more aggressively in order to reduce incremental link time and allow sharing between linkage units [May 2016]

If this information is useful, please help other people find it:
Share via:

Peter Collingbourne via llvm-dev

2016-May-04 05:01 UTC

[llvm-dev] RFC [ThinLTO]: Promoting more aggressively in order to reduce incremental link time and allow sharing between linkage units

On Tue, May 3, 2016 at 9:01 PM, Mehdi Amini <mehdi.amini at apple.com>
wrote:
>
> On Apr 6, 2016, at 4:41 PM, Peter Collingbourne <peter at pcc.me.uk>
wrote:
>
> Hi all,
>
> I'd like to propose changes to how we do promotion of global values in
> ThinLTO. The goal here is to make it possible to pre-compile parts of the
> translation unit to native code at compile time. For example, if we know
> that:
>
> 1) A function is a leaf function, so it will never import any other
> functions, and
> 2) The function's instruction count falls above a threshold specified
at
> compile time, so it will never be imported.
> or
> 3) The compile-time threshold is zero, so there is no possibility of
> functions being imported (What's the utility of this? Consider a
program
> transformation that requires whole-program information, such as CFI. During
> development, the import threshold may be set to zero in order to minimize
> the incremental link time while still providing the same CFI enforcement
> that would be used in production builds of the application.)
>
> then the function's body will not be affected by link-time decisions,
and
> we might as well produce its object code at compile time. This will also
> allow the object code to be shared between linkage units (this should
> hopefully help solve a major scalability problem for Chromium, as that
> project contains a large number of test binaries based on common
libraries).
>
> This can be done with a change to the intermediate object file format. We
> can represent object files as native code containing statically compiled
> functions and global data in the .text,. data, .rodata (etc.) sections,
> with an .llvmbc section (or, I suppose, "__LLVM, __bitcode" when
targeting
> Mach-O) containing bitcode for functions to be compiled at link time.
>
>
> I was wondering why can't the "precompiled" function be
embedded in the IR
> instead of the bitcode embedded in the object file?
> The codegen would still emit a single object file out of this IR file that
> contains the code for the IR and the precompiled function.
>
> It seems to me that this way the scheme would work with any existing
> existing LTO implementation.
>
You'd still have the same problem. No matter whether you put the native
object inside the IR file or vice versa, you still have a file containing a
native object and some IR. That's the scenario that I found that the gold
plugin interface wouldn't support.

Supporting IR embedded in a native object section inside a linker should be
pretty trivial, if you control the linker. My prototype implementation in
lld is about 10 lines of code.

Peter

> --
> Mehdi
>
>
>
>
> In order to make this work, we need to make sure that references from
> link-time compiled functions to statically compiled functions work
> correctly in the case where the statically compiled function has internal
> linkage. We can do this by promoting every global value with internal
> linkage, using a hash of the external names (as I mentioned in [1]).
>
> I imagine that for some linkers, it may not be possible to deal with this
> scheme. For example, I did some investigation last year and discovered that
> I could not use the gold plugin interface to load a native object file if
> we had already claimed it as an IR file. I wouldn't be surprised to
learn
> that ld64 has similar problems.
>
> In cases where we completely control the linker (e.g. lld), we can easily
> support this scheme, as the linker can directly do whatever it wants. But
> for linkers that cannot support this, I suggest that we promote
> consistently under ThinLTO rather than having different promotion schemes
> for different linkers, in order to reduce overall complexity.
>
> Thanks for your feedback!
>
> Thanks,
> --
> --
> Peter
>
> [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098062.html
>
>
>

-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160503/e46108c9/attachment-0001.html>

Mehdi Amini via llvm-dev

2016-May-04 05:04 UTC

head link

[llvm-dev] RFC [ThinLTO]: Promoting more aggressively in order to reduce incremental link time and allow sharing between linkage units

> On May 3, 2016, at 10:01 PM, Peter Collingbourne <peter at pcc.me.uk>
wrote:
> 
> 
> 
> On Tue, May 3, 2016 at 9:01 PM, Mehdi Amini <mehdi.amini at apple.com
<mailto:mehdi.amini at apple.com>> wrote:
> 
>> On Apr 6, 2016, at 4:41 PM, Peter Collingbourne <peter at pcc.me.uk
<mailto:peter at pcc.me.uk>> wrote:
>> 
>> Hi all,
>> 
>> I'd like to propose changes to how we do promotion of global values
in ThinLTO. The goal here is to make it possible to pre-compile parts of the
translation unit to native code at compile time. For example, if we know that:
>> 
>> 1) A function is a leaf function, so it will never import any other
functions, and
>> 2) The function's instruction count falls above a threshold
specified at compile time, so it will never be imported.
>> or
>> 3) The compile-time threshold is zero, so there is no possibility of
functions being imported (What's the utility of this? Consider a program
transformation that requires whole-program information, such as CFI. During
development, the import threshold may be set to zero in order to minimize the
incremental link time while still providing the same CFI enforcement that would
be used in production builds of the application.)
>> 
>> then the function's body will not be affected by link-time
decisions, and we might as well produce its object code at compile time. This
will also allow the object code to be shared between linkage units (this should
hopefully help solve a major scalability problem for Chromium, as that project
contains a large number of test binaries based on common libraries).
>> 
>> This can be done with a change to the intermediate object file format.
We can represent object files as native code containing statically compiled
functions and global data in the .text,. data, .rodata (etc.) sections, with an
.llvmbc section (or, I suppose, "__LLVM, __bitcode" when targeting
Mach-O) containing bitcode for functions to be compiled at link time.
> 
> I was wondering why can't the "precompiled" function be
embedded in the IR instead of the bitcode embedded in the object file?
> The codegen would still emit a single object file out of this IR file that
contains the code for the IR and the precompiled function.
> 
> It seems to me that this way the scheme would work with any existing
existing LTO implementation.
> 
> You'd still have the same problem. No matter whether you put the native
object inside the IR file or vice versa, you still have a file containing a
native object and some IR. That's the scenario that I found that the gold
plugin interface wouldn't support.
It is not clear to me why it is a problem for gold: it does not need to know
that the IR file contains some native precompiled code: it only need to know
that this is an "LLVM file", that will be passed to LLVM for LTO and
it will get a single object file in return.
Can you elaborate why the linker need to know beforehand and differentiate?

-- 
Mehdi
> 
> Supporting IR embedded in a native object section inside a linker should be
pretty trivial, if you control the linker. My prototype implementation in lld is
about 10 lines of code.
> 
> Peter
> 
> 
> -- 
> Mehdi
> 
> 
> 
>> 
>> In order to make this work, we need to make sure that references from
link-time compiled functions to statically compiled functions work correctly in
the case where the statically compiled function has internal linkage. We can do
this by promoting every global value with internal linkage, using a hash of the
external names (as I mentioned in [1]).
>> 
>> I imagine that for some linkers, it may not be possible to deal with
this scheme. For example, I did some investigation last year and discovered that
I could not use the gold plugin interface to load a native object file if we had
already claimed it as an IR file. I wouldn't be surprised to learn that ld64
has similar problems.
>> 
>> In cases where we completely control the linker (e.g. lld), we can
easily support this scheme, as the linker can directly do whatever it wants. But
for linkers that cannot support this, I suggest that we promote consistently
under ThinLTO rather than having different promotion schemes for different
linkers, in order to reduce overall complexity.
>> 
>> Thanks for your feedback!
>> 
>> Thanks,
>> -- 
>> -- 
>> Peter
>> 
>> [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098062.html
<http://lists.llvm.org/pipermail/llvm-dev/2016-April/098062.html>
> 
> 
> 
> -- 
> -- 
> Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160503/096a1e1d/attachment.html>

Peter Collingbourne via llvm-dev

2016-May-04 05:25 UTC

head link

[llvm-dev] RFC [ThinLTO]: Promoting more aggressively in order to reduce incremental link time and allow sharing between linkage units

On Tue, May 3, 2016 at 10:04 PM, Mehdi Amini <mehdi.amini at apple.com>
wrote:
>
> On May 3, 2016, at 10:01 PM, Peter Collingbourne <peter at pcc.me.uk>
wrote:
>
>
>
> On Tue, May 3, 2016 at 9:01 PM, Mehdi Amini <mehdi.amini at
apple.com> wrote:
>
>>
>> On Apr 6, 2016, at 4:41 PM, Peter Collingbourne <peter at
pcc.me.uk> wrote:
>>
>> Hi all,
>>
>> I'd like to propose changes to how we do promotion of global values
in
>> ThinLTO. The goal here is to make it possible to pre-compile parts of
the
>> translation unit to native code at compile time. For example, if we
know
>> that:
>>
>> 1) A function is a leaf function, so it will never import any other
>> functions, and
>> 2) The function's instruction count falls above a threshold
specified at
>> compile time, so it will never be imported.
>> or
>> 3) The compile-time threshold is zero, so there is no possibility of
>> functions being imported (What's the utility of this? Consider a
program
>> transformation that requires whole-program information, such as CFI.
During
>> development, the import threshold may be set to zero in order to
minimize
>> the incremental link time while still providing the same CFI
enforcement
>> that would be used in production builds of the application.)
>>
>> then the function's body will not be affected by link-time
decisions, and
>> we might as well produce its object code at compile time. This will
also
>> allow the object code to be shared between linkage units (this should
>> hopefully help solve a major scalability problem for Chromium, as that
>> project contains a large number of test binaries based on common
libraries).
>>
>> This can be done with a change to the intermediate object file format.
We
>> can represent object files as native code containing statically
compiled
>> functions and global data in the .text,. data, .rodata (etc.) sections,
>> with an .llvmbc section (or, I suppose, "__LLVM, __bitcode"
when targeting
>> Mach-O) containing bitcode for functions to be compiled at link time.
>>
>>
>> I was wondering why can't the "precompiled" function be
embedded in the
>> IR instead of the bitcode embedded in the object file?
>> The codegen would still emit a single object file out of this IR file
>> that contains the code for the IR and the precompiled function.
>>
>> It seems to me that this way the scheme would work with any existing
>> existing LTO implementation.
>>
>
> You'd still have the same problem. No matter whether you put the native
> object inside the IR file or vice versa, you still have a file containing a
> native object and some IR. That's the scenario that I found that the
gold
> plugin interface wouldn't support.
>
>
> It is not clear to me why it is a problem for gold: it does not need to
> know that the IR file contains some native precompiled code: it only need
> to know that this is an "LLVM file", that will be passed to LLVM
for LTO
> and it will get a single object file in return.
> Can you elaborate why the linker need to know beforehand and differentiate?
>
(There wouldn't just be one object file, there would be N native objects
and 1 (or N if ThinLTO) combined LTO objects.)

In principle, it doesn't need to know. In practice, I found that in my
prototype I couldn't persuade gold to accept what I was doing without
giving undefined symbol errors.

I suppose I could have debugged it further, but I couldn't justify spending
more time on it, since the projects I care about are interested in
switching to lld for other reasons.

Peter


>
> --
> Mehdi
>
>
> Supporting IR embedded in a native object section inside a linker should
> be pretty trivial, if you control the linker. My prototype implementation
> in lld is about 10 lines of code.
>
> Peter
>
>
>> --
>> Mehdi
>>
>>
>>
>>
>> In order to make this work, we need to make sure that references from
>> link-time compiled functions to statically compiled functions work
>> correctly in the case where the statically compiled function has
internal
>> linkage. We can do this by promoting every global value with internal
>> linkage, using a hash of the external names (as I mentioned in [1]).
>>
>> I imagine that for some linkers, it may not be possible to deal with
this
>> scheme. For example, I did some investigation last year and discovered
that
>> I could not use the gold plugin interface to load a native object file
if
>> we had already claimed it as an IR file. I wouldn't be surprised to
learn
>> that ld64 has similar problems.
>>
>> In cases where we completely control the linker (e.g. lld), we can
easily
>> support this scheme, as the linker can directly do whatever it wants.
But
>> for linkers that cannot support this, I suggest that we promote
>> consistently under ThinLTO rather than having different promotion
schemes
>> for different linkers, in order to reduce overall complexity.
>>
>> Thanks for your feedback!
>>
>> Thanks,
>> --
>> --
>> Peter
>>
>> [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098062.html
>>
>>
>>
>
>
> --
> --
> Peter
>
>
>

-- 
-- 
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160503/615b4d8a/attachment.html>

llvm dev - May 2016 - RFC [ThinLTO]: Promoting more aggressively in order to reduce incremental link time and allow sharing between linkage units

[llvm-dev] RFC [ThinLTO]: Promoting more aggressively in order to reduce incremental link time and allow sharing between linkage units

[llvm-dev] RFC [ThinLTO]: Promoting more aggressively in order to reduce incremental link time and allow sharing between linkage units

[llvm-dev] RFC [ThinLTO]: Promoting more aggressively in order to reduce incremental link time and allow sharing between linkage units