thr3ads.net - llvm dev - [llvm-dev] [RFC] LLVM Busybox Proposal [Jun 2021]

If this information is useful, please help other people find it:
Share via:

Petr Hosek via llvm-dev

2021-Jun-23 05:19 UTC

[llvm-dev] [RFC] LLVM Busybox Proposal

I guess this depends on a particular implementation of the distributed
build system. In the case of Goma, we only supply the compiler binary which
was invoked as the command (that binary links glibc as a shared library but
we assume that one is supplied by the host system), all other files like
headers are passed together with the compiler invocation as inputs. If we
used dynamic linking, Goma would need to figure out what other shared
libraries need to be sent to the server. It's certainly doable but it's
an
extra complexity we would like to avoid.

On Tue, Jun 22, 2021 at 10:09 PM David Blaikie <dblaikie at gmail.com>
wrote:
> On Tue, Jun 22, 2021 at 10:00 PM Petr Hosek via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> From our perspective as a toolchain vendor, even if using shared
>> libraries could get us closer to static linking in terms of
performance,
>> we'd still prefer static linking for the ease of distribution.
Dealing with
>> a single statically linked executable is much easier than dealing
>> with multiple shared libraries. This is especially important in
distributed
>> compilation environments like Goma.
>>
>
> What makes it especially complicated for distributed compilation
> environments? (I'd expect a toolchain contains so many files that
whether
> it's one binary, or a binary and a handful of shared libraries
wouldn't
> change the general implementation complexity of a distributed build
system?)
>
>
>>
>> When comparing performance between static and dynamic linking, I'd
also
>> recommend doing a comparison between binaries built with PGO+LTO. Plain
-O3
>> leaves a lot of performance on the table and as far as I'm aware,
most
>> toolchain vendors use PGO+LTO.
>>
>> On Tue, Jun 22, 2021 at 5:00 PM Fangrui Song via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> On 2021-06-22, Leonard Chan via llvm-dev wrote:
>>> >Small update: I have a WIP prototype of the tool at
>>> >https://reviews.llvm.org/D104686. The prototype only includes
>>> llvm-objcopy
>>> >and llvm-objdump packed together, but we're seeing size
benefits from
>>> >busyboxing those two compared against having two separate
tools. (More
>>> >details in the prototype's description.) I don't plan
on landing this
>>> as-is
>>> >anytime soon and there's still some things I'd like to
improve/change
>>> and
>>> >get feedback on.
>>> >
>>> >To answer some replies:
>>> >
>>> >- Ideally, we could start off with an incremental approach and
not
>>> package
>>> >large tools like clang/lld off the bat. The llvm-* tools seem
like a
>>> good
>>> >place to start since they're generally a bunch of
relatively small
>>> binaries
>>> >that all share a subset of functions in libLLVM, but don't
necessarily
>>> use
>>> >all of libLLVM, so statically linking them together (with
--gc-sections)
>>> >can help dedup a lot of shared components vs having separate
statically
>>> >compiled tools. In my measurements, the busybox tool containing
>>> >llvm-objcopy+objdump is negligibly larger than llvm-objdump on
its own
>>> (a
>>> >couple KB difference) indicating a lot of shared code between
objdump
>>> and
>>> >objcopy.
>>> >
>>> >- Will Dietz's multiplexing tool looks like a good place to
start from.
>>> The
>>> >only concern I can see though is mostly the amount of work
needed to
>>> update
>>> >it to LLVM 13.
>>> >
>>> >- We don't have plans for windows support now, but it's
not off the
>>> table.
>>> >(Been mostly focusing on *nix for now). Depending on overall
traction
>>> for
>>> >this idea, we could approach incrementally and add support for
different
>>> >platforms over time.
>>>
>>> -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on
>>> -DLLVM_TARGETS_TO_BUILD=X86 (custom1)
>>> vs
>>> -DLLVM_TARGETS_TO_BUILD=X86 (custom2)
>>>
>>>
>>> # This is the lower bound for any multiplexing approach. clang is
the
>>> largest executable.
>>> % stat -c %s /tmp/out/custom2/bin/clang-13
>>> 102900408
>>>
>>> I have built clang, lld and a bunch of ELF binary utilities.
>>>
>>> % stat -c %s /tmp/out/custom1/lib/libLLVM-13git.so
>>> /tmp/out/custom1/lib/libclang-cpp.so.13git
>>>
/tmp/out/custom1/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
>>> | awk '{s+=$1}END{print s}'
>>> 138896544
>>>
>>> % stat -c %s
>>>
/tmp/out/custom2/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
>>> | awk '{s+=$1}END{print s}'
>>> 209054440
>>>
>>>
>>> The -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on build is
doing
>>> a really good job.
>>>
>>> A multiplexing approach can squeeze some bytes from 138896544
toward
>>> 102900408,
>>> but how much can it do?
>>>
>>>
>>> >- I'm starting to think the `cl::opt` to `OptTable` issue
might be
>>> >orthogonal to the busybox implementation. The tool essentially
>>> dispatches
>>> >to different "main" functions in different tools, but
as long as we
>>> don't
>>> >do anything within busybox after exiting that tool's main,
then the
>>> global
>>> >state issues we weren't sure of with `cl::opt` might not be
of any
>>> concern
>>> >now. It may be an issue down the line if, let's say, the
tool flags
>>> moved
>>> >from being "owned" by the tools themselves to instead
being "owned" by
>>> >busybox, and then we'd have to merge similarly-named flags
together. In
>>> >that case, migrating these tools to use `OptTable` may be
necessary
>>> since
>>> >(I think) `OptTable` should handle this. This may be a tedious
task, but
>>> >this is just to say that busybox won't need to be
immediately blocked
>>> on it.
>>>
>>> Such improvement is useful even if we don't do multiplexing.
>>> I switched llvm-symbolizer. thakis switched llvm-objdump.
>>> I can look at some binary utilities.
>>>
>>> >- I haven't seen any issues with colliding symbols when
linking
>>> (although
>>> >I've only merged two tools for now). I suspect that with
small-ish
>>> llvm-*
>>> >tools, the bulk of their code is shared from libLLVM, and they
have
>>> their
>>> >own distinct logic built on top of it, which could mean a low
chance of
>>> >conflicting internal ABIs.
>>> >
>>> >On Mon, Jun 21, 2021 at 10:54 AM Leonard Chan <leonardchan
at google.com>
>>> >wrote:
>>> >
>>> >> Hello all,
>>> >>
>>> >> When building LLVM tools, including Clang and lld,
it's currently
>>> possible
>>> >> to use either static or shared linking for LLVM libraries.
The latter
>>> can
>>> >> significantly reduce the size of the toolchain since we
aren't
>>> duplicating
>>> >> the same code in every binary, but the dynamic relocations
can affect
>>> >> performance. The former doesn't affect performance but
significantly
>>> >> increases the size of our toolchain.
>>> >>
>>> >> We would like to implement a support for a third approach
which we
>>> call,
>>> >> for a lack of better term, "busybox" feature,
where everything is
>>> compiled
>>> >> into a single binary which then dispatches into an
appropriate tool
>>> >> depending on the first command. This approach can
significantly
>>> reduce the
>>> >> size by deduplicating all of the shared code without
affecting the
>>> >> performance.
>>> >>
>>> >> In terms of implementation, the build would produce a
single binary
>>> called
>>> >> `llvm` and the first command would identify the tool. For
example,
>>> instead
>>> >> of invoking `llvm-nm` you'd invoke `llvm nm`. Ideally
we would also
>>> support
>>> >> creation of `llvm-nm` symlink which redirects to `llvm`
for backwards
>>> >> compatibility.
>>> >> This functionality would ideally be implemented as an
option in the
>>> CMake
>>> >> build that toolchain vendors can opt into.
>>> >>
>>> >> The implementation would have to replace `main` function
of each tool
>>> with
>>> >> an entrypoint regular function which is registered into a
tool
>>> registry.
>>> >> This could be wrapped in a macro for convenience. When the
"busybox"
>>> >> feature is disabled, the macro would expand to a `main`
function as
>>> before
>>> >> and redirect to the entrypoint function. When the
"busybox" feature is
>>> >> enabled, it would register the entrypoint function into
the registry,
>>> which
>>> >> would be responsible for the dispatching based on the tool
name.
>>> Ideally,
>>> >> toolchain maintainers would also be able to control which
tools they
>>> could
>>> >> add to the "busybox" binary via CMake build
options, so toolchains
>>> will
>>> >> only include the tools they use.
>>> >>
>>> >> One implementation detail we think will be an issue is
merging
>>> arguments
>>> >> in individual tools that use `cl::opt`. `cl::opt` works by
>>> maintaining a
>>> >> global state of flags, but we aren’t confident of what the
resulting
>>> >> behavior will be when merging them together in the
dispatching
>>> `main`. What
>>> >> we would like to avoid is having flags used by one
specific tool
>>> available
>>> >> on other tools. To address this issue, we would like to
migrate all
>>> tools
>>> >> to use `OptTable` which doesn't have this issue and
has been the
>>> general
>>> >> direction most tools have been already moving into.
>>> >>
>>> >> A second issue would be resolving symlinks. For example,
llvm-objcopy
>>> will
>>> >> check argv[0] and behave as llvm-strip (ie. use the right
flags +
>>> >> configuration) if it is called via a symlink that “looks
like” a strip
>>> >> tool, but for all other cases it will run under the
default objcopy
>>> mode.
>>> >> The “looks like” function is usually an `Is` function
copied in
>>> multiple
>>> >> tools that is essentially a substring check: so symlinks
like
>>> `llvm-strip`,
>>> >> strip.exe, and `gnu-llvm-strip-10` all result in using the
strip
>>> “mode”
>>> >> while all other names use the objcopy mode. To replicate
the same
>>> behavior,
>>> >> we will need to take great care in making sure symlinks to
the
>>> busybox tool
>>> >> dispatch correctly to the appropriate llvm tool, which
might mean
>>> exposing
>>> >> and merging these `Is` functions.
>>> >>
>>> >> Some open questions:
>>> >> - People's initial thoughts/opinions?
>>> >> - Are there existing tools in LLVM that already do this?
>>> >> - Other implementation details/global states that we would
also need
>>> to
>>> >> account for?
>>> >>
>>> >> - Leonard
>>> >>
>>>
>>> >_______________________________________________
>>> >LLVM Developers mailing list
>>> >llvm-dev at lists.llvm.org
>>> >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210622/29bd07bb/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3996 bytes
Desc: S/MIME Cryptographic Signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210622/29bd07bb/attachment.bin>

David Blaikie via llvm-dev

2021-Jun-23 05:55 UTC

head link

[llvm-dev] [RFC] LLVM Busybox Proposal

On Tue, Jun 22, 2021 at 10:20 PM Petr Hosek <phosek at google.com> wrote:
> I guess this depends on a particular implementation of the distributed
> build system. In the case of Goma, we only supply the compiler binary which
> was invoked as the command (that binary links glibc as a shared library but
> we assume that one is supplied by the host system), all other files like
> headers are passed together with the compiler invocation as inputs. If we
> used dynamic linking, Goma would need to figure out what other shared
> libraries need to be sent to the server. It's certainly doable but
it's an
> extra complexity we would like to avoid.
>
Curious/fair enough - good to know!

>
> On Tue, Jun 22, 2021 at 10:09 PM David Blaikie <dblaikie at
gmail.com> wrote:
>
>> On Tue, Jun 22, 2021 at 10:00 PM Petr Hosek via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> From our perspective as a toolchain vendor, even if using shared
>>> libraries could get us closer to static linking in terms of
performance,
>>> we'd still prefer static linking for the ease of distribution.
Dealing with
>>> a single statically linked executable is much easier than dealing
>>> with multiple shared libraries. This is especially important in
distributed
>>> compilation environments like Goma.
>>>
>>
>> What makes it especially complicated for distributed compilation
>> environments? (I'd expect a toolchain contains so many files that
whether
>> it's one binary, or a binary and a handful of shared libraries
wouldn't
>> change the general implementation complexity of a distributed build
system?)
>>
>>
>>>
>>> When comparing performance between static and dynamic linking,
I'd also
>>> recommend doing a comparison between binaries built with PGO+LTO.
Plain -O3
>>> leaves a lot of performance on the table and as far as I'm
aware, most
>>> toolchain vendors use PGO+LTO.
>>>
>>> On Tue, Jun 22, 2021 at 5:00 PM Fangrui Song via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> On 2021-06-22, Leonard Chan via llvm-dev wrote:
>>>> >Small update: I have a WIP prototype of the tool at
>>>> >https://reviews.llvm.org/D104686. The prototype only
includes
>>>> llvm-objcopy
>>>> >and llvm-objdump packed together, but we're seeing size
benefits from
>>>> >busyboxing those two compared against having two separate
tools. (More
>>>> >details in the prototype's description.) I don't
plan on landing this
>>>> as-is
>>>> >anytime soon and there's still some things I'd like
to improve/change
>>>> and
>>>> >get feedback on.
>>>> >
>>>> >To answer some replies:
>>>> >
>>>> >- Ideally, we could start off with an incremental approach
and not
>>>> package
>>>> >large tools like clang/lld off the bat. The llvm-* tools
seem like a
>>>> good
>>>> >place to start since they're generally a bunch of
relatively small
>>>> binaries
>>>> >that all share a subset of functions in libLLVM, but
don't necessarily
>>>> use
>>>> >all of libLLVM, so statically linking them together (with
>>>> --gc-sections)
>>>> >can help dedup a lot of shared components vs having
separate statically
>>>> >compiled tools. In my measurements, the busybox tool
containing
>>>> >llvm-objcopy+objdump is negligibly larger than llvm-objdump
on its own
>>>> (a
>>>> >couple KB difference) indicating a lot of shared code
between objdump
>>>> and
>>>> >objcopy.
>>>> >
>>>> >- Will Dietz's multiplexing tool looks like a good
place to start
>>>> from. The
>>>> >only concern I can see though is mostly the amount of work
needed to
>>>> update
>>>> >it to LLVM 13.
>>>> >
>>>> >- We don't have plans for windows support now, but
it's not off the
>>>> table.
>>>> >(Been mostly focusing on *nix for now). Depending on
overall traction
>>>> for
>>>> >this idea, we could approach incrementally and add support
for
>>>> different
>>>> >platforms over time.
>>>>
>>>> -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on
>>>> -DLLVM_TARGETS_TO_BUILD=X86 (custom1)
>>>> vs
>>>> -DLLVM_TARGETS_TO_BUILD=X86 (custom2)
>>>>
>>>>
>>>> # This is the lower bound for any multiplexing approach. clang
is the
>>>> largest executable.
>>>> % stat -c %s /tmp/out/custom2/bin/clang-13
>>>> 102900408
>>>>
>>>> I have built clang, lld and a bunch of ELF binary utilities.
>>>>
>>>> % stat -c %s /tmp/out/custom1/lib/libLLVM-13git.so
>>>> /tmp/out/custom1/lib/libclang-cpp.so.13git
>>>>
/tmp/out/custom1/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
>>>> | awk '{s+=$1}END{print s}'
>>>> 138896544
>>>>
>>>> % stat -c %s
>>>>
/tmp/out/custom2/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
>>>> | awk '{s+=$1}END{print s}'
>>>> 209054440
>>>>
>>>>
>>>> The -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on build
is
>>>> doing a really good job.
>>>>
>>>> A multiplexing approach can squeeze some bytes from 138896544
toward
>>>> 102900408,
>>>> but how much can it do?
>>>>
>>>>
>>>> >- I'm starting to think the `cl::opt` to `OptTable`
issue might be
>>>> >orthogonal to the busybox implementation. The tool
essentially
>>>> dispatches
>>>> >to different "main" functions in different tools,
but as long as we
>>>> don't
>>>> >do anything within busybox after exiting that tool's
main, then the
>>>> global
>>>> >state issues we weren't sure of with `cl::opt` might
not be of any
>>>> concern
>>>> >now. It may be an issue down the line if, let's say,
the tool flags
>>>> moved
>>>> >from being "owned" by the tools themselves to
instead being "owned" by
>>>> >busybox, and then we'd have to merge similarly-named
flags together. In
>>>> >that case, migrating these tools to use `OptTable` may be
necessary
>>>> since
>>>> >(I think) `OptTable` should handle this. This may be a
tedious task,
>>>> but
>>>> >this is just to say that busybox won't need to be
immediately blocked
>>>> on it.
>>>>
>>>> Such improvement is useful even if we don't do
multiplexing.
>>>> I switched llvm-symbolizer. thakis switched llvm-objdump.
>>>> I can look at some binary utilities.
>>>>
>>>> >- I haven't seen any issues with colliding symbols when
linking
>>>> (although
>>>> >I've only merged two tools for now). I suspect that
with small-ish
>>>> llvm-*
>>>> >tools, the bulk of their code is shared from libLLVM, and
they have
>>>> their
>>>> >own distinct logic built on top of it, which could mean a
low chance of
>>>> >conflicting internal ABIs.
>>>> >
>>>> >On Mon, Jun 21, 2021 at 10:54 AM Leonard Chan
<leonardchan at google.com>
>>>> >wrote:
>>>> >
>>>> >> Hello all,
>>>> >>
>>>> >> When building LLVM tools, including Clang and lld,
it's currently
>>>> possible
>>>> >> to use either static or shared linking for LLVM
libraries. The
>>>> latter can
>>>> >> significantly reduce the size of the toolchain since
we aren't
>>>> duplicating
>>>> >> the same code in every binary, but the dynamic
relocations can affect
>>>> >> performance. The former doesn't affect performance
but significantly
>>>> >> increases the size of our toolchain.
>>>> >>
>>>> >> We would like to implement a support for a third
approach which we
>>>> call,
>>>> >> for a lack of better term, "busybox"
feature, where everything is
>>>> compiled
>>>> >> into a single binary which then dispatches into an
appropriate tool
>>>> >> depending on the first command. This approach can
significantly
>>>> reduce the
>>>> >> size by deduplicating all of the shared code without
affecting the
>>>> >> performance.
>>>> >>
>>>> >> In terms of implementation, the build would produce a
single binary
>>>> called
>>>> >> `llvm` and the first command would identify the tool.
For example,
>>>> instead
>>>> >> of invoking `llvm-nm` you'd invoke `llvm nm`.
Ideally we would also
>>>> support
>>>> >> creation of `llvm-nm` symlink which redirects to
`llvm` for backwards
>>>> >> compatibility.
>>>> >> This functionality would ideally be implemented as an
option in the
>>>> CMake
>>>> >> build that toolchain vendors can opt into.
>>>> >>
>>>> >> The implementation would have to replace `main`
function of each
>>>> tool with
>>>> >> an entrypoint regular function which is registered
into a tool
>>>> registry.
>>>> >> This could be wrapped in a macro for convenience. When
the "busybox"
>>>> >> feature is disabled, the macro would expand to a
`main` function as
>>>> before
>>>> >> and redirect to the entrypoint function. When the
"busybox" feature
>>>> is
>>>> >> enabled, it would register the entrypoint function
into the
>>>> registry, which
>>>> >> would be responsible for the dispatching based on the
tool name.
>>>> Ideally,
>>>> >> toolchain maintainers would also be able to control
which tools they
>>>> could
>>>> >> add to the "busybox" binary via CMake build
options, so toolchains
>>>> will
>>>> >> only include the tools they use.
>>>> >>
>>>> >> One implementation detail we think will be an issue is
merging
>>>> arguments
>>>> >> in individual tools that use `cl::opt`. `cl::opt`
works by
>>>> maintaining a
>>>> >> global state of flags, but we aren’t confident of what
the resulting
>>>> >> behavior will be when merging them together in the
dispatching
>>>> `main`. What
>>>> >> we would like to avoid is having flags used by one
specific tool
>>>> available
>>>> >> on other tools. To address this issue, we would like
to migrate all
>>>> tools
>>>> >> to use `OptTable` which doesn't have this issue
and has been the
>>>> general
>>>> >> direction most tools have been already moving into.
>>>> >>
>>>> >> A second issue would be resolving symlinks. For
example,
>>>> llvm-objcopy will
>>>> >> check argv[0] and behave as llvm-strip (ie. use the
right flags +
>>>> >> configuration) if it is called via a symlink that
“looks like” a
>>>> strip
>>>> >> tool, but for all other cases it will run under the
default objcopy
>>>> mode.
>>>> >> The “looks like” function is usually an `Is` function
copied in
>>>> multiple
>>>> >> tools that is essentially a substring check: so
symlinks like
>>>> `llvm-strip`,
>>>> >> strip.exe, and `gnu-llvm-strip-10` all result in using
the strip
>>>> “mode”
>>>> >> while all other names use the objcopy mode. To
replicate the same
>>>> behavior,
>>>> >> we will need to take great care in making sure
symlinks to the
>>>> busybox tool
>>>> >> dispatch correctly to the appropriate llvm tool, which
might mean
>>>> exposing
>>>> >> and merging these `Is` functions.
>>>> >>
>>>> >> Some open questions:
>>>> >> - People's initial thoughts/opinions?
>>>> >> - Are there existing tools in LLVM that already do
this?
>>>> >> - Other implementation details/global states that we
would also need
>>>> to
>>>> >> account for?
>>>> >>
>>>> >> - Leonard
>>>> >>
>>>>
>>>> >_______________________________________________
>>>> >LLVM Developers mailing list
>>>> >llvm-dev at lists.llvm.org
>>>> >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210622/b4bfc58c/attachment.html>

Fāng-ruì Sòng via llvm-dev

2021-Jun-23 06:08 UTC

head link

[llvm-dev] [RFC] LLVM Busybox Proposal

On Tue, Jun 22, 2021 at 10:20 PM Petr Hosek <phosek at google.com>
wrote:>
> I guess this depends on a particular implementation of the distributed
build system. In the case of Goma, we only supply the compiler binary which was
invoked as the command (that binary links glibc as a shared library but we
assume that one is supplied by the host system), all other files like headers
are passed together with the compiler invocation as inputs. If we used dynamic
linking, Goma would need to figure out what other shared libraries need to be
sent to the server. It's certainly doable but it's an extra complexity
we would like to avoid.
For non-clang executables, -DLLVM_LINK_LLVM_DYLIB=on just adds one
more DT_NEEDED.
The DT_NEEDED entry can use a $ORIGIN based DT_RUNPATH. Can Goma
detect the libraries shipped with the tools?
I asked because I feel this could be an artificial limitation which
could be straightforwardly addressed in Goma.
A toolchain executable using a accompanying shared object is not rare
(thinking of plugins).

Multiplexing LLVM tools is one alternative but I am a bit concerned
with the extra complexity and the new configuration the build system
needs to support.

https://lists.llvm.org/pipermail/llvm-dev/2021-June/151338.html
mentioned another approach which doesn't require intrusive
modification to the tools.

As for PGO+LTO, you can apply them to libLLVM-13git.so as well.
> On Tue, Jun 22, 2021 at 10:09 PM David Blaikie <dblaikie at
gmail.com> wrote:
>>
>> On Tue, Jun 22, 2021 at 10:00 PM Petr Hosek via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
>>>
>>> From our perspective as a toolchain vendor, even if using shared
libraries could get us closer to static linking in terms of performance,
we'd still prefer static linking for the ease of distribution. Dealing with
a single statically linked executable is much easier than dealing with multiple
shared libraries. This is especially important in distributed compilation
environments like Goma.
>>
>>
>> What makes it especially complicated for distributed compilation
environments? (I'd expect a toolchain contains so many files that whether
it's one binary, or a binary and a handful of shared libraries wouldn't
change the general implementation complexity of a distributed build system?)
>>
>>>
>>>
>>> When comparing performance between static and dynamic linking,
I'd also recommend doing a comparison between binaries built with PGO+LTO.
Plain -O3 leaves a lot of performance on the table and as far as I'm aware,
most toolchain vendors use PGO+LTO.
>>>
>>> On Tue, Jun 22, 2021 at 5:00 PM Fangrui Song via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>>>>
>>>> On 2021-06-22, Leonard Chan via llvm-dev wrote:
>>>> >Small update: I have a WIP prototype of the tool at
>>>> >https://reviews.llvm.org/D104686. The prototype only
includes llvm-objcopy
>>>> >and llvm-objdump packed together, but we're seeing size
benefits from
>>>> >busyboxing those two compared against having two separate
tools. (More
>>>> >details in the prototype's description.) I don't
plan on landing this as-is
>>>> >anytime soon and there's still some things I'd like
to improve/change and
>>>> >get feedback on.
>>>> >
>>>> >To answer some replies:
>>>> >
>>>> >- Ideally, we could start off with an incremental approach
and not package
>>>> >large tools like clang/lld off the bat. The llvm-* tools
seem like a good
>>>> >place to start since they're generally a bunch of
relatively small binaries
>>>> >that all share a subset of functions in libLLVM, but
don't necessarily use
>>>> >all of libLLVM, so statically linking them together (with
--gc-sections)
>>>> >can help dedup a lot of shared components vs having
separate statically
>>>> >compiled tools. In my measurements, the busybox tool
containing
>>>> >llvm-objcopy+objdump is negligibly larger than llvm-objdump
on its own (a
>>>> >couple KB difference) indicating a lot of shared code
between objdump and
>>>> >objcopy.
>>>> >
>>>> >- Will Dietz's multiplexing tool looks like a good
place to start from. The
>>>> >only concern I can see though is mostly the amount of work
needed to update
>>>> >it to LLVM 13.
>>>> >
>>>> >- We don't have plans for windows support now, but
it's not off the table.
>>>> >(Been mostly focusing on *nix for now). Depending on
overall traction for
>>>> >this idea, we could approach incrementally and add support
for different
>>>> >platforms over time.
>>>>
>>>> -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on
-DLLVM_TARGETS_TO_BUILD=X86 (custom1)
>>>> vs
>>>> -DLLVM_TARGETS_TO_BUILD=X86 (custom2)
>>>>
>>>>
>>>> # This is the lower bound for any multiplexing approach. clang
is the largest executable.
>>>> % stat -c %s /tmp/out/custom2/bin/clang-13
>>>> 102900408
>>>>
>>>> I have built clang, lld and a bunch of ELF binary utilities.
>>>>
>>>> % stat -c %s /tmp/out/custom1/lib/libLLVM-13git.so
/tmp/out/custom1/lib/libclang-cpp.so.13git
/tmp/out/custom1/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
| awk '{s+=$1}END{print s}'
>>>> 138896544
>>>>
>>>> % stat -c %s
/tmp/out/custom2/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
| awk '{s+=$1}END{print s}'
>>>> 209054440
>>>>
>>>>
>>>> The -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on build
is doing a really good job.
>>>>
>>>> A multiplexing approach can squeeze some bytes from 138896544
toward 102900408,
>>>> but how much can it do?
>>>>
>>>>
>>>> >- I'm starting to think the `cl::opt` to `OptTable`
issue might be
>>>> >orthogonal to the busybox implementation. The tool
essentially dispatches
>>>> >to different "main" functions in different tools,
but as long as we don't
>>>> >do anything within busybox after exiting that tool's
main, then the global
>>>> >state issues we weren't sure of with `cl::opt` might
not be of any concern
>>>> >now. It may be an issue down the line if, let's say,
the tool flags moved
>>>> >from being "owned" by the tools themselves to
instead being "owned" by
>>>> >busybox, and then we'd have to merge similarly-named
flags together. In
>>>> >that case, migrating these tools to use `OptTable` may be
necessary since
>>>> >(I think) `OptTable` should handle this. This may be a
tedious task, but
>>>> >this is just to say that busybox won't need to be
immediately blocked on it.
>>>>
>>>> Such improvement is useful even if we don't do
multiplexing.
>>>> I switched llvm-symbolizer. thakis switched llvm-objdump.
>>>> I can look at some binary utilities.
>>>>
>>>> >- I haven't seen any issues with colliding symbols when
linking (although
>>>> >I've only merged two tools for now). I suspect that
with small-ish llvm-*
>>>> >tools, the bulk of their code is shared from libLLVM, and
they have their
>>>> >own distinct logic built on top of it, which could mean a
low chance of
>>>> >conflicting internal ABIs.
>>>> >
>>>> >On Mon, Jun 21, 2021 at 10:54 AM Leonard Chan
<leonardchan at google.com>
>>>> >wrote:
>>>> >
>>>> >> Hello all,
>>>> >>
>>>> >> When building LLVM tools, including Clang and lld,
it's currently possible
>>>> >> to use either static or shared linking for LLVM
libraries. The latter can
>>>> >> significantly reduce the size of the toolchain since
we aren't duplicating
>>>> >> the same code in every binary, but the dynamic
relocations can affect
>>>> >> performance. The former doesn't affect performance
but significantly
>>>> >> increases the size of our toolchain.
>>>> >>
>>>> >> We would like to implement a support for a third
approach which we call,
>>>> >> for a lack of better term, "busybox"
feature, where everything is compiled
>>>> >> into a single binary which then dispatches into an
appropriate tool
>>>> >> depending on the first command. This approach can
significantly reduce the
>>>> >> size by deduplicating all of the shared code without
affecting the
>>>> >> performance.
>>>> >>
>>>> >> In terms of implementation, the build would produce a
single binary called
>>>> >> `llvm` and the first command would identify the tool.
For example, instead
>>>> >> of invoking `llvm-nm` you'd invoke `llvm nm`.
Ideally we would also support
>>>> >> creation of `llvm-nm` symlink which redirects to
`llvm` for backwards
>>>> >> compatibility.
>>>> >> This functionality would ideally be implemented as an
option in the CMake
>>>> >> build that toolchain vendors can opt into.
>>>> >>
>>>> >> The implementation would have to replace `main`
function of each tool with
>>>> >> an entrypoint regular function which is registered
into a tool registry.
>>>> >> This could be wrapped in a macro for convenience. When
the "busybox"
>>>> >> feature is disabled, the macro would expand to a
`main` function as before
>>>> >> and redirect to the entrypoint function. When the
"busybox" feature is
>>>> >> enabled, it would register the entrypoint function
into the registry, which
>>>> >> would be responsible for the dispatching based on the
tool name. Ideally,
>>>> >> toolchain maintainers would also be able to control
which tools they could
>>>> >> add to the "busybox" binary via CMake build
options, so toolchains will
>>>> >> only include the tools they use.
>>>> >>
>>>> >> One implementation detail we think will be an issue is
merging arguments
>>>> >> in individual tools that use `cl::opt`. `cl::opt`
works by maintaining a
>>>> >> global state of flags, but we aren’t confident of what
the resulting
>>>> >> behavior will be when merging them together in the
dispatching `main`. What
>>>> >> we would like to avoid is having flags used by one
specific tool available
>>>> >> on other tools. To address this issue, we would like
to migrate all tools
>>>> >> to use `OptTable` which doesn't have this issue
and has been the general
>>>> >> direction most tools have been already moving into.
>>>> >>
>>>> >> A second issue would be resolving symlinks. For
example, llvm-objcopy will
>>>> >> check argv[0] and behave as llvm-strip (ie. use the
right flags +
>>>> >> configuration) if it is called via a symlink that
“looks like” a strip
>>>> >> tool, but for all other cases it will run under the
default objcopy mode.
>>>> >> The “looks like” function is usually an `Is` function
copied in multiple
>>>> >> tools that is essentially a substring check: so
symlinks like `llvm-strip`,
>>>> >> strip.exe, and `gnu-llvm-strip-10` all result in using
the strip “mode”
>>>> >> while all other names use the objcopy mode. To
replicate the same behavior,
>>>> >> we will need to take great care in making sure
symlinks to the busybox tool
>>>> >> dispatch correctly to the appropriate llvm tool, which
might mean exposing
>>>> >> and merging these `Is` functions.
>>>> >>
>>>> >> Some open questions:
>>>> >> - People's initial thoughts/opinions?
>>>> >> - Are there existing tools in LLVM that already do
this?
>>>> >> - Other implementation details/global states that we
would also need to
>>>> >> account for?
>>>> >>
>>>> >> - Leonard
>>>> >>
>>>>
>>>> >_______________________________________________
>>>> >LLVM Developers mailing list
>>>> >llvm-dev at lists.llvm.org
>>>> >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


-- 
宋方睿

llvm dev - Jun 2021 - [RFC] LLVM Busybox Proposal

[llvm-dev] [RFC] LLVM Busybox Proposal

[llvm-dev] [RFC] LLVM Busybox Proposal

[llvm-dev] [RFC] LLVM Busybox Proposal