thr3ads.net - llvm dev - [llvm-dev] [RFC] LLVM Busybox Proposal [Jun 2021]

If this information is useful, please help other people find it:
Share via:

Petr Hosek via llvm-dev

2021-Jun-23 05:00 UTC

[llvm-dev] [RFC] LLVM Busybox Proposal

>From our perspective as a toolchain vendor, even if using shared librariescould get us closer to static linking in terms of performance, we'd still
prefer static linking for the ease of distribution. Dealing with a single
statically linked executable is much easier than dealing with multiple
shared libraries. This is especially important in distributed compilation
environments like Goma.

When comparing performance between static and dynamic linking, I'd also
recommend doing a comparison between binaries built with PGO+LTO. Plain -O3
leaves a lot of performance on the table and as far as I'm aware, most
toolchain vendors use PGO+LTO.

On Tue, Jun 22, 2021 at 5:00 PM Fangrui Song via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On 2021-06-22, Leonard Chan via llvm-dev wrote:
> >Small update: I have a WIP prototype of the tool at
> >https://reviews.llvm.org/D104686. The prototype only includes
> llvm-objcopy
> >and llvm-objdump packed together, but we're seeing size benefits
from
> >busyboxing those two compared against having two separate tools. (More
> >details in the prototype's description.) I don't plan on
landing this
> as-is
> >anytime soon and there's still some things I'd like to
improve/change and
> >get feedback on.
> >
> >To answer some replies:
> >
> >- Ideally, we could start off with an incremental approach and not
package
> >large tools like clang/lld off the bat. The llvm-* tools seem like a
good
> >place to start since they're generally a bunch of relatively small
> binaries
> >that all share a subset of functions in libLLVM, but don't
necessarily use
> >all of libLLVM, so statically linking them together (with
--gc-sections)
> >can help dedup a lot of shared components vs having separate statically
> >compiled tools. In my measurements, the busybox tool containing
> >llvm-objcopy+objdump is negligibly larger than llvm-objdump on its own
(a
> >couple KB difference) indicating a lot of shared code between objdump
and
> >objcopy.
> >
> >- Will Dietz's multiplexing tool looks like a good place to start
from.
> The
> >only concern I can see though is mostly the amount of work needed to
> update
> >it to LLVM 13.
> >
> >- We don't have plans for windows support now, but it's not off
the table.
> >(Been mostly focusing on *nix for now). Depending on overall traction
for
> >this idea, we could approach incrementally and add support for
different
> >platforms over time.
>
> -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on
> -DLLVM_TARGETS_TO_BUILD=X86 (custom1)
> vs
> -DLLVM_TARGETS_TO_BUILD=X86 (custom2)
>
>
> # This is the lower bound for any multiplexing approach. clang is the
> largest executable.
> % stat -c %s /tmp/out/custom2/bin/clang-13
> 102900408
>
> I have built clang, lld and a bunch of ELF binary utilities.
>
> % stat -c %s /tmp/out/custom1/lib/libLLVM-13git.so
> /tmp/out/custom1/lib/libclang-cpp.so.13git
>
/tmp/out/custom1/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
> | awk '{s+=$1}END{print s}'
> 138896544
>
> % stat -c %s
>
/tmp/out/custom2/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
> | awk '{s+=$1}END{print s}'
> 209054440
>
>
> The -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on build is doing a
> really good job.
>
> A multiplexing approach can squeeze some bytes from 138896544 toward
> 102900408,
> but how much can it do?
>
>
> >- I'm starting to think the `cl::opt` to `OptTable` issue might be
> >orthogonal to the busybox implementation. The tool essentially
dispatches
> >to different "main" functions in different tools, but as long
as we don't
> >do anything within busybox after exiting that tool's main, then the
global
> >state issues we weren't sure of with `cl::opt` might not be of any
concern
> >now. It may be an issue down the line if, let's say, the tool flags
moved
> >from being "owned" by the tools themselves to instead being
"owned" by
> >busybox, and then we'd have to merge similarly-named flags
together. In
> >that case, migrating these tools to use `OptTable` may be necessary
since
> >(I think) `OptTable` should handle this. This may be a tedious task,
but
> >this is just to say that busybox won't need to be immediately
blocked on
> it.
>
> Such improvement is useful even if we don't do multiplexing.
> I switched llvm-symbolizer. thakis switched llvm-objdump.
> I can look at some binary utilities.
>
> >- I haven't seen any issues with colliding symbols when linking
(although
> >I've only merged two tools for now). I suspect that with small-ish
llvm-*
> >tools, the bulk of their code is shared from libLLVM, and they have
their
> >own distinct logic built on top of it, which could mean a low chance of
> >conflicting internal ABIs.
> >
> >On Mon, Jun 21, 2021 at 10:54 AM Leonard Chan <leonardchan at
google.com>
> >wrote:
> >
> >> Hello all,
> >>
> >> When building LLVM tools, including Clang and lld, it's
currently
> possible
> >> to use either static or shared linking for LLVM libraries. The
latter
> can
> >> significantly reduce the size of the toolchain since we aren't
> duplicating
> >> the same code in every binary, but the dynamic relocations can
affect
> >> performance. The former doesn't affect performance but
significantly
> >> increases the size of our toolchain.
> >>
> >> We would like to implement a support for a third approach which we
call,
> >> for a lack of better term, "busybox" feature, where
everything is
> compiled
> >> into a single binary which then dispatches into an appropriate
tool
> >> depending on the first command. This approach can significantly
reduce
> the
> >> size by deduplicating all of the shared code without affecting the
> >> performance.
> >>
> >> In terms of implementation, the build would produce a single
binary
> called
> >> `llvm` and the first command would identify the tool. For example,
> instead
> >> of invoking `llvm-nm` you'd invoke `llvm nm`. Ideally we would
also
> support
> >> creation of `llvm-nm` symlink which redirects to `llvm` for
backwards
> >> compatibility.
> >> This functionality would ideally be implemented as an option in
the
> CMake
> >> build that toolchain vendors can opt into.
> >>
> >> The implementation would have to replace `main` function of each
tool
> with
> >> an entrypoint regular function which is registered into a tool
registry.
> >> This could be wrapped in a macro for convenience. When the
"busybox"
> >> feature is disabled, the macro would expand to a `main` function
as
> before
> >> and redirect to the entrypoint function. When the
"busybox" feature is
> >> enabled, it would register the entrypoint function into the
registry,
> which
> >> would be responsible for the dispatching based on the tool name.
> Ideally,
> >> toolchain maintainers would also be able to control which tools
they
> could
> >> add to the "busybox" binary via CMake build options, so
toolchains will
> >> only include the tools they use.
> >>
> >> One implementation detail we think will be an issue is merging
arguments
> >> in individual tools that use `cl::opt`. `cl::opt` works by
maintaining a
> >> global state of flags, but we aren’t confident of what the
resulting
> >> behavior will be when merging them together in the dispatching
`main`.
> What
> >> we would like to avoid is having flags used by one specific tool
> available
> >> on other tools. To address this issue, we would like to migrate
all
> tools
> >> to use `OptTable` which doesn't have this issue and has been
the general
> >> direction most tools have been already moving into.
> >>
> >> A second issue would be resolving symlinks. For example,
llvm-objcopy
> will
> >> check argv[0] and behave as llvm-strip (ie. use the right flags +
> >> configuration) if it is called via a symlink that “looks like” a
strip
> >> tool, but for all other cases it will run under the default
objcopy
> mode.
> >> The “looks like” function is usually an `Is` function copied in
multiple
> >> tools that is essentially a substring check: so symlinks like
> `llvm-strip`,
> >> strip.exe, and `gnu-llvm-strip-10` all result in using the strip
“mode”
> >> while all other names use the objcopy mode. To replicate the same
> behavior,
> >> we will need to take great care in making sure symlinks to the
busybox
> tool
> >> dispatch correctly to the appropriate llvm tool, which might mean
> exposing
> >> and merging these `Is` functions.
> >>
> >> Some open questions:
> >> - People's initial thoughts/opinions?
> >> - Are there existing tools in LLVM that already do this?
> >> - Other implementation details/global states that we would also
need to
> >> account for?
> >>
> >> - Leonard
> >>
>
> >_______________________________________________
> >LLVM Developers mailing list
> >llvm-dev at lists.llvm.org
> >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210622/c0847782/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3996 bytes
Desc: S/MIME Cryptographic Signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210622/c0847782/attachment.bin>

David Blaikie via llvm-dev

2021-Jun-23 05:09 UTC

head link

[llvm-dev] [RFC] LLVM Busybox Proposal

On Tue, Jun 22, 2021 at 10:00 PM Petr Hosek via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> From our perspective as a toolchain vendor, even if using shared libraries
> could get us closer to static linking in terms of performance, we'd
still
> prefer static linking for the ease of distribution. Dealing with a single
> statically linked executable is much easier than dealing with multiple
> shared libraries. This is especially important in distributed compilation
> environments like Goma.
>
What makes it especially complicated for distributed compilation
environments? (I'd expect a toolchain contains so many files that whether
it's one binary, or a binary and a handful of shared libraries wouldn't
change the general implementation complexity of a distributed build system?)

>
> When comparing performance between static and dynamic linking, I'd also
> recommend doing a comparison between binaries built with PGO+LTO. Plain -O3
> leaves a lot of performance on the table and as far as I'm aware, most
> toolchain vendors use PGO+LTO.
>
> On Tue, Jun 22, 2021 at 5:00 PM Fangrui Song via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On 2021-06-22, Leonard Chan via llvm-dev wrote:
>> >Small update: I have a WIP prototype of the tool at
>> >https://reviews.llvm.org/D104686. The prototype only includes
>> llvm-objcopy
>> >and llvm-objdump packed together, but we're seeing size
benefits from
>> >busyboxing those two compared against having two separate tools.
(More
>> >details in the prototype's description.) I don't plan on
landing this
>> as-is
>> >anytime soon and there's still some things I'd like to
improve/change and
>> >get feedback on.
>> >
>> >To answer some replies:
>> >
>> >- Ideally, we could start off with an incremental approach and not
>> package
>> >large tools like clang/lld off the bat. The llvm-* tools seem like
a good
>> >place to start since they're generally a bunch of relatively
small
>> binaries
>> >that all share a subset of functions in libLLVM, but don't
necessarily
>> use
>> >all of libLLVM, so statically linking them together (with
--gc-sections)
>> >can help dedup a lot of shared components vs having separate
statically
>> >compiled tools. In my measurements, the busybox tool containing
>> >llvm-objcopy+objdump is negligibly larger than llvm-objdump on its
own (a
>> >couple KB difference) indicating a lot of shared code between
objdump and
>> >objcopy.
>> >
>> >- Will Dietz's multiplexing tool looks like a good place to
start from.
>> The
>> >only concern I can see though is mostly the amount of work needed
to
>> update
>> >it to LLVM 13.
>> >
>> >- We don't have plans for windows support now, but it's not
off the
>> table.
>> >(Been mostly focusing on *nix for now). Depending on overall
traction for
>> >this idea, we could approach incrementally and add support for
different
>> >platforms over time.
>>
>> -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on
>> -DLLVM_TARGETS_TO_BUILD=X86 (custom1)
>> vs
>> -DLLVM_TARGETS_TO_BUILD=X86 (custom2)
>>
>>
>> # This is the lower bound for any multiplexing approach. clang is the
>> largest executable.
>> % stat -c %s /tmp/out/custom2/bin/clang-13
>> 102900408
>>
>> I have built clang, lld and a bunch of ELF binary utilities.
>>
>> % stat -c %s /tmp/out/custom1/lib/libLLVM-13git.so
>> /tmp/out/custom1/lib/libclang-cpp.so.13git
>>
/tmp/out/custom1/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
>> | awk '{s+=$1}END{print s}'
>> 138896544
>>
>> % stat -c %s
>>
/tmp/out/custom2/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
>> | awk '{s+=$1}END{print s}'
>> 209054440
>>
>>
>> The -DLLVM_LINK_LLVM_DYLIB=on -DCLANG_LINK_CLANG_DYLIB=on build is
doing
>> a really good job.
>>
>> A multiplexing approach can squeeze some bytes from 138896544 toward
>> 102900408,
>> but how much can it do?
>>
>>
>> >- I'm starting to think the `cl::opt` to `OptTable` issue might
be
>> >orthogonal to the busybox implementation. The tool essentially
dispatches
>> >to different "main" functions in different tools, but as
long as we don't
>> >do anything within busybox after exiting that tool's main, then
the
>> global
>> >state issues we weren't sure of with `cl::opt` might not be of
any
>> concern
>> >now. It may be an issue down the line if, let's say, the tool
flags moved
>> >from being "owned" by the tools themselves to instead
being "owned" by
>> >busybox, and then we'd have to merge similarly-named flags
together. In
>> >that case, migrating these tools to use `OptTable` may be necessary
since
>> >(I think) `OptTable` should handle this. This may be a tedious
task, but
>> >this is just to say that busybox won't need to be immediately
blocked on
>> it.
>>
>> Such improvement is useful even if we don't do multiplexing.
>> I switched llvm-symbolizer. thakis switched llvm-objdump.
>> I can look at some binary utilities.
>>
>> >- I haven't seen any issues with colliding symbols when linking
(although
>> >I've only merged two tools for now). I suspect that with
small-ish llvm-*
>> >tools, the bulk of their code is shared from libLLVM, and they have
their
>> >own distinct logic built on top of it, which could mean a low
chance of
>> >conflicting internal ABIs.
>> >
>> >On Mon, Jun 21, 2021 at 10:54 AM Leonard Chan <leonardchan at
google.com>
>> >wrote:
>> >
>> >> Hello all,
>> >>
>> >> When building LLVM tools, including Clang and lld, it's
currently
>> possible
>> >> to use either static or shared linking for LLVM libraries. The
latter
>> can
>> >> significantly reduce the size of the toolchain since we
aren't
>> duplicating
>> >> the same code in every binary, but the dynamic relocations can
affect
>> >> performance. The former doesn't affect performance but
significantly
>> >> increases the size of our toolchain.
>> >>
>> >> We would like to implement a support for a third approach
which we
>> call,
>> >> for a lack of better term, "busybox" feature, where
everything is
>> compiled
>> >> into a single binary which then dispatches into an appropriate
tool
>> >> depending on the first command. This approach can
significantly reduce
>> the
>> >> size by deduplicating all of the shared code without affecting
the
>> >> performance.
>> >>
>> >> In terms of implementation, the build would produce a single
binary
>> called
>> >> `llvm` and the first command would identify the tool. For
example,
>> instead
>> >> of invoking `llvm-nm` you'd invoke `llvm nm`. Ideally we
would also
>> support
>> >> creation of `llvm-nm` symlink which redirects to `llvm` for
backwards
>> >> compatibility.
>> >> This functionality would ideally be implemented as an option
in the
>> CMake
>> >> build that toolchain vendors can opt into.
>> >>
>> >> The implementation would have to replace `main` function of
each tool
>> with
>> >> an entrypoint regular function which is registered into a tool
>> registry.
>> >> This could be wrapped in a macro for convenience. When the
"busybox"
>> >> feature is disabled, the macro would expand to a `main`
function as
>> before
>> >> and redirect to the entrypoint function. When the
"busybox" feature is
>> >> enabled, it would register the entrypoint function into the
registry,
>> which
>> >> would be responsible for the dispatching based on the tool
name.
>> Ideally,
>> >> toolchain maintainers would also be able to control which
tools they
>> could
>> >> add to the "busybox" binary via CMake build options,
so toolchains will
>> >> only include the tools they use.
>> >>
>> >> One implementation detail we think will be an issue is merging
>> arguments
>> >> in individual tools that use `cl::opt`. `cl::opt` works by
maintaining
>> a
>> >> global state of flags, but we aren’t confident of what the
resulting
>> >> behavior will be when merging them together in the dispatching
`main`.
>> What
>> >> we would like to avoid is having flags used by one specific
tool
>> available
>> >> on other tools. To address this issue, we would like to
migrate all
>> tools
>> >> to use `OptTable` which doesn't have this issue and has
been the
>> general
>> >> direction most tools have been already moving into.
>> >>
>> >> A second issue would be resolving symlinks. For example,
llvm-objcopy
>> will
>> >> check argv[0] and behave as llvm-strip (ie. use the right
flags +
>> >> configuration) if it is called via a symlink that “looks like”
a strip
>> >> tool, but for all other cases it will run under the default
objcopy
>> mode.
>> >> The “looks like” function is usually an `Is` function copied
in
>> multiple
>> >> tools that is essentially a substring check: so symlinks like
>> `llvm-strip`,
>> >> strip.exe, and `gnu-llvm-strip-10` all result in using the
strip “mode”
>> >> while all other names use the objcopy mode. To replicate the
same
>> behavior,
>> >> we will need to take great care in making sure symlinks to the
busybox
>> tool
>> >> dispatch correctly to the appropriate llvm tool, which might
mean
>> exposing
>> >> and merging these `Is` functions.
>> >>
>> >> Some open questions:
>> >> - People's initial thoughts/opinions?
>> >> - Are there existing tools in LLVM that already do this?
>> >> - Other implementation details/global states that we would
also need to
>> >> account for?
>> >>
>> >> - Leonard
>> >>
>>
>> >_______________________________________________
>> >LLVM Developers mailing list
>> >llvm-dev at lists.llvm.org
>> >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210622/1450f39e/attachment-0001.html>

llvm dev - Jun 2021 - [RFC] LLVM Busybox Proposal

[llvm-dev] [RFC] LLVM Busybox Proposal

[llvm-dev] [RFC] LLVM Busybox Proposal