thr3ads.net - llvm dev - [llvm-dev] [RFC] LLVM Busybox Proposal [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Chris Bieneman via llvm-dev

2021-Sep-16 22:40 UTC

[llvm-dev] [RFC] LLVM Busybox Proposal

Hi all,

Apologies for reviving a long-aged thread here. I hadn't followed this
thread when it came up, and independently started playing with the same basic
idea. I have a prototype implementation on my GitHub[1], which creates an
llvm-driver that can execute clang, dsymutil, llvm-ar, llvm-cxxfilt,
llvm-dwarfdump, and llvm-objcopy.

The pattern can be applied fairly simply to additional tools (with one _huge_
caveat that I'll go into below). For most tools to be built into the
multicall binary the only required changes are adding the `GENERATE_DRIVER`
option to the `add_llvm_tool` CMake call, and changing the `main` function to be
prefixed by the tool's name (i.e. llvm_objcopy_main, clang_main, etc).

As an example, the full diffs for llvm-objcopy are:

```
diff --git a/llvm/tools/llvm-objcopy/CMakeLists.txt
b/llvm/tools/llvm-objcopy/CMakeLists.txt
index d14d2135f5db..644dec79bc50 100644
--- a/llvm/tools/llvm-objcopy/CMakeLists.txt
+++ b/llvm/tools/llvm-objcopy/CMakeLists.txt
@@ -43,6 +43,7 @@ add_llvm_tool(llvm-objcopy
   ObjcopyOptsTableGen
   InstallNameToolOptsTableGen
   StripOptsTableGen
+  GENERATE_DRIVER
   )
 
 add_llvm_tool_symlink(llvm-install-name-tool llvm-objcopy)
diff --git a/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
b/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
index ad166487eb78..bd5556f225b2 100644
--- a/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
+++ b/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
@@ -401,7 +401,7 @@ static Error executeObjcopy(ConfigManager &ConfigMgr) {
   return Error::success();
 }
 
-int main(int argc, char **argv) {
+int llvm_objcopy_main(int argc, char **argv) {
   InitLLVM X(argc, argv);
   ToolName = argv[0];
```

With some clever CMake goop, any tool that opts into being part of the merged
driver gets a generated template main function, and much of the other
boilerplate code is generated too. As implemented, the tools all get built into
their own tools _and_ the llvm-driver tool. If this is a desirable route part of
the patch to make this "real" would be adding an option to disable
building the tool and instead generate a symlink from the tool to llvm-driver.

This implementation dose require CMake 3.12 or later, since CMake 3.12 allows
linkage dependencies for object libraries, which the implementation depends on.

The _huge_ caveat is that cl::opt haunts all things I do in LLVM. I tried adding
clang-tidy to the tools, and it will build fine, but crashes on launch because
of duplicate command line options being registered (d'oh!). cl::opt's
continued reliance on globals means that it is ill-suited for the construction
of a mega-llvm-driver.

This is something that has come up time and time again in many different
contexts, but we've never really had the community effort behind resolving
it.

Using OptTable has been suggested, but one of the common complaints is that
OptTable for tool options is unwieldy and overly complicated for small simple
tools. Also, there isn't a good way to handle options that are buried inside
libraries, and many of the cl::opt options are.

Many years ago I initiated lengthy discussions on llvm-dev and at llvm socials
about an alternate approach [2], but it was a half measure at best. One of the
challenges that it didn't solve was the need for registering command line
options for all the debugging values buried in the passes. I don't mean to
derail this effort by sending us all down a rabbit hole, but I also think that
for a real tractable solution to an llvm Busybox/multicall binary solution, we
really need to do something about cl::opt.

-Chris

[1]
https://github.com/llvm-beanz/llvm-project/commit/59befc0116b982b6905822e5c5109bb6a4d397b0
<https://github.com/llvm-beanz/llvm-project/commit/59befc0116b982b6905822e5c5109bb6a4d397b0>
[2] https://lists.llvm.org/pipermail/llvm-dev/2014-August/075855.html
<https://lists.llvm.org/pipermail/llvm-dev/2014-August/075855.html>
> On Jun 23, 2021, at 6:32 PM, Mehdi AMINI via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> 
> 
> On Wed, Jun 23, 2021 at 3:52 PM Fāng-ruì Sòng <maskray at google.com
<mailto:maskray at google.com>> wrote:
> On Wed, Jun 23, 2021 at 3:43 PM Mehdi AMINI <joker.eph at gmail.com
<mailto:joker.eph at gmail.com>> wrote:
> >
> >
> >
> > On Tue, Jun 22, 2021 at 11:09 PM Fāng-ruì Sòng via llvm-dev
<llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
wrote:
> >>
> >> On Tue, Jun 22, 2021 at 10:20 PM Petr Hosek <phosek at
google.com <mailto:phosek at google.com>> wrote:
> >> >
> >> > I guess this depends on a particular implementation of the
distributed build system. In the case of Goma, we only supply the compiler
binary which was invoked as the command (that binary links glibc as a shared
library but we assume that one is supplied by the host system), all other files
like headers are passed together with the compiler invocation as inputs. If we
used dynamic linking, Goma would need to figure out what other shared libraries
need to be sent to the server. It's certainly doable but it's an extra
complexity we would like to avoid.
> >>
> >> For non-clang executables, -DLLVM_LINK_LLVM_DYLIB=on just adds one
> >> more DT_NEEDED.
> >> The DT_NEEDED entry can use a $ORIGIN based DT_RUNPATH. Can Goma
> >> detect the libraries shipped with the tools?
> >> I asked because I feel this could be an artificial limitation
which
> >> could be straightforwardly addressed in Goma.
> >> A toolchain executable using a accompanying shared object is not
rare
> >> (thinking of plugins).
> >>
> >> Multiplexing LLVM tools is one alternative but I am a bit
concerned
> >> with the extra complexity and the new configuration the build
system
> >> needs to support.
> >>
> >> https://lists.llvm.org/pipermail/llvm-dev/2021-June/151338.html
<https://lists.llvm.org/pipermail/llvm-dev/2021-June/151338.html>
> >> mentioned another approach which doesn't require intrusive
> >> modification to the tools.
> >>
> >> As for PGO+LTO, you can apply them to libLLVM-13git.so as well.
> >
> >
> > Some thoughts if we're getting into PGO+LTO territory, I feel that
both methods presented here will be at a disadvantage compared to building clang
and lld into their own binaries.
> > For example I remember that on Mac an important optimization for clang
builds was to order the functions in the binary roughly in the order in which
they are first encountered during execution, assuming the same behavior for lld
you can see the conflicting optimization goal... You can also think about how
libSupport may be differently "hot" on a clang PGO profile compared to
lld and would result in different optimization.
> 
> If PGO+LTO is desired, the executables can be split this way, assuming
> the performance of
> llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}
> doesn't matter.
> 
> * clang (libLLVM*.a)
> * lld +
llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}
> (libLLVM-13git.so)
> 
> > LTO also benefits from "internalizing", basically building a
static binary where only `main` is exported and everything else becomes an
internal linkage is the best case: pointer escaping, global analysis, etc all
become more powerful. Optimizing a shared library kind of makes every symbol
public, and I suspect the busybox approach may be better on this aspect (you get
back to a single public main, but it can reach much more code though).
> 
> With --version-script we can internalize shared object symbols as
> well. For example, this has been used to facilitate whole-program
> devirtualization (https://reviews.llvm.org/D98686
<https://reviews.llvm.org/D98686>).
> With https://lists.llvm.org/pipermail/llvm-dev/2021-June/151338.html
<https://lists.llvm.org/pipermail/llvm-dev/2021-June/151338.html>
> we can get a list of roots which need to be exported.
> A thin executable plus a -fvisibility-inlines-hidden +
> -Bsymbolic-functions shared object is almost identical to a PIE.
> 
> You can get closer to it but note that:
> 
> - You have some non-trivial and non-standard build setup and scripts to
workaround the problem (finding roots, etc.), the busybox solution is much more
"clean" from this point of view if one can structure it in
"normal" C++.
> - How does it work on non-ELF platforms?
> - It still isn't equivalent: you're still having a large surface
API exported by the .so which limits what the optimizer can do (alias analysis,
etc.). You won't be able to inject context from the callers there, or inline
across the libLLVM.so boundary.
> 
> -- 
> Mehdi
> 
> 
> 
>  
> 
> >
> >>
> >>
> >> > On Tue, Jun 22, 2021 at 10:09 PM David Blaikie <dblaikie
at gmail.com <mailto:dblaikie at gmail.com>> wrote:
> >> >>
> >> >> On Tue, Jun 22, 2021 at 10:00 PM Petr Hosek via llvm-dev
<llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
wrote:
> >> >>>
> >> >>> From our perspective as a toolchain vendor, even if
using shared libraries could get us closer to static linking in terms of
performance, we'd still prefer static linking for the ease of distribution.
Dealing with a single statically linked executable is much easier than dealing
with multiple shared libraries. This is especially important in distributed
compilation environments like Goma.
> >> >>
> >> >>
> >> >> What makes it especially complicated for distributed
compilation environments? (I'd expect a toolchain contains so many files
that whether it's one binary, or a binary and a handful of shared libraries
wouldn't change the general implementation complexity of a distributed build
system?)
> >> >>
> >> >>>
> >> >>>
> >> >>> When comparing performance between static and dynamic
linking, I'd also recommend doing a comparison between binaries built with
PGO+LTO. Plain -O3 leaves a lot of performance on the table and as far as
I'm aware, most toolchain vendors use PGO+LTO.
> >> >>>
> >> >>> On Tue, Jun 22, 2021 at 5:00 PM Fangrui Song via
llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> >> >>>>
> >> >>>> On 2021-06-22, Leonard Chan via llvm-dev wrote:
> >> >>>> >Small update: I have a WIP prototype of the
tool at
> >> >>>> >https://reviews.llvm.org/D104686
<https://reviews.llvm.org/D104686>. The prototype only includes
llvm-objcopy
> >> >>>> >and llvm-objdump packed together, but
we're seeing size benefits from
> >> >>>> >busyboxing those two compared against having
two separate tools. (More
> >> >>>> >details in the prototype's description.)
I don't plan on landing this as-is
> >> >>>> >anytime soon and there's still some
things I'd like to improve/change and
> >> >>>> >get feedback on.
> >> >>>> >
> >> >>>> >To answer some replies:
> >> >>>> >
> >> >>>> >- Ideally, we could start off with an
incremental approach and not package
> >> >>>> >large tools like clang/lld off the bat. The
llvm-* tools seem like a good
> >> >>>> >place to start since they're generally a
bunch of relatively small binaries
> >> >>>> >that all share a subset of functions in
libLLVM, but don't necessarily use
> >> >>>> >all of libLLVM, so statically linking them
together (with --gc-sections)
> >> >>>> >can help dedup a lot of shared components vs
having separate statically
> >> >>>> >compiled tools. In my measurements, the
busybox tool containing
> >> >>>> >llvm-objcopy+objdump is negligibly larger
than llvm-objdump on its own (a
> >> >>>> >couple KB difference) indicating a lot of
shared code between objdump and
> >> >>>> >objcopy.
> >> >>>> >
> >> >>>> >- Will Dietz's multiplexing tool looks
like a good place to start from. The
> >> >>>> >only concern I can see though is mostly the
amount of work needed to update
> >> >>>> >it to LLVM 13.
> >> >>>> >
> >> >>>> >- We don't have plans for windows support
now, but it's not off the table.
> >> >>>> >(Been mostly focusing on *nix for now).
Depending on overall traction for
> >> >>>> >this idea, we could approach incrementally
and add support for different
> >> >>>> >platforms over time.
> >> >>>>
> >> >>>> -DLLVM_LINK_LLVM_DYLIB=on
-DCLANG_LINK_CLANG_DYLIB=on -DLLVM_TARGETS_TO_BUILD=X86 (custom1)
> >> >>>> vs
> >> >>>> -DLLVM_TARGETS_TO_BUILD=X86 (custom2)
> >> >>>>
> >> >>>>
> >> >>>> # This is the lower bound for any multiplexing
approach. clang is the largest executable.
> >> >>>> % stat -c %s /tmp/out/custom2/bin/clang-13
> >> >>>> 102900408
> >> >>>>
> >> >>>> I have built clang, lld and a bunch of ELF binary
utilities.
> >> >>>>
> >> >>>> % stat -c %s
/tmp/out/custom1/lib/libLLVM-13git.so /tmp/out/custom1/lib/libclang-cpp.so.13git
/tmp/out/custom1/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
| awk '{s+=$1}END{print s}'
> >> >>>> 138896544
> >> >>>>
> >> >>>> % stat -c %s
/tmp/out/custom2/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
| awk '{s+=$1}END{print s}'
> >> >>>> 209054440
> >> >>>>
> >> >>>>
> >> >>>> The -DLLVM_LINK_LLVM_DYLIB=on
-DCLANG_LINK_CLANG_DYLIB=on build is doing a really good job.
> >> >>>>
> >> >>>> A multiplexing approach can squeeze some bytes
from 138896544 toward 102900408,
> >> >>>> but how much can it do?
> >> >>>>
> >> >>>>
> >> >>>> >- I'm starting to think the `cl::opt` to
`OptTable` issue might be
> >> >>>> >orthogonal to the busybox implementation. The
tool essentially dispatches
> >> >>>> >to different "main" functions in
different tools, but as long as we don't
> >> >>>> >do anything within busybox after exiting that
tool's main, then the global
> >> >>>> >state issues we weren't sure of with
`cl::opt` might not be of any concern
> >> >>>> >now. It may be an issue down the line if,
let's say, the tool flags moved
> >> >>>> >from being "owned" by the tools
themselves to instead being "owned" by
> >> >>>> >busybox, and then we'd have to merge
similarly-named flags together. In
> >> >>>> >that case, migrating these tools to use
`OptTable` may be necessary since
> >> >>>> >(I think) `OptTable` should handle this. This
may be a tedious task, but
> >> >>>> >this is just to say that busybox won't
need to be immediately blocked on it.
> >> >>>>
> >> >>>> Such improvement is useful even if we don't
do multiplexing.
> >> >>>> I switched llvm-symbolizer. thakis switched
llvm-objdump.
> >> >>>> I can look at some binary utilities.
> >> >>>>
> >> >>>> >- I haven't seen any issues with
colliding symbols when linking (although
> >> >>>> >I've only merged two tools for now). I
suspect that with small-ish llvm-*
> >> >>>> >tools, the bulk of their code is shared from
libLLVM, and they have their
> >> >>>> >own distinct logic built on top of it, which
could mean a low chance of
> >> >>>> >conflicting internal ABIs.
> >> >>>> >
> >> >>>> >On Mon, Jun 21, 2021 at 10:54 AM Leonard Chan
<leonardchan at google.com <mailto:leonardchan at google.com>>
> >> >>>> >wrote:
> >> >>>> >
> >> >>>> >> Hello all,
> >> >>>> >>
> >> >>>> >> When building LLVM tools, including
Clang and lld, it's currently possible
> >> >>>> >> to use either static or shared linking
for LLVM libraries. The latter can
> >> >>>> >> significantly reduce the size of the
toolchain since we aren't duplicating
> >> >>>> >> the same code in every binary, but the
dynamic relocations can affect
> >> >>>> >> performance. The former doesn't
affect performance but significantly
> >> >>>> >> increases the size of our toolchain.
> >> >>>> >>
> >> >>>> >> We would like to implement a support for
a third approach which we call,
> >> >>>> >> for a lack of better term,
"busybox" feature, where everything is compiled
> >> >>>> >> into a single binary which then
dispatches into an appropriate tool
> >> >>>> >> depending on the first command. This
approach can significantly reduce the
> >> >>>> >> size by deduplicating all of the shared
code without affecting the
> >> >>>> >> performance.
> >> >>>> >>
> >> >>>> >> In terms of implementation, the build
would produce a single binary called
> >> >>>> >> `llvm` and the first command would
identify the tool. For example, instead
> >> >>>> >> of invoking `llvm-nm` you'd invoke
`llvm nm`. Ideally we would also support
> >> >>>> >> creation of `llvm-nm` symlink which
redirects to `llvm` for backwards
> >> >>>> >> compatibility.
> >> >>>> >> This functionality would ideally be
implemented as an option in the CMake
> >> >>>> >> build that toolchain vendors can opt
into.
> >> >>>> >>
> >> >>>> >> The implementation would have to replace
`main` function of each tool with
> >> >>>> >> an entrypoint regular function which is
registered into a tool registry.
> >> >>>> >> This could be wrapped in a macro for
convenience. When the "busybox"
> >> >>>> >> feature is disabled, the macro would
expand to a `main` function as before
> >> >>>> >> and redirect to the entrypoint function.
When the "busybox" feature is
> >> >>>> >> enabled, it would register the
entrypoint function into the registry, which
> >> >>>> >> would be responsible for the dispatching
based on the tool name. Ideally,
> >> >>>> >> toolchain maintainers would also be able
to control which tools they could
> >> >>>> >> add to the "busybox" binary
via CMake build options, so toolchains will
> >> >>>> >> only include the tools they use.
> >> >>>> >>
> >> >>>> >> One implementation detail we think will
be an issue is merging arguments
> >> >>>> >> in individual tools that use `cl::opt`.
`cl::opt` works by maintaining a
> >> >>>> >> global state of flags, but we aren’t
confident of what the resulting
> >> >>>> >> behavior will be when merging them
together in the dispatching `main`. What
> >> >>>> >> we would like to avoid is having flags
used by one specific tool available
> >> >>>> >> on other tools. To address this issue,
we would like to migrate all tools
> >> >>>> >> to use `OptTable` which doesn't have
this issue and has been the general
> >> >>>> >> direction most tools have been already
moving into.
> >> >>>> >>
> >> >>>> >> A second issue would be resolving
symlinks. For example, llvm-objcopy will
> >> >>>> >> check argv[0] and behave as llvm-strip
(ie. use the right flags +
> >> >>>> >> configuration) if it is called via a
symlink that “looks like” a strip
> >> >>>> >> tool, but for all other cases it will
run under the default objcopy mode.
> >> >>>> >> The “looks like” function is usually an
`Is` function copied in multiple
> >> >>>> >> tools that is essentially a substring
check: so symlinks like `llvm-strip`,
> >> >>>> >> strip.exe, and `gnu-llvm-strip-10` all
result in using the strip “mode”
> >> >>>> >> while all other names use the objcopy
mode. To replicate the same behavior,
> >> >>>> >> we will need to take great care in
making sure symlinks to the busybox tool
> >> >>>> >> dispatch correctly to the appropriate
llvm tool, which might mean exposing
> >> >>>> >> and merging these `Is` functions.
> >> >>>> >>
> >> >>>> >> Some open questions:
> >> >>>> >> - People's initial
thoughts/opinions?
> >> >>>> >> - Are there existing tools in LLVM that
already do this?
> >> >>>> >> - Other implementation details/global
states that we would also need to
> >> >>>> >> account for?
> >> >>>> >>
> >> >>>> >> - Leonard
> >> >>>> >>
> >> >>>>
> >> >>>>
>_______________________________________________
> >> >>>> >LLVM Developers mailing list
> >> >>>> >llvm-dev at lists.llvm.org
<mailto:llvm-dev at lists.llvm.org>
> >> >>>>
>https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> >> >>>>
> >> >>>> _______________________________________________
> >> >>>> LLVM Developers mailing list
> >> >>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
> >> >>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> >> >>>
> >> >>> _______________________________________________
> >> >>> LLVM Developers mailing list
> >> >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
> >> >>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> >>
> >>
> >>
> >> --
> >> 宋方睿
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
> 
> -- 
> 宋方睿
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210916/464104e1/attachment-0001.html>

Fāng-ruì Sòng via llvm-dev

2021-Sep-16 23:14 UTC

head link

[llvm-dev] [RFC] LLVM Busybox Proposal

On Thu, Sep 16, 2021 at 3:40 PM Chris Bieneman via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
> Hi all,
>
> Apologies for reviving a long-aged thread here. I hadn't followed this
thread when it came up, and independently started playing with the same basic
idea. I have a prototype implementation on my GitHub[1], which creates an
llvm-driver that can execute clang, dsymutil, llvm-ar, llvm-cxxfilt,
llvm-dwarfdump, and llvm-objcopy.
>
> The pattern can be applied fairly simply to additional tools (with one
_huge_ caveat that I'll go into below). For most tools to be built into the
multicall binary the only required changes are adding the `GENERATE_DRIVER`
option to the `add_llvm_tool` CMake call, and changing the `main` function to be
prefixed by the tool's name (i.e. llvm_objcopy_main, clang_main, etc).
>
> As an example, the full diffs for llvm-objcopy are:
>
> ```
> diff --git a/llvm/tools/llvm-objcopy/CMakeLists.txt
b/llvm/tools/llvm-objcopy/CMakeLists.txt
> index d14d2135f5db..644dec79bc50 100644
> --- a/llvm/tools/llvm-objcopy/CMakeLists.txt
> +++ b/llvm/tools/llvm-objcopy/CMakeLists.txt
> @@ -43,6 +43,7 @@ add_llvm_tool(llvm-objcopy
>    ObjcopyOptsTableGen
>    InstallNameToolOptsTableGen
>    StripOptsTableGen
> +  GENERATE_DRIVER
>    )
>
>  add_llvm_tool_symlink(llvm-install-name-tool llvm-objcopy)
> diff --git a/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
b/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
> index ad166487eb78..bd5556f225b2 100644
> --- a/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
> +++ b/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
> @@ -401,7 +401,7 @@ static Error executeObjcopy(ConfigManager
&ConfigMgr) {
>    return Error::success();
>  }
>
> -int main(int argc, char **argv) {
> +int llvm_objcopy_main(int argc, char **argv) {
>    InitLLVM X(argc, argv);
>    ToolName = argv[0];
> ```
>
> With some clever CMake goop, any tool that opts into being part of the
merged driver gets a generated template main function, and much of the other
boilerplate code is generated too. As implemented, the tools all get built into
their own tools _and_ the llvm-driver tool. If this is a desirable route part of
the patch to make this "real" would be adding an option to disable
building the tool and instead generate a symlink from the tool to llvm-driver.
>
> This implementation dose require CMake 3.12 or later, since CMake 3.12
allows linkage dependencies for object libraries, which the implementation
depends on.
>
> The _huge_ caveat is that cl::opt haunts all things I do in LLVM. I tried
adding clang-tidy to the tools, and it will build fine, but crashes on launch
because of duplicate command line options being registered (d'oh!).
cl::opt's continued reliance on globals means that it is ill-suited for the
construction of a mega-llvm-driver.
>
> This is something that has come up time and time again in many different
contexts, but we've never really had the community effort behind resolving
it.
Sadly agree :(
> Using OptTable has been suggested, but one of the common complaints is that
OptTable for tool options is unwieldy and overly complicated for small simple
tools. Also, there isn't a good way to handle options that are buried inside
libraries, and many of the cl::opt options are.
Short options were a big problem of these user-facing binary utilities.
On the good side, most binary utilities are ready for the
crunchgen/busybox/multiplexer proposal now.
I have switched llvm-readobj (https://reviews.llvm.org/D105532)
llvm-size (https://reviews.llvm.org/D105598)
llvm-cxxfilt (https://reviews.llvm.org/D105605) llvm-nm
(https://reviews.llvm.org/D105330) llvm-strings
(https://reviews.llvm.org/D104889)
(and from the previous release llvm-symbolizer).
The descriptions say more on why OptTable is better than cl::opt for
user-facing options.
> Many years ago I initiated lengthy discussions on llvm-dev and at llvm
socials about an alternate approach [2], but it was a half measure at best. One
of the challenges that it didn't solve was the need for registering command
line options for all the debugging values buried in the passes. I don't mean
to derail this effort by sending us all down a rabbit hole, but I also think
that for a real tractable solution to an llvm Busybox/multicall binary solution,
we really need to do something about cl::opt.
In mlir, mlir/lib/Support/Timing.cpp uses this style

namespace {
struct DefaultTimingManagerOptions {
  llvm::cl::opt<bool> timing{"mlir-timing",
                             llvm::cl::desc("Display execution
times"),
                             llvm::cl::init(false)};
  llvm::cl::opt<DisplayMode> displayMode{
      "mlir-timing-display", llvm::cl::desc("Display method for
timing data"),
      llvm::cl::init(DisplayMode::Tree),
      llvm::cl::values(
          clEnumValN(DisplayMode::List, "list",
                     "display the results in a list sorted by total
time"),
          clEnumValN(DisplayMode::Tree, "tree",
                     "display the results ina with a nested tree
view"))};
};
} // end anonymous namespace

static llvm::ManagedStatic<DefaultTimingManagerOptions> options;

void mlir::registerDefaultTimingManagerCLOptions() {
  // Make sure that the options struct has been constructed.
  *options;
}

Still cl::opt, but if register*CLOptions functions are well
controlled, the global option name space problem should be fine.
> -Chris
>
> [1]
https://github.com/llvm-beanz/llvm-project/commit/59befc0116b982b6905822e5c5109bb6a4d397b0
> [2] https://lists.llvm.org/pipermail/llvm-dev/2014-August/075855.html
>
> On Jun 23, 2021, at 6:32 PM, Mehdi AMINI via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
>
>
>
> On Wed, Jun 23, 2021 at 3:52 PM Fāng-ruì Sòng <maskray at google.com>
wrote:
>>
>> On Wed, Jun 23, 2021 at 3:43 PM Mehdi AMINI <joker.eph at
gmail.com> wrote:
>> >
>> >
>> >
>> > On Tue, Jun 22, 2021 at 11:09 PM Fāng-ruì Sòng via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>> >>
>> >> On Tue, Jun 22, 2021 at 10:20 PM Petr Hosek <phosek at
google.com> wrote:
>> >> >
>> >> > I guess this depends on a particular implementation of
the distributed build system. In the case of Goma, we only supply the compiler
binary which was invoked as the command (that binary links glibc as a shared
library but we assume that one is supplied by the host system), all other files
like headers are passed together with the compiler invocation as inputs. If we
used dynamic linking, Goma would need to figure out what other shared libraries
need to be sent to the server. It's certainly doable but it's an extra
complexity we would like to avoid.
>> >>
>> >> For non-clang executables, -DLLVM_LINK_LLVM_DYLIB=on just adds
one
>> >> more DT_NEEDED.
>> >> The DT_NEEDED entry can use a $ORIGIN based DT_RUNPATH. Can
Goma
>> >> detect the libraries shipped with the tools?
>> >> I asked because I feel this could be an artificial limitation
which
>> >> could be straightforwardly addressed in Goma.
>> >> A toolchain executable using a accompanying shared object is
not rare
>> >> (thinking of plugins).
>> >>
>> >> Multiplexing LLVM tools is one alternative but I am a bit
concerned
>> >> with the extra complexity and the new configuration the build
system
>> >> needs to support.
>> >>
>> >>
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151338.html
>> >> mentioned another approach which doesn't require intrusive
>> >> modification to the tools.
>> >>
>> >> As for PGO+LTO, you can apply them to libLLVM-13git.so as
well.
>> >
>> >
>> > Some thoughts if we're getting into PGO+LTO territory, I feel
that both methods presented here will be at a disadvantage compared to building
clang and lld into their own binaries.
>> > For example I remember that on Mac an important optimization for
clang builds was to order the functions in the binary roughly in the order in
which they are first encountered during execution, assuming the same behavior
for lld you can see the conflicting optimization goal... You can also think
about how libSupport may be differently "hot" on a clang PGO profile
compared to lld and would result in different optimization.
>>
>> If PGO+LTO is desired, the executables can be split this way, assuming
>> the performance of
>>
llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}
>> doesn't matter.
>>
>> * clang (libLLVM*.a)
>> * lld +
llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}
>> (libLLVM-13git.so)
>>
>> > LTO also benefits from "internalizing", basically
building a static binary where only `main` is exported and everything else
becomes an internal linkage is the best case: pointer escaping, global analysis,
etc all become more powerful. Optimizing a shared library kind of makes every
symbol public, and I suspect the busybox approach may be better on this aspect
(you get back to a single public main, but it can reach much more code though).
>>
>> With --version-script we can internalize shared object symbols as
>> well. For example, this has been used to facilitate whole-program
>> devirtualization (https://reviews.llvm.org/D98686).
>> With https://lists.llvm.org/pipermail/llvm-dev/2021-June/151338.html
>> we can get a list of roots which need to be exported.
>> A thin executable plus a -fvisibility-inlines-hidden +
>> -Bsymbolic-functions shared object is almost identical to a PIE.
>
>
> You can get closer to it but note that:
>
> - You have some non-trivial and non-standard build setup and scripts to
workaround the problem (finding roots, etc.), the busybox solution is much more
"clean" from this point of view if one can structure it in
"normal" C++.
> - How does it work on non-ELF platforms?
> - It still isn't equivalent: you're still having a large surface
API exported by the .so which limits what the optimizer can do (alias analysis,
etc.). You won't be able to inject context from the callers there, or inline
across the libLLVM.so boundary.
>
> --
> Mehdi
>
>
>
>
>>
>>
>> >
>> >>
>> >>
>> >> > On Tue, Jun 22, 2021 at 10:09 PM David Blaikie
<dblaikie at gmail.com> wrote:
>> >> >>
>> >> >> On Tue, Jun 22, 2021 at 10:00 PM Petr Hosek via
llvm-dev <llvm-dev at lists.llvm.org> wrote:
>> >> >>>
>> >> >>> From our perspective as a toolchain vendor, even
if using shared libraries could get us closer to static linking in terms of
performance, we'd still prefer static linking for the ease of distribution.
Dealing with a single statically linked executable is much easier than dealing
with multiple shared libraries. This is especially important in distributed
compilation environments like Goma.
>> >> >>
>> >> >>
>> >> >> What makes it especially complicated for distributed
compilation environments? (I'd expect a toolchain contains so many files
that whether it's one binary, or a binary and a handful of shared libraries
wouldn't change the general implementation complexity of a distributed build
system?)
>> >> >>
>> >> >>>
>> >> >>>
>> >> >>> When comparing performance between static and
dynamic linking, I'd also recommend doing a comparison between binaries
built with PGO+LTO. Plain -O3 leaves a lot of performance on the table and as
far as I'm aware, most toolchain vendors use PGO+LTO.
>> >> >>>
>> >> >>> On Tue, Jun 22, 2021 at 5:00 PM Fangrui Song via
llvm-dev <llvm-dev at lists.llvm.org> wrote:
>> >> >>>>
>> >> >>>> On 2021-06-22, Leonard Chan via llvm-dev
wrote:
>> >> >>>> >Small update: I have a WIP prototype of
the tool at
>> >> >>>> >https://reviews.llvm.org/D104686. The
prototype only includes llvm-objcopy
>> >> >>>> >and llvm-objdump packed together, but
we're seeing size benefits from
>> >> >>>> >busyboxing those two compared against
having two separate tools. (More
>> >> >>>> >details in the prototype's
description.) I don't plan on landing this as-is
>> >> >>>> >anytime soon and there's still some
things I'd like to improve/change and
>> >> >>>> >get feedback on.
>> >> >>>> >
>> >> >>>> >To answer some replies:
>> >> >>>> >
>> >> >>>> >- Ideally, we could start off with an
incremental approach and not package
>> >> >>>> >large tools like clang/lld off the bat.
The llvm-* tools seem like a good
>> >> >>>> >place to start since they're
generally a bunch of relatively small binaries
>> >> >>>> >that all share a subset of functions in
libLLVM, but don't necessarily use
>> >> >>>> >all of libLLVM, so statically linking
them together (with --gc-sections)
>> >> >>>> >can help dedup a lot of shared components
vs having separate statically
>> >> >>>> >compiled tools. In my measurements, the
busybox tool containing
>> >> >>>> >llvm-objcopy+objdump is negligibly larger
than llvm-objdump on its own (a
>> >> >>>> >couple KB difference) indicating a lot of
shared code between objdump and
>> >> >>>> >objcopy.
>> >> >>>> >
>> >> >>>> >- Will Dietz's multiplexing tool
looks like a good place to start from. The
>> >> >>>> >only concern I can see though is mostly
the amount of work needed to update
>> >> >>>> >it to LLVM 13.
>> >> >>>> >
>> >> >>>> >- We don't have plans for windows
support now, but it's not off the table.
>> >> >>>> >(Been mostly focusing on *nix for now).
Depending on overall traction for
>> >> >>>> >this idea, we could approach
incrementally and add support for different
>> >> >>>> >platforms over time.
>> >> >>>>
>> >> >>>> -DLLVM_LINK_LLVM_DYLIB=on
-DCLANG_LINK_CLANG_DYLIB=on -DLLVM_TARGETS_TO_BUILD=X86 (custom1)
>> >> >>>> vs
>> >> >>>> -DLLVM_TARGETS_TO_BUILD=X86 (custom2)
>> >> >>>>
>> >> >>>>
>> >> >>>> # This is the lower bound for any
multiplexing approach. clang is the largest executable.
>> >> >>>> % stat -c %s /tmp/out/custom2/bin/clang-13
>> >> >>>> 102900408
>> >> >>>>
>> >> >>>> I have built clang, lld and a bunch of ELF
binary utilities.
>> >> >>>>
>> >> >>>> % stat -c %s
/tmp/out/custom1/lib/libLLVM-13git.so /tmp/out/custom1/lib/libclang-cpp.so.13git
/tmp/out/custom1/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
| awk '{s+=$1}END{print s}'
>> >> >>>> 138896544
>> >> >>>>
>> >> >>>> % stat -c %s
/tmp/out/custom2/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
| awk '{s+=$1}END{print s}'
>> >> >>>> 209054440
>> >> >>>>
>> >> >>>>
>> >> >>>> The -DLLVM_LINK_LLVM_DYLIB=on
-DCLANG_LINK_CLANG_DYLIB=on build is doing a really good job.
>> >> >>>>
>> >> >>>> A multiplexing approach can squeeze some
bytes from 138896544 toward 102900408,
>> >> >>>> but how much can it do?
>> >> >>>>
>> >> >>>>
>> >> >>>> >- I'm starting to think the `cl::opt`
to `OptTable` issue might be
>> >> >>>> >orthogonal to the busybox implementation.
The tool essentially dispatches
>> >> >>>> >to different "main" functions
in different tools, but as long as we don't
>> >> >>>> >do anything within busybox after exiting
that tool's main, then the global
>> >> >>>> >state issues we weren't sure of with
`cl::opt` might not be of any concern
>> >> >>>> >now. It may be an issue down the line if,
let's say, the tool flags moved
>> >> >>>> >from being "owned" by the tools
themselves to instead being "owned" by
>> >> >>>> >busybox, and then we'd have to merge
similarly-named flags together. In
>> >> >>>> >that case, migrating these tools to use
`OptTable` may be necessary since
>> >> >>>> >(I think) `OptTable` should handle this.
This may be a tedious task, but
>> >> >>>> >this is just to say that busybox
won't need to be immediately blocked on it.
>> >> >>>>
>> >> >>>> Such improvement is useful even if we
don't do multiplexing.
>> >> >>>> I switched llvm-symbolizer. thakis switched
llvm-objdump.
>> >> >>>> I can look at some binary utilities.
>> >> >>>>
>> >> >>>> >- I haven't seen any issues with
colliding symbols when linking (although
>> >> >>>> >I've only merged two tools for now).
I suspect that with small-ish llvm-*
>> >> >>>> >tools, the bulk of their code is shared
from libLLVM, and they have their
>> >> >>>> >own distinct logic built on top of it,
which could mean a low chance of
>> >> >>>> >conflicting internal ABIs.
>> >> >>>> >
>> >> >>>> >On Mon, Jun 21, 2021 at 10:54 AM Leonard
Chan <leonardchan at google.com>
>> >> >>>> >wrote:
>> >> >>>> >
>> >> >>>> >> Hello all,
>> >> >>>> >>
>> >> >>>> >> When building LLVM tools, including
Clang and lld, it's currently possible
>> >> >>>> >> to use either static or shared
linking for LLVM libraries. The latter can
>> >> >>>> >> significantly reduce the size of the
toolchain since we aren't duplicating
>> >> >>>> >> the same code in every binary, but
the dynamic relocations can affect
>> >> >>>> >> performance. The former doesn't
affect performance but significantly
>> >> >>>> >> increases the size of our toolchain.
>> >> >>>> >>
>> >> >>>> >> We would like to implement a support
for a third approach which we call,
>> >> >>>> >> for a lack of better term,
"busybox" feature, where everything is compiled
>> >> >>>> >> into a single binary which then
dispatches into an appropriate tool
>> >> >>>> >> depending on the first command. This
approach can significantly reduce the
>> >> >>>> >> size by deduplicating all of the
shared code without affecting the
>> >> >>>> >> performance.
>> >> >>>> >>
>> >> >>>> >> In terms of implementation, the
build would produce a single binary called
>> >> >>>> >> `llvm` and the first command would
identify the tool. For example, instead
>> >> >>>> >> of invoking `llvm-nm` you'd
invoke `llvm nm`. Ideally we would also support
>> >> >>>> >> creation of `llvm-nm` symlink which
redirects to `llvm` for backwards
>> >> >>>> >> compatibility.
>> >> >>>> >> This functionality would ideally be
implemented as an option in the CMake
>> >> >>>> >> build that toolchain vendors can opt
into.
>> >> >>>> >>
>> >> >>>> >> The implementation would have to
replace `main` function of each tool with
>> >> >>>> >> an entrypoint regular function which
is registered into a tool registry.
>> >> >>>> >> This could be wrapped in a macro for
convenience. When the "busybox"
>> >> >>>> >> feature is disabled, the macro would
expand to a `main` function as before
>> >> >>>> >> and redirect to the entrypoint
function. When the "busybox" feature is
>> >> >>>> >> enabled, it would register the
entrypoint function into the registry, which
>> >> >>>> >> would be responsible for the
dispatching based on the tool name. Ideally,
>> >> >>>> >> toolchain maintainers would also be
able to control which tools they could
>> >> >>>> >> add to the "busybox"
binary via CMake build options, so toolchains will
>> >> >>>> >> only include the tools they use.
>> >> >>>> >>
>> >> >>>> >> One implementation detail we think
will be an issue is merging arguments
>> >> >>>> >> in individual tools that use
`cl::opt`. `cl::opt` works by maintaining a
>> >> >>>> >> global state of flags, but we aren’t
confident of what the resulting
>> >> >>>> >> behavior will be when merging them
together in the dispatching `main`. What
>> >> >>>> >> we would like to avoid is having
flags used by one specific tool available
>> >> >>>> >> on other tools. To address this
issue, we would like to migrate all tools
>> >> >>>> >> to use `OptTable` which doesn't
have this issue and has been the general
>> >> >>>> >> direction most tools have been
already moving into.
>> >> >>>> >>
>> >> >>>> >> A second issue would be resolving
symlinks. For example, llvm-objcopy will
>> >> >>>> >> check argv[0] and behave as
llvm-strip (ie. use the right flags +
>> >> >>>> >> configuration) if it is called via a
symlink that “looks like” a strip
>> >> >>>> >> tool, but for all other cases it
will run under the default objcopy mode.
>> >> >>>> >> The “looks like” function is usually
an `Is` function copied in multiple
>> >> >>>> >> tools that is essentially a
substring check: so symlinks like `llvm-strip`,
>> >> >>>> >> strip.exe, and `gnu-llvm-strip-10`
all result in using the strip “mode”
>> >> >>>> >> while all other names use the
objcopy mode. To replicate the same behavior,
>> >> >>>> >> we will need to take great care in
making sure symlinks to the busybox tool
>> >> >>>> >> dispatch correctly to the
appropriate llvm tool, which might mean exposing
>> >> >>>> >> and merging these `Is` functions.
>> >> >>>> >>
>> >> >>>> >> Some open questions:
>> >> >>>> >> - People's initial
thoughts/opinions?
>> >> >>>> >> - Are there existing tools in LLVM
that already do this?
>> >> >>>> >> - Other implementation
details/global states that we would also need to
>> >> >>>> >> account for?
>> >> >>>> >>
>> >> >>>> >> - Leonard
>> >> >>>> >>
>> >> >>>>
>> >> >>>>
>_______________________________________________
>> >> >>>> >LLVM Developers mailing list
>> >> >>>> >llvm-dev at lists.llvm.org
>> >> >>>>
>https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >> >>>>
>> >> >>>>
_______________________________________________
>> >> >>>> LLVM Developers mailing list
>> >> >>>> llvm-dev at lists.llvm.org
>> >> >>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >> >>>
>> >> >>> _______________________________________________
>> >> >>> LLVM Developers mailing list
>> >> >>> llvm-dev at lists.llvm.org
>> >> >>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >>
>> >>
>> >>
>> >> --
>> >> 宋方睿
>> >> _______________________________________________
>> >> LLVM Developers mailing list
>> >> llvm-dev at lists.llvm.org
>> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>> --
>> 宋方睿
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


-- 
宋方睿

Leonard Chan via llvm-dev

2021-Sep-16 23:19 UTC

head link

[llvm-dev] [RFC] LLVM Busybox Proposal

Thanks for sharing your prototype! Glad to see that other people are on
board with this idea. For an incremental approach, it seems that Fangrui
has migrated many llvm tools to use OptTable, so it shouldn't be a blocker
for those at least. Do you also happen to be landing any of your code
sometime soon? We have an intern who will be picking up this work and we
should probably coordinate to make sure no work is duplicated.

On Thu, Sep 16, 2021 at 3:40 PM Chris Bieneman via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi all,
>
> Apologies for reviving a long-aged thread here. I hadn't followed this
> thread when it came up, and independently started playing with the same
> basic idea. I have a prototype implementation on my GitHub[1], which
> creates an llvm-driver that can execute clang, dsymutil, llvm-ar,
> llvm-cxxfilt, llvm-dwarfdump, and llvm-objcopy.
>
> The pattern can be applied fairly simply to additional tools (with one
> _huge_ caveat that I'll go into below). For most tools to be built into
the
> multicall binary the only required changes are adding the `GENERATE_DRIVER`
> option to the `add_llvm_tool` CMake call, and changing the `main` function
> to be prefixed by the tool's name (i.e. llvm_objcopy_main, clang_main,
> etc).
>
> As an example, the full diffs for llvm-objcopy are:
>
> ```
> diff --git a/llvm/tools/llvm-objcopy/CMakeLists.txt
> b/llvm/tools/llvm-objcopy/CMakeLists.txt
> index d14d2135f5db..644dec79bc50 100644
> --- a/llvm/tools/llvm-objcopy/CMakeLists.txt
> +++ b/llvm/tools/llvm-objcopy/CMakeLists.txt
> @@ -43,6 +43,7 @@ add_llvm_tool(llvm-objcopy
>    ObjcopyOptsTableGen
>    InstallNameToolOptsTableGen
>    StripOptsTableGen
> +  GENERATE_DRIVER
>    )
>
>  add_llvm_tool_symlink(llvm-install-name-tool llvm-objcopy)
> diff --git a/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
> b/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
> index ad166487eb78..bd5556f225b2 100644
> --- a/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
> +++ b/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
> @@ -401,7 +401,7 @@ static Error executeObjcopy(ConfigManager
&ConfigMgr) {
>    return Error::success();
>  }
>
> -int main(int argc, char **argv) {
> +int llvm_objcopy_main(int argc, char **argv) {
>    InitLLVM X(argc, argv);
>    ToolName = argv[0];
> ```
>
> With some clever CMake goop, any tool that opts into being part of the
> merged driver gets a generated template main function, and much of the
> other boilerplate code is generated too. As implemented, the tools all get
> built into their own tools _and_ the llvm-driver tool. If this is a
> desirable route part of the patch to make this "real" would be
adding an
> option to disable building the tool and instead generate a symlink from the
> tool to llvm-driver.
>
> This implementation dose require CMake 3.12 or later, since CMake 3.12
> allows linkage dependencies for object libraries, which the implementation
> depends on.
>
> The _huge_ caveat is that cl::opt haunts all things I do in LLVM. I tried
> adding clang-tidy to the tools, and it will build fine, but crashes on
> launch because of duplicate command line options being registered
(d'oh!).
> cl::opt's continued reliance on globals means that it is ill-suited for
the
> construction of a mega-llvm-driver.
>
> This is something that has come up time and time again in many different
> contexts, but we've never really had the community effort behind
resolving
> it.
>
> Using OptTable has been suggested, but one of the common complaints is
> that OptTable for tool options is unwieldy and overly complicated for small
> simple tools. Also, there isn't a good way to handle options that are
> buried inside libraries, and many of the cl::opt options are.
>
> Many years ago I initiated lengthy discussions on llvm-dev and at llvm
> socials about an alternate approach [2], but it was a half measure at best.
> One of the challenges that it didn't solve was the need for registering
> command line options for all the debugging values buried in the passes. I
> don't mean to derail this effort by sending us all down a rabbit hole,
but
> I also think that for a real tractable solution to an llvm
> Busybox/multicall binary solution, we really need to do something about
> cl::opt.
>
> -Chris
>
> [1]
>
https://github.com/llvm-beanz/llvm-project/commit/59befc0116b982b6905822e5c5109bb6a4d397b0
> [2] https://lists.llvm.org/pipermail/llvm-dev/2014-August/075855.html
>
> On Jun 23, 2021, at 6:32 PM, Mehdi AMINI via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
>
> On Wed, Jun 23, 2021 at 3:52 PM Fāng-ruì Sòng <maskray at google.com>
wrote:
>
>> On Wed, Jun 23, 2021 at 3:43 PM Mehdi AMINI <joker.eph at
gmail.com> wrote:
>> >
>> >
>> >
>> > On Tue, Jun 22, 2021 at 11:09 PM Fāng-ruì Sòng via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> >>
>> >> On Tue, Jun 22, 2021 at 10:20 PM Petr Hosek <phosek at
google.com> wrote:
>> >> >
>> >> > I guess this depends on a particular implementation of
the
>> distributed build system. In the case of Goma, we only supply the
compiler
>> binary which was invoked as the command (that binary links glibc as a
>> shared library but we assume that one is supplied by the host system),
all
>> other files like headers are passed together with the compiler
invocation
>> as inputs. If we used dynamic linking, Goma would need to figure out
what
>> other shared libraries need to be sent to the server. It's
certainly doable
>> but it's an extra complexity we would like to avoid.
>> >>
>> >> For non-clang executables, -DLLVM_LINK_LLVM_DYLIB=on just adds
one
>> >> more DT_NEEDED.
>> >> The DT_NEEDED entry can use a $ORIGIN based DT_RUNPATH. Can
Goma
>> >> detect the libraries shipped with the tools?
>> >> I asked because I feel this could be an artificial limitation
which
>> >> could be straightforwardly addressed in Goma.
>> >> A toolchain executable using a accompanying shared object is
not rare
>> >> (thinking of plugins).
>> >>
>> >> Multiplexing LLVM tools is one alternative but I am a bit
concerned
>> >> with the extra complexity and the new configuration the build
system
>> >> needs to support.
>> >>
>> >>
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151338.html
>> >> mentioned another approach which doesn't require intrusive
>> >> modification to the tools.
>> >>
>> >> As for PGO+LTO, you can apply them to libLLVM-13git.so as
well.
>> >
>> >
>> > Some thoughts if we're getting into PGO+LTO territory, I feel
that both
>> methods presented here will be at a disadvantage compared to building
clang
>> and lld into their own binaries.
>> > For example I remember that on Mac an important optimization for
clang
>> builds was to order the functions in the binary roughly in the order in
>> which they are first encountered during execution, assuming the same
>> behavior for lld you can see the conflicting optimization goal... You
can
>> also think about how libSupport may be differently "hot" on a
clang PGO
>> profile compared to lld and would result in different optimization.
>>
>> If PGO+LTO is desired, the executables can be split this way, assuming
>> the performance of
>>
llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}
>> doesn't matter.
>>
>> * clang (libLLVM*.a)
>> * lld +
>>
llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}
>> (libLLVM-13git.so)
>>
>> > LTO also benefits from "internalizing", basically
building a static
>> binary where only `main` is exported and everything else becomes an
>> internal linkage is the best case: pointer escaping, global analysis,
etc
>> all become more powerful. Optimizing a shared library kind of makes
every
>> symbol public, and I suspect the busybox approach may be better on this
>> aspect (you get back to a single public main, but it can reach much
more
>> code though).
>>
>> With --version-script we can internalize shared object symbols as
>> well. For example, this has been used to facilitate whole-program
>> devirtualization (https://reviews.llvm.org/D98686).
>> With https://lists.llvm.org/pipermail/llvm-dev/2021-June/151338.html
>> we can get a list of roots which need to be exported.
>> A thin executable plus a -fvisibility-inlines-hidden +
>> -Bsymbolic-functions shared object is almost identical to a PIE.
>>
>
> You can get closer to it but note that:
>
> - You have some non-trivial and non-standard build setup and scripts
> to workaround the problem (finding roots, etc.), the busybox solution is
> much more "clean" from this point of view if one can structure it
in
> "normal" C++.
> - How does it work on non-ELF platforms?
> - It still isn't equivalent: you're still having a large surface
API
> exported by the .so which limits what the optimizer can do (alias analysis,
> etc.). You won't be able to inject context from the callers there, or
> inline across the libLLVM.so boundary.
>
> --
> Mehdi
>
>
>
>
>
>>
>> >
>> >>
>> >>
>> >> > On Tue, Jun 22, 2021 at 10:09 PM David Blaikie
<dblaikie at gmail.com>
>> wrote:
>> >> >>
>> >> >> On Tue, Jun 22, 2021 at 10:00 PM Petr Hosek via
llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> >> >>>
>> >> >>> From our perspective as a toolchain vendor, even
if using shared
>> libraries could get us closer to static linking in terms of
performance,
>> we'd still prefer static linking for the ease of distribution.
Dealing with
>> a single statically linked executable is much easier than dealing with
>> multiple shared libraries. This is especially important in distributed
>> compilation environments like Goma.
>> >> >>
>> >> >>
>> >> >> What makes it especially complicated for distributed
compilation
>> environments? (I'd expect a toolchain contains so many files that
whether
>> it's one binary, or a binary and a handful of shared libraries
wouldn't
>> change the general implementation complexity of a distributed build
system?)
>> >> >>
>> >> >>>
>> >> >>>
>> >> >>> When comparing performance between static and
dynamic linking, I'd
>> also recommend doing a comparison between binaries built with PGO+LTO.
>> Plain -O3 leaves a lot of performance on the table and as far as
I'm aware,
>> most toolchain vendors use PGO+LTO.
>> >> >>>
>> >> >>> On Tue, Jun 22, 2021 at 5:00 PM Fangrui Song via
llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> >> >>>>
>> >> >>>> On 2021-06-22, Leonard Chan via llvm-dev
wrote:
>> >> >>>> >Small update: I have a WIP prototype of
the tool at
>> >> >>>> >https://reviews.llvm.org/D104686. The
prototype only includes
>> llvm-objcopy
>> >> >>>> >and llvm-objdump packed together, but
we're seeing size benefits
>> from
>> >> >>>> >busyboxing those two compared against
having two separate tools.
>> (More
>> >> >>>> >details in the prototype's
description.) I don't plan on landing
>> this as-is
>> >> >>>> >anytime soon and there's still some
things I'd like to
>> improve/change and
>> >> >>>> >get feedback on.
>> >> >>>> >
>> >> >>>> >To answer some replies:
>> >> >>>> >
>> >> >>>> >- Ideally, we could start off with an
incremental approach and
>> not package
>> >> >>>> >large tools like clang/lld off the bat.
The llvm-* tools seem
>> like a good
>> >> >>>> >place to start since they're
generally a bunch of relatively
>> small binaries
>> >> >>>> >that all share a subset of functions in
libLLVM, but don't
>> necessarily use
>> >> >>>> >all of libLLVM, so statically linking
them together (with
>> --gc-sections)
>> >> >>>> >can help dedup a lot of shared components
vs having separate
>> statically
>> >> >>>> >compiled tools. In my measurements, the
busybox tool containing
>> >> >>>> >llvm-objcopy+objdump is negligibly larger
than llvm-objdump on
>> its own (a
>> >> >>>> >couple KB difference) indicating a lot of
shared code between
>> objdump and
>> >> >>>> >objcopy.
>> >> >>>> >
>> >> >>>> >- Will Dietz's multiplexing tool
looks like a good place to
>> start from. The
>> >> >>>> >only concern I can see though is mostly
the amount of work
>> needed to update
>> >> >>>> >it to LLVM 13.
>> >> >>>> >
>> >> >>>> >- We don't have plans for windows
support now, but it's not off
>> the table.
>> >> >>>> >(Been mostly focusing on *nix for now).
Depending on overall
>> traction for
>> >> >>>> >this idea, we could approach
incrementally and add support for
>> different
>> >> >>>> >platforms over time.
>> >> >>>>
>> >> >>>> -DLLVM_LINK_LLVM_DYLIB=on
-DCLANG_LINK_CLANG_DYLIB=on
>> -DLLVM_TARGETS_TO_BUILD=X86 (custom1)
>> >> >>>> vs
>> >> >>>> -DLLVM_TARGETS_TO_BUILD=X86 (custom2)
>> >> >>>>
>> >> >>>>
>> >> >>>> # This is the lower bound for any
multiplexing approach. clang is
>> the largest executable.
>> >> >>>> % stat -c %s /tmp/out/custom2/bin/clang-13
>> >> >>>> 102900408
>> >> >>>>
>> >> >>>> I have built clang, lld and a bunch of ELF
binary utilities.
>> >> >>>>
>> >> >>>> % stat -c %s
/tmp/out/custom1/lib/libLLVM-13git.so
>> /tmp/out/custom1/lib/libclang-cpp.so.13git
>>
/tmp/out/custom1/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
>> | awk '{s+=$1}END{print s}'
>> >> >>>> 138896544
>> >> >>>>
>> >> >>>> % stat -c %s
>>
/tmp/out/custom2/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
>> | awk '{s+=$1}END{print s}'
>> >> >>>> 209054440
>> >> >>>>
>> >> >>>>
>> >> >>>> The -DLLVM_LINK_LLVM_DYLIB=on
-DCLANG_LINK_CLANG_DYLIB=on build
>> is doing a really good job.
>> >> >>>>
>> >> >>>> A multiplexing approach can squeeze some
bytes from 138896544
>> toward 102900408,
>> >> >>>> but how much can it do?
>> >> >>>>
>> >> >>>>
>> >> >>>> >- I'm starting to think the `cl::opt`
to `OptTable` issue might
>> be
>> >> >>>> >orthogonal to the busybox implementation.
The tool essentially
>> dispatches
>> >> >>>> >to different "main" functions
in different tools, but as long as
>> we don't
>> >> >>>> >do anything within busybox after exiting
that tool's main, then
>> the global
>> >> >>>> >state issues we weren't sure of with
`cl::opt` might not be of
>> any concern
>> >> >>>> >now. It may be an issue down the line if,
let's say, the tool
>> flags moved
>> >> >>>> >from being "owned" by the tools
themselves to instead being
>> "owned" by
>> >> >>>> >busybox, and then we'd have to merge
similarly-named flags
>> together. In
>> >> >>>> >that case, migrating these tools to use
`OptTable` may be
>> necessary since
>> >> >>>> >(I think) `OptTable` should handle this.
This may be a tedious
>> task, but
>> >> >>>> >this is just to say that busybox
won't need to be immediately
>> blocked on it.
>> >> >>>>
>> >> >>>> Such improvement is useful even if we
don't do multiplexing.
>> >> >>>> I switched llvm-symbolizer. thakis switched
llvm-objdump.
>> >> >>>> I can look at some binary utilities.
>> >> >>>>
>> >> >>>> >- I haven't seen any issues with
colliding symbols when linking
>> (although
>> >> >>>> >I've only merged two tools for now).
I suspect that with
>> small-ish llvm-*
>> >> >>>> >tools, the bulk of their code is shared
from libLLVM, and they
>> have their
>> >> >>>> >own distinct logic built on top of it,
which could mean a low
>> chance of
>> >> >>>> >conflicting internal ABIs.
>> >> >>>> >
>> >> >>>> >On Mon, Jun 21, 2021 at 10:54 AM Leonard
Chan <
>> leonardchan at google.com>
>> >> >>>> >wrote:
>> >> >>>> >
>> >> >>>> >> Hello all,
>> >> >>>> >>
>> >> >>>> >> When building LLVM tools, including
Clang and lld, it's
>> currently possible
>> >> >>>> >> to use either static or shared
linking for LLVM libraries. The
>> latter can
>> >> >>>> >> significantly reduce the size of the
toolchain since we aren't
>> duplicating
>> >> >>>> >> the same code in every binary, but
the dynamic relocations can
>> affect
>> >> >>>> >> performance. The former doesn't
affect performance but
>> significantly
>> >> >>>> >> increases the size of our toolchain.
>> >> >>>> >>
>> >> >>>> >> We would like to implement a support
for a third approach
>> which we call,
>> >> >>>> >> for a lack of better term,
"busybox" feature, where everything
>> is compiled
>> >> >>>> >> into a single binary which then
dispatches into an appropriate
>> tool
>> >> >>>> >> depending on the first command. This
approach can
>> significantly reduce the
>> >> >>>> >> size by deduplicating all of the
shared code without affecting
>> the
>> >> >>>> >> performance.
>> >> >>>> >>
>> >> >>>> >> In terms of implementation, the
build would produce a single
>> binary called
>> >> >>>> >> `llvm` and the first command would
identify the tool. For
>> example, instead
>> >> >>>> >> of invoking `llvm-nm` you'd
invoke `llvm nm`. Ideally we would
>> also support
>> >> >>>> >> creation of `llvm-nm` symlink which
redirects to `llvm` for
>> backwards
>> >> >>>> >> compatibility.
>> >> >>>> >> This functionality would ideally be
implemented as an option
>> in the CMake
>> >> >>>> >> build that toolchain vendors can opt
into.
>> >> >>>> >>
>> >> >>>> >> The implementation would have to
replace `main` function of
>> each tool with
>> >> >>>> >> an entrypoint regular function which
is registered into a tool
>> registry.
>> >> >>>> >> This could be wrapped in a macro for
convenience. When the
>> "busybox"
>> >> >>>> >> feature is disabled, the macro would
expand to a `main`
>> function as before
>> >> >>>> >> and redirect to the entrypoint
function. When the "busybox"
>> feature is
>> >> >>>> >> enabled, it would register the
entrypoint function into the
>> registry, which
>> >> >>>> >> would be responsible for the
dispatching based on the tool
>> name. Ideally,
>> >> >>>> >> toolchain maintainers would also be
able to control which
>> tools they could
>> >> >>>> >> add to the "busybox"
binary via CMake build options, so
>> toolchains will
>> >> >>>> >> only include the tools they use.
>> >> >>>> >>
>> >> >>>> >> One implementation detail we think
will be an issue is merging
>> arguments
>> >> >>>> >> in individual tools that use
`cl::opt`. `cl::opt` works by
>> maintaining a
>> >> >>>> >> global state of flags, but we aren’t
confident of what the
>> resulting
>> >> >>>> >> behavior will be when merging them
together in the dispatching
>> `main`. What
>> >> >>>> >> we would like to avoid is having
flags used by one specific
>> tool available
>> >> >>>> >> on other tools. To address this
issue, we would like to
>> migrate all tools
>> >> >>>> >> to use `OptTable` which doesn't
have this issue and has been
>> the general
>> >> >>>> >> direction most tools have been
already moving into.
>> >> >>>> >>
>> >> >>>> >> A second issue would be resolving
symlinks. For example,
>> llvm-objcopy will
>> >> >>>> >> check argv[0] and behave as
llvm-strip (ie. use the right
>> flags +
>> >> >>>> >> configuration) if it is called via a
symlink that “looks like”
>> a strip
>> >> >>>> >> tool, but for all other cases it
will run under the default
>> objcopy mode.
>> >> >>>> >> The “looks like” function is usually
an `Is` function copied
>> in multiple
>> >> >>>> >> tools that is essentially a
substring check: so symlinks like
>> `llvm-strip`,
>> >> >>>> >> strip.exe, and `gnu-llvm-strip-10`
all result in using the
>> strip “mode”
>> >> >>>> >> while all other names use the
objcopy mode. To replicate the
>> same behavior,
>> >> >>>> >> we will need to take great care in
making sure symlinks to the
>> busybox tool
>> >> >>>> >> dispatch correctly to the
appropriate llvm tool, which might
>> mean exposing
>> >> >>>> >> and merging these `Is` functions.
>> >> >>>> >>
>> >> >>>> >> Some open questions:
>> >> >>>> >> - People's initial
thoughts/opinions?
>> >> >>>> >> - Are there existing tools in LLVM
that already do this?
>> >> >>>> >> - Other implementation
details/global states that we would
>> also need to
>> >> >>>> >> account for?
>> >> >>>> >>
>> >> >>>> >> - Leonard
>> >> >>>> >>
>> >> >>>>
>> >> >>>>
>_______________________________________________
>> >> >>>> >LLVM Developers mailing list
>> >> >>>> >llvm-dev at lists.llvm.org
>> >> >>>>
>https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >> >>>>
>> >> >>>>
_______________________________________________
>> >> >>>> LLVM Developers mailing list
>> >> >>>> llvm-dev at lists.llvm.org
>> >> >>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >> >>>
>> >> >>> _______________________________________________
>> >> >>> LLVM Developers mailing list
>> >> >>> llvm-dev at lists.llvm.org
>> >> >>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >>
>> >>
>> >>
>> >> --
>> >> 宋方睿
>> >> _______________________________________________
>> >> LLVM Developers mailing list
>> >> llvm-dev at lists.llvm.org
>> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>> --
>> 宋方睿
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210916/71a156a0/attachment.html>

llvm dev - Sep 2021 - [RFC] LLVM Busybox Proposal

[llvm-dev] [RFC] LLVM Busybox Proposal

[llvm-dev] [RFC] LLVM Busybox Proposal

[llvm-dev] [RFC] LLVM Busybox Proposal