I have a few bits of cleanup and fleshing out that I’d like to do for the
llvm-driver tool in my patch. I can work on that tonight and tomorrow, and
probably post a patch for review by Monday.
My general approach to this would be to add the llvm-driver as an excluded from
all target that is always configured in the build.
Subsequent patches would add support for making it replace the tool builds with
symlinks, and ensuring compatibility with important build system functionality
like `LLVM_DISTRIBUTION_COMPONENTS`.
I’ll start working on the patch cleanup, and if that approach sounds reasonable
we can move on from there.
-Chris
> On Sep 16, 2021, at 6:19 PM, Leonard Chan <leonardchan at google.com>
wrote:
>
>
> Thanks for sharing your prototype! Glad to see that other people are on
board with this idea. For an incremental approach, it seems that Fangrui has
migrated many llvm tools to use OptTable, so it shouldn't be a blocker for
those at least. Do you also happen to be landing any of your code sometime soon?
We have an intern who will be picking up this work and we should probably
coordinate to make sure no work is duplicated.
>
>> On Thu, Sep 16, 2021 at 3:40 PM Chris Bieneman via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>> Hi all,
>>
>> Apologies for reviving a long-aged thread here. I hadn't followed
this thread when it came up, and independently started playing with the same
basic idea. I have a prototype implementation on my GitHub[1], which creates an
llvm-driver that can execute clang, dsymutil, llvm-ar, llvm-cxxfilt,
llvm-dwarfdump, and llvm-objcopy.
>>
>> The pattern can be applied fairly simply to additional tools (with one
_huge_ caveat that I'll go into below). For most tools to be built into the
multicall binary the only required changes are adding the `GENERATE_DRIVER`
option to the `add_llvm_tool` CMake call, and changing the `main` function to be
prefixed by the tool's name (i.e. llvm_objcopy_main, clang_main, etc).
>>
>> As an example, the full diffs for llvm-objcopy are:
>>
>> ```
>> diff --git a/llvm/tools/llvm-objcopy/CMakeLists.txt
b/llvm/tools/llvm-objcopy/CMakeLists.txt
>> index d14d2135f5db..644dec79bc50 100644
>> --- a/llvm/tools/llvm-objcopy/CMakeLists.txt
>> +++ b/llvm/tools/llvm-objcopy/CMakeLists.txt
>> @@ -43,6 +43,7 @@ add_llvm_tool(llvm-objcopy
>> ObjcopyOptsTableGen
>> InstallNameToolOptsTableGen
>> StripOptsTableGen
>> + GENERATE_DRIVER
>> )
>>
>> add_llvm_tool_symlink(llvm-install-name-tool llvm-objcopy)
>> diff --git a/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
b/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
>> index ad166487eb78..bd5556f225b2 100644
>> --- a/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
>> +++ b/llvm/tools/llvm-objcopy/llvm-objcopy.cpp
>> @@ -401,7 +401,7 @@ static Error executeObjcopy(ConfigManager
&ConfigMgr) {
>> return Error::success();
>> }
>>
>> -int main(int argc, char **argv) {
>> +int llvm_objcopy_main(int argc, char **argv) {
>> InitLLVM X(argc, argv);
>> ToolName = argv[0];
>> ```
>>
>> With some clever CMake goop, any tool that opts into being part of the
merged driver gets a generated template main function, and much of the other
boilerplate code is generated too. As implemented, the tools all get built into
their own tools _and_ the llvm-driver tool. If this is a desirable route part of
the patch to make this "real" would be adding an option to disable
building the tool and instead generate a symlink from the tool to llvm-driver.
>>
>> This implementation dose require CMake 3.12 or later, since CMake 3.12
allows linkage dependencies for object libraries, which the implementation
depends on.
>>
>> The _huge_ caveat is that cl::opt haunts all things I do in LLVM. I
tried adding clang-tidy to the tools, and it will build fine, but crashes on
launch because of duplicate command line options being registered (d'oh!).
cl::opt's continued reliance on globals means that it is ill-suited for the
construction of a mega-llvm-driver.
>>
>> This is something that has come up time and time again in many
different contexts, but we've never really had the community effort behind
resolving it.
>>
>> Using OptTable has been suggested, but one of the common complaints is
that OptTable for tool options is unwieldy and overly complicated for small
simple tools. Also, there isn't a good way to handle options that are buried
inside libraries, and many of the cl::opt options are.
>>
>> Many years ago I initiated lengthy discussions on llvm-dev and at llvm
socials about an alternate approach [2], but it was a half measure at best. One
of the challenges that it didn't solve was the need for registering command
line options for all the debugging values buried in the passes. I don't mean
to derail this effort by sending us all down a rabbit hole, but I also think
that for a real tractable solution to an llvm Busybox/multicall binary solution,
we really need to do something about cl::opt.
>>
>> -Chris
>>
>> [1]
https://github.com/llvm-beanz/llvm-project/commit/59befc0116b982b6905822e5c5109bb6a4d397b0
>> [2] https://lists.llvm.org/pipermail/llvm-dev/2014-August/075855.html
>>
>>> On Jun 23, 2021, at 6:32 PM, Mehdi AMINI via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
>>>
>>>
>>>
>>> On Wed, Jun 23, 2021 at 3:52 PM Fāng-ruì Sòng <maskray at
google.com> wrote:
>>>> On Wed, Jun 23, 2021 at 3:43 PM Mehdi AMINI <joker.eph at
gmail.com> wrote:
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Jun 22, 2021 at 11:09 PM Fāng-ruì Sòng via
llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>>> >>
>>>> >> On Tue, Jun 22, 2021 at 10:20 PM Petr Hosek <phosek
at google.com> wrote:
>>>> >> >
>>>> >> > I guess this depends on a particular
implementation of the distributed build system. In the case of Goma, we only
supply the compiler binary which was invoked as the command (that binary links
glibc as a shared library but we assume that one is supplied by the host
system), all other files like headers are passed together with the compiler
invocation as inputs. If we used dynamic linking, Goma would need to figure out
what other shared libraries need to be sent to the server. It's certainly
doable but it's an extra complexity we would like to avoid.
>>>> >>
>>>> >> For non-clang executables, -DLLVM_LINK_LLVM_DYLIB=on
just adds one
>>>> >> more DT_NEEDED.
>>>> >> The DT_NEEDED entry can use a $ORIGIN based
DT_RUNPATH. Can Goma
>>>> >> detect the libraries shipped with the tools?
>>>> >> I asked because I feel this could be an artificial
limitation which
>>>> >> could be straightforwardly addressed in Goma.
>>>> >> A toolchain executable using a accompanying shared
object is not rare
>>>> >> (thinking of plugins).
>>>> >>
>>>> >> Multiplexing LLVM tools is one alternative but I am a
bit concerned
>>>> >> with the extra complexity and the new configuration
the build system
>>>> >> needs to support.
>>>> >>
>>>> >>
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151338.html
>>>> >> mentioned another approach which doesn't require
intrusive
>>>> >> modification to the tools.
>>>> >>
>>>> >> As for PGO+LTO, you can apply them to libLLVM-13git.so
as well.
>>>> >
>>>> >
>>>> > Some thoughts if we're getting into PGO+LTO territory,
I feel that both methods presented here will be at a disadvantage compared to
building clang and lld into their own binaries.
>>>> > For example I remember that on Mac an important
optimization for clang builds was to order the functions in the binary roughly
in the order in which they are first encountered during execution, assuming the
same behavior for lld you can see the conflicting optimization goal... You can
also think about how libSupport may be differently "hot" on a clang
PGO profile compared to lld and would result in different optimization.
>>>>
>>>> If PGO+LTO is desired, the executables can be split this way,
assuming
>>>> the performance of
>>>>
llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}
>>>> doesn't matter.
>>>>
>>>> * clang (libLLVM*.a)
>>>> * lld +
llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}
>>>> (libLLVM-13git.so)
>>>>
>>>> > LTO also benefits from "internalizing",
basically building a static binary where only `main` is exported and everything
else becomes an internal linkage is the best case: pointer escaping, global
analysis, etc all become more powerful. Optimizing a shared library kind of
makes every symbol public, and I suspect the busybox approach may be better on
this aspect (you get back to a single public main, but it can reach much more
code though).
>>>>
>>>> With --version-script we can internalize shared object symbols
as
>>>> well. For example, this has been used to facilitate
whole-program
>>>> devirtualization (https://reviews.llvm.org/D98686).
>>>> With
https://lists.llvm.org/pipermail/llvm-dev/2021-June/151338.html
>>>> we can get a list of roots which need to be exported.
>>>> A thin executable plus a -fvisibility-inlines-hidden +
>>>> -Bsymbolic-functions shared object is almost identical to a
PIE.
>>>
>>> You can get closer to it but note that:
>>>
>>> - You have some non-trivial and non-standard build setup and
scripts to workaround the problem (finding roots, etc.), the busybox solution is
much more "clean" from this point of view if one can structure it in
"normal" C++.
>>> - How does it work on non-ELF platforms?
>>> - It still isn't equivalent: you're still having a large
surface API exported by the .so which limits what the optimizer can do (alias
analysis, etc.). You won't be able to inject context from the callers there,
or inline across the libLLVM.so boundary.
>>>
>>> --
>>> Mehdi
>>>
>>>
>>>
>>>
>>>>
>>>> >
>>>> >>
>>>> >>
>>>> >> > On Tue, Jun 22, 2021 at 10:09 PM David Blaikie
<dblaikie at gmail.com> wrote:
>>>> >> >>
>>>> >> >> On Tue, Jun 22, 2021 at 10:00 PM Petr Hosek
via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>>> >> >>>
>>>> >> >>> From our perspective as a toolchain
vendor, even if using shared libraries could get us closer to static linking in
terms of performance, we'd still prefer static linking for the ease of
distribution. Dealing with a single statically linked executable is much easier
than dealing with multiple shared libraries. This is especially important in
distributed compilation environments like Goma.
>>>> >> >>
>>>> >> >>
>>>> >> >> What makes it especially complicated for
distributed compilation environments? (I'd expect a toolchain contains so
many files that whether it's one binary, or a binary and a handful of shared
libraries wouldn't change the general implementation complexity of a
distributed build system?)
>>>> >> >>
>>>> >> >>>
>>>> >> >>>
>>>> >> >>> When comparing performance between static
and dynamic linking, I'd also recommend doing a comparison between binaries
built with PGO+LTO. Plain -O3 leaves a lot of performance on the table and as
far as I'm aware, most toolchain vendors use PGO+LTO.
>>>> >> >>>
>>>> >> >>> On Tue, Jun 22, 2021 at 5:00 PM Fangrui
Song via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>>> >> >>>>
>>>> >> >>>> On 2021-06-22, Leonard Chan via
llvm-dev wrote:
>>>> >> >>>> >Small update: I have a WIP
prototype of the tool at
>>>> >> >>>> >https://reviews.llvm.org/D104686.
The prototype only includes llvm-objcopy
>>>> >> >>>> >and llvm-objdump packed together,
but we're seeing size benefits from
>>>> >> >>>> >busyboxing those two compared
against having two separate tools. (More
>>>> >> >>>> >details in the prototype's
description.) I don't plan on landing this as-is
>>>> >> >>>> >anytime soon and there's
still some things I'd like to improve/change and
>>>> >> >>>> >get feedback on.
>>>> >> >>>> >
>>>> >> >>>> >To answer some replies:
>>>> >> >>>> >
>>>> >> >>>> >- Ideally, we could start off
with an incremental approach and not package
>>>> >> >>>> >large tools like clang/lld off
the bat. The llvm-* tools seem like a good
>>>> >> >>>> >place to start since they're
generally a bunch of relatively small binaries
>>>> >> >>>> >that all share a subset of
functions in libLLVM, but don't necessarily use
>>>> >> >>>> >all of libLLVM, so statically
linking them together (with --gc-sections)
>>>> >> >>>> >can help dedup a lot of shared
components vs having separate statically
>>>> >> >>>> >compiled tools. In my
measurements, the busybox tool containing
>>>> >> >>>> >llvm-objcopy+objdump is
negligibly larger than llvm-objdump on its own (a
>>>> >> >>>> >couple KB difference) indicating
a lot of shared code between objdump and
>>>> >> >>>> >objcopy.
>>>> >> >>>> >
>>>> >> >>>> >- Will Dietz's multiplexing
tool looks like a good place to start from. The
>>>> >> >>>> >only concern I can see though is
mostly the amount of work needed to update
>>>> >> >>>> >it to LLVM 13.
>>>> >> >>>> >
>>>> >> >>>> >- We don't have plans for
windows support now, but it's not off the table.
>>>> >> >>>> >(Been mostly focusing on *nix for
now). Depending on overall traction for
>>>> >> >>>> >this idea, we could approach
incrementally and add support for different
>>>> >> >>>> >platforms over time.
>>>> >> >>>>
>>>> >> >>>> -DLLVM_LINK_LLVM_DYLIB=on
-DCLANG_LINK_CLANG_DYLIB=on -DLLVM_TARGETS_TO_BUILD=X86 (custom1)
>>>> >> >>>> vs
>>>> >> >>>> -DLLVM_TARGETS_TO_BUILD=X86 (custom2)
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>> # This is the lower bound for any
multiplexing approach. clang is the largest executable.
>>>> >> >>>> % stat -c %s
/tmp/out/custom2/bin/clang-13
>>>> >> >>>> 102900408
>>>> >> >>>>
>>>> >> >>>> I have built clang, lld and a bunch
of ELF binary utilities.
>>>> >> >>>>
>>>> >> >>>> % stat -c %s
/tmp/out/custom1/lib/libLLVM-13git.so /tmp/out/custom1/lib/libclang-cpp.so.13git
/tmp/out/custom1/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
| awk '{s+=$1}END{print s}'
>>>> >> >>>> 138896544
>>>> >> >>>>
>>>> >> >>>> % stat -c %s
/tmp/out/custom2/bin/{clang-13,lld,llvm-{ar,cov,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}}
| awk '{s+=$1}END{print s}'
>>>> >> >>>> 209054440
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>> The -DLLVM_LINK_LLVM_DYLIB=on
-DCLANG_LINK_CLANG_DYLIB=on build is doing a really good job.
>>>> >> >>>>
>>>> >> >>>> A multiplexing approach can squeeze
some bytes from 138896544 toward 102900408,
>>>> >> >>>> but how much can it do?
>>>> >> >>>>
>>>> >> >>>>
>>>> >> >>>> >- I'm starting to think the
`cl::opt` to `OptTable` issue might be
>>>> >> >>>> >orthogonal to the busybox
implementation. The tool essentially dispatches
>>>> >> >>>> >to different "main"
functions in different tools, but as long as we don't
>>>> >> >>>> >do anything within busybox after
exiting that tool's main, then the global
>>>> >> >>>> >state issues we weren't sure
of with `cl::opt` might not be of any concern
>>>> >> >>>> >now. It may be an issue down the
line if, let's say, the tool flags moved
>>>> >> >>>> >from being "owned" by
the tools themselves to instead being "owned" by
>>>> >> >>>> >busybox, and then we'd have
to merge similarly-named flags together. In
>>>> >> >>>> >that case, migrating these tools
to use `OptTable` may be necessary since
>>>> >> >>>> >(I think) `OptTable` should
handle this. This may be a tedious task, but
>>>> >> >>>> >this is just to say that busybox
won't need to be immediately blocked on it.
>>>> >> >>>>
>>>> >> >>>> Such improvement is useful even if we
don't do multiplexing.
>>>> >> >>>> I switched llvm-symbolizer. thakis
switched llvm-objdump.
>>>> >> >>>> I can look at some binary utilities.
>>>> >> >>>>
>>>> >> >>>> >- I haven't seen any issues
with colliding symbols when linking (although
>>>> >> >>>> >I've only merged two tools
for now). I suspect that with small-ish llvm-*
>>>> >> >>>> >tools, the bulk of their code is
shared from libLLVM, and they have their
>>>> >> >>>> >own distinct logic built on top
of it, which could mean a low chance of
>>>> >> >>>> >conflicting internal ABIs.
>>>> >> >>>> >
>>>> >> >>>> >On Mon, Jun 21, 2021 at 10:54 AM
Leonard Chan <leonardchan at google.com>
>>>> >> >>>> >wrote:
>>>> >> >>>> >
>>>> >> >>>> >> Hello all,
>>>> >> >>>> >>
>>>> >> >>>> >> When building LLVM tools,
including Clang and lld, it's currently possible
>>>> >> >>>> >> to use either static or
shared linking for LLVM libraries. The latter can
>>>> >> >>>> >> significantly reduce the
size of the toolchain since we aren't duplicating
>>>> >> >>>> >> the same code in every
binary, but the dynamic relocations can affect
>>>> >> >>>> >> performance. The former
doesn't affect performance but significantly
>>>> >> >>>> >> increases the size of our
toolchain.
>>>> >> >>>> >>
>>>> >> >>>> >> We would like to implement a
support for a third approach which we call,
>>>> >> >>>> >> for a lack of better term,
"busybox" feature, where everything is compiled
>>>> >> >>>> >> into a single binary which
then dispatches into an appropriate tool
>>>> >> >>>> >> depending on the first
command. This approach can significantly reduce the
>>>> >> >>>> >> size by deduplicating all of
the shared code without affecting the
>>>> >> >>>> >> performance.
>>>> >> >>>> >>
>>>> >> >>>> >> In terms of implementation,
the build would produce a single binary called
>>>> >> >>>> >> `llvm` and the first command
would identify the tool. For example, instead
>>>> >> >>>> >> of invoking `llvm-nm`
you'd invoke `llvm nm`. Ideally we would also support
>>>> >> >>>> >> creation of `llvm-nm`
symlink which redirects to `llvm` for backwards
>>>> >> >>>> >> compatibility.
>>>> >> >>>> >> This functionality would
ideally be implemented as an option in the CMake
>>>> >> >>>> >> build that toolchain vendors
can opt into.
>>>> >> >>>> >>
>>>> >> >>>> >> The implementation would
have to replace `main` function of each tool with
>>>> >> >>>> >> an entrypoint regular
function which is registered into a tool registry.
>>>> >> >>>> >> This could be wrapped in a
macro for convenience. When the "busybox"
>>>> >> >>>> >> feature is disabled, the
macro would expand to a `main` function as before
>>>> >> >>>> >> and redirect to the
entrypoint function. When the "busybox" feature is
>>>> >> >>>> >> enabled, it would register
the entrypoint function into the registry, which
>>>> >> >>>> >> would be responsible for the
dispatching based on the tool name. Ideally,
>>>> >> >>>> >> toolchain maintainers would
also be able to control which tools they could
>>>> >> >>>> >> add to the
"busybox" binary via CMake build options, so toolchains will
>>>> >> >>>> >> only include the tools they
use.
>>>> >> >>>> >>
>>>> >> >>>> >> One implementation detail we
think will be an issue is merging arguments
>>>> >> >>>> >> in individual tools that use
`cl::opt`. `cl::opt` works by maintaining a
>>>> >> >>>> >> global state of flags, but
we aren’t confident of what the resulting
>>>> >> >>>> >> behavior will be when
merging them together in the dispatching `main`. What
>>>> >> >>>> >> we would like to avoid is
having flags used by one specific tool available
>>>> >> >>>> >> on other tools. To address
this issue, we would like to migrate all tools
>>>> >> >>>> >> to use `OptTable` which
doesn't have this issue and has been the general
>>>> >> >>>> >> direction most tools have
been already moving into.
>>>> >> >>>> >>
>>>> >> >>>> >> A second issue would be
resolving symlinks. For example, llvm-objcopy will
>>>> >> >>>> >> check argv[0] and behave as
llvm-strip (ie. use the right flags +
>>>> >> >>>> >> configuration) if it is
called via a symlink that “looks like” a strip
>>>> >> >>>> >> tool, but for all other
cases it will run under the default objcopy mode.
>>>> >> >>>> >> The “looks like” function is
usually an `Is` function copied in multiple
>>>> >> >>>> >> tools that is essentially a
substring check: so symlinks like `llvm-strip`,
>>>> >> >>>> >> strip.exe, and
`gnu-llvm-strip-10` all result in using the strip “mode”
>>>> >> >>>> >> while all other names use
the objcopy mode. To replicate the same behavior,
>>>> >> >>>> >> we will need to take great
care in making sure symlinks to the busybox tool
>>>> >> >>>> >> dispatch correctly to the
appropriate llvm tool, which might mean exposing
>>>> >> >>>> >> and merging these `Is`
functions.
>>>> >> >>>> >>
>>>> >> >>>> >> Some open questions:
>>>> >> >>>> >> - People's initial
thoughts/opinions?
>>>> >> >>>> >> - Are there existing tools
in LLVM that already do this?
>>>> >> >>>> >> - Other implementation
details/global states that we would also need to
>>>> >> >>>> >> account for?
>>>> >> >>>> >>
>>>> >> >>>> >> - Leonard
>>>> >> >>>> >>
>>>> >> >>>>
>>>> >> >>>>
>_______________________________________________
>>>> >> >>>> >LLVM Developers mailing list
>>>> >> >>>> >llvm-dev at lists.llvm.org
>>>> >> >>>>
>https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> >> >>>>
>>>> >> >>>>
_______________________________________________
>>>> >> >>>> LLVM Developers mailing list
>>>> >> >>>> llvm-dev at lists.llvm.org
>>>> >> >>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> >> >>>
>>>> >> >>>
_______________________________________________
>>>> >> >>> LLVM Developers mailing list
>>>> >> >>> llvm-dev at lists.llvm.org
>>>> >> >>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> 宋方睿
>>>> >> _______________________________________________
>>>> >> LLVM Developers mailing list
>>>> >> llvm-dev at lists.llvm.org
>>>> >>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>>
>>>> --
>>>> 宋方睿
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210916/efb2e1bb/attachment-0001.html>