thr3ads.net - llvm dev - [llvm-dev] RFC: Revisiting LLD-as-a-library design [Jun 2021]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2021-Jun-12 17:24 UTC

[llvm-dev] RFC: Revisiting LLD-as-a-library design

Is this a JIT use case? Perhaps ORC would be applicable there.

Or is the intent to make on-disk linked shared libraries so they can be
cached over multiple executions/etc, perhaps?

On Sat, Jun 12, 2021 at 10:09 AM Erik McClure via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I use LLVM to compile WebAssembly to native code. The primary use-case for
> this is compiling WASM plugins for games - this is what Microsoft Flight
> Simulator 2020 uses it for. Using the system linker is not an option on
> Windows, which does not ship link.exe by default, making LLD a mandatory
> requirement if you are using LLVM in any kind of end-user plugin scenario,
> as the average user has not installed Visual Studio.
>
> This puts users of LLVM's library capabilities on windows in an awkward
> position, because in order to use LLVM as a library when compiling a
> plugin, one must use LLD, which cannot be used as a library. My current
> solution is to use LLD as a library anyway and maintain a fork of LLVM with
> the various global cleanup bugs patched (most of which have now made it
> into stable), along with a helper function that allows me to use LLD to
> read out the symbols of a given shared library (which is used to perform
> link-time validation of webassembly modules, because LLD makes it difficult
> to access any errors that happen).
>
> If LLD wanted to become an actual library, I think it would need a better
> method of reporting errors than simply an stdout and stderr stream,
> although I don't know what this would look like. It would also be nice
for
> it to expose the different link stages like LLVM does so that the
> application has a bit more control over what's going on. However, I
don't
> really have any concrete ideas about what LLD should look like as a
> library, only that I would like it to stop crashing when I attempt to use
> it as one.
>
> --
> Sincerely, Erik McClure
>
>
> On Fri, Jun 11, 2021 at 8:20 PM Michael Spencer <bigcheesegs at
gmail.com>
> wrote:
>
>> Adding Erik (not subscribed) who has previously had issues with LLD not
>> being a library to provide some additional use cases.
>>
>> - Michael Spencer
>>
>>
>> On Thu, Jun 10, 2021 at 12:15 PM Reid Kleckner via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hey all,
>>>
>>> Long ago, the LLD project contributors decided that they
weren't going
>>> to design LLD as a library, which stands in opposition to the way
that the
>>> rest of LLVM strives to be a reusable library. Part of the
reasoning was
>>> that, at the time, LLD wasn't done yet, and the top priority
was to finish
>>> making LLD a fast, useful, usable product. If sacrificing
reusability
>>> helped LLD achieve its project goals, the contributors at the time
felt
>>> that was the right tradeoff, and that carried the day.
>>>
>>> However, it is now ${YEAR} 2021, and I think we ought to reconsider
this
>>> design decision. LLD was a great success: it works, it is fast, it
is
>>> simple, many users have adopted it, it has many ports
>>> (COFF/ELF/mingw/wasm/new MachO). Today, we have actual users who
want to
>>> run the linker as a library, and they aren't satisfied with the
option of
>>> launching a child process. Some users are interested in process
reuse as a
>>> performance optimization, some are including the linker in the
frontend.
>>> Who knows. I try not to pre-judge any of these efforts, I think we
should
>>> do what we can to enable experimentation.
>>>
>>> So, concretely, what could change? The main points of reusability
are:
>>> - Fatal errors and warnings exit the process without returning
control
>>> to the caller
>>> - Conflicts over global variables between threads
>>>
>>> Error recovery is the big imposition here. To avoid a giant rewrite
of
>>> all error handling code in LLD, I think we should *avoid* returning
failure
>>> via the llvm::Error class or std::error_code. We should instead use
an
>>> approach more like clang, where diagnostics are delivered to a
diagnostic
>>> consumer on the side. The success of the link is determined by
whether any
>>> errors were reported. Functions may return a simple success boolean
in
>>> cases where higher level functions need to exit early. This has
worked
>>> reasonably well for clang. The main failure mode here is that we
miss an
>>> error check, and crash or report useless follow-on errors after an
error
>>> that would normally have been fatal.
>>>
>>> Another motivation for all of this is increasing the use of
parallelism
>>> in LLD. Emitting errors in parallel from threads and then exiting
the
>>> process is risky business. A new diagnostic context or consumer
could make
>>> this more reliable. MLIR has this issue as well, and I believe they
use
>>> this pattern. They use some kind of thread shard index to order the
>>> diagnostics, LLD could do the same.
>>>
>>> Finally, we'd work to eliminate globals. I think this is mainly
a small
>>> matter of programming (SMOP) and doesn't need much discussion,
although the
>>> `make` template presents interesting challenges.
>>>
>>> Thoughts? Tomatoes? Flowers? I apologize for the lack of context
links
>>> to the original discussions. It takes more time than I have to dig
those up.
>>>
>>> Reid
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210612/27555af5/attachment.html>

Erik McClure via llvm-dev

2021-Jun-12 17:54 UTC

head link

[llvm-dev] RFC: Revisiting LLD-as-a-library design

The point of using LLVM for compiling WASM is to take advantage of
ahead-of-time optimizations that could cause hitches in a JIT. For example,
it integrates polly to try to recover vectorization optimizations. The
resulting DLL can then be cached and loaded instantly on every subsequent
playthrough, without any possibility of hitching. Microsoft Flight
Simulator 2020 also ships pre-compiled plugin DLLs on Xbox, which does not
allow JITing code, but because these are compiled on developer machines the
linker problem doesn't really apply in that situation. If they wanted to
JIT webassembly, there are plenty of JIT runtimes to do that.

Regardless, I think it's kind of silly to say that instead of using a
perfectly functional linker that LLVM has, someone should JIT the code.
LLVM is a compiler backend - it should support using its own linker the
same way people use LLVM, and if LLVM can be used as a library, then LLD
should be usable as a library. Furthermore, there is no technical reason
for LLD to not be a library. It's already almost all the way there, the
maintainers simply don't bother testing to see when they forget to clean up
one of the global caches.
--
Sincerely, Erik McClure


On Sat, Jun 12, 2021 at 10:24 AM David Blaikie <dblaikie at gmail.com>
wrote:
> Is this a JIT use case? Perhaps ORC would be applicable there.
>
> Or is the intent to make on-disk linked shared libraries so they can be
> cached over multiple executions/etc, perhaps?
>
> On Sat, Jun 12, 2021 at 10:09 AM Erik McClure via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> I use LLVM to compile WebAssembly to native code. The primary use-case
>> for this is compiling WASM plugins for games - this is what Microsoft
>> Flight Simulator 2020 uses it for. Using the system linker is not an
option
>> on Windows, which does not ship link.exe by default, making LLD a
mandatory
>> requirement if you are using LLVM in any kind of end-user plugin
scenario,
>> as the average user has not installed Visual Studio.
>>
>> This puts users of LLVM's library capabilities on windows in an
awkward
>> position, because in order to use LLVM as a library when compiling a
>> plugin, one must use LLD, which cannot be used as a library. My current
>> solution is to use LLD as a library anyway and maintain a fork of LLVM
with
>> the various global cleanup bugs patched (most of which have now made it
>> into stable), along with a helper function that allows me to use LLD to
>> read out the symbols of a given shared library (which is used to
perform
>> link-time validation of webassembly modules, because LLD makes it
difficult
>> to access any errors that happen).
>>
>> If LLD wanted to become an actual library, I think it would need a
better
>> method of reporting errors than simply an stdout and stderr stream,
>> although I don't know what this would look like. It would also be
nice for
>> it to expose the different link stages like LLVM does so that the
>> application has a bit more control over what's going on. However, I
don't
>> really have any concrete ideas about what LLD should look like as a
>> library, only that I would like it to stop crashing when I attempt to
use
>> it as one.
>>
>> --
>> Sincerely, Erik McClure
>>
>>
>> On Fri, Jun 11, 2021 at 8:20 PM Michael Spencer <bigcheesegs at
gmail.com>
>> wrote:
>>
>>> Adding Erik (not subscribed) who has previously had issues with LLD
not
>>> being a library to provide some additional use cases.
>>>
>>> - Michael Spencer
>>>
>>>
>>> On Thu, Jun 10, 2021 at 12:15 PM Reid Kleckner via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hey all,
>>>>
>>>> Long ago, the LLD project contributors decided that they
weren't going
>>>> to design LLD as a library, which stands in opposition to the
way that the
>>>> rest of LLVM strives to be a reusable library. Part of the
reasoning was
>>>> that, at the time, LLD wasn't done yet, and the top
priority was to finish
>>>> making LLD a fast, useful, usable product. If sacrificing
reusability
>>>> helped LLD achieve its project goals, the contributors at the
time felt
>>>> that was the right tradeoff, and that carried the day.
>>>>
>>>> However, it is now ${YEAR} 2021, and I think we ought to
reconsider
>>>> this design decision. LLD was a great success: it works, it is
fast, it is
>>>> simple, many users have adopted it, it has many ports
>>>> (COFF/ELF/mingw/wasm/new MachO). Today, we have actual users
who want to
>>>> run the linker as a library, and they aren't satisfied with
the option of
>>>> launching a child process. Some users are interested in process
reuse as a
>>>> performance optimization, some are including the linker in the
frontend.
>>>> Who knows. I try not to pre-judge any of these efforts, I think
we should
>>>> do what we can to enable experimentation.
>>>>
>>>> So, concretely, what could change? The main points of
reusability are:
>>>> - Fatal errors and warnings exit the process without returning
control
>>>> to the caller
>>>> - Conflicts over global variables between threads
>>>>
>>>> Error recovery is the big imposition here. To avoid a giant
rewrite of
>>>> all error handling code in LLD, I think we should *avoid*
returning failure
>>>> via the llvm::Error class or std::error_code. We should instead
use an
>>>> approach more like clang, where diagnostics are delivered to a
diagnostic
>>>> consumer on the side. The success of the link is determined by
whether any
>>>> errors were reported. Functions may return a simple success
boolean in
>>>> cases where higher level functions need to exit early. This has
worked
>>>> reasonably well for clang. The main failure mode here is that
we miss an
>>>> error check, and crash or report useless follow-on errors after
an error
>>>> that would normally have been fatal.
>>>>
>>>> Another motivation for all of this is increasing the use of
parallelism
>>>> in LLD. Emitting errors in parallel from threads and then
exiting the
>>>> process is risky business. A new diagnostic context or consumer
could make
>>>> this more reliable. MLIR has this issue as well, and I believe
they use
>>>> this pattern. They use some kind of thread shard index to order
the
>>>> diagnostics, LLD could do the same.
>>>>
>>>> Finally, we'd work to eliminate globals. I think this is
mainly a small
>>>> matter of programming (SMOP) and doesn't need much
discussion, although the
>>>> `make` template presents interesting challenges.
>>>>
>>>> Thoughts? Tomatoes? Flowers? I apologize for the lack of
context links
>>>> to the original discussions. It takes more time than I have to
dig those up.
>>>>
>>>> Reid
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210612/28bf5133/attachment.html>

llvm dev - Jun 2021 - RFC: Revisiting LLD-as-a-library design

[llvm-dev] RFC: Revisiting LLD-as-a-library design

[llvm-dev] RFC: Revisiting LLD-as-a-library design