David Blaikie via llvm-dev
2021-Jun-12 18:41 UTC
[llvm-dev] RFC: Revisiting LLD-as-a-library design
On Sat, Jun 12, 2021 at 10:54 AM Erik McClure <erik2003 at gmail.com> wrote:> The point of using LLVM for compiling WASM is to take advantage of > ahead-of-time optimizations that could cause hitches in a JIT. >Curious what sort of hitches you're referring to - but probably far enough off-topic from this thread. (certainly the goal of the ORC JIT is to be as robust as AOT compilation, providing the same semantics, etc)> For example, it integrates polly to try to recover vectorization > optimizations. The resulting DLL can then be cached and loaded instantly on > every subsequent playthrough, >Fair enough - that's what I was curious about, or whether there were some other circumstances/motivations for using LLD as a library (eg: perhaps some bugs/missing features of ORC that could be addressed, or lack of documentation/visibility/etc).> without any possibility of hitching. Microsoft Flight Simulator 2020 also > ships pre-compiled plugin DLLs on Xbox, which does not allow JITing code, > but because these are compiled on developer machines the linker problem > doesn't really apply in that situation. If they wanted to JIT webassembly, > there are plenty of JIT runtimes to do that. > > Regardless, I think it's kind of silly to say that instead of using a > perfectly functional linker that LLVM has, someone should JIT the code. >I didn't mean to suggest that - but that it sounded pretty close to a JIT-like use case & was curious if there was some non-fundamental blocker that lead to the use of LLD rather than ORC for what appeared like it might be a JIT-like use case.> LLVM is a compiler backend - it should support using its own linker the > same way people use LLVM, and if LLVM can be used as a library, then LLD > should be usable as a library. Furthermore, there is no technical reason > for LLD to not be a library. It's already almost all the way there, the > maintainers simply don't bother testing to see when they forget to clean up > one of the global caches. >I don't disagree that LLD, like the rest of LLVM, would benefit from having a library-centric design. - Dave> -- > Sincerely, Erik McClure > > > On Sat, Jun 12, 2021 at 10:24 AM David Blaikie <dblaikie at gmail.com> wrote: > >> Is this a JIT use case? Perhaps ORC would be applicable there. >> >> Or is the intent to make on-disk linked shared libraries so they can be >> cached over multiple executions/etc, perhaps? >> >> On Sat, Jun 12, 2021 at 10:09 AM Erik McClure via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> I use LLVM to compile WebAssembly to native code. The primary use-case >>> for this is compiling WASM plugins for games - this is what Microsoft >>> Flight Simulator 2020 uses it for. Using the system linker is not an option >>> on Windows, which does not ship link.exe by default, making LLD a mandatory >>> requirement if you are using LLVM in any kind of end-user plugin scenario, >>> as the average user has not installed Visual Studio. >>> >>> This puts users of LLVM's library capabilities on windows in an awkward >>> position, because in order to use LLVM as a library when compiling a >>> plugin, one must use LLD, which cannot be used as a library. My current >>> solution is to use LLD as a library anyway and maintain a fork of LLVM with >>> the various global cleanup bugs patched (most of which have now made it >>> into stable), along with a helper function that allows me to use LLD to >>> read out the symbols of a given shared library (which is used to perform >>> link-time validation of webassembly modules, because LLD makes it difficult >>> to access any errors that happen). >>> >>> If LLD wanted to become an actual library, I think it would need a >>> better method of reporting errors than simply an stdout and stderr stream, >>> although I don't know what this would look like. It would also be nice for >>> it to expose the different link stages like LLVM does so that the >>> application has a bit more control over what's going on. However, I don't >>> really have any concrete ideas about what LLD should look like as a >>> library, only that I would like it to stop crashing when I attempt to use >>> it as one. >>> >>> -- >>> Sincerely, Erik McClure >>> >>> >>> On Fri, Jun 11, 2021 at 8:20 PM Michael Spencer <bigcheesegs at gmail.com> >>> wrote: >>> >>>> Adding Erik (not subscribed) who has previously had issues with LLD not >>>> being a library to provide some additional use cases. >>>> >>>> - Michael Spencer >>>> >>>> >>>> On Thu, Jun 10, 2021 at 12:15 PM Reid Kleckner via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>>>> Hey all, >>>>> >>>>> Long ago, the LLD project contributors decided that they weren't going >>>>> to design LLD as a library, which stands in opposition to the way that the >>>>> rest of LLVM strives to be a reusable library. Part of the reasoning was >>>>> that, at the time, LLD wasn't done yet, and the top priority was to finish >>>>> making LLD a fast, useful, usable product. If sacrificing reusability >>>>> helped LLD achieve its project goals, the contributors at the time felt >>>>> that was the right tradeoff, and that carried the day. >>>>> >>>>> However, it is now ${YEAR} 2021, and I think we ought to reconsider >>>>> this design decision. LLD was a great success: it works, it is fast, it is >>>>> simple, many users have adopted it, it has many ports >>>>> (COFF/ELF/mingw/wasm/new MachO). Today, we have actual users who want to >>>>> run the linker as a library, and they aren't satisfied with the option of >>>>> launching a child process. Some users are interested in process reuse as a >>>>> performance optimization, some are including the linker in the frontend. >>>>> Who knows. I try not to pre-judge any of these efforts, I think we should >>>>> do what we can to enable experimentation. >>>>> >>>>> So, concretely, what could change? The main points of reusability are: >>>>> - Fatal errors and warnings exit the process without returning control >>>>> to the caller >>>>> - Conflicts over global variables between threads >>>>> >>>>> Error recovery is the big imposition here. To avoid a giant rewrite of >>>>> all error handling code in LLD, I think we should *avoid* returning failure >>>>> via the llvm::Error class or std::error_code. We should instead use an >>>>> approach more like clang, where diagnostics are delivered to a diagnostic >>>>> consumer on the side. The success of the link is determined by whether any >>>>> errors were reported. Functions may return a simple success boolean in >>>>> cases where higher level functions need to exit early. This has worked >>>>> reasonably well for clang. The main failure mode here is that we miss an >>>>> error check, and crash or report useless follow-on errors after an error >>>>> that would normally have been fatal. >>>>> >>>>> Another motivation for all of this is increasing the use of >>>>> parallelism in LLD. Emitting errors in parallel from threads and then >>>>> exiting the process is risky business. A new diagnostic context or consumer >>>>> could make this more reliable. MLIR has this issue as well, and I believe >>>>> they use this pattern. They use some kind of thread shard index to order >>>>> the diagnostics, LLD could do the same. >>>>> >>>>> Finally, we'd work to eliminate globals. I think this is mainly a >>>>> small matter of programming (SMOP) and doesn't need much discussion, >>>>> although the `make` template presents interesting challenges. >>>>> >>>>> Thoughts? Tomatoes? Flowers? I apologize for the lack of context links >>>>> to the original discussions. It takes more time than I have to dig those up. >>>>> >>>>> Reid >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210612/b5180e3f/attachment.html>
Neil Henning via llvm-dev
2021-Jun-14 08:45 UTC
[llvm-dev] RFC: Revisiting LLD-as-a-library design
+1 on this proposal from Reid (thanks for bringing this to the list!) I'll drop a brain dump of why we (Unity) would like this proposal: we're running into the same problems that Andrew noted for Zig above. At present we are using LLD to do an on-disk JIT effectively: - this gets us debugging support for 'free' (write a PDB to disk next to the DLL, et voila you can debug) - it lets us cache binaries that don't change across source file changes to reduce overall build times We've seen cases where we call LLD as-a-subprocess 2000 times with this approach. While I noted on the Windows/COFF call that we are taking steps to try and mitigate the number of calls to LLD, we'll still likely be in the 100~ ish DLLs built and thus 100~ ish calls to LLD. The cost of spawning all these LLD subprocesses is a good 15% of our build pipeline. As an experiment I tried running LLD-as-a-library and serialized the accesses to the linker, and it was only a little bit slower than having N threads launching N instances of LLD. That gives me good hope that having an actual thread-safe way to run LLD will substantially reduce our build times, meaning happy users. -Neil. On Sat, Jun 12, 2021 at 7:41 PM David Blaikie via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On Sat, Jun 12, 2021 at 10:54 AM Erik McClure <erik2003 at gmail.com> wrote: > >> The point of using LLVM for compiling WASM is to take advantage of >> ahead-of-time optimizations that could cause hitches in a JIT. >> > > Curious what sort of hitches you're referring to - but probably far enough > off-topic from this thread. (certainly the goal of the ORC JIT is to be as > robust as AOT compilation, providing the same semantics, etc) > > >> For example, it integrates polly to try to recover vectorization >> optimizations. The resulting DLL can then be cached and loaded instantly on >> every subsequent playthrough, >> > > Fair enough - that's what I was curious about, or whether there were some > other circumstances/motivations for using LLD as a library (eg: perhaps > some bugs/missing features of ORC that could be addressed, or lack of > documentation/visibility/etc). > > >> without any possibility of hitching. Microsoft Flight Simulator 2020 also >> ships pre-compiled plugin DLLs on Xbox, which does not allow JITing code, >> but because these are compiled on developer machines the linker problem >> doesn't really apply in that situation. If they wanted to JIT webassembly, >> there are plenty of JIT runtimes to do that. >> >> Regardless, I think it's kind of silly to say that instead of using a >> perfectly functional linker that LLVM has, someone should JIT the code. >> > > I didn't mean to suggest that - but that it sounded pretty close to a > JIT-like use case & was curious if there was some non-fundamental blocker > that lead to the use of LLD rather than ORC for what appeared like it might > be a JIT-like use case. > > >> LLVM is a compiler backend - it should support using its own linker the >> same way people use LLVM, and if LLVM can be used as a library, then LLD >> should be usable as a library. Furthermore, there is no technical reason >> for LLD to not be a library. It's already almost all the way there, the >> maintainers simply don't bother testing to see when they forget to clean up >> one of the global caches. >> > > I don't disagree that LLD, like the rest of LLVM, would benefit from > having a library-centric design. > > - Dave > > >> -- >> Sincerely, Erik McClure >> >> >> On Sat, Jun 12, 2021 at 10:24 AM David Blaikie <dblaikie at gmail.com> >> wrote: >> >>> Is this a JIT use case? Perhaps ORC would be applicable there. >>> >>> Or is the intent to make on-disk linked shared libraries so they can be >>> cached over multiple executions/etc, perhaps? >>> >>> On Sat, Jun 12, 2021 at 10:09 AM Erik McClure via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> I use LLVM to compile WebAssembly to native code. The primary use-case >>>> for this is compiling WASM plugins for games - this is what Microsoft >>>> Flight Simulator 2020 uses it for. Using the system linker is not an option >>>> on Windows, which does not ship link.exe by default, making LLD a mandatory >>>> requirement if you are using LLVM in any kind of end-user plugin scenario, >>>> as the average user has not installed Visual Studio. >>>> >>>> This puts users of LLVM's library capabilities on windows in an awkward >>>> position, because in order to use LLVM as a library when compiling a >>>> plugin, one must use LLD, which cannot be used as a library. My current >>>> solution is to use LLD as a library anyway and maintain a fork of LLVM with >>>> the various global cleanup bugs patched (most of which have now made it >>>> into stable), along with a helper function that allows me to use LLD to >>>> read out the symbols of a given shared library (which is used to perform >>>> link-time validation of webassembly modules, because LLD makes it difficult >>>> to access any errors that happen). >>>> >>>> If LLD wanted to become an actual library, I think it would need a >>>> better method of reporting errors than simply an stdout and stderr stream, >>>> although I don't know what this would look like. It would also be nice for >>>> it to expose the different link stages like LLVM does so that the >>>> application has a bit more control over what's going on. However, I don't >>>> really have any concrete ideas about what LLD should look like as a >>>> library, only that I would like it to stop crashing when I attempt to use >>>> it as one. >>>> >>>> -- >>>> Sincerely, Erik McClure >>>> >>>> >>>> On Fri, Jun 11, 2021 at 8:20 PM Michael Spencer <bigcheesegs at gmail.com> >>>> wrote: >>>> >>>>> Adding Erik (not subscribed) who has previously had issues with LLD >>>>> not being a library to provide some additional use cases. >>>>> >>>>> - Michael Spencer >>>>> >>>>> >>>>> On Thu, Jun 10, 2021 at 12:15 PM Reid Kleckner via llvm-dev < >>>>> llvm-dev at lists.llvm.org> wrote: >>>>> >>>>>> Hey all, >>>>>> >>>>>> Long ago, the LLD project contributors decided that they weren't >>>>>> going to design LLD as a library, which stands in opposition to the way >>>>>> that the rest of LLVM strives to be a reusable library. Part of the >>>>>> reasoning was that, at the time, LLD wasn't done yet, and the top priority >>>>>> was to finish making LLD a fast, useful, usable product. If sacrificing >>>>>> reusability helped LLD achieve its project goals, the contributors at the >>>>>> time felt that was the right tradeoff, and that carried the day. >>>>>> >>>>>> However, it is now ${YEAR} 2021, and I think we ought to reconsider >>>>>> this design decision. LLD was a great success: it works, it is fast, it is >>>>>> simple, many users have adopted it, it has many ports >>>>>> (COFF/ELF/mingw/wasm/new MachO). Today, we have actual users who want to >>>>>> run the linker as a library, and they aren't satisfied with the option of >>>>>> launching a child process. Some users are interested in process reuse as a >>>>>> performance optimization, some are including the linker in the frontend. >>>>>> Who knows. I try not to pre-judge any of these efforts, I think we should >>>>>> do what we can to enable experimentation. >>>>>> >>>>>> So, concretely, what could change? The main points of reusability are: >>>>>> - Fatal errors and warnings exit the process without returning >>>>>> control to the caller >>>>>> - Conflicts over global variables between threads >>>>>> >>>>>> Error recovery is the big imposition here. To avoid a giant rewrite >>>>>> of all error handling code in LLD, I think we should *avoid* returning >>>>>> failure via the llvm::Error class or std::error_code. We should instead use >>>>>> an approach more like clang, where diagnostics are delivered to a >>>>>> diagnostic consumer on the side. The success of the link is determined by >>>>>> whether any errors were reported. Functions may return a simple success >>>>>> boolean in cases where higher level functions need to exit early. This has >>>>>> worked reasonably well for clang. The main failure mode here is that we >>>>>> miss an error check, and crash or report useless follow-on errors after an >>>>>> error that would normally have been fatal. >>>>>> >>>>>> Another motivation for all of this is increasing the use of >>>>>> parallelism in LLD. Emitting errors in parallel from threads and then >>>>>> exiting the process is risky business. A new diagnostic context or consumer >>>>>> could make this more reliable. MLIR has this issue as well, and I believe >>>>>> they use this pattern. They use some kind of thread shard index to order >>>>>> the diagnostics, LLD could do the same. >>>>>> >>>>>> Finally, we'd work to eliminate globals. I think this is mainly a >>>>>> small matter of programming (SMOP) and doesn't need much discussion, >>>>>> although the `make` template presents interesting challenges. >>>>>> >>>>>> Thoughts? Tomatoes? Flowers? I apologize for the lack of context >>>>>> links to the original discussions. It takes more time than I have to dig >>>>>> those up. >>>>>> >>>>>> Reid >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> llvm-dev at lists.llvm.org >>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>> >>>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Neil Henning Senior Software Engineer Compiler unity.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210614/37a34b12/attachment.html>