Saleem Abdulrasool via llvm-dev
2021-Sep-09 01:21 UTC
[llvm-dev] Supporting LLVM_BUILD_LLVM_DYLIB on Windows
On Wed, Sep 8, 2021 at 4:53 PM Tom Stellard <tstellar at redhat.com> wrote:> On 9/8/21 3:52 PM, Saleem Abdulrasool via llvm-dev wrote: > > Hello llvm-dev, > > > > In general, I am in favor of this change and think it would be a good > improvement. A few comments below: > > > One of the current limitations on LLVM on Windows is that you cannot use > LLVM_BUILD_LLVM_DYLIB: > https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16 > < > https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16> > I am interested in trying to see if we can lift this limitation. There > are others in the community that also seem to be interested in seeing LLVM > being possible to use as a DLL on Windows and the topic does come up on the > mailing lists every so often. > > > > When you build a distribution of a LLVM based toolchain currently, the > result on Windows is ~2GiB for a trimmed down toolset. This is largely due > to the static linking used for all the tools. I would like to be able to > use the shared LLVM build for building a toolset on Windows. > > > > Unlike Unix platforms, the default on Windows is that all symbols are > treated as `dso_local` (that is `-fvisibility-default=hidden`). Symbols > which are meant to participate in dynamic linking are to be attributed as > `__declspec(dllexport)` in the module and `__declspec(dllimport)` external > to the module. This is similar to Unix platforms where > `__attribute__((__visibility__(...)))` controls the same type of behaviour > with `-fvisibility-default=hidden`. > > > > For the case of distributions, it would remain valuable to minimize the > number of shared objects to reduce the files that require to be shipped but > also to minimize the number of cross-module calls which are not entirely > free (i.e. PLT+GOT or IAT costs). At the same time, the number of possible > labels which can be exposed from a single module on Windows is limited to > 64K. Experience from MSys2 indicates that LLVM with all the backends is > likely to exceed this count (with a subset of targets, the number already > is close to 60K). This means that it may be that we would need two > libraries on Windows. > > > > The backends only need to export a handful of symbols, so I don't think > enabling more targets will have that big of impact on the number > of exported symbols. >That is positive at least.> > With the LLVM community being diverse, people often build on different > platforms with different configurations, and I am concerned that adding > more differences in how we build libraries complicates how maintainable > LLVM is. I would suggest that we actually change the behavior of the Unix > builds to match that of Windows by building with > `-fvisibility-default=hidden`. Although this is a change, it is not > without value. By explicitly marking the interfaces which are vended by a > library and making everything else internal, it does enable some potential > optimization options for the compiler and linker (to be clear, I am not > suggesting that this will have a guaranteed benefit, just that it can > potentially enable additional opportunities for optimizations and size > reductions). This should incidentally help static linking. > > > > In order to achieve this, we would need to have a module specific > annotation to indicate what symbols are meant to be used outside of the > module when built in a shared configuration. The same annotation would > apply to all targets and is expected to be applied uniformly. This of > course has a cost associated with it: the public interfaces would need to > be decorated appropriately. However, by having the same behaviour on all > the platforms, developers would not be impacted by the platform differences > in their day-to-day development. The only time that developers would need > to be aware of this is when they are working on the module boundary, that > is, changes which do not change the API surface of LLVM would not need to > consider the annotations. > > > > Concretely, what I believe is required to enable building with > LLVM_BUILD_LLVM_DYLIB on Windows is: > > - introduce module specific decoration (e.g. LLVM_SUPPORT_ABI, ...) to > mark public interfaces of shared library modules > > - decorate all the public interfaces of the shared library modules with > the new decoration > > - switching the builds to use `-fvisibility-default=hidden` by default > > > > My recommendation would be to start with a generic decoration (e.g. > LLVM_EXPORTED_API) > and decorate all the currently exported symbols (which currently is all of > them except > for the lib/Target symbols on Linux). This way you could switch to > -fvisibility-default=hidden > and it would have no impact on library users. I know it's tedious work, > but you can split it up > and recruit others (like me) to help make the changes. >I really think that making this generic is only going to cause more churn. In the case we do run into the limit, we are going to have to split the flag. Making the flag less generic now will be easier to extend later but doesn't cost much now. The application is going to be the same, it is merely the spelling that changes. Since the flag impacts how the library can be called, it seems that ABI might be more appropriate (and Fangrui seemed to agree with that point on the differential). I don't care too much about the spelling being ABI or API though. I'm less worried about the tedium (especially since I am experimenting with the possibility of trying to automate at least a large portion of the change).> Once you have that working we can start trimming down the ABI and > determine whether or > not we actually need to have two libraries or not. If the full build with > a few targets > enabled is already under the limit, I think there is a good chance we can > get > a single library to work. > > The advantage of this approach is that you can make a lot of progress in a > very > non-invasive way. >I think that the invasive part is more the annotation than anything else. I don't expect there will be many places where we will truly have to contend with library structuring.> > I believe that these can be done mostly independently and staged in the > order specified. Until the last phase, it would have no actual impact on > the builds. However, by staging it, we could allow others to experiment > with the option while it is under development, and allows for an easier > path for switching the builds over. > > > > Although this would enable LLVM_BUILD_LLVM_DYLIB on Windows, give us > better uniformity between Windows and non-Windows platforms, potentially > enable additional optimization benefits, improve binary sizes for a > distribution of the toolchain (though less on Linux where distributors are > already using the build configuration ignoring the official suggestions in > the LLVM guides), and help with runtime costs of the toolchain (by making > the core of the tools a shared library, the backing pages can now be shared > across multiple instances), it is not entirely without downsides. The > primary downsides that I see are: > > - it becomes less enticing to support both LLVM_BUILD_LLVM_DYLIB and > BUILD_SHARED_LIBS: while technically possible, interfaces will need to be > decorated for both forms of the build > > I don't think we should make any changes to the BUILD_SHARED_LIBS build. > This is intended for developer convenience only and not recommend for > end users or distros[1]. >If you build with hidden visibility, then BUILD_SHARED_LIBS would generate libraries which do not export any interfaces, which would be a problem. In order to support that, at least on Windows, you absolutely will need to annotate all of the APIs on a per module basis. I suppose that one could then pass additional flags to counteract the defaults to make BUILD_SHARED_LIBS work. I am merely trying to point out that having a large set of libraries with a large API surface and having the flexibility to build them in a multitude of ways has an associated cost and that this will increase that.> Supporting BUILD_SHARED_LIBS will require annotating more of the API > and create more work both initially and also in ongoing maintenance, > and I don't think it is worth it.-Tom> > > - LLVM_DYLIB_COMPONENTS becomes less tractable: in theory it is possible > to apply enough CPP magic to determine where a symbol is homed, but > allowing a symbol to be homed in a shared or static library is > significantly more complex > > - BUILD_SHARED_LIBS becomes more expensive to maintain: the decoration > is per-module, which requires that we would need to decorate the symbols of > each module with module specific annotations as well > > > > One argument that people make for BUILD_SHARED_LIBS is that it reduces > the overall time build-test cycle. With the combination of lld, DWARF > Fission, and LLVM_BUILD_LLVM_DYLIB, I believe that most of the benefits > still can be had. The cost of linking all the tools is amortized across > the link of a single library, which while not as small as the a singular > library, is offset by the following: > > - The LLVM_BUILD_LLVM_DYLIB would not require the re-linking of all the > libraries for each tool. > > - DWARF Fission would avoid the need to relink all of the DWARF > information. > > - lld is faster than the gold and bfd linkers > > > > Header changes would still ripple through the system as before, > requiring rebuilding the transitive closure. Source file changes do not > have the same impact of course. > > > > For those would like a more concrete example of what a change like this > may shape up into: https://reviews.llvm.org/D109192 < > https://reviews.llvm.org/D109192> contains `LLVMSupportExports.h` which > has the expected structure for declaring the decoration macros with the > rest of the change primarily being focused on applying the decoration. > Please ignore the CMake changes as they are there to ensure that the CI > validates this without changing the configuration and not intended to be > part of the final version of the change. > > > > [1] https://llvm.org/docs/CMake.html#llvm-related-variables > > > > -- > > Saleem Abdulrasool > > compnerd (at) compnerd (dot) org > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > >-- Saleem Abdulrasool compnerd (at) compnerd (dot) org -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210908/e17d9421/attachment.html>
Tom Stellard via llvm-dev
2021-Sep-09 03:43 UTC
[llvm-dev] Supporting LLVM_BUILD_LLVM_DYLIB on Windows
On 9/8/21 6:21 PM, Saleem Abdulrasool wrote:> On Wed, Sep 8, 2021 at 4:53 PM Tom Stellard <tstellar at redhat.com <mailto:tstellar at redhat.com>> wrote: > > On 9/8/21 3:52 PM, Saleem Abdulrasool via llvm-dev wrote: > > Hello llvm-dev, > > > > In general, I am in favor of this change and think it would be a good > improvement. A few comments below: > > > One of the current limitations on LLVM on Windows is that you cannot use LLVM_BUILD_LLVM_DYLIB: https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16 <https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16> <https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16 <https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-shlib/CMakeLists.txt#L14-L16>> I am interested in trying to see if we can lift this limitation. There are others in the community that also seem to be interested in seeing LLVM being possible to use as a DLL on Windows and the topic does come up on the mailing lists every so often. > > > > When you build a distribution of a LLVM based toolchain currently, the result on Windows is ~2GiB for a trimmed down toolset. This is largely due to the static linking used for all the tools. I would like to be able to use the shared LLVM build for building a toolset on Windows. > > > > Unlike Unix platforms, the default on Windows is that all symbols are treated as `dso_local` (that is `-fvisibility-default=hidden`). Symbols which are meant to participate in dynamic linking are to be attributed as `__declspec(dllexport)` in the module and `__declspec(dllimport)` external to the module. This is similar to Unix platforms where `__attribute__((__visibility__(...)))` controls the same type of behaviour with `-fvisibility-default=hidden`. > > > > For the case of distributions, it would remain valuable to minimize the number of shared objects to reduce the files that require to be shipped but also to minimize the number of cross-module calls which are not entirely free (i.e. PLT+GOT or IAT costs). At the same time, the number of possible labels which can be exposed from a single module on Windows is limited to 64K. Experience from MSys2 indicates that LLVM with all the backends is likely to exceed this count (with a subset of targets, the number already is close to 60K). This means that it may be that we would need two libraries on Windows. > > > > The backends only need to export a handful of symbols, so I don't think > enabling more targets will have that big of impact on the number > of exported symbols. > > > That is positive at least. > > > With the LLVM community being diverse, people often build on different platforms with different configurations, and I am concerned that adding more differences in how we build libraries complicates how maintainable LLVM is. I would suggest that we actually change the behavior of the Unix builds to match that of Windows by building with `-fvisibility-default=hidden`. Although this is a change, it is not without value. By explicitly marking the interfaces which are vended by a library and making everything else internal, it does enable some potential optimization options for the compiler and linker (to be clear, I am not suggesting that this will have a guaranteed benefit, just that it can potentially enable additional opportunities for optimizations and size reductions). This should incidentally help static linking. > > > > In order to achieve this, we would need to have a module specific annotation to indicate what symbols are meant to be used outside of the module when built in a shared configuration. The same annotation would apply to all targets and is expected to be applied uniformly. This of course has a cost associated with it: the public interfaces would need to be decorated appropriately. However, by having the same behaviour on all the platforms, developers would not be impacted by the platform differences in their day-to-day development. The only time that developers would need to be aware of this is when they are working on the module boundary, that is, changes which do not change the API surface of LLVM would not need to consider the annotations. > > > > Concretely, what I believe is required to enable building with LLVM_BUILD_LLVM_DYLIB on Windows is: > > - introduce module specific decoration (e.g. LLVM_SUPPORT_ABI, ...) to mark public interfaces of shared library modules > > - decorate all the public interfaces of the shared library modules with the new decoration > > - switching the builds to use `-fvisibility-default=hidden` by default > > > > My recommendation would be to start with a generic decoration (e.g. LLVM_EXPORTED_API) > and decorate all the currently exported symbols (which currently is all of them except > for the lib/Target symbols on Linux). This way you could switch to -fvisibility-default=hidden > and it would have no impact on library users. I know it's tedious work, but you can split it up > and recruit others (like me) to help make the changes. > > > I really think that making this generic is only going to cause more churn. In the case we do run into the limit, we are going to have to split the flag. Making the flag less generic now will be easier to extend later but doesn't cost much now. The application is going to be the same, it is merely the spelling that changes. Since the flag impacts how the library can be called, it seems that ABI might be more appropriate (and Fangrui seemed to agree with that point on the differential). I don't care too much about the spelling being ABI or API though. >OK, that makes sense.> I'm less worried about the tedium (especially since I am experimenting with the possibility of trying to automate at least a large portion of the change). > > Once you have that working we can start trimming down the ABI and determine whether or > not we actually need to have two libraries or not. If the full build with a few targets > enabled is already under the limit, I think there is a good chance we can get > a single library to work. > > The advantage of this approach is that you can make a lot of progress in a very > non-invasive way. > > > I think that the invasive part is more the annotation than anything else. I don't expect there will be many places where we will truly have to contend with library structuring. > > > I believe that these can be done mostly independently and staged in the order specified. Until the last phase, it would have no actual impact on the builds. However, by staging it, we could allow others to experiment with the option while it is under development, and allows for an easier path for switching the builds over. > > > > Although this would enable LLVM_BUILD_LLVM_DYLIB on Windows, give us better uniformity between Windows and non-Windows platforms, potentially enable additional optimization benefits, improve binary sizes for a distribution of the toolchain (though less on Linux where distributors are already using the build configuration ignoring the official suggestions in the LLVM guides), and help with runtime costs of the toolchain (by making the core of the tools a shared library, the backing pages can now be shared across multiple instances), it is not entirely without downsides. The primary downsides that I see are: > > - it becomes less enticing to support both LLVM_BUILD_LLVM_DYLIB and BUILD_SHARED_LIBS: while technically possible, interfaces will need to be decorated for both forms of the build > > I don't think we should make any changes to the BUILD_SHARED_LIBS build. > This is intended for developer convenience only and not recommend for > end users or distros[1]. > > > If you build with hidden visibility, then BUILD_SHARED_LIBS would generate libraries which do not export any interfaces, which would be a problem. In order to support that, at least on Windows, you absolutely will need to annotate all of the APIs on a per module basis. I suppose that one could then pass additional flags to counteract the defaults to make BUILD_SHARED_LIBS work. I am merely trying to point out that having a large set of libraries with a large API surface and having the flexibility to build them in a multitude of ways has an associated cost and that this will increase that. >Do you want to support a BUILD_SHARED_LIBS build on Windows? If yes, then I agree you need to annotate the API. All I'm saying is that if you don't care about BUILD_SHARED_LIBS, then don't worry about trying to make the build work with -fvisibility=hidden. Just keep -fvisibility=default for the BUILD_SHARED_LIBS. -Tom> Supporting BUILD_SHARED_LIBS will require annotating more of the API > and create more work both initially and also in ongoing maintenance, > and I don't think it is worth it. > > -Tom > > > - LLVM_DYLIB_COMPONENTS becomes less tractable: in theory it is possible to apply enough CPP magic to determine where a symbol is homed, but allowing a symbol to be homed in a shared or static library is significantly more complex > > - BUILD_SHARED_LIBS becomes more expensive to maintain: the decoration is per-module, which requires that we would need to decorate the symbols of each module with module specific annotations as well > > > > One argument that people make for BUILD_SHARED_LIBS is that it reduces the overall time build-test cycle. With the combination of lld, DWARF Fission, and LLVM_BUILD_LLVM_DYLIB, I believe that most of the benefits still can be had. The cost of linking all the tools is amortized across the link of a single library, which while not as small as the a singular library, is offset by the following: > > - The LLVM_BUILD_LLVM_DYLIB would not require the re-linking of all the libraries for each tool. > > - DWARF Fission would avoid the need to relink all of the DWARF information. > > - lld is faster than the gold and bfd linkers > > > > Header changes would still ripple through the system as before, requiring rebuilding the transitive closure. Source file changes do not have the same impact of course. > > > > For those would like a more concrete example of what a change like this may shape up into: https://reviews.llvm.org/D109192 <https://reviews.llvm.org/D109192> <https://reviews.llvm.org/D109192 <https://reviews.llvm.org/D109192>> contains `LLVMSupportExports.h` which has the expected structure for declaring the decoration macros with the rest of the change primarily being focused on applying the decoration. Please ignore the CMake changes as they are there to ensure that the CI validates this without changing the configuration and not intended to be part of the final version of the change. > > > > [1] https://llvm.org/docs/CMake.html#llvm-related-variables <https://llvm.org/docs/CMake.html#llvm-related-variables> > > > > -- > > Saleem Abdulrasool > > compnerd (at) compnerd (dot) org > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > > > > > -- > Saleem Abdulrasool > compnerd (at) compnerd (dot) org