Hello LLVM Developers, Within Google, we have a growing range of needs that existing libc implementations don't quite address. This is pushing us to start working on a new libc implementation. Informal conversations with others within the LLVM community has told us that a libc in LLVM is actually a broader need, and we are increasingly consolidating our toolchains around LLVM. Hence, we wanted to see if the LLVM project would be interested in us developing this upstream as part of the project. To be very clear: we don't expect our needs to exactly match everyone else's -- part of our impetus is to simplify things wherever we can, and that may not quite match what others want in a libc. That said, we do believe that the effort will still be directly beneficial and usable for the broader LLVM community, and may serve as a starting point for others in the community to flesh out an increasingly complete set of libc functionality. We are still in the early stages, but we do have some high-level goals and guiding principles of the initial scope we are interested in pursuing: 1. The project should mesh with the "as a library" philosophy of the LLVM project: even though "the C Standard Library" is nominally "a library," most implementations are, in practice, quite monolithic. 2. The libc should support static non-PIE and static-PIE linking. This means, providing the CRT (the C runtime) and a PIE loader for static non-PIE and static-PIE linked executables. 3. If there is a specification, we should follow it. The scope that we need includes most of the C Standard Library; POSIX additions; and some necessary, system-specific extensions. This does not mean we should (or can) follow the entire specification -- there will be some parts which simply aren't worth implementing, and some parts which cannot be safely used in modern coding practice. 4. Vendor extensions must be considered very carefully, and only admitted when necessary. Similar to Clang and libc++, it does seem inevitable that we will need to provide some level of compatibility with other vendors' extensions. 5. The project should be an exemplar of developing with LLVM tooling. Two examples are fuzz testing from the start, and sanitizer-supported testing. There are also few areas which we do not intend to invest in at this point: 1. Implement dynamic loading and linking support. 2. Support for more architectures (we'll start with just x86-64 for simplicity). For these areas, the community is of course free to contribute. Our hope is that, preserving the "as a library" design philosophy will make such extensions easy, and allow retaining the simplicity when these features aren't needed. We intend to build the new libc in a gradual manner. To begin with, the new libc will be a layer sitting between the application and the system libc. Eventually, when the implementation is sufficiently complete, it will be able to replace the system libc at least for some use cases and contexts. So, what do you think about incorporating this new libc under the LLVM project? Thank you, Siva Chandra and the rest of the Google LLVM contributors -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190624/2e9c674e/attachment.html>
disclaimer: I work at Google so don't take my +1 as an independent vote forward. We would like to use this on Fuchsia and I am particularly interested in creating a dynamic linking library for ELF with Roland McGrath's guidance. We spoke about creating a library for writing dynamic linkers internally and I don't see why this can't be upstreamed. On Fuchsia we critically need support for AArch64; What do you expect to be architecture dependent? I struggled to think of where the architecture and not the operating system was the issue. On Mon, Jun 24, 2019 at 3:23 PM Siva Chandra via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hello LLVM Developers, > > Within Google, we have a growing range of needs that existing libc > implementations don't quite address. This is pushing us to start working on > a new libc implementation. > > Informal conversations with others within the LLVM community has told us > that a libc in LLVM is actually a broader need, and we are increasingly > consolidating our toolchains around LLVM. Hence, we wanted to see if the > LLVM project would be interested in us developing this upstream as part of > the project. > > To be very clear: we don't expect our needs to exactly match everyone > else's -- part of our impetus is to simplify things wherever we can, and > that may not quite match what others want in a libc. That said, we do > believe that the effort will still be directly beneficial and usable for > the broader LLVM community, and may serve as a starting point for others in > the community to flesh out an increasingly complete set of libc > functionality. > > We are still in the early stages, but we do have some high-level goals and > guiding principles of the initial scope we are interested in pursuing: > > > 1. > > The project should mesh with the "as a library" philosophy of the LLVM > project: even though "the C Standard Library" is nominally "a library," > most implementations are, in practice, quite monolithic. > 2. > > The libc should support static non-PIE and static-PIE linking. This > means, providing the CRT (the C runtime) and a PIE loader for static > non-PIE and static-PIE linked executables. > 3. > > If there is a specification, we should follow it. The scope that we > need includes most of the C Standard Library; POSIX additions; and some > necessary, system-specific extensions. This does not mean we should (or > can) follow the entire specification -- there will be some parts which > simply aren't worth implementing, and some parts which cannot be safely > used in modern coding practice. > 4. > > Vendor extensions must be considered very carefully, and only admitted > when necessary. Similar to Clang and libc++, it does seem inevitable that > we will need to provide some level of compatibility with other vendors' > extensions. > 5. > > The project should be an exemplar of developing with LLVM tooling. Two > examples are fuzz testing from the start, and sanitizer-supported testing. > > > There are also few areas which we do not intend to invest in at this point: > > > 1. > > Implement dynamic loading and linking support. > 2. > > Support for more architectures (we'll start with just x86-64 for > simplicity). > > > For these areas, the community is of course free to contribute. Our hope > is that, preserving the "as a library" design philosophy will make such > extensions easy, and allow retaining the simplicity when these features > aren't needed. > > We intend to build the new libc in a gradual manner. To begin with, the > new libc will be a layer sitting between the application and the system > libc. Eventually, when the implementation is sufficiently complete, it will > be able to replace the system libc at least for some use cases and contexts. > > So, what do you think about incorporating this new libc under the LLVM > project? > > Thank you, > > Siva Chandra and the rest of the Google LLVM contributors > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190624/18555075/attachment.html>
What do you expect the support for Windows to be? Certainly, I don't expect you to provide Windows support personally if you don't need it, but given that LLVM supports Windows, it should at least be done in such a way that the design lends itself to interested parties contributing Windows support. Currently clang-cl has several dependencies on having a Visual Studio installation present on your machine, and one of these is because to provide an implementation of the CRT (i.e. libc). So having a libc implementation which supports Windows and is compatible with MSVCRT would be useful for people using clang on Windows as well. On Mon, Jun 24, 2019 at 3:38 PM Jake Ehrlich via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > disclaimer: I work at Google so don't take my +1 as an independent vote forward. > > We would like to use this on Fuchsia and I am particularly interested in creating a dynamic linking library for ELF with Roland McGrath's guidance. We spoke about creating a library for writing dynamic linkers internally and I don't see why this can't be upstreamed. > > On Fuchsia we critically need support for AArch64; What do you expect to be architecture dependent? I struggled to think of where the architecture and not the operating system was the issue. > > On Mon, Jun 24, 2019 at 3:23 PM Siva Chandra via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> Hello LLVM Developers, >> >> >> Within Google, we have a growing range of needs that existing libc implementations don't quite address. This is pushing us to start working on a new libc implementation. >> >> >> Informal conversations with others within the LLVM community has told us that a libc in LLVM is actually a broader need, and we are increasingly consolidating our toolchains around LLVM. Hence, we wanted to see if the LLVM project would be interested in us developing this upstream as part of the project. >> >> >> To be very clear: we don't expect our needs to exactly match everyone else's -- part of our impetus is to simplify things wherever we can, and that may not quite match what others want in a libc. That said, we do believe that the effort will still be directly beneficial and usable for the broader LLVM community, and may serve as a starting point for others in the community to flesh out an increasingly complete set of libc functionality. >> >> >> We are still in the early stages, but we do have some high-level goals and guiding principles of the initial scope we are interested in pursuing: >> >> >> The project should mesh with the "as a library" philosophy of the LLVM project: even though "the C Standard Library" is nominally "a library," most implementations are, in practice, quite monolithic. >> >> The libc should support static non-PIE and static-PIE linking. This means, providing the CRT (the C runtime) and a PIE loader for static non-PIE and static-PIE linked executables. >> >> If there is a specification, we should follow it. The scope that we need includes most of the C Standard Library; POSIX additions; and some necessary, system-specific extensions. This does not mean we should (or can) follow the entire specification -- there will be some parts which simply aren't worth implementing, and some parts which cannot be safely used in modern coding practice. >> >> Vendor extensions must be considered very carefully, and only admitted when necessary. Similar to Clang and libc++, it does seem inevitable that we will need to provide some level of compatibility with other vendors' extensions. >> >> The project should be an exemplar of developing with LLVM tooling. Two examples are fuzz testing from the start, and sanitizer-supported testing. >> >> >> There are also few areas which we do not intend to invest in at this point: >> >> >> Implement dynamic loading and linking support. >> >> Support for more architectures (we'll start with just x86-64 for simplicity). >> >> >> For these areas, the community is of course free to contribute. Our hope is that, preserving the "as a library" design philosophy will make such extensions easy, and allow retaining the simplicity when these features aren't needed. >> >> >> We intend to build the new libc in a gradual manner. To begin with, the new libc will be a layer sitting between the application and the system libc. Eventually, when the implementation is sufficiently complete, it will be able to replace the system libc at least for some use cases and contexts. >> >> >> So, what do you think about incorporating this new libc under the LLVM project? >> >> >> Thank you, >> >> Siva Chandra and the rest of the Google LLVM contributors >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
On 6/24/19 5:23 PM, Siva Chandra via llvm-dev wrote: Hello LLVM Developers, Within Google, we have a growing range of needs that existing libc implementations don't quite address. This is pushing us to start working on a new libc implementation. Informal conversations with others within the LLVM community has told us that a libc in LLVM is actually a broader need, +1 - This has also been my experience: Many people over many years have expressed a desire to have a libc has part of the LLVM project. It is currently a large gap in our LLVM toolchain offering. Moreover, from the standpoint of my organization, an LLVM libc could provide benefits on both production platforms and research/experimental hardware. and we are increasingly consolidating our toolchains around LLVM. Hence, we wanted to see if the LLVM project would be interested in us developing this upstream as part of the project. To be very clear: we don't expect our needs to exactly match everyone else's -- part of our impetus is to simplify things wherever we can, and that may not quite match what others want in a libc. That said, we do believe that the effort will still be directly beneficial and usable for the broader LLVM community, and may serve as a starting point for others in the community to flesh out an increasingly complete set of libc functionality. We are still in the early stages, but we do have some high-level goals and guiding principles of the initial scope we are interested in pursuing: 1. The project should mesh with the "as a library" philosophy of the LLVM project: even though "the C Standard Library" is nominally "a library," most implementations are, in practice, quite monolithic. 2. The libc should support static non-PIE and static-PIE linking. This means, providing the CRT (the C runtime) and a PIE loader for static non-PIE and static-PIE linked executables. 3. If there is a specification, we should follow it. The scope that we need includes most of the C Standard Library; POSIX additions; and some necessary, system-specific extensions. This does not mean we should (or can) follow the entire specification -- there will be some parts which simply aren't worth implementing, and some parts which cannot be safely used in modern coding practice. 4. Vendor extensions must be considered very carefully, and only admitted when necessary. Similar to Clang and libc++, it does seem inevitable that we will need to provide some level of compatibility with other vendors' extensions. 5. The project should be an exemplar of developing with LLVM tooling. Two examples are fuzz testing from the start, and sanitizer-supported testing. Great. There are also few areas which we do not intend to invest in at this point: 1. Implement dynamic loading and linking support. It will be useful to have a design document that describes the kind of system and capabilities that you're targeting, and then we can discuss how the libc might have a modular design that can be adapted for other use cases. I mention modularity because, for example, we have accelerator hardware and various kind of low-variability/embedded environments where many, but not all, POSIX/libc capabilities make sense. 1. Support for more architectures (we'll start with just x86-64 for simplicity). For these areas, the community is of course free to contribute. Our hope is that, preserving the "as a library" design philosophy will make such extensions easy, and allow retaining the simplicity when these features aren't needed. We intend to build the new libc in a gradual manner. To begin with, the new libc will be a layer sitting between the application and the system libc. Eventually, when the implementation is sufficiently complete, it will be able to replace the system libc at least for some use cases and contexts. So, what do you think about incorporating this new libc under the LLVM project? This is something that I'd like to see. -Hal Thank you, Siva Chandra and the rest of the Google LLVM contributors _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190624/0dbdb594/attachment-0001.html>
Hey Siva, HardenedBSD is a derivative of FreeBSD that aims to perform a clean-room reimplementation of the publicly-documented bits of the grsecurity patchset. We're extremely interested in llvm's CFI to fill the gap of PaX's/grsecurity's patented/GPLv3'd excellent RAP implementation. We've made measurable and tangible progress in researching and integrating Cross-DSO CFI (even producing a pre-alpha Call-For-Testing of Cross-DSO CFI in HardenedBSD base). One hard problem I need to solve is tight integration of the sanitizer library into both our libc and our RTLD while also attempting to keep diffs minimal with our upstream FreeBSD. Having a libc that was sanitizer-centric (or, at least, aware) and could serve as a drop-in replacement for our libc would be a major win and would even enable quicker development of novel security technologies in the future. On Mon, Jun 24, 2019 at 03:23:20PM -0700, Siva Chandra via llvm-dev wrote:> Hello LLVM Developers, > > Within Google, we have a growing range of needs that existing libc > implementations don't quite address. This is pushing us to start working on > a new libc implementation. > > Informal conversations with others within the LLVM community has told us > that a libc in LLVM is actually a broader need, and we are increasingly > consolidating our toolchains around LLVM. Hence, we wanted to see if the > LLVM project would be interested in us developing this upstream as part of > the project. > > To be very clear: we don't expect our needs to exactly match everyone > else's -- part of our impetus is to simplify things wherever we can, and > that may not quite match what others want in a libc. That said, we do > believe that the effort will still be directly beneficial and usable for > the broader LLVM community, and may serve as a starting point for others in > the community to flesh out an increasingly complete set of libc > functionality. > > We are still in the early stages, but we do have some high-level goals and > guiding principles of the initial scope we are interested in pursuing: > > > 1. > > The project should mesh with the "as a library" philosophy of the LLVM > project: even though "the C Standard Library" is nominally "a library," > most implementations are, in practice, quite monolithic. > 2. > > The libc should support static non-PIE and static-PIE linking. This > means, providing the CRT (the C runtime) and a PIE loader for static > non-PIE and static-PIE linked executables.Having a portable, permissively-licensed CSU/CRT that supports static PIE would be a very welcomed project, especially if HardenedBSD could make use of it.> 3. > > If there is a specification, we should follow it. The scope that we need > includes most of the C Standard Library; POSIX additions; and some > necessary, system-specific extensions. This does not mean we should (or > can) follow the entire specification -- there will be some parts which > simply aren't worth implementing, and some parts which cannot be safely > used in modern coding practice. > 4. > > Vendor extensions must be considered very carefully, and only admitted > when necessary. Similar to Clang and libc++, it does seem inevitable that > we will need to provide some level of compatibility with other vendors' > extensions. > 5. > > The project should be an exemplar of developing with LLVM tooling. Two > examples are fuzz testing from the start, and sanitizer-supported testing. > > > There are also few areas which we do not intend to invest in at this point: > > > 1. > > Implement dynamic loading and linking support.That is correct. Implementing a runtime linker (RTLD) is orthogonal. However, it seems to be the next logical (and welcomed!) step. Not within scope of a libc implementation, though.> 2. > > Support for more architectures (we'll start with just x86-64 for > simplicity). > > > For these areas, the community is of course free to contribute. Our hope is > that, preserving the "as a library" design philosophy will make such > extensions easy, and allow retaining the simplicity when these features > aren't needed. > > We intend to build the new libc in a gradual manner. To begin with, the > new libc will be a layer sitting between the application and the system > libc. Eventually, when the implementation is sufficiently complete, it will > be able to replace the system libc at least for some use cases and contexts. > > So, what do you think about incorporating this new libc under the LLVM > project?Even if the new libc isn't merged into llvm, it would be very interesting to collaborate on. I would hope that Google would remain interested in keeping in open sourced, and perhaps maintained in a fashion that multiple OS vendors can adopt. Thanks, -- Shawn Webb Cofounder / Security Engineer HardenedBSD Tor-ified Signal: +1 443-546-8752 Tor+XMPP+OTR: lattera at is.a.hacker.sx GPG Key ID: 0xFF2E67A277F8E1FA GPG Key Fingerprint: D206 BB45 15E0 9C49 0CF9 3633 C85B 0AF8 AB23 0FB2 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190624/be5a96cf/attachment.sig>
<disclaimer: I work at Google, though not on anything related to this project>> On Jun 24, 2019, at 3:23 PM, Siva Chandra via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > We are still in the early stages, but we do have some high-level goals and guiding principles of the initial scope we are interested in pursuing: > > The project should mesh with the "as a library" philosophy of the LLVM project: even though "the C Standard Library" is nominally "a library," most implementations are, in practice, quite monolithic.This is awesome. I’d really love to see a corpus of functionality built as a set of libraries that can be sliced and remixed in different ways per the needs of different use-cases.> For these areas, the community is of course free to contribute. Our hope is that, preserving the "as a library" design philosophy will make such extensions easy, and allow retaining the simplicity when these features aren't needed.Fantastic!> > We intend to build the new libc in a gradual manner. To begin with, the new libc will be a layer sitting between the application and the system libc. Eventually, when the implementation is sufficiently complete, it will be able to replace the system libc at least for some use cases and contexts. > > So, what do you think about incorporating this new libc under the LLVM project?I would love to see this, and I think it would fill a significant missing piece in the LLVM ecosystem. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190624/c6f24618/attachment.html>
I’m not totally sold on the idea of having it be a layer between system libc and application. I think this is likely to create a split between windows and non windows that will be difficult to overcome. It also seems like it brings with it its own set of difficulties. Where can you make a separation in libc such that you’re guaranteed that the two pieces do not share any state, especially given that not everyone is going to be using the same libc? Have you considered just starting with a blank slate? On Mon, Jun 24, 2019 at 5:33 PM Chris Lattner via llvm-dev < llvm-dev at lists.llvm.org> wrote:> <disclaimer: I work at Google, though not on anything related to this > project> > > On Jun 24, 2019, at 3:23 PM, Siva Chandra via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > We are still in the early stages, but we do have some high-level goals and > guiding principles of the initial scope we are interested in pursuing: > > > 1. The project should mesh with the "as a library" philosophy of the > LLVM project: even though "the C Standard Library" is nominally "a > library," most implementations are, in practice, quite monolithic. > > > This is awesome. I’d really love to see a corpus of functionality built > as a set of libraries that can be sliced and remixed in different ways per > the needs of different use-cases. > > For these areas, the community is of course free to contribute. Our hope > is that, preserving the "as a library" design philosophy will make such > extensions easy, and allow retaining the simplicity when these features > aren't needed. > > > Fantastic! > > > We intend to build the new libc in a gradual manner. To begin with, the > new libc will be a layer sitting between the application and the system > libc. Eventually, when the implementation is sufficiently complete, it will > be able to replace the system libc at least for some use cases and contexts. > > So, what do you think about incorporating this new libc under the LLVM > project? > > > I would love to see this, and I think it would fill a significant missing > piece in the LLVM ecosystem. > > -Chris > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190624/369a6ef3/attachment.html>
On Mon, 24 Jun 2019 at 23:23, Siva Chandra via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Hello LLVM Developers, > > > Within Google, we have a growing range of needs that existing libc implementations don't quite address. This is pushing us to start working on a new libc implementation. >Are you able to share what some of these needs are? My reason for asking is to see if there is a particular niche where existing libc designs are not working, or if there is an approach that will handle many use cases better than existing libc implementations.> > Informal conversations with others within the LLVM community has told us that a libc in LLVM is actually a broader need, and we are increasingly consolidating our toolchains around LLVM. Hence, we wanted to see if the LLVM project would be interested in us developing this upstream as part of the project. > > > To be very clear: we don't expect our needs to exactly match everyone else's -- part of our impetus is to simplify things wherever we can, and that may not quite match what others want in a libc. That said, we do believe that the effort will still be directly beneficial and usable for the broader LLVM community, and may serve as a starting point for others in the community to flesh out an increasingly complete set of libc functionality. >I'm definitely interested in hearing more. Assembling an LLVM based toolchain when there isn't an obvious native platform C library that can be used could in theory benefit greatly from something like this. As you point out, this might not be in your set of needs though.> > We are still in the early stages, but we do have some high-level goals and guiding principles of the initial scope we are interested in pursuing: > > > The project should mesh with the "as a library" philosophy of the LLVM project: even though "the C Standard Library" is nominally "a library," most implementations are, in practice, quite monolithic. >There can be good reasons for designs to be monolithic though, for example https://wiki.musl-libc.org/design-concepts.html . I'm not enough of a C-library expert to say that this is always true, but it does at least highlight that there is a risk that a toolkit suitable for many libraries becomes too cumbersome to use in practice.> The libc should support static non-PIE and static-PIE linking. This means, providing the CRT (the C runtime) and a PIE loader for static non-PIE and static-PIE linked executables. >Interesting. I've seen an embedded static-PIE loader embedded into an image so that it could relocate itself. As all the dependencies were statically linked there were only simple relative relocations to resolve. Are you thinking of something along those lines or an external loader program?> If there is a specification, we should follow it. The scope that we need includes most of the C Standard Library; POSIX additions; and some necessary, system-specific extensions. This does not mean we should (or can) follow the entire specification -- there will be some parts which simply aren't worth implementing, and some parts which cannot be safely used in modern coding practice. >I'm interested in what sort of platform that the libc could run on and what would be needed to be provided externally? In particular I'm interested in whether a platform OS is required? I'm also interested in where the boundaries of the libc, for example I'm thinking of something like the separation of newlib and libgloss here?> Vendor extensions must be considered very carefully, and only admitted when necessary. Similar to Clang and libc++, it does seem inevitable that we will need to provide some level of compatibility with other vendors' extensions. > > The project should be an exemplar of developing with LLVM tooling. Two examples are fuzz testing from the start, and sanitizer-supported testing. > > > There are also few areas which we do not intend to invest in at this point: > > > Implement dynamic loading and linking support. > > Support for more architectures (we'll start with just x86-64 for simplicity). >I strongly recommend you choose at least one other architecture and build cross platform support in from the beginning. I suspect that trying to put this in retroactively will put huge stress on the design and the supporting infrastructure such as the build system. There is also a danger of baking design decisions favouring one architecture into the system, 32-bit vs 64-bit support is one obvious case. I'm thinking that this is one area where the community could contribute.> > For these areas, the community is of course free to contribute. Our hope is that, preserving the "as a library" design philosophy will make such extensions easy, and allow retaining the simplicity when these features aren't needed. > > > We intend to build the new libc in a gradual manner. To begin with, the new libc will be a layer sitting between the application and the system libc. Eventually, when the implementation is sufficiently complete, it will be able to replace the system libc at least for some use cases and contexts. >I'm interested to see which system libc and existing platforms you intend to support? Does this go as low as embedded system where the platform is more like a board support package, or is this purely a libc for platforms?> > So, what do you think about incorporating this new libc under the LLVM project? >Personally I think that if it can satisfy the needs of a sufficiently broad segment of the community then I'm in favour. I'm looking forward to seeing more. Peter> > Thank you, > > Siva Chandra and the rest of the Google LLVM contributors > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
On Tue, Jun 25, 2019 at 03:24:04AM +0000, Siva Chandra via llvm-dev wrote:> Hello LLVM Developers, > > Within Google, we have a growing range of needs that existing libc > implementations don't quite address. This is pushing us to start > working on a new libc implementation. > > Informal conversations with others within the LLVM community has > told us that a libc in LLVM is actually a broader need, and we are > increasingly consolidating our toolchains around LLVM. Hence, we > wanted to see if the LLVM project would be interested in us > developing this upstream as part of the project. > > To be very clear: we don't expect our needs to exactly match > everyone else's -- part of our impetus is to simplify things > wherever we can, and that may not quite match what others want in a > libc. That said, we do believe that the effort will still be > directly beneficial and usable for the broader LLVM community, and > may serve as a starting point for others in the community to flesh > out an increasingly complete set of libc functionality. > > We are still in the early stages, but we do have some high-level > goals and guiding principles of the initial scope we are interested > in pursuing: > > The project should mesh with the "as a library" philosophy of the > LLVM project: even though "the C Standard Library" is nominally "a > library," most implementations are, in practice, quite monolithic. > > The libc should support static non-PIE and static-PIE linking. This > means, providing the CRT (the C runtime) and a PIE loader for static > non-PIE and static-PIE linked executables. > > If there is a specification, we should follow it. The scope that we > need includes most of the C Standard Library; POSIX additions; and > some necessary, system-specific extensions. This does not mean we > should (or can) follow the entire specification -- there will be > some parts which simply aren't worth implementing, and some parts > which cannot be safely used in modern coding practice. > > Vendor extensions must be considered very carefully, and only > admitted when necessary. Similar to Clang and libc++, it does seem > inevitable that we will need to provide some level of compatibility > with other vendors' extensions. > > The project should be an exemplar of developing with LLVM tooling. > Two examples are fuzz testing from the start, and > sanitizer-supported testing. > > There are also few areas which we do not intend to invest in at this point: > > Implement dynamic loading and linking support. > Support for more architectures (we'll start with just x86-64 for simplicity). > > For these areas, the community is of course free to contribute. Our > hope is that, preserving the "as a library" design philosophy will > make such extensions easy, and allow retaining the simplicity when > these features aren't needed. > > We intend to build the new libc in a gradual manner. To begin with, > the new libc will be a layer sitting between the application and the > system libc. Eventually, when the implementation is sufficiently > complete, it will be able to replace the system libc at least for > some use cases and contexts. > > So, what do you think about incorporating this new libc under the > LLVM project?Since I have a little experience in this area, I'd like to chime in on it. :-) TL;DR I think it's a reall, REALLY bad idea. First, writing and maintaining a correct, compatible, high-quality libc is a monumental task. The amount of code needed is not all that large, but the subtleties of how it behaves and the difficulties of implementing various interfaces that have no capacity to fail or report failure, and the astronomical "compatibility surface" of interfacing with all C and C++ software ever written as well as a large amount of software written in other languages whose runtimes "pass through" the behavior of libc to the applications they host, all contribute to the scale of work, and of knowledge/expertise, involved in making something of even decent quality. (As an aside, note that I love to see hobby libc projects even if they have major problems, but that's totally different from proposing something that lots of people will end up stuck using.) Second, corporate development teams are uniquely qualified to utterly botch a libc, yet still push it into widespread use, and the cost is painful compatibility hacks in all applications. Apple did this with their fork of BSD libc code. Google has done it once already with their fork of musl in Fuchsia -- a project which I contributed significant amounts of free labor to in terms of tracking down folks for license clarification their lawyers wanted, only to have them never bother to ask me why technical things were done they way they were before making random useless and broken changes in their fork. A corporate-led project does not have to answer to the community, and will leave whatever bugs they introduce in place for the sake of bug-compatibility with their own software rather than fixing them. Third, there is tremendous value in non-monoculture of libc implementations, or implementations of any important library interfaces or language runtimes. Likewise there's tremendous value in non-monoculture of tooling (compilers, linkers, etc.). Avoiding monoculture preserves the motivation for consensus-based standards processes rather than single-party control (see also: Chrome and what it's done to the web) and the motivation for people writing software to write to the standards rather than to a particular implementation. A big part of making that possible is clear delineation of roles between parts of the toolchain and runtime, with well-defined interface boundaries. Some folks have told me that I should press LLVM to make musl the "LLVM libc" instead of whatever Google wants to do, but that misses the point: there *shouldn't be* a "LLVM libc", or any one library implementation that's "first class" for use with LLVM while others are only "second class". So, in summary: Point 1 is why making a libc for real-world use is not to be taken lightly. Point 2 is why, if it is done, it shouldn't be a Google project. Point 3 is why there should not be an "LLVM libc". Hope this is all helpful. Regards, Rich
On Tue, Jun 25, 2019 at 2:34 PM Rich Felker via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > On Tue, Jun 25, 2019 at 03:24:04AM +0000, Siva Chandra via llvm-dev wrote: > > Third, there is tremendous value in non-monoculture of libc > implementations, or implementations of any important library > interfaces or language runtimes. Likewise there's tremendous value in > non-monoculture of tooling (compilers, linkers, etc.). Avoiding > monoculture preserves the motivation for consensus-based standards > processes rather than single-party control (see also: Chrome and what > it's done to the web) and the motivation for people writing software > to write to the standards rather than to a particular implementation. > A big part of making that possible is clear delineation of roles > between parts of the toolchain and runtime, with well-defined > interface boundaries. Some folks have told me that I should press LLVM > to make musl the "LLVM libc" instead of whatever Google wants to do, > but that misses the point: there *shouldn't be* a "LLVM libc", or any > one library implementation that's "first class" for use with LLVM > while others are only "second class".Doesn't having additional libc implementations to choose from contribute *to* the ideal of not having a monoculture? Also, I didn't read the proposal as segregating the world into first class and second class libc implementations. For example, libc++ currently works fine with non LLVM-based toolchains, and libstdc++ currently works fine with LLVM-based toolchains. Do you see libc as fundamentally different in this regard? Regarding your second point, if Google were to write a libc implementation and then upstream it in bulk, I would agree with you. But being done in the open appears to solve the exact problem you are concerned about, which is that corporate interests will lead to lasting design decisions that aren't in the best interest of the general public. By doing it in the open, such problems can be addressed before the code is ever committed.
I'm gonna let the folks working on this respond to technical points, but some meta points about discussion on this list... On Tue, Jun 25, 2019 at 2:33 PM Rich Felker via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Since I have a little experience in this area, I'd like to chime in on > it. :-) TL;DR I think it's a reall, REALLY bad idea. >In case there is any confusion, I'm really glad you're participating in the discussion here because of this background.> Second, corporate development teams are uniquely qualified to utterly > botch a libc, yet still push it into widespread use, and the cost is > painful compatibility hacks in all applications. Apple did this with > their fork of BSD libc code. Google has done it once already with > their fork of musl in FuchsiaLet's keep this focused on technical issues and LLVM issues, none of the above (or the text in this paragraph I've snipped out) has anything to do with those, and I don't think the LLVM list is the right place to discuss that. LLVM has a long and effective history of both individuals and corporations working effectively together in the open as part of the project. I don't think this project poses any risk there, much like Zach points out in his reply. Google is specifically discussing this early and trying to participate in the open process of the LLVM community from the outset. =] Also, I'd suggest using more specific technical language than "botch" and "hacks" to make the discussion more productive. With that, I'll wander off and let you all dig into the real issues here. -Chandler -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190625/1e091462/attachment.html>
On Mon, Jun 24, 2019 at 3:37 PM Jake Ehrlich <jakehehrlich at google.com> wrote:> disclaimer: I work at Google so don't take my +1 as an independent vote > forward. > > We would like to use this on Fuchsia and I am particularly interested in > creating a dynamic linking library for ELF with Roland McGrath's guidance. > We spoke about creating a library for writing dynamic linkers internally > and I don't see why this can't be upstreamed. >If dynamic linking support is added in a "as a library" fashion, so that it can easily be excluded if not required without affecting the rest of the system, I do not see any problems adding it.> On Fuchsia we critically need support for AArch64; What do you expect to > be architecture dependent? I struggled to think of where the architecture > and not the operating system was the issue. >I think syscalls are an example of being architecture specific? And, items like program startup and PIE loader are operating system/exe format specific? Just for my knowledge, why is answering these questions at a general level important?> On Mon, Jun 24, 2019 at 3:23 PM Siva Chandra via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hello LLVM Developers, >> >> Within Google, we have a growing range of needs that existing libc >> implementations don't quite address. This is pushing us to start working on >> a new libc implementation. >> >> Informal conversations with others within the LLVM community has told us >> that a libc in LLVM is actually a broader need, and we are increasingly >> consolidating our toolchains around LLVM. Hence, we wanted to see if the >> LLVM project would be interested in us developing this upstream as part of >> the project. >> >> To be very clear: we don't expect our needs to exactly match everyone >> else's -- part of our impetus is to simplify things wherever we can, and >> that may not quite match what others want in a libc. That said, we do >> believe that the effort will still be directly beneficial and usable for >> the broader LLVM community, and may serve as a starting point for others in >> the community to flesh out an increasingly complete set of libc >> functionality. >> >> We are still in the early stages, but we do have some high-level goals >> and guiding principles of the initial scope we are interested in pursuing: >> >> >> 1. >> >> The project should mesh with the "as a library" philosophy of the >> LLVM project: even though "the C Standard Library" is nominally "a >> library," most implementations are, in practice, quite monolithic. >> 2. >> >> The libc should support static non-PIE and static-PIE linking. This >> means, providing the CRT (the C runtime) and a PIE loader for static >> non-PIE and static-PIE linked executables. >> 3. >> >> If there is a specification, we should follow it. The scope that we >> need includes most of the C Standard Library; POSIX additions; and some >> necessary, system-specific extensions. This does not mean we should (or >> can) follow the entire specification -- there will be some parts which >> simply aren't worth implementing, and some parts which cannot be safely >> used in modern coding practice. >> 4. >> >> Vendor extensions must be considered very carefully, and only >> admitted when necessary. Similar to Clang and libc++, it does seem >> inevitable that we will need to provide some level of compatibility with >> other vendors' extensions. >> 5. >> >> The project should be an exemplar of developing with LLVM tooling. >> Two examples are fuzz testing from the start, and sanitizer-supported >> testing. >> >> >> There are also few areas which we do not intend to invest in at this >> point: >> >> >> 1. >> >> Implement dynamic loading and linking support. >> 2. >> >> Support for more architectures (we'll start with just x86-64 for >> simplicity). >> >> >> For these areas, the community is of course free to contribute. Our hope >> is that, preserving the "as a library" design philosophy will make such >> extensions easy, and allow retaining the simplicity when these features >> aren't needed. >> >> We intend to build the new libc in a gradual manner. To begin with, the >> new libc will be a layer sitting between the application and the system >> libc. Eventually, when the implementation is sufficiently complete, it will >> be able to replace the system libc at least for some use cases and contexts. >> >> So, what do you think about incorporating this new libc under the LLVM >> project? >> >> Thank you, >> >> Siva Chandra and the rest of the Google LLVM contributors >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190625/64c56a75/attachment-0001.html>
On Mon, Jun 24, 2019 at 3:45 PM Finkel, Hal J. <hfinkel at anl.gov> wrote:> On 6/24/19 5:23 PM, Siva Chandra via llvm-dev wrote: > > Hello LLVM Developers, > > Within Google, we have a growing range of needs that existing libc > implementations don't quite address. This is pushing us to start working on > a new libc implementation. > > Informal conversations with others within the LLVM community has told us > that a libc in LLVM is actually a broader need, > > > +1 - This has also been my experience: Many people over many years have > expressed a desire to have a libc has part of the LLVM project. It is > currently a large gap in our LLVM toolchain offering. Moreover, from the > standpoint of my organization, an LLVM libc could provide benefits on both > production platforms and research/experimental hardware. > > > and we are increasingly consolidating our toolchains around LLVM. Hence, > we wanted to see if the LLVM project would be interested in us developing > this upstream as part of the project. > > To be very clear: we don't expect our needs to exactly match everyone > else's -- part of our impetus is to simplify things wherever we can, and > that may not quite match what others want in a libc. That said, we do > believe that the effort will still be directly beneficial and usable for > the broader LLVM community, and may serve as a starting point for others in > the community to flesh out an increasingly complete set of libc > functionality. > > We are still in the early stages, but we do have some high-level goals and > guiding principles of the initial scope we are interested in pursuing: > > > 1. > > The project should mesh with the "as a library" philosophy of the LLVM > project: even though "the C Standard Library" is nominally "a library," > most implementations are, in practice, quite monolithic. > 2. > > The libc should support static non-PIE and static-PIE linking. This > means, providing the CRT (the C runtime) and a PIE loader for static > non-PIE and static-PIE linked executables. > 3. > > If there is a specification, we should follow it. The scope that we > need includes most of the C Standard Library; POSIX additions; and some > necessary, system-specific extensions. This does not mean we should (or > can) follow the entire specification -- there will be some parts which > simply aren't worth implementing, and some parts which cannot be safely > used in modern coding practice. > 4. > > Vendor extensions must be considered very carefully, and only admitted > when necessary. Similar to Clang and libc++, it does seem inevitable that > we will need to provide some level of compatibility with other vendors' > extensions. > 5. > > The project should be an exemplar of developing with LLVM tooling. Two > examples are fuzz testing from the start, and sanitizer-supported testing. > > > Great. > > > > There are also few areas which we do not intend to invest in at this point: > > > 1. > > Implement dynamic loading and linking support. > > > It will be useful to have a design document that describes the kind of > system and capabilities that you're targeting, and then we can discuss how > the libc might have a modular design that can be adapted for other use > cases. I mention modularity because, for example, we have accelerator > hardware and various kind of low-variability/embedded environments where > many, but not all, POSIX/libc capabilities make sense. >I am of the opinion that modularity should be as fine-grained as possible. For example, one should be able to pick and package individual functions into a libc as suitable for their platform. That said, I am open to other ideas you might have about modularity. I am also open to getting convinced that function level granularity is an overkill.> > 1. > > Support for more architectures (we'll start with just x86-64 for > simplicity). > > > For these areas, the community is of course free to contribute. Our hope > is that, preserving the "as a library" design philosophy will make such > extensions easy, and allow retaining the simplicity when these features > aren't needed. > > We intend to build the new libc in a gradual manner. To begin with, the > new libc will be a layer sitting between the application and the system > libc. Eventually, when the implementation is sufficiently complete, it will > be able to replace the system libc at least for some use cases and contexts. > > So, what do you think about incorporating this new libc under the LLVM > project? > > > This is something that I'd like to see. > > -Hal > > > > Thank you, > > Siva Chandra and the rest of the Google LLVM contributors > > > _______________________________________________ > LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190625/7a9ce6df/attachment.html>
I agree with your point 1. With regards to point 2, I think there's a difference between Fuchsia, which Google controls (where every check-in is authored by a Fuchsia eng and reviewed by another Fuchsia eng), and LLVM, which Google doesn't control. There's also a difference between Google in general, and the Fuchsia project, which I'd summarize as simply: Google is not a monoculture. Case in point: Jake, who works at Google, immediately countered Siva's suggestion that "Support for more architectures" is not something Google intends to invest in, by pointing out his need for AArch64 support. I work for Google too, and I personally need RISC-V support. (Separately, I'm sorry to hear about your experience with Fuchsia's musl fork... though I've not worked on Fuchsia and have no knowledge of that situation and therefore won't say anything more about it.) With regards to point 3, I agree with your points, in particular, I agree that it's important for there to be a variety of libc implementations. But it seems to me that while gnu has both gcc and glibc, gcc doesn't require the use of glibc, and I would anticipate that clang would never require llvmlibc. I would anticipate that a user would continue to have their choice of compiler, their choice of STL implementation, their choice of libc implementation. To the extent that there would be a "library implementation that's first-class for use with LLVM", I think there already is: glibc. But it would be better if there were two first-class implementations. On Tue, Jun 25, 2019 at 2:34 PM Rich Felker via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On Tue, Jun 25, 2019 at 03:24:04AM +0000, Siva Chandra via llvm-dev wrote: > > Hello LLVM Developers, > > > > Within Google, we have a growing range of needs that existing libc > > implementations don't quite address. This is pushing us to start > > working on a new libc implementation. > > > > Informal conversations with others within the LLVM community has > > told us that a libc in LLVM is actually a broader need, and we are > > increasingly consolidating our toolchains around LLVM. Hence, we > > wanted to see if the LLVM project would be interested in us > > developing this upstream as part of the project. > > > > To be very clear: we don't expect our needs to exactly match > > everyone else's -- part of our impetus is to simplify things > > wherever we can, and that may not quite match what others want in a > > libc. That said, we do believe that the effort will still be > > directly beneficial and usable for the broader LLVM community, and > > may serve as a starting point for others in the community to flesh > > out an increasingly complete set of libc functionality. > > > > We are still in the early stages, but we do have some high-level > > goals and guiding principles of the initial scope we are interested > > in pursuing: > > > > The project should mesh with the "as a library" philosophy of the > > LLVM project: even though "the C Standard Library" is nominally "a > > library," most implementations are, in practice, quite monolithic. > > > > The libc should support static non-PIE and static-PIE linking. This > > means, providing the CRT (the C runtime) and a PIE loader for static > > non-PIE and static-PIE linked executables. > > > > If there is a specification, we should follow it. The scope that we > > need includes most of the C Standard Library; POSIX additions; and > > some necessary, system-specific extensions. This does not mean we > > should (or can) follow the entire specification -- there will be > > some parts which simply aren't worth implementing, and some parts > > which cannot be safely used in modern coding practice. > > > > Vendor extensions must be considered very carefully, and only > > admitted when necessary. Similar to Clang and libc++, it does seem > > inevitable that we will need to provide some level of compatibility > > with other vendors' extensions. > > > > The project should be an exemplar of developing with LLVM tooling. > > Two examples are fuzz testing from the start, and > > sanitizer-supported testing. > > > > There are also few areas which we do not intend to invest in at this > point: > > > > Implement dynamic loading and linking support. > > Support for more architectures (we'll start with just x86-64 for > simplicity). > > > > For these areas, the community is of course free to contribute. Our > > hope is that, preserving the "as a library" design philosophy will > > make such extensions easy, and allow retaining the simplicity when > > these features aren't needed. > > > > We intend to build the new libc in a gradual manner. To begin with, > > the new libc will be a layer sitting between the application and the > > system libc. Eventually, when the implementation is sufficiently > > complete, it will be able to replace the system libc at least for > > some use cases and contexts. > > > > So, what do you think about incorporating this new libc under the > > LLVM project? > > Since I have a little experience in this area, I'd like to chime in on > it. :-) TL;DR I think it's a reall, REALLY bad idea. > > First, writing and maintaining a correct, compatible, high-quality > libc is a monumental task. The amount of code needed is not all that > large, but the subtleties of how it behaves and the difficulties of > implementing various interfaces that have no capacity to fail or > report failure, and the astronomical "compatibility surface" of > interfacing with all C and C++ software ever written as well as a > large amount of software written in other languages whose runtimes > "pass through" the behavior of libc to the applications they host, all > contribute to the scale of work, and of knowledge/expertise, involved > in making something of even decent quality. (As an aside, note that I > love to see hobby libc projects even if they have major problems, but > that's totally different from proposing something that lots of people > will end up stuck using.) > > Second, corporate development teams are uniquely qualified to utterly > botch a libc, yet still push it into widespread use, and the cost is > painful compatibility hacks in all applications. Apple did this with > their fork of BSD libc code. Google has done it once already with > their fork of musl in Fuchsia -- a project which I contributed > significant amounts of free labor to in terms of tracking down folks > for license clarification their lawyers wanted, only to have them > never bother to ask me why technical things were done they way they > were before making random useless and broken changes in their fork. A > corporate-led project does not have to answer to the community, and > will leave whatever bugs they introduce in place for the sake of > bug-compatibility with their own software rather than fixing them. > > Third, there is tremendous value in non-monoculture of libc > implementations, or implementations of any important library > interfaces or language runtimes. Likewise there's tremendous value in > non-monoculture of tooling (compilers, linkers, etc.). Avoiding > monoculture preserves the motivation for consensus-based standards > processes rather than single-party control (see also: Chrome and what > it's done to the web) and the motivation for people writing software > to write to the standards rather than to a particular implementation. > A big part of making that possible is clear delineation of roles > between parts of the toolchain and runtime, with well-defined > interface boundaries. Some folks have told me that I should press LLVM > to make musl the "LLVM libc" instead of whatever Google wants to do, > but that misses the point: there *shouldn't be* a "LLVM libc", or any > one library implementation that's "first class" for use with LLVM > while others are only "second class". > > So, in summary: > > Point 1 is why making a libc for real-world use is not to be taken > lightly. > > Point 2 is why, if it is done, it shouldn't be a Google project. > > Point 3 is why there should not be an "LLVM libc". > > Hope this is all helpful. > > Regards, > > Rich > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190625/698143ef/attachment-0001.html>
On Tue, Jun 25, 2019 at 12:12 PM Rich Felker <dalias at libc.org> wrote:> First, writing and maintaining a correct, compatible, high-quality > libc is a monumental task. >Point 1 is why making a libc for real-world use is not to be taken> lightly. >We totally understand the magnitude of this undertaking :) Point 2 is why, if it is done, it shouldn't be a Google project.>The very point of my first email in this thread was to ask if this can be made part of the LLVM project, developed and maintained by the LLVM community.> Point 3 is why there should not be an "LLVM libc". >If there can be a C++ standard library and runtime implementation as part of the LLVM project, I do not see a reason why there cannot be a libc implementation as part of the LLVM project.>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190625/7b9dcc10/attachment.html>
On 6/24/19 6:23 PM, Siva Chandra via llvm-dev wrote:> Within Google, we have a growing range of needs that existing libc > implementations don't quite address. > To be very clear: we don't expect our needs to exactly match everyone > else's -- part of our impetus is to simplify things wherever we can, and > that may not quite match what others want in a libc. > There are also few areas which we do not intend to invest in at this point: > Implement dynamic loading and linking support. > Support for more architectures (we'll start with just x86-64 for > simplicity). > So, what do you think about incorporating this new libc under the LLVM > project?The null hypothesis is to not add a project to LLVM. In order to add a project, it should be justified. What are the justifications here? I've quoted the snippets above where it is made clear that Google's needs do *not* line up with the needs of the community. But the proposal failed to mention what the actual needs of Google are. So what are they? The current list of C ABI environments which LLVM recognizes is: none gnu gnuabin32 gnuabi64 gnueabi gnueabihf gnux32 code16 eabi eabihf android musl musleabi musleabihf msvc itanium cygnus coreclr simulator Would this proposed libc be adding a new C ABI environment to this list, or maintaining API/ABI compatibility with one or more of these? Finally, I'm only aware of 2 operating systems where the libc is not an integral part of the system, which is Linux and Windows. For example on macOS, FreeBSD, OpenBSD, and DragonFlyBSD, the libc is guaranteed to be available, and must be dynamically linked, because this is the stable syscall ABI. So it would only make sense for an LLVM libc to be for Linux and Windows. It seems reasonable to assume that Google is only interested in Linux. In this case I have to re-iterate my original question, what are the needs that are not being met by existing Linux libcs, such as musl? Regards, Andrew -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190626/5a7a9513/attachment.sig>
On Jun 26, 2019, at 9:02 AM, Andrew Kelley via llvm-dev <llvm-dev at lists.llvm.org> wrote:> On 6/24/19 6:23 PM, Siva Chandra via llvm-dev wrote: >> Within Google, we have a growing range of needs that existing libc >> implementations don't quite address. >> To be very clear: we don't expect our needs to exactly match everyone >> else's -- part of our impetus is to simplify things wherever we can, and >> that may not quite match what others want in a libc. >> There are also few areas which we do not intend to invest in at this point: >> Implement dynamic loading and linking support. >> Support for more architectures (we'll start with just x86-64 for >> simplicity). >> So, what do you think about incorporating this new libc under the LLVM >> project? > The null hypothesis is to not add a project to LLVM. In order to add a > project, it should be justified. What are the justifications here? I've > quoted the snippets above where it is made clear that Google's needs do > *not* line up with the needs of the community. But the proposal failed > to mention what the actual needs of Google are. > > So what are they?I really have nothing to do with this project, and no insight on the thoughts behind it, but I think you and several other people on this thread have missed a significant issue: the thread is conflating whether it is a good idea to "create yet another libc" with whether it is a good idea to "contribute that code to LLVM". I don’t think arguing whether or not someone should build a project is on-topic for this list. Given that they appear motivated to build it, the question is whether this fits into the LLVM umbrella. With my LLVM hat on (I also work for Google, but am unaffiliated and uninvolved with this proposal), it appears clearly beneficial for LLVM to have a libc if it were done well. That said, clang shouldn’t/couldn't *require* one specific libc, just like we don’t require libc++ as the standard library. We want LLVM components to be mixable and matchable. I appreciate the comments on this thread that are throwing in ideas for how to make the project better, how to ensure it grows to being a successful and widely useful component of LLVM, etc. I for one think that this could be very useful for people building custom micro targets, and being able to build custom configs of a libc without (e.g.) stdio or libm would be a nice way to shed weight. -Chris
> On Jun 24, 2019, at 3:23 PM, Siva Chandra via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hello LLVM Developers, > > Within Google, we have a growing range of needs that existing libc implementations don't quite address. This is pushing us to start working on a new libc implementation. > > Informal conversations with others within the LLVM community has told us that a libc in LLVM is actually a broader need, and we are increasingly consolidating our toolchains around LLVM. Hence, we wanted to see if the LLVM project would be interested in us developing this upstream as part of the project. > > To be very clear: we don't expect our needs to exactly match everyone else's -- part of our impetus is to simplify things wherever we can, and that may not quite match what others want in a libc. That said, we do believe that the effort will still be directly beneficial and usable for the broader LLVM community, and may serve as a starting point for others in the community to flesh out an increasingly complete set of libc functionality. > > We are still in the early stages, but we do have some high-level goals and guiding principles of the initial scope we are interested in pursuing: > > The project should mesh with the "as a library" philosophy of the LLVM project: even though "the C Standard Library" is nominally "a library," most implementations are, in practice, quite monolithic. > The libc should support static non-PIE and static-PIE linking. This means, providing the CRT (the C runtime) and a PIE loader for static non-PIE and static-PIE linked executables. > If there is a specification, we should follow it. The scope that we need includes most of the C Standard Library; POSIX additions; and some necessary, system-specific extensions. This does not mean we should (or can) follow the entire specification -- there will be some parts which simply aren't worth implementing, and some parts which cannot be safely used in modern coding practice.I’d love to hear what you have in mind with point 3 above, and see it expanded. libc++ implements C++11 and subsequent standards, and that makes me wonder: Which standards would this libc implement? Would you implement upcoming C standards, and how would you manage “experimental” features (API changes, ABI changes, etc)? What parts of the standard wouldn’t you follow, why, how would the LLVM community determine this? Which parts aren’t worth implementing? Which parts cannot be safely used in modern coding practice? How would you remedy what’s perceived as “the bad parts”? I’d love it if the C Standards Committee, WG14, got renewed involvement through this project. Is that an explicit goal? Who will join WG14 in this effort? What part of C do you see this project help improve over time?> Vendor extensions must be considered very carefully, and only admitted when necessary. Similar to Clang and libc++, it does seem inevitable that we will need to provide some level of compatibility with other vendors' extensions. > The project should be an exemplar of developing with LLVM tooling. Two examples are fuzz testing from the start, and sanitizer-supported testing.How do you intend to test this C library? Fuzzing and all that is nice, but just straight conformance testing is what I’d like to hear about.> There are also few areas which we do not intend to invest in at this point: > > Implement dynamic loading and linking support. > Support for more architectures (we'll start with just x86-64 for simplicity). > > For these areas, the community is of course free to contribute. Our hope is that, preserving the "as a library" design philosophy will make such extensions easy, and allow retaining the simplicity when these features aren't needed. > > We intend to build the new libc in a gradual manner. To begin with, the new libc will be a layer sitting between the application and the system libc. Eventually, when the implementation is sufficiently complete, it will be able to replace the system libc at least for some use cases and contexts. > > So, what do you think about incorporating this new libc under the LLVM project? > > Thank you, > Siva Chandra and the rest of the Google LLVM contributorsSomewhat off-topic… this last line is unfortunate. It would be great to not sign as "the rest of the Google LLVM contributors” when subsequent responses show that many Google LLVM contributors aren’t co-signing this proposal (even if they’re interested!). Scope and purpose within your organization would have been more helpful, here it sounds like all of Google is in agreement… which never happen 🙂 - JF (and not the rest of the Apple LLVM contributors 😉) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190626/2f1b9ff9/attachment.html>
On Tue, Jun 25, 2019 at 2:53 AM Peter Smith <peter.smith at linaro.org> wrote:> > On Mon, 24 Jun 2019 at 23:23, Siva Chandra via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > > > Hello LLVM Developers, > > > > > > Within Google, we have a growing range of needs that existing libc implementations don't quite address. This is pushing us to start working on a new libc implementation. > > > > Are you able to share what some of these needs are? My reason for > asking is to see if there is a particular niche where existing libc > designs are not working, or if there is an approach that will handle > many use cases better than existing libc implementations.There have been a lot of questions about our reasons for opting to build a new libc and why an existing libc implementation does not meet our needs. I will try to address these questions in a general fashion in this email. I will answer individual concerns separately. Before I start, I also want to apologize if I am being late to answer, or appearing to be ignoring some of the emails. I am not trying to ignore or avoid any one or any question - it is just that I need time to process your questions and compose meaningful answers. So, we have a bunch of reasons for a new libc and why we prefer it to be a part of the LLVM project: 1. Static linking without the complexity of dynamic linking - Most libc implementations end up being complicated because they support dynamic loading/linking. This is not bad by itself, but we want to be able to take out dynamic linking capability where possible and get the benefits of the much simpler system. We believe that building everything in a “as a library fashion” would facilitate this. 2. As somebody else has pointed out in the list, we want to have a libc with as much fine grained modularity as possible. This not only helps one to pick and choose what they want, but also makes it easy to adapt to different build systems. Moreover, such a modular system will also facilitate deploying chunks of functionality during the transition from another libc to this new libc. 3. Sanitizer supported testing and fuzz testing from the start - Doing this from the start will impact few design choices non-trivially. For example, sanitizers need that a target be rebuilt with sanitizer specific specialized options. We want to develop the new libc in such a fashion that it will work with these specialized options as well. 4. ABI independent implementation as far as possible - There will be places where it would not be possible to implement in an ABI independent fashion. However, wherever possible, we want to use normal source code so that compiler-based changes to the ABI are easy. Our reasons for ABI independent implementations fall into two categories: a) Long term changes to the ABI for security like SCADS, and for performance tuning like caller/callee register ratios to better match software and hardware. b) Rapid deployment of specific ABI changes as part of security mitigation strategies such as those for Spectre. For example, speculative load hardening would have vastly benefitted from being able to change the calling convention. 5. Avoid assembly language as far as possible - Again, there will be places where one cannot avoid assembly level implementations. But, wherever possible, we want to avoid assembly level implementations. There are a few reasons here as well: a) We want to leverage the compiler for performance wherever possible, and as part of the LLVM project, fix compiler bugs rather than use assembly. b) Enable sanitizers and coverage-based fuzzing to work well across the implementation of libc. c) Allow deploying compiler-based security mitigations such as those we needed for Spectre. 6. Having the support of the LLVM community, project, and infrastructure - From access to the broad platform expertise in the community to the strong license and project structure, we think the project will be significantly more successful as part of LLVM than elsewhere. All this does not mean we want to implement everything from scratch. If someone has implementations for parts of the libc ready, and would like to contribute to this project under the LLVM license, we will certainly welcome it.
On Wed, Jun 26, 2019 at 9:02 AM Andrew Kelley via llvm-dev < llvm-dev at lists.llvm.org> wrote:> It seems reasonable to assume that Google is only > interested in Linux. In this case I have to re-iterate my original > question, what are the needs that are not being met by existing Linux > libcs, such as musl?First of all, let me make this clear: musl is great, and I have used it personally to learn about how things work. We also evaluated the option of adopting musl and modifying it to suit our needs (I listed our rough goals here: http://lists.llvm.org/pipermail/llvm-dev/2019-June/133360.html) However, considering the disruptive nature of some of the changes we want to make (like ABI independence, sanitizer friendly, modularity, avoiding complexity from dynamic linking where possible, etc.), it seemed comparable to building from the ground up. That also offered an opportunity to structure this as part of the overall LLVM project which has lots of advantages on its own. This does not mean we want to re-implement everything. If community members already have parts of libc implementations ready and want to contribute them to LLVM, that would be great. That could absolutely include parts of musl if the authors are interested in contributing them to LLVM, but some research indicated this wasn't likely (happy to be corrected if wrong though). We would just need to figure out how to add pieces incrementally, and support the goals outlined above. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190626/f1906aab/attachment.html>
[ I have worked on FreeBSD libc, so a few clarifications here: ] On 26/06/2019 17:02, Andrew Kelley via llvm-dev wrote:> Finally, I'm only aware of 2 operating systems where the libc is not an > integral part of the system, which is Linux and Windows. For example on > macOS, FreeBSD, OpenBSD, and DragonFlyBSD, the libc is guaranteed to be > available, and must be dynamically linked, because this is the stable > syscall ABI.Solaris and macOS (kind-of) belong on this list, but FreeBSD does not and I don't believe other BSDs do, though the situation is somewhat more complex. On FreeBSD, the system call ABI is stable and there are compat layers that allow foreign or legacy system call interfaces to be exposed to userspace processes (e.g. a FreeBSD 7 system call table on FreeBSD 12, or a Linux system call table on any FreeBSD. The Capsicum sandbox mode is also implemented in part by pivoting the system call layer: once you call cap_enter, some system calls are simply not exposed to you at all). There is even CloudABI, which uses a mostly musl-derived libc and a Capsicum-derived system call table. This is used for statically linked applications with a custom launcher that gives strong security guarantees. That said, the relationship between FreeBSD's libc, libthr (pthreads) and rtld are quite complex, as are their interactions with the kernel. Supporting dlopening libthr turned out to be incredibly hard to support in practice, but even without that, there is some complexity from the fact that libc must allow libthr to preempt a number of its symbols (and must provide implementations of things like pthread_mutex for programs that do not start threads). In the 5.x time frame, we did support two different pthreads implementations. This was, in hindsight, an absolutely terrible idea and not something that I'd ever recommend anyone do ever again. On macOS, libSystem is actually the public interface to the kernel, so you can bring along your own libc if you want to, you just have to dynamically link to libSystem to get access to system calls (or you do what Go did, try to make them without going via libSystem, and watch every single program written in your language die when the kernel's gettimeofday interface changes...). This; however, makes it effectively impossible to difficult to bring your own dyld replacement to macOS, because it must be able to load libSystem without making any system calls...> So it would only make sense for an LLVM libc to be for > Linux and Windows. It seems reasonable to assume that Google is only > interested in Linux. In this case I have to re-iterate my original > question, what are the needs that are not being met by existing Linux > libcs, such as musl?I am also unconvinced that it is possible to design a clean platform abstraction layer for libc that would work over even Linux and FreeBSD without imposing significant penalties for one or the other. If you add Windows into the mix, then it gets a lot harder. POSIX's decision to use int, rather than a pointer type, for file descriptors and to make specific guarantees about reuse order (rather than just providing dup2 as a moderately sane interface) means that userspace code will need to implement the file descriptor table. Do we build higher-level layering on top of file descriptors or do we support Windows HANDLEs natively for internal usage and use fds only for public APIs? The idea of an LLVM libc has been proposed a few times and generally the pushback has been that it doesn't make sense because libc is so intimately tied to the host kernel that it's very hard to consider it as a portable component. David
On 25.06.2019 21:12, Rich Felker via llvm-dev wrote:> Since I have a little experience in this area, I'd like to chime in on > it. :-) TL;DR I think it's a reall, REALLY bad idea.As a contributor to NetBSD libc, I don't see any benefits in the proposal. Mentioned motivations like static linking, static PIE are supported natively out of the box. Tighter sanitizer integration? NetBSD supports in-libc UBSan and whole-distribution sanitization (ASan, UBSan, TSan, MSan.. in various degrees of completeness). Licensing issues? The implementation is (L)GPL-free... NetBSD libc is an integral and inseparable part of the NetBSD distribution. We share the same code with the kernel, userland utilities, bootloader, rumpkernels.. Every kernel has its own specific syscall ABI layer and thus parts like libpthread need to be implemented largely for each OS separately. Even every BSD is totally different here. Portable libdl? Not really as we shall support dynamic loader specifics on per-OS basis. Furthermore downstream OSs like NetBSD need downstream specific behavior in toolchain that is tightly integrated into loader/libc/kernel. Reimplementing libc would be a tremendous work for literally no gain. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190627/3bf85eaf/attachment.sig>
On Mon, Jun 24, 2019 at 3:23 PM Siva Chandra via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hello LLVM Developers, > > Within Google, we have a growing range of needs that existing libc > implementations don't quite address. This is pushing us to start working on > a new libc implementation. > > Informal conversations with others within the LLVM community has told us > that a libc in LLVM is actually a broader need, and we are increasingly > consolidating our toolchains around LLVM. Hence, we wanted to see if the > LLVM project would be interested in us developing this upstream as part of > the project. > > To be very clear: we don't expect our needs to exactly match everyone > else's -- part of our impetus is to simplify things wherever we can, and > that may not quite match what others want in a libc. That said, we do > believe that the effort will still be directly beneficial and usable for > the broader LLVM community, and may serve as a starting point for others in > the community to flesh out an increasingly complete set of libc > functionality. > > We are still in the early stages, but we do have some high-level goals and > guiding principles of the initial scope we are interested in pursuing: > > > 1. > > The project should mesh with the "as a library" philosophy of the LLVM > project: even though "the C Standard Library" is nominally "a library," > most implementations are, in practice, quite monolithic. > 2. > > The libc should support static non-PIE and static-PIE linking. This > means, providing the CRT (the C runtime) and a PIE loader for static > non-PIE and static-PIE linked executables. > 3. > > If there is a specification, we should follow it. The scope that we > need includes most of the C Standard Library; POSIX additions; and some > necessary, system-specific extensions. This does not mean we should (or > can) follow the entire specification -- there will be some parts which > simply aren't worth implementing, and some parts which cannot be safely > used in modern coding practice. > >I don’t think that POSIX additions should be part of the core library. Not all interesting targets are POSIX: e.g. Windows. I think that POSIX should be a separate standalone library piece as you mention that dynamic loading should be downthread. I think that the only pieces that should be available in the core should be the C11 core specification. What parts of the C standard do you consider as not being worth implementing? If you are looking to implement “extensions” which replace the modern coding practices, does that mean that the surface really should be the MSVCRT implementation then? Because it does deprecate the “unsafe” routines in favour of safe versions (suffixed with `_s`). Additionally, you could always just implement the C standard annex and use those instead.> > 1. > > Vendor extensions must be considered very carefully, and only admitted > when necessary. Similar to Clang and libc++, it does seem inevitable that > we will need to provide some level of compatibility with other vendors' > extensions. > >How would this work for reasonable bodies of code which are built on Linux? e.g. Chrome does have Linux specific paths and I would be surprised if Chrome does not depend on any GNU behaviours.> > 1. > > The project should be an exemplar of developing with LLVM tooling. Two > examples are fuzz testing from the start, and sanitizer-supported testing. > > > There are also few areas which we do not intend to invest in at this point: > > > 1. > > Implement dynamic loading and linking support. > >If this is done as a “library” layer, then so should POSIX and the C99/C11 annexes.> > 1. > > Support for more architectures (we'll start with just x86-64 for > simplicity). > >I think that AArch64 is pretty core these days and leaving that out is pretty restrictive. At this point Windows AArch64 is an interesting target. With Linux AArch64 and Windows AArch64 becoming more mainstream, it seems like a poor design tradeoff to limit the target to Linux x86_64.> > For these areas, the community is of course free to contribute. Our hope > is that, preserving the "as a library" design philosophy will make such > extensions easy, and allow retaining the simplicity when these features > aren't needed. > > We intend to build the new libc in a gradual manner. To begin with, the > new libc will be a layer sitting between the application and the system > libc. Eventually, when the implementation is sufficiently complete, it will > be able to replace the system libc at least for some use cases and contexts. >This is really tricky and finicky to implement (I have done something like this in the past). On ELF you can interposition symbols, but on PE/COFF with two level namespace binding, this needs to be statically resolved. Would the approach mean that symbols are interpositioned at compile time to ensure that they are fully redirected? How will you manage cross-domain memory once a malloc implementation is included into the library? What happens with threading? The general libc implementation would require that full threading is under its control - consider cases like the IE model for TLS. This requires the loader to be aware of the modules and the full spacing. Another example where this starts to break down is with faulty - it was just a library layer that implemented compressed memory mapped library loading because a previous libc implementation - bionic - suffered from extensive issues including the inability to load more than a handful of modules. This is far from only limitation of the bionic libc implementation, but this doesn’t seem like the appropriate forum for discussing the previous libc implementation attempts. One other point of interest to this is how would the loader integration work? With glibc, the loader effectively embeds a copy of libc for itself, and has to dig through the kernel handoff (AT_AUXV) to get the loader location. What happens with multiple object file formats? PE/COFF does not load the same way as ELF and may ripple through the rest of the library. The libc integration is needed for the resolution of symbols as well as for TLS.> So, what do you think about incorporating this new libc under the LLVM > project? >As stated, I really feel that this is far too specialised to certain use cases that are pertinent to Google. I think that this needs to be broadened to allow a general purpose libc much as libc++ is a general C++ implementation. I think that the project has a different set of requirements and seems like it would be extremely interesting to see how it would develop over time. This could really be an interesting choice for a certain type of project but as described feels like it is best explored outside of the umbrella of LLVM.> > Thank you, > > Siva Chandra and the rest of the Google LLVM contributors > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Saleem Abdulrasool compnerd (at) compnerd (dot) org -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190627/18a2e107/attachment.html>
> On Jun 27, 2019, at 2:53 PM, Saleem Abdulrasool via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > So, what do you think about incorporating this new libc under the LLVM project? > > As stated, I really feel that this is far too specialised to certain use cases that are pertinent to Google. I think that this needs to be broadened to allow a general purpose libc much as libc++ is a general C++ implementation. I think that the project has a different set of requirements and seems like it would be extremely interesting to see how it would develop over time. This could really be an interesting choice for a certain type of project but as described feels like it is best explored outside of the umbrella of LLVM. >I don't have a strong stake in this decision, but Saleem's commentary matches my thoughts on the topic. Maybe some of this is related to messaging - would the proposed project be *an* LLVM libc or *the* LLVM libc. There is already at least one instance within the LLVM umbrella where a subproject designed and built to a particular set of constraints became *the* LLVM solution, and ended up disincentivizing investment from contributors whose priorities didn't match those constraints. Staking the blessed-by-LLVM slot for a piece of the toolchain is not free. To turn the question around, why should *this* libc (assuming it will be built whether or not LLVM accepts it) be *the* LLVM libc? --Owen -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190627/1608dc2e/attachment.html>
On Wed, Jun 26, 2019 at 10:27 AM JF Bastien <jfbastien at apple.com> wrote:>> 3. If there is a specification, we should follow it. The scope that we need includes most of the C Standard Library; POSIX additions; and some necessary, system-specific extensions. This does not mean we should (or can) follow the entire specification -- there will be some parts which simply aren't worth implementing, and some parts which cannot be safely used in modern coding practice.> I’d love to hear what you have in mind with point 3 above, and see it expanded. libc++ implements C++11 and subsequent standards, and that makes me wonder: > > Which standards would this libc implement?We need parts of the C standard library, parts of the POSIX extensions, and also the linux headers. The community is of course free to widen the surface as needed.> Would you implement upcoming C standards, and how would you manage “experimental” features (API changes, ABI changes, etc)?We will probably take this up on an as-needed basis.> What parts of the standard wouldn’t you follow, why, how would the LLVM community determine this?I would think what we (the "we" here is for the developer community and not my company or my team) communicate would depend on how the project evolves. For example, at the very beginning, we will probably only say "large parts of the standards A, B, C are still unimplemented." When the implemented surface becomes large enough, we might start explicitly listing the unimplemented parts. There might be parts which require qualification with version numbers.> Which parts aren’t worth implementing? > Which parts cannot be safely used in modern coding practice? How would you remedy what’s perceived as “the bad parts”?At a certain level, what is worth and what is safe/unsafe is a subjective matter. So, instead of listing my opinions here, let me say this: If we build sufficient modularity into the libc, one will be able to pick and choose what they want, and omit what they do not want.> I’d love it if the C Standards Committee, WG14, got renewed involvement through this project. Is that an explicit goal? Who will join WG14 in this effort? > What part of C do you see this project help improve over time?The answer to this question also depends on how the project and the community around it evolves.> How do you intend to test this C library? Fuzzing and all that is nice, but just straight conformance testing is what I’d like to hear about.What kind of testing we want to do depends on what exactly is getting tested. But in general, we want to do conformance tests for sure. We also want to do some amount of differential testing between this new libc and an existing, battle tested libc. Depending on what is getting tested, we also want to be able to test against the test suite of an existing libc.