> On 6/25/19 7:22 PM, Zachary Turner via llvm-dev wrote: > > I foresee problems with this on both Windows and non-Windows. A > > typical libc implementation has a lot of internal state that is shared > > across API boundaries in a way that is considered an implementation > > detail. So making assumptions about which state is shared and which > > isn't is going to be a problem.+1 for what Hal Finkel has said below about switching from redirectors to implementations: There will be certain groups of functions which will have to be switched all together. We will not be able to do it one function at a time for such groups.> > How do you guarantee that if you implement method A and forward method > > B, that B will behave the same as it would have if you had forwarded A > > also? It might not even work at all. Where can you safely draw this > > boundary?Are you talking about a scenario wherein implementation of B in the system libc calls its A? If yes, most libc implementations do a good job of using internal names in such scenarios. That is, B would call A with an internal name. This ensures that B from the system libc calls A also from the system libc and not the redirector/forwarder.> > Users can set errno for example, and in many cases they must set errno > > to 0 before invoking a call if they want to reliably detect an error. > > So let's say they set errno to 0, then call a method which our libc > > implementation decides to forward. What do we do? We could propagate > > errno on every single call, but my point is that there are going to be > > a ton of subtle issues that arise from this approach that are hard to > > foresee, precisely because the implementation details of a libc > > implementation are supposed to be just that - implementation details.Dealing with errno in particular is probably not as nasty as it seems. The standard allows errno to be a macro. Hence, for the transitory phase, implementations and redirectors in our libc can make use of the errno from the system libc. Something like this: $> cat llvm-errno.cpp #include <errno.h> // This is the system-libc header file int *__llvm_errno() { return &errno; } $> cat errno.h # This is the llvm libc's errno.h int *__llvm_errno(); #define errno (*__llvm_errno()) On Tue, Jun 25, 2019 at 6:20 PM Finkel, Hal J. <hfinkel at anl.gov> wrote:> You certainly can't mix-and-match on a per-function level, in general. I > suspect that there are some subsystems that can be substituted. Using > open from one libc and close from another seems problematic. Using open > and close from one libc and qsort from another is probably fine. And, as > you point out, the library might need to be configurable to use an > externally-provided errno.
Errno is an easy example, but perhaps not the best specifically because the standard dictates its behavior. But an implementation may have implicit assumptions as well. I guess let me make this concrete: can you propose a specific separation that you have in mind? Keep in mind that even if A doesn’t depend on B, that doesn’t mean that A and B can be separated. You mentioned that open() and close() would obviously have to be done at the same time, but it’s much worse than this: The *entire transitive closure* of open() and close() must be done at the same time, and my hypothesis is that this is going to a) be much larger than you expect, and b) be different with different underlying libc implementations. Then there are more immediate issues. On Windows specifically, I’m not even sure it’s going to be physically possible to link in two copies of the CRT and have one forward to the other. If it is possible, it’s very non obvious how to make it work and will likely require a ton of additional machinery. On Wed, Jun 26, 2019 at 9:44 PM Siva Chandra <sivachandra at google.com> wrote:> > On 6/25/19 7:22 PM, Zachary Turner via llvm-dev wrote: > > > I foresee problems with this on both Windows and non-Windows. A > > > typical libc implementation has a lot of internal state that is shared > > > across API boundaries in a way that is considered an implementation > > > detail. So making assumptions about which state is shared and which > > > isn't is going to be a problem. > > +1 for what Hal Finkel has said below about switching from redirectors > to implementations: There will be certain groups of functions which > will have to be switched all together. We will not be able to do it > one function at a time for such groups. > > > > How do you guarantee that if you implement method A and forward method > > > B, that B will behave the same as it would have if you had forwarded A > > > also? It might not even work at all. Where can you safely draw this > > > boundary? > > Are you talking about a scenario wherein implementation of B in the > system libc calls its A? If yes, most libc implementations do a good > job of using internal names in such scenarios. That is, B would call A > with an internal name. This ensures that B from the system libc calls > A also from the system libc and not the redirector/forwarder. > > > > Users can set errno for example, and in many cases they must set errno > > > to 0 before invoking a call if they want to reliably detect an error. > > > So let's say they set errno to 0, then call a method which our libc > > > implementation decides to forward. What do we do? We could propagate > > > errno on every single call, but my point is that there are going to be > > > a ton of subtle issues that arise from this approach that are hard to > > > foresee, precisely because the implementation details of a libc > > > implementation are supposed to be just that - implementation details. > > Dealing with errno in particular is probably not as nasty as it seems. > The standard allows errno to be a macro. Hence, for the transitory > phase, implementations and redirectors in our libc can make use of the > errno from the system libc. Something like this: > > $> cat llvm-errno.cpp > #include <errno.h> // This is the system-libc header file > > int *__llvm_errno() { > return &errno; > } > > $> cat errno.h # This is the llvm libc's errno.h > int *__llvm_errno(); > > #define errno (*__llvm_errno()) > > On Tue, Jun 25, 2019 at 6:20 PM Finkel, Hal J. <hfinkel at anl.gov> wrote: > > You certainly can't mix-and-match on a per-function level, in general. I > > suspect that there are some subsystems that can be substituted. Using > > open from one libc and close from another seems problematic. Using open > > and close from one libc and qsort from another is probably fine. And, as > > you point out, the library might need to be configurable to use an > > externally-provided errno. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190627/1c19a7b7/attachment-0001.html>
On Thu, Jun 27, 2019 at 9:06 AM Zachary Turner <zturner at roblox.com> wrote:> I guess let me make this concrete: can you propose a specific separation that you have in mind? > > Keep in mind that even if A doesn’t depend on B, that doesn’t mean that A and B can be separated. You mentioned that open() and close() would obviously have to be done at the same time, but it’s much worse than this: The *entire transitive closure* of open() and close() must be done at the same time, and my hypothesis is that this is going to a) be much larger than you expect, and b) be different with different underlying libc implementations.Let me change the direction here a little bit. Lets say, for Windows, you can develop the new libc starting from a clean slate without having to worry about the redirectors/forwarders. Is that a good enough place for you to start? What I am getting to is this: redirectors are probably an implementation detail at this point. We think they will allow us to develop and phase-in this libc in a gradual manner. But, if they end up being a problem on other platforms, we will build them in such a way that they only stay as Linux specific implementation details. If other platforms can benefit from them, they are of course free to adopt them.> Then there are more immediate issues. On Windows specifically, I’m not even sure it’s going to be physically possible to link in two copies of the CRT and have one forward to the other. If it is possible, it’s very non obvious how to make it work and will likely require a ton of additional machinery.No, I do not think we want to mix up CRTs on any platform. At the least, it will be disruptive to the compiler drivers. Our goal is to build a CRT with supports statically linked executables on Linux. We do not intend to mix this new CRT with the CRT from the system libc. The new CRT might only be useful after a non-trivial part of the libc has been built. Until then, we have to use the CRT from the system libc.