thr3ads.net - llvm dev - [llvm-dev] A libc in LLVM [Jun 2019]

If this information is useful, please help other people find it:
Share via:

Zachary Turner via llvm-dev

2019-Jun-26 00:22 UTC

[llvm-dev] A libc in LLVM

I foresee problems with this on both Windows and non-Windows.  A
typical libc implementation has a lot of internal state that is shared
across API boundaries in a way that is considered an implementation
detail.  So making assumptions about which state is shared and which
isn't is going to be a problem.

How do you guarantee that if you implement method A and forward method
B, that B will behave the same as it would have if you had forwarded A
also?  It might not even work at all.  Where can you safely draw this
boundary?

Users can set errno for example, and in many cases they must set errno
to 0 before invoking a call if they want to reliably detect an error.
So let's say they set errno to 0, then call a method which our libc
implementation decides to forward.  What do we do?  We could propagate
errno on every single call, but my point is that there are going to be
a ton of subtle issues that arise from this approach that are hard to
foresee, precisely because the implementation details of a libc
implementation are supposed to be just that - implementation details.

On Tue, Jun 25, 2019 at 5:01 PM Siva Chandra <sivachandra at google.com>
wrote:>
> On Tue, Jun 25, 2019 at 4:32 PM Zachary Turner <zturner at
roblox.com> wrote:
>>
>> The main concern I have is that Windows is so different from
>> everything else that there is a high likelihood of decisions being
>> baked in early on that make things very difficult for people to come
>> along later and contribute a Windows implementation.  This happened
>> with sanitizers for example (lack of support for weak functions on
>> Windows), LLDB (posix api calls scattered throughout the codebase),
>> and I worry with libc it will be even more difficult to correctly
>> design the abstraction because we have to deal with executable file
>> format, syscalls, operating system loaders, and various linkage
>> models.
>>
>> The most immediate thing I think we will run into is that you
>> mentioned wanting this to take shape as something that sits in between
>> system libc and application.  Given that Windows' libc and other
>> versions of libc are so different, I expect this to lead to some
>> interesting problems.
>>
>> Can you elaborate more on how you envision this working with llvm libc
>> in between application and system libc?
>
>
> A typical application uses a large number of pieces from a libc. But, it is
not practical to have everything implemented and ready in a new libc from day
one. So for that phase, when the new libc is still being built, we want the
unimplemented parts of the new libc to essentially redirect to the system libc.
This brings two benefits:
>
> 1. We can build the new libc in a gradual manner.
> 2. Applications stay operational while gaining the benefits of the new
implementations.
>
> Do you foresee any problems with this approach on Windows?

Finkel, Hal J. via llvm-dev

2019-Jun-26 01:20 UTC

head link

[llvm-dev] A libc in LLVM

On 6/25/19 7:22 PM, Zachary Turner via llvm-dev wrote:> I foresee problems with this on both Windows and non-Windows.  A
> typical libc implementation has a lot of internal state that is shared
> across API boundaries in a way that is considered an implementation
> detail.  So making assumptions about which state is shared and which
> isn't is going to be a problem.
>
> How do you guarantee that if you implement method A and forward method
> B, that B will behave the same as it would have if you had forwarded A
> also?  It might not even work at all.  Where can you safely draw this
> boundary?
>
> Users can set errno for example, and in many cases they must set errno
> to 0 before invoking a call if they want to reliably detect an error.
> So let's say they set errno to 0, then call a method which our libc
> implementation decides to forward.  What do we do?  We could propagate
> errno on every single call, but my point is that there are going to be
> a ton of subtle issues that arise from this approach that are hard to
> foresee, precisely because the implementation details of a libc
> implementation are supposed to be just that - implementation details.

You certainly can't mix-and-match on a per-function level, in general. I 
suspect that there are some subsystems that can be substituted. Using 
open from one libc and close from another seems problematic. Using open 
and close from one libc and qsort from another is probably fine. And, as 
you point out, the library might need to be configurable to use an 
externally-provided errno.

  -Hal

>
> On Tue, Jun 25, 2019 at 5:01 PM Siva Chandra <sivachandra at
google.com> wrote:
>> On Tue, Jun 25, 2019 at 4:32 PM Zachary Turner <zturner at
roblox.com> wrote:
>>> The main concern I have is that Windows is so different from
>>> everything else that there is a high likelihood of decisions being
>>> baked in early on that make things very difficult for people to
come
>>> along later and contribute a Windows implementation.  This happened
>>> with sanitizers for example (lack of support for weak functions on
>>> Windows), LLDB (posix api calls scattered throughout the codebase),
>>> and I worry with libc it will be even more difficult to correctly
>>> design the abstraction because we have to deal with executable file
>>> format, syscalls, operating system loaders, and various linkage
>>> models.
>>>
>>> The most immediate thing I think we will run into is that you
>>> mentioned wanting this to take shape as something that sits in
between
>>> system libc and application.  Given that Windows' libc and
other
>>> versions of libc are so different, I expect this to lead to some
>>> interesting problems.
>>>
>>> Can you elaborate more on how you envision this working with llvm
libc
>>> in between application and system libc?
>>
>> A typical application uses a large number of pieces from a libc. But,
it is not practical to have everything implemented and ready in a new libc from
day one. So for that phase, when the new libc is still being built, we want the
unimplemented parts of the new libc to essentially redirect to the system libc.
This brings two benefits:
>>
>> 1. We can build the new libc in a gradual manner.
>> 2. Applications stay operational while gaining the benefits of the new
implementations.
>>
>> Do you foresee any problems with this approach on Windows?
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Siva Chandra via llvm-dev

2019-Jun-27 04:44 UTC

head link

[llvm-dev] A libc in LLVM

> On 6/25/19 7:22 PM, Zachary Turner via llvm-dev wrote:
> > I foresee problems with this on both Windows and non-Windows.  A
> > typical libc implementation has a lot of internal state that is shared
> > across API boundaries in a way that is considered an implementation
> > detail.  So making assumptions about which state is shared and which
> > isn't is going to be a problem.
+1 for what Hal Finkel has said below about switching from redirectors
to implementations: There will be certain groups of functions which
will have to be switched all together. We will not be able to do it
one function at a time for such groups.
> > How do you guarantee that if you implement method A and forward method
> > B, that B will behave the same as it would have if you had forwarded A
> > also?  It might not even work at all.  Where can you safely draw this
> > boundary?
Are you talking about a scenario wherein implementation of B in the
system libc calls its A? If yes, most libc implementations do a good
job of using internal names in such scenarios. That is, B would call A
with an internal name. This ensures that B from the system libc calls
A also from the system libc and not the redirector/forwarder.
> > Users can set errno for example, and in many cases they must set errno
> > to 0 before invoking a call if they want to reliably detect an error.
> > So let's say they set errno to 0, then call a method which our
libc
> > implementation decides to forward.  What do we do?  We could propagate
> > errno on every single call, but my point is that there are going to be
> > a ton of subtle issues that arise from this approach that are hard to
> > foresee, precisely because the implementation details of a libc
> > implementation are supposed to be just that - implementation details.
Dealing with errno in particular is probably not as nasty as it seems.
The standard allows errno to be a macro. Hence, for the transitory
phase, implementations and redirectors in our libc can make use of the
errno from the system libc. Something like this:

$> cat llvm-errno.cpp
#include <errno.h>  // This is the system-libc header file

int *__llvm_errno() {
  return &errno;
}

$> cat errno.h  # This is the llvm libc's errno.h
int *__llvm_errno();

#define errno (*__llvm_errno())

On Tue, Jun 25, 2019 at 6:20 PM Finkel, Hal J. <hfinkel at anl.gov>
wrote:> You certainly can't mix-and-match on a per-function level, in general.
I
> suspect that there are some subsystems that can be substituted. Using
> open from one libc and close from another seems problematic. Using open
> and close from one libc and qsort from another is probably fine. And, as
> you point out, the library might need to be configurable to use an
> externally-provided errno.

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Jun 2019 - A libc in LLVM

[llvm-dev] A libc in LLVM

[llvm-dev] A libc in LLVM

[llvm-dev] A libc in LLVM

Apparently Analagous Threads