thr3ads.net - Libguestfs - [Libguestfs] [libnbd PATCH v3 09/29] lib/utils: introduce async-signal-safe execvpe() [Feb 2023]

If this information is useful, please help other people find it:
Share via:

Daniel P. Berrangé

2023-Feb-21 18:04 UTC

[Libguestfs] [libnbd PATCH v3 09/29] lib/utils: introduce async-signal-safe execvpe()

On Tue, Feb 21, 2023 at 06:53:39PM +0100, Laszlo Ersek
wrote:> 
> More in general, this lesson tells me that POSIX is effectively
> irrelevant -- which is quite sad in itself; the bigger problem however
> is that *nothing replaces it*. If the one formal standard we have for
> portability does not reflect reality closely enough, and we need to rely
> on personal experience with various platforms, then we're back to where
> we were *before* POSIX. That is, having to check several separate
> documentation sets, and testing each API on every relevant platform in
> *each project* where the API is used. The idea is "ignore POSIX, care
> about Linux / modern systems only", but then it turns out those modern
> systems *do* differ sufficiently that extracting a common programming
> base *would* be useful. It's just that POSIX is not that common base;
> more precisely, there is no formalized, explicit common base. I guess
> "whatever passes CI" is the common base. That's... terrible,
and it
> makes me seriously question if I want to program userspace in C at all.
FWIW, I wouldn't say that POSIX is irrelevant in general. If you
are trying to maximimse portability it is worth paying attention to.

Rather I'd say that maintainers of projects may be opinionated about
which platforms they wish to support, to eliminate the burden of caring
about platforms which have few if any users in the modern world.

In libvirt and QEMU context we set explicit platform support targets:

  https://libvirt.org/platforms.html
  https://www.qemu.org/docs/master/about/build-platforms.html

which effectively limit us to only care about actively developed
OS from the last ~4 years, and even then only fairly mainstream
stuff. We don't care about a hobbyist/toy UNIX OS. The burden is
on other OS to attain compatibility with mainstream modern OS,
not for apps to adapt to osbscure feature-poor platforms. 

With this attitude, we don't care about compliance with countless
obsolete vendor's UNIX platforms, and thus many of the edge cases
that POSIX worries about can be ignored. This frees the project
maintainers time to focus on work that benefits a broader set of
users.
>From this, libvirt/QEMU could both explicitly decide to not careabout any C compilers other than CLang/GCC.  Vendor compilers and
most especially MSVC are out of scope. CLang/GCC are able to support
any of the OS platforms we target. This frees us from caring about
ISO C standards, letting us use GNU extensions.

AFAIK, libnbd/nbdkit haven't made a statement about what platforms
they aim to target. In my response I'm more or less assuming though
that you would only care about similar modern platforms to QEMU/libvirt,
and thus POSIX conformance would not be needed in all areas. Maybe
libnbd/nbdkit want to be more explicit about what they target as
platforms to make the portability requirements clear to contributors ?

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Laszlo Ersek

2023-Feb-21 22:59 UTC

head link

[Libguestfs] [libnbd PATCH v3 09/29] lib/utils: introduce async-signal-safe execvpe()

On 2/21/23 19:04, Daniel P. Berrang? wrote:
> AFAIK, libnbd/nbdkit haven't made a statement about what platforms
> they aim to target. In my response I'm more or less assuming though
> that you would only care about similar modern platforms to QEMU/libvirt,
> and thus POSIX conformance would not be needed in all areas. Maybe
> libnbd/nbdkit want to be more explicit about what they target as
> platforms to make the portability requirements clear to contributors ?
libnbd's README.md requires

* Linux, FreeBSD or OpenBSD.
  Other OSes may also work but we have only tested these three.
* GCC or Clang
* GNU make
* bash
* [...]

nbdkit's requires

* Linux, macOS, Windows, FreeBSD, OpenBSD or Haiku
* GCC or Clang
* bash
* GNU make
* [...]

To me, anything beyond Linux on those OS lists is entirely untestable
*locally*, hence my reliance on POSIX. CI is a horrible way (compared to
a published technical standard) to figure out whether each individual
interface works as needed everywhere, even just across this small set of
OSes. Having to look at multiple OS manual pages is just slightly less
horrible (and I consider those less trustworthy than POSIX; see again
the conflict between the linux man pages and the glibc documentation
from GNU). The POSIX people have done *huge work* to save us that effort.

Sticking with POSIX might make us work more (as in, write technically
superfluous code), but I've always felt fewer nasty surprises are
waiting to ambush us that way.

I don't think we have documentation that describes the broadest
intersection of these OSes specifically. (We don't even have
conflict-free documentation just for Linux!)

Laszlo

Laszlo Ersek

2023-Feb-22 08:48 UTC

head link

[Libguestfs] [libnbd PATCH v3 09/29] lib/utils: introduce async-signal-safe execvpe()

Sorry about replying for the second time. After having slept on it (not
much, but some), some thoughts are emerging / being distilled about my
own attitude.

On 2/21/23 19:04, Daniel P. Berrang? wrote:> On Tue, Feb 21, 2023 at 06:53:39PM +0100, Laszlo Ersek wrote:
>>
>> More in general, this lesson tells me that POSIX is effectively
>> irrelevant -- which is quite sad in itself; the bigger problem however
>> is that *nothing replaces it*. If the one formal standard we have for
>> portability does not reflect reality closely enough, and we need to
rely
>> on personal experience with various platforms, then we're back to
where
>> we were *before* POSIX. That is, having to check several separate
>> documentation sets, and testing each API on every relevant platform in
>> *each project* where the API is used. The idea is "ignore POSIX,
care
>> about Linux / modern systems only", but then it turns out those
modern
>> systems *do* differ sufficiently that extracting a common programming
>> base *would* be useful. It's just that POSIX is not that common
base;
>> more precisely, there is no formalized, explicit common base. I guess
>> "whatever passes CI" is the common base. That's...
terrible, and it
>> makes me seriously question if I want to program userspace in C at all.
> 
> FWIW, I wouldn't say that POSIX is irrelevant in general. If you
> are trying to maximimse portability it is worth paying attention to.
> 
> Rather I'd say that maintainers of projects may be opinionated about
> which platforms they wish to support, to eliminate the burden of caring
> about platforms which have few if any users in the modern world.
> 
> In libvirt and QEMU context we set explicit platform support targets:
> 
>   https://libvirt.org/platforms.html
>   https://www.qemu.org/docs/master/about/build-platforms.html
> 
> which effectively limit us to only care about actively developed
> OS from the last ~4 years, and even then only fairly mainstream
> stuff. We don't care about a hobbyist/toy UNIX OS. The burden is
> on other OS to attain compatibility with mainstream modern OS,
> not for apps to adapt to osbscure feature-poor platforms. 
> 
> With this attitude, we don't care about compliance with countless
> obsolete vendor's UNIX platforms, and thus many of the edge cases
> that POSIX worries about can be ignored. This frees the project
> maintainers time to focus on work that benefits a broader set of
> users.
> 
> From this, libvirt/QEMU could both explicitly decide to not care
> about any C compilers other than CLang/GCC.  Vendor compilers and
> most especially MSVC are out of scope. CLang/GCC are able to support
> any of the OS platforms we target. This frees us from caring about
> ISO C standards, letting us use GNU extensions.
The attitude you describe above and my attitude are largely driven by
the same goal: target development as narrowly as possible.

Portability is essential in both cases; the big difference is in the
workflow chosen for achieving portability.

Approach #1: A number of OSes and a number of tools (compilers etc) are
hand-picked, based on "practical" factors. This set of components
taken
as a whole does not have uniform, central documentation. Therefore,
development is driven by (a) continuously consulting multiple -- often
conflicting -- sets of documentation, and by (b) trial-and-error. By
"trial-and-error" I mean that a "CI pass" is taken as strong
evidence of
absence of bugs, including portability bugs. The workflow relies heavily
on CI to root out portability bugs.

The advantage of this approach is that it deduces -- with documentation
reconciliation, trial-and-error, and compiler / OS / libc source code
investigation -- such a "common denominator" that is fairly likely the
*greatest* common denominator. Therefore less/simpler code has to be
written and maintained for feature and bugfix delivery.

The disadvantage is that there is no single source of truth; the
workflow is centered on reconciling incomplete and/or conflicting
documentation sets, and "happens to work in CI" is taken as the final
argument. CI is costly in computing time (energy), developer time
(waiting, bad presentation of results), and money (minutes are
expensive), and locally testing all targeted platforms is a huge chore.
CI development/management in itself consumes immense human effort.

Approach #2: target a published technical standard, as a single source
of truth. Still employ CI, but not as a guiding tool, more like "just in
case". CI failures originating from portability issues are not expected
by general. CI success is not taken as the primary evidence of lack of
portability bugs.

The advantage of this approach is that developers can focus on a single
source of truth, for driving development -- POSIX. Patching up
portability problems may occasionally be necessary, but that should be
the exception.

The disadvantage of this approach is that POSIX, while arguably a common
denominator, is almost certainly not the *greatest* common denominator.
Therefore, more code needs to be written and maintained, plus recent
developments that "eventually" appear in all of the targeted platforms
/
tools, are not consumable until they become centrally standardized.

So, here's the thing: at a personal level, I can entirely identify with
approach #2, and I'm unable to identify with approach #1, as the
development workflow that I am supposed to follow and practice. To me,
being torn to pieces between 3-4 conflicting documentation sets, and
writing code such that the *primary metric* be "let me see if this
passes CI -- let me throw code at the wall and see what sticks" is
unbearable. Having to submit several tweaks in succession and waiting
tens of minutes for CI to finish every time, rules out software
development as a profession for me. (CI remains relevant anyway, but not
for dictating or driving portability decisions.) Having to test out
interfaces manually that are supposed to be standard, for determining
and exploiting their *accidental* greatest common denominator, again
rules out software development for me as a profession.

Such work *is* valuable, but it's called standardization / standards
development, not software development. I don't mind participating in
standards development, but the *output* of that activity needs to be a
*central formal standard* that programmers can rely upon in the future,
not some implicit understanding that gets encoded in / dispersed over a
bunch of disparate applications and libraries -- such as "we can call
execvp() here because our particular fork() version lets us" -- that
merely happen to target the same arbitrary set of platforms.

QEMU actually gets this *quite* right, with "devel/style.rst". It
still
doesn't say anything about fork()/execvp() though, for example.

On the same note, I honestly think that the conflict between the linux
manual pages and the GNU manual, regarding the child process
restrictions, is *unforgivable*.

Note that I'm not trying to assert an objective truth here. All I'm
saying is I'm personally incompatible with approach#1. To me, *how* I
work is generally more important than *what* I achieve for users. Under
that umbrella, the justification for introducing our own
async-signal-safe execvpe in this patch is simply the fact that the
official documentations (plural) available on Linux are *inconsistent*
about fork()+execvp(). The fact that it "happens" to work in practice
is
just happenstance. If you will, call this my denial of practical reality.

So: if the libnbd project can tolerate my attitude (approach#2), then
I'd like to proceed with this series (full scope), with me addressing
the v3 review feedback in v4, and so on. If not, then I'll abandon the
series, and try to make myself useful with something else -- where my
basic stance, towards whatever documentation I read, need not be *distrust*.

Laszlo

Reasonably Related Threads

Search for more maybe matching threads

Libguestfs - Feb 2023 - [libnbd PATCH v3 09/29] lib/utils: introduce async-signal-safe execvpe()

[Libguestfs] [libnbd PATCH v3 09/29] lib/utils: introduce async-signal-safe execvpe()

[Libguestfs] [libnbd PATCH v3 09/29] lib/utils: introduce async-signal-safe execvpe()

[Libguestfs] [libnbd PATCH v3 09/29] lib/utils: introduce async-signal-safe execvpe()

Reasonably Related Threads