thr3ads.net - Libguestfs - [Libguestfs] [PATCH libnbd] generator: Pass LISTEN_FDNAMES=nbd with systemd socket activation [Feb 2023]

If this information is useful, please help other people find it:
Share via:

Laszlo Ersek

2023-Feb-05 08:10 UTC

[Libguestfs] [PATCH libnbd] generator: Pass LISTEN_FDNAMES=nbd with systemd socket activation

On 2/1/23 17:17, Laszlo Ersek wrote:> On 1/31/23 18:19, Eric Blake wrote:
>> The alternative to relying on execvp() to scan PATH is to pre-scan
>> PATH ourselves before fork().  I wish there were a helper function in
>> glibc that would quickly return the absolute path that execvp() would
>> otherwise utilize.
>
> We use execvp() in another spot -- "generator/states-connect.c".
>
> I'll try to come up with some new nbd_internal_fork_safe_*() APIs, for
> preparing and then "running" execvp.
There's a problem.

getenv() is not (generally) thread-safe -- even if we called
getenv("PATH") before forking.

  https://pubs.opengroup.org/onlinepubs/9699919799/functions/getenv.html

An application may call a libnbd interface from one of its threads, and
call getenv() -- or setenv(), or unsetenv(), or putenv() -- from another
one of its threads. Because there is no synchronization around
*existent* getenv() calls in libnbd (LOGNAME, HOME, LIBNBD_DEBUG), this
would lead to undefined behavior.

We can't robustly pre-fetch these environment variables in a library
constructor funcion, like errors_init(), for two reasons:

- Client code that manipulates these variables intentionally, would
  break, even if single-threaded. Example:
"tests/debug-environment.c".

- dlopen() and dlsym() are thread-safe functions (all POSIX functions
  are, unless explicitly noted otherwise

<https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_407>).
  If an application were to dlopen() libnbd, it could expect dlopen() to
  be thread-safe, and so no libnbd constructor function within would be
  permitted to call thread-unsafe functions.

A robust solution would call for an explicit libnbd initialization
function (where libnbd would snapshot the relevant environment variables
into global variables, and the application would be told, by way of
documentation, that this would have to be done before any threads are
created). But that breaks compatibility with existent libnbd
applications.

Another robust solution would be for libnbd to:

- expose a global mutex called "env_mutex" (possibly wrapped into an
  nbd_ accessor API, such as "nbd_lock_environment" or something),

- acquire the mutex around all getenv() and similar calls,

- document for any client application that any environment accesses
  anywhere in the application must be protected by this lock. In
  particular, because getenv() my *modify* internal state, its return
  value must be copied before releasing the mutex.

This is per POSIX:
> The returned string pointer might be invalidated or [...] the string
> content might be overwritten by a subsequent call to getenv
Linux/glibc is a bit more helpful:
<https://man7.org/linux/man-pages/man3/getenv.3.html> calls getenv()
"MT-Safe env", and that is explained as follows
<https://man7.org/linux/man-pages/man7/attributes.7.html>:
>        env    Functions marked with env as an MT-Safety issue access
>               the environment with getenv(3) or similar, without any
>               guards to ensure safety in the presence of concurrent
>               modifications.
>
>               We do not mark these functions as MT-Unsafe, however,
>               because functions that modify the environment are all
>               marked with const:env and regarded as unsafe.  Being
>               unsafe, the latter are not to be called when multiple
>               threads are running or asynchronous signals are enabled,
>               and so the environment can be considered effectively
>               constant in these contexts, which makes the former safe.
IOW the argument on Linux/glibc is that "getenv() will not break
getenv(), because it does not modify anything".

It still needs to be protected from writes: setenv() is marked as
"MT-Unsafe const:env"
<https://man7.org/linux/man-pages/man3/setenv.3.html>, and see the
following excerpt about const:env", again from attributes(7):
>        const  Functions marked with const as an MT-Safety issue non-
>               atomically modify internal objects that are better
>               regarded as constant, because a substantial portion of the
>               GNU C Library accesses them without synchronization.
>               Unlike race, which causes both readers and writers of
>               internal objects to be regarded as MT-Unsafe, this mark is
>               applied to writers only.  Writers remain MT-Unsafe to
>               call, but the then-mandatory constness of objects they
>               modify enables readers to be regarded as MT-Safe (as long
>               as no other reasons for them to be unsafe remain), since
>               the lack of synchronization is not a problem when the
>               objects are effectively constant.
>
>               The identifier that follows the const mark will appear by
>               itself as a safety note in readers.  Programs that wish to
>               work around this safety issue, so as to call writers, may
>               use a non-recursive read-write lock associated with the
>               identifier, and guard all calls to functions marked with
>               const followed by the identifier with a write lock, and
>               all calls to functions marked with the identifier by
>               itself with a read lock.
I don't like it:

- it requires the same amount of work in the source code (both libnbd
  and client apps),

- readers excluding readers is safer per POSIX, where getenv() may be
  entirely MT-Unsafe,

- the environment is not a highly contended resource, so the performance
  impact of getenv() excluding getenv() is likely negligible.

Summary:
- ignore the whole thing?
- update the documentation, add locking APIs, implemented with a mutex?
- update the documentation, add locking APIs, implemented with an
  rwlock?

Please comment,
Laszlo

Richard W.M. Jones

2023-Feb-05 12:38 UTC

head link

[Libguestfs] [PATCH libnbd] generator: Pass LISTEN_FDNAMES=nbd with systemd socket activation

On Sun, Feb 05, 2023 at 09:10:29AM +0100, Laszlo Ersek
wrote:> - ignore the whole thing?
This one.
> - update the documentation, add locking APIs, implemented with a mutex?
I don't believe that would help since other libraries or the main
program might be calling setenv.  We cannot do anything just inside libnbd.

Rich.
> - update the documentation, add locking APIs, implemented with an
>   rwlock?
> 
> Please comment,
> Laszlo
-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

Eric Blake

2023-Feb-06 17:28 UTC

head link

[Libguestfs] [PATCH libnbd] generator: Pass LISTEN_FDNAMES=nbd with systemd socket activation

On Sun, Feb 05, 2023 at 09:10:29AM +0100, Laszlo Ersek
wrote:> On 2/1/23 17:17, Laszlo Ersek wrote:
> > On 1/31/23 18:19, Eric Blake wrote:
> 
> >> The alternative to relying on execvp() to scan PATH is to pre-scan
> >> PATH ourselves before fork().  I wish there were a helper function
in
> >> glibc that would quickly return the absolute path that execvp()
would
> >> otherwise utilize.
> >
> > We use execvp() in another spot --
"generator/states-connect.c".
> >
> > I'll try to come up with some new nbd_internal_fork_safe_*() APIs,
for
> > preparing and then "running" execvp.
> 
> There's a problem.
> 
> getenv() is not (generally) thread-safe -- even if we called
> getenv("PATH") before forking.
> 
>   https://pubs.opengroup.org/onlinepubs/9699919799/functions/getenv.html
> 
> An application may call a libnbd interface from one of its threads, and
> call getenv() -- or setenv(), or unsetenv(), or putenv() -- from another
> one of its threads. Because there is no synchronization around
> *existent* getenv() calls in libnbd (LOGNAME, HOME, LIBNBD_DEBUG), this
> would lead to undefined behavior.
You're right: In general, manipulating the environment in a
multi-threaded application is a risky proposition, as some libc
implementations can perform memory allocations to do things like
compact or (re-)sort the list of current variables even on a getenv()
action; although from the glibc documentation, it looks like the
better qualtiy-of-implementation ones make getenv() non-modifying and
only putenv()/setenv()/unsetenv() can cause actual modifications to
the current list.  It doesn't help that POSIX allows the historical
practice where assignments to the global 'environ' cause libc to use
the resulting new environment correctly, even if the just-assigned
list is not sorted the way libc would have done had it been in charge
of all environment manipulations through the function interfaces.
> 
> We can't robustly pre-fetch these environment variables in a library
> constructor funcion, like errors_init(), for two reasons:
> 
> - Client code that manipulates these variables intentionally, would
>   break, even if single-threaded. Example:
"tests/debug-environment.c".
Indeed - the fact that we have already documented the use of an
environment variable as a way to tweak debugging behavior of libnbd
from within a given process means that we don't want to go breaking
that behavior, in places where it is done safely.  Maybe we can
improve the documentation to remind users that using setenv()/putenv()
to add LIBNBD_DEBUG to the environment is best done at a point where
other threads in the process will not be accessing the environment,
but at most it would be just documentation and not something we can
add code to overcome.
> 
> - dlopen() and dlsym() are thread-safe functions (all POSIX functions
>   are, unless explicitly noted otherwise
Yep, libraries in general should not modify the environment.  Reading
the environment is a different matter, though, and these days, it is
an acceptable (although probably under-documented) practice to assume
that as long as the main app doesn't modify the environment in one
thread while using a library in another, then the library can read
(but not modify) the environment.  This puts the burden of locking on
the application, without needing any additional interfaces into the
library.

...> 
> Linux/glibc is a bit more helpful:
> <https://man7.org/linux/man-pages/man3/getenv.3.html> calls getenv()
> "MT-Safe env", and that is explained as follows
> <https://man7.org/linux/man-pages/man7/attributes.7.html>:
> 
> >        env    Functions marked with env as an MT-Safety issue access
> >               the environment with getenv(3) or similar, without any
> >               guards to ensure safety in the presence of concurrent
> >               modifications.
> >
> >               We do not mark these functions as MT-Unsafe, however,
> >               because functions that modify the environment are all
> >               marked with const:env and regarded as unsafe.  Being
> >               unsafe, the latter are not to be called when multiple
> >               threads are running or asynchronous signals are enabled,
> >               and so the environment can be considered effectively
> >               constant in these contexts, which makes the former safe.
> 
> IOW the argument on Linux/glibc is that "getenv() will not break
> getenv(), because it does not modify anything".
Yes, and any good quality of implementation libc should do the same
(although I don't know for sure if the BSD-derived systems actually do
that).
> 
> It still needs to be protected from writes: setenv() is marked as
> "MT-Unsafe const:env"
> <https://man7.org/linux/man-pages/man3/setenv.3.html>, and see the
> following excerpt about const:env", again from attributes(7):
But unless we write the environment in libnbd (which we should not be
doing), the burden of protecting writes is the applications job - the
application should not be calling into libnbd while also modifying the
environment, but libnbd doesn't need to do anything special to assist
an application in implementing that level of lockout.
> 
> I don't like it:
> 
> - it requires the same amount of work in the source code (both libnbd
>   and client apps),
> 
> - readers excluding readers is safer per POSIX, where getenv() may be
>   entirely MT-Unsafe,
> 
> - the environment is not a highly contended resource, so the performance
>   impact of getenv() excluding getenv() is likely negligible.
> 
> Summary:
> - ignore the whole thing?
This, or at most document that applications should not be modifying
the environment in parallel with calling libnbd functions.  That is,
I'm okay if we decide to be explicit that we rely on glibc's promise
of getenv() being MT-safe as long as the other 3 functions which
modify the environment are not used in parallel; but I'm also okay
with leaving that implicit in the .pod documentation where we can
refer people back to this thread in the archives if it ever actually
comes up in the future.
> - update the documentation, add locking APIs, implemented with a mutex?
> - update the documentation, add locking APIs, implemented with an
>   rwlock?
Too much work for no real gain.  Libnbd does not need to do either of
these, at least not without someone giving us good reason why they
can't implement proper locking themselves without libnbd's help.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Libguestfs - Feb 2023 - [PATCH libnbd] generator: Pass LISTEN_FDNAMES=nbd with systemd socket activation

[Libguestfs] [PATCH libnbd] generator: Pass LISTEN_FDNAMES=nbd with systemd socket activation

[Libguestfs] [PATCH libnbd] generator: Pass LISTEN_FDNAMES=nbd with systemd socket activation

[Libguestfs] [PATCH libnbd] generator: Pass LISTEN_FDNAMES=nbd with systemd socket activation