Richard W.M. Jones
2023-May-08 21:33 UTC
[Libguestfs] [PATCH nbdkit] server: Add -D nbdkit.environ=1 to dump the environment
On Mon, May 08, 2023 at 08:54:48AM +0200, Laszlo Ersek wrote:> On 5/7/23 12:43, Richard W.M. Jones wrote: > > This is not secure so should not be used routinely. Also we do not > > attempt to filter environment variables, so even ones containing > > multiple lines or special characters are all sent to nbdkit_debug. > > > > The reason for adding this is to allow for debugging the new > > nbd_set_socket_activation_name(3) API added in libnbd 1.16. > > --- > > docs/nbdkit.pod | 10 ++++++++++ > > server/main.c | 28 ++++++++++++++++++++++++++++ > > 2 files changed, 38 insertions(+) > > > > diff --git a/docs/nbdkit.pod b/docs/nbdkit.pod > > index 83cf6a11c..22350cc08 100644 > > --- a/docs/nbdkit.pod > > +++ b/docs/nbdkit.pod > > @@ -619,6 +619,16 @@ are numerous and not usually very interesting. > > S<I<-D nbdkit.backend.controlpath=0>> suppresses the non-datapath > > commands (config, open, close, can_write, etc.) > > > > +=item B<-D nbdkit.environ=1> > > + > > +(not Windows) > > + > > +Print nbdkit's environment variables in the debug output at start up. > > +This is insecure because environment variables may contain both > > +sensitive and user-controlled information, so it should not be used > > +routinely. But it is useful for tracking down problems related to > > +environment variables. > > + > > =item B<-D nbdkit.tls.log=>N > > > > Enable TLS logging. C<N> can be in the range 0 (no logging) to 99. > > diff --git a/server/main.c b/server/main.c > > index 1df5d69ac..7b6adc3cc 100644 > > --- a/server/main.c > > +++ b/server/main.c > > @@ -210,6 +210,26 @@ dump_config (void) > > #endif > > } > > > > +#ifndef WIN32 > > + > > +/* -D nbdkit.environ=1 to dump the environment at start up. */ > > +NBDKIT_DLL_PUBLIC int nbdkit_debug_environ; > > + > > +#ifndef HAVE_ENVIRON_DECL > > +extern char **environ; > > +#endif > > + > > +static void > > +dump_environment (void) > > +{ > > + size_t i; > > + > > + for (i = 0; environ[i]; ++i) > > + nbdkit_debug ("%s", environ[i]); > > +} > > + > > +#endif /* !WIN32 */ > > + > > int > > main (int argc, char *argv[]) > > { > > @@ -662,6 +682,14 @@ main (int argc, char *argv[]) > > /* Check all debug flags were used, and free them. */ > > free_debug_flags (); > > > > +#ifndef WIN32 > > + /* Dump the environment if asked. This is the earliest we can do it > > + * because it uses a debug flag. > > + */ > > + if (nbdkit_debug_environ && verbose) > > + dump_environment (); > > +#endif > > + > > if (help) { > > struct backend *b; > > > > I was surprised not to see any assignment to the new global variable, > but then I found "server/debug-flags.c"; that's some serious wizardry.With the side effect, as it turns out, that it's very hard to implement for language plugins. Rust is fine. OCaml can be done by adding a C object to the link. Other languages, not really :-(> Side thought: should we not make both "nbdkit_debug_environ" (and > friends), and the target of the "sym" pointer in "apply_debug_flags", > volatile? Aliasing doesn't get much more obscure than this, and I wonder > if this has the potential to blow up later, when compilers (and > processors) get even more clever. > > Of course POSIX could always say "this is just required to work", which > I guess would be fine, but currently it doesn't seem to say it.I guess Eric will know for sure, but isn't it generally true that all global variables can be modified by any function call? My model is that they can modified at any time by any called function; even through a pointer so not really visible to the compiler.> Anyway: > > Acked-by: Laszlo Ersek <lersek at redhat.com> > > Thanks for writing the patch! > LaszloRich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Eric Blake
2023-May-08 21:52 UTC
[Libguestfs] [PATCH nbdkit] server: Add -D nbdkit.environ=1 to dump the environment
On Mon, May 08, 2023 at 10:33:25PM +0100, Richard W.M. Jones wrote:> > > > > > > I was surprised not to see any assignment to the new global variable, > > but then I found "server/debug-flags.c"; that's some serious wizardry. > > With the side effect, as it turns out, that it's very hard to > implement for language plugins. Rust is fine. OCaml can be done by > adding a C object to the link. Other languages, not really :-(And not today's problem to solve.> > > Side thought: should we not make both "nbdkit_debug_environ" (and > > friends), and the target of the "sym" pointer in "apply_debug_flags", > > volatile? Aliasing doesn't get much more obscure than this, and I wonder > > if this has the potential to blow up later, when compilers (and > > processors) get even more clever. > > > > Of course POSIX could always say "this is just required to work", which > > I guess would be fine, but currently it doesn't seem to say it. > > I guess Eric will know for sure, but isn't it generally true that all > global variables can be modified by any function call? My model is > that they can modified at any time by any called function; even > through a pointer so not really visible to the compiler.Marking a variable as volatile will force the compiler to reload it on each access, which can slow down loops that read the debug variable. But in general, these are not variables we are frequently changing (we do it at most once, during startup, based on the environment); and a compiler can't completely eliminate global variables unless it is doing some heavy LTO. I see no reason why we would want to mark it volatile, because the only thing that would guarantee is that the compiler no longer hoists repeated reads of the globals outside of loops, even though our loops won't be changing the debug state. I suppose something like LTO could provide a compiler an opportunity to say that no .o linked into the final image modifies the variable directly, which may hurt nbdkit.* debug variables in the main executable, but I don't see how the compiler could ever optimize away access to globals in .so plugins (since the very nature of .so's is that they will be dynamically run by code that the compiler didn't see). This is all waaaaay outside of the bounds of what POSIX says, so at best we are just relying on "does it work for how we are using it" - but since we are only using it for debugging, where it is doing what we want, I'm not too worried about trying to make this comply with some unwritten spec. Sometimes, looking the other way when there is dark magic present is the prudent action. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Laszlo Ersek
2023-May-09 06:56 UTC
[Libguestfs] [PATCH nbdkit] server: Add -D nbdkit.environ=1 to dump the environment
On 5/8/23 23:33, Richard W.M. Jones wrote:> On Mon, May 08, 2023 at 08:54:48AM +0200, Laszlo Ersek wrote: >> On 5/7/23 12:43, Richard W.M. Jones wrote: >>> This is not secure so should not be used routinely. Also we do not >>> attempt to filter environment variables, so even ones containing >>> multiple lines or special characters are all sent to nbdkit_debug. >>> >>> The reason for adding this is to allow for debugging the new >>> nbd_set_socket_activation_name(3) API added in libnbd 1.16. >>> --- >>> docs/nbdkit.pod | 10 ++++++++++ >>> server/main.c | 28 ++++++++++++++++++++++++++++ >>> 2 files changed, 38 insertions(+) >>> >>> diff --git a/docs/nbdkit.pod b/docs/nbdkit.pod >>> index 83cf6a11c..22350cc08 100644 >>> --- a/docs/nbdkit.pod >>> +++ b/docs/nbdkit.pod >>> @@ -619,6 +619,16 @@ are numerous and not usually very interesting. >>> S<I<-D nbdkit.backend.controlpath=0>> suppresses the non-datapath >>> commands (config, open, close, can_write, etc.) >>> >>> +=item B<-D nbdkit.environ=1> >>> + >>> +(not Windows) >>> + >>> +Print nbdkit's environment variables in the debug output at start up. >>> +This is insecure because environment variables may contain both >>> +sensitive and user-controlled information, so it should not be used >>> +routinely. But it is useful for tracking down problems related to >>> +environment variables. >>> + >>> =item B<-D nbdkit.tls.log=>N >>> >>> Enable TLS logging. C<N> can be in the range 0 (no logging) to 99. >>> diff --git a/server/main.c b/server/main.c >>> index 1df5d69ac..7b6adc3cc 100644 >>> --- a/server/main.c >>> +++ b/server/main.c >>> @@ -210,6 +210,26 @@ dump_config (void) >>> #endif >>> } >>> >>> +#ifndef WIN32 >>> + >>> +/* -D nbdkit.environ=1 to dump the environment at start up. */ >>> +NBDKIT_DLL_PUBLIC int nbdkit_debug_environ; >>> + >>> +#ifndef HAVE_ENVIRON_DECL >>> +extern char **environ; >>> +#endif >>> + >>> +static void >>> +dump_environment (void) >>> +{ >>> + size_t i; >>> + >>> + for (i = 0; environ[i]; ++i) >>> + nbdkit_debug ("%s", environ[i]); >>> +} >>> + >>> +#endif /* !WIN32 */ >>> + >>> int >>> main (int argc, char *argv[]) >>> { >>> @@ -662,6 +682,14 @@ main (int argc, char *argv[]) >>> /* Check all debug flags were used, and free them. */ >>> free_debug_flags (); >>> >>> +#ifndef WIN32 >>> + /* Dump the environment if asked. This is the earliest we can do it >>> + * because it uses a debug flag. >>> + */ >>> + if (nbdkit_debug_environ && verbose) >>> + dump_environment (); >>> +#endif >>> + >>> if (help) { >>> struct backend *b; >>> >> >> I was surprised not to see any assignment to the new global variable, >> but then I found "server/debug-flags.c"; that's some serious wizardry. > > With the side effect, as it turns out, that it's very hard to > implement for language plugins. Rust is fine. OCaml can be done by > adding a C object to the link. Other languages, not really :-(Hm. I didn't even realize it was being done like this for plugins' sake. I assumed it was there to avoid boilerplate code. I figure an alternative would be for -D to stash the flags in a dictionary (or list) in the main program, and then each plugin / filter could access that data structure (or even get a reference to it in (one of) the entry point function(s)). IIRC, if you build the main program with -rdynamic, then dynamically loaded objects can use the main program for resolving their symbols... Indeed, nbdkit already seems to use -rdynamic; at least "configure.ac" checks for it.> >> Side thought: should we not make both "nbdkit_debug_environ" (and >> friends), and the target of the "sym" pointer in "apply_debug_flags", >> volatile? Aliasing doesn't get much more obscure than this, and I wonder >> if this has the potential to blow up later, when compilers (and >> processors) get even more clever. >> >> Of course POSIX could always say "this is just required to work", which >> I guess would be fine, but currently it doesn't seem to say it. > > I guess Eric will know for sure, but isn't it generally true that all > global variables can be modified by any function call? My model is > that they can modified at any time by any called function; even > through a pointer so not really visible to the compiler.Well to express it with some hand-waving, modifying globals through pointers are of course fine, but it's about "pointer provenance". The dlsym() use here isn't much different from reading a pointer from a FILE* with fscanf("%p"), casting the result to int*, then de-referencing the result. Collecting the addresses of global variables in a list and passing the list to another function (even in a dl-opened object) is different. Just because the object is a global variable, concerns are not immmediately removed. For example, per POSIX, if you get a signal delivered (truly) asynchronously, you can only mostly assign a value to a "volatile sig_atomic_t" global variable in the signal catching function. (I'm not being exact here, see the spec for details.) The spec doesn't explain (as far as I understand), but the "sig_atomic_t" type is required so that no matter at what instruciton boundary the signal is delivered, the variable can never be modified halfway, and the "volatile" is there for the same visibility concern that I'm raising here. Compilers have been getting more aggressive in exploiting the "strict aliasing" rules; that was all I wanted to raise. I'll comment some more under Eric's response. Thanks! Laszlo> >> Anyway: >> >> Acked-by: Laszlo Ersek <lersek at redhat.com> >> >> Thanks for writing the patch! >> Laszlo > > Rich. >