Eric Blake
2020-Feb-18 21:31 UTC
Re: [Libguestfs] alternatives for hooking dlopen() without LD_LIBRARY_PATH or LD_AUDIT?
On 2/17/20 9:12 AM, Florian Weimer wrote:> * Eric Blake: > >> I'm just now noticing that 'man ld' reports that you may pass '--audit >> LIB' during linking to add a DT_DEPAUDIT dependency on a library >> implementing the audit interface, which sounds like it might be an >> alternative to LD_AUDIT for getting a library with la_objsearch() to >> actually do something (but doesn't obviate the need for la_obsearch() >> to be in a separate library, rather than part of the main executable, >> unless a library can be reused as its own audit library...). > > DT_AUDIT support has yet to be implemented in glibc: > > <https://sourceware.org/bugzilla/show_bug.cgi?id=24943> > <https://sourceware.org/ml/libc-alpha/2019-08/msg00705.html> > > If you go on record saying that you need this, maybe someone will review > the patch. Sorry. 8-(Another followup: nbdkit-vddk-plugin.so is now using a re-exec setup [1], for several reasons: 1. all our other ideas tried so far (such as a dlopen() interposition in the main nbdkit binary) touched more files and required more exported APIs; we managed to get re-exec working with changes limited to just the plugin (other than a minor change to nbdkit to quit messing with argv[] so that /proc/self/cmdline would stay stable - but that did not require a new API). 2. it turns out that overriding dlopen() was insufficient to work around VDDK's setup [2]. It _did_ solve the initial dlopen() performed by VDDK, but that library in turn had DT_NEEDED entries for bare library names, which dlopen() does NOT affect but which la_objsearch() should. 3. for nbdkit, we want to minimize the number of binary files shipped; the re-exec solution works with just the nbdkit binary and the nbdkit-vddk-plugin.so. Any solution that requires a third file to be shipped (be that a shared library providing dlopen, or a LD_AUDIT library, or otherwise) is less palatable than the 2-binary solution that our re-exec solution provides. [1] https://github.com/libguestfs/nbdkit/commit/0c7ac4e655b [2] https://www.redhat.com/archives/libguestfs/2020-February/msg00184.html So with that said, here's a question I just thought of: If your patch for glibc support for DT_AUDIT is incorporated, is it possible to mark a shared library as its own audit library via DT_AUDIT? That is, if nbdkit-vddk-plugin.so can provide entry points for _both_ the nbdkit interface (which satisfies dlopen() from the nbdkit binary) and la_version/la_objsearch() (which satisfy the requirements for use from the audit code in ld.so), _and_ during the compilation of nbdkit-vddk-plugin.so, we marked the library as its own DT_AUDIT entry, would the mere act of dlopen("nbdkit-vddk-plugin.so") from nbdkit be sufficient to trigger audit actions such as la_objsearch() on all subsequent shared loads (whether by dlopen() or DT_NEEDED) performed by nbdkit-vddk-plugin.so and its descendant loaded libraries? Because if so, we would have a use case where a single binary, set up to act as its own audit library, might be sufficient to hook the shared object search path without needing any of environment variable modification, a process re-exec, or a third shipping binary - in which case that would indeed be a nicer solution than the current re-exec solution we committed today (of course, nbdkit would not be able to rely on that solution except on systems with new enough glibc to support DT_AUDIT). I guess even without DT_AUDIT support, I could at least answer the question of whether a single .so can be used to satisfy both dlopen() and LD_AUDIT interfaces at once by setting LD_AUDIT (where the only remaining gap is figuring out when glibc can let DT_AUDIT have the same effect). During my earlier attempts to get a working dlopen() override, I didn't consider any solution that required setting LD_AUDIT, but now that I proved dlopen() override alone was not enough for the case at hand, having to re-exec to set LD_AUDIT is no worse than having to re-exec to set LD_LIBRARY_PATH as a fallback to systems where glibc does not support DT_AUDIT. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Florian Weimer
2020-Feb-21 12:19 UTC
Re: [Libguestfs] alternatives for hooking dlopen() without LD_LIBRARY_PATH or LD_AUDIT?
* Eric Blake:> So with that said, here's a question I just thought of: > > If your patch for glibc support for DT_AUDIT is incorporated, is it > possible to mark a shared library as its own audit library via > DT_AUDIT? That is, if nbdkit-vddk-plugin.so can provide entry points > for _both_ the nbdkit interface (which satisfies dlopen() from the > nbdkit binary) and la_version/la_objsearch() (which satisfy the > requirements for use from the audit code in ld.so), _and_ during the > compilation of nbdkit-vddk-plugin.so, we marked the library as its own > DT_AUDIT entry, would the mere act of dlopen("nbdkit-vddk-plugin.so") > from nbdkit be sufficient to trigger audit actions such as > la_objsearch() on all subsequent shared loads (whether by dlopen() or > DT_NEEDED) performed by nbdkit-vddk-plugin.so and its descendant > loaded libraries?So you want to dlopen nbdkit-vddk-plugin.so and launch a new auditor even if the process so far hasn't used auditing? And the main program (which links agains a library which eventually makes this dlopen call) would not know anything about the existence of this specific plugin and auditing? This isn't currently supported. It's not just that the glibc implementation cannot do it. The audit API (as sketched in <link.h>) is not a good fit for late loading where you have never observed open events. It pretty much assumes that auditors are loaded magically *before* program start, so that they can observe all open calls and set up their own data structures along the way. I think what confuses me is that keep talking about a single binary, but clearly there is this separate vddk DSO, and there is talk of plugins. So it seems to me that multiple files are involved already? Thanks, Florian
Eric Blake
2020-Feb-21 13:26 UTC
Re: [Libguestfs] alternatives for hooking dlopen() without LD_LIBRARY_PATH or LD_AUDIT?
On 2/21/20 6:19 AM, Florian Weimer wrote:> * Eric Blake: > >> So with that said, here's a question I just thought of: >> >> If your patch for glibc support for DT_AUDIT is incorporated, is it >> possible to mark a shared library as its own audit library via >> DT_AUDIT? That is, if nbdkit-vddk-plugin.so can provide entry points >> for _both_ the nbdkit interface (which satisfies dlopen() from the >> nbdkit binary) and la_version/la_objsearch() (which satisfy the >> requirements for use from the audit code in ld.so), _and_ during the >> compilation of nbdkit-vddk-plugin.so, we marked the library as its own >> DT_AUDIT entry, would the mere act of dlopen("nbdkit-vddk-plugin.so") >> from nbdkit be sufficient to trigger audit actions such as >> la_objsearch() on all subsequent shared loads (whether by dlopen() or >> DT_NEEDED) performed by nbdkit-vddk-plugin.so and its descendant >> loaded libraries? > > So you want to dlopen nbdkit-vddk-plugin.so and launch a new auditor > even if the process so far hasn't used auditing? And the main program > (which links agains a library which eventually makes this dlopen call) > would not know anything about the existence of this specific plugin and > auditing?Yes, you interpreted my question correctly.> > This isn't currently supported. It's not just that the glibc > implementation cannot do it. The audit API (as sketched in <link.h>) is > not a good fit for late loading where you have never observed open > events. It pretty much assumes that auditors are loaded magically > *before* program start, so that they can observe all open calls and set > up their own data structures along the way.The concern is not about nbdkit loading nbdkit-vddk-plugin.so, but nbdkit-vddk-plugin.so doing subsequent loads of libvixDiskLib.so and its bare dependencies on libstdc++.so and such that were incorrectly built without DT_RUNPATH, but where we can't rewrite libvixDiskLib.so because it is proprietary, so the best we can do is hook the loading environment (either by la_objsearch or by re-exec with LD_LIBRARY_PATH).> > I think what confuses me is that keep talking about a single binary, but > clearly there is this separate vddk DSO, and there is talk of plugins. > So it seems to me that multiple files are involved already?You are correct that there are multiple files involved: The nbdkit project currently has 2 relevant files: 'nbdkit' and 'nbdkit-vddk-plugin.so' (and various other plugins, but those are not relevant to the VDDK use case) The VDDK project from VMware: multiple files: libvixDiskLib.so (primary interface), which dlopen()s libdiskLibPlugin.so, which in turn has DT_NEEDED on libstdc++.so and several other recompiled system libraries. 'find vmware-vix-disklib-distrib/lib64/ -type f | wc' shows 23 libraries total, but the end user installs it as a single tarball from VMware. We can't change what VDDK ships, but we want to avoid making the nbdkit portion change from 2 files into 3, as every additional file required beyond what VMware ships is that much more burden for a user to choose to use nbdkit for accessing their VMware disks. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2020-Feb-21 14:51 UTC
Re: [Libguestfs] alternatives for hooking dlopen() without LD_LIBRARY_PATH or LD_AUDIT?
On Fri, Feb 21, 2020 at 01:19:34PM +0100, Florian Weimer wrote:> I think what confuses me is that keep talking about a single binary, but > clearly there is this separate vddk DSO, and there is talk of plugins. > So it seems to me that multiple files are involved already?nbdkit is a standalone binary that happens to be able to load plugins from a well-known path, eg nbdkit-vddk-plugin.so. nbdkit knows the path for plugins, and there's a wrapper allowing it to get local plugins even when it's still in the build directory. Adding another file would mean another path (or overloading the meaning of the plugin path) and just makes the whole thing more fragile and complex. Having said all that, what would also solve this is either an API for updating LD_LIBRARY_PATH after the program has started; or making setenv ("LD_LIBRARY_PATH",...) DTRT*; or some kind of dlopen() variant which takes a library path as an extra parameter. Rich. * “Why does setenv ("LD_LIBRARY_PATH") not work?” has several stackoverflow answers. Apparently even the JDK has to work around this by re-execing. https://www.google.com/search?q=setenv+%22LD_LIBRARY_PATH%22+site:stackoverflow.com -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Possibly Parallel Threads
- Re: alternatives for hooking dlopen() without LD_LIBRARY_PATH or LD_AUDIT?
- Re: alternatives for hooking dlopen() without LD_LIBRARY_PATH or LD_AUDIT?
- Re: alternatives for hooking dlopen() without LD_LIBRARY_PATH or LD_AUDIT?
- Re: alternatives for hooking dlopen() without LD_LIBRARY_PATH or LD_AUDIT?
- alternatives for hooking dlopen() without LD_LIBRARY_PATH or LD_AUDIT?