I'm trying to test a trivial patch to set FD_CLOEXEC on the flintlock lockfd when using F_OFD_SETLK and am running into a build failure even in an unpatched state. This is on a Debian jessie amd64 system building the jessie-backports version. The stable version (1.2.19-1+deb8u1) works and builds fine using the same commands, however I also encountered a similar build problem in a sid chroot with 1.2.23-1 $ debian/rules build && fakeroot debian/rules binary : ...ends with the following: dh_install --sourcedir=debian/tmp --fail-missing dh_install: libxapian22 missing files (usr/lib/*/libxapian.so.*), aborting debian/rules:171: recipe for target 'install-stamp' failed make: *** [install-stamp] Error 20 There's a couple of possibly-relevant lines near the top: libtool: warning: 'libxapian.la' has not been installed in '/usr/lib' libtool: install: /usr/bin/install -c bin/.libs/xapian-check /tmp/xapian-core-1. 2.22/debian/tmp/usr/bin/xapian-check libtool: warning: 'libxapian.la' has not been installed in '/usr/lib' libtool: install: /usr/bin/install -c bin/.libs/xapian-compact /tmp/xapian-core- 1.2.22/debian/tmp/usr/bin/xapian-compact libtool: warning: 'libxapian.la' has not been installed in '/usr/lib' libtool: install: /usr/bin/install -c bin/.libs/xapian-inspect /tmp/xapian-core- 1.2.22/debian/tmp/usr/bin/xapian-inspect libtool: warning: 'libxapian.la' has not been installed in '/usr/lib' Not sure what's wrong, yet... I'm wondering of anybody else has seen this. Full log available at: https://80x24.org/spew/20160806024934.10039-1-e at 80x24.org/ (HTML) or as gzipped mboxrd https://80x24.org/spew/20160806024934.10039-1-e at 80x24.org/t.mbox.gz
Eric Wong <e at 80x24.org> wrote:> I'm trying to test a trivial patch to set FD_CLOEXEC on the > flintlock lockfd when using F_OFD_SETLKFwiw, this is the patch I was originally going to test. (but now I see maybe my F_SETFD might only need to be called on F_OFD_SETLK success) Description at bottom. --- a/backends/flint_lock.cc +++ b/backends/flint_lock.cc @@ -108,6 +108,10 @@ FlintLock::lock(bool exclusive, string & explanation) { return ((errno == EMFILE || errno == ENFILE) ? FDLIMIT : UNKNOWN); } +#ifdef FD_CLOEXEC + (void)fcntl(lockfd, F_SETFD, FD_CLOEXEC); +#endif + #ifdef F_OFD_SETLK // F_OFD_SETLK has exactly the semantics we want, so use it if it's // available. Support was added in Linux 3.15, and there's work on --- The problem I'm attempting to solve comes from wanting doing an inplace full reindex of a giant git mail repo[1] while not stalling small incremental updates which happen as mail flows in. So in the reindexer, I start "git log" once for the entire history (which feeds data for git cat-file --batch, which feeds Xapian); and commit+close the Xapian::WritableDatabase every 100 entries. This allows more time-critical processes doing incremental updates to have a chance to index its data sooner rather than waiting 20-30 minutes for the full reindex to finish. The "git log" process is started once, after opening the Xapian DB (I need to open the Xapian DB to get the last-indexed-commit from metadata); but it seems the lack of close-on-exec for the flintlock FD with F_OFD_* usage causes the lock to remain held after closing the Xapian DB... [1] this is an archive of the git mailing list since 2005: https://public-inbox.org/git/20160710004813.GA20210 at dcvr.yhbt.net/T/
On Sat, Aug 06, 2016 at 03:43:48AM +0000, Eric Wong wrote:> Eric Wong <e at 80x24.org> wrote: > > I'm trying to test a trivial patch to set FD_CLOEXEC on the > > flintlock lockfd when using F_OFD_SETLK > > Fwiw, this is the patch I was originally going to test. > (but now I see maybe my F_SETFD might only need to be > called on F_OFD_SETLK success) Description at bottom. > > --- a/backends/flint_lock.cc > +++ b/backends/flint_lock.cc > @@ -108,6 +108,10 @@ FlintLock::lock(bool exclusive, string & explanation) { > return ((errno == EMFILE || errno == ENFILE) ? FDLIMIT : UNKNOWN); > } > > +#ifdef FD_CLOEXEC > + (void)fcntl(lockfd, F_SETFD, FD_CLOEXEC); > +#endif > +This is fixed in master and 1.4.0, which set CLOEXEC for all relevant FDs. Those changes weren't backported as they were rather pervasive and we'd had no user reports of issues due to this, but since the CLOEXEC changes we now we lock in process on a modern Linux kernel and that change got backported - as you've noticed that can cause more major issues without CLOEXEC, and I think we ought to backport CLOEXEC for the lock file to 1.2.x. I'd suggest comparing backends/flint_lock.cc with the version on git master - the CLOEXEC-related changes are easy to see, and a bit more rigorous than your patch (CLOEXEC is applied when the lock file is opened, at least on platforms which support that, which means there's no window in which fork()+exec() can occur in another thread while the FD is unprotected). Arguably we ought to avoid using OFD locks if we don't have CLOEXEC (then we use a child process to hold the lock, which also means that fork()+exec() doesn't end up with a locked FD open). Currently only Linux supports OFD locks, and has CLOEXEC so right now it's not an issue. OFD locks have been proposed for POSIX standardisation, but since CLOEXEC is in POSIX.1-2008, I suspect any platform which adds support for OFD locks would actually support CLOEXEC anyway. Cheers, Olly