On Sat, Aug 06, 2016 at 03:43:48AM +0000, Eric Wong wrote:> Eric Wong <e at 80x24.org> wrote: > > I'm trying to test a trivial patch to set FD_CLOEXEC on the > > flintlock lockfd when using F_OFD_SETLK > > Fwiw, this is the patch I was originally going to test. > (but now I see maybe my F_SETFD might only need to be > called on F_OFD_SETLK success) Description at bottom. > > --- a/backends/flint_lock.cc > +++ b/backends/flint_lock.cc > @@ -108,6 +108,10 @@ FlintLock::lock(bool exclusive, string & explanation) { > return ((errno == EMFILE || errno == ENFILE) ? FDLIMIT : UNKNOWN); > } > > +#ifdef FD_CLOEXEC > + (void)fcntl(lockfd, F_SETFD, FD_CLOEXEC); > +#endif > +This is fixed in master and 1.4.0, which set CLOEXEC for all relevant FDs. Those changes weren't backported as they were rather pervasive and we'd had no user reports of issues due to this, but since the CLOEXEC changes we now we lock in process on a modern Linux kernel and that change got backported - as you've noticed that can cause more major issues without CLOEXEC, and I think we ought to backport CLOEXEC for the lock file to 1.2.x. I'd suggest comparing backends/flint_lock.cc with the version on git master - the CLOEXEC-related changes are easy to see, and a bit more rigorous than your patch (CLOEXEC is applied when the lock file is opened, at least on platforms which support that, which means there's no window in which fork()+exec() can occur in another thread while the FD is unprotected). Arguably we ought to avoid using OFD locks if we don't have CLOEXEC (then we use a child process to hold the lock, which also means that fork()+exec() doesn't end up with a locked FD open). Currently only Linux supports OFD locks, and has CLOEXEC so right now it's not an issue. OFD locks have been proposed for POSIX standardisation, but since CLOEXEC is in POSIX.1-2008, I suspect any platform which adds support for OFD locks would actually support CLOEXEC anyway. Cheers, Olly
Olly Betts <olly at survex.com> wrote:> On Sat, Aug 06, 2016 at 03:43:48AM +0000, Eric Wong wrote: > > Eric Wong <e at 80x24.org> wrote: > > > I'm trying to test a trivial patch to set FD_CLOEXEC on the > > > flintlock lockfd when using F_OFD_SETLK > > > > Fwiw, this is the patch I was originally going to test. > > (but now I see maybe my F_SETFD might only need to be > > called on F_OFD_SETLK success) Description at bottom. > > > > --- a/backends/flint_lock.cc > > +++ b/backends/flint_lock.cc > > @@ -108,6 +108,10 @@ FlintLock::lock(bool exclusive, string & explanation) { > > return ((errno == EMFILE || errno == ENFILE) ? FDLIMIT : UNKNOWN); > > } > > > > +#ifdef FD_CLOEXEC > > + (void)fcntl(lockfd, F_SETFD, FD_CLOEXEC); > > +#endif > > + > > This is fixed in master and 1.4.0, which set CLOEXEC for all relevant > FDs. Those changes weren't backported as they were rather pervasive > and we'd had no user reports of issues due to this, but since the > CLOEXEC changes we now we lock in process on a modern Linux kernel and > that change got backported - as you've noticed that can cause more major > issues without CLOEXEC, and I think we ought to backport CLOEXEC for the > lock file to 1.2.x.Thanks for the response. Is this the svn/1.2 branch in the git repo? Currently at: commit 76e09ea726d9666462567b5839899fc53f53cebe ("Fix typo in POD docs for Search::Xapian") Did you get a chance to look into the build failure? No worries if not, I've been busy, too; I'm just happy that https://public-inbox.org/git/ (git mailing list archives with Xapian) seems to be holding up :)> I'd suggest comparing backends/flint_lock.cc with the version on git > master - the CLOEXEC-related changes are easy to see, and a bit more > rigorous than your patch (CLOEXEC is applied when the lock file is > opened, at least on platforms which support that, which means there's > no window in which fork()+exec() can occur in another thread while > the FD is unprotected). > > Arguably we ought to avoid using OFD locks if we don't have CLOEXEC > (then we use a child process to hold the lock, which also means that > fork()+exec() doesn't end up with a locked FD open). Currently > only Linux supports OFD locks, and has CLOEXEC so right now it's > not an issue. OFD locks have been proposed for POSIX standardisation, > but since CLOEXEC is in POSIX.1-2008, I suspect any platform which adds > support for OFD locks would actually support CLOEXEC anyway.Yes, I doubt anybody would implement OFD before CLOEXEC :)
On Tue, Aug 16, 2016 at 11:40:55PM +0000, Eric Wong wrote:> Olly Betts <olly at survex.com> wrote: > > On Sat, Aug 06, 2016 at 03:43:48AM +0000, Eric Wong wrote: > > > Eric Wong <e at 80x24.org> wrote: > > > +#ifdef FD_CLOEXEC > > > + (void)fcntl(lockfd, F_SETFD, FD_CLOEXEC); > > > +#endif > > > > This is fixed in master and 1.4.0, which set CLOEXEC for all relevant > > FDs. Those changes weren't backported as they were rather pervasive > > and we'd had no user reports of issues due to this, but since the > > CLOEXEC changes we now we lock in process on a modern Linux kernel and > > that change got backported - as you've noticed that can cause more major > > issues without CLOEXEC, and I think we ought to backport CLOEXEC for the > > lock file to 1.2.x. > > Thanks for the response. Is this the svn/1.2 branch in > the git repo? Currently at: > > commit 76e09ea726d9666462567b5839899fc53f53cebe > ("Fix typo in POD docs for Search::Xapian")Yes - the name is a historical remnant at this point, but I'm slightly reluctant to tidy that up as it's referenced in various scripts, buildbot configuration, etc. I've backported the relevant CLOEXEC handling to there now. Not sure when the next 1.2.x will happen, but probably 1.4.1 is first.> Did you get a chance to look into the build failure?The build failure isn't something that rings a bell, and those packages built OK for me in a clean jessie/sid chroot (as appropriate) using sbuild before I uploaded them. I haven't try to reproduce as it seemed unlikely that would have changed, and I have a bit of todo pile already. Cheers, Olly