On Sun, 21 Feb 2016 22:33:22 +0000, Olly Betts <olly at survex.com> wrote:> On Sun, Feb 21, 2016 at 02:15:25PM +0100, Eric J wrote: > > I discovered, while trying to set up Tcl bindings for Notmuch > > (https://notmuchmail.org/), which uses Xapian, that flintlock was not > > being locked (I had lost updates). > > It seems to work for me, testing with this: > > package require Tcl 8 > package require xapian 1.0.0 > xapian::WritableDatabase db "tmp.db" $xapian::DB_CREATE_OR_OPEN > xapian::WritableDatabase db2 "tmp.db" $xapian::DB_CREATE_OR_OPENeric at bruno [ ~ ]$ cat /proc/version Linux version 3.13.300 (root at bruno) (gcc version 4.8.2 (GCC) ) #2 SMP Tue Sep 16 21:01:43 BST 2014 eric at bruno [ ~ ]$ tclsh % info patchlevel 8.6.1 % package require Tcl 8 8.6.1 % package require xapian 1.0.0 1.2.18 % xapian::WritableDatabase db "tmp.db" $xapian::DB_CREATE_OR_OPEN _e0c4b00000000000_p_Xapian__WritableDatabase % xapian::WritableDatabase db2 "tmp.db" $xapian::DB_CREATE_OR_OPEN _f0d3b00000000000_p_Xapian__WritableDatabase % At which point eric at bruno [ ~ ]$ lsof tmp.db/flintlock COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME cat 13543 eric 5w REG 8,9 0 930437 tmp.db/flintlock cat 13552 eric 9w REG 8,9 0 930437 tmp.db/flintlock Blaming the execl is due to stepping though my copy of the lock code in gdb, and seeing, in lsof, 5w on the open, still 5w on the fork, 5ww on the fcntl, and 5w again on the execl.> I wonder if the problem is unrelated to locking, but instead it's that > the Tcl database doesn't get explicitly destroyed in your script, so > that the C++ object doesn't either, and the changes don't get committed. > > I would try calling close() on the WritableDatabase object before your > script exits.I did have a close in my script, and then added a destroy. No way of telling if it was first-to-close or last-to-close that was being lost.> There's some discussion of this in the Tcl bindings docs (section > "Destructors"): > > https://xapian.org/docs/bindings/tcl8/Thanx, Eric -- ms fnd in a lbry
On Mon, Feb 22, 2016 at 12:26:27PM +0100, Eric wrote:> On Sun, 21 Feb 2016 22:33:22 +0000, Olly Betts <olly at survex.com> wrote: > > On Sun, Feb 21, 2016 at 02:15:25PM +0100, Eric J wrote: > > > I discovered, while trying to set up Tcl bindings for Notmuch > > > (https://notmuchmail.org/), which uses Xapian, that flintlock was not > > > being locked (I had lost updates). > > > > It seems to work for me, testing with this: > > > > package require Tcl 8 > > package require xapian 1.0.0 > > xapian::WritableDatabase db "tmp.db" $xapian::DB_CREATE_OR_OPEN > > xapian::WritableDatabase db2 "tmp.db" $xapian::DB_CREATE_OR_OPEN > > > eric at bruno [ ~ ]$ cat /proc/version > Linux version 3.13.300 (root at bruno) (gcc version 4.8.2 (GCC) ) #2 SMP > Tue Sep 16 21:01:43 BST 2014 > eric at bruno [ ~ ]$ tclsh > % info patchlevel > 8.6.1 > % package require Tcl 8 > 8.6.1 > % package require xapian 1.0.0 > 1.2.18I've tested with 1.2.18 and can't reproduce this with that version either (is that also the version of xapian-core you're running? The 1.2.18 above is the bindings version I think).> % xapian::WritableDatabase db "tmp.db" $xapian::DB_CREATE_OR_OPEN > _e0c4b00000000000_p_Xapian__WritableDatabase > % xapian::WritableDatabase db2 "tmp.db" $xapian::DB_CREATE_OR_OPEN > _f0d3b00000000000_p_Xapian__WritableDatabase > % > > At which point > > eric at bruno [ ~ ]$ lsof tmp.db/flintlock > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > cat 13543 eric 5w REG 8,9 0 930437 tmp.db/flintlock > cat 13552 eric 9w REG 8,9 0 930437 tmp.db/flintlock > > Blaming the execl is due to stepping though my copy of the lock code in > gdb, and seeing, in lsof, 5w on the open, still 5w on the fork, 5ww on > the fcntl, and 5w again on the execl.Odd, as you said elsewhere, execl() shouldn't drop the lock. It would be good to get to the bottom of this, as unreliable locking is a bad thing to have. What FS are you running this on? Is use of Tcl actually a factor here, or can you reproduce it with just C++ code? E.g. using the "simpleindex" example from the xapian-core sources: examples/simpleindex tmp.db & examples/simpleindex tmp.db More recent Xapian versions will try to use the new OFD locks and avoid the need to fork() and execl(), so will presumably avoid whatever is going on here. But the OFD locks were added in Linux 3.15, so your kernel isn't quite new enough. Cheers, Olly
On Wed, 24 Feb 2016 03:17:35 +0000, Olly Betts <olly at survex.com> wrote:>On Mon, Feb 22, 2016 at 12:26:27PM +0100, Eric wrote: >> On Sun, 21 Feb 2016 22:33:22 +0000, Olly Betts <olly at survex.com> wrote: >>> On Sun, Feb 21, 2016 at 02:15:25PM +0100, Eric J wrote: >>>> I discovered, while trying to set up Tcl bindings for Notmuch >>>> (https://notmuchmail.org/), which uses Xapian, that flintlock was not >>>> being locked (I had lost updates). >>> >>> It seems to work for me, testing with this: >>> >>> package require Tcl 8 >>> package require xapian 1.0.0 >>> xapian::WritableDatabase db "tmp.db" $xapian::DB_CREATE_OR_OPEN >>> xapian::WritableDatabase db2 "tmp.db" $xapian::DB_CREATE_OR_OPEN >> >> >> eric at bruno [ ~ ]$ cat /proc/version >> Linux version 3.13.300 (root at bruno) (gcc version 4.8.2 (GCC) ) #2 SMP >> Tue Sep 16 21:01:43 BST 2014 >> eric at bruno [ ~ ]$ tclsh >> % info patchlevel >> 8.6.1 >> % package require Tcl 8 >> 8.6.1 >> % package require xapian 1.0.0 >> 1.2.18 > > I've tested with 1.2.18 and can't reproduce this with that version > either (is that also the version of xapian-core you're running? The > 1.2.18 above is the bindings version I think). > >> % xapian::WritableDatabase db "tmp.db" $xapian::DB_CREATE_OR_OPEN >> _e0c4b00000000000_p_Xapian__WritableDatabase >> % xapian::WritableDatabase db2 "tmp.db" $xapian::DB_CREATE_OR_OPEN >> _f0d3b00000000000_p_Xapian__WritableDatabase >> % >> >> At which point >> >> eric at bruno [ ~ ]$ lsof tmp.db/flintlock >> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME >> cat 13543 eric 5w REG 8,9 0 930437 tmp.db/flintlock >> cat 13552 eric 9w REG 8,9 0 930437 tmp.db/flintlock >> >> Blaming the execl is due to stepping though my copy of the lock code in >> gdb, and seeing, in lsof, 5w on the open, still 5w on the fork, 5ww on >> the fcntl, and 5w again on the execl. > > Odd, as you said elsewhere, execl() shouldn't drop the lock. It would > be good to get to the bottom of this, as unreliable locking is a bad > thing to have. > > What FS are you running this on?ext4> Is use of Tcl actually a factor here, or can you reproduce it with > just C++ code? > > E.g. using the "simpleindex" example from the xapian-core sources: > > examples/simpleindex tmp.db & > examples/simpleindex tmp.dblfs at bruno [ /usr/src/sources-deptj/xapian-core-1.2.18 ]$ examples/simpleindex tmp.db & [1] 26157 lfs at bruno [ /usr/src/sources-deptj/xapian-core-1.2.18 ]$ examples/simpleindex tmp.db DatabaseLockError: Unable to get write lock on tmp.db: already locked [1]+ Stopped examples/simpleindex tmp.db so it is presumably not anything to do with the FS or the OS. I am hoping that the right Tcl person (whoever that is) may pick something up in an strace.> More recent Xapian versions will try to use the new OFD locks and avoid > the need to fork() and execl(), so will presumably avoid whatever is > going on here. But the OFD locks were added in Linux 3.15, so your > kernel isn't quite new enough.Yes, I saw that, and it is good, but my chances of moving up soon are not good. And I would like to get to the bottom of this anyway. Thanx, Eric -- ms fnd in a lbry
On Thu, 10 Mar 2016 22:59:55 +0000 (UTC), Eric Lindblad <GeirfuglApS at yahoo.com> wrote:> cf: http://permalink.gmane.org/gmane.comp.search.xapian.general/9965 > >> Eric J <eric <at> deptj.eu> wrote: > > ... > >> Earlier 8.5.x are presumably the same as 8.5.18. > > If someone might post one or more code samples > (incl. instructions for compiling, if relevant) > and a manner of checking the following: > > 1) "database locks" with Tcl bindings aren't functioning > 2) "database locks" with Tcl bindings function correctly > > if I find the time to test on GNU/Linux (32 bit) tcl8.5.11 > the specified xapian-core version number (1.2.18), I will > post the results.That would be useful information, thankyou. No need to compile anything since the problem is reproducible with the Xapian Tcl bindings alone. All that is necessary is to install the same version of xapian-core and xapian-bindings (with tcl of course), and install the required version of Tcl (Tk not needed). Terminal session 1: $ tclsh % info patchlevel # to check the version % package require Thread # expect to get "can't find package" # if you get a version number, the lock will probably not # function correctly for 8.5.x (x<19) or 8.6.x (x<5) % package require xapian 1.0.0 # expect to see "1.2.18" (or whatever) % xapian::WritableDatabase db "tmp.db" $xapian::DB_CREATE_OR_OPEN # get something like "_e004f80100000000_p_Xapian__WritableDatabase" Terminal session 2: (with the same working directory) $ tclsh % info patchlevel % package require Thread % package require xapian 1.0.0 % xapian::WritableDatabase db2 "tmp.db" $xapian::DB_CREATE_OR_OPEN # should get "Unable to get write lock on tmp.db: already locked" # if locks are functioning correctly # will get something like "_705b900100000000_p_Xapian__WritableDatabase" # if the locks are not functioning correctly It is worth having a third terminal session with $ lsof -r 5 tmp.db/flintlock to see how many times the file is open. If locks are functioning correctly, you will end up with something like: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME cat 13543 eric 5ww REG 8,9 0 930437 tmp.db/flintlock (what matters is having only one line and the "ww") If locks are not functioning correctly, you will end up with something like: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME cat 13549 eric 5w REG 8,9 0 930437 tmp.db/flintlock cat 13552 eric 9w REG 8,9 0 930437 tmp.db/flintlock (two lines and only a single "w" in each) Eric -- ms fnd in a lbry