Hi all,
After much grief today, we'd like to support the 'TODO' note in
source/lib/util_tdb.c :-
/* TODO: If we time out waiting for a lock, it might
* be nice to use F_GETLK to get the pid of the
* process currently holding the lock and print that
* as part of the debugging message. -- mbp */
It could have saved us a whole-sale restart of Samba if we could have
more easily identified the process that was causing us to get loads and
loads of:-
[2007/06/15 13:02:20, 0, pid=5430] tdb/tdbutil.c:tdb_log(783)
tdb(/usr/local/samba/private/secrets.tdb): tdb_lock failed on list 2 ltype=2
(Interrupted system call)
[2007/06/15 13:02:20, 0, pid=5430]
tdb/tdbutil.c:tdb_chainlock_with_timeout_internal(82)
tdb_chainlock_with_timeout_internal: alarm (10) timed out for key replay cache
mutex in tdb /usr/local/samba/private/secrets.tdb
We believe we did find it, but only by sending a SIGTERM to all 'smbd'
processes and being left with 'smbd' proceses that didn't die.
(SIGKILL
did remove them).
Mac
Assistant Systems Administrator @nibsc.ac.uk
mac@nibsc.ac.uk
Work: +44 1707 641565 Everything else: +44 7956 237670 (anytime)
On Fri, Jun 15, 2007 at 03:19:08PM +0100, Mac wrote:> After much grief today, we'd like to support the 'TODO' note in > source/lib/util_tdb.c :- > > /* TODO: If we time out waiting for a lock, it might > * be nice to use F_GETLK to get the pid of the > * process currently holding the lock and print that > * as part of the debugging message. -- mbp */Not sure if you have Linux, but if you do a "cat /proc/locks" in that situation would also have helped you. Volker -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://lists.samba.org/archive/samba/attachments/20070615/3c63e986/attachment.bin
On Fri, Jun 15, 2007 at 03:19:08PM +0100, Mac wrote:> Hi all, > > After much grief today, we'd like to support the 'TODO' note in > source/lib/util_tdb.c :- > > /* TODO: If we time out waiting for a lock, it might > * be nice to use F_GETLK to get the pid of the > * process currently holding the lock and print that > * as part of the debugging message. -- mbp */ > > > It could have saved us a whole-sale restart of Samba if we could have > more easily identified the process that was causing us to get loads and > loads of:- > > [2007/06/15 13:02:20, 0, pid=5430] tdb/tdbutil.c:tdb_log(783) > tdb(/usr/local/samba/private/secrets.tdb): tdb_lock failed on list 2 ltype=2 (Interrupted system call) > [2007/06/15 13:02:20, 0, pid=5430] tdb/tdbutil.c:tdb_chainlock_with_timeout_internal(82) > tdb_chainlock_with_timeout_internal: alarm (10) timed out for key replay cache mutex in tdb /usr/local/samba/private/secrets.tdb > > > We believe we did find it, but only by sending a SIGTERM to all 'smbd' > processes and being left with 'smbd' proceses that didn't die. (SIGKILL > did remove them).If you get an smbd in that state, can you attach to it with gdb and get a stack backtrace so we can see where it was ? Thanks, Jeremy.