Hi all, After much grief today, we'd like to support the 'TODO' note in source/lib/util_tdb.c :- /* TODO: If we time out waiting for a lock, it might * be nice to use F_GETLK to get the pid of the * process currently holding the lock and print that * as part of the debugging message. -- mbp */ It could have saved us a whole-sale restart of Samba if we could have more easily identified the process that was causing us to get loads and loads of:- [2007/06/15 13:02:20, 0, pid=5430] tdb/tdbutil.c:tdb_log(783) tdb(/usr/local/samba/private/secrets.tdb): tdb_lock failed on list 2 ltype=2 (Interrupted system call) [2007/06/15 13:02:20, 0, pid=5430] tdb/tdbutil.c:tdb_chainlock_with_timeout_internal(82) tdb_chainlock_with_timeout_internal: alarm (10) timed out for key replay cache mutex in tdb /usr/local/samba/private/secrets.tdb We believe we did find it, but only by sending a SIGTERM to all 'smbd' processes and being left with 'smbd' proceses that didn't die. (SIGKILL did remove them). Mac Assistant Systems Administrator @nibsc.ac.uk mac@nibsc.ac.uk Work: +44 1707 641565 Everything else: +44 7956 237670 (anytime)
On Fri, Jun 15, 2007 at 03:19:08PM +0100, Mac wrote:> After much grief today, we'd like to support the 'TODO' note in > source/lib/util_tdb.c :- > > /* TODO: If we time out waiting for a lock, it might > * be nice to use F_GETLK to get the pid of the > * process currently holding the lock and print that > * as part of the debugging message. -- mbp */Not sure if you have Linux, but if you do a "cat /proc/locks" in that situation would also have helped you. Volker -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://lists.samba.org/archive/samba/attachments/20070615/3c63e986/attachment.bin
On Fri, Jun 15, 2007 at 03:19:08PM +0100, Mac wrote:> Hi all, > > After much grief today, we'd like to support the 'TODO' note in > source/lib/util_tdb.c :- > > /* TODO: If we time out waiting for a lock, it might > * be nice to use F_GETLK to get the pid of the > * process currently holding the lock and print that > * as part of the debugging message. -- mbp */ > > > It could have saved us a whole-sale restart of Samba if we could have > more easily identified the process that was causing us to get loads and > loads of:- > > [2007/06/15 13:02:20, 0, pid=5430] tdb/tdbutil.c:tdb_log(783) > tdb(/usr/local/samba/private/secrets.tdb): tdb_lock failed on list 2 ltype=2 (Interrupted system call) > [2007/06/15 13:02:20, 0, pid=5430] tdb/tdbutil.c:tdb_chainlock_with_timeout_internal(82) > tdb_chainlock_with_timeout_internal: alarm (10) timed out for key replay cache mutex in tdb /usr/local/samba/private/secrets.tdb > > > We believe we did find it, but only by sending a SIGTERM to all 'smbd' > processes and being left with 'smbd' proceses that didn't die. (SIGKILL > did remove them).If you get an smbd in that state, can you attach to it with gdb and get a stack backtrace so we can see where it was ? Thanks, Jeremy.