Joe Murphy
2007-Jun-14 22:28 UTC
[Samba] Re: Intermittent "internal error: signal 11" with 3.0.24
Hi all Follow up to this post, as it didn't seem to go through the first time. We have a common problem occurring with our Samba setups. We run 3 (identical) processing environments that each contain a Samba host sharing approx .5TB of data to 6 Wintel machines. Normally these hosts operate fine, though we intermittently experience a Samba panic (intermittent = every 2-3 days) as per below. Eventually this will kill the host and require a restart to clear. Our issues appear to have started when we upgraded our Samba to 3.0.24. We begun recording panic messages in syslog following the upgrade. Going up to 3.0.25a is an option though we'd like to do it with a bit of an idea that it will fix things. We've tried a range of things to reproduce and isolate the issue; load testing, re-processing batches that experienced problems - so far without success. Any suggestions appreciated. sambaserver:~ # gdb /usr/sbin/smbd 1778 ... (gdb) bt #0 0xffffe410 in ?? () #1 0x00000001 in ?? () #2 0x00000000 in ?? () #3 0xbfffcfd8 in ?? () #4 0x402b36e3 in __waitpid_nocancel () from /lib/tls/libc.so.6 #5 0x4025ef58 in do_system () from /lib/tls/libc.so.6 #6 0x402268dd in system () from /lib/tls/libpthread.so.0 #7 0x0822b612 in smb_panic (why=0x0) at lib/util.c:1608 #8 0x08219b3f in fault_report (sig=-512) at lib/fault.c:47 #9 0x08219b50 in sig_fault (sig=-512) at lib/fault.c:70 #10 <signal handler called> #11 0x40292d43 in strlen () from /lib/tls/libc.so.6 #12 0x40268242 in vfprintf () from /lib/tls/libc.so.6 #13 0x40285e76 in vsnprintf () from /lib/tls/libc.so.6 #14 0x08219956 in dbgtext (format_str=0x36373020 <Address 0x36373020 out of bounds>) at lib/debug.c:1011 #15 0x0825b360 in oplock_timeout_handler (te=0x844d838, now=0xbfffdfc0, private_data=0x844e498) at smbd/oplock.c:351 #16 0x08242d7d in run_events () at lib/events.c:102 #17 0x080f2801 in receive_message_or_smb (buffer=0x40433008 "", buffer_len=131137, timeout=60000) at smbd/process.c:457 #18 0x080f4122 in smbd_process () at smbd/process.c:1649 #19 0x082beea9 in main (argc=909586464, argv=0xbfffe334) at smbd/server.c:1024 This relates to the following panic message recorded in syslog: Jun 14 15:36:56 sambaserver smbd[1778]: [2007/06/14 15:36:56, 0] printing/print_cups.c:cups_cache_reload(85) Jun 14 15:36:56 sambaserver smbd[1778]: Unable to connect to CUPS server localhost - Connection refused Jun 14 15:36:56 sambaserver smbd[1778]: [2007/06/14 15:36:56, 0] printing/print_cups.c:cups_cache_reload(85) Jun 14 15:36:56 sambaserver smbd[1778]: Unable to connect to CUPS server localhost - Connection refused Jun 14 15:38:02 sambaserver smbd[1778]: [2007/06/14 15:38:02, 0] tdb/tdbutil.c:tdb_log(783) Jun 14 15:38:02 sambaserver smbd[1778]: tdb(/var/lib/samba/locking.tdb): tdb_lock failed on list 2872 ltype=1 (Resource deadlock avoided) Jun 14 15:38:02 sambaserver smbd[1778]: [2007/06/14 15:38:02, 0] smbd/close.c:close_remove_share_mode(164) Jun 14 15:38:02 sambaserver smbd[1778]: close_remove_share_mode: Could not get share mode lock for file Templates/TI.dmsft Jun 14 15:38:03 sambaserver smbd[1778]: [2007/06/14 15:38:03, 0] tdb/tdbutil.c:tdb_log(783) Jun 14 15:38:03 sambaserver smbd[1778]: tdb(/var/lib/samba/locking.tdb): tdb_lock failed on list 1324 ltype=1 (Resource deadlock avoided) Jun 14 15:38:03 sambaserver smbd[1778]: [2007/06/14 15:38:03, 0] smbd/close.c:close_remove_share_mode(164) Jun 14 15:38:03 sambaserver smbd[1778]: close_remove_share_mode: Could not get share mode lock for file Templates/TI.dmsft Jun 14 15:38:32 sambaserver smbd[1778]: [2007/06/14 15:38:32, 0] smbd/oplock.c:oplock_timeout_handler(351) Jun 14 15:38:32 sambaserver smbd[1778]: [2007/06/14 15:38:32, 0] lib/fault.c:fault_report(41) Jun 14 15:38:32 sambaserver smbd[1778]: ==============================================================Jun 14 15:38:32 sambaserver smbd[1778]: [2007/06/14 15:38:32, 0] lib/fault.c:fault_report(42) Jun 14 15:38:32 sambaserver smbd[1778]: INTERNAL ERROR: Signal 11 in pid 1778 (3.0.24-SerNet-SuSE) Jun 14 15:38:32 sambaserver smbd[1778]: Please read the Trouble-Shooting section of the Samba3-HOWTO Jun 14 15:38:32 sambaserver smbd[1778]: [2007/06/14 15:38:32, 0] lib/fault.c:fault_report(44) Jun 14 15:38:32 sambaserver smbd[1778]: Jun 14 15:38:32 sambaserver smbd[1778]: From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf Jun 14 15:38:32 sambaserver smbd[1778]: [2007/06/14 15:38:32, 0] lib/fault.c:fault_report(45) Jun 14 15:38:32 sambaserver smbd[1778]: ==============================================================Jun 14 15:38:32 sambaserver smbd[1778]: [2007/06/14 15:38:32, 0] lib/util.c:smb_panic(1599) Jun 14 15:38:32 sambaserver smbd[1778]: PANIC (pid 1778): internal error Jun 14 15:38:32 sambaserver smbd[1778]: [2007/06/14 15:38:32, 0] lib/util.c:log_stack_trace(1706) Jun 14 15:38:32 sambaserver smbd[1778]: BACKTRACE: 14 stack frames: Jun 14 15:38:32 sambaserver smbd[1778]: #0 /usr/sbin/smbd(log_stack_trace+0x22) [0x822b6fb] Jun 14 15:38:32 sambaserver smbd[1778]: #1 /usr/sbin/smbd(smb_panic+0x6f) [0x822b59a] Jun 14 15:38:32 sambaserver smbd[1778]: #2 /usr/sbin/smbd [0x8219b3f] Jun 14 15:38:32 sambaserver smbd[1778]: #3 /usr/sbin/smbd [0x8219b50] Jun 14 15:38:32 sambaserver smbd[1778]: #4 [0xffffe420] Jun 14 15:38:32 sambaserver smbd[1778]: #5 /lib/tls/libc.so.6(vsnprintf+0xb6) [0x40285e76] Jun 14 15:38:32 sambaserver smbd[1778]: #6 /usr/sbin/smbd(dbgtext+0x2e) [0x8219956] Jun 14 15:38:32 sambaserver smbd[1778]: #7 /usr/sbin/smbd [0x825b360] Jun 14 15:38:32 sambaserver smbd[1778]: #8 /usr/sbin/smbd(run_events+0x15f) [0x8242d7d] Jun 14 15:38:32 sambaserver smbd[1778]: #9 /usr/sbin/smbd [0x80f2801] Jun 14 15:38:32 sambaserver smbd[1778]: #10 /usr/sbin/smbd(smbd_process+0x10e) [0x80f4122] Jun 14 15:38:32 sambaserver smbd[1778]: #11 /usr/sbin/smbd(main+0x946) [0x82beea9] Jun 14 15:38:32 sambaserver smbd[1778]: #12 /lib/tls/libc.so.6(__libc_start_main+0xd0) [0x40240210] Jun 14 15:38:32 sambaserver smbd[1778]: #13 /usr/sbin/smbd [0x808ceb1] Jun 14 15:38:32 sambaserver smbd[1778]: [2007/06/14 15:38:32, 0] lib/util.c:smb_panic(1607) Jun 14 15:38:32 sambaserver smbd[1778]: smb_panic(): calling panic action [/bin/sleep 90000] Versions Distro: SLES9 (SP3) Kernel: 2.6.5-7.97-bigsmp Samba: Version 3.0.24-SerNet-SuSE Samba config below (<ad_domain> changed, but orig followed sub.dom.tld format): ---------------------------- [global] workgroup = <ad_domain> domain master = no local master = no preferred master = no os level = 0 username map = /etc/samba/smbusers map to guest = Bad User logon path = \\%L\profiles\.msprofile logon home = \\%L\%U\.9xprofile logon drive = P: security = ads realm = <ad_domain> encrypt passwords = yes idmap uid = 10000-20000 idmap gid = 10000-20000 template primary group = "Domain Users" template shell = /bin/bash winbind separator = + winbind enum users = yes winbind enum groups = yes winbind use default domain = yes password server = prod1.<ad_domain> prod2.<ad_domain> prod3.<ad_domain> log level = 3 panic action = "/bin/sleep 90000" [Data] comment = Data directory path = /data01/Data read only = No inherit permissions = Yes directory mask = 0755 create mask = 0755 ---------------------------- Much appreciated. Joe Murphy joe.murphy@clear.net.nz
Volker Lendecke
2007-Jun-15 05:54 UTC
[Samba] Re: Intermittent "internal error: signal 11" with 3.0.24
On Fri, Jun 15, 2007 at 10:13:34AM +1200, Joe Murphy wrote:> tdb(/var/lib/samba/locking.tdb): tdb_lock failed on list > 2872 ltype=1 (Resource deadlock avoided)Urgs... This should never happen. What OS do you have? Can you send you smb.conf? What applications are running on your Samba host except Samba? NFS? None of these would probably directly cause this, I'm just asking for the environment... Volker -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://lists.samba.org/archive/samba/attachments/20070615/ef818678/attachment.bin