Kai Lanz
2008-Jun-30 23:28 UTC
[Samba] Smbd internal error and panic with nfs-mounted share
We're trying to share a directory that's mounted via NFS from another server; when we connect to that share, smbd panics and tries to dump core. This is samba-3.0.30 on Alpha/Tru64-4.0G. The directory we're trying to use is exported from a Linux box called sesfs (Centos 4.4) and mounted on the Alpha server via NFS as /sesfs/scratch. I've appended below an edited summary from log.smbd at debuglevel 3 showing an attempted connection from a Mac called "redips". The process gets as far as call_trans2findfirst, then reports internal error signal 6 (abort) and panics. It says it's dumping core, but there's nothing written to the destination directory. Here's the fstab entry that mounts the scratch directory: /exavol/scratch@sesfs /sesfs/scratch nfs rw,bg And here's what the result looks like: pangea> ls -ld /sesfs/scratch drwxrwxrwt 1 root system 4096 Jun 10 17:41 /sesfs/scratch Here's the share definition in smb.conf: [scratch] comment = sesfs scratch directory path = /sesfs/scratch create mode = 0644 directory mode = 0755 guest ok = no public = no writable = yes printable = no Smbclient shows the share is available: pangea> smbclient -L pangea Domain=[PANGEA] OS=[Unix] Server=[Samba 3.0.30] Sharename Type Comment --------- ---- ------- scratch Disk sesfs scratch directory IPC$ IPC IPC Service (Samba 3.0.30 File Server) lanz Disk Home Directories There are two complications to all this: 1. The Linux "box", sesfs, that exports the scratch directory is actually an ExaStore NAS system, comprising a 2-node linux cluster connected to a big RAID. The NFS code may have been customized for that environment. 2. The samba daemon doesn't always crash when the connection to the scratch share is first made -- sometimes the connection seems to succeed, but then smbd dies when you first attempt to do a data transfer. In other words, the behavior is not 100% repeatable. (If you want to see the unedited log file, I put it on the web at http://www.stanford.edu/~lanz/samba-log.full) [2008/06/30 09:12:44, 2, pid=19041, effective(0, 0), real(0, 0)] smbd/ reply.c:(3 23) netbios connect: name1=PANGEA name2=REDIPS [2008/06/30 09:12:49, 3, pid=19041, effective(0, 0), real(0, 0)] auth/ auth.c:(22 3) check_ntlm_password: mapped user is: [PANGEA]\[LANZ]@[redips] [2008/06/30 09:12:49, 3, pid=19041, effective(0, 0), real(0, 0)] auth/ auth.c:(26 9) check_ntlm_password: sam authentication for user [LANZ] succeeded [2008/06/30 09:12:49, 2, pid=19041, effective(0, 0), real(0, 0)] auth/ auth.c:(30 4) check_ntlm_password: authentication for user [LANZ] -> [LANZ] -> [lanz] succe eded [2008/06/30 09:12:49, 3, pid=19041, effective(0, 0), real(0, 0)] smbd/ password.c :(354) Adding homes service for user 'lanz' using home directory: '/home/ sysop/lanz' [2008/06/30 09:12:49, 3, pid=19041, effective(0, 0), real(0, 0)] param/loadparm. c:(2682) adding home's share [lanz] for user 'lanz' at '/home/sysop/lanz' [2008/06/30 09:12:49, 3, pid=19041, effective(0, 0), real(0, 0)] smbd/ service.c: (805) Connect path is '/sesfs/scratch' for service [scratch] [2008/06/30 09:12:49, 3, pid=19041, effective(0, 0), real(0, 0)] smbd/ vfs.c:(95) Initialising default vfs hooks [2008/06/30 09:12:49, 3, pid=19041, effective(0, 0), real(0, 0)] smbd/ vfs.c:(128 ) Initialising custom vfs hooks from [/[Default VFS]/] [2008/06/30 09:12:49, 1, pid=19041, effective(2104, 601), real(0, 0)] smbd/servi ce.c:(1033) redips (171.64.171.122) connect to service scratch initially as user lanz (uid =2104, gid=601) (pid 19041) [2008/06/30 09:12:49, 3, pid=19041, effective(0, 0), real(0, 0)] smbd/ reply.c:(5 73) tconX service=SCRATCH [2008/06/30 09:12:49, 3, pid=19041, effective(0, 0), real(0, 0)] smbd/ process.c: (1069) Transaction 4 of length 74 [2008/06/30 09:12:49, 3, pid=19041, effective(0, 0), real(0, 0)] smbd/ process.c: (927) switch message SMBtrans2 (pid 19041) conn 0x14018e050 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/trans 2.c:(2285) call_trans2qfsinfo: level = 261 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/proce ss.c:(1069) Transaction 5 of length 92 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/proce ss.c:(927) switch message SMBntcreateX (pid 19041) conn 0x14018e050 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/dosmo de.c:(142) unix_mode(.) returning 0755 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/proce ss.c:(1069) Transaction 6 of length 88 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/proce ss.c:(927) switch message SMBnttrans (pid 19041) conn 0x14018e050 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/nttra ns.c:(2082) call_nt_transact_query_security_desc: file = ., info_wanted = 0x3 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] passdb/loo kup_sid.c:(1009) fetch sid from uid cache 0 -> S-1-5-21-1975590968-77483778-2269577258-1000 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/proce ss.c:(1069) Transaction 7 of length 45 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/proce ss.c:(927) switch message SMBclose (pid 19041) conn 0x14018e050 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/reply .c:(3329) close directory fnum=4529 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/proce ss.c:(1069) Transaction 8 of length 74 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/proce ss.c:(927) switch message SMBtrans2 (pid 19041) conn 0x14018e050 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/trans 2.c:(2285) call_trans2qfsinfo: level = 1 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/proce ss.c:(1069) Transaction 9 of length 130 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/proce ss.c:(927) switch message SMBtrans2 (pid 19041) conn 0x14018e050 [2008/06/30 09:12:49, 3, pid=19041, effective(2104, 601), real(0, 0)] smbd/trans 2.c:(1701) call_trans2findfirst: dirtype = 16, maxentries = 4, close_after_first=1, close _if_end = 2 requires_resume_key = 4 level = 0x104, max_data_bytes = 16644 [2008/06/30 09:12:49, 0, pid=19041, effective(2104, 601), real(0, 0)] lib/fault. c:(41) ==============================================================[2008/06/30 09:12:49, 0, pid=19041, effective(2104, 601), real(0, 0)] lib/fault. c:(42) INTERNAL ERROR: Signal 6 in pid 19041 (3.0.30) Please read the Trouble-Shooting section of the Samba3-HOWTO [2008/06/30 09:12:49, 0, pid=19041, effective(2104, 601), real(0, 0)] lib/fault. c:(44) From: http://www.samba.org/samba/docs/Samba3-HOWTO.pdf [2008/06/30 09:12:49, 0, pid=19041, effective(2104, 601), real(0, 0)] lib/fault. c:(45) ==============================================================[2008/06/30 09:12:49, 0, pid=19041, effective(2104, 601), real(0, 0)] lib/util.c :(1632) PANIC (pid 19041): internal error [2008/06/30 09:12:49, 0, pid=19041, effective(2104, 601), real(0, 0)] lib/util.c :(1787) unable to produce a stack trace on this platform [2008/06/30 09:12:49, 0, pid=19041, effective(0, 0), real(0, 0)] lib/ fault.c:(18 1) dumping core in /usr/local/samba/var/cores/smbd -- Kai Lanz Stanford University School of Earth Sciences
I've got two subnets joined by an OpenVPN bridge. I used to have my PDC on the router 192.168.2.128, and the DMS 192.168.2.1 happily authenticated to it. Now, for security and other reasons I have put my PDC behind a firewall. The PDC now lives at 192.168.1.3, and my router is still on 192.168.1.1 and 192.168.2.128. In the router's iptables rules, I have added the following: iptables -t nat -A PREROUTING -p tcp --dport 137:139 -i tap0 -j DNAT --to 192.168.1.3 iptables -t nat -A PREROUTING -p tcp --dport 445 -i tap0 -j DNAT --to 192.168.1.3 iptables -t nat -A PREROUTING -p udp --dport 137:139 -i tap0 -j DNAT --to 192.168.1.3 iptables -t nat -A PREROUTING -p udp --dport 445 -i tap0 -j DNAT --to 192.168.1.3 (tap0 is the 192.168.2.128 interface) In the DMS's smb.conf. I have the following: [global] workgroup = CORP netbios name = FURNSRV server string = Furniture File Server security = domain password server = 192.168.1.3 wins server = 192.168.1.3 wins support = no wins proxy = no name resolve order = wins dns proxy = no local master = yes domain master = no preferred master = yes os level = 65 socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192 SO_BROADCAST printing = cups printcap = cups remote browse sync = 192.168.1.3 When I start Samba on the DMB, I can do 'net join' just fine. I can ping the PDC. I can list shares on the PDC. I can't list shares on the client! root@honk:/etc/samba# smbclient -L localhost Password: session setup failed: NT_STATUS_NO_LOGON_SERVERS I'm a little befuddled here. Is there something I've forgotten in iptables? Is something else missing? I'm not sure exactly what to debug. I have done tcpdump on the PDC and I can see requests and responses, but I'm not 100% clear what to look for. I appreciate any help at all! Thanks, Misty
Jeremy Allison
2008-Jul-03 00:41 UTC
[Samba] Smbd internal error and panic with nfs-mounted share
On Mon, Jun 30, 2008 at 04:28:33PM -0700, Kai Lanz wrote:> We're trying to share a directory that's mounted via NFS from another > server; when we connect to that share, smbd panics and tries to dump > core.Can you add a "panic action" parameter to your smb.conf to get smbd to invoke /bin/sleep 9999999 Then when smbd crashes, attach to it via gdb and post a full backtrace please. Thanks, Jeremy.