Alberto Di Fede
2007-Mar-27 21:37 UTC
[Samba] Samba hangs badly making the OS hang unrecoverably too
Hi everybody, i'm getting mad regarding a very serious (since it's happening on a production server) issue with a samba domain joined installation. This machine has worked perfectly for an year and the last configuration tune up and version upgrade (3.0.23c) went along fine in the first days of January. It's been two weeks approximately now since we started to have a bad issue, a smbd lock causing a complete machine lock. I don't wont to get too much into annoying details, but let's say that an hardware problem has been ruled out, that there are no network communications issues and that, most interestingly, we had the same issue with another identically configured server. In my young sysadmin career (6 years now) i've never seen such a failure, one causing a linux machine to stop responding on console commands (even ps aux, lsof...), never letting me to look at logs while on hang (console will hang waiting for the file open), neither to kill processes hanging. There's no CPU occupation, nor particular network activity neither resources depleting: it just hangs there with no possibilities to recover it besides an hardware machine reset. All of this considered, i tried to debug the problem in every way, looking for particular messages into every log and launching every samba related daemon interactively and with debug level at 10: i obtained no particular info, no write or read error, communications problems with the domain of any nature. NMBD and SMBD processes just stop to work and hangs waiting, i really don't know for what. When it hangs i'm still able to communicate with open ssh sessions as long as i i don't try to look for open processes or files. I suppose is something related to name resolution and winbind rid mappings, but from what i've gathered so far, i've no particular info to spot even possibly the problem. Any help or suggestion from you would be very welcomed and appreciated. Tomorrow i will setup another machine for the same purpose with an opensuse 10.2 and look what happens. Thanks a lot in advance to everybody willing to help, i'll be happy to provide you with any requested further info. Alberto This is my smb.conf [global] workgroup = AGBSOFT realm = AGBSOFT.CH netbios name = FTP server string = FTP Server wins server = 10.100.0.2,10.100.0.4 #client schannel = no idmap uid = 10000-200000000 idmap gid = 10000-200000000 idmap backend = rid:AGBSOFT=10000-200000000 allow trusted domains = no winbind enum users = yes winbind enum groups = yes winbind use default domain = yes winbind nested groups = yes template homedir = /home/%D/%U template shell = /bin/bash load printers = no log file = /var/log/samba/%m.log #log level = 8 max log size = 0 security = ads #password server = agbsoft-nt1.agbsoft.ch encrypt passwords = yes socket options = IPTOS_LOWDELAY TCP_NODELAY #os level = 23 domain master = no preferred master = no local master = no inherit acls = yes inherit permissions = yes map acl inherit = yes store dos attributes = yes acl compatibility = win2k #acl group control = yes map hidden = no map system = no map readonly = no nt acl support = yes ea support = yes winbind offline logon = true winbind refresh tickets = true dos filemode = yes [FTPSpace] comment = "FTP" path = /ftp valid users = @"AGBSOFT\Domain Users" writable = yes [Regressions] comment = "Regressions Files" path= /ftp/istap/Regressions valid users = @"AGBSOFT\Domain Users" writable = yes