Kevin Bedford
2008-Jul-03 12:38 UTC
[Samba] Samba pdc with ldap & samba3-3.0.25c-34 on CentOS 5
Hi all, I had been running a Samba pdc using the standard tdbsam backend on CentOS 4.5 supporting 20+ WinXPsp2 workstations and a Win2k3 Terminal Server. This set up was running fine for arroung 18 months maybe more. The above system was built on ageing hardware that used to be a Win 2k server that died from OS corruption. Due to the expanding client needs there was good reason to replace the old box. The new system is based on a Supermicro server board with an Intel Core2 Duo CPU E6750, 8G RAM and 4 500G HDD's as two software RAID mirrors. This Machine was installed with CentOS 5 and Samba updated to 3.0.25c instead of the supplied 3.0.25b as the previous setup had suffered an inability to change passwords from Win2k3 with any prior version. Openldap and Samba were configured as a PDC mostly from the HOWTO by Example from samba.org and the user accounts were imported from the old tdb backend. The intent was to migrate the domain however this did not go as expected. All workstations needed to be removed and re joined to the new domain and user profiles needed to be re created. Once the new server was installed various users have experienced some kind of random file locking related issues when working on "Word" or "Excel" documents from the file shares, resulting in various Windows error messages including the following "Word failed reading from this file(~WRL0925), please restore the network connection or replace the floppy disk and retry" "Cannot delete filename; It is being used by another person or program." "Quote Register 2008.xls is locked for editing by 'drana'" (in this case drana is the user trying to open the file that received the error.) "The document was saved successfully, but Excel cannot re-open it because of a sharing violation, Please close the document and try to open it again." I have several other Samba file servers on CentOS 4 & 5 machines some of which are PDC's but this is the first LDAP deployment. None of the others have exhibited this issue. My Research so far has found plenty of other reports of this kind of issue with no reports of real solution. I have read some documentation that claims OPLOCKS can cause this issue and have tried various config options to disable them. Other documents claim network hardware faults such as cabling or switches and I have already replaced anything that appeared at all suspect. Also just prior to it's retirement the old server was being accessed using exactly the same cabling and switch infrastructure. There also does not seem to be any issue copying large amounts of data between servers using scp or rsync so disk and network throughput does not seem to be an issue. The smb.conf is as follows [global] unix charset = LOCALE workgroup = EARTHLIFT netbios name = EARTHLIFTSERVER server string = earthliftserver interfaces = eth0, lo bind interfaces only = Yes passdb backend = ldapsam:ldap://localhost #enable privileges = Yes #nt acl support = no username map = /etc/samba/smbusers log level = 1 #syslog = 0 log file = /var/log/samba/%m max log size = 50 smb ports = 139 name resolve order = wins bcast hosts socket options = TCP_NODELAY IPTOS_LOWDELAY SO_KEEPALIVE SO_RCVBUF=8192 SO_SNDBUF=8192 time server = Yes printcap name = CUPS show add printer wizard = No add user script = /opt/IDEALX/sbin/smbldap-useradd -m "%u" delete user script = /opt/IDEALX/sbin/smbldap-userdel "%u" add group script = /opt/IDEALX/sbin/smbldap-groupadd -p "%g" delete group script = /opt/IDEALX/sbin/smbldap-groupdel "%g" add user to group script = /opt/IDEALX/sbin/smbldap-groupmod -m "%u" "%g" delete user from group script = /opt/IDEALX/sbin/smbldap-groupmod -x "%u" "%g" set primary group script = /opt/IDEALX/sbin/smbldap-usermod -g "%g" "%u" add machine script = /opt/IDEALX/sbin/smbldap-useradd -w "%u" logon script = %U.bat logon path = \\%L\profiles\%U logon drive = H: logon home = \\%L\%U domain logons = Yes preferred master = Yes local master = yes wins support = Yes ldap suffix = dc=earthlift,dc=local ldap machine suffix = ou=machines ldap user suffix = ou=People ldap group suffix = ou=group ldap idmap suffix = ou=Idmap ldap admin dn = cn=Manager,dc=earthlift,dc=local idmap backend = ldap:ldap://localhost idmap uid = 10000-20000 idmap gid = 10000-20000 #map acl inherit = Yes printing = cups hide files = /desktop.ini/Desktop.ini os level = 65 level2 oplocks = no kernel oplocks = no oplocks = no #use sendfile = no #lock spin time = 15 [homes] comment = Home Directories valid users = %S read only = No browseable = No [printers] comment = SMB Print Spool path = /var/spool/samba guest ok = Yes printable = Yes browseable = No #[profdata] #comment = Profile Data Share #path = /home/users/profdata #read only = No #profile acls = Yes [netlogon] comment = Network Logon Service path = /home/users/netlogon guest ok = yes write list = @ntadmins share modes = no [Profiles] path = /home/users/profiles browseable = no guest ok = yes create mask = 0600 directory mask = 0700 writeable = yes oplocks = false level2 oplocks = false csc policy = disable veto oplock files = /prf*.tmp/; profile acls = yes [public] writeable = yes public = yes path = /home/shares/public # force group = Domain Users create mode = 770 directory mode = 770 oplocks = yes level2 oplocks = yes # veto oplock files = /*.mdb/*.xls/*.wdb/; [batch] writeable = yes public = yes path = /home/shares/batch # force group = Domain Users create mode = 770 directory mode = 770 oplocks = yes level2 oplocks = yes veto oplock files = /*.mdb/*.xls/*.wdb/; [faxes] writeable = yes public = yes path = /home/shares/faxes # force group = Domain Users create mode = 770 directory mode = 770 oplocks = yes level2 oplocks = yes # veto oplock files = /*.mdb/*.xls/*.wdb/; [scans] writeable = yes public = yes path = /home/shares/scans # force group = Domain Users create mode = 770 directory mode = 770 oplocks = yes level2 oplocks = yes # veto oplock files = /*.mdb/*.xls/*.wdb/; [admin] writeable =yes guest ok = no path = /home/shares/admin # force group = Domain Users create mode = 770 directory mode = 770 oplocks = yes level2 oplocks = yes # veto oplock files = /*.mdb/*.xls/*.wdb/; [excavate] writeable =yes guest ok = no path = /home/shares/excavate # force group = Domain Users create mode = 770 directory mode = 770 oplocks = false level2 oplocks = false # veto oplock files = /*.mdb/*.wdb/; [piling] writeable =yes guest ok = no path = /home/shares/piling # force group = Domain Users create mode = 770 directory mode = 770 oplocks = false level2 oplocks = false # veto oplock files = /*.mdb/*.wdb/; [ohs] writeable =yes guest ok = no path = /home/shares/ohs # force group = Domain Users create mode = 770 directory mode = 770 oplocks = yes level2 oplocks = yes # veto oplock files = /*.mdb/*.xls/*.wdb/; [finance] writeable =yes guest ok = no path = /home/shares/finance # force group = finance create mode = 770 directory mode = 770 valid users = @finance oplocks = yes level2 oplocks = yes # veto oplock files = /*.mdb/*.xls/*.wdb/; [ANZ] writeable =yes guest ok = no path = /home/shares/ANZ # force group = finance create mode = 770 directory mode = 770 browseable = no valid users = @finance oplocks = yes level2 oplocks = yes # veto oplock files = /*.mdb/*.xls/*.wdb/; [payroll] writeable =yes guest ok = no path = /home/shares/payroll # force group = payroll create mode = 770 directory mode = 770 valid users = @payroll oplocks = yes level2 oplocks = yes # veto oplock files = /*.mdb/*.xls/*.wdb/; The longer a user has the file open for the more likely they seem to be to have the problem. This also occurs a lot in the log files but more frequently than the user issues so I didn't consider it the cause of the issue [2008/07/03 16:41:56, 0] lib/util_sock.c:read_data(534) read_data: read failure for 4 bytes to client 192.168.0.82. Error = No route to host [2008/07/03 16:21:25, 0] lib/util_sock.c:read_data(534) read_data: read failure for 4 bytes to client 192.168.0.82. Error = Connection timed out The only way to regain access to a file once the error occurs is to kill the related smb process on the server. My next step was going to be to rebuild the old server agian using CentOS 4 to eliminate the change in kernel/OS version but retain the LDAP server so as not to have to rebuild the domain and profiles again. If anyone can suggest a solution or more appropriate course of action please advise? Best Regards Kevin Bedford