Hi folks, We have been experiencing problems with one of our Samba servers here for a while now and need a little help. First off we are running Fedora Core 3, kernel version 2.6.12-1, and Samba version 3.0.20b. Our hardware is a RAID 5 DELL PowerEdge 700 server. The problem I am about to describe has occurred on this server in previous versions of samba prior to version 3.0.20b and has the same result every time it happens. To get to the point of the problem, this server will run fine for a period of time and then begin to build up SMBD processes until eventually our users can no longer access shares. The Samba server just stops responding. It does not even respond to STOP, START, or RESTART commands. Doing a RESTART on samba will look like it is restarting the service, but Samba will still be in the same locked state with shares still not available. Doing a status on the service then reveals that the STOP, START, or RESTART did nothing to clear out the old processes or the locked files it thought it previously had opened. We end up just rebooting the server to clear everything out. Right now we are just reading through all the documentation, posts, and waiting for this to happen again to hopefully capture some error in the log. When that happens I'll send more detail. I've seen other posts about this, but none ever clearly state some sort of resolution to this problem. So I know this is happening to others. It's getting increasingly more frustrating since we have been running samba as our file servers for Windows clients for years now and never had this much of a problem with it crashing on us like this. Our smb.conf file is pretty basic and really it is the one we have run since we first started. Since we have not had a problem in the past this is making me think it is more an external application, like one of our databases possibly that is causing the SMBD process to build out of control. We are open to any and all suggestions to help fix this from happening anymore. smb.conf file: [global] workgroup = XXX server string = Samba Server security = DOMAIN log file = /var/log/samba/log max log size = 100000 socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192 dns proxy = No ldap ssl = no log level = 5 load printers = No kernel oplocks = No [homes] comment = Home Directories read only = No browseable = No available = No #[printers] # comment = All Printers # path = /var/spool/samba # printable = Yes # browseable = No [public_app] comment = Public Data Repository path = /var/local/group_shares/public_data read only = No create mask = 02775 force create mode = 02775 directory mask = 02775 force directory mode = 02775 [prod_control] comment = Production Control Repository path = /var/local/group_shares/prod_control read only = No create mask = 02775 force create mode = 02775 directory mask = 02775 force directory mode = 02775 -- Matt Lung Midwest Tool & Die, Corp.
On Wed, 2005-12-07 at 09:50 -0500, Matt Lung wrote:> To get to the point of the problem, this server will run fine for a > period of time and then begin to build up SMBD processes until > eventually our users can no longer access shares. The Samba server just > stops responding. It does not even respond to STOP, START, or RESTART > commands. Doing a RESTART on samba will look like it is restarting the > service, but Samba will still be in the same locked state with shares > still not available. Doing a status on the service then reveals that > the STOP, START, or RESTART did nothing to clear out the old processes > or the locked files it thought it previously had opened. We end up just > rebooting the server to clear everything out. Right now we are just > reading through all the documentation, posts, and waiting for this to > happen again to hopefully capture some error in the log. When that > happens I'll send more detail.Instead of immediately restarting it you may attach a strace to the spinning process and tell us where it dies. Meanwhile I suggest you to check the integrity of your tdb files (killing with -9 may lead to corrupted tdbs and in some rare occasion I've seen our code spinning on corrupted files). To check if a tdb is ok, you can tdbbackup it (no need to stop samba for that) and see if the backup is ok. In case of error you have a corrupted tdb and it is better to remove and restart it in case it is a temporary db or plan adequate measures in case it is a persistent one. Simo. -- Simo Sorce - idra@samba.org Samba Team - http://www.samba.org Italian Site - http://samba.xsec.it
Hi folks, We have been experiencing problems with one of our Samba servers here for a while now and need a little help. First off we are running Fedora Core 3, kernel version 2.6.12-1, and Samba version 3.0.20b. Our hardware is a RAID 5 DELL PowerEdge 700 server. The problem I am about to describe has occurred on this server in previous versions of samba prior to version 3.0.20b and has the same result every time it happens. To get to the point of the problem, this server will run fine for a period of time and then begin to build up SMBD processes until eventually our users can no longer access shares. The Samba server just stops responding. It does not even respond to STOP, START, or RESTART commands. Doing a RESTART on samba will look like it is restarting the service, but Samba will still be in the same locked state with shares still not available. Doing a status on the service then reveals that the STOP, START, or RESTART did nothing to clear out the old processes or the locked files it thought it previously had opened. We end up just rebooting the server to clear everything out. Right now we are just reading through all the documentation, posts, and waiting for this to happen again to hopefully capture some error in the log. When that happens I'll send more detail. I've seen other posts about this, but none ever clearly state some sort of resolution to this problem. So I know this is happening to others. It's getting increasingly more frustrating since we have been running samba as our file servers for Windows clients for years now and never had this much of a problem with it crashing on us like this. Our smb.conf file is pretty basic and really it is the one we have run since we first started. Since we have not had a problem in the past this is making me think it is more an external application, like one of our databases possibly that is causing the SMBD process to build out of control. We are open to any and all suggestions to help fix this from happening anymore. smb.conf file: [global] workgroup = XXX server string = Samba Server security = DOMAIN log file = /var/log/samba/log max log size = 100000 socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192 dns proxy = No ldap ssl = no log level = 5 load printers = No kernel oplocks = No [homes] comment = Home Directories read only = No browseable = No available = No #[printers] # comment = All Printers # path = /var/spool/samba # printable = Yes # browseable = No [public_app] comment = Public Data Repository path = /var/local/group_shares/public_data read only = No create mask = 02775 force create mode = 02775 directory mask = 02775 force directory mode = 02775 [prod_control] comment = Production Control Repository path = /var/local/group_shares/prod_control read only = No create mask = 02775 force create mode = 02775 directory mask = 02775 force directory mode = 02775 -- Matt Lung Midwest Tool & Die, Corp.
Brian_Pickering@selinc.com
2005-Dec-08 18:19 UTC
[Samba] Hanging SMBD processes - Samba CRASHING
We've had similar troubles with Samba 3.x on our ClearCase VOB server running RHEL3. Our fix was to go back to the old 2.2.12, and we haven't had a problem since. Unfortunately I was never able to devote enough time to tracking down the problem fully. I had hoped that upgrading to RHEL4 using a 2.6 kernel would help, but your experience doesn't bode well for that. -------------------------------------------- Brian Pickering System Administrator - Information Services Schweitzer Engineering Laboratories, Inc. Email - Brian_Pickering@selinc.com Telephone - 509-332-1890 x1212 samba-bounces+brian_pickering=selinc.com@lists.samba.org wrote on 12/07/2005 06:50:28 AM:> Hi folks, > > We have been experiencing problems with one of our Samba servers here > for a while now and need a little help. First off we are running Fedora > Core 3, kernel version 2.6.12-1, and Samba version 3.0.20b. Our hardware > is a RAID 5 DELL PowerEdge 700 server. The problem I am about to > describe has occurred on this server in previous versions of samba prior > to version 3.0.20b and has the same result every time it happens. > > To get to the point of the problem, this server will run fine for a > period of time and then begin to build up SMBD processes until > eventually our users can no longer access shares. The Samba server just > stops responding. It does not even respond to STOP, START, or RESTART > commands. Doing a RESTART on samba will look like it is restarting the > service, but Samba will still be in the same locked state with shares > still not available. Doing a status on the service then reveals that > the STOP, START, or RESTART did nothing to clear out the old processes > or the locked files it thought it previously had opened. We end up just > rebooting the server to clear everything out. Right now we are just > reading through all the documentation, posts, and waiting for this to > happen again to hopefully capture some error in the log. When that > happens I'll send more detail. > > I've seen other posts about this, but none ever clearly state some sort > of resolution to this problem. So I know this is happening to others. > It's getting increasingly more frustrating since we have been running > samba as our file servers for Windows clients for years now and never > had this much of a problem with it crashing on us like this. Our > smb.conf file is pretty basic and really it is the one we have run since > we first started. Since we have not had a problem in the past this is > making me think it is more an external application, like one of our > databases possibly that is causing the SMBD process to build out of > control. We are open to any and all suggestions to help fix this from > happening anymore. > > smb.conf file: > > [global] > workgroup = XXX > server string = Samba Server > security = DOMAIN > log file = /var/log/samba/log > max log size = 100000 > socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192 > dns proxy = No > ldap ssl = no > log level = 5 > load printers = No > kernel oplocks = No > > [homes] > comment = Home Directories > read only = No > browseable = No > available = No > > #[printers] > # comment = All Printers > # path = /var/spool/samba > # printable = Yes > # browseable = No > > [public_app] > comment = Public Data Repository > path = /var/local/group_shares/public_data > read only = No > create mask = 02775 > force create mode = 02775 > directory mask = 02775 > force directory mode = 02775 > > [prod_control] > comment = Production Control Repository > path = /var/local/group_shares/prod_control > read only = No > create mask = 02775 > force create mode = 02775 > directory mask = 02775 > force directory mode = 02775 > > > > -- > Matt Lung > > Midwest Tool & Die, Corp. > > > > -- > To unsubscribe from this list go to the following URL and read the > instructions: https://lists.samba.org/mailman/listinfo/samba >---------------------------------- This e-mail may contain SEL confidential or legally privileged information. The opinions expressed are not necessarily those of SEL. Any unauthorized disclosure, distribution or other use is prohibited. If you received this e-mail in error, please notify the sender, permanently delete it, and destroy any printed copies.