In the past few months we have seen a couple of our production servers crash that are running samba-3.0.14a-1. What seems to happen is the smbd pid's seem to grow and grow until an out of memory error occurs and the smbd process just hangs. A simple restart does not even clear out the pid's. I end up rebooting to just clear it all out and start fresh. It probably has happened at least once a month for the past 3 months. We are running Fedora Core 3 with kernel-2.6.11-1.14. We have turned off all unneeded services and have it running only what we need. The main services this server provides is postgresql, and smb shares. It is also doing hourly rsync of data to another server which momentarily hogs up all the cpu and memory a few minutes an hour. The server is setup as a domain member authenticating to our PDC via LDAP. One thing we continually see in our logwatch reports is that kernel leases are broken by samba pid's. I don't know if that has something to do with it or not. I do have kernel oplocks turned on. Hopefully someone else having similar problems or had similar problems can lend a helping hand. This is really starting to get annoying since it is so random. Any advice would be appreciated. Thanks -- Matt Lung Midwest Tool & Die, Corp.
On Wed, 29 Jun 2005 09:00:01 -0500 Matt Lung <matt.lung@midwest-tool.com> wrote:> In the past few months we have seen a couple of our production servers > crash that are running samba-3.0.14a-1. What seems to happen is the > smbd pid's seem to grow and grow until an out of memory error occurs and > the smbd process just hangs. A simple restart does not even clear out > the pid's. I end up rebooting to just clear it all out and start > fresh. It probably has happened at least once a month for the past 3 > months.It happens here on FC3 x86_64, samba 3..0.15pre everyday - after a week there are more than 1000 smbd processes. First 'service smb stop' does not kill all of them, but 2nd does. I see in log files: *** glibc detected *** free(): invalid pointer: 0x00002aaaab090580 *** Regards, Nerijus
Nerijus Baliunas wrote:>On Wed, 29 Jun 2005 13:49:18 -0500 Matt Lung <matt.lung@midwest-tool.com> wrote: > > > >>hmm... even a 2nd restart does not help for me. we don't have quite >>that many processes stuck, but it is sure getting annoying though. Have >>you had any luck trying to fix it? >> >> > >No, but I said 2nd stop, not restart. > >sure.. thanks for sharing your experince though. At least maybe others having the same problem will see this and say something, or one of the team members may have some input on this. -- Matt Lung Midwest Tool & Die, Corp.
I've posted a couple times about smbd processes that crashes at random. It happened again last night. I was able to restart samba via SWAT and get it back up and going. I've noticed a couple new things going on. Even after I got it to restart there is still a bunch of pid's associated with smbd that get listed when doing an smb status command. pid's that are not even associated with active connections anymore. Seem to be eternally stuck until a reboot. I have also noticed that on our PDC listed in the active connections is the IP address of this member server. It is the only IP address in the active connections list, and the only server listed in the active connections. That really doesn't make sense to me. It is almost acting like a client instead of a domain member server. Right now I'm just trying to look for answers as to why this may be happening. Any suggestions, thoughts, etc.. ??? Thanks -- Matt Lung Midwest Tool & Die, Corp.