thr3ads.net - samba - HELP: Connections dropping whilst processes increasing. [Nov 1999]

If this information is useful, please help other people find it:
Share via:

Martin Rootes

1999-Nov-15 17:32 UTC

HELP: Connections dropping whilst processes increasing.

I'm running Samba 2.0.4b on a E450 running Solaris 7 which acts as a
diskspace server for
students, normally everything runs quite smoothly with numbers of connections in
the region of
about 800/900. Then every now and again students will start seeing their
connections being lost
and others will not get a connection when they log in, looking at the system
shows no obvious
problems, except the error messages below in log.smb, and the fact that the
number of smbd
processes increases to unfeasible levels (I have seen as many as 5000). Does
anyone have any
idea as to what may be causing this, or any idea as to how to diagnose the
problem.

	Martin Rootes
	Systems Support
	Sheffield Hallam University

[1999/11/10 14:56:08, 0] smbd/oplock.c:request_oplock_break(996)
  request_oplock_break: no response received to oplock break request 
to pid 7118 on port 51376 for dev = 2580006, inode = 8385207
  for dev = 2580006, inode = 8385207, tv_sec = 382985c5, tv_usec = 
798c7
[1999/11/10 14:56:08, 0] lib/util_sock.c:client_addr(889)
  getpeername failed. Error was Transport endpoint is not connected
[1999/11/10 14:56:08, 0] lib/util_sock.c:write_data(415)
  write_data: write failure. Error = Broken pipe
[1999/11/10 14:56:08, 0] lib/util_sock.c:write_socket(191)
  write_socket: Error writing 4 bytes to socket 7: ERRNO = Broken pipe
[1999/11/10 14:56:08, 0] lib/util_sock.c:send_smb(606)
  Error writing 4 bytes to client. -1. Exiting

------------------------------------------------------------------------------
Martin Rootes - Senior Systems Programmer/Analyst, Sheffield Hallam University
Email :         M.Rootes@shu.ac.uk
------------------------------------------------------------------------------

Cliff Green

1999-Nov-20 19:18 UTC

head link

HELP: Connections dropping whilst processes increasing.

Did anyone ever give an answer for this problem?  

We've been experiencing something very similar to Martin Rootes'
problem,
on an HP9000 K-series server, with anywhere from hundreds to thousands of
extra, unkillable smbd processes.  The odd thing is, the system load goes
*very* high, but it doesn't seem to affect anything other than further
smbd services, including preventing successful logons.

It's odd - it only happens on that one server (we run Samba on five
production servers), and there are few differences between that host and
the others.  As you can imagine, I really need to determine if our problem
is that there's something wrong with Samba, or if this is due to either
the other processes on that server or something different about the
clients that predominantly use that server.

Very unfortunately, the only way to get rid of those hundreds to thousands
of extra processes is to restart the server.  An increasingly unacceptable
solution.

My management and the support staffer on that campus believe that Samba is
the problem, because it displays this behavior (difficulty logging in, and
enormous numbers of unkillable smbd processes).  I believe it's something
else, but need to prove it.

Let's see, the only configuration options were --prefix, --with-quotas,
and --with-mmap (which I guess we'll stop using Real Soon).

The logon script mounts the user's home directory, a shared directory,
sets the time, and some antiviral housekeeping.

Help!

Anything anyone's found or any insights will be helpful.

We require domain logons, and they've worked fine for a couple of years
now (from Samba 1.9.18 to now - 2.0.5a [don't suggest upgrading to 2.0.6 -
I've got a major problem there on a testbed system]), Win95 & plain
passwords, logon scripts generated by rootpreexec calling a perl script in
[netlogon].  I believe the campus with the problem has a persistent share
defined on the clients, and I know that's not the case for the other
campuses.  log.smb on that campus shows *no* entries for "connect to
service netlogon", but many "closed connection to service
netlogon", which
should not be happening.

On the other hand, that server began running both Oracle and OpenView for
network monitoring and management a few months before these problems
started to appear.

I didn't want to shower you all with log details, my smb.conf file, or the
logonscript (of course, I'll provide info if it'll help) - but can
*anyone* provide some advice, insight, or <gasp> solutions?

c
-- 
Clifford Green               Internet -  green@umdnj.edu
Academic Computing Services     voice -     732-235-5250
UMDNJ-IST                         fax -     732-235-5252

Paulo Afonso Graner Fessel

1999-Nov-22 19:38 UTC

head link

HELP: Connections dropping whilst processes increasing.

On Sat, 20 Nov 1999, Cliff Green (green@UMDNJ.EDU) wrote:
> Did anyone ever give an answer for this problem? 
	I'm getting EXACTLY the same problem here, with Red Hat 6.0 + custom
Linux Kernel 2.2.13 (with e2comp patch basically).
 > We've been experiencing something very similar to Martin Rootes'
problem,
> on an HP9000 K-series server, with anywhere from hundreds to thousands of 
> extra, unkillable smbd processes. The odd thing is, the system load goes 
> *very* high, but it doesn't seem to affect anything other than further 
> smbd services, including preventing successful logons. 
	Same thing here - I can do whatever I want to, except to use Samba
services. Telnet, httpd, LDAP, etc., everything but Samba works OK.
Includes, the problem is easily reproductible in one station: it's only
neccessary to open a Word document, make one or two modifications and save
it twice. The first save works ok, the second locks the machine and turns
the user's original smbd mad, making the server spawn two or three other
process for the same user.
 > It's odd - it only happens on that one server (we run Samba on five 
> production servers), and there are few differences between that host and 
> the others. As you can imagine, I really need to determine if our problem 
> is that there's something wrong with Samba, or if this is due to either
> the other processes on that server or something different about the 
> clients that predominantly use that server.
	How many users use this specific server and how do they use it? What
applications are involved in client side (Word, Excel, xBASE apps...) and
in server side (daemons?) ?

 > Very unfortunately, the only way to get rid of those hundreds to thousands 
> of extra processes is to restart the server. An increasingly unacceptable 
> solution. 
	I repeat, it's EXACTLY the same way here. Unacceptability considerations
included. =:-0
 > My management and the support staffer on that campus believe that Samba is 
> the problem, because it displays this behavior (difficulty logging in, and 
> enormous numbers of unkillable smbd processes). I believe it's
something
> else, but need to prove it.
	Hmmm... How is this server connected to the stations that show the
problem? Here I think that our problem may be our switch (a 3Com SuperStack
1000 with OLD firmware and low-capacity buffers, since it's a workgroup
switch and not a backbone switch). I say this because I'm observing
*collisions* in ports that are reserved to the *server* and *workstations*
(no hubs involved). 
 > Let's see, the only configuration options were --prefix, --with-quotas,
> and --with-mmap (which I guess we'll stop using Real Soon). 
	If I'm not mistaken, mmap suport is disabled by default in the current
(and not-so-current) versions of Samba, so I think it's not an issue
(unless you have enabled it explictly).
 > The logon script mounts the user's home directory, a shared directory, 
> sets the time, and some antiviral housekeeping. 
	I don't have logon scripts here. I map the drives using Network
Neighborhood. Hmmm, we also have antiviral software running (McAfee
ViruScan), what's yours?
 > Help! 
> 
> Anything anyone's found or any insights will be helpful. 
	I've found "window frozen" problems and acknowledge-time problems
("acks
too long") between station and server. In the first case, this is a signal
of buffer exhaustion and so I'll be setting up separate switchless network
for us in a separate interface on the server, plus the "usual" network
interface that will remain connected to the switch. I'll put one smbd
listening in each interface. Thus if the smbd linked to the interface
connected to the switch locks I'll know that the issue is the switch issue.
 > log.smb on that campus shows *no* entries for "connect to 
> service netlogon", but many "closed connection to service
netlogon", which
> should not be happening. 
	I'm not sure, but it *seemed* to be happening here too - I'll check
out.
> On the other hand, that server began running both Oracle and OpenView for 
> network monitoring and management a few months before these problems 
> started to appear. 
	I don't think this is the problem, as I don't run neither of these here
and I have the problem too. BUT...

	...humm, I'm running snmpd here and I think you're doing this too, as I
think this server of yours is SNMP-manageable. Or not? I say this because
I'm running snmpd here (and actually it's basically useless).
 > I didn't want to shower you all with log details, my smb.conf file, or
the
> logonscript (of course, I'll provide info if it'll help) - but can 
> *anyone* provide some advice, insight, or <gasp> solutions? 
	I'm looking for solutions, also. Unfortunatley, I still don't have any
concrete answers. But if you could check out these points would be
interesting to see whether there are (or not) other similarities (besides
the problem itself).

	P.
 
 

-- 
"The one that doesn't run the risk doesn't snap"

(Mill?r, "Li??es de Ingl?s Audiovisual", Pasquim n?117)

don_mccall@hp.com

1999-Nov-22 20:20 UTC

head link

HELP: Connections dropping whilst processes increasing.

Hi Cliff,
If you have dde on your HP-UX system you could try attaching to some of these 
runaway smbd processes and get a stack trace to see WHAT they are doing; 
perhaps catch them forking and see what they were trying to do at the time;
the command syntax for dde would be:

dde -ui line -attach <PID of the smbd process> /...path to smbd/smbd

(-ui line puts it in line mode so you don't have to deal with the xwindows
gui
over a modem)

dde 'stops' the process and gives you a prompt, where you can type
'tb' to get
a traceback of the routines called to get to whereever the stop occurred.  You 
can then type "go", to start the process running where it left off,
and <cntl
c> to interrupt it again, and do another 'tb' to get another stack,
etc - to
see if you are in a loop, and what you are executing so frantically...  May 
help, may not...


I suppose you have already tried turning on a higher level of debugging to see 
if any useful debug statements are being generated during this runaway 
condition, but if not, that would be useful as well....

I haven't seen this behavior on my 11.0 system at this time, but it's a
diag
system, and not very heavily loaded...

Hope this helps,
Don

Possibly Parallel Threads

Search for more maybe matching threads

samba - Nov 1999 - HELP: Connections dropping whilst processes increasing.

HELP: Connections dropping whilst processes increasing.

HELP: Connections dropping whilst processes increasing.

HELP: Connections dropping whilst processes increasing.

HELP: Connections dropping whilst processes increasing.

Possibly Parallel Threads