thr3ads.net - samba - Severe problem with Samba [Dec 2001]

If this information is useful, please help other people find it:
Share via:

Martin Rootes

2001-Dec-13 10:13 UTC

Severe problem with Samba

Dear All,

	we are experiencing severe problems with Samba 2.2.0 (with quota support)
running on a
dual processor (400MHz) Sun E450 running Solaris 2.7. This is used as a central
file server for
student diskspace, accessed by approx 1200 PCs running NT 4. Up until recently
we
experienced some, what we assume to be, loading issues with connections during
the middle of
the day being slow. However, recently we have been encountering severe problems.
Everything
seems fine until  midday, then what we start to see is the number of smbd
processes going up
whilst the number of connections (determined from smbstatus -b) dropping,
students with
connections starting getting slow responses and no new connections are being
made, load on
the system skyrockets. stopping samba and restarting seems to cure the problem,
but the
problem can re-occur. We are in a desperate panic at the moment as the students
are all doing
assignments and this is seriously affecting their work. We have tried various
tweaks to Samba
(deadtime, change notify timeout), the tcp stack and have tripled system memory,
all to no avail.
We also seem to have an issue with keepalives and tcp_nodelay, neither of which
seem to work
at all, we see the following messages in the log about keepalives:-

[2001/12/13 11:55:29, 0] lib/util_sock.c:set_socket_options(165)
  Failed to set socket option SO_KEEPALIVE (Error Invalid argument)
[2001/12/13 11:55:29, 0] lib/util_sock.c:set_socket_options(165)
  Failed to set socket option TCP_NODELAY (Error Invalid argument)

The following are a selection of messages appearing just before Samba was
stopped:

[2001/12/13 11:39:51, 0] lib/util_sock.c:write_socket(566)
  write_socket: Error writing 4 bytes to socket 12: ERRNO = Broken pipe
[2001/12/13 11:39:51, 0] lib/util_sock.c:send_smb(753)
  Error writing 4 bytes to client. -1. Exiting
[2001/12/13 11:40:29, 0] lib/util_sock.c:get_socket_addr(1084)
  getpeername failed. Error was Transport endpoint is not connected
[2001/12/13 11:40:30, 0] lib/util_sock.c:get_socket_addr(1084)
  getpeername failed. Error was Transport endpoint is not connected
[2001/12/13 11:40:30, 0] lib/util_sock.c:write_socket_data(542)
  write_socket_data: write failure. Error = Broken pipe
[2001/12/13 11:40:30, 0] lib/util_sock.c:write_socket(566)
  write_socket: Error writing 4 bytes to socket 12: ERRNO = Broken pipe
[2001/12/13 11:40:30, 0] lib/util_sock.c:send_smb(753)
  Error writing 4 bytes to client. -1. Exiting
[2001/12/13 11:40:30, 0] lib/util_sock.c:write_socket_data(542)
  write_socket_data: write failure. Error = Broken pipe
[2001/12/13 11:40:30, 0] lib/util_sock.c:write_socket(566)
  write_socket: Error writing 4 bytes to socket 12: ERRNO = Broken pipe
[2001/12/13 11:40:30, 0] lib/util_sock.c:send_smb(753)
  Error writing 4 bytes to client. -1. Exiting
[2001/12/13 11:40:33, 0] lib/util_sock.c:read_socket_data(479)
  read_socket_data: recv failure for 4. Error = Connection reset by peer
[2001/12/13 11:40:49, 0] smbd/server.c:open_sockets(251)
  open_sockets: accept: Software caused connection abort
[2001/12/13 11:40:53, 0] lib/util_sock.c:read_socket_data(479)
  read_socket_data: recv failure for 4. Error = Connection reset by peer

We think we may have loading problems, however, if it is, it doesn't seem to
be directly
proportional to number of connections. In fact there will be a significant rise
in the load at, and
for 10 - 15 mins past, the hour (this is all day long not just midday), we
assume that this is
because logging in exacts a high load on the system. It's alos possible that
the midday
problems are caused by different patterns of working, as students will be
logging in for short
periods to check e-mail before going to get lunch etc. Another oddity we see are
some samba
connections left running from the day before (or sometimes longer), so we are
wondering
whether connections are not getting killed properly, thereby adding to the load.

So, please, any pointers as to what the problem is would be very helpful. At the
moment we're
struggling, I'm considering getting a less stressful job - something like a
fork lift truck driver in an
explosives factory - and people are starting to question whether we should
replace the whole
system with a Novell based one!

	Thanks in advance

	Martin Rootes
	Systems Support


------------------------------------------------------------------------------
Martin Rootes - Senior Systems Programmer/Analyst, Sheffield Hallam University
Email :         M.J.Rootes@shu.ac.uk                      Phone: 0114 225 3828
------------------------------------------------------------------------------

Lonny Schwartz

2001-Dec-13 10:38 UTC

head link

Severe problem with Samba

You may want to check out this article from Sysadmin Mag about Solris
Performance Tuning, seems like some of this may apply to your situation.

http://www.samag.com/documents/s=1323/sam0110e/0110e.htm

also this site which is referenced in the article

http://www.sean.de/Solaris/tune.html

Cheers,

Lonny

-----Original Message-----
From: samba-admin@lists.samba.org [mailto:samba-admin@lists.samba.org]On
Behalf Of Martin Rootes
Sent: Thursday, December 13, 2001 10:08 AM
To: Samba
Subject: Severe problem with Samba


Dear All,

	we are experiencing severe problems with Samba 2.2.0 (with quota support)
running on a
dual processor (400MHz) Sun E450 running Solaris 2.7. This is used as a
central file server for
student diskspace, accessed by approx 1200 PCs running NT 4. Up until
recently we
experienced some, what we assume to be, loading issues with connections
during the middle of
the day being slow. However, recently we have been encountering severe
problems. Everything
seems fine until  midday, then what we start to see is the number of smbd
processes going up
whilst the number of connections (determined from smbstatus -b) dropping,
students with
connections starting getting slow responses and no new connections are being
made, load on
the system skyrockets. stopping samba and restarting seems to cure the
problem, but the
problem can re-occur. We are in a desperate panic at the moment as the
students are all doing
assignments and this is seriously affecting their work. We have tried
various tweaks to Samba
(deadtime, change notify timeout), the tcp stack and have tripled system
memory, all to no avail.
We also seem to have an issue with keepalives and tcp_nodelay, neither of
which seem to work
at all, we see the following messages in the log about keepalives:-

[2001/12/13 11:55:29, 0] lib/util_sock.c:set_socket_options(165)
  Failed to set socket option SO_KEEPALIVE (Error Invalid argument)
[2001/12/13 11:55:29, 0] lib/util_sock.c:set_socket_options(165)
  Failed to set socket option TCP_NODELAY (Error Invalid argument)

The following are a selection of messages appearing just before Samba was
stopped:

[2001/12/13 11:39:51, 0] lib/util_sock.c:write_socket(566)
  write_socket: Error writing 4 bytes to socket 12: ERRNO = Broken pipe
[2001/12/13 11:39:51, 0] lib/util_sock.c:send_smb(753)
  Error writing 4 bytes to client. -1. Exiting
[2001/12/13 11:40:29, 0] lib/util_sock.c:get_socket_addr(1084)
  getpeername failed. Error was Transport endpoint is not connected
[2001/12/13 11:40:30, 0] lib/util_sock.c:get_socket_addr(1084)
  getpeername failed. Error was Transport endpoint is not connected
[2001/12/13 11:40:30, 0] lib/util_sock.c:write_socket_data(542)
  write_socket_data: write failure. Error = Broken pipe
[2001/12/13 11:40:30, 0] lib/util_sock.c:write_socket(566)
  write_socket: Error writing 4 bytes to socket 12: ERRNO = Broken pipe
[2001/12/13 11:40:30, 0] lib/util_sock.c:send_smb(753)
  Error writing 4 bytes to client. -1. Exiting
[2001/12/13 11:40:30, 0] lib/util_sock.c:write_socket_data(542)
  write_socket_data: write failure. Error = Broken pipe
[2001/12/13 11:40:30, 0] lib/util_sock.c:write_socket(566)
  write_socket: Error writing 4 bytes to socket 12: ERRNO = Broken pipe
[2001/12/13 11:40:30, 0] lib/util_sock.c:send_smb(753)
  Error writing 4 bytes to client. -1. Exiting
[2001/12/13 11:40:33, 0] lib/util_sock.c:read_socket_data(479)
  read_socket_data: recv failure for 4. Error = Connection reset by peer
[2001/12/13 11:40:49, 0] smbd/server.c:open_sockets(251)
  open_sockets: accept: Software caused connection abort
[2001/12/13 11:40:53, 0] lib/util_sock.c:read_socket_data(479)
  read_socket_data: recv failure for 4. Error = Connection reset by peer

We think we may have loading problems, however, if it is, it doesn't seem to
be directly
proportional to number of connections. In fact there will be a significant
rise in the load at, and
for 10 - 15 mins past, the hour (this is all day long not just midday), we
assume that this is
because logging in exacts a high load on the system. It's alos possible that
the midday
problems are caused by different patterns of working, as students will be
logging in for short
periods to check e-mail before going to get lunch etc. Another oddity we see
are some samba
connections left running from the day before (or sometimes longer), so we
are wondering
whether connections are not getting killed properly, thereby adding to the
load.

So, please, any pointers as to what the problem is would be very helpful. At
the moment we're
struggling, I'm considering getting a less stressful job - something like a
fork lift truck driver in an
explosives factory - and people are starting to question whether we should
replace the whole
system with a Novell based one!

	Thanks in advance

	Martin Rootes
	Systems Support


----------------------------------------------------------------------------
--
Martin Rootes - Senior Systems Programmer/Analyst, Sheffield Hallam
University
Email :         M.J.Rootes@shu.ac.uk                      Phone: 0114 225
3828
----------------------------------------------------------------------------
--

--
To unsubscribe from this list go to the following URL and read the
instructions:  http://lists.samba.org/mailman/listinfo/samba

Charles Marcus

2001-Dec-13 13:24 UTC

head link

File-locking issues with Shared DB files fixed?

Can anyone confirm (or deny) that the problems with file-locking when
accessing a shared database (ACT, Access, FoxPro, etc) are fixed yet?  This
is the only thing preventing me from killing our last Win2K Server.

Thanks guys, for all the hard work - Samba rocks (other than this problem)!

Charles

Jeremy Allison

2001-Dec-13 15:18 UTC

head link

Severe problem with Samba

On Thu, Dec 13, 2001 at 06:08:03PM +0000, Martin Rootes
wrote:> Dear All,
> 
> 	we are experiencing severe problems with Samba 2.2.0 (with quota support)
running on a
> dual processor (400MHz) Sun E450 running Solaris 2.7. This is used as a
central file server for
> student diskspace, accessed by approx 1200 PCs running NT 4. Up until
recently we
> experienced some, what we assume to be, loading issues with connections
during the middle of
> the day being slow. However, recently we have been encountering severe
problems. Everything
> seems fine until  midday, then what we start to see is the number of smbd
processes going up
> whilst the number of connections (determined from smbstatus -b) dropping,
students with
> connections starting getting slow responses and no new connections are
being made, load on
> the system skyrockets. stopping samba and restarting seems to cure the
problem, but the
> problem can re-occur. We are in a desperate panic at the moment as the
students are all doing
> assignments and this is seriously affecting their work. We have tried
various tweaks to Samba
> (deadtime, change notify timeout), the tcp stack and have tripled system
memory, all to no avail.
> We also seem to have an issue with keepalives and tcp_nodelay, neither of
which seem to work
> at all, we see the following messages in the log about keepalives:-
We think we've solved these in the latest Samba 2.2.x CVS tree.
Unfortunately
this isn't released as "stable" 2.2.3 code yet (getting close
though). If you'd
like to test this the CVS branch is SAMBA_2_2. It has been confirmed to fix this
problem on other Solaris and HPUX boxes.

Jeremy.

Martin Rootes

2001-Dec-18 07:04 UTC

head link

Severe problem with Samba

Thanks Jeremy, I'll compile it up and test it out.

	Martin.

Date sent:      	Thu, 13 Dec 2001 15:17:32 -0800
To:             	Martin Rootes <M.J.Rootes@shu.ac.uk>
Copies to:      	Samba <Samba@lists.samba.org>
Subject:        	Re: Severe problem with Samba
From:           	jra@samba.org (Jeremy Allison)
> On Thu, Dec 13, 2001 at 06:08:03PM +0000, Martin Rootes wrote:
> > Dear All,
> > 
> > 	we are experiencing severe problems with Samba 2.2.0 (with quota
support) running on a
> > dual processor (400MHz) Sun E450 running Solaris 2.7. This is used as
a central file server for
> > student diskspace, accessed by approx 1200 PCs running NT 4. Up until
recently we
> > experienced some, what we assume to be, loading issues with
connections during the middle of
> > the day being slow. However, recently we have been encountering severe
problems. Everything
> > seems fine until  midday, then what we start to see is the number of
smbd processes going up
> > whilst the number of connections (determined from smbstatus -b)
dropping, students with
> > connections starting getting slow responses and no new connections are
being made, load on
> > the system skyrockets. stopping samba and restarting seems to cure the
problem, but the
> > problem can re-occur. We are in a desperate panic at the moment as the
students are all doing
> > assignments and this is seriously affecting their work. We have tried
various tweaks to Samba
> > (deadtime, change notify timeout), the tcp stack and have tripled
system memory, all to no avail.
> > We also seem to have an issue with keepalives and tcp_nodelay, neither
of which seem to work
> > at all, we see the following messages in the log about keepalives:-
> 
> We think we've solved these in the latest Samba 2.2.x CVS tree.
Unfortunately
> this isn't released as "stable" 2.2.3 code yet (getting close
though). If you'd
> like to test this the CVS branch is SAMBA_2_2. It has been confirmed to fix
this
> problem on other Solaris and HPUX boxes.
> 
> Jeremy.

------------------------------------------------------------------------------
Martin Rootes - Senior Systems Programmer/Analyst, Sheffield Hallam University
Email :         M.J.Rootes@shu.ac.uk                      Phone: 0114 225 3828
------------------------------------------------------------------------------

Maybe Matching Threads

Search for more maybe matching threads

samba - Dec 2001 - Severe problem with Samba

Severe problem with Samba

Severe problem with Samba

File-locking issues with Shared DB files fixed?

Severe problem with Samba

Severe problem with Samba

Maybe Matching Threads