thr3ads.net - samba - [Samba] Occasional problem with hanging SMB mounts. [Jul 2002]

If this information is useful, please help other people find it:
Share via:

Kris Kelley

2002-Jul-03 09:04 UTC

[Samba] Occasional problem with hanging SMB mounts.

Hello everyone.

I have two linux servers* that each have four directories mounted as
SMBFS shares from a Windows 2000 Server.  For the most part, this set-up
is working great, however, there have been occasional hiccups.

Every so often, one of the servers, LINUX-ONE, logs a couple of
Samba-related errors.  The following is an example:

   Jun 30 06:30:45 mx-two kernel: smb_trans2_request: result=-104,
      setting invalid
   Jun 30 06:30:45 mx-two kernel: smb_retry: successful, new pid=553,
      generation=25

I believe this is due to the network connection used by the relevant
SMBFS mount shutting down because of inactivity, and then being
re-established when the mount is accessed.  I haven't been worrying
about these messages, however, recently I have encountered even bigger
issues.  Sometimes when one of these "disconnects" occurs, the
connection isn't always reestablished when it should be.  When this
happens, the mount hangs, and all processes trying to access the mount
are blocked, resulting in a high load average.  Anywhere from two to
fifteen minutes after I've discovered the problem, it clears up on its
own, and I see messages like these in the syslog:

   Jul  1 13:52:57 mx-two kernel: smb_get_length: recv error = 110
   Jul  1 13:52:57 mx-two kernel: smb_trans2_request: result=-110,
      setting invalid
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/s failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/c failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/s failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/c failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/s failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/s failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/a failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/j failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/c failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/w failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/w failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/s failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_retry: successful, new pid=553,
      generation=44

Why does this happen?  Is this a known issue with Samba 2.2.4?

Also, I have yet to see any of these events on the other linux server,
LINUX-TWO.  I believe this is because LINUX-TWO has processes running on
it that hit these mounts every five seconds, and so there is never an
opportunity for the underlying network connections to become inactive.
Does this make sense?  If so, all I have to do to work around the
problem on LINUX-ONE is set up a script that periodically pings the
mounts, perhaps running an "ls" command, every so often, correct?  If
that is true, I need to know what the inactivity time-out limit is, so
this script doesn't have to run more often than necessary.

If this is covered in documentation somewhere, feel free to point me in
that direction.  Otherwise, any help will be greatly appreciated.
Thanks!

---Kris Kelley

* Red Hat 7.1, kernel 2.4.9-34 (supplied by Red Hat), Samba 2.2.4
  (compiled from source with all defaults, except smbmount support was
  enabled)

Rashkae

2002-Jul-03 09:29 UTC

head link

[Samba] Occasional problem with hanging SMB mounts.

This would appear to me as though Samba is having trouble finding the
W2K server in question when it wants to re-connect.  The default
nameresolve order = lmhosts host wins bcast
might be the cause of your problem, since host (normal unix DNS
lookup) takes priority before attempting a wins search or bcast.  If
you have a properly configured wins server used by all the network
nodes, consider placing wins before host.  Alternatively, add the
computer names of your W2K Servers to your /etc/hosts file for quick
host lookups..

You can also change the behaviour of Windows 2000 via Registry to change
the connection timeout values or disable them altogether.  I don't
have those registry keys on hand.


In the likely event that I've completely misunderstood the problem,
I apologize in advance.

On Wed, 3 Jul 2002, Kris Kelley wrote:

Hello everyone.

I have two linux servers* that each have four directories mounted as
SMBFS shares from a Windows 2000 Server.  For the most part, this set-up
is working great, however, there have been occasional hiccups.

Every so often, one of the servers, LINUX-ONE, logs a couple of
Samba-related errors.  The following is an example:

   Jun 30 06:30:45 mx-two kernel: smb_trans2_request: result=-104,
      setting invalid
   Jun 30 06:30:45 mx-two kernel: smb_retry: successful, new pid=553,
      generation=25

I believe this is due to the network connection used by the relevant
SMBFS mount shutting down because of inactivity, and then being
re-established when the mount is accessed.  I haven't been worrying
about these messages, however, recently I have encountered even bigger
issues.  Sometimes when one of these "disconnects" occurs, the
connection isn't always reestablished when it should be.  When this
happens, the mount hangs, and all processes trying to access the mount
are blocked, resulting in a high load average.  Anywhere from two to
fifteen minutes after I've discovered the problem, it clears up on its
own, and I see messages like these in the syslog:

   Jul  1 13:52:57 mx-two kernel: smb_get_length: recv error = 110
   Jul  1 13:52:57 mx-two kernel: smb_trans2_request: result=-110,
      setting invalid
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/s failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/c failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/s failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/c failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/s failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/s failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/a failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/j failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/c failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/w failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/w failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_lookup: find archive/s failed,
      error=-5
   Jul  1 13:52:57 mx-two kernel: smb_retry: successful, new pid=553,
      generation=44

Why does this happen?  Is this a known issue with Samba 2.2.4?

Also, I have yet to see any of these events on the other linux server,
LINUX-TWO.  I believe this is because LINUX-TWO has processes running on
it that hit these mounts every five seconds, and so there is never an
opportunity for the underlying network connections to become inactive.
Does this make sense?  If so, all I have to do to work around the
problem on LINUX-ONE is set up a script that periodically pings the
mounts, perhaps running an "ls" command, every so often, correct?  If
that is true, I need to know what the inactivity time-out limit is, so
this script doesn't have to run more often than necessary.

If this is covered in documentation somewhere, feel free to point me in
that direction.  Otherwise, any help will be greatly appreciated.
Thanks!

---Kris Kelley

* Red Hat 7.1, kernel 2.4.9-34 (supplied by Red Hat), Samba 2.2.4
  (compiled from source with all defaults, except smbmount support was
  enabled)

Urban Widmark

2002-Jul-03 13:03 UTC

head link

[Samba] Occasional problem with hanging SMB mounts.

On Wed, 3 Jul 2002, Kris Kelley wrote:
> Every so often, one of the servers, LINUX-ONE, logs a couple of
> Samba-related errors.  The following is an example:
> 
>    Jun 30 06:30:45 mx-two kernel: smb_trans2_request: result=-104,
>       setting invalid
>    Jun 30 06:30:45 mx-two kernel: smb_retry: successful, new pid=553,
>       generation=25
Those are not really errors ...

The first is saying that it detected that the tcp connection to the server
was gone when trying to send. This is normal, smb servers like to do that.
The second message is saying that smbmount reconnected to the server and 
everything is ok.
> issues.  Sometimes when one of these "disconnects" occurs, the
> connection isn't always reestablished when it should be.  When this
> happens, the mount hangs, and all processes trying to access the mount
> are blocked, resulting in a high load average.  Anywhere from two to
> fifteen minutes after I've discovered the problem, it clears up on its
> own, and I see messages like these in the syslog:
>    Jul  1 13:52:57 mx-two kernel: smb_get_length: recv error = 110
-110 is "Connection timed out" (/usr/include/asm/errno.h). Which is 
interesting, I don't recall having seen that from anyone. But I forget.

The current smbfs version is completely single threaded on one mount and
while one process is sending (and receiving) no one else can do anything.
This is old code from 2.1.something (or 2.0?) when all of the kernel was
like that.

What has probably happened is that one request has attempted to send
something. It fails, but the apparently time it takes for a -110 failure
is a lot longer than a -104. Because of the single thread issue nothing
happens while this is waiting so you get high load.

When the request finally fails all the queued up requests get through,
only to find the tcp socket closed (-5 = I/O error), until smbmount again
manages to reconnect.

The long delay points to another problem with the current smbfs socket
code. It lets the network select the length of a timeout. Patches for this
exists for different 2.4 and 2.2 kernels that sets the timeout for any
operation to 30 seconds (user cfg). I plan to get that into 2.4.20.

There is a more advanced version that should let people always interrupt
processes that are sleeping while accessing smbfs and not be single
threaded and thus faster with multiple accesses ... for 2.5, eventually.

> Why does this happen?  Is this a known issue with Samba 2.2.4?
Yes, with the kernel, nothing to do with samba.

> Also, I have yet to see any of these events on the other linux server,
> LINUX-TWO.  I believe this is because LINUX-TWO has processes running on
> it that hit these mounts every five seconds, and so there is never an
> opportunity for the underlying network connections to become inactive.
> Does this make sense?  If so, all I have to do to work around the
> problem on LINUX-ONE is set up a script that periodically pings the
> mounts, perhaps running an "ls" command, every so often, correct?
If
Yes. There will eventually be similar code inside smbfs to do whatever it
needs to keep the connection up while mounted.
> that is true, I need to know what the inactivity time-out limit is, so
> this script doesn't have to run more often than necessary.
It's a server side setting. I believe NT (and win2k?) uses:

HKLM\System\CurrentControlSet\Services\LanmanServer\Parameters\autodisconnect
-1 to 65535 minutes

I think the default is something like 10 minutes. 5 minutes sounds good.

/Urban

Maybe Matching Threads

Search for more maybe matching threads

samba - Jul 2002 - Occasional problem with hanging SMB mounts.

[Samba] Occasional problem with hanging SMB mounts.

[Samba] Occasional problem with hanging SMB mounts.

[Samba] Occasional problem with hanging SMB mounts.

Maybe Matching Threads