thr3ads.net - samba - Samba can't keep NT shares mounted [Sep 1999]

If this information is useful, please help other people find it:
Share via:

Matthew Vanecek

1999-Sep-19 17:56 UTC

Samba can't keep NT shares mounted

I don't know who this problem belongs to, samba or Linux, but it's a
common one, judging by the responses in my INBOX to a previous post. 
Plus, it's a fairly serious one; many applications require smb shares to
be mounted continuously and without interruption.  It's been a recurring
problem which seems to have been pushed off to the side as not
important, even though it's a serious flaw.  Just where, I don't know
(i.e., kernel smbfs or samba/smbmount).

I know there must be *someone* who has a clue.  Various proposed causes
include password caching, the use of uid_t instead of __kernel_uid_t in
the smbumount code (what that has to do with keeping a share mounted,
I'm not sure), and blistering silence from those that write the code and
know it best.

Pasted below is the body of a previous unanswered post.  If any of the
experts could help, I and others would really appreciate it.


Running NT 4.0 WS SP 5 and samba 2.0.5a on Linux, 2.2.12 kernel. 
Whenever I mount an NT share from Linux, it times out after an
indeterminate period of time.  This has been a continuing problem, the
only workaround being to perform something requiring disk activity on
the NT box, bypassing the cache (i.e., ls > /dev/null doesn't work, but
df does).  This must be done on a regular basis--every two or three
minutes.

Error messages 
me2v:reliant me2v$ ls winnt
ls: winnt: Input/output error

from /var/log/messages:
Sep 15 20:09:14 reliant kernel: smb_trans2_request: result=-32, setting
invalid
Sep 15 20:10:25 reliant kernel: smb_retry: signal failed, error=-3

This was after only about 1 and 1/2 hours, give or take 15 minutes.

This has been a recurring problem since I moved to 2.0.3 from 1.9.18 way
back when, and plenty of other people have had it, also.  The typical
response is that it's a password caching problem, but the password
caching fixes, if any, haven't fixed the problem.

I would like to know 1) Is there something in NT that could be causing
this, and what that is/how to fix it, or 2) how to fix it once and for
all from the Linux side (besides not using samba, that is).  Or is there
a smb.conf undocumented option somewhere that would help?  

I don't know if this is technically a Linux problem or if it's a samba
problem, since historically smbmount has not been officially part of
samba (although it's distributed and compiled with samba), so I wasn't
sure who to post to.  Hopefully, someone has found a fix, or at least
knows what the problem is...

Incidentally, I'm pretty sure it's not an NT service pack problem, since
this has been a recurring problem with no service packs, and with SP3-5.
-- 
Matthew Vanecek
Course of Study: unt.edu/bcis
Visit my Website at people.unt.edu/~mev0003
For answers type: perl -e 'print
$i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10);'
*****************************************************************
For 93 million miles, there is nothing between the sun and my shadow
except me. I'm always getting in the way of something...

Matthew Vanecek

1999-Sep-21 14:05 UTC

head link

Samba can't keep NT shares mounted

Steve Rhodes wrote:> 
> Bad news guys,
> 
> Almost 24 hours and I can't get smbmount to fail!  I connected another
> Win98 machine to the subnet, but apparently, it's not enough to cause
the
> smbmount problem to occur.
> 
> I am going to try flooding the network with netbios packets and see if that
> can induce failure.
FWIW, it never occurs here when the net is under load.  Generally, it's
just when it's idle, and smbmount dies for some reason.  I think that's
where we need to focus--when is smbmount dying, and why?  Is it a result
of something NT does?  /var/log/messages and /var/log/samba/* don't have
a clue as to why it dies.

Maybe it's a problem specific to NT? Or to NT Workstation?  That's what
I have, NT 4.0 WS SP5.  I don't have a Windows machine to test on (thank
God!! It's enough to have it at work!).

Anyhow, I took the smbmount from 2.0.5a and recompiled it.  In the top,
I uncommented the SMBFS_DEBUG portion.  Doesn't really help, I don't
think, but... 

At the beginning of the day yesterday, I mounted a share with the
DEBUG-enabled smbmount, and I mounted another one with the normal
smbmount.  WHen I got home from work, the DEBUG-ed share was still
mounted and up (well, aside from pagefile.sys ;) ).  THe normal smbmount
was out like a broken lamp.  df hung for a bit while it tried to probe
that mount point, then I got the infamous input/output error.  Here's
what happened when I do a df:

Sep 21 08:49:30 reliant kernel: smb_trans2_request: result=-32, setting
invalid
Sep 21 08:49:31 reliant kernel: smb_retry: new pid=19481, generation=7
Sep 21 08:49:31 reliant kernel: smb_lookup: find //pagefile.sys failed,
error=-2
6
Sep 21 08:49:34 reliant kernel: smb_retry: signal failed, error=-3
Sep 21 08:51:50 reliant kernel: smb_retry: signal failed, error=-3

The first part, up to and including the pagefile.sys message, is from
the smbmount with "#define SMBFS_DEBUG 1".  The last two are from the
regular smbmount, or from smbfs trying to awaken the dead normal
smbmount, I guess, is more accurate.

I haven't had time to decode what extra stuff gets done with
SMBFS_DEBUG. There's a bunch of "#ifndef SMBFS_DEBUG"s in there,
though...

I'll probably try to attach a gdb to the regular smbmount tomorrow
morning (no time today), and see what happens.

-- 
Matthew Vanecek
Course of Study: unt.edu/bcis
Visit my Website at people.unt.edu/~mev0003
For answers type: perl -e 'print
$i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10);'
*****************************************************************
For 93 million miles, there is nothing between the sun and my shadow
except me. I'm always getting in the way of something...

Steve Rhodes

1999-Sep-22 13:17 UTC

head link

Samba can't keep NT shares mounted

I read through a bunch of posts on the subject, and I have seen a number of 
theories that the problem might be caused by a period of inactivity, rather 
than heavy network load as I earlier speculated.  I thought an easy test 
would be to mount the NT box, then physically disconnect the smbmount 
client for a period of time, and see if the connection breaks.  Well, I 
left it off the network all night, and when I re-connected and checked it 
this morning, it was still running!

I also ran a quick test by running a ping flood over the network.  The 
collision light on the hub was blinking madly, and I still could not induce 
failure!

I am starting to think that it may have something to do with the 
configuration of the NT box.  I am running this one as a Primary Domain 
Controller.  I recall from my earlier experience that the machine in the 
troubled network was a Stand Alone configuration.  (Un)fortunately, that 
machine has been re-configured as a Linux box (DHCP problems).

Still trying,

Steve Rhodes

-----Original Message-----
From:	Matthew Vanecek [SMTP:mev0003@unt.edu]
Sent:	Tuesday, September 21, 1999 9:06 AM
To:	srhodes@cpinternet.com
Cc:	'Urban Widmark'; 'Khimenko Victor';
linux-kernel@vger.rutgers.edu;
samba@samba.org
Subject:	Re: Samba can't keep NT shares mounted

Steve Rhodes wrote:>
> Bad news guys,
>
> Almost 24 hours and I can't get smbmount to fail!  I connected another
> Win98 machine to the subnet, but apparently, it's not enough to cause
the
> smbmount problem to occur.
>
> I am going to try flooding the network with netbios packets and see if 
that> can induce failure.
FWIW, it never occurs here when the net is under load.  Generally, it's
just when it's idle, and smbmount dies for some reason.  I think that's
where we need to focus--when is smbmount dying, and why?  Is it a result
of something NT does?  /var/log/messages and /var/log/samba/* don't have
a clue as to why it dies.

Maybe it's a problem specific to NT? Or to NT Workstation?  That's what
I have, NT 4.0 WS SP5.  I don't have a Windows machine to test on (thank
God!! It's enough to have it at work!).

Anyhow, I took the smbmount from 2.0.5a and recompiled it.  In the top,
I uncommented the SMBFS_DEBUG portion.  Doesn't really help, I don't
think, but...

At the beginning of the day yesterday, I mounted a share with the
DEBUG-enabled smbmount, and I mounted another one with the normal
smbmount.  WHen I got home from work, the DEBUG-ed share was still
mounted and up (well, aside from pagefile.sys ;) ).  THe normal smbmount
was out like a broken lamp.  df hung for a bit while it tried to probe
that mount point, then I got the infamous input/output error.  Here's
what happened when I do a df:

Sep 21 08:49:30 reliant kernel: smb_trans2_request: result=-32, setting
invalid
Sep 21 08:49:31 reliant kernel: smb_retry: new pid=19481, generation=7
Sep 21 08:49:31 reliant kernel: smb_lookup: find //pagefile.sys failed,
error=-2
6
Sep 21 08:49:34 reliant kernel: smb_retry: signal failed, error=-3
Sep 21 08:51:50 reliant kernel: smb_retry: signal failed, error=-3

The first part, up to and including the pagefile.sys message, is from
the smbmount with "#define SMBFS_DEBUG 1".  The last two are from the
regular smbmount, or from smbfs trying to awaken the dead normal
smbmount, I guess, is more accurate.

I haven't had time to decode what extra stuff gets done with
SMBFS_DEBUG. There's a bunch of "#ifndef SMBFS_DEBUG"s in there,
though...

I'll probably try to attach a gdb to the regular smbmount tomorrow
morning (no time today), and see what happens.

--
Matthew Vanecek
Course of Study: unt.edu/bcis
Visit my Website at people.unt.edu/~mev0003
For answers type: perl -e 'print
$i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10);'
*****************************************************************
For 93 million miles, there is nothing between the sun and my shadow
except me. I'm always getting in the way of something...

Steve Rhodes

1999-Sep-24 15:51 UTC

head link

Samba can't keep NT shares mounted

I have been trying for some time to create this failure in a controlled 
environment.  I have just now been able to experience the failure, but not 
quite as controlled as I would like, and certainly not the version I would 
prefer, as I am running 2.0.1 in this example.  I have gone through a 
number of theories on this subject, which I though would be useful to 
review.

Theory 1.)  The failure is caused by excessive network traffic

   This doesn't seem to be the case.  I put together a test setup in my lab 
and flooded the network with pings.  The storm was pretty impressive, but 
the connection held tight.

Theory 2.)  The failure is caused by a period of inactivity.

  This may indeed be a piece of the puzzle, and seems to be one of the more 
popular notions going around.  However, in the same laboratory setup 
mentioned above, I disconnected the client machine overnight, and it was 
still working properly upon re-connection the next day.

Theory 3.)  The failure is specific to a particular configuration of 
server.

  This is not the case, as I have received correspondence from a number of 
people with servers ranging from OS2 to NT as a PDC, all with the same 
problem.

Theory 4.)  The failure has something to do with the DEBUG option in the 
source code.

  The theory goes that the DEBUG option will work, but if it is turned off, 
that is where the problem starts.  Something to do with attempting to write 
out error messages.  I haven't had the opportunity
to observe this directly, but it is an interesting theory.

    ~~~~

Having said all that, I would like to relate the configuration under which 
I was able to observe the failure, and present YAT (Yet Another Theory)

The basic concept behind this configuration was to use files set up on an 
NT web server through an apache server on Linux by smbmounting the NT drive 
in the apache html directory.  This was the original configuration in which 
I observed the problem earlier this year.  For maximum possibility of 
inducing failure I set it up as an smbmount on /mnt/test, the built a 
symolic link to that from a /home/httpd/html/Test directory, which is in 
the apache document path.

This way, somebody can connect to the apache server on the Linux box and 
view the html files which are kept on the NT box.  The underlying reason 
for this is that the NT files are updated by a daily batch process which 
runs every morning.  A rather complex system was built around the NT box, 
so it was impratical to re-build it on Linux.  We needed access to those 
files from the Linux box for security reasons, hence the smbmount.

At first, it looked like I was going to get the same result I had been 
experiencing throughout this process, the smbmount looked rock solid.  Many 
hours went by, and every time I checked the connection, it was still 
working.  However, this morning when I checked, it was broken.

This leads into my new theory.  I have seen a number of posts indicating 
that if files are changed on the smb server, that this causes issues on the 
mounted smb client.  It wasn't entirely clear to me what those issues were, 
it seemed to be a lack of current data from the perspective of the client, 
or perhaps even a broken connection.  In any event, I am speculating that 
the update process on the NT box in my configuration above may be the 
trigger that induces the failure.  Most of the files are over-written 
during the update, and this may be a reason for the broken connection.

I will be continuing to pursue this issue and narrow down the 
variablesassociated with the failure.  Thanks to everyone that has sent in 
messages on this problem, and kudos to you if you have managed to read 
through this lengthy post.

Regards,

Steve Rhodes

Possibly Parallel Threads

Search for more apparently analagous threads

samba - Sep 1999 - Samba can't keep NT shares mounted

Samba can't keep NT shares mounted

Samba can't keep NT shares mounted

Samba can't keep NT shares mounted

Samba can't keep NT shares mounted

Possibly Parallel Threads