Jeremy,
Bit of bad news - we had a another runaway SMBD. This time profiles were
NOT involved - I was simply copying up an installation of Office2000 and
then deleting it. It almost reached the end of the deletion when the
processor utilisation went up to 90%. The client connection then dropped
with a 'connection is no longer available' and there is now 99%
processor
utilisation on the server. This process is now unkillable.
Do you think this is a kernel (2.2.19) issue or a Samba issue (2.2.2) ? You
said if a process becomes unkillable then it is a kernel issue - perhaps it
is the kernel not handling the SMBD correctly ?
I'll build another machine with 2.4 and see if I can replicate the problem
again.
Noel
-----Original Message-----
From: Noel Kelly
Sent: 19 November 2001 12:30
To: 'jra@samba.org'
Subject: RE: 2.2.2 runaway SMBD process
Jeremy,
I managed to recreate a runaway process. See the attached file for the logs
with a setting of 10 - I specified "log name = log.%m.%I" but it did
not
seem to box things up nicely. However I was pretty much the only user so
all the logs are for my machine and usernmae (nkelly).
After the problem occurred I killed everything off I could which left this:
[root@belly var]# ps ax | grep smb
1302 ? S 0:42 /usr/local/samba/bin/smbd -D
1319 ? R 22:27 /usr/local/samba/bin/smbd -D
1324 ? S 0:00 /usr/local/samba/bin/smbd -D
1325 ? S 0:00 /usr/local/samba/bin/smbd -D
1339 ? S 0:00 /usr/local/samba/bin/smbd -D
1340 ? S 0:00 /usr/local/samba/bin/smbd -D
[root@belly var]#
so process 1319 is the one in question.
The problems seem to occur with profiles being written up or down to the
server. I noted that on the previous occasion the read-only 'Outlook
Express.lnk' file was involved. This quite often causes problems with
profiles being saved on our Novell centred network. To recreate it I made
the Internet Explorer and Desktop shortcuts on the Quick Launch bar
read-only as well. Sure enough the smbd got lost and went ballistic at
login.
That is 'login' - I had logged out and logged in again so the read-only
links were written to the Samba share and it was on logging in again that
the problem occurred. Obviously you can track it through but I would have
thought that it was a straight forward read from the Samba share but perhaps
it is the attempted overwrite on the workstation which is failing and
causing the smbd to get lost and loop.
We have now decided to simply keep M$ with M$ and keep the profiles on the
Win2000 DC. Whatever problems they cause with profiles can be dealt with by
them. It will be interesting to see if we have anymore of these runaway
processes.
Hope this is helpful and let me know if you need anything else,
Noel
-----Original Message-----
From: jra@samba.org [mailto:jra@samba.org]
Sent: 18 November 2001 20:29
To: Noel Kelly
Cc: 'jra@samba.org'; 'samba@lists.samba.org'
Subject: Re: 2.2.2 runaway SMBD process
On Sun, Nov 18, 2001 at 05:30:07PM -0000, Noel Kelly
wrote:> Jeremy,
>
> Below are the relevant parts of the log for the offending process (2493).
> Definitely looks like an oplock issue. Don't like the look of these
lines:>
> [2001/11/16 17:43:08, 0] lib/util.c:smb_panic(1055)
> PANIC: open_mode_check: Existant process 2493 left active oplock.
> [2001/11/16 17:43:09, 0] locking/locking.c:delete_fn(253)
> locking : delete_fn. LOGIC ERROR ! Entry for pid 2496 and it no longer
> exists !
If you can reproduce this at will I'd appreciate it if you
could get me a full level 10 debug log from start of smbd
to abort, with the logs split ut by accessing client (use %m
in the log line). Thanks.
> And yes the process becomes entirely unkillable, not responding to even
-9.
Ok. This is not a Samba bug, it's a problem with the
kernel.
A process never even *sees* a -9 signal, it's handled
completely in the kernel. If the process doesn't terminate
(is it stuck in the 'D' wait state ?) it means the kernel
is having problems, not smbd in this case.
As I recall you're using a Linux 2.2 kernel, so this is not
a kernel oplock bug.
Jeremy.