Amit_Bhutani@Dell.com
2003-Mar-13 19:40 UTC
[Samba] "Oplock" error messages with samba on Linux
TEST:
====SAMBA STRESS OVER GIGABIT NETWORK: This is a simple
Read/Write/Compare/Delete kind of test over the network via Samba. Multiple
clients with multiple threads try to read/write/compare/delete(in a loop) a
finite sized file (64K block size) over samba for a prolonged period of time
in an effort to stress test the server.
SERVER SIDE CONFIG:
==================ARCH: x86
OS: RHL AS 2.1
KERNEL: 2.4.9-e.3smp
SAMBA: 2.2.7-1.21as.i386.rpm (also tried RH AS2.1 stock version:
2.2.1a-4.i386.rpm)
NIC: 2 ONBOARD GIG BCM5703(COPPER)ON A STATIC GEC TRUNK
RAM: 512 MB
CPU: Dual 2.66GHz CPU with HT ON (tried several different speed CPU's: 2.2,
2.4, 3.06 etc)
SWAP: 2 GB
SCSI/RAID: Able to reproduce this on SCSI as well as RAID 0 (not tried other
RAID configs)
CLIENT SIDE CONFIG:
==================SAMBA CLIENTS: Windows only (tried both NT4 and W2K). Please
note that there
are *no* Linux clients trying to access the samba share. The clients are all
Windows and are all installed from one common image.
DESCRIPTION OF FAILURES:
=======================The Stress Test Controller starts to show some
Read/Write/Compare failures
anywhere from a 30 mins up to a 24 hour period into the test. These failures
continue to occur and eventually the server "locks up". Tried to
enable
"nmi_watchdog" with serial console. No OOPS capture yet. I also tried
raising the samba debug level (went up to 3) , printk level raised to "7 4
1
7", SysRq enabled etc. None of the information I have captured so far gives
me any definite theory on why this is happening. What I do notice is that in
almost all the failures, the samba logs have references to "oplocks".
Any
body has any ideas why this could be happening ? Any ideas for what I can do
further to troubleshoot. I have tried various other things that I have not
posted with this message to try and keep this post readable. Posted below
are a sample smbd.log, <samba_client_name>.log and my smb.conf file from
one
of the recent failures. Any suggestions would be helpful.
/var/log/smbd.log:
=================[2003/03/12 04:02:03, 0] smbd/server.c:open_sockets(238)
Got SIGHUP
/var/log/samba/samba_client1.log:
================================[2003/03/11 10:04:07, 0]
passdb/pdb_smbpasswd.c:pdb_getsampwnam(1369)
unable to open passdb database.
[2003/03/12 03:33:49, 0] smbd/oplock.c:oplock_break(790)
oplock_break: end of file from client
oplock_break failed for file ZEUS05/ZEUS05_0.dat (dev = 811, inode 131074,
file_id = 5907).
[2003/03/12 03:33:49, 0] smbd/oplock.c:oplock_break(878)
oplock_break: client failure in break - shutting down this smbd.
[2003/03/12 03:34:15, 0] smbd/oplock.c:request_oplock_break(1008)
request_oplock_break: no response received to oplock break request to pid
2482 on port 32793 for dev = 811, inode = 131074, file_id = 5907
/etc/samba/smb.conf:
===================[global]
netbios name = box_400
server string = Samba Server
security = SHARE
encrypt passwords = Yes
log file = /var/log/samba/%m.log
max log size = 0
socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192
dns proxy = No
printing = lprng
wins support = yes
[homes]
comment = Home Directories
writeable = Yes
browseable = No
[printers]
comment = All Printers
path = /var/spool/samba
printable = Yes
browseable = No
[share]
path = /stress
writeable = Yes
guest ok = Yes
[web]
path = /var/www/html
writeable = Yes
guest ok = Yes
- Amit Bhutani
jra@dp.samba.org
2003-Mar-13 19:45 UTC
[Samba] "Oplock" error messages with samba on Linux
On Thu, Mar 13, 2003 at 01:40:03PM -0600, Amit_Bhutani@Dell.com wrote:> TEST: > ====> SAMBA STRESS OVER GIGABIT NETWORK: This is a simple > Read/Write/Compare/Delete kind of test over the network via Samba. Multiple > clients with multiple threads try to read/write/compare/delete(in a loop) a > finite sized file (64K block size) over samba for a prolonged period of time > in an effort to stress test the server. > > SERVER SIDE CONFIG: > ==================> ARCH: x86 > OS: RHL AS 2.1 > KERNEL: 2.4.9-e.3smp > SAMBA: 2.2.7-1.21as.i386.rpm (also tried RH AS2.1 stock version: > 2.2.1a-4.i386.rpm) > NIC: 2 ONBOARD GIG BCM5703(COPPER)ON A STATIC GEC TRUNK > RAM: 512 MB > CPU: Dual 2.66GHz CPU with HT ON (tried several different speed CPU's: 2.2, > 2.4, 3.06 etc) > SWAP: 2 GB > SCSI/RAID: Able to reproduce this on SCSI as well as RAID 0 (not tried other > RAID configs) > > CLIENT SIDE CONFIG: > ==================> SAMBA CLIENTS: Windows only (tried both NT4 and W2K). Please note that there > are *no* Linux clients trying to access the samba share. The clients are all > Windows and are all installed from one common image. > > DESCRIPTION OF FAILURES: > =======================> The Stress Test Controller starts to show some Read/Write/Compare failures > anywhere from a 30 mins up to a 24 hour period into the test. These failures > continue to occur and eventually the server "locks up". Tried to enable > "nmi_watchdog" with serial console. No OOPS capture yet. I also tried > raising the samba debug level (went up to 3) , printk level raised to "7 4 1 > 7", SysRq enabled etc. None of the information I have captured so far gives > me any definite theory on why this is happening. What I do notice is that in > almost all the failures, the samba logs have references to "oplocks". Any > body has any ideas why this could be happening ? Any ideas for what I can do > further to troubleshoot. I have tried various other things that I have not > posted with this message to try and keep this post readable. Posted below > are a sample smbd.log, <samba_client_name>.log and my smb.conf file from one > of the recent failures. Any suggestions would be helpful.Server 'locks up' represents a Linux kernel bug. No other option. Samba as a user level process should not be able to cause the kernel to freeze. On a 'perfect' network (no lost packets) oplock break failures are due to bugs in Windows clients not responding to asynchronous 'break' messages sent back to them over a TCP stream. What *exact* config are the Windows clients ? What service pack ? Jeremy Allison, Samba Team.
Amit_Bhutani@Dell.com
2003-Mar-13 22:00 UTC
[Samba] "Oplock" error messages with samba on Linux
>Server 'locks up' represents a Linux kernel bug. No other option. >Samba as a user level process should not be able to cause the >kernel to freeze.I am almost convinced that I am chasing two different issues here. I have the inclination to say that the "lockup" part of the issue is either due to the kernel, drivers or hardware itself (not sure if I would go as far as blaming the linux kernel 100% yet. I am regressing on some errata kernels). Unfortunately have not been able to narrow it down further than that. The samba errors however are obviously due to something samba is interacting with or samba itself. Hence the post on the samba mailing list.>On a 'perfect' network (no lost packets) oplock break failures >are due to >bugs in Windows clients not responding to asynchronous 'break' messages >sent back to them over a TCP stream. What *exact* config are >the Windows >clients ? What service pack ?Here is the detailed config of *one* of the most common failing racks. Agree that this is a lot older and slower rack but do remember that the same failure has been exhibited on lot faster/newer racks with other OS images as well (W2K) MODEL: Dell OptiPlex G's CPUI: (1) x86 Intel pentium 1 Family 5, Model 2, Stepping 12 MEMORY: (1) 32 Meg DIMM HARD DRIVE: (1) 1 GB drive (IDE) with 513 MB free OS: MS Windows NT 4.00.1381 NIC: Intel PRO/100+ Server Adapter (PILA84708B) NIC DRIVER: Name: 3100bnt.sys Version: 5.00.66.0000 Desp: NDIS 4 driver - Amit Bhutani