Amit_Bhutani@Dell.com
2003-Mar-13 19:40 UTC
[Samba] "Oplock" error messages with samba on Linux
TEST: ====SAMBA STRESS OVER GIGABIT NETWORK: This is a simple Read/Write/Compare/Delete kind of test over the network via Samba. Multiple clients with multiple threads try to read/write/compare/delete(in a loop) a finite sized file (64K block size) over samba for a prolonged period of time in an effort to stress test the server. SERVER SIDE CONFIG: ==================ARCH: x86 OS: RHL AS 2.1 KERNEL: 2.4.9-e.3smp SAMBA: 2.2.7-1.21as.i386.rpm (also tried RH AS2.1 stock version: 2.2.1a-4.i386.rpm) NIC: 2 ONBOARD GIG BCM5703(COPPER)ON A STATIC GEC TRUNK RAM: 512 MB CPU: Dual 2.66GHz CPU with HT ON (tried several different speed CPU's: 2.2, 2.4, 3.06 etc) SWAP: 2 GB SCSI/RAID: Able to reproduce this on SCSI as well as RAID 0 (not tried other RAID configs) CLIENT SIDE CONFIG: ==================SAMBA CLIENTS: Windows only (tried both NT4 and W2K). Please note that there are *no* Linux clients trying to access the samba share. The clients are all Windows and are all installed from one common image. DESCRIPTION OF FAILURES: =======================The Stress Test Controller starts to show some Read/Write/Compare failures anywhere from a 30 mins up to a 24 hour period into the test. These failures continue to occur and eventually the server "locks up". Tried to enable "nmi_watchdog" with serial console. No OOPS capture yet. I also tried raising the samba debug level (went up to 3) , printk level raised to "7 4 1 7", SysRq enabled etc. None of the information I have captured so far gives me any definite theory on why this is happening. What I do notice is that in almost all the failures, the samba logs have references to "oplocks". Any body has any ideas why this could be happening ? Any ideas for what I can do further to troubleshoot. I have tried various other things that I have not posted with this message to try and keep this post readable. Posted below are a sample smbd.log, <samba_client_name>.log and my smb.conf file from one of the recent failures. Any suggestions would be helpful. /var/log/smbd.log: =================[2003/03/12 04:02:03, 0] smbd/server.c:open_sockets(238) Got SIGHUP /var/log/samba/samba_client1.log: ================================[2003/03/11 10:04:07, 0] passdb/pdb_smbpasswd.c:pdb_getsampwnam(1369) unable to open passdb database. [2003/03/12 03:33:49, 0] smbd/oplock.c:oplock_break(790) oplock_break: end of file from client oplock_break failed for file ZEUS05/ZEUS05_0.dat (dev = 811, inode 131074, file_id = 5907). [2003/03/12 03:33:49, 0] smbd/oplock.c:oplock_break(878) oplock_break: client failure in break - shutting down this smbd. [2003/03/12 03:34:15, 0] smbd/oplock.c:request_oplock_break(1008) request_oplock_break: no response received to oplock break request to pid 2482 on port 32793 for dev = 811, inode = 131074, file_id = 5907 /etc/samba/smb.conf: ===================[global] netbios name = box_400 server string = Samba Server security = SHARE encrypt passwords = Yes log file = /var/log/samba/%m.log max log size = 0 socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192 dns proxy = No printing = lprng wins support = yes [homes] comment = Home Directories writeable = Yes browseable = No [printers] comment = All Printers path = /var/spool/samba printable = Yes browseable = No [share] path = /stress writeable = Yes guest ok = Yes [web] path = /var/www/html writeable = Yes guest ok = Yes - Amit Bhutani
jra@dp.samba.org
2003-Mar-13 19:45 UTC
[Samba] "Oplock" error messages with samba on Linux
On Thu, Mar 13, 2003 at 01:40:03PM -0600, Amit_Bhutani@Dell.com wrote:> TEST: > ====> SAMBA STRESS OVER GIGABIT NETWORK: This is a simple > Read/Write/Compare/Delete kind of test over the network via Samba. Multiple > clients with multiple threads try to read/write/compare/delete(in a loop) a > finite sized file (64K block size) over samba for a prolonged period of time > in an effort to stress test the server. > > SERVER SIDE CONFIG: > ==================> ARCH: x86 > OS: RHL AS 2.1 > KERNEL: 2.4.9-e.3smp > SAMBA: 2.2.7-1.21as.i386.rpm (also tried RH AS2.1 stock version: > 2.2.1a-4.i386.rpm) > NIC: 2 ONBOARD GIG BCM5703(COPPER)ON A STATIC GEC TRUNK > RAM: 512 MB > CPU: Dual 2.66GHz CPU with HT ON (tried several different speed CPU's: 2.2, > 2.4, 3.06 etc) > SWAP: 2 GB > SCSI/RAID: Able to reproduce this on SCSI as well as RAID 0 (not tried other > RAID configs) > > CLIENT SIDE CONFIG: > ==================> SAMBA CLIENTS: Windows only (tried both NT4 and W2K). Please note that there > are *no* Linux clients trying to access the samba share. The clients are all > Windows and are all installed from one common image. > > DESCRIPTION OF FAILURES: > =======================> The Stress Test Controller starts to show some Read/Write/Compare failures > anywhere from a 30 mins up to a 24 hour period into the test. These failures > continue to occur and eventually the server "locks up". Tried to enable > "nmi_watchdog" with serial console. No OOPS capture yet. I also tried > raising the samba debug level (went up to 3) , printk level raised to "7 4 1 > 7", SysRq enabled etc. None of the information I have captured so far gives > me any definite theory on why this is happening. What I do notice is that in > almost all the failures, the samba logs have references to "oplocks". Any > body has any ideas why this could be happening ? Any ideas for what I can do > further to troubleshoot. I have tried various other things that I have not > posted with this message to try and keep this post readable. Posted below > are a sample smbd.log, <samba_client_name>.log and my smb.conf file from one > of the recent failures. Any suggestions would be helpful.Server 'locks up' represents a Linux kernel bug. No other option. Samba as a user level process should not be able to cause the kernel to freeze. On a 'perfect' network (no lost packets) oplock break failures are due to bugs in Windows clients not responding to asynchronous 'break' messages sent back to them over a TCP stream. What *exact* config are the Windows clients ? What service pack ? Jeremy Allison, Samba Team.
Amit_Bhutani@Dell.com
2003-Mar-13 22:00 UTC
[Samba] "Oplock" error messages with samba on Linux
>Server 'locks up' represents a Linux kernel bug. No other option. >Samba as a user level process should not be able to cause the >kernel to freeze.I am almost convinced that I am chasing two different issues here. I have the inclination to say that the "lockup" part of the issue is either due to the kernel, drivers or hardware itself (not sure if I would go as far as blaming the linux kernel 100% yet. I am regressing on some errata kernels). Unfortunately have not been able to narrow it down further than that. The samba errors however are obviously due to something samba is interacting with or samba itself. Hence the post on the samba mailing list.>On a 'perfect' network (no lost packets) oplock break failures >are due to >bugs in Windows clients not responding to asynchronous 'break' messages >sent back to them over a TCP stream. What *exact* config are >the Windows >clients ? What service pack ?Here is the detailed config of *one* of the most common failing racks. Agree that this is a lot older and slower rack but do remember that the same failure has been exhibited on lot faster/newer racks with other OS images as well (W2K) MODEL: Dell OptiPlex G's CPUI: (1) x86 Intel pentium 1 Family 5, Model 2, Stepping 12 MEMORY: (1) 32 Meg DIMM HARD DRIVE: (1) 1 GB drive (IDE) with 513 MB free OS: MS Windows NT 4.00.1381 NIC: Intel PRO/100+ Server Adapter (PILA84708B) NIC DRIVER: Name: 3100bnt.sys Version: 5.00.66.0000 Desp: NDIS 4 driver - Amit Bhutani