We recently tried the upgrade to 2.0 Beta series (just finished testing Beta 4) on our network. Everything works fine with 1.9.18pl10, but once we start 2.0 in its place we start getting the following in the log files: [1998/12/17 05:00:12, 1] smbd/files.c:file_init(219) file_init: Information only: requested 10000 open files, 246 are available. [1998/12/17 05:07:52, 1] smbd/service.c:make_connection(484) vhawsey (192.168.1.2) connect to service virus as user vhawsey (uid=503, gid=100) (pid 16637) [1998/12/17 05:09:27, 1] smbd/service.c:make_connection(484) ost15 (192.168.2.142) connect to service tmobley as user tmobley (uid=742, gid=100) (p id 17398) [1998/12/17 06:10:53, 1] smbd/service.c:make_connection(484) ost15 (192.168.2.142) connect to service virus as user tmobley (uid=742, gid=100) (pid 17398) [1998/12/17 06:32:48, 1] smbd/service.c:make_connection(484) long (192.168.2.5) connect to service virus as user elong (uid=530, gid=100) (pid 2492 0) [1998/12/17 06:33:20, 0] smbd/oplock.c:request_oplock_break(942) request_oplock_break: no response received to oplock break request to pid 17398 on por t 1502 for dev = 301, inode = 620605 [1998/12/17 06:33:28, 0] smbd/oplock.c:oplock_break(734) oplock_break: receive_smb timed out after 30 seconds. oplock_break failed for file COMM.INF (dev = 301, inode = 620605). [1998/12/17 06:33:28, 0] smbd/oplock.c:oplock_break(804) oplock_break: client failure in break - shutting down this smbd. [1998/12/17 06:33:28, 1] smbd/service.c:close_cnum(510) ost15 (0.0.0.0) closed connection to service virus [1998/12/17 06:33:28, 1] smbd/service.c:close_cnum(510) ost15 (0.0.0.0) closed connection to service tmobley [1998/12/17 06:33:34, 1] smbd/service.c:make_connection(484) long (192.168.2.5) connect to service virus as user elong (uid=530, gid=100) (pid 2530 1) [1998/12/17 06:33:52, 0] smbd/oplock.c:request_oplock_break(942) request_oplock_break: no response received to oplock break request to pid 17398 on por t 1502 for dev = 301, inode = 620605 [1998/12/17 06:33:52, 1] smbd/service.c:close_cnum(510) long (0.0.0.0) closed connection to service virus [1998/12/17 07:03:45, 1] smbd/service.c:make_connection(484) jordan (192.168.1.3) connect to service virus as user mcjordan (uid=832, gid=100) (pid 7254) [1998/12/17 07:07:28, 1] smbd/service.c:make_connection(484) mcbride (192.168.1.12) connect to service virus as user umcbride (uid=532, gid=100) (p id 9067) [1998/12/17 07:08:43, 1] smbd/service.c:make_connection(484) ssmith (192.168.1.9) connect to service virus as user ssmith (uid=872, gid=100) (pid 9 676) [1998/12/17 07:08:58, 0] lib/doscalls.c:dos_GetWd(405) Very strange, couldn't stat "." What really bothers me is the section concerning the oplock_break problems. It seems that everything churns along fine with no noticable speed decrease until these messages start cropping up in the log files. We have roaving profiles and desktops, and once these messages start appearing in the log files, things screech to a halt, and most (if not all) of the computers will "halt". As you can see the share that is causing the problem here is the virus share (F-SECURE anti-virus communication directory) I can reproduce this pretty much at will (just have to reinstall 2.0). Are there any ideas about how to fix this (is it a bug, or is it my configuration???) It seems strange to me that 1.9.18pl10 works fine and 2.0 doesn't with the same smb.conf (minus the minor unrelated changes that we made for 2.0) Any ideas? Thanks! Bruce Tenison btenison@dibbs.net All the world's a stage, and we are merely players
Scott D. Webster
1998-Dec-19 02:00 UTC
oplock_break: client failure in break - shutting down
On Fri, 18 Dec 1998, Bruce Tenison wrote:> What really bothers me is the section concerning the oplock_break > problems. It seems that everything churns along fine with no noticable > speed decrease until these messages start cropping up in the log files. > We have roaving profiles and desktops, and once these messages start > appearing in the log files, things screech to a halt, and most (if not > all) of the computers will "halt". As you can see the share that is > causing the problem here is the virus share (F-SECURE anti-virus > communication directory)Are oplocks needed on that share? Have you tried turning them off? If you're simply running the F-SECURE program from there, you shouldn't need oplocks. -- Scott D. Webster swebster@carroll.com Etc Services Voice: 201.385.7113 Linux, UNIX, & TCP/IP Network Consulting Pager: 800.379.2402
Bruce, On Sat, 19 Dec 1998 18:14:42 +1100, Bruce Tenison wrote:>> Are oplocks needed on that share? Have you tried turning them >> off? If you're simply running the F-SECURE program from there, you >> shouldn't need oplocks. >> >I'm pretty sure that they're needed, since the clients don't actually run >the software from there, but they use the share to check for updates to >the AV files, and report back to the Administration station via the share >on any located viruses. Works really nice, when it works ;) > >Maybe I'm misunderstanding the purpose of oplocks?!?!?Yes. Opportunistic locks (oplocks) allow the client to agressively cache file contents local. When having "oplocks = false" they just dont cache file contents. It has nothing to do with record locking. Hasta la vista, Robert -- --------------------------------------------------------------- Robert.Dahlem@frankfurt.netsurf.de Radio Bornheim - 2:2461/332@fidonet +49-69-4930830 (ZyX, V34) 2:2461/326@fidonet +49-69-94414444 (ISDN X.75) ---------------------------------------------------------------
> You should be aware of the fact that indeed there is a problem with your > configuration (oplocks should be brakeable easily) and that running without > oplocks may have serious negative performance impacts.Well, it seems that many (if not all) of our 95/98 clients are having the problem of not succeeding with the break request. For example, my machine (tenison) requests: [1998/12/18 10:36:21, 0] smbd/oplock.c:request_oplock_break(935) request_oplock_break: no response received to oplock break request to pid 14904 on port 1243 for dev = 301, inode = 620605 and obviously gets no response. So I go searching in the logs for this pid and inode number and get in the logs for another machine (ahooks): [1998/12/18 10:36:29, 0] smbd/oplock.c:oplock_break(734) oplock_break: receive_smb timed out after 30 seconds. oplock_break failed for file COMM.INF (dev = 301, inode = 620605). [1998/12/18 10:36:29, 0] smbd/oplock.c:oplock_break(804) oplock_break: client failure in break - shutting down this smbd. [1998/12/18 10:36:29, 1] smbd/service.c:close_cnum(510) ahooks (0.0.0.0) closed connection to service virus This seems to be the standard routine for no matter which machine has the oplock, and whether or not the machine is a 95, 95b, 95SR1, or 98 client as far as I can tell. I've tried to up the log level on the pid that has the lock, but I never know which it will be (since it shuts down and closes the connection), so I guess we'll have to up the log level on all smbd processes until we get a good snapshot??? Any suggestions as to the log level to put smbd?> > I think you should try to brake that problem down to a situation as simple as > possible and provide some debug logs to the list so we can investigate on this. > Maybe in the end we will find that its client related but the knowledge about this > should be searchable in the archives.Will work on it and see what I can get. After mulling it over in my mind last night, I came to the conclusion that we had a similar problem with 1.9.18pl10, but it wasn't as obvious a problem. Thanks again for the help! Bruce Tenison btenison@dibbs.net All the world's a stage, and we are merely players
Bruce, On Mon, 21 Dec 1998 02:42:48 +1100, Bruce Tenison wrote:>> You should be aware of the fact that indeed there is a problem with your >> configuration (oplocks should be brakeable easily) and that running without >> oplocks may have serious negative performance impacts.>Well, it seems that many (if not all) of our 95/98 clients are having the >problem of not succeeding with the break request. For example, my machine >(tenison) requests: >[1998/12/18 10:36:21, 0] smbd/oplock.c:request_oplock_break(935) > request_oplock_break: no response received to oplock break request to >pid 14904 on >port 1243 for dev = 301, inode = 620605 > >and obviously gets no response. So I go searching in the logs for this pid >and inode number and get in the logs for another machine (ahooks): >[1998/12/18 10:36:29, 0] smbd/oplock.c:oplock_break(734) > oplock_break: receive_smb timed out after 30 seconds. > oplock_break failed for file COMM.INF (dev = 301, inode = 620605). >[1998/12/18 10:36:29, 0] smbd/oplock.c:oplock_break(804) > oplock_break: client failure in break - shutting down this smbd. >[1998/12/18 10:36:29, 1] smbd/service.c:close_cnum(510) > ahooks (0.0.0.0) closed connection to service virus>This seems to be the standard routine for no matter which machine has the >oplock, and whether or not the machine is a 95, 95b, 95SR1, or 98 client >as far as I can tell.>I've tried to up the log level on the pid that has the lock, but I never >know which it will be (since it shuts down and closes the connection), so >I guess we'll have to up the log level on all smbd processes until we get >a good snapshot??? Any suggestions as to the log level to put smbd?Try the following in smb.conf: In your global section add the line include = /some_path/smb.conf.%I Now lets assume then IP address of your machine were 192.168.1.1 Then create a file /some_path/smb.conf.192.168.1.1 with the following content: debug level = 10 log file = /some_path/log.%I Restart the smbd for that machine (just parse the output of smbstatus for your IP address, kill the associated process ID and do a "DIR X:" from the command line with X: is some drive letter of a mounted share). Now you should have a growing log file with debug level 10 just for your machine. Start tcpdump host <server_IP> and host <client_IP> and work until the error comes back. Be nice and use a patched tcpdump with the SMB extensions (you can get this from one of the samba mirror sites) but its not essential in the first step.>I came to the conclusion that we had a similar problem with 1.9.18pl10, but it >wasn't as obvious a problem.Check it with the 2.0 beta code: This will be the one further development goes to. Hasta la vista, Robert -- --------------------------------------------------------------- Robert.Dahlem@frankfurt.netsurf.de Radio Bornheim - 2:2461/332@fidonet +49-69-4930830 (ZyX, V34) 2:2461/326@fidonet +49-69-94414444 (ISDN X.75) ---------------------------------------------------------------