Georges Brom
2010-Feb-11 13:03 UTC
[Samba] smbd panic due to negative exclusive oplock count
Hi, I'm facing an issue with at least two Samba servers in production both showing important number of log entries like: smbd[27355]: PANIC: assert failed at smbd/oplock.c(149): exclusive_oplocks_open>=0 Looking at the source code, this panic is triggered when the number of exclusive oplocks for a given file becomes negative. Once triggered for one smbd process, this process can generate more than a million log entries per day, causing both perf and space issues on server end, and poor perf on client end. When dying, the smbd process will log this final entry: smbd[27355]: respond_to_all_remaining_local_messages: PANIC : we have -1 exclusive oplocks. This issue has been reproduced on both servers with versions 3.3.9 and 3.3.10 on RHEL 5.4 64bits, not tested with older ones. A level 10 log extract around a PANIC is like that: ... [2010/02/08 17:18:14, 5] smbd/uid.c:change_to_user(355) change_to_user uid=(0,6195) gid=(0,101) [2010/02/08 17:18:14, 3] smbd/reply.c:reply_close(4385) close fd=61 fnum=8800 (numopen=3) [2010/02/08 17:18:14, 6] smbd/close.c:set_close_write_time(450) close_write_time: Sun Feb 7 07:28:15 2106 [2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(100) Locking key 0BFD0000000000003E85 [2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(129) Allocated locked data 0x0x2b62c70b6450 [2010/02/08 17:18:14, 10] locking/locking.c:parse_share_modes(519) parse_share_modes: delete_on_close: 0, owrt: Sun Jan 10 18:14:29 2010 CET, cwrt: Thu Jan 1 01:00:00 1970 CET, tok: 0, num_share_modes: 1 [2010/02/08 17:18:14, 10] locking/locking.c:parse_share_modes(616) parse_share_modes: share_mode_entry[0]: pid = 5575, share_access = 0x3, private_options = 0x40, access_mask = 0x20089, mid = 0x0, type= 0x10, gen_id = 23508270, uid = 6195, flags = 0, file_id fd0b:5e853e [2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_record_destr(42) Unlocking key 0BFD0000000000003E85 [2010/02/08 17:18:14, 0] smbd/oplock.c:release_file_oplock(149) PANIC: assert failed at smbd/oplock.c(149): exclusive_oplocks_open>=0 [2010/02/08 17:18:14, 10] locking/posix.c:get_windows_lock_ref_count(495) get_windows_lock_count for file XXXXXXXXXXX = 0 [2010/02/08 17:18:14, 10] locking/posix.c:delete_windows_lock_ref_count(521) delete_windows_lock_ref_count for file XXXXXXXXXXXX [2010/02/08 17:18:14, 2] smbd/close.c:close_normal_file(606) padcube closed file XXXXXXXXXXXX (numopen=2) NT_STATUS_OK [2010/02/08 17:18:14, 5] smbd/files.c:file_free(476) freed files structure 8800 (33 used) [2010/02/08 17:18:14, 5] lib/util.c:show_msg(645) ... This pattern looks the same for all PANICs, whatever the 'XXXXXXXXXXX' file the client is using. Any help would be greatly appreciated in order to better diagnose this issue and find a solution to it. Regards, Georges
Georges Brom
2010-Feb-13 11:19 UTC
[Samba] smbd panic due to negative exclusive oplock count
Hi, I'm facing an issue with at least two Samba servers both showing important number of log entries like: smbd[27355]: PANIC: assert failed at smbd/oplock.c(149): exclusive_oplocks_open>=0 Looking at the source code, this panic is triggered when the number of exclusive oplocks for a given file becomes negative. Once triggered for one smbd process, this process can generate more than a million log entries per day, causing both perf and space issues on server end, and poor perf on client end. When dying, the smbd process will log this final entry: smbd[27355]: respond_to_all_remaining_ local_messages: PANIC : we have -1 exclusive oplocks. This issue has been reproduced on both servers with versions 3.3.9 and 3.3.10 on RHEL 5.4 64bits, not tested with older ones. A level 10 log extract around a PANIC is like that: ... [2010/02/08 17:18:14, 5] smbd/uid.c:change_to_user(355) change_to_user uid=(0,6195) gid=(0,101) [2010/02/08 17:18:14, 3] smbd/reply.c:reply_close(4385) close fd=61 fnum=8800 (numopen=3) [2010/02/08 17:18:14, 6] smbd/close.c:set_close_write_time(450) close_write_time: Sun Feb 7 07:28:15 2106 [2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(100) Locking key 0BFD0000000000003E85 [2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(129) Allocated locked data 0x0x2b62c70b6450 [2010/02/08 17:18:14, 10] locking/locking.c:parse_share_modes(519) parse_share_modes: delete_on_close: 0, owrt: Sun Jan 10 18:14:29 2010 CET, cwrt: Thu Jan 1 01:00:00 1970 CET, tok: 0, num_share_modes: 1 [2010/02/08 17:18:14, 10] locking/locking.c:parse_share_modes(616) parse_share_modes: share_mode_entry[0]: pid = 5575, share_access = 0x3, private_options = 0x40, access_mask = 0x20089, mid = 0x0, type= 0x10, gen_id = 23508270, uid = 6195, flags = 0, file_id fd0b:5e853e [2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_record_destr(42) Unlocking key 0BFD0000000000003E85 [2010/02/08 17:18:14, 0] smbd/oplock.c:release_file_oplock(149) PANIC: assert failed at smbd/oplock.c(149): exclusive_oplocks_open>=0 [2010/02/08 17:18:14, 10] locking/posix.c:get_windows_lock_ref_count(495) get_windows_lock_count for file XXXXXXXXXXX = 0 [2010/02/08 17:18:14, 10] locking/posix.c:delete_windows_lock_ref_count(521) delete_windows_lock_ref_count for file XXXXXXXXXXXX [2010/02/08 17:18:14, 2] smbd/close.c:close_normal_file(606) padcube closed file XXXXXXXXXXXX (numopen=2) NT_STATUS_OK [2010/02/08 17:18:14, 5] smbd/files.c:file_free(476) freed files structure 8800 (33 used) [2010/02/08 17:18:14, 5] lib/util.c:show_msg(645) ... This pattern looks the same for all PANICs, whatever the 'XXXXXXXXXXX' file the client is using. Any help would be greatly appreciated in order to better diagnose this issue and find a solution to it. Regards, Georges
Volker Lendecke
2010-Feb-13 15:45 UTC
[Samba] smbd panic due to negative exclusive oplock count
On Sat, Feb 13, 2010 at 12:19:41PM +0100, Georges Brom wrote:> This issue has been reproduced on both servers with versions 3.3.9 and > 3.3.10 on RHEL 5.4 64bits, not tested with older ones. > > > A level 10 log extract around a PANIC is like that:We need a *lot* more context to find this. Please open a bug at bugzilla.samba.org and upload a full debug level 10 log of smbd leading to this bug. Please note that the default max log size is far too small for this kind of debugging, please set it to 0 for this test. And also note that smbd logfiles due to their repetitive nature shrink a lot with bzip2 -9 :-) If you have privacy concerns, feel free to close this bug to be readable by the Samba-Team only, or alternatively send the logfile to me directly. If that is not enough, we need information how to reproduce the bug ourselves. Thanks, Volker -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: Digital signature URL: <http://lists.samba.org/pipermail/samba/attachments/20100213/baec5829/attachment.pgp>
Apparently Analagous Threads
- VFS ACL with SMB2
- Can't remember name of command to temporarily disable a share
- Confusion berween file and direcory
- The specified server can't perform the requested operation. Problem Accessing samba from windows 7 and vista
- Problem After Upgrade - NT_STATUS_FILE_IS_A_DIRECTORY