thr3ads.net - samba - [Samba] smbd panic due to negative exclusive oplock count [Feb 2010]

If this information is useful, please help other people find it:
Share via:

Georges Brom

2010-Feb-11 13:03 UTC

[Samba] smbd panic due to negative exclusive oplock count

Hi,

I'm facing an issue with at least two Samba servers in production both
showing important number of log entries like:

smbd[27355]:   PANIC: assert failed at smbd/oplock.c(149):
exclusive_oplocks_open>=0


Looking at the source code, this panic is triggered when the number of
exclusive oplocks for a given file becomes negative.


Once triggered for one smbd process, this process can generate more than a
million log entries per day, causing both perf and space issues on server
end, and poor perf on client end.

When dying, the smbd process will log this final entry:
smbd[27355]:   respond_to_all_remaining_local_messages: PANIC : we have -1
exclusive oplocks.


This issue has been reproduced on both servers with versions 3.3.9 and
3.3.10 on RHEL 5.4 64bits, not tested with older ones.


A level 10 log extract around a PANIC is like that:

...
[2010/02/08 17:18:14,  5] smbd/uid.c:change_to_user(355)
  change_to_user uid=(0,6195) gid=(0,101)
[2010/02/08 17:18:14,  3] smbd/reply.c:reply_close(4385)
  close fd=61 fnum=8800 (numopen=3)
[2010/02/08 17:18:14,  6] smbd/close.c:set_close_write_time(450)
  close_write_time: Sun Feb  7 07:28:15 2106
[2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(100)
  Locking key 0BFD0000000000003E85
[2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(129)
  Allocated locked data 0x0x2b62c70b6450
[2010/02/08 17:18:14, 10] locking/locking.c:parse_share_modes(519)
  parse_share_modes: delete_on_close: 0, owrt: Sun Jan 10 18:14:29 2010 CET,
cwrt: Thu Jan  1 01:00:00 1970 CET, tok: 0, num_share_modes: 1
[2010/02/08 17:18:14, 10] locking/locking.c:parse_share_modes(616)
  parse_share_modes: share_mode_entry[0]:  pid = 5575, share_access = 0x3,
private_options = 0x40, access_mask = 0x20089, mid = 0x0, type= 0x10, gen_id
= 23508270, uid = 6195, flags = 0, file_id fd0b:5e853e
[2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_record_destr(42)
  Unlocking key 0BFD0000000000003E85
[2010/02/08 17:18:14,  0] smbd/oplock.c:release_file_oplock(149)
  PANIC: assert failed at smbd/oplock.c(149): exclusive_oplocks_open>=0
[2010/02/08 17:18:14, 10] locking/posix.c:get_windows_lock_ref_count(495)
  get_windows_lock_count for file XXXXXXXXXXX = 0
[2010/02/08 17:18:14, 10] locking/posix.c:delete_windows_lock_ref_count(521)
  delete_windows_lock_ref_count for file XXXXXXXXXXXX
[2010/02/08 17:18:14,  2] smbd/close.c:close_normal_file(606)
  padcube closed file XXXXXXXXXXXX (numopen=2) NT_STATUS_OK
[2010/02/08 17:18:14,  5] smbd/files.c:file_free(476)
  freed files structure 8800 (33 used)
[2010/02/08 17:18:14,  5] lib/util.c:show_msg(645)
...

This pattern looks the same for all PANICs, whatever the 'XXXXXXXXXXX'
file
the client is using.


Any help would be greatly appreciated in order to better diagnose this issue
and find a solution to it.


Regards,
Georges

Georges Brom

2010-Feb-13 11:19 UTC

head link

[Samba] smbd panic due to negative exclusive oplock count

Hi,

I'm facing an issue with at least two Samba servers both showing important
number of log entries like:

smbd[27355]:   PANIC: assert failed at smbd/oplock.c(149):
exclusive_oplocks_open>=0


Looking at the source code, this panic is triggered when the number of
exclusive oplocks for a given file becomes negative.


Once triggered for one smbd process, this process can generate more than a
million log entries per day, causing both perf and space issues on server
end, and poor perf on client end.

When dying, the smbd process will log this final entry:
smbd[27355]:   respond_to_all_remaining_
local_messages: PANIC : we have -1 exclusive oplocks.


This issue has been reproduced on both servers with versions 3.3.9 and
3.3.10 on RHEL 5.4 64bits, not tested with older ones.


A level 10 log extract around a PANIC is like that:

...
[2010/02/08 17:18:14,  5] smbd/uid.c:change_to_user(355)
  change_to_user uid=(0,6195) gid=(0,101)
[2010/02/08 17:18:14,  3] smbd/reply.c:reply_close(4385)
  close fd=61 fnum=8800 (numopen=3)
[2010/02/08 17:18:14,  6] smbd/close.c:set_close_write_time(450)
  close_write_time: Sun Feb  7 07:28:15 2106
[2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(100)
  Locking key 0BFD0000000000003E85
[2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(129)
  Allocated locked data 0x0x2b62c70b6450
[2010/02/08 17:18:14, 10] locking/locking.c:parse_share_modes(519)
  parse_share_modes: delete_on_close: 0, owrt: Sun Jan 10 18:14:29 2010 CET,
cwrt: Thu Jan  1 01:00:00 1970 CET, tok: 0, num_share_modes: 1
[2010/02/08 17:18:14, 10] locking/locking.c:parse_share_modes(616)
  parse_share_modes: share_mode_entry[0]:  pid = 5575, share_access = 0x3,
private_options = 0x40, access_mask = 0x20089, mid = 0x0, type= 0x10, gen_id
= 23508270, uid = 6195, flags = 0, file_id fd0b:5e853e
[2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_record_destr(42)
  Unlocking key 0BFD0000000000003E85
[2010/02/08 17:18:14,  0] smbd/oplock.c:release_file_oplock(149)
  PANIC: assert failed at smbd/oplock.c(149): exclusive_oplocks_open>=0
[2010/02/08 17:18:14, 10] locking/posix.c:get_windows_lock_ref_count(495)
  get_windows_lock_count for file XXXXXXXXXXX = 0
[2010/02/08 17:18:14, 10] locking/posix.c:delete_windows_lock_ref_count(521)
  delete_windows_lock_ref_count for file XXXXXXXXXXXX
[2010/02/08 17:18:14,  2] smbd/close.c:close_normal_file(606)
  padcube closed file XXXXXXXXXXXX (numopen=2) NT_STATUS_OK
[2010/02/08 17:18:14,  5] smbd/files.c:file_free(476)
  freed files structure 8800 (33 used)
[2010/02/08 17:18:14,  5] lib/util.c:show_msg(645)
...

This pattern looks the same for all PANICs, whatever the 'XXXXXXXXXXX'
file
the client is using.


Any help would be greatly appreciated in order to better diagnose this issue
and find a solution to it.


Regards,
Georges

Volker Lendecke

2010-Feb-13 15:45 UTC

head link

[Samba] smbd panic due to negative exclusive oplock count

On Sat, Feb 13, 2010 at 12:19:41PM +0100, Georges Brom
wrote:> This issue has been reproduced on both servers with versions 3.3.9 and
> 3.3.10 on RHEL 5.4 64bits, not tested with older ones.
> 
> 
> A level 10 log extract around a PANIC is like that:
We need a *lot* more context to find this. Please open a bug
at bugzilla.samba.org and upload a full debug level 10 log
of smbd leading to this bug.

Please note that the default max log size is far too small
for this kind of debugging, please set it to 0 for this
test. And also note that smbd logfiles due to their
repetitive nature shrink a lot with bzip2 -9 :-)

If you have privacy concerns, feel free to close this bug to
be readable by the Samba-Team only, or alternatively send
the logfile to me directly. If that is not enough, we need
information how to reproduce the bug ourselves.

Thanks,

Volker
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
URL:
<http://lists.samba.org/pipermail/samba/attachments/20100213/baec5829/attachment.pgp>

Reasonably Related Threads

Search for more seemingly similar threads

samba - Feb 2010 - smbd panic due to negative exclusive oplock count

[Samba] smbd panic due to negative exclusive oplock count

[Samba] smbd panic due to negative exclusive oplock count

[Samba] smbd panic due to negative exclusive oplock count

Reasonably Related Threads