Good morning,
We have been noticing troubles browsing on a ZFS share, especially in
the afternoon, and found our 8 cores going at 100% with over 100000 smtx
running on each core on mpstat. We are running Solaris 5.11 with Samba
3.5.10, 48 GB of RAM and two 4 core Xeons. The fileserver is attached
by domain mode to Windows 2003 R2 SP2 with Services for Unix installed
and we only have around 80 users (40-50 on the server at a time).
Here's an example of the mpstat in the morning:
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt
idl
0 0 0 14 8910 8686 3612 12 1084 59403 316 921 1 65
0 35
1 0 0 7 5638 5536 5817 16 1214 66263 309 117 0 69
0 31
2 0 0 6 4646 4580 3404 6 1120 64217 324 74 0 62
0 38
3 0 0 8 3112 3006 3712 8 1131 61692 340 54 0 61
0 39
4 0 0 12 166 36 3102 14 973 51300 221 68 0 61
0 38
5 0 0 15 206 25 3397 13 965 51468 238 62 0 61
0 39
6 0 0 10 205 29 3254 11 958 52947 236 107 0 62
0 38
7 0 0 19 124 30 3889 15 962 51948 230 44 0 62
0 38
We ran lockstat to see what was causing the high SMTX and found this:
root at dcivolume01:~# lockstat -D 10 sleep 2
Adaptive mutex spin: 706601 events in 2.060 seconds (342956 events/sec)
Count indv cuml rcnt nsec Lock Caller
------------------------------------------------------------------------
-------
402581 57% 57% 0.00 13105 0xffffff0d492ad678
kidmap_cache_lookup_uidbys id+0x51
265969 38% 95% 0.00 10882 0xffffff0d492ad678
kidmap_cache_lookup_gidbys id+0x51
24747 4% 98% 0.00 1427 zone0+0x20
zone_getspecific+0x2b
3660 1% 99% 0.00 1058 0xffffff0d54a5d590 rrw_enter_read+0x1b
3260 0% 99% 0.00 1524 0xffffff0d54a5d670 zfs_zget+0x46
2552 0% 99% 0.00 800 0xffffff0d54a5d590 rrw_exit+0x1d
872 0% 100% 0.00 27319 0xffffff0d538a3330
zfs_zaccess_aces_check+0x77
599 0% 100% 0.00 888 0xffffff0d55bedcb0
dnode_block_freed+0x6e
479 0% 100% 0.00 1899 0xffffff0e32dfe168
dmu_zfetch_find+0x17b
149 0% 100% 0.00 802 0xffffff0d34e79000 vn_rele+0x1e
------------------------------------------------------------------------
-------
We can't figure out what's causing it, even reducing the number of
groups used on the device, but hasn't helped. Seems like the longer the
machine is serving browsing the worse it gets until somedays it locks
up.
I tried flushing idmap and restarting both services (smb and idmap), but
it doesn't seem to help.
Any help would be greatly appreciated!
Thank you,
Matthias
MATTHIAS FOSTEL
TECHNOLOGY
DESIGN COLLECTIVE, INC.
ARCHITECTURE | PLANNING | INTERIORS
WWW.DESIGNCOLLECTIVE.COM <http://www.designcollective.com/>
_____________________________________________________________________
"This e-mail and any attachments contain confidential and privileged
information. If you are NOT the intended recipient, please notify the sender
promptly by return e-mail and delete this message and any attachments. Any
action taken with this message by persons other than the intended recipient is
prohibited. By the use of any attachment, the intended recipient accepts all
risks associated with such use, including unintended electronic viruses and will
indemnify Design Collective, Inc. from any claim of damage arising from use or
transfer of the attachment. Certain attachments are copyrighted instruments of
service of Design Collective, Inc. and governed by the current edition of the
AIA B141 Owner Architect Agreement."
______________________________________________________________________