CaptainTrips28
2023-Feb-08 01:07 UTC
[Samba] long delays with file enumeration & listing in large data storage environment
After joining a Samba server to domain using either Winbind or using
ADBridge (PBIS), enumeration time of listing files from windows file
explorer for that share is roughly (9) seconds for a folder containing 30k
files consisting of both large (upto 15GB in size) and small size files (as
low as KB's). Intitially, without setting logging to "0" and
disabling smb
max server protocol in smb.conf (which is now defaulting to smb3_11),
listing of these files would take up to 45 seconds. So we have already
drastically cut down this time with smb.conf option corrections.
As ~9 seconds may seem trivial to enumerate 30k files, this is only in a
test scenario with randomly generated junk files. In our production
scenario, we have folders frequently containing upwards of 100k and 250k
files and requirements for folders that may contain upto potentially 1
million files of various sizes and filenames consisting of upper and
lowercase characters. Assuming the average of around 200k files in the prod
scenario, the enumeration time would be around 1min10secs to list all
files. This enumeration delay is causing issues with our end users.
Initially, prior to joining the samba server to domain, using just a local
smb-enabled account, testing access time from windows file explorer to this
share of 30k files for enumeration was instaneous (a second or less).
However, in production, windows AD users are accessing the shares which
have AD owners and groups assigned to the shares folders and permissions
based on their level of access.
We have tried numerous config options and variations of options
enabled/disabled, trying local storage mountpoints on the samba server vs.
remote mountpoints via NFS and SSHFS. We've tried numerous hardware types:
a physical HPE Gen11 DL380 system acting the samba server, a virtual samba
server within vsphere, moving hardware around on network appliances to
reduce hops and improve throughput. Ipv6 has been disabled, all ports are
open as required, no antivirus scan interference. From the windows side,
smb client settings have been tested at various options, confirmed netbios
is disabled, etc. We've also tried limiting the samba server to one DC and
ignoring all irrelevant domains. No matter the test scenario, once domain
joined, the absolute lowest enumeration time we can achieve is the (9)
seconds until file listing is complete and that is pretty universal outcome
in each of the above mentioned testing scenarios.
- Tried both RHEL 8.4 and 8.6 (FIPS disabled, SELinux permissive and
firewalld off)
- Samba versions 4.13.3 (rhel 8.4) and 4.15.5 (rhel 8.6) have both been
tried with identical outcome, all pulled from RH satellite repos)
- We've tried with stigs both applied and unapplied, as well as folder/file
encryption both on and off (no difference in performance).
- PBIS/ADBridge was tried with both versions 22.2.x and 22.3.x (latest)
- Testing share access from Windows 2019 Server Datacenter (Build 17763).
Our user VDA sessions would also being accessing the shares from the same.
- Production environment is enterprise scale; hundreds of users/thousands
of folders/millions of files
- Share enumeration speed is the same if accessing by either mapped drive,
file explorer // access, or symbolically linking to the share
Any suggestions/recommendations on how to reduce or eliminate enumeration
and listing times for these shares is certainly appreciated.
-------------------------------
CURRENT SMB.CONF (I've had to replace the domain/share/host details for
privacy):
-------------------------------
[global]
security = ADS
workgroup = DOMAIN
realm = DOMAIN
machine password timeout = 0
idmap config * : backend = tdb
idmap config * : range = 3000-7999
idmap config DOMAIN.COM:backend = ad
idmap config DOMAIN.COM:schema_mode = rfc2307
idmap config DOMAIN.COM:range = 1617000000-1617999999
idmap config DOMAIN.COM:unix_nss_info = yes
map acl inherit = yes
store dos attributes = yes
log level = 0
dns proxy = yes
hostname lookups = yes
kerberos method = system keytab
log file = /var/log/samba/log.%m
smb encrypt = yes
server signing = auto
client signing = auto
vfs objects = acl_xattr
nt acl support = yes
netbios name = testhost01
[OrgName-OrgGroup-OrgSubgroup-TEST]
path = /mountpoint/Development/OrgName/OrgGroup/OrgSubgroup/TEST
writable = yes
browsable= yes
read only = no
create mask = 0770
force create mode = 0770
directory mask = 0775
hide unreadable = yes
# force group = OrgGroupTest at domain.com
# valid users = +domain.com\"OrgGroupTest"
---------------------------------------------------------------------------
OTHER SMB.CONF OPTIONS TRIED IN VARIATION:
---------------------------------------------------------------------------
*(not currently applied as these options either made no difference or
decreased performance/enumeration):
(global options we've tried various combinations of. We are aware these
options are not recommended by modern kernel standards, but tried them
regardless)
# username map = /etc/samba/user.map
# server multi channel support = yes
# aio read size = 16384
# aio write size = 16384
# aio max threads = 100
# allocation roundup size = 1048576
# interfaces = "10.100.99.103;speed=10000000000,capability=RSS"
# winbind max domain connections = 10
# winbind expand groups = 1
# socket options = SO_RCVBUF=131072 SO_SNDBUF=131072 TCP_NODELAY
# min receivefile size = 16384
# use sendfile = true
# aio read size = 16384
# aio write size = 16384
# aio write behind = true
(share options we've tried with no performance improvement even when
testing lowercase files only)
# case sensitive = true
# default case = lower
# preserve case = no
# short preserve case = no
Jeremy Allison
2023-Feb-08 01:42 UTC
[Samba] long delays with file enumeration & listing in large data storage environment
On Tue, Feb 07, 2023 at 08:07:46PM -0500, CaptainTrips28 via samba wrote:> >- Samba versions 4.13.3 (rhel 8.4) and 4.15.5 (rhel 8.6) have both been >tried with identical outcome, all pulled from RH satellite repos) > >- We've tried with stigs both applied and unapplied, as well as folder/file >encryption both on and off (no difference in performance). > >- PBIS/ADBridge was tried with both versions 22.2.x and 22.3.x (latest) > >- Testing share access from Windows 2019 Server Datacenter (Build 17763). >Our user VDA sessions would also being accessing the shares from the same. > >- Production environment is enterprise scale; hundreds of users/thousands >of folders/millions of files > >- Share enumeration speed is the same if accessing by either mapped drive, >file explorer // access, or symbolically linking to the share > >Any suggestions/recommendations on how to reduce or eliminate enumeration >and listing times for these shares is certainly appreciated.You may be running into this (from the Samba 4.17.x release notes). NEW FEATURES/CHANGES =================== SMB Server performance improvements ----------------------------------- The security improvements in recent releases (4.13, 4.14, 4.15, 4.16), mainly as protection against symlink races, caused performance regressions for meta data heavy workloads. With 4.17 the situation improved a lot again: - Pathnames given by a client are devided into dirname and basename. The amount of syscalls to validate dirnames is reduced to 2 syscalls (openat, close) per component. On modern Linux kernels (>= 5.6) smbd makes use of the openat2() syscall with RESOLVE_NO_SYMLINKS, in order to just use 2 syscalls (openat2, close) for the whole dirname. - Contended path based operations used to generate a lot of unsolicited wakeup events causing thundering herd problems, which lead to masive latencies for some clients. These events are now avoided in order to provide stable latencies and much higher throughput of open/close operations. I would suggest you try with Samba 4.17.latest to see if you find an improvement.
Rowland Penny
2023-Feb-08 08:48 UTC
[Samba] long delays with file enumeration & listing in large data storage environment
On 08/02/2023 01:07, CaptainTrips28 via samba wrote:> > ------------------------------- > > CURRENT SMB.CONF (I've had to replace the domain/share/host details for > privacy): > > ------------------------------- > > [global] > > security = ADS > > workgroup = DOMAIN > > realm = DOMAIN > > machine password timeout = 0 > > idmap config * : backend = tdb > > idmap config * : range = 3000-7999 > > idmap config DOMAIN.COM:backend = ad > > idmap config DOMAIN.COM:schema_mode = rfc2307 > > idmap config DOMAIN.COM:range = 1617000000-1617999999 > > idmap config DOMAIN.COM:unix_nss_info = yes >Can we rule out what may just be typo's/bad sanitising. Your workgroup and realm are both set to 'DOMAIN', they shouldn't be the same, they can be, but it wouldn't be best practice, The realm should be the dns domain in uppercase and the workgroup (or to give it its other name 'NetBIOS domain name') is usually the lefthand portion of the dns domain, again in uppercase. For example, if the dns domain is samdom.example.com, then the realm would be SAMDOM.EXAMPLE.COM and the workgroup could be SAMDOM (though it doesn't have to be, it can be anything, preferably one short word without dots). There is a further complication in that you are using 'DOMAIN.COM' on the 'idmap config' lines, this should be whatever the 'workgroup' is set to and if that is what the 'workgroup' is set to, then I refer you back to my comment about the dots in a workgroup name. I would also remove the 'hostname lookups = yes' line Rowland
Luke Barone
2023-Feb-08 12:48 UTC
[Samba] long delays with file enumeration & listing in large data storage environment
hide unreadable = yes I was under the impression this was a very time consuming option in domains. Can you try setting this to no and trying again? On Tue, Feb 7, 2023, 5:09 p.m. CaptainTrips28 via samba < samba at lists.samba.org> wrote:> After joining a Samba server to domain using either Winbind or using > ADBridge (PBIS), enumeration time of listing files from windows file > explorer for that share is roughly (9) seconds for a folder containing 30k > files consisting of both large (upto 15GB in size) and small size files (as > low as KB's). Intitially, without setting logging to "0" and disabling smb > max server protocol in smb.conf (which is now defaulting to smb3_11), > listing of these files would take up to 45 seconds. So we have already > drastically cut down this time with smb.conf option corrections. > > As ~9 seconds may seem trivial to enumerate 30k files, this is only in a > test scenario with randomly generated junk files. In our production > scenario, we have folders frequently containing upwards of 100k and 250k > files and requirements for folders that may contain upto potentially 1 > million files of various sizes and filenames consisting of upper and > lowercase characters. Assuming the average of around 200k files in the prod > scenario, the enumeration time would be around 1min10secs to list all > files. This enumeration delay is causing issues with our end users. > > Initially, prior to joining the samba server to domain, using just a local > smb-enabled account, testing access time from windows file explorer to this > share of 30k files for enumeration was instaneous (a second or less). > However, in production, windows AD users are accessing the shares which > have AD owners and groups assigned to the shares folders and permissions > based on their level of access. > > We have tried numerous config options and variations of options > enabled/disabled, trying local storage mountpoints on the samba server vs. > remote mountpoints via NFS and SSHFS. We've tried numerous hardware types: > a physical HPE Gen11 DL380 system acting the samba server, a virtual samba > server within vsphere, moving hardware around on network appliances to > reduce hops and improve throughput. Ipv6 has been disabled, all ports are > open as required, no antivirus scan interference. From the windows side, > smb client settings have been tested at various options, confirmed netbios > is disabled, etc. We've also tried limiting the samba server to one DC and > ignoring all irrelevant domains. No matter the test scenario, once domain > joined, the absolute lowest enumeration time we can achieve is the (9) > seconds until file listing is complete and that is pretty universal outcome > in each of the above mentioned testing scenarios. > > - Tried both RHEL 8.4 and 8.6 (FIPS disabled, SELinux permissive and > firewalld off) > > - Samba versions 4.13.3 (rhel 8.4) and 4.15.5 (rhel 8.6) have both been > tried with identical outcome, all pulled from RH satellite repos) > > - We've tried with stigs both applied and unapplied, as well as folder/file > encryption both on and off (no difference in performance). > > - PBIS/ADBridge was tried with both versions 22.2.x and 22.3.x (latest) > > - Testing share access from Windows 2019 Server Datacenter (Build 17763). > Our user VDA sessions would also being accessing the shares from the same. > > - Production environment is enterprise scale; hundreds of users/thousands > of folders/millions of files > > - Share enumeration speed is the same if accessing by either mapped drive, > file explorer // access, or symbolically linking to the share > > Any suggestions/recommendations on how to reduce or eliminate enumeration > and listing times for these shares is certainly appreciated. > > > > > ------------------------------- > > CURRENT SMB.CONF (I've had to replace the domain/share/host details for > privacy): > > ------------------------------- > > [global] > > security = ADS > > workgroup = DOMAIN > > realm = DOMAIN > > machine password timeout = 0 > > idmap config * : backend = tdb > > idmap config * : range = 3000-7999 > > idmap config DOMAIN.COM:backend = ad > > idmap config DOMAIN.COM:schema_mode = rfc2307 > > idmap config DOMAIN.COM:range = 1617000000-1617999999 > > idmap config DOMAIN.COM:unix_nss_info = yes > > map acl inherit = yes > > store dos attributes = yes > > log level = 0 > > dns proxy = yes > > hostname lookups = yes > > kerberos method = system keytab > > log file = /var/log/samba/log.%m > > smb encrypt = yes > > server signing = auto > > client signing = auto > > vfs objects = acl_xattr > > nt acl support = yes > > netbios name = testhost01 > > > > > > [OrgName-OrgGroup-OrgSubgroup-TEST] > > path = /mountpoint/Development/OrgName/OrgGroup/OrgSubgroup/TEST > > writable = yes > > browsable= yes > > read only = no > > create mask = 0770 > > force create mode = 0770 > > directory mask = 0775 > > hide unreadable = yes > > # force group = OrgGroupTest at domain.com > > # valid users = +domain.com\"OrgGroupTest" > > > --------------------------------------------------------------------------- > > OTHER SMB.CONF OPTIONS TRIED IN VARIATION: > > --------------------------------------------------------------------------- > > *(not currently applied as these options either made no difference or > decreased performance/enumeration): > > > > (global options we've tried various combinations of. We are aware these > options are not recommended by modern kernel standards, but tried them > regardless) > > # username map = /etc/samba/user.map > > # server multi channel support = yes > > # aio read size = 16384 > > # aio write size = 16384 > > # aio max threads = 100 > > # allocation roundup size = 1048576 > > # interfaces = "10.100.99.103;speed=10000000000,capability=RSS" > > # winbind max domain connections = 10 > > # winbind expand groups = 1 > > # socket options = SO_RCVBUF=131072 SO_SNDBUF=131072 TCP_NODELAY > > # min receivefile size = 16384 > > # use sendfile = true > > # aio read size = 16384 > > # aio write size = 16384 > > # aio write behind = true > > > (share options we've tried with no performance improvement even when > testing lowercase files only) > > # case sensitive = true > > # default case = lower > > # preserve case = no > > # short preserve case = no > -- > To unsubscribe from this list go to the following URL and read the > instructions: https://lists.samba.org/mailman/options/samba >