CaptainTrips28
2023-Feb-08 01:07 UTC
[Samba] long delays with file enumeration & listing in large data storage environment
After joining a Samba server to domain using either Winbind or using ADBridge (PBIS), enumeration time of listing files from windows file explorer for that share is roughly (9) seconds for a folder containing 30k files consisting of both large (upto 15GB in size) and small size files (as low as KB's). Intitially, without setting logging to "0" and disabling smb max server protocol in smb.conf (which is now defaulting to smb3_11), listing of these files would take up to 45 seconds. So we have already drastically cut down this time with smb.conf option corrections. As ~9 seconds may seem trivial to enumerate 30k files, this is only in a test scenario with randomly generated junk files. In our production scenario, we have folders frequently containing upwards of 100k and 250k files and requirements for folders that may contain upto potentially 1 million files of various sizes and filenames consisting of upper and lowercase characters. Assuming the average of around 200k files in the prod scenario, the enumeration time would be around 1min10secs to list all files. This enumeration delay is causing issues with our end users. Initially, prior to joining the samba server to domain, using just a local smb-enabled account, testing access time from windows file explorer to this share of 30k files for enumeration was instaneous (a second or less). However, in production, windows AD users are accessing the shares which have AD owners and groups assigned to the shares folders and permissions based on their level of access. We have tried numerous config options and variations of options enabled/disabled, trying local storage mountpoints on the samba server vs. remote mountpoints via NFS and SSHFS. We've tried numerous hardware types: a physical HPE Gen11 DL380 system acting the samba server, a virtual samba server within vsphere, moving hardware around on network appliances to reduce hops and improve throughput. Ipv6 has been disabled, all ports are open as required, no antivirus scan interference. From the windows side, smb client settings have been tested at various options, confirmed netbios is disabled, etc. We've also tried limiting the samba server to one DC and ignoring all irrelevant domains. No matter the test scenario, once domain joined, the absolute lowest enumeration time we can achieve is the (9) seconds until file listing is complete and that is pretty universal outcome in each of the above mentioned testing scenarios. - Tried both RHEL 8.4 and 8.6 (FIPS disabled, SELinux permissive and firewalld off) - Samba versions 4.13.3 (rhel 8.4) and 4.15.5 (rhel 8.6) have both been tried with identical outcome, all pulled from RH satellite repos) - We've tried with stigs both applied and unapplied, as well as folder/file encryption both on and off (no difference in performance). - PBIS/ADBridge was tried with both versions 22.2.x and 22.3.x (latest) - Testing share access from Windows 2019 Server Datacenter (Build 17763). Our user VDA sessions would also being accessing the shares from the same. - Production environment is enterprise scale; hundreds of users/thousands of folders/millions of files - Share enumeration speed is the same if accessing by either mapped drive, file explorer // access, or symbolically linking to the share Any suggestions/recommendations on how to reduce or eliminate enumeration and listing times for these shares is certainly appreciated. ------------------------------- CURRENT SMB.CONF (I've had to replace the domain/share/host details for privacy): ------------------------------- [global] security = ADS workgroup = DOMAIN realm = DOMAIN machine password timeout = 0 idmap config * : backend = tdb idmap config * : range = 3000-7999 idmap config DOMAIN.COM:backend = ad idmap config DOMAIN.COM:schema_mode = rfc2307 idmap config DOMAIN.COM:range = 1617000000-1617999999 idmap config DOMAIN.COM:unix_nss_info = yes map acl inherit = yes store dos attributes = yes log level = 0 dns proxy = yes hostname lookups = yes kerberos method = system keytab log file = /var/log/samba/log.%m smb encrypt = yes server signing = auto client signing = auto vfs objects = acl_xattr nt acl support = yes netbios name = testhost01 [OrgName-OrgGroup-OrgSubgroup-TEST] path = /mountpoint/Development/OrgName/OrgGroup/OrgSubgroup/TEST writable = yes browsable= yes read only = no create mask = 0770 force create mode = 0770 directory mask = 0775 hide unreadable = yes # force group = OrgGroupTest at domain.com # valid users = +domain.com\"OrgGroupTest" --------------------------------------------------------------------------- OTHER SMB.CONF OPTIONS TRIED IN VARIATION: --------------------------------------------------------------------------- *(not currently applied as these options either made no difference or decreased performance/enumeration): (global options we've tried various combinations of. We are aware these options are not recommended by modern kernel standards, but tried them regardless) # username map = /etc/samba/user.map # server multi channel support = yes # aio read size = 16384 # aio write size = 16384 # aio max threads = 100 # allocation roundup size = 1048576 # interfaces = "10.100.99.103;speed=10000000000,capability=RSS" # winbind max domain connections = 10 # winbind expand groups = 1 # socket options = SO_RCVBUF=131072 SO_SNDBUF=131072 TCP_NODELAY # min receivefile size = 16384 # use sendfile = true # aio read size = 16384 # aio write size = 16384 # aio write behind = true (share options we've tried with no performance improvement even when testing lowercase files only) # case sensitive = true # default case = lower # preserve case = no # short preserve case = no
Jeremy Allison
2023-Feb-08 01:42 UTC
[Samba] long delays with file enumeration & listing in large data storage environment
On Tue, Feb 07, 2023 at 08:07:46PM -0500, CaptainTrips28 via samba wrote:> >- Samba versions 4.13.3 (rhel 8.4) and 4.15.5 (rhel 8.6) have both been >tried with identical outcome, all pulled from RH satellite repos) > >- We've tried with stigs both applied and unapplied, as well as folder/file >encryption both on and off (no difference in performance). > >- PBIS/ADBridge was tried with both versions 22.2.x and 22.3.x (latest) > >- Testing share access from Windows 2019 Server Datacenter (Build 17763). >Our user VDA sessions would also being accessing the shares from the same. > >- Production environment is enterprise scale; hundreds of users/thousands >of folders/millions of files > >- Share enumeration speed is the same if accessing by either mapped drive, >file explorer // access, or symbolically linking to the share > >Any suggestions/recommendations on how to reduce or eliminate enumeration >and listing times for these shares is certainly appreciated.You may be running into this (from the Samba 4.17.x release notes). NEW FEATURES/CHANGES =================== SMB Server performance improvements ----------------------------------- The security improvements in recent releases (4.13, 4.14, 4.15, 4.16), mainly as protection against symlink races, caused performance regressions for meta data heavy workloads. With 4.17 the situation improved a lot again: - Pathnames given by a client are devided into dirname and basename. The amount of syscalls to validate dirnames is reduced to 2 syscalls (openat, close) per component. On modern Linux kernels (>= 5.6) smbd makes use of the openat2() syscall with RESOLVE_NO_SYMLINKS, in order to just use 2 syscalls (openat2, close) for the whole dirname. - Contended path based operations used to generate a lot of unsolicited wakeup events causing thundering herd problems, which lead to masive latencies for some clients. These events are now avoided in order to provide stable latencies and much higher throughput of open/close operations. I would suggest you try with Samba 4.17.latest to see if you find an improvement.
Rowland Penny
2023-Feb-08 08:48 UTC
[Samba] long delays with file enumeration & listing in large data storage environment
On 08/02/2023 01:07, CaptainTrips28 via samba wrote:> > ------------------------------- > > CURRENT SMB.CONF (I've had to replace the domain/share/host details for > privacy): > > ------------------------------- > > [global] > > security = ADS > > workgroup = DOMAIN > > realm = DOMAIN > > machine password timeout = 0 > > idmap config * : backend = tdb > > idmap config * : range = 3000-7999 > > idmap config DOMAIN.COM:backend = ad > > idmap config DOMAIN.COM:schema_mode = rfc2307 > > idmap config DOMAIN.COM:range = 1617000000-1617999999 > > idmap config DOMAIN.COM:unix_nss_info = yes >Can we rule out what may just be typo's/bad sanitising. Your workgroup and realm are both set to 'DOMAIN', they shouldn't be the same, they can be, but it wouldn't be best practice, The realm should be the dns domain in uppercase and the workgroup (or to give it its other name 'NetBIOS domain name') is usually the lefthand portion of the dns domain, again in uppercase. For example, if the dns domain is samdom.example.com, then the realm would be SAMDOM.EXAMPLE.COM and the workgroup could be SAMDOM (though it doesn't have to be, it can be anything, preferably one short word without dots). There is a further complication in that you are using 'DOMAIN.COM' on the 'idmap config' lines, this should be whatever the 'workgroup' is set to and if that is what the 'workgroup' is set to, then I refer you back to my comment about the dots in a workgroup name. I would also remove the 'hostname lookups = yes' line Rowland
Luke Barone
2023-Feb-08 12:48 UTC
[Samba] long delays with file enumeration & listing in large data storage environment
hide unreadable = yes I was under the impression this was a very time consuming option in domains. Can you try setting this to no and trying again? On Tue, Feb 7, 2023, 5:09 p.m. CaptainTrips28 via samba < samba at lists.samba.org> wrote:> After joining a Samba server to domain using either Winbind or using > ADBridge (PBIS), enumeration time of listing files from windows file > explorer for that share is roughly (9) seconds for a folder containing 30k > files consisting of both large (upto 15GB in size) and small size files (as > low as KB's). Intitially, without setting logging to "0" and disabling smb > max server protocol in smb.conf (which is now defaulting to smb3_11), > listing of these files would take up to 45 seconds. So we have already > drastically cut down this time with smb.conf option corrections. > > As ~9 seconds may seem trivial to enumerate 30k files, this is only in a > test scenario with randomly generated junk files. In our production > scenario, we have folders frequently containing upwards of 100k and 250k > files and requirements for folders that may contain upto potentially 1 > million files of various sizes and filenames consisting of upper and > lowercase characters. Assuming the average of around 200k files in the prod > scenario, the enumeration time would be around 1min10secs to list all > files. This enumeration delay is causing issues with our end users. > > Initially, prior to joining the samba server to domain, using just a local > smb-enabled account, testing access time from windows file explorer to this > share of 30k files for enumeration was instaneous (a second or less). > However, in production, windows AD users are accessing the shares which > have AD owners and groups assigned to the shares folders and permissions > based on their level of access. > > We have tried numerous config options and variations of options > enabled/disabled, trying local storage mountpoints on the samba server vs. > remote mountpoints via NFS and SSHFS. We've tried numerous hardware types: > a physical HPE Gen11 DL380 system acting the samba server, a virtual samba > server within vsphere, moving hardware around on network appliances to > reduce hops and improve throughput. Ipv6 has been disabled, all ports are > open as required, no antivirus scan interference. From the windows side, > smb client settings have been tested at various options, confirmed netbios > is disabled, etc. We've also tried limiting the samba server to one DC and > ignoring all irrelevant domains. No matter the test scenario, once domain > joined, the absolute lowest enumeration time we can achieve is the (9) > seconds until file listing is complete and that is pretty universal outcome > in each of the above mentioned testing scenarios. > > - Tried both RHEL 8.4 and 8.6 (FIPS disabled, SELinux permissive and > firewalld off) > > - Samba versions 4.13.3 (rhel 8.4) and 4.15.5 (rhel 8.6) have both been > tried with identical outcome, all pulled from RH satellite repos) > > - We've tried with stigs both applied and unapplied, as well as folder/file > encryption both on and off (no difference in performance). > > - PBIS/ADBridge was tried with both versions 22.2.x and 22.3.x (latest) > > - Testing share access from Windows 2019 Server Datacenter (Build 17763). > Our user VDA sessions would also being accessing the shares from the same. > > - Production environment is enterprise scale; hundreds of users/thousands > of folders/millions of files > > - Share enumeration speed is the same if accessing by either mapped drive, > file explorer // access, or symbolically linking to the share > > Any suggestions/recommendations on how to reduce or eliminate enumeration > and listing times for these shares is certainly appreciated. > > > > > ------------------------------- > > CURRENT SMB.CONF (I've had to replace the domain/share/host details for > privacy): > > ------------------------------- > > [global] > > security = ADS > > workgroup = DOMAIN > > realm = DOMAIN > > machine password timeout = 0 > > idmap config * : backend = tdb > > idmap config * : range = 3000-7999 > > idmap config DOMAIN.COM:backend = ad > > idmap config DOMAIN.COM:schema_mode = rfc2307 > > idmap config DOMAIN.COM:range = 1617000000-1617999999 > > idmap config DOMAIN.COM:unix_nss_info = yes > > map acl inherit = yes > > store dos attributes = yes > > log level = 0 > > dns proxy = yes > > hostname lookups = yes > > kerberos method = system keytab > > log file = /var/log/samba/log.%m > > smb encrypt = yes > > server signing = auto > > client signing = auto > > vfs objects = acl_xattr > > nt acl support = yes > > netbios name = testhost01 > > > > > > [OrgName-OrgGroup-OrgSubgroup-TEST] > > path = /mountpoint/Development/OrgName/OrgGroup/OrgSubgroup/TEST > > writable = yes > > browsable= yes > > read only = no > > create mask = 0770 > > force create mode = 0770 > > directory mask = 0775 > > hide unreadable = yes > > # force group = OrgGroupTest at domain.com > > # valid users = +domain.com\"OrgGroupTest" > > > --------------------------------------------------------------------------- > > OTHER SMB.CONF OPTIONS TRIED IN VARIATION: > > --------------------------------------------------------------------------- > > *(not currently applied as these options either made no difference or > decreased performance/enumeration): > > > > (global options we've tried various combinations of. We are aware these > options are not recommended by modern kernel standards, but tried them > regardless) > > # username map = /etc/samba/user.map > > # server multi channel support = yes > > # aio read size = 16384 > > # aio write size = 16384 > > # aio max threads = 100 > > # allocation roundup size = 1048576 > > # interfaces = "10.100.99.103;speed=10000000000,capability=RSS" > > # winbind max domain connections = 10 > > # winbind expand groups = 1 > > # socket options = SO_RCVBUF=131072 SO_SNDBUF=131072 TCP_NODELAY > > # min receivefile size = 16384 > > # use sendfile = true > > # aio read size = 16384 > > # aio write size = 16384 > > # aio write behind = true > > > (share options we've tried with no performance improvement even when > testing lowercase files only) > > # case sensitive = true > > # default case = lower > > # preserve case = no > > # short preserve case = no > -- > To unsubscribe from this list go to the following URL and read the > instructions: https://lists.samba.org/mailman/options/samba >
Possibly Parallel Threads
- Empty folder deletion issue - Samba 4.15 thru 4.18
- Empty folder deletion issue - Samba 4.15 thru 4.18
- Empty folder deletion issue - Samba 4.15 thru 4.18
- Replication failing with Win 2012 R2 (maybe)
- not enough charcters available in %J or confusing blanks?