awl1
2017-Aug-18 21:57 UTC
[Samba] Friendly Reminder: Would you please comment on my findings?
Ah, ok, "directory handle leases"... Ouch, I see... :-( In this case, I will first repeat my test scenario with a Windows SMB2 server and report back here. Based on the results of this exercise, you can then advise whether you still want to move this to smb-technical and raise this with Microsoft folks (who still might have a simple workaround "fix" to improve their SMB2 client performance with a Samba server that does not [yet] implement directory handle leases), or whether the only way to fix this (on the long run, I am perfectly aware that this is a hugely complex task) will be implementing those infamous "directory handle leases" in Samba... ;-) Many thanks so far for the progress we made today - I'll call it a day for today now (close to midnight over here), and get back once I have the Wireshark recordings of the Windows SMB2 share server... Best regards Andreas Am 18.08.2017 um 23:42 schrieb Jeremy Allison:> On Fri, Aug 18, 2017 at 11:35:04PM +0200, awl1 wrote: >> Am 18.08.2017 um 23:17 schrieb Jeremy Allison via samba: >>> This might be hidden against Windows due to directory handle >>> leases, which we don't yet support. >> Are you saying that when I replace the Samba server by a Windows >> SMB2 share server, I should see better performance? I can perfectly >> test that out and record a Wireshark trace for this if you like... > Yes, that would be my guess. > >> Layperson question: Would such a "directory handle lease" be >> something like a cache for SMB2 Find responses? >> >> This would have to be a client-side cache in order to avoid sending >> back all 1000 file names in the directory from server to client over >> the network with every single response. >> >> Also, in case it indeed were a client-side cache, this cache would >> also need to be silently concurrently updated with every Create >> request/response cycle, because the number of files in the >> server-side directory always grows by the one file just written >> between one call to SMB2_FIND_ID_BOTH_DIRECTORY_INFO and the next... > That's exactly what directory handle leases are. They're > oplocks for directories.
Andrew Bartlett
2017-Aug-19 08:53 UTC
[Samba] Friendly Reminder: Would you please comment on my findings?
On Fri, 2017-08-18 at 23:57 +0200, awl1 wrote:> Ah, ok, "directory handle leases"... Ouch, I see... :-( > > In this case, I will first repeat my test scenario with a Windows SMB2 > server and report back here. > > Based on the results of this exercise, you can then advise whether you > still want to move this to smb-technical and raise this with Microsoft > folks (who still might have a simple workaround "fix" to improve their > SMB2 client performance with a Samba server that does not [yet] > implement directory handle leases), or whether the only way to fix this > (on the long run, I am perfectly aware that this is a hugely complex > task) will be implementing those infamous "directory handle leases" in > Samba... ;-)I think it is time this thread moved to samba-technical, with those findings and a good summary. Not all developers follow this list closely, and this is clearly becoming a development question.> Many thanks so far for the progress we made today - I'll call it a day > for today now (close to midnight over here), and get back once I have > the Wireshark recordings of the Windows SMB2 share server...I know you really wish 'we' could take this over from here, the challenge is that the things you hope we could do are the most time- consuming bits, which is why this has been getting awkward. Sorry, Andrew Bartlett -- Andrew Bartlett http://samba.org/~abartlet/ Authentication Developer, Samba Team http://samba.org Samba Developer, Catalyst IT http://catalyst.net.nz/services/samba
awl1
2017-Aug-24 13:37 UTC
[Samba] Windows SMB2 client doing excessive, inefficient SMB2 Find (and other) requests
Hello Jeremy, Andrew, All, I can now indeed confirm that my test scenario runs MUCH faster when executed against a Windows 10 SMB2 server than against Samba (test scenario takes ~ 18 seconds using a Windows SMB2 server as opposed to ~ 300 seconds when the server is Samba 4.6.5). But, most interestingly, the reason for these performance gains is NOT - as assumed by Jeremy - that the number of client-server request/response cycles over the network goes down - Wireshark pcapng file size is even slightly larger!!! (so it is NOT client-side caching on the Win SMB2 client side that does the trick), but the Windows SMB2 server "only" responds much faster to the infamous (and unnecessary/imperformant) SMB2_FIND_ID_BOTH_DIRECTORY_INFO requests with Pattern "*" - which probably means that the Windows SMB2 server does some caching here. In order to write the ~ 1000 files in my old test scenario, the total packet capture size for a SMB1 client against both a Samba or a Windows SMB1 server is about 10 MB, while the total packet caapture size for a SMB2 client against a SMB2 server (regardless whether Samba or Windows SMB2 server!!!) is about 32 MB, which clearly points to the inefficient use of Find requests with "*" pattern. (Unfortunately, the Windows SMB2 server can still handle this 32MB communication in 18 seconds, while the Samba 4.6.5 server takes ~ 300 seconds.) Nevertheless, comparing this with SMB1 behaviour (where the number of Find requests was equal to the number of files to be written, and none of the Find requests used a "*" Find pattern) this means that the whole scenario can (and probably should) be optimized by Microsoft for both cases as a "low hanging fruits" performance optimisation. Also, the way a Linux SMB2 client behaves (mount.cifs vers=2.x or 3.x) shows that it might even be possible to complete the write scenario without any Find requests (not sure about Windows' case insensitive file system, though...). Of course, in addition, on the long run, the Samba server should also be improved to behave similarly to the Windows SMB2 server and cache the results of Find calls (probably this means implementing the infamous "directory handle leases"), but IMO that's a second step... Before I follow your advice to move this whole issue/topic to the samba-technical alias and start over there with a clean description of the scenario and the findings, I have two simple questions: 1) I plan to use a new, reproducible test scenario with 2000 random small files with a file length between 1 and 2048 bytes, created along the lines of the following: for i in $(seq -f "%04g" 1 2000) ; do length=`shuf -i 1-2048 -n 1` head -c $length < /dev/urandom > file${i}.rnd done in order to make test data non-confidential (unfortunately, my previous test files/packet traces were confidential). Do you agree that the above procedure is fine to create the test scenario? 2) What about confidential data from SMB/SMB2 sessions (i.e. Samba usernames/passwords)? What do I meed to do to filter all information from Wireshark traces that points to users and passwords? More specifically, would filtering all SMB2 "SessionSetup" request and response packets from Wireshark traces be sufficient to do so? What about machine names/IP addresses from my LAN? Any other such IP addresses/machine names contained in any packets other than (obviously) those of my particular SMB2 client and Samba server (and Windows SMB2 server to compare)? Many thanks one more time for your feedback and replies to questions 1) and 2) above... Best regards Andreas Am 18.08.2017 um 23:57 schrieb awl1:> Ah, ok, "directory handle leases"... Ouch, I see... :-( > > In this case, I will first repeat my test scenario with a Windows SMB2 > server and report back here. > > Based on the results of this exercise, you can then advise whether you > still want to move this to smb-technical and raise this with Microsoft > folks (who still might have a simple workaround "fix" to improve their > SMB2 client performance with a Samba server that does not [yet] > implement directory handle leases), or whether the only way to fix > this (on the long run, I am perfectly aware that this is a hugely > complex task) will be implementing those infamous "directory handle > leases" in Samba... ;-) > > Many thanks so far for the progress we made today - I'll call it a day > for today now (close to midnight over here), and get back once I have > the Wireshark recordings of the Windows SMB2 share server... > > Best regards > Andreas > > > Am 18.08.2017 um 23:42 schrieb Jeremy Allison: >> On Fri, Aug 18, 2017 at 11:35:04PM +0200, awl1 wrote: >>> Am 18.08.2017 um 23:17 schrieb Jeremy Allison via samba: >>>> This might be hidden against Windows due to directory handle >>>> leases, which we don't yet support. >>> Are you saying that when I replace the Samba server by a Windows >>> SMB2 share server, I should see better performance? I can perfectly >>> test that out and record a Wireshark trace for this if you like... >> Yes, that would be my guess. >> >>> Layperson question: Would such a "directory handle lease" be >>> something like a cache for SMB2 Find responses? >>> >>> This would have to be a client-side cache in order to avoid sending >>> back all 1000 file names in the directory from server to client over >>> the network with every single response. >>> >>> Also, in case it indeed were a client-side cache, this cache would >>> also need to be silently concurrently updated with every Create >>> request/response cycle, because the number of files in the >>> server-side directory always grows by the one file just written >>> between one call to SMB2_FIND_ID_BOTH_DIRECTORY_INFO and the next... >> That's exactly what directory handle leases are. They're >> oplocks for directories. > >
awl1
2017-Aug-31 11:49 UTC
[Samba] 2nd try: Windows SMB2 client doing excessive, inefficient SMB2 Find (and other) requests
Hi again, Jeremy, Andrew, All, sorry for being such a nuisance to you, but as I heven't received any replies so far, I would like to just one more time repeat my below request for help as sent out last week: This time, I was "only" asking you to take notice of my latest finding regarding my test case run against a Windows SMB2 server, and answer two short and simple questions in order to make me work with confidence on a reproducible, non-confidential test case in order to then address the issue on the samba-technical list as suggested by you before. I truly hope that it will be more easy to receive feedback and get into discussion mode on samba-technical: Let me restate that I am willing to help with everything in my power (test case setup, Wireshark testing, statistics, ...) - the only things I won't be able to do is fixing the issue by myself, as the main issue IMO still is a Microsoft issue in their SMB2 client (so we need the help of the Microsoft folks on samba-technical). Also, I clearly won't be able to implement "directory handle leases" in Samba by myself... And BTW: In the unlikely case that the Samba team are not at all interested in addressing/fixing this detrimental SMB2 performance issue for a huge number of small files (neither in the Microsoft SMB2 client nor in Samba itself), please just let me know this as well - I will then stop bugging you any further... Many thanks for your help and understanding! :-) Best regards, Andreas Am 24.08.2017 um 15:37 schrieb awl1:> Hello Jeremy, Andrew, All, > > I can now indeed confirm that my test scenario runs MUCH faster when > executed against a Windows 10 SMB2 server than against Samba (test > scenario takes ~ 18 seconds using a Windows SMB2 server as opposed to > ~ 300 seconds when the server is Samba 4.6.5). > > But, most interestingly, the reason for these performance gains is NOT > - as assumed by Jeremy - that the number of client-server > request/response cycles over the network goes down - Wireshark pcapng > file size is even slightly larger!!! (so it is NOT client-side caching > on the Win SMB2 client side that does the trick), but the Windows SMB2 > server "only" responds much faster to the infamous (and > unnecessary/imperformant) SMB2_FIND_ID_BOTH_DIRECTORY_INFO requests > with Pattern "*" - which probably means that the Windows SMB2 server > does some caching here. > > In order to write the ~ 1000 files in my old test scenario, the total > packet capture size for a SMB1 client against both a Samba or a > Windows SMB1 server is about 10 MB, while the total packet caapture > size for a SMB2 client against a SMB2 server (regardless whether Samba > or Windows SMB2 server!!!) is about 32 MB, which clearly points to the > inefficient use of Find requests with "*" pattern. (Unfortunately, the > Windows SMB2 server can still handle this 32MB communication in 18 > seconds, while the Samba 4.6.5 server takes ~ 300 seconds.) > > Nevertheless, comparing this with SMB1 behaviour (where the number of > Find requests was equal to the number of files to be written, and none > of the Find requests used a "*" Find pattern) this means that the > whole scenario can (and probably should) be optimized by Microsoft for > both cases as a "low hanging fruits" performance optimisation. Also, > the way a Linux SMB2 client behaves (mount.cifs vers=2.x or 3.x) shows > that it might even be possible to complete the write scenario without > any Find requests (not sure about Windows' case insensitive file > system, though...). > > Of course, in addition, on the long run, the Samba server should also > be improved to behave similarly to the Windows SMB2 server and cache > the results of Find calls (probably this means implementing the > infamous "directory handle leases"), but IMO that's a second step... > > > Before I follow your advice to move this whole issue/topic to the > samba-technical alias and start over there with a clean description of > the scenario and the findings, I have two simple questions: > > 1) I plan to use a new, reproducible test scenario with 2000 random > small files with a file length between 1 and 2048 bytes, created along > the lines of the following: > > for i in $(seq -f "%04g" 1 2000) ; do > length=`shuf -i 1-2048 -n 1` > head -c $length < /dev/urandom > file${i}.rnd > done > > in order to make test data non-confidential (unfortunately, my > previous test files/packet traces were confidential). Do you agree > that the above procedure is fine to create the test scenario? > > 2) What about confidential data from SMB/SMB2 sessions (i.e. Samba > usernames/passwords)? What do I meed to do to filter all information > from Wireshark traces that points to users and passwords? > > More specifically, would filtering all SMB2 "SessionSetup" request and > response packets from Wireshark traces be sufficient to do so? What > about machine names/IP addresses from my LAN? Any other such IP > addresses/machine names contained in any packets other than > (obviously) those of my particular SMB2 client and Samba server (and > Windows SMB2 server to compare)? > > Many thanks one more time for your feedback and replies to questions > 1) and 2) above... > > Best regards > Andreas
Ralph Böhme
2017-Aug-31 13:43 UTC
[Samba] Windows SMB2 client doing excessive, inefficient SMB2 Find (and other) requests
Andreas, On Thu, Aug 24, 2017 at 03:37:07PM +0200, awl1 via samba wrote:> Before I follow your advice to move this whole issue/topic to the > samba-technical alias and start over there with a clean description of the > scenario and the findings, I have two simple questions: > > 1) I plan to use a new, reproducible test scenario with 2000 random small > files with a file length between 1 and 2048 bytes, created along the lines > of the following: > > for i in $(seq -f "%04g" 1 2000) ; do > length=`shuf -i 1-2048 -n 1` > head -c $length < /dev/urandom > file${i}.rnd > doneThis is overly complicated for this I guess, a simple touch file$i should do it.> in order to make test data non-confidential (unfortunately, my previous test > files/packet traces were confidential). Do you agree that the above > procedure is fine to create the test scenario? > > 2) What about confidential data from SMB/SMB2 sessions (i.e. Samba > usernames/passwords)? What do I meed to do to filter all information from > Wireshark traces that points to users and passwords?Passwords are not send over the wire, the user name is as are IP addresses. Use a test user and setup VMs is a private test network if you care.> More specifically, would filtering all SMB2 "SessionSetup" request and > response packets from Wireshark traces be sufficient to do so? What about > machine names/IP addresses from my LAN? Any other such IP addresses/machine > names contained in any packets other than (obviously) those of my particular > SMB2 client and Samba server (and Windows SMB2 server to compare)?For the trace it's okay if it starts with the reproducer without session setup. If you can send a brief description of the issue alongside the traces to samba-technical, I can try to get this to the right people. Cheerio! -slow
Reasonably Related Threads
- Windows SMB2 client doing excessive, inefficient SMB2 Find (and other) requests
- Friendly Reminder: Would you please comment on my findings?
- Windows SMB2 client doing excessive, inefficient SMB2 Find (and other) requests
- Friendly Reminder: Would you please comment on my findings?
- Huge number of small files performance regression from 3.5.16 to 4.6.5 with identical smb.conf