Cyril Servant
2020-May-18 08:14 UTC
Parallel transfers with sftp (call for testing / advice)
Hi Peter, and thank you for your advice, it's really appreciated.> * A new thread queue infrastructure > > Please forget about using this pattern in the OpenSSH context. Others have > mentioned that threading requires much more care, and while that is true I > think a simple thread queue is doable, although I think there's an > issue or two still overlooked in the proposed implementation (specifically, > threads combined with child processes especially requires carefully dealing > with signals in all threads, which the change doesn't seem to do, and the > thread_loop() logic is not clear - the function will never return).The thread exists in thread_real_loop(), but yeah, it may not be clear enough. Anyways, if the we have to avoid using threads, we'll have to rework all this?> * Established behavior is changed, an error condition arbitrarily introduced > > The proposed change refuses file transfer if the local and remote files > have different sizes. This has never been a requirement before, and should > certainly not become one. It's also not neccessary for what you want to do. > If a user says to transfer a file then the behavior always was and should > continue to be that whatever was there before on the recipient side is simply > overwritten.No, this error is only raised if the thread wants to write a part of the file, just after the main thread created the sparse file. If the file doesn't have the right size, this means there has been a problem during the sparse file creation.> * Don't add server workarounds to the client > > [...] > > * Ad-hoc name resolution and random server address choiceThose 2 functionalities have been added with our specific HPC clusters in mind. I think we can simply remove them, and just focus on a point to point operation. For information, the unpatched sftp never resolves hostnames, it just lets the underlying ssh process do it.> Another approach would be for the recipient to initially create the full > size file before any data transfer, but that comes at the cost of a probably > quite significant delay. (Maybe not though? Can you test?)Well, without changing anything server-side, the only solution is to write something at the end of the file. In our tests it creates a sparse file, but indeed, this must be portable, and we have to test it on multiple platforms? Once again, thanks for your advice. I've only answered a few things here, but as you said, portability is the main subject, then if an alternative to threads has to be chosen, we'll have to think about it. -- Cyril
David Newall
2020-May-18 12:45 UTC
Parallel transfers with sftp (call for testing / advice)
Hi Cyril, I've lost track of who the original poster is, but you seem to be at least involved. Per my previous email, I'm still curious to know what performance do you get (using an unpatched sftp) when you use "-R 640"? Regards, David
Cyril Servant
2020-May-18 13:38 UTC
Parallel transfers with sftp (call for testing / advice)
Hi David,> I've lost track of who the original poster is, but you seem to be at least involved. > > Per my previous email, I'm still curious to know what performance do you get (using an unpatched sftp) when you use "-R 640"?With the "-R" option, you're still bound to the power of a single core of your CPU for encryption / decryption. The proposed patch creates a new ssh connection per thread. So if you have at least 2 cores on your CPU (client and server-side), you will be able to use 2 times more bandwidth with "-n 2". I just made some tests to be sure, and the "-R 640" option slows down transfers (a bit). I guess it's because the transfer speed is limited by CPU power and not network bandwidth speed. -- Cyril