thr3ads.net - openssh unix dev - Parallel transfers with sftp (call for testing / advice) [May 2020]

If this information is useful, please help other people find it:
Share via:

Cyril Servant

2020-May-05 10:55 UTC

Parallel transfers with sftp (call for testing / advice)

Peter Stuge wrote:> 
> Matthieu Hautreux wrote:
>> The change proposed by Cyril in sftp is a very pragmatic approach to 
>> deal with parallelism at the file transfer level. It leverages the 
>> already existing sftp protocol and its capability to write/read file 
>> content at specified offsets. This enables to speed up sftp transfers 
>> significantly by parallelizing the SSH channels used for large 
>> transfers. This improvement is performed only by modifying the sftp 
>> client, which is a very small modification compared to the openssh 
>> codebase. The modification is not too complicated to review and
validate
>> (I did it) and does not change the default behavior of the cli.
> 
> I think you make a compelling argument. I admit that I haven't
> reviewed the patch, even though that is what matters the most.
If you want to review the code, here is a direct link to the patch:
https://github.com/openssh/openssh-portable/compare/V_8_2_P1...cea-hpc:V_8_2_P1_PSFTP1
> I guess that noone really minds ways to make SFTP scale, but ever since
> the patch was proposed I have been thinking that the paralell channel
> approach is likely to introduce a whole load of not very clean error
> conditions regarding reassembly, which need to be handled sensibly both
> within the sftp client and on the interface to outside/calling processes.
> Can you or Cyril say something about this?
Indeed, reassembly must be handled properly. Depending on the filesystem
you're
writing to, the block size can vary. Having a "power of 2 bytes" chunk
is the
starting point. Performance-wise, our storage experts asked us to use 2MB or
more chunks. I chose the arbitrary value of 64MB chunks because I don't see
any
advantage of using smaller chunks. After all, we're talking about using
parallel transfers on high bandwidth links. Of course this value can be
discussed and easily changed (it's the base_chunk_size variable). In order
to
be more exact, a chunk size will be a multiple of base_chunk_size, so a 4GB
file transferred with 4 channels will be cut in 4 1GB chunks.

The main source of errors during reassembly is if the copy_buffer_len (-B
option) is set to a "non power of 2" value. This will lead to writes
sitting
partially on 2 blocks, and probably corrupt the file. Writing simultaneously
the start and the end of a block on 2 different NFS clients is a really bad
idea. That's why I issue a warning if -n > 0 and -B is not a power of 2.

Concerning error handling within threads, if a thread encounters a blocking
error, the other threads will end their current chunk copy, then stop doing
anything.
> And another thought - if the proposed patch and/or method indeed will not
> go anywhere, would it still be helpful for you if the sftp client would
> only expose the file offset functionality? That way, the complexity of
> reassembly and the associated error handling doesn't enter into
OpenSSH.
There is no server side change in this patch, so I don't think we can talk
about conplexity of reassembly. Once base_chunk_size is set and a warning or an
error is raised if copy_buffer_len is not a power of 2, there is nothing more
than during a reput or a reget.

Of course, exposing the file offset functionality would help creating a new
software on top of the sftp client, but at the cost of simplicity. And if you
do this, in the case of "get", the new software will need to send a
lot of "ls"
commands in order to be aware of the directory structure, the file sizes in
order to correctly split transfers? I feel like it's reinventing the wheel.
-- 
Cyril

Ron Frederick

2020-May-05 14:42 UTC

head link

Parallel transfers with sftp (call for testing / advice)

On May 5, 2020, at 3:55 AM, Cyril Servant <cyril.servant at gmail.com>
wrote:> Peter Stuge wrote:
>> 
>> Matthieu Hautreux wrote:
>>> The change proposed by Cyril in sftp is a very pragmatic approach
to
>>> deal with parallelism at the file transfer level. It leverages the 
>>> already existing sftp protocol and its capability to write/read
file
>>> content at specified offsets. This enables to speed up sftp
transfers
>>> significantly by parallelizing the SSH channels used for large 
>>> transfers. This improvement is performed only by modifying the sftp
>>> client, which is a very small modification compared to the openssh 
>>> codebase. The modification is not too complicated to review and
validate
>>> (I did it) and does not change the default behavior of the cli.
>> 
>> I think you make a compelling argument. I admit that I haven't
>> reviewed the patch, even though that is what matters the most.
> 
> If you want to review the code, here is a direct link to the patch:
>
https://github.com/openssh/openssh-portable/compare/V_8_2_P1...cea-hpc:V_8_2_P1_PSFTP1
<https://github.com/openssh/openssh-portable/compare/V_8_2_P1...cea-hpc:V_8_2_P1_PSFTP1>
I haven?t reviewed the patch in detail, but one thing that jumped out at me when
looking at this was your choice to use multiple channels to implement the
parallelism. The SFTP protocol already allows multiple outstanding reads or
writes on a single channel, without waiting for the server?s response. I don?t
know if the current OpenSSH client takes advantage of that or not, but it seems
to me that you?d get the same benefit from that as you would from opening
multiple channels with less setup cost (not having to open the additional
channels, and not having to do separate file open calls on each channel). It
might also make some other things like the progress bar support easier, as all
of the responses would be coming back on the same channel.

You can find an example of this form of parallelism (multiple read/write
requests outstanding on a single channel) in AsyncSSH?s SFTP implementation. By
default, it will allow up to 128 parallel reads or writes of 16 KB each, and you
can adjust both the block size and max outstanding requests via ?block_size? and
?max_requests? arguments in the calls to methods such as get(), put(), and
copy() in the asyncssh.SFTPClient class
(https://asyncssh.readthedocs.io/en/latest/api.html#sftpclient
<https://asyncssh.readthedocs.io/en/latest/api.html#sftpclient>). Here?s
an example client:

import asyncio, asyncssh, sys 
        
async def run_client():
    async with asyncssh.connect('localhost') as conn:
        async with conn.start_sftp_client() as sftp:
            await sftp.get('example.txt')
    
try:
    asyncio.get_event_loop().run_until_complete(run_client())
except (OSError, asyncssh.Error) as exc:
    sys.exit('SFTP operation failed: ' + str(exc))


The block_size and max_requests can be added as optional arguments to the
sftp.get() in this example, but parallelism will automatically be used for any
file larger than 16 KB, allowing for up to 2 MB of data to be outstanding at
once (128 requests of 16 KB each), without any threading and all over a single
SFTP channel.

This parallel I/O support in AsyncSSH also extends to clients making large
read() or write() calls on a remote file. If you call read() or write() with a
length of more than the block_size set when a file is opened, these reads or
write will automatically be broken into multiple parallel requests. This
behavior can be disabled by setting block_size to 0 if you really need to
control the exact requests sent on the channel, but by default callers will get
the benefit of the extra speed.
-- 
Ron Frederick
ronf at timeheart.net

Ben Lindstrom

2020-May-05 15:04 UTC

head link

Parallel transfers with sftp (call for testing / advice)

OpenSSH does:

 ???? -R num_requests
 ???????????? Specify how many requests may be outstanding at any one time.
 ???????????? Increasing this may slightly improve file transfer speed 
but will
 ???????????? increase memory usage.? The default is 64 outstanding 
requests.

As well as block size:

 ???? -B buffer_size
 ???????????? Specify the size of the buffer that sftp uses when 
transferring
 ???????????? files.? Larger buffers require fewer round trips at the 
cost of
 ???????????? higher memory consumption.? The default is 32768 bytes.


I suspect for this to be taken seriously it will more than likely need 
to have threads dropped in favor of a different method.? As multiple 
times the OpenSSH team has rejected threads as they cause additional 
complexity and security issues.

Ben



Ron Frederick wrote on 5/5/20 9:42 AM:> On May 5, 2020, at 3:55 AM, Cyril Servant <cyril.servant at
gmail.com> wrote:
>> Peter Stuge wrote:
>>> Matthieu Hautreux wrote:
>>>> The change proposed by Cyril in sftp is a very pragmatic
approach to
>>>> deal with parallelism at the file transfer level. It leverages
the
>>>> already existing sftp protocol and its capability to write/read
file
>>>> content at specified offsets. This enables to speed up sftp
transfers
>>>> significantly by parallelizing the SSH channels used for large
>>>> transfers. This improvement is performed only by modifying the
sftp
>>>> client, which is a very small modification compared to the
openssh
>>>> codebase. The modification is not too complicated to review and
validate
>>>> (I did it) and does not change the default behavior of the cli.
>>> I think you make a compelling argument. I admit that I haven't
>>> reviewed the patch, even though that is what matters the most.
>> If you want to review the code, here is a direct link to the patch:
>>
https://github.com/openssh/openssh-portable/compare/V_8_2_P1...cea-hpc:V_8_2_P1_PSFTP1
<https://github.com/openssh/openssh-portable/compare/V_8_2_P1...cea-hpc:V_8_2_P1_PSFTP1>
> I haven?t reviewed the patch in detail, but one thing that jumped out at me
when looking at this was your choice to use multiple channels to implement the
parallelism. The SFTP protocol already allows multiple outstanding reads or
writes on a single channel, without waiting for the server?s response. I don?t
know if the current OpenSSH client takes advantage of that or not, but it seems
to me that you?d get the same benefit from that as you would from opening
multiple channels with less setup cost (not having to open the additional
channels, and not having to do separate file open calls on each channel). It
might also make some other things like the progress bar support easier, as all
of the responses would be coming back on the same channel.
>
> You can find an example of this form of parallelism (multiple read/write
requests outstanding on a single channel) in AsyncSSH?s SFTP implementation. By
default, it will allow up to 128 parallel reads or writes of 16 KB each, and you
can adjust both the block size and max outstanding requests via ?block_size? and
?max_requests? arguments in the calls to methods such as get(), put(), and
copy() in the asyncssh.SFTPClient class
(https://asyncssh.readthedocs.io/en/latest/api.html#sftpclient
<https://asyncssh.readthedocs.io/en/latest/api.html#sftpclient>). Here?s
an example client:
>
> import asyncio, asyncssh, sys
>          
> async def run_client():
>      async with asyncssh.connect('localhost') as conn:
>          async with conn.start_sftp_client() as sftp:
>              await sftp.get('example.txt')
>      
> try:
>      asyncio.get_event_loop().run_until_complete(run_client())
> except (OSError, asyncssh.Error) as exc:
>      sys.exit('SFTP operation failed: ' + str(exc))
>
>
> The block_size and max_requests can be added as optional arguments to the
sftp.get() in this example, but parallelism will automatically be used for any
file larger than 16 KB, allowing for up to 2 MB of data to be outstanding at
once (128 requests of 16 KB each), without any threading and all over a single
SFTP channel.
>
> This parallel I/O support in AsyncSSH also extends to clients making large
read() or write() calls on a remote file. If you call read() or write() with a
length of more than the block_size set when a file is opened, these reads or
write will automatically be broken into multiple parallel requests. This
behavior can be disabled by setting block_size to 0 if you really need to
control the exact requests sent on the channel, but by default callers will get
the benefit of the extra speed.

Cyril Servant

2020-May-05 16:03 UTC

head link

Parallel transfers with sftp (call for testing / advice)

Ron Frederick wrote:> On May 5, 2020, at 8:36 AM, Cyril Servant <cyril.servant at
gmail.com> wrote:
>> Ron Frederick wrote:
>>> I haven?t reviewed the patch in detail, but one thing that jumped
out at me when looking at this was your choice to use multiple channels to
implement the parallelism. The SFTP protocol already allows multiple outstanding
reads or writes on a single channel, without waiting for the server?s response.
I don?t know if the current OpenSSH client takes advantage of that or not, but
it seems to me that you?d get the same benefit from that as you would from
opening multiple channels with less setup cost (not having to open the
additional channels, and not having to do separate file open calls on each
channel). It might also make some other things like the progress bar support
easier, as all of the responses would be coming back on the same channel.
>> 
>> As Matthieu said earlier, each ssh channel speed is bound to a CPU core
speed.
>> The whole point here is to use multiple ssh channels in order to
increase
>> transfer speed. And as Ben just said, there is already the -R option in
sftp.
>> This option indeed helps, but the channel speed is still bound to a CPU
core
>> speed.
> 
> 
> Thanks. It looks like OpenSSH?s SFTP ?-B? and ?-R? are indeed similar to
what I described.
> 
> When you say ?SSH channel? here, though, do you actually mean multiple data
streams sent over a single TCP connection, or multiple independent TCP
connections? I haven?t tried to measure this, but I would have expected most of
the cost of OpenSSH processing to be the decryption, which happens at the
connection level, not any processing at the channel level. So, I?m a bit
surprised that you?d be able to get much benefit from multiple cores when
running multiple channels on a single TCP connection.
I say SSH channel, but I should talk about SSH connection. And as these are
different TCP connections, it allows simultaneous connection to multiple
servers. Of course, this is only useful if these servers share the same network
storage (NFS, luste?).
-- 
Cyril

David Newall

2020-May-06 04:21 UTC

head link

Parallel transfers with sftp (call for testing / advice)

Did anything happen after 
https://daniel.haxx.se/blog/2010/12/08/making-sftp-transfers-fast/? I 
suspect it did, because we do now allow multiple outstanding packets, as 
well as specifying the buffer size.

Daniel explained the process that SFTP uses quite clearly, such that I'm 
not sure why re-assembly is an issue.? He explained that each transfer 
already specifies the offset within the file.? It seems reasonable that 
multiple writers would just each write to the same file at their various 
different offsets.? It relies on the target supporting sparse files, but 
supercomputers only ever run Linux ;-) which does do the right thing.

The original patch which we are discussing seemed more concerned about 
being able to connect to multiple IP addresses, rather than multiple 
connections between the same pair of machines.? The issue, as I 
understand, is that the supercomputer has slow NICs, so adding multiple 
NICs allows greater network bandwidth.? This, I think, is the problem to 
be solved; not re-assembly, just sending to what appear to be multiple 
different hosts (i.e. IP addresses.)

I was curious to know why a supercomputer would have issues receiving at 
some high-bandwidth via a single NIC, while the sending machine has no 
such performance issue; but that's an aside.

Matthieu Hautreux

2020-May-08 23:25 UTC

head link

Parallel transfers with sftp (call for testing / advice)

Le 06/05/2020 ? 06:21, David Newall a ?crit?:> Did anything happen after 
> https://daniel.haxx.se/blog/2010/12/08/making-sftp-transfers-fast/? I 
> suspect it did, because we do now allow multiple outstanding packets, 
> as well as specifying the buffer size.
>
> Daniel explained the process that SFTP uses quite clearly, such that 
> I'm not sure why re-assembly is an issue.? He explained that each 
> transfer already specifies the offset within the file.? It seems 
> reasonable that multiple writers would just each write to the same 
> file at their various different offsets.? It relies on the target 
> supporting sparse files, but supercomputers only ever run Linux ;-) 
> which does do the right thing.You are right, reassembly is not an issue, as long as you have sparse 
files support, which is our case with Linux :)>
> The original patch which we are discussing seemed more concerned about 
> being able to connect to multiple IP addresses, rather than multiple 
> connections between the same pair of machines.? The issue, as I 
> understand, is that the supercomputer has slow NICs, so adding 
> multiple NICs allows greater network bandwidth.? This, I think, is the 
> problem to be solved; not re-assembly, just sending to what appear to 
> be multiple different hosts (i.e. IP addresses.)
No, the primary goal of the patch is to enable to do that between two 
endpoints with one NIC per endpoint, the NIC being 10GE or faster.

Here is an example with roughly the same results for a single 
destination/IP :


# With the patched sftp and 1+10 parallel SSH connections

[me at france openssh-portable]$ ./sftp -n 10 germany0
Connected main channel to germany0 (1.2.3.96).
Connected channel 1 to germany0 (1.2.3.96).
Connected channel 2 to germany0 (1.2.3.96).
Connected channel 3 to germany0 (1.2.3.96).
Connected channel 4 to germany0 (1.2.3.96).
Connected channel 5 to germany0 (1.2.3.96).
Connected channel 6 to germany0 (1.2.3.96).
Connected channel 7 to germany0 (1.2.3.96).
Connected channel 8 to germany0 (1.2.3.96).
Connected channel 9 to germany0 (1.2.3.96).
Connected channel 10 to germany0 (1.2.3.96).
sftp>? get 5g 5g.bis
Fetching /files/5g to 5g.bis
/files/5g 100% 5120MB 706.7MB/s?? 00:07
sftp> put 5g.bis

Uploading 5g.bis to /files/5g.bis
5g.bis 100% 5120MB 664.0MB/s?? 00:07
sftp>


# WIth the legacy sftp :

[me at france openssh-portable]$ sftp germany0

sftp> get 5g 5g.bis
Fetching /files/5g to 5g.bis
/p/scratch/chpsadm/files/5g 100% 5120MB? 82.8MB/s?? 01:01
sftp> put 5g.bis
Uploading 5g.bis to /files/5g.bis
5g.bis 100% 5120MB? 67.0MB/s?? 01:16
sftp>


# With scp :

[me at france openssh-portable]$ scp 5g germany0:/files/5g.bis
5g 100% 5120MB? 83.1MB/s?? 01:01


#With rsync :

[me at france openssh-portable]$ rsync -v 5g germany0:/files/5g.bis

5g

sent 5,370,019,908 bytes? received 35 bytes? 85,920,319.09 bytes/sec
total size is 5,368,709,120? speedup is 1.00

>
> I was curious to know why a supercomputer would have issues receiving 
> at some high-bandwidth via a single NIC, while the sending machine has 
> no such performance issue; but that's an aside.
Supercomputers commonly offer multiple "login nodes" and a generic DNS
entry to connect to one of them randomly : the DNS entry is associated 
to multiple IP adresses and the client (dns resolver) selects one of them.

Other DNS entries may exist to address a particular login node, in case 
you want to go at a particular place.

When used with Cyril 's patched sftp, this logic makes that you are 
targeting multiple hosts automatically if you use the generic DNS entry 
(the first perf results of Cyril). If you select a particular host DNS 
entry (like in this exampe), then you will only contact that single host 
only.

On supercomputers, files are commonly stored on distributed file systems 
like NFS, Lustre, GPFS, ... In case your transfers target one of those 
types of file systems, you can use multiple hosts as destinations 
without any issues. You just need to ensure that the sftp sent/written 
blocks are properly sized to avoid any overwritting of some targets by 
others because of their file systems client implementations and the 
asynchronism of the various page cache flushes on the involved nodes. 
That is what is done in the patch, as explained by Cyril in a previous 
message, the block size used for parallel transfers was selected with 
that potential issue in mind.


Regards,

Matthieu
>
> _______________________________________________
> openssh-unix-dev mailing list
> openssh-unix-dev at mindrot.org
> https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev

Peter Stuge

2020-May-09 14:57 UTC

head link

Parallel transfers with sftp (call for testing / advice)

Hi,

Cyril Servant wrote:> > I think you make a compelling argument. I admit that I haven't
> > reviewed the patch, even though that is what matters the most.
> 
> If you want to review the code, here is a direct link to the patch:
>
https://github.com/openssh/openssh-portable/compare/V_8_2_P1...cea-hpc:V_8_2_P1_PSFTP1
Thanks for that, I've pulled your branch and reviewed locally.

I'm sorry, but I have to say that I think this has zero chance of going
upstream as-is, and it will probably be problematic even with further work.
I'll come back to why.

As-is there are way too many things going on in this proposal at once, and it
seems to make some assumptions which do hold true on Linux and probably other
systems but are not as portable as the rest of the codebase, and thus can't
work on some systems where OpenSSH currently works.

My review feedback follows. Note that I'm not an OpenSSH developer, but have
followed the project for a long time. Some feedback is specific, but most is
rather general. Your change viewed in isolation is not terrible, but I hope to
create an understanding for why the proposal is far too different from the
existing code to be able to go upstream, so that you may be able to avoid
similar situations in the future.

* A new thread queue infrastructure

Please forget about using this pattern in the OpenSSH context. Others have
mentioned that threading requires much more care, and while that is true I
think a simple thread queue is doable, although I think there's an
issue or two still overlooked in the proposed implementation (specifically,
threads combined with child processes especially requires carefully dealing
with signals in all threads, which the change doesn't seem to do, and the
thread_loop() logic is not clear - the function will never return).

Anyway, the important point is that a thread queue is utterly different
from the architecture of the entire OpenSSH codebase, it is not nearly as
portable, and introducing such a different paradigm can not be justified
for gaining concurrency in the range of 10-63.

There already exists concurrency infrastructure which will scale fine into
that range, it is based on a select() call. If you do rework, please look
at if/how you can reuse that and do discuss what you find, in particular
discuss beforehand if you find that you would need to extend it.

* Established behavior is changed, an error condition arbitrarily introduced

The proposed change refuses file transfer if the local and remote files
have different sizes. This has never been a requirement before, and should
certainly not become one. It's also not neccessary for what you want to do.
If a user says to transfer a file then the behavior always was and should
continue to be that whatever was there before on the recipient side is simply
overwritten.

* New flags are added to existing functions and new versions of some functions

If you ever find yourself creating a function called
"real_$otherfunction"
you can be sure that there exists another, better approach.

The new flags and parameters are introduced on very many layers throughout the
code, in order to propagate higher level state down to the lowest level.
Sometimes that is unavoidable, but really do look hard for a simpler approach
first.

You also introduce new global state. Maybe it would be possible to combine
the two.

Ideally you'll find a way to change your code such that you do not actually
need to pass any new state so far down in a call stack. Try hard to confine
any changes within as narrow a call stack window as possible, ideally only
a single caller+callee level for each change. Does that make sense?

The existing concurrency infrastructure already solves this problem btw.

* Special casing is added for the new functionality

In a couple of places new code is added if the new functionality is
used, and the previous code is kept verbatim if not. This approach is
untenable in the long run, and indicates that the existing code would
need to change to also support the new functionality, or sometimes
even better: that the new code needs to go somewhere else, in to another
layer in the call stack, so that existing code and logic doesn't need to
change.

* Don't add server workarounds to the client

fake_opendir(), reverse_recurse_stat() and wait_availability() do not
seem to belong in the client. The comment about fake_opendir() says:

+ * Open remote directory in order to refresh the file system cache. This is
+ * useful in the case of a parallel upload to a distributed file system (such
as
+ * NFS or Lustre)

To me this needs to go into the server; it is the server's responsibility
to present fresh and correct information about the files it has. I don't
believe that this was done for that purpose, but it does look a bit like
this workaround was put in the client to be able to claim that the server
component doesn't need to change, when apparently it does. :\

Personally I think this is rather pragmatism, and sparing yourself the
wait for a changed server to be in operation on all your relevant nodes.

* Ad-hoc name resolution and random server address choice

The sftp client is changed to always call getaddrinfo() on the server name
and pick a random address out of the result. This was not the case before,
otherwise the change would not have been necessary. Server name resolution
is another thing that is already done consistently in a particular way within
the codebase, but the proposed change ignores that and adds a one-off thing
to one specific code path. That is also untenable for the codebase in the
long run; a different approach is absolutely neccessary.

* The thread queue uses "orders" while the existing codebase uses
callbacks

This is again about reusing existing patterns in a codebase. Introducing
a completely new and different pattern is sometimes unavoidable, but then
it may make more sense to first refactor the existing codebase, to enable
the addition of some new functionality which would be made possible only
by the new pattern. I don't think this is the case here, however.

* Memory allocations without explanations and without obvious frees

In a few places it looks like new memory allocations are added without
corresponding frees, and there is no explanation e.g. in the commit message.
That makes me worry that leaks are introduced, which would be unfortunate,
since there are probably some folks with long-running client processes.
And the commit that claims to correct memory leaks is also not explained
and in fact looks like it could actually create another memory leak.

* Tests. Nice! :)

* The "optimized sparse file creation" is not obvious and not
explained.

* scp ends up being affected by what is proposed as an sftp change.

This is another indicator that something may be changing in places where
it shouldn't.

* Form.

This is the least significant problem, but it does still matter.
There is extra whitespace here and there, and there are some unneccessary
casts, e.g. in free() calls.

Please don't understand the above points as some kind of checklist of things
to fix, as I said that's not my mandate, I just want to contribute
perspective
because I think that your use case is valid, and I appreciate that you want to
contribute to OpenSSH.

Now why this solution may be problematic even with further work:

Given that at least part of the significant mismatch between proposed code
and existing code is due to the nature of the problem I think it may be a
bit tricky to find a solution contained completely within OpenSSH, at least
right away.

With that said, I still think that it makes sense to somehow allow OpenSSH
to work in CPU-bound situations such as yours. The question is how..

> > likely to introduce a whole load of not very clean error conditions
> > regarding reassembly, which need to be handled sensibly both within
> > the sftp client and on the interface to outside/calling processes.
> > Can you or Cyril say something about this?
> 
> Indeed, reassembly must be handled properly...
I only understood in later emails that you actually don't handle reassembly
within OpenSSH, but offload that onto the kernel by using sparse files. I
think that's a wise choice.

But it obviously only works given kernel and filesystem support, I don't
know if it's OK to require that. And what happens without that support?

Another approach would be for the recipient to initially create the full
size file before any data transfer, but that comes at the cost of a probably
quite significant delay. (Maybe not though? Can you test?)

More importantly, this is significantly different from the current behavior;
a file transfer thus far creates or truncates the file on the recipient side
and appends received data, making the file grow, allowing e.g. humans to know
if the transfer was complete by only comparing sizes.

And there we've come full circle, arriving at transfering only part of a
file
with sftp, the reassembly problem, and the topic of other tooling already
reliably handling things such as partial transfers, reassembly, and more...

> > And another thought - if the proposed patch and/or method indeed will
not
> > go anywhere, would it still be helpful for you if the sftp client
would
> > only expose the file offset functionality? That way, the complexity of
> > reassembly and the associated error handling doesn't enter into
OpenSSH.
> 
> There is no server side change in this patch, so I don't think we can
talk
> about conplexity of reassembly.
Indeed the complexity I had in mind isn't present in your proposal, but OTOH
there is plenty of other complexity that I didn't anticipate. Heh.

And while there is no server change, I think there probably should be,
instead of adding that fake_opendir() business to the client.

> Of course, exposing the file offset functionality would help creating a new
> software on top of the sftp client, but at the cost of simplicity.
Your proposed change is also all but simple.

I think an approach that requires only a very small change has a much better
chance to get upstream, but it obviously also depends on what the patch
looks like. I would encourage you to give it a try however. I guess it
should not need to be a very large patch.

> And if you do this, in the case of "get", the new software will
need to
> send a lot of "ls" commands in order to be aware of the directory
structure,
> the file sizes in order to correctly split transfers
Right, recursive get could probably benefit from further support from the sftp
client, but start simple; what would the simple case look like, where a user
instructs the sftp program to transfer only a part of a file?

Thanks and kind regards

//Peter

Cyril Servant

2020-May-18 08:14 UTC

head link

Parallel transfers with sftp (call for testing / advice)

Hi Peter, and thank you for your advice, it's really appreciated.
> * A new thread queue infrastructure
> 
> Please forget about using this pattern in the OpenSSH context. Others have
> mentioned that threading requires much more care, and while that is true I
> think a simple thread queue is doable, although I think there's an
> issue or two still overlooked in the proposed implementation (specifically,
> threads combined with child processes especially requires carefully dealing
> with signals in all threads, which the change doesn't seem to do, and
the
> thread_loop() logic is not clear - the function will never return).
The thread exists in thread_real_loop(), but yeah, it may not be clear enough.
Anyways, if the we have to avoid using threads, we'll have to rework all
this?
> * Established behavior is changed, an error condition arbitrarily
introduced
> 
> The proposed change refuses file transfer if the local and remote files
> have different sizes. This has never been a requirement before, and should
> certainly not become one. It's also not neccessary for what you want to
do.
> If a user says to transfer a file then the behavior always was and should
> continue to be that whatever was there before on the recipient side is
simply
> overwritten.
No, this error is only raised if the thread wants to write a part of the file,
just after the main thread created the sparse file. If the file doesn't have
the right size, this means there has been a problem during the sparse file
creation.
> * Don't add server workarounds to the client
> 
> [...]
> 
> * Ad-hoc name resolution and random server address choice
Those 2 functionalities have been added with our specific HPC clusters in mind.
I think we can simply remove them, and just focus on a point to point
operation. For information, the unpatched sftp never resolves hostnames, it
just lets the underlying ssh process do it.
> Another approach would be for the recipient to initially create the full
> size file before any data transfer, but that comes at the cost of a
probably
> quite significant delay. (Maybe not though? Can you test?)
Well, without changing anything server-side, the only solution is to write
something at the end of the file. In our tests it creates a sparse file, but
indeed, this must be portable, and we have to test it on multiple platforms?

Once again, thanks for your advice. I've only answered a few things here,
but
as you said, portability is the main subject, then if an alternative to threads
has to be chosen, we'll have to think about it.

-- 
Cyril

openssh unix dev - May 2020 - Parallel transfers with sftp (call for testing / advice)

Parallel transfers with sftp (call for testing / advice)

Parallel transfers with sftp (call for testing / advice)

Parallel transfers with sftp (call for testing / advice)

Parallel transfers with sftp (call for testing / advice)

Parallel transfers with sftp (call for testing / advice)

Parallel transfers with sftp (call for testing / advice)

Parallel transfers with sftp (call for testing / advice)

Parallel transfers with sftp (call for testing / advice)