Howdy all, I know development on SCP is discouraged but being that it's still in wide use I thought I would do some work some of my users have been asking for and allow SCP to resume from a partial transfer. When enabled the resume function will compare the source and target files. The source host will and target host will pass MD5 hashes back and forth in order to compare the hash of the file fragment on the target and the file up to the length of the source file. If these hashes match the source will seek to the appropriate byte position and start sending the rest of the file. This will then be appended to the target. Example: An scp transfer of a 128,000 byte file terminates after transferring 64,000 bytes. The user then issues the SCP command again with the -R (resume) option. The hash of the 64k fragment on the target is compared to the hash of the source file up to 64k. The hashes match so the source seeks to the 64,001th byte and then starts send the remainder of the file. This is written to a temp file on the target which then appends it to the target after the end of the transfer. If the hash doesn't match then then entire source file is copied to the target and replaces the fragment. In the event that the source and target files have the same size and same hash then the file is skipped (no transfer of data). This end sup being handy if the user is doing a recursive transfer or using a wildcard. Both sides have to be using a resume enabled SCP for this to work. Since scp uses the first scp in the users path I've added an option that lets users define which scp to use on the remote side. This can be set with '-s' e.g. 'scp -R -s /opt/openssh/bin/scp' foo* you at host:~/foofiles/' The code is available at https://github.com/rapier1/openssh-portable in the resume-scp branch. The patch file is ~1k lines long so I didn't think posting it here would be the right move. Note: This is the first functional implementation of this so there are a lot of debug statement and I am sure there are ways of improving this. I also know that users can just use rsync. That said, this was a deliverable and the users wanted this. So I did it. Anyway, I'm open to any and all suggestions, modifications, criticism and so forth. Thanks, Chris
rapier wrote:> I'm open to any and all suggestions, modifications, criticism and so forth.Do you think there are some ways to accomplish same or similar functionality also for SFTP? Perhaps that could help acceptance for some SCP development. //Peter
If you?re looking for incremental copy capabilities, why not just use rsync, which already runs nicely over SSH? On Apr 1, 2021, at 10:50 AM, rapier <rapier at psc.edu> wrote:> Howdy all, > > I know development on SCP is discouraged but being that it's still in wide use I thought I would do some work some of my users have been asking for and allow SCP to resume from a partial transfer. > > When enabled the resume function will compare the source and target files. The source host will and target host will pass MD5 hashes back and forth in order to compare the hash of the file fragment on the target and the file up to the length of the source file. If these hashes match the source will seek to the appropriate byte position and start sending the rest of the file. This will then be appended to the target. > > Example: > An scp transfer of a 128,000 byte file terminates after transferring 64,000 bytes. The user then issues the SCP command again with the -R (resume) option. The hash of the 64k fragment on the target is compared to the hash of the source file up to 64k. The hashes match so the source seeks to the 64,001th byte and then starts send the remainder of the file. This is written to a temp file on the target which then appends it to the target after the end of the transfer. > > If the hash doesn't match then then entire source file is copied to the target and replaces the fragment. > > In the event that the source and target files have the same size and same hash then the file is skipped (no transfer of data). This end sup being handy if the user is doing a recursive transfer or using a wildcard. > > Both sides have to be using a resume enabled SCP for this to work. Since scp uses the first scp in the users path I've added an option that lets users define which scp to use on the remote side. This can be set with '-s' e.g. 'scp -R -s /opt/openssh/bin/scp' foo* you at host:~/foofiles/' > > The code is available at https://github.com/rapier1/openssh-portable in the resume-scp branch. The patch file is ~1k lines long so I didn't think posting it here would be the right move. > > Note: This is the first functional implementation of this so there are a lot of debug statement and I am sure there are ways of improving this. I also know that users can just use rsync. That said, this was a deliverable and the users wanted this. So I did it. Anyway, I'm open to any and all suggestions, modifications, criticism and so forth. > > Thanks, > > Chris-- Ron Frederick ronf at timeheart.net
> On 01 Apr 2021, at 19:50 , rapier <rapier at psc.edu> wrote: > > The code is available at https://github.com/rapier1/openssh-portable in the resume-scp branch. The patch file is ~1k lines long so I didn't think posting it here would be the right move.Given the date this was released, I had to double check? I?m just surprised it didn?t include a ??-parallel=<number>? to speedup the transfers over latency links? much more needed for me transferring data continuously between the northern and southern hemispheres ;(
On 4/1/21 1:50 PM, rapier wrote:> Howdy all, > > I know development on SCP is discouraged but being that it's still in wide use > I thought I would do some work some of my users have been asking for and allow > SCP to resume from a partial transfer.Would it be possible to instead reimplement SCP in terms of SFTP, and then add this feature to SFTP? My understanding is that such a re-implementation is something many people have wanted for quite a while. Of course, this might very well be out of scope for the project, which would be fine.> When enabled the resume function will compare the source and target files. The > source host will and target host will pass MD5 hashes back and forth in order > to compare the hash of the file fragment on the target and the file up to the > length of the source file. If these hashes match the source will seek to the > appropriate byte position and start sending the rest of the file. This will > then be appended to the target.I suggest using a better hash than MD5, which is considered broken. Blake2b is both faster and much more secure. Sincerely, Demi -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_0xB288B55FFF9C22C1.asc Type: application/pgp-keys Size: 4874 bytes Desc: OpenPGP public key URL: <http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20210403/38496d30/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: <http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20210403/38496d30/attachment.asc>