Niels Andersen
2001-Nov-24 01:22 UTC
Find similarity between local and remote file before transferring
For two different reasons, I've been thinking about how to find out how much data there would have to be transferred to update the local file,or how similar the local and remote files are. Reason one: I'm working on a frontend for rsync. I think it would be a nice feature if it could check how much of the existing file there would be reused, if updating from the remote file. Reason two: There's a lot of filesharing systems out there, each with it's own advantages. But you can extremely rarely resume with another downloading mechanism. So, if you download a 650 mb iso from somewhere, and there's an error, you would have to start all over. Or maybe you just need the last to megs, but theres a 20 hour queue. If it was possible, I'd suggest hosts of such systems, to also share the files with rsync. The rsync daemon should then check for similarity, and if there's more than eg. 10% difference, it should deny, and tell the user to use the primary download system, with queues, ratio, or whatever. /Niels Andersen
Dave Dykstra
2001-Nov-27 03:32 UTC
Find similarity between local and remote file before transferring
On Fri, Nov 23, 2001 at 03:22:02PM +0100, Niels Andersen wrote:> For two different reasons, I've been thinking about how to find out how much > data there would have to be transferred to update the local file,or how > similar the local and remote files are. > > Reason one: > I'm working on a frontend for rsync. I think it would be a nice feature if > it could check how much of the existing file there would be reused, if > updating from the remote file. > > Reason two: > There's a lot of filesharing systems out there, each with it's own > advantages. But you can extremely rarely resume with another downloading > mechanism. So, if you download a 650 mb iso from somewhere, and there's an > error, you would have to start all over. Or maybe you just need the last to > megs, but theres a 20 hour queue. > If it was possible, I'd suggest hosts of such systems, to also share the > files with rsync. The rsync daemon should then check for similarity, and if > there's more than eg. 10% difference, it should deny, and tell the user to > use the primary download system, with queues, ratio, or whatever.Rsync currently doesn't do that. Would it suit your purposes if 'rsync -n' reported the statistics of what it would transfer if -n were not used rather that what it reports now (which appears to be the number of bytes it actually transfers rather than what it would transfer without -n)? I think that's been asked for but nobody ever made a patch. - Dave Dykstra
niels@myplace.dk
2001-Nov-28 20:07 UTC
Find similarity between local and remote file before transferring
> Rsync currently doesn't do that.That explains why I couldn't find anything usable on the man-page. :)> Would it suit your purposes if 'rsync -n' > reported the statistics of what it would transfer if -n were not used > rather that what it reports now (which appears to be the number of > bytes it actually transfers rather than what it would transfer > without -n)?Yes, that would be perfect for one of the purposes. But I think this functionality should be set with another option, so we keep rsync backwards compatible. I don't really care, but some might would. :)> I think that's been asked for but nobody ever made a patch.Okay, then I'm not the only one. :) I don't know exactly how the protocol works, but I can imagine that it's a simple patch. It's just to pretend to be downloading, but every time the client sees a part that need to be downloaded, it tells the server that it allready has the part, and then just note the size. That must be pretty simple. I think... :) My interest right now is only for connecting to an rsync daemon for downloading files. I don't know if it makes it easier. :) I'd be very happy if someone would make this patch. It's for my rsync frontend, which I hope will help telling the world that rsync is a great way of downloading files, and it should be used some more. :) /Niels Andersen (Sorry about the late answer, technical problems with my workstation...)