On Sun, 2008-01-27 at 10:39 +1100, Michael Ashley wrote:> I'm using rsync to transfer large amounts (megabytes per day!) of
> data over an Iridium modem link (240 bytes per second) from
> Antarctica.
>
> One problem is that the Iridium link has a mean uptime of perhaps 30
> minutes.
>
> Implementing partial transfers is crucial, so I was using
>
> rsync -av --partial --partial-dir=.rsync myfiles user@host:
>
> The files are pre-compressed with bzip2.
>
> Question 1: the documentation isn't clear about the interaction
> between
> the partial and partial-dir switches. And advice?
When both switches are given, --partial-dir overrides --partial.
> The problem with the above command is that the receiving rsync
> processes seem to hang around for a long time, even after the link is
> cut.
>
> Question 2: is there a signal one can send to a receiving rsync to
> get it to write out its partial transfers to the
> partial-dir?
SIGINT should work.
> The next thing I tried was to add "--timeout 1000". This worked
> reasonably well, except that IO buffering makes rsync think that
> the network is dead even though the link is up, and data is trickling
> out at 2400 baud.
>
> So, I tried "--bwlimit=1". I really need
"--bwlimit=0.24", but
> rsync won't allow floating point there.
>
> This still isn't very satisfactory, and I am still not maximising my
> use of the link.
>
> Question 3: what should I do?! Any other switches that are relevant
> to my situation? E.g., "--block-size" (what are the
units
> here? the man page doesn't say).
--block-size is in bytes and controls the size of the blocks matched by
the delta-transfer algorithm. You probably can't do much better than
the default (approximately the square root of the file size).
> Question 4: will "--partial" save every last byte that makes it
> through? Or does it truncate to the last "block"
(which
> might have taken 20 minutes to come through on a slow link).
>From reading the code, it looks like rsync will truncate to the last
"token", where a token is either a match with a block of the old
destination file or a chunk of literal data of length up to the sending
rsync's CHUNK_SIZE constant, by default 32KB. You could recompile the
sending rsync with a lower CHUNK_SIZE.
> I guess moving to an rsync daemon rather than ssh transport might be a
> good idea. I would like to be able to remove all network buffering
> (not sure if this is possible, is the "--blocking-io" switch
relevant
> here?), and have rsync realise that the network is dead if nothing
> comes through in, say, 60 seconds. Alternatively, I could use another
> mechanism to determine the link status (an occasional ping?) and then
> send a SIG_SOMETHING to rsync to get it to clean up nicely and be
> ready for the next connection.
On such a slow, unreliable link, I doubt you will be able to make rsync
work very well by just trimming the buffering and making it time out and
restart. I would suggest putting some kind of layer on top of the link
that would allow a set of rsync processes to block when the link goes
down and simply resume work when it comes back up. Someone who knows
more about networking than I do might have more ideas about how to
accomplish this or what else to try.
Matt