thr3ads.net - rsync - Future RSYNC enhancement/improvement suggestions [Apr 2002]

If this information is useful, please help other people find it:
Share via:

Jan Rafaj

2002-Apr-19 03:25 UTC

Future RSYNC enhancement/improvement suggestions

Hello,

Recently while working with rsync as the way to mirror large (several
GB) archive on a regular basis, I came across several problems,
and also got the ideas about their possible solutions
- please could you investigate & consider implementing the features,
described below, to future RSYNC releases ?

- when the checksumming (consider very large archive, several GB)
  stage of rsync runs slow (~3 and more minutes), which is the
  case of either slower CPU machines or machines with older HDDs
  that dont have UDMA or have just UDMA33 transfer modes, one can
  often observe that the network connection to the master site
  shuts down and the mirroring fails (in subsequent mirroring
  attempts, when, f.e., the archive is already transferred from about
  90%). The reason why I think this happens, is the fact, that the
  bidirectionally-open connection is just reset by either client
  or server, becouse rsync does not do any transfer while
  the checksumming runs (I might be wrong, but this is what
  I observed), and the tcp connection is reset becouse of stall
  (I dont have clue by what means, becouse I'm no TCP/IP expert,
  but I suspect it might be just TCP/IP).
  How about adding a feature to keep the checksums in a berkeley-style
  database somewhere on the HDD separately, and with subsequent
  mirroring attempts, look to it just for the checksums, so that
  the rsync does not need to do checksumming of whole target
  (already mirrored) file tree ? I think implementing this could
  take some time, but it would certainly improve rsync's responsivenes
  and ease use with slow CPUs & HDDs

- make output of error & status messages from rsync uniformed,
  so that it could be easily parsed by scripts (it is not right
  now - rsync 2.5.5)

- perhaps if the network connection between rsync client and server
  stalls for some reason, implement something like 'tcp keepalive'
  feature ?

I know these are suggestions only; I dont have enough power nor knowledges
to implement them to rsync by myself (but I feel plagued myself with
the problems described), so I'm sending these solution ideas to you
in the hope they will be useful and could be implemented in the future.

Please let me know your opinion about this.

Thanks & regards,

Jan

David Bolen

2002-Apr-19 10:42 UTC

head link

Future RSYNC enhancement/improvement suggestions

Jan Rafaj [rafaj@cedric.vabo.cz] writes:
>   How about adding a feature to keep the checksums in a berkeley-style
>   database somewhere on the HDD separately, and with subsequent
>   mirroring attempts, look to it just for the checksums, so that
>   the rsync does not need to do checksumming of whole target
>   (already mirrored) file tree ?
There's a chicken and egg issue with this - how do you know that the
separately stored checksum accurately reflects the file which it
represents?  Once they are stored separately they can get out of sync.
The natural way to verify the checksum would be to recompute it, but
then you're sort of back to square one.  I know there have been
discussions about this sort of thing on the list in the past.

For multiple similar distributions, the rsync+ work (recently
incorporated into the mainline rsync in experimental mode - the
write-batch and read-batch options) helps remove repeated computations
of the checksums and deltas, but it's not a generalized system for any
random transfer.

I've wanted similar benefits because we use dialup to remote locations
and for databases with hundreds of MB or 1-2 GB, we end up wasting a
bit of phone time when both sides are just computing checksums.  But
I'm not sure of a good generalized solution.  There may be platform
specific hacks (e.g., under NT, storing the computed checksum in a
separate stream in the file, so it's guaranteed to be associated with
the file), but I don't know of a portable way to link meta information
with filesystem files.

Note that if you aren't already, be sure that you up the default
blocksize for large files - that can cut down significantly on both
checksum computation time as well as meta data transferred over the
session, since there are fewer blocks that need two checksums (weak +
MD4) apiece.
> - make output of error & status messages from rsync uniformed,
>   so that it could be easily parsed by scripts (it is not right
>   now - rsync 2.5.5)
I know Martin has expressed some interest to the list in having something
like this in the future as an option.
> - perhaps if the network connection between rsync client and server
>   stalls for some reason, implement something like 'tcp keepalive'
>   feature ?
I think rsync is pretty complicated at the network level already - it
seems reasonable to me that rsync ought to be able to assume that the
lowest level network protocol stack will get the data to the other end
and/or give an error if something goes wrong without needing a lot of
babysitting.

In all but the rsync server cases, rsync doesn't control the network
stream itself anyway (it just has a child process using ssh, rsh or
anything else), so it becomes a question for that particular utility
and not something rsync can do anything about.

In the rsync server case, it already sets the TCP KEEPALIVE option at
the socket level when it receives a connection.

If your network transport between systems is problematic, there's a
limited about of stuff rsync can do about it.  Oh and no, just being
idle on a session shouldn't terminate it, no matter how long rsync
takes to compute checksums.  So if that's happening to you, you might
want to investigate your network connectivity.  Or perhaps you're
going through a NAT or some sort of proxy box that places a timeout on
TCP sessions that you can increase?

Upon failures, if you use --partial and a separate destination
directory you can keep re-trying and slowly get the whole file across
(that's how we do our backups) but you do still need to recompute
checksums each time.  It might be nice to see if rsync itself could
have a retry mechanism that would re-use the existing checksum
information it had computed previously.  I have a feeling with the
structure of the code at this point though that doing so would be
reasonably complicated.

The caveat to --partial is that once you have a partial file, even
with --compare-dest, that partial file is all rsync considers for the
remaining portion of the transfer.  So originally for our database
backups, I was removing any partial copy manually if it was less than
some fraction of the previous copy I already had, since I'd lose less
time rebuilding that fraction than losing access to the entire prior
file.

In response to that, there was another internal-use patch I made to
rsync to "--partial-pad" any partial file with data from the original
file on the destination system during an error.  No guarantees it
would work as well, since I just took data from the original file past
the size point of the partial copy, but in many cases (growing files)
its a big win.  If anyone is interested, I could extract it and post
it.

-- David

/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l@fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/

tim.conway@philips.com

2002-Apr-19 16:42 UTC

head link

Future RSYNC enhancement/improvement suggestions

The problem with cached checksums is that unless the filesystem driver 
regenerates them as the filesystem is modified, they're meaningless on a 
live filesystem.  I ran into a similar problem on huge trees on slow NAS, 
and have finally written my own system (does no checksumming, but instead 
acts like rsync -W, if timestamp and size match, we're done), and sends 
everything in chunks, a list of non-directories to unlink, a list of 
directories to rmdir (in depth order, of course), and a gzipped tar, 8Mb 
at a time.

Tim Conway
tim.conway@philips.com
303.682.4917
Philips Semiconductor - Longmont TC
1880 Industrial Circle, Suite D
Longmont, CO 80501
Available via SameTime Connect within Philips, n9hmg on AIM
perl -e 'print pack(nnnnnnnnnnnn, 
19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), 
".\n" '
"There are some who call me.... Tim?"




Jan Rafaj <rafaj@cedric.vabo.cz>
Sent by: rsync-admin@lists.samba.org
04/19/2002 04:23 AM

 
        To:     <rsync@samba.org>
        cc:     <rafaj@cedric.vabo.cz>
(bcc: Tim Conway/LMT/SC/PHILIPS)
        Subject:        Future RSYNC enhancement/improvement suggestions
        Classification: 




Hello,

Recently while working with rsync as the way to mirror large (several
GB) archive on a regular basis, I came across several problems,
and also got the ideas about their possible solutions
- please could you investigate & consider implementing the features,
described below, to future RSYNC releases ?

- when the checksumming (consider very large archive, several GB)
  stage of rsync runs slow (~3 and more minutes), which is the
  case of either slower CPU machines or machines with older HDDs
  that dont have UDMA or have just UDMA33 transfer modes, one can
  often observe that the network connection to the master site
  shuts down and the mirroring fails (in subsequent mirroring
  attempts, when, f.e., the archive is already transferred from about
  90%). The reason why I think this happens, is the fact, that the
  bidirectionally-open connection is just reset by either client
  or server, becouse rsync does not do any transfer while
  the checksumming runs (I might be wrong, but this is what
  I observed), and the tcp connection is reset becouse of stall
  (I dont have clue by what means, becouse I'm no TCP/IP expert,
  but I suspect it might be just TCP/IP).
  How about adding a feature to keep the checksums in a berkeley-style
  database somewhere on the HDD separately, and with subsequent
  mirroring attempts, look to it just for the checksums, so that
  the rsync does not need to do checksumming of whole target
  (already mirrored) file tree ? I think implementing this could
  take some time, but it would certainly improve rsync's responsivenes
  and ease use with slow CPUs & HDDs

- make output of error & status messages from rsync uniformed,
  so that it could be easily parsed by scripts (it is not right
  now - rsync 2.5.5)

- perhaps if the network connection between rsync client and server
  stalls for some reason, implement something like 'tcp keepalive'
  feature ?

I know these are suggestions only; I dont have enough power nor knowledges
to implement them to rsync by myself (but I feel plagued myself with
the problems described), so I'm sending these solution ideas to you
in the hope they will be useful and could be implemented in the future.

Please let me know your opinion about this.

Thanks & regards,

Jan



-- 
To unsubscribe or change options: 
http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html

Stefan Nehlsen

2002-Apr-22 03:18 UTC

head link

Future RSYNC enhancement/improvement suggestions

On Fri, Apr 19, 2002 at 12:23:06PM +0200, Jan Rafaj
wrote:> 
> Hello,
> 
> Recently while working with rsync as the way to mirror large (several
> GB) archive on a regular basis, I came across several problems,
> and also got the ideas about their possible solutions
> - please could you investigate & consider implementing the features,
> described below, to future RSYNC releases ?
> 
> - when the checksumming (consider very large archive, several GB)
>   stage of rsync runs slow (~3 and more minutes), which is the
>   case of either slower CPU machines or machines with older HDDs
>   that dont have UDMA or have just UDMA33 transfer modes, one can
>   often observe that the network connection to the master site
>   shuts down and the mirroring fails (in subsequent mirroring
>   attempts, when, f.e., the archive is already transferred from about
>   90%). The reason why I think this happens, is the fact, that the
>   bidirectionally-open connection is just reset by either client
>   or server, becouse rsync does not do any transfer while
>   the checksumming runs (I might be wrong, but this is what
>   I observed), and the tcp connection is reset becouse of stall
>   (I dont have clue by what means, becouse I'm no TCP/IP expert,
>   but I suspect it might be just TCP/IP).
>   How about adding a feature to keep the checksums in a berkeley-style
>   database somewhere on the HDD separately, and with subsequent
>   mirroring attempts, look to it just for the checksums, so that
>   the rsync does not need to do checksumming of whole target
>   (already mirrored) file tree ? I think implementing this could
>   take some time, but it would certainly improve rsync's responsivenes
>   and ease use with slow CPUs & HDDs
The problem is that the generator works in the following steps:

1. for each block both checksums are calculated and stored in a table.

2. the number of entries in the table is send to the sender.

3. the content of the table is send to the sender.

4. the table is thrown away.

There is no real need to do this in 4 steps.

It should be possible to change this without changing the protocol.

 - the number of entries may be calculated from the blocksize and
   the size of the (flat) file. It will be send to the sender.

 - the rest may be done in a loop:

	* read a block
	* calculate checksums for this block and fill a sum_struct
	* send this sum_struct to the generator

The code will become a little more complicated but it will use
less memory and may be a bit faster.

> - perhaps if the network connection between rsync client and server
>   stalls for some reason, implement something like 'tcp keepalive'
>   feature ?
not a good idea -- the line should always be busy


cu, Stefan
-- 
Stefan Nehlsen | ParlaNet Administration | sn@parlanet.de | +49 431 988-1260

Martin Pool

2002-Apr-22 05:50 UTC

head link

Future RSYNC enhancement/improvement suggestions

On 22 Apr 2002, Jan Rafaj <rafaj@cedric.vabo.cz>
wrote:> 
> 
> On Mon, 22 Apr 2002, Stefan Nehlsen wrote:
> 
> > On Fri, Apr 19, 2002 at 12:23:06PM +0200, Jan Rafaj wrote:
> >
> > > - perhaps if the network connection between rsync client and
server
> > >   stalls for some reason, implement something like 'tcp
keepalive'
> > >   feature ?
TCP connections don't timeout anyhow.  Possibly a dial-on-demand line
or a firewall might drop the connection, but there should be enough
traffic that this is not a problem.
> PS: 4th point - how about adding feature that would enable rsync
> to store the PID of the running process somewhere ? (like,
> I hate to 'ps ax | grep' for the rsync on a machine where
> other rsync instances might be running, controlled by other means
> than my script :)
For the daemon you can use the "pid file" configuration option.  For
clients, you should just remember the pid when you create the process,
e.g. by using the $! shell special parameter.  There's no
straightforward means to find out the pid of the remote child, but I'm
not really convinced that's very important.  If you're debugging rsync
it's fairly easy to do by peeking into /proc, using lsof, or some
similar os-dependent mechanism.

-- 
Martin

David Bolen

2002-Apr-22 10:03 UTC

head link

Future RSYNC enhancement/improvement suggestions

Martin Pool [mbp@samba.org] writes:
> TCP connections don't timeout anyhow.  Possibly a dial-on-demand line
> or a firewall might drop the connection, but there should be enough
> traffic that this is not a problem.
Unless you have quite large files, in which case there can be a
lengthy period (particularly if the file is being accessed across a
local network) while checksums are computed where there is no traffic
at all.  For a while (when we had slow drives and a 10BaseT network)
we could take 20-30 minutes for checksum computation on a 500-600MB
database file with 4K blocks.  And our long distance dialup call was
completely idle during that period.

At the time, I had planned on experimenting with the sort of changes
that Stefan's recent response to this thread suggested - transmitting
the checksum information as it was computed rather than building it up
before sending anything.

As it turns out, we upgraded to a faster RAID setup, and bumped the
needed machines to 100BaseT, an the time went down to somewhere
between 5-10 minutes typically, so the priority of making the changes
dropped.  But I do still think it would be a useful adjustment to the
data flow within rsync at some point.  I can't remember just how major
the surgery looked to get the transmission to occur at the point of
computation though.

-- David

/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l@fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/

David Bolen

2002-Apr-22 16:28 UTC

head link

Future RSYNC enhancement/improvement suggestions

(I wrote about long files using 20-30 min to checksum without network
traffic)

Jason Haar [Jason.Haar@trimble.co.nz] writes:
> ...But then you should have a dialup timeout of 1 hour set?
Oh of course - I was more responding to Martin's comment about there
being enough traffic present in general during an rsync session, since
there are cases when you can have lengthy periods without traffic at
all.

I could also see some NAT boxes holding a particular stream for far
less than an hour by default, but I don't have a particular data point
for that so perhaps it's just being too conservative.
> I think the problem is that you're morally upset that rsync spends so
> much time sending no network traffic. Quite understandable ;-)
Not sure about morally, but definitely financially :-)
> What about separating the tree into subtrees and rsyncing them? That
> means you go from:
>
> 1> dialup connection started [quick]
> 2> rsync generates checksums (no network traffic) [slow]
> 3> rsync transmits files 
Perhaps you misunderstood - the checksum generation time that was
taking so long was on a *single* file level.  Rsync had already
exchanged file lists and chosen the files to transfer - it was working
on a single file and generating the block checksums on the receiver
side to send over to the sender side.

(As it turns out the transfers in question were for a single directory
normally comprised of two files - a database file and its transaction
log)

The real rub was that after spending 20+ minutes with an idle line
computing the checksum, it would then take another 30+ minutes to
transmit the checksum information over.  So it was (and likely still
is) a case where sending the data as computed would have been a major
win.  At least for slow connections, the checksum computation is
unlikely to be the bottleneck versus network transmission, so leaving
the network idle is totally wasted time that could be fully reclaimed.

I may still look into that sort of change but just haven't had the
cycles yet with the decrease in our checksum time - although this
particular discussion has sort of started me thinking about it again.
I may review our current logs to see how much time is being wasted.

-- David

/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l@fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/

David Bolen

2002-Apr-22 16:34 UTC

head link

Future RSYNC enhancement/improvement suggestions

Martin Pool [mbp@sourcefrog.net] writes:
> I guess alternatively you could set the rsync timeout high, the
> line-drop timeout low, and make it dial on demand.  That would let the
> line drop when rsync was really thinking hard, and it would come back
> up as necessary.  Losing the ppp channel does not by itself interrupt
> any tcp sessions running across it, provided that you can recover the
> same ip address next time you connect.
That assumes an environment where dial-on-demand is feasible.
Unfortunately, our particular setup is a direct PC to PC dial, and
there's no IP involved (it's Windows<->Windows with
NETBIOS/NETBEUI)
so disconnecting would shut down the remote rsync.

But it's an interesting thought for cases where it could get used.  In
general I'd expect it to be fairly fragile though unless you had
complete control of the dial infrastructure or could otherwise ensure,
as you note, identical IP address assignment.

I don't suppose anyone knows any legacy reason why all the checksums
are computed and stored in memory before transmission do they?  I
don't think at the time I could find any real requirement in the code
that it be done that way - the sequence was pretty much
generate/send/free.

-- David

/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l@fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/

David Bolen

2002-Apr-22 17:28 UTC

head link

Future RSYNC enhancement/improvement suggestions

Martin Pool [mbp@sourcefrog.net] writes:
> No, I think you could avoid it, and also avoid the up-front traversal
> of the tree, and possibly even do this while retaining some degree of
> wire compatibility.  It will be a fair bit of work.
Yeah, I was sort of thinking bang for the buck - munging with the file
list handling reaches into far more code and would likely be far more
effort to change within the current rsync source than the checksum
transmission.  I think the checksum would just be moving the
equivalent of send_sums right into generate_sums and only touching the
single generate.c module, with no noticeable difference on the wire or
to other modules.

I did go back and take a current look at our current transfers for the
one task this for which this could make the most difference.  For the
~110GB of data we synchronize each month (over V.34 dialup lines :-)),
the "wasted" time with our current network/filesystem looks to be in
aggregate only about 7.5 hours of phone time, which in turn is only
about 1.6% of the ~480 hours used each month.  So it's hard to worry
extensively about that 1.6%.

-- David

/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l@fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/

Maybe Matching Threads

Search for more maybe matching threads

rsync - Apr 2002 - Future RSYNC enhancement/improvement suggestions

Future RSYNC enhancement/improvement suggestions

Future RSYNC enhancement/improvement suggestions

Future RSYNC enhancement/improvement suggestions

Future RSYNC enhancement/improvement suggestions

Future RSYNC enhancement/improvement suggestions

Future RSYNC enhancement/improvement suggestions

Future RSYNC enhancement/improvement suggestions

Future RSYNC enhancement/improvement suggestions

Future RSYNC enhancement/improvement suggestions

Maybe Matching Threads