thr3ads.net - rsync - rsync and debian -- summary of issues [Apr 2002]

If this information is useful, please help other people find it:
Share via:

Martin Pool

2002-Apr-11 00:17 UTC

rsync and debian -- summary of issues

There seems to be a thread about rsync and Debian packages every
couple of months.  I've written up a document which tries to cover all
of the questions and debates.  It's pretty informal, but hopefully
will be useful.

  http://rsync.samba.org/rsync-and-debian/

I'd appreciate comments.

-- 
Martin

Adam Heath

2002-Apr-11 16:03 UTC

head link

rsync and debian -- summary of issues

On Thu, 11 Apr 2002, Martin Pool wrote:
> There seems to be a thread about rsync and Debian packages every
> couple of months.  I've written up a document which tries to cover all
> of the questions and debates.  It's pretty informal, but hopefully
> will be useful.
>
>   http://rsync.samba.org/rsync-and-debian/
>
> I'd appreciate comments.
Seems good.

It'd be nice if there were links to details about how the reverse rsync
algorythm in used, tho.

Also, I've had this idea, to use rdiff to generate the checksums, on the
server.  If rdiff supported a reverse rsync algo(it currently doesn't), then
it would make integration with other tools simplistic(thing a wrapper around
wget and rdiff).

Brian May

2002-Apr-11 18:00 UTC

head link

rsync and debian -- summary of issues

On Thu, Apr 11, 2002 at 06:15:43PM +1000, Martin Pool
wrote:> There seems to be a thread about rsync and Debian packages every
> couple of months.  I've written up a document which tries to cover all
> of the questions and debates.  It's pretty informal, but hopefully
> will be useful.
> 
>   http://rsync.samba.org/rsync-and-debian/
> 
> I'd appreciate comments.
I think some more details is required regarding rproxy.

Why is nobody actively developing it?

AFAIK, it solves all the problems regarding server load discussed in
rsync, doesn't it???
-- 
Brian May <bam@debian.org>

Jason Gunthorpe

2002-Apr-11 23:42 UTC

head link

rsync and debian -- summary of issues

On Thu, 11 Apr 2002, Martin Pool wrote:
> I'd appreciate comments.
Hmm...

As you may know I'm both the APT author, administrator of the top level
debian mirrors and associated mirror network. So,
> 3.2 rsync is too hard on servers
> If it is, then I think we should fix the problems, rather than
> invent a new system from scratch. I think the scalability problems
> are accidents of the current codebase, rather than anything inherent
> in the design.
It's true I'm afraid. Currently on ftp.d.o:

nobody    8835 25.7  0.3 22120 1740 ?        RN   Apr10 525:24 rsync --daemon
nobody   22896  5.0  0.3 22828 1992 ?        SN   Apr11  21:20 rsync --daemon
nobody    3907  7.3  0.5 22336 2820 ?        RN   Apr11  15:30 rsync --daemon
nobody   10729 13.7  4.0 22308 20904 ?       RN   Apr11  13:10 rsync --daemon

The load average is currently > 7 all due to rsync. I'm not sure what
that
one that has sucked up 500mins is actually doing, but I've come to accept
that as 'normal'. I expect some client has asked it to recompute every
checksum for the entire 30G of data and it's just burning away processor
power <sigh>.

We tend to allow only 10-15 simulataneous rsync connections because of
this.

Things are better now, in the past with 2.2 kernels and somewhat slower
disks rsync would not just suck up CPU power but it would seriously hit
the drives as well. I think the improvements in inode/dentry caching in
2.4, and our new archive structure are largely responsible for making that
less noticable.

IMHO as long as rsync continues to have a server heavy design it's ability
to scale is going to be quite poor. Right now there are 91 people
connected to  ftp/http on ftp.d.o, if they were using rsync's I'm sure
the
poor server would be quite dead indeed.
> 3.1 Compressed files cannot be differenced
I recall seeing some work done to determine how much savings you could
expect if you used xdeltas of the uncompressed data. This would be the
best result you could expect from gzip --rsyncable. I recall the numbers 
were disapointing, it was << 50% on average or somesuch. It would be nice
if someone could find that email or repeat the experiments.
> 3.5 Goswin Brederlow's proposal to use the reverse rsync algorithm over
> HTTP Range requests
Several years ago I suggested this in a conversation with you on one of
the rsync lists, someone else was able to pull a reference to the IBM
patent database and claimed it was the particular patent that prohibits
the server-friendly reverse implementation.
> 3.7 rsync uses too much memory
This only really seems to be true for tree-mirroring, the filelists can be
very big indeed.

Jason

Michael Salmon

2002-Apr-28 23:43 UTC

head link

rsync and debian -- summary of issues

On Thursday, April 11, 2002 06:15:43 PM +1000 Martin Pool <mbp@samba.org> 
wrote:
+------
| There seems to be a thread about rsync and Debian packages every
| couple of months.  I've written up a document which tries to cover all
| of the questions and debates.  It's pretty informal, but hopefully
| will be useful.
|
|   http://rsync.samba.org/rsync-and-debian/
|
| I'd appreciate comments.
+-----X8

There is a patch that creates a file list that can be used later and hence 
avoids the file walk which addresses the server load that you complain 
about but IIRC it was intended to be used in server-push rather than 
client-pull, still it could be a start.

/Michael
--
This space intentionally left non-blank.

Seemingly Similar Threads

Search for more reasonably related threads

rsync - Apr 2002 - rsync and debian -- summary of issues

rsync and debian -- summary of issues

rsync and debian -- summary of issues

rsync and debian -- summary of issues

rsync and debian -- summary of issues

rsync and debian -- summary of issues

Seemingly Similar Threads