thr3ads.net - rsync - Suggest Rsync Performance Improvements [Sep 2002]

If this information is useful, please help other people find it:
Share via:

Damon Atkins

2002-Sep-11 06:56 UTC

Suggest Rsync Performance Improvements

1. Large 1 MB I/O, all reads and write to file systems
1MB (can use setvbuf to do this with out coding)
  e.g a awk programme doing 8K I/O to read 2GB file
took 16 min, a perl programme doing 1MB I/O took 16
seconds.

2. When doing rsync -a /dir1/dir2 /dir3/dir4
   Do not use pipe's, as they only read/write 5k at a
time this is extremly slow, check it out with Solaris
truss. Use sockets with large I/O or shared memory on
the same system.

To see the difference try
 timex dd if=xyz of=lll bs=5k
  vs
 timex dd if=xyz of=lll bs=1024k

3. When allocating memory which will be accessed all
the time use valloc() which is
memalign(sysconf(_SC_PAGESIZE),size)

It aligns memory to the page, so memory copies are
word/page align and therefore faster, ie. 4 bytes are
copied at once instead of a byte at once.

4. If you use GigaBit ethernet to get performance you
need to read/write 64Kbytes to the socket and let the
OS break it up into MTU sized packets, or better even
send 1Mbytes to the socket. 

I would change the rsync code, but could not find any
into on the variables, so that I could safely change
the code, and known it still works.

Damon.

http://mobile.yahoo.com.au - Yahoo! Messenger for SMS
- Now send & receive IMs on your mobile via SMS

jw schultz

2002-Sep-11 07:55 UTC

head link

Suggest Rsync Performance Improvements

On Wed, Sep 11, 2002 at 04:55:30PM +1000, Damon Atkins
wrote:> 
> 1. Large 1 MB I/O, all reads and write to file systems
> 1MB (can use setvbuf to do this with out coding)
>   e.g a awk programme doing 8K I/O to read 2GB file
> took 16 min, a perl programme doing 1MB I/O took 16
> seconds.
Larger I/O generally improves performance.  Have you a patch
with numbers to back it up on multiple platforms?  By the
way, i could just say that the difference is caused by
comparing awk with perl, but i won't.
> 2. When doing rsync -a /dir1/dir2 /dir3/dir4
>    Do not use pipe's, as they only read/write 5k at a
> time this is extremly slow, check it out with Solaris
> truss. Use sockets with large I/O or shared memory on
> the same system.
> 
> To see the difference try
>  timex dd if=xyz of=lll bs=5k
>   vs
>  timex dd if=xyz of=lll bs=1024k
dd is not a test of pipe performance or IPC in general.
If you are going to talk about pipe performance use pipe
benchmarks.  Pipes will outperform sockets on almost any
platform.  Shared memory will beat both.  However, the
overhead of using pipes is such a small factor in the rsync
performance and is an essential aspect of the necessary IPC.
Or are you suggesting we rewrite ssh?

There are reasons some people call it slowlaris.  This is
one.  Other systems have much larger and faster pipes,
usually related to pagesize.  Have you tested on cygwin?
We have lots of cygwin users.
> 3. When allocating memory which will be accessed all
> the time use valloc() which is
> memalign(sysconf(_SC_PAGESIZE),size)
> 
> It aligns memory to the page, so memory copies are
> word/page align and therefore faster, ie. 4 bytes are
> copied at once instead of a byte at once.
Wordsize allignment is all that is needed for that sort of
copying and malloc guarantees that on most platforms.  I
believe that much of the I/O is unaligned but then where
data gets copied inside the kernel or libs those copies are
optimized even when unaligned.  There are advantages larger
unit alignments (cacheline, TLB and page) however.  Note the
current use of realloc.  Before a micro-optimization like
this is done the whole memory footprint will probably be
changed.
> 4. If you use GigaBit ethernet to get performance you
> need to read/write 64Kbytes to the socket and let the
> OS break it up into MTU sized packets, or better even
> send 1Mbytes to the socket. 
The network interface is irrelevant.  We wouldn't code
specifically for 1400Mb any more than we would code for
10Mb.  Generally what works at one interface speed works at
all.  And, very important to remember, many rsync users are
running over ssh on PPP and 56kb dialup or on internet VPNs.
These users really feel it when the network performance is
damaged. 
> I would change the rsync code, but could not find any
> into on the variables, so that I could safely change
> the code, and known it still works.
I'm afraid your last sentance doesn't parse.  If you meant
"info on the variables"  look at their names and how they
are used.  The code is the implementation documentation.
There is a whole test suite that comes with the source for
regression testing.  Of course if you can't tell from the
changes that you didn't break things your change will meet
even more resistance.

---

Rsync is run on many different platforms and not all are
even related to UNIX.  What improves performance on one can
sometimes have negative effects on another.  Also make sure
that your optimizations will even build on a very wide range
of systems.

It is free software.  If you create a patch and show large
improvements in performance on diverse platforms without
makeing the code a mess you are free to advocate its
introduction to mainline.

Generally your "suggestions" are good ones.  But they are
best advocated with a project in its early stages.  I doubt
you will get a very favorable reception from many projects
and the only way you will get acceptance is to present
patches with benchmarks.

Hopefully my gentle response will save you some flames.

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw@pegasys.ws

		Remember Cernan and Schmitt

Possibly Parallel Threads

Search for more seemingly similar threads

rsync - Sep 2002 - Suggest Rsync Performance Improvements

Suggest Rsync Performance Improvements

Suggest Rsync Performance Improvements

Possibly Parallel Threads