For anyone who'd like to check out the latest release of my
"rzync" [sic]
test release, I've just released a new version. For those that might
not have time to look at the code but could provide some feedback based
on a rough description, I've created the following simple web page:
http://www.clari.net/~wayne/new-protocol.html
Here's the tar file of the new release:
http://www.clari.net/~wayne/rzync-0.03.tar.gz
Changes in this version:
I've optimized the protocol to make the transferred-byte overhead
smaller; I've used an rsync-like file-list compression to make the
directory data smaller; I've gotten rid of some previous limitations
(such as the 4-byte file-size limit and the lack of reallocating various
buffers for really large file-count transfers); I've re-enabled the
"move" versions of the various get/put commands (which were disabled
in
the last release); and I've fixed several bugs. The resulting program
seems to be working quite well in my limited testing.
The count of transferred bytes in the latest protocol is now below what
rsync sends for many commands -- both a start-from-scratch update or a
fully-up-to-date update are usually smaller, for instance. This is
mainly because my file-list data is smaller, but it's also because I
reduced the protocol overhead quite a bit. Transferred bytes for
partially-changed files are still bigger than rsync because librsync
creates unusually large delta sizes (though there's a patch that makes
it work much better, it's still not as good as rsync).
In my speed testing, one test was sending around 8.5 meg of data on a
local system, and while rsync took only .5 seconds, my rzync app took
around 2 seconds. A quick gprof run reveals that 98% of the runtime is
being spent in 2 librsync routines, so it looks like librsync needs to
be optimized a bit.
One potential next steps might include optimizing rsync to make the
transferred file-list size a little smaller (e.g. making the transfer of
the "size" attribute only as long as needed to store the number would
save ~4-5 bytes per file entry on typical files).
It looks like work needs to be done on making librsync more efficient.
Until I can get some better speed tests, I'm unsure if I should attempt
to make rsync talk my new protocol. Opinions welcomed.
..wayne..
On Fri, Jun 21, 2002 at 03:46:39AM -0700, Wayne Davison wrote:> The count of transferred bytes in the latest protocol is now below what > rsync sends for many commands -- both a start-from-scratch update or a > fully-up-to-date update are usually smaller, for instance. This is > mainly because my file-list data is smaller, but it's also because I > reduced the protocol overhead quite a bit. Transferred bytes for > partially-changed files are still bigger than rsync because librsync > creates unusually large delta sizes (though there's a patch that makes > it work much better, it's still not as good as rsync).I believe that the remaining difference is rsync does "context compression" using zlib. I believe librsync does no compression at all yet. Even if you zlib compress librsync's delta's, they will still be bigger than rsync because of the "context" it uses... it compresses the whole file, hits and misses, but only sends the compressed output for the misses. This means the compressor is "primed" with data from the hits. I think that the best solution for this is to do what xdelta is planning to do... toss zlib and include target references as well as source references in the delta instruction stream; do the compression yourself. One way to do this is implement xdelta-style non-block aligned matches against the target, building a rollsum hash-tree as you go through it, and run it alongside the rsync block match algorithm. However, this might not work well in practice...> In my speed testing, one test was sending around 8.5 meg of data on a > local system, and while rsync took only .5 seconds, my rzync app took > around 2 seconds. A quick gprof run reveals that 98% of the runtime is > being spent in 2 librsync routines, so it looks like librsync needs to > be optimized a bit. > > One potential next steps might include optimizing rsync to make the > transferred file-list size a little smaller (e.g. making the transfer of > the "size" attribute only as long as needed to store the number would > save ~4-5 bytes per file entry on typical files). > > It looks like work needs to be done on making librsync more efficient.I'm going to get onto this after this week end. I know what needs to be done... I just need the time to do it. -- ---------------------------------------------------------------------- ABO: finger abo@minkirri.apana.org.au for more info, including pgp key ----------------------------------------------------------------------
Wayne Davison
2002-Jun-21 17:58 UTC
rZync 0.04 -- a faster next-generation protocol test app
FYI, I decided to release a new version of my next-generation protocol
test app because I created an optimized transfer mode when files are
being sent whole (it bypasses all calls to librsync). This makes my
"rZync" test app faster than rsync for sending whole files (rather
than
4x slower, like it was). This is significant because it helps to assure
me that my single-process generator/receiver will be able to keep up
with rsync's dual process implementation. A full-file transfer appears
to be faster than rsync, even on a dual processor system. For instance,
this test was 775 files in 126 directories:
---------------------------------- rsync ----------------------------------
wrote 32920749 bytes read 12420 bytes 9409476.86 bytes/sec
total size is 32869747 speedup is 1.00
rsync -av foo /tmp 2.23s user 1.54s system 162% cpu 2.314 total
wrote 32920749 bytes read 12420 bytes 7318482.00 bytes/sec
total size is 32869747 speedup is 1.00
rsync -av foo /tmp 2.23s user 1.55s system 105% cpu 3.588 total
---------------------------------- rZync ----------------------------------
wrote 32900189 bytes (16813) read 5534 bytes (5534) 13162289.20 bytes/sec
total size is 32869700 speedup is 1.00
rs -av foo /tmp 0.34s user 0.56s system 39% cpu 2.274 total
wrote 32900064 bytes (16688) read 5534 bytes (5534) 13162239.20 bytes/sec
total size is 32869700 speedup is 1.00
rs -av foo /tmp 0.42s user 0.69s system 58% cpu 1.910 total
---------------------------------------------------------------------------
I've also updated my new-protocol web page to explain what I'm trying to
accomplish (which some folks probably missed the first-time around):
http://www.clari.net/~wayne/new-protocol.html
Here's the tar file of the new release:
http://www.clari.net/~wayne/rzync-0.04.tar.gz
For that that want to try this out, use the "rs" perl script to
control
rZync in an rsync-like manner (a temporary, test-mode situation), or
control it yourself by sending it commands on stdin.
..wayne..