Hi - I'm trying to diagnose a slow rsync problem. I'm trying to rsync a
lot of fresh data between two systems with gigabit ethernet, but using
ssh. The systems have large RAID disks, with pretty fast read and write
benchmarks (>100MB/s). Both systems are running Fedora 10 Linux. I only
get around 15MB/s transferred between the two systems using:
rsync -raxSH --numeric-ids /indir sys2:/outdir
I've tried switching to rsh, but that doesn't help a great deal. I get
close to maximum gigabit speeds in simple data copy tests however.
Running strace on the rsync running on the destination system, I see it
does a lot of seeking between writes. It seems that the sparse file
support does this for even tiny numbers of zero bytes. Wouldn't it make
sense to have a minimum threshold of ~1024 sparse bytes before doing a
seek? I would suspect seeking increases the overheads quite a bit.
Can file systems actually record such small numbers of sparse bytes? I
would assume that they work on the block basis for sparse files (at least
on ext2 etc), so it's not clear to me why rsync has SPARSE_WRITE_SIZE set
to 1024 (rather than 4096), and why there isn't a minimum threshold of
~1024 bytes before seeking.
It's not clear to me that this is my problem, but at least it would be a
sensible optimisation.
Jeremy
--
Jeremy Sanders <jeremy@jeremysanders.net>
http://www.jeremysanders.net/ Cambridge, UK
Public Key Server PGP Key ID: E1AAE053