Hi,
I've been trying to use rsync to backup the content of a large web
application on our Digital unix servers. I'm having some success but the
main problem is the sheer number of files and subdirectories which must be
copied/maintained between the two servers (one is our standby system).
Believe it or not there are in excess of 7.5 million files/directories
which must be synced between the systems each night. The window for
carrying this out can be up to 7-8 hours and I've been experimenting by
running multiple rysncs on different parts of the hierarchy. I did try
doing larger sections of the directory tree but because (as far as I know)
rysnc build up a list prior to doing anywork I eventually run into
resource issues in the kernel (even though I've bumped up all the in core
stuff to around 2gb for user processes).
I'm doing conccurent rsync's of the form
rsync -aHWvu --delete --force /directory/wildcardAA*
remotebox::data/directory/
I've been partially successful. It works intermittently. The main problems
I am seeing are :
1. files which are added to the filelist but are deleted by the websystem
before they are sent - at least I believe this is what is stopping some of
them completing - see below
send_files failed to open /some/path/to/file
11455: No such file or directory
rsync error: partial transfer (code 23) at main.c(578)
?? can I force rsync to ignore this and carry on?
2. I'm also seeing the following but I have a suspicion this maybe down to
resource problems on the remote system - I know it was running rather low
on swap - I'm going to re-run the tests once I've changed the swapping
algorithm to over-committent mode (so it does not pre-allocate memory so
greedily).
rsync: connection unexpectedly closed (9562065 bytes read so far)
rsync error: error in rsync protocol data stream (code 12) at io.c(150)
rsync: connection unexpectedly closed (28 bytes read so far)
rsync error: error in rsync protocol data stream (code 12) at io.c(150)
3. I've tried to set up logging on the remote system (set up to receive
rsync requests via port 873 and i've added
log /path/to/log
directives to the rsyncd.conf file but they never seem to get any content
(i've touched them to ensure they exist..)
see entry from rsyncd.conf
[data]
comment= BIG SERVER
path=/path/to/data
use chroot = yes
read only= no
hosts allow=10.1.2.3
list=yes
uid=0
gid=0
exclude=quota.group quota.user
log /usr/local/etc/rsync_logfile_data.log
I'm hoping someone can advise on best practice for trying to do something
like this. I've tried to get rzync going
(http://www.clari.net/~wayne/new-protocol.html) but so far to no avail. It
apparently tries to address some of the problems inherent in moving LARGE
amounts of data. Hope I'm not talking Heresy here :-)
Anyway, feedback and comments welcome. Please send them directly to me and
I will post a summary of replies in a few days....
many thanks.
-- _
/-\dam
-------------------------------------------------------------------------
FLESH: Adam Bentley, Systems/Networking/Usenet, Coventry University. UK
INET : A.Bentley@coventry.ac.uk
-------------------------------------------------------------------------
#include <std/disclaimer.h>
-------------------------------------------------------------------------