Hi all,
I have about 15 LDAP servers that will be using Rsync over SSH to sync up their
access logs to a centralized server for further processing. The estimated log
volume is around 12 Gigs / days total. On the centralized server, the directory
where the logs are being synced is an enterprise-grade NAS. So it looks like
this:
LDAP servers --> rsync over SSH --> NFS mount on log repo server -->
NAS device
I have 4 servers currently configured and everything appeared to work fine at
first. I do not have any visibility to the LDAP servers but I assume their
clocks are synced. The cron is running at the same times: HH:00, HH:10, 20, 30,
40 and 50.
I just started to notice some 'sync sputtering': Sometime, all 4
server's latest access log will have the same timestamp e.g. 09:30, while at
other times I would see something like this: It is 09:35 and I have 2 servers at
09:20 and 2 at 09:30 i.e. they could not all sync themselves during the last
round.
How does rsync handle mutliple simultaneous connections over SSH. I am guessing
that it is up to the log repo server to allocate 4 separate SSH sessions, and
that within each of these, it will use rsync to sync up the logs. Is that
correct? If so, then the only issue would be the log repo server requesting a
lot of information from the NAS device at the same time, hence causing the
operation to possibly fail. The actual rsync sommand looks like this:
/bin/rsync --timeout=300 --rsync-path=/usr/local/bin/rsync -avz -e
"/usr/local/bin/ssh -i <key>" `find $LDAP_LOGS_LOCAL/access*
-mtime -5 -type f` flmuser@$PREPARSER_SERVER:$LDAP_LOGS_REMOTE
Timeout is set to 300 and logs being checked and synced are at most 100 megs in
size (they rotate once they hit 100 megs).
Cheers,
Fran?ois