We're thinking about putting rsync to use in our production environment. What we want to do is have a cron job running on a client that replicates files on the host every five minutes. We believe that some of the files will take longer than five minutes to complete. From the limited testing I've done it looks like the first rsync session correctly identifies the file as a candidate for transferring. Five minutes later the second rsync process is started and it too thinks that the file is out of date (as it still is) and subsequently starts transferring the file again. Ideally, the second process should realise that the file, although being out of date, is already in the process of being synchronised, and therefore ignore it. Does an option exist that handles situations like this (and I've simply overlooked it) or is my situation general enough to warrant a feature like the one above being implemented? Thanks in advance, Matthew Burgess British Telecommunications plc Registered office: 81 Newgate Street London EC1A 7AJ Registered in England no. 1800000 This electronic message contains information from British Telecommunications plc which may be privileged or confidential. The information is intended to be for the use of the individual(s) or entity named above. If you are not the intended recipient be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited. If you have received this electronic message in error, please notify us by telephone or email (to the numbers or address above) immediately.
Hi, We have written a few lines in an envelope shell script to prevent this kind of trouble. This idea is very common in Unix - 0) Check from 'ps' - is there any active pid where there is a corresponding log/rsync.<pid> file - if there is terminate with log message. 1) write PID to a certain ..log/rsync.<pid> 2) do the rsync stuff 3) remove my ...log/rsync.<pid> 4) Terminate (It is sometimes wise to clean up possibly crashed ..log/rsync.<pid> files - or make an enhancement which deletes them if there is no active corresponding PID in the system in the step 0) sh -----Original Message----- From: matthew.2.burgess@bt.com [mailto:matthew.2.burgess@bt.com] Sent: 20. kes?kuuta 2002 14:51 To: rsync@lists.samba.org Subject: cronning rsync We're thinking about putting rsync to use in our production environment. What we want to do is have a cron job running on a client that replicates files on the host every five minutes. We believe that some of the files will take longer than five minutes to complete. From the limited testing I've done it looks like the first rsync session correctly identifies the file as a candidate for transferring. Five minutes later the second rsync process is started and it too thinks that the file is out of date (as it still is) and subsequently starts transferring the file again. Ideally, the second process should realise that the file, although being out of date, is already in the process of being synchronised, and therefore ignore it. Does an option exist that handles situations like this (and I've simply overlooked it) or is my situation general enough to warrant a feature like the one above being implemented? Thanks in advance, Matthew Burgess British Telecommunications plc Registered office: 81 Newgate Street London EC1A 7AJ Registered in England no. 1800000 This electronic message contains information from British Telecommunications plc which may be privileged or confidential. The information is intended to be for the use of the individual(s) or entity named above. If you are not the intended recipient be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited. If you have received this electronic message in error, please notify us by telephone or email (to the numbers or address above) immediately. -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
On Thu, Jun 20, 2002 at 12:51:14PM +0100, matthew.2.burgess@bt.com wrote:> We're thinking about putting rsync to use in our production environment. > What we want to do is have a cron job running on a client that replicates > files on the host every five minutes. We believe that some of the files > will take longer than five minutes to complete.Then you'll want to ensure that only one rsync instance is active at a time. This is best done with an external locking mechanism. My personal favorite is setlock, part of Dan Bernstein's daemontools package <http://cr.yp.to/daemontools.html>, and your setlock'd cron job might look like this: 0-55/5 * * * * setlock -n /tmp/.rsync.lock rsync <blah blah blah> Your OS may have similar functionality already available, and other 3rd-party packages may also have rolled their own equivalents (I know both procmail and maildrop did) -- "man -k lock" to see for yourself. - Adrian
Thanks for all your swift replies. I can't believe that I didn't even see the obvious solution of checking to see if rsync was running before-hand. I've essentially rolled up my call to rsync in a shell-script as below: ---begin script--- #!/bin/sh #is a previous rsync process still running? ps -ef | grep 'rsync' | grep -v 'grep rsync' | grep -v 'rsync.sh' > /dev/null if [ $? -eq 1 ]; then #rsync isn't running - let's launch it now rsync [opts] user@host::/module/* . else echo "Rsync is still running...please wait and try again later" fi ---end script--- Thanks for all of your suggestions, Matt Burgess British Telecommunications plc Registered office: 81 Newgate Street London EC1A 7AJ Registered in England no. 1800000 This electronic message contains information from British Telecommunications plc which may be privileged or confidential. The information is intended to be for the use of the individual(s) or entity named above. If you are not the intended recipient be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited. If you have received this electronic message in error, please notify us by telephone or email (to the numbers or address above) immediately. -----Original Message----- From: Adrian Ho [mailto:aho-sw-rsync@03s.net] Sent: Thursday, June 20, 2002 14:19 To: rsync@lists.samba.org Subject: Re: cronning rsync On Thu, Jun 20, 2002 at 12:51:14PM +0100, matthew.2.burgess@bt.com wrote:> We're thinking about putting rsync to use in our production environment. > What we want to do is have a cron job running on a client that replicates > files on the host every five minutes. We believe that some of the files > will take longer than five minutes to complete.Then you'll want to ensure that only one rsync instance is active at a time. This is best done with an external locking mechanism. My personal favorite is setlock, part of Dan Bernstein's daemontools package <http://cr.yp.to/daemontools.html>, and your setlock'd cron job might look like this: 0-55/5 * * * * setlock -n /tmp/.rsync.lock rsync <blah blah blah> Your OS may have similar functionality already available, and other 3rd-party packages may also have rolled their own equivalents (I know both procmail and maildrop did) -- "man -k lock" to see for yourself. - Adrian -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Unless you can be certain that nobody else might run their own rsync, it's not quite ready. I'd suggest more like ++++++++++++++++++++++++++++++++++++ #!/bin/sh pidfile=/var/tmp/ourproceduresrsync.pid #get the content. will be blank if nonexistent... saves a stat. oldpid=`cat $pidfile 2>/dev/null` #verify that it's numeric if [ "`expr $oldpid / $oldpid 2>/dev/null`" -eq 1 ] then #see if it represents a running rsync... unlikely to randomly get another rsync on the same pid if ps -p $oldpid |grep rsync >/dev/null then #and if it's running, that's all we need to know.. maybe next time exit 0 fi fi #fire off the rsync in the background rsync -options source destination & #save its pid for the next run, in case we're not done when he starts echo $! > $pidfile #wait for it to finish (this is kid of like a "fg" wait $! #and get rid of the pid file rm $pidfile ++++++++++++++++++++++++++++++++++++ fix the ps for however your system works. Now, there are still two vulnerabilities here. If it takes more than 5 minutes to cat the file, do the expr, do the ps and grep call rsync, and echo the rsync pid into the pidfile, you could possibly get a race condition. If it takes that long, though, you've got bigger problems. Frankly, you should be safe all the way down to an every-minute run, though that would probably be wasteful. I don't know if the setlock thing is completely immune to racing. It might be better. Tim Conway tim.conway@philips.com 303.682.4917 office, 3039210301 cell Philips Semiconductor - Longmont TC 1880 Industrial Circle, Suite D Longmont, CO 80501 Available via SameTime Connect within Philips, n9hmg on AIM perl -e 'print pack(nnnnnnnnnnnn, 19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), ".\n" ' "There are some who call me.... Tim?" matthew.2.burgess@bt.com Sent by: rsync-admin@lists.samba.org 06/20/2002 08:03 AM To: rsync@lists.samba.org cc: (bcc: Tim Conway/LMT/SC/PHILIPS) Subject: RE: cronning rsync Classification: Thanks for all your swift replies. I can't believe that I didn't even see the obvious solution of checking to see if rsync was running before-hand. I've essentially rolled up my call to rsync in a shell-script as below: ---begin script--- #!/bin/sh #is a previous rsync process still running? ps -ef | grep 'rsync' | grep -v 'grep rsync' | grep -v 'rsync.sh' > /dev/null if [ $? -eq 1 ]; then #rsync isn't running - let's launch it now rsync [opts] user@host::/module/* . else echo "Rsync is still running...please wait and try again later" fi ---end script--- Thanks for all of your suggestions, Matt Burgess British Telecommunications plc Registered office: 81 Newgate Street London EC1A 7AJ Registered in England no. 1800000 This electronic message contains information from British Telecommunications plc which may be privileged or confidential. The information is intended to be for the use of the individual(s) or entity named above. If you are not the intended recipient be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited. If you have received this electronic message in error, please notify us by telephone or email (to the numbers or address above) immediately. -----Original Message----- From: Adrian Ho [mailto:aho-sw-rsync@03s.net] Sent: Thursday, June 20, 2002 14:19 To: rsync@lists.samba.org Subject: Re: cronning rsync On Thu, Jun 20, 2002 at 12:51:14PM +0100, matthew.2.burgess@bt.com wrote:> We're thinking about putting rsync to use in our production environment. > What we want to do is have a cron job running on a client thatreplicates> files on the host every five minutes. We believe that some of the files > will take longer than five minutes to complete.Then you'll want to ensure that only one rsync instance is active at a time. This is best done with an external locking mechanism. My personal favorite is setlock, part of Dan Bernstein's daemontools package <http://cr.yp.to/daemontools.html>, and your setlock'd cron job might look like this: 0-55/5 * * * * setlock -n /tmp/.rsync.lock rsync <blah blah blah> Your OS may have similar functionality already available, and other 3rd-party packages may also have rolled their own equivalents (I know both procmail and maildrop did) -- "man -k lock" to see for yourself. - Adrian -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html