Axel Kittenberger
2008-Oct-16 14:50 UTC
Alternatives to programmatically calling the rsync binary a lot
Dear list, I'd like to have your expertise opinion on following issue. Out of a concrete need we developed an application that will rsync any changes on a local directory structure to a remove system the moment they happen using the linux kernel watch feature. This is in our opinion much more elegant compared to invoking rsync every x seconds/minutes from cron, or having to use a special filesystem (a FUSEd mirror or even kernel native). The application is called lsyncd (live syncing demon) http://code.google.com/p/lsyncd/ For simplicity we just exec()ed the systems installed rsync binary to invoke rsync for a directory when a change happened in it. Now some users complained that this strategy involves a lot of forking on a vivid directory structure. Also we have not yet figured out a developed way how to handle which error rsync might encounter (what to do on network error, what to do on other errors) etc. Now do you think it is feasible to go for another strategy than working at hands distance by forking? I looked into librsync, in a childish assumption guessing it would be very same thing rsync uses, but a few reads later its evident they do not. But the big Beta tag frightens me, also it says its not wire compatible with rsync > 2, which does not look so cool. The other alternative would be to directly link with the rsync files (the possibilities of the GNU world), and call /use its according functions just like the rysnc main() function would do. What do you think would be smartest strategy to go for? Kind regards, Axel Kittenberger -------------- next part -------------- HTML attachment scrubbed and removed
Marcelo Leal
2008-Oct-16 21:03 UTC
Alternatives to programmatically calling the rsync binary a lot
The solution needs to be around rsync? I think you should look in some kind of "low level" replication, like drbd or something... What you have described i think is something very complex, because you can have many changes almost at the same time, and many sync process starting, or so... i don?t know how is the integration of the "watch feature", and the sync daemon, but as you described i really think a low level solution would be more safe. Leal. 2008/10/16 Axel Kittenberger <axel77@gmail.com>:> Dear list, I'd like to have your expertise opinion on following issue. > > Out of a concrete need we developed an application that will rsync any > changes on a local directory structure to a remove system the moment they > happen using the linux kernel watch feature. This is in our opinion much > more elegant compared to invoking rsync every x seconds/minutes from cron, > or having to use a special filesystem (a FUSEd mirror or even kernel > native). The application is called lsyncd (live syncing demon) > http://code.google.com/p/lsyncd/ > > For simplicity we just exec()ed the systems installed rsync binary to invoke > rsync for a directory when a change happened in it. Now some users > complained that this strategy involves a lot of forking on a vivid directory > structure. Also we have not yet figured out a developed way how to handle > which error rsync might encounter (what to do on network error, what to do > on other errors) etc. > > Now do you think it is feasible to go for another strategy than working at > hands distance by forking? I looked into librsync, in a childish assumption > guessing it would be very same thing rsync uses, but a few reads later its > evident they do not. But the big Beta tag frightens me, also it says its not > wire compatible with rsync > 2, which does not look so cool. The other > alternative would be to directly link with the rsync files (the > possibilities of the GNU world), and call /use its according functions just > like the rysnc main() function would do. What do you think would be smartest > strategy to go for? > > Kind regards, > Axel Kittenberger > > -- > Please use reply-all for most replies to avoid omitting the mailing list. > To unsubscribe or change options: > https://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html >-- [http://www.posix.brte.com.br/blog] --------==== pOSix rules ====-------
Axel Kittenberger
2008-Oct-16 21:31 UTC
Alternatives to programmatically calling the rsync binary a lot
Thanks for your comment! But as you can see on lsyncd project base, we compared it to rdbd (along other solutions) already. rdbd doesn't fit in many cases, since in this usecase we want a oneway sync only, not a two-way and rdbd is also a "heavy weight" solution, requireing big changes for expost solutions. watch+rsync is "leight weight" on the other hand. Also I for one needed the file ownerships different on the remote server than of the source server, something else rdbd can't. What I especially like on lsyncd (watch + rsync) is that it doesn't intefer (as in slow down) local operations at all.>From the projects homepage""""""""""""""" DRBD operates on block device level. This makes it useful for synchronizing systems that are under heavy load. Lsyncd on the other hand does not require you to change block devices and/or mount points, allows you to change uid/gid of the transferred files, separates the receiver through the one-way nature of rsync. However when using lsyncd a file change can possibly result in a full file transfer (at least for binary files) and is therefore unsuitable for databases. Also a directory rename will result in transferring the whole directory. """"""""""""""" Due to this one-wayness of syncing there is also IMHO nothing "unsecure" about it. Only forking rsync operations all the time is not optimal.And in my optimism I assumened rsync and librsync would relate to each other like curl and libcurl, only to be stumped at reality. On Thu, Oct 16, 2008 at 11:03 PM, Marcelo Leal <diversos@posix.brte.com.br>wrote:> The solution needs to be around rsync? I think you should look in some > kind of "low level" replication, like drbd or something... > What you have described i think is something very complex, because you > can have many changes almost at the same time, and many sync process > starting, or so... i don?t know how is the integration of the "watch > feature", and the sync daemon, but as you described i really think a > low level solution would be more safe. > > Leal. > > 2008/10/16 Axel Kittenberger <axel77@gmail.com>: > > Dear list, I'd like to have your expertise opinion on following issue. > > > > Out of a concrete need we developed an application that will rsync any > > changes on a local directory structure to a remove system the moment they > > happen using the linux kernel watch feature. This is in our opinion much > > more elegant compared to invoking rsync every x seconds/minutes from > cron, > > or having to use a special filesystem (a FUSEd mirror or even kernel > > native). The application is called lsyncd (live syncing demon) > > http://code.google.com/p/lsyncd/ > > > > For simplicity we just exec()ed the systems installed rsync binary to > invoke > > rsync for a directory when a change happened in it. Now some users > > complained that this strategy involves a lot of forking on a vivid > directory > > structure. Also we have not yet figured out a developed way how to handle > > which error rsync might encounter (what to do on network error, what to > do > > on other errors) etc. > > > > Now do you think it is feasible to go for another strategy than working > at > > hands distance by forking? I looked into librsync, in a childish > assumption > > guessing it would be very same thing rsync uses, but a few reads later > its > > evident they do not. But the big Beta tag frightens me, also it says its > not > > wire compatible with rsync > 2, which does not look so cool. The other > > alternative would be to directly link with the rsync files (the > > possibilities of the GNU world), and call /use its according functions > just > > like the rysnc main() function would do. What do you think would be > smartest > > strategy to go for? > > > > Kind regards, > > Axel Kittenberger > > > > -- > > Please use reply-all for most replies to avoid omitting the mailing list. > > To unsubscribe or change options: > > https://lists.samba.org/mailman/listinfo/rsync > > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html<http://www.catb.org/%7Eesr/faqs/smart-questions.html> > > > > > > -- > > [http://www.posix.brte.com.br/blog] > --------==== pOSix rules ====------- >-------------- next part -------------- HTML attachment scrubbed and removed
Matt McCutchen
2008-Oct-17 01:52 UTC
Alternatives to programmatically calling the rsync binary a lot
On Thu, 2008-10-16 at 13:38 +0200, Axel Kittenberger wrote:> For simplicity we just exec()ed the systems installed rsync binary to > invoke rsync for a directory when a change happened in it. Now some > users complained that this strategy involves a lot of forking on a > vivid directory structure.> Now do you think it is feasible to go for another strategy than > working at hands distance by forking?> The other alternative would be to directly link with the rsync files > (the possibilities of the GNU world), and call /use its according > functions just like the rysnc main() function would do.The rsync codebase is really designed for a single run. It would probably be possible to modify rsync to the point where you can call functions to process individual directories that need processing, but making everything work correctly and maintaining this derivative of rsync may be more effort than you would want to spend. Forking may be the most practical solution. Another option would be to develop your own protocol for indicating changes at the file level and use librsync to delta-transfer individual files (similar to Unison's approach). This might let you integrate the file manipulation more tightly with the change notifications you get from inotify. Matt
Possibly Parallel Threads
- How to sync an exact list of files, Including deletes!?
- [Bug 12781] New: rsync library
- Alternatives to rsync. Was: Huge directory tree: Get files to sync via tools like sysdig
- Huge directory tree: Get files to sync via tools like sysdig
- [Bug 12569] New: Missing directory errors not ignored