I've been researching the state of 'file alteration monitoring' technology on Linux. Famd uses dnotify to inefficently monitor a handful of directories. The replacement for dnotify is being worked on in the kenel and it's called inotify. If I understand it correctly and they get it finished, it would be an awesome addition to rsync. With it, you could run rsync to update a remote system (push mode) and keep it up2date. With inotify efficiently feeding 'file opened for write was just closed' notifications to rsync, it could efficiently and continuously mirror an active file system. With enough bandwidth, the time lag could be mere seconds. My challenge is to mirror 300 gigs of half meg files on to three remote filesystems, afap. Most of the files are 'write and leave for two months' with some being 'write 4 times a day' so rsync should be perfect if it can be told right after the file is created. We are installing fiber to our building so this might be the perfect combination to come up with a five second mirror performance. Can rsync be set up with: here's the src dir, here's the remote dir, now here's a relative list of files I want you to sync? Perhaps being fed one at a time via a pipe. The pipe thing wouldn't be needed if rsync used inotify directly. Right now I'm using a homebrew, ftp utility over a T1 with hours of back-log. I'm excited about the prospects. thanks scottb
Scott wrote:> of directories. The replacement for dnotify is being worked on in the > kenel and it's called inotify. If I understand it correctly > and they get it finished, it would be an awesome addition to rsync.I can't speak for the people who work on rsync, but from the sounds of this, it seems like it's better suited for a new *rsync capable* daemon that listens for the inotify messages and synchronizes files to an rsync server using the rsync protocol. This could be achieved using the librsync modules (I think? I haven't checked to see if those are kept up to date with current rsync protocols) and would avoid having to create a listening daemon mode for rsync on top of what it already has. At least, I hope that's the method you were getting at - you wouldn't want to spawn a new call to rsync every time a file was modified - that would most likely murder your system! This sounds like an awesome idea though - I for one would jump on this if it was good in performance. I had previously experiemented with something called ssyncd which was OK, but had to scan all the files it was to monitor constantly. It's one problem was that should the part doing the monitoring lose access to the folder it was copying from, it would think the files were all gone and start an "rm -rf *" essentially and wipe all your sync'd data (that was fun times, fun times). I had poked a few other alternatives, but none were good enough. If this inotify stuff actually works, it will be a godsend :) That?s my take on this :) Eli.
On Wed, 2005-03-02 16:11:52 -0800, Scott Becker <scottb@bxwa.com> wrote in message <42265648.1000507@bxwa.com>:> I've been researching the state of 'file alteration monitoring' > technology on Linux. Famd uses dnotify to inefficently monitor a handful > of directories. The replacement for dnotify is being worked on in the > kenel and it's called inotify. If I understand it correctly and they get > it finished, it would be an awesome addition to rsync. With it, you > could run rsync to update a remote system (push mode) and keep it > up2date. With inotify efficiently feeding 'file opened for write was > just closed' notifications to rsync, it could efficiently and > continuously mirror an active file system. With enough bandwidth, the > time lag could be mere seconds.Again, this reminds me to some preload library I wrote for this purpose. It intercepted all file-manipulating library calls and replicated it to a number of sync'ed servers.> My challenge is to mirror 300 gigs of half meg files on to three remote > filesystems, afap. Most of the files are 'write and leave for two > months' with some being 'write 4 times a day' so rsync should be perfect > if it can be told right after the file is created. We are installing > fiber to our building so this might be the perfect combination to come > up with a five second mirror performance.I'd defitively rewrite it, since the initial implementation isn't any longer available to me... MfG, JBG -- Jan-Benedict Glaw jbglaw@lug-owl.de . +49-172-7608481 _ O _ "Eine Freie Meinung in einem Freien Kopf | Gegen Zensur | Gegen Krieg _ _ O fuer einen Freien Staat voll Freier B?rger" | im Internet! | im Irak! O O O ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA)); -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://lists.samba.org/archive/rsync/attachments/20050303/1a69307d/attachment.bin
On Wednesday 02 March 2005 06:11 pm, Scott Becker wrote:> I've been researching the state of 'file alteration monitoring' > technology on Linux. Famd uses dnotify to inefficently monitor a handful > of directories. The replacement for dnotify is being worked on in the > kenel and it's called inotify. If I understand it correctly and they get > it finished, it would be an awesome addition to rsync. With it, you > could run rsync to update a remote system (push mode) and keep it > up2date. With inotify efficiently feeding 'file opened for write was > just closed' notifications to rsync, it could efficiently and > continuously mirror an active file system. With enough bandwidth, the > time lag could be mere seconds. > > My challenge is to mirror 300 gigs of half meg files on to three remote > filesystems, afap. Most of the files are 'write and leave for two > months' with some being 'write 4 times a day' so rsync should be perfect > if it can be told right after the file is created. We are installing > fiber to our building so this might be the perfect combination to come > up with a five second mirror performance.[...] Have you looked into the existing solutions for this problem, along the lines of network block devices used in a RAID-1? I used that setup probably 5 years ago with a reliable network link, in order to keep 4 machines in sync. The ndb driver in Linux is much improved since then... Using the mdutils, I'm pretty sure that the software RAID susbsystem (yeah, I'm still assuming Linux) would handle disconnected remote drives, though you might have to manually bring them back into sync after network disconnections (probably with a script to monitor /proc/mdstat). I agree that rsync is really cool, and that something like this would be handy, but I'm inclined to agree with Eli in that this might be better handled by having a daemon listen for changes and then send the changed file(s) up to the destination server(s), probably using the rsync protocol. This one-way sync thing might be even better handled by the RAID/nbd solution - since that's really what nbd was designed for. :) --Danny
On Wed, Mar 02, 2005 at 04:11:52PM -0800, Scott Becker wrote:> Can rsync be set up with: here's the src dir, here's the remote dir, now > here's a relative list of files I want you to sync? Perhaps being fed > one at a time via a pipe.Rsync supports that, but not on an incremental basis -- i.e. it won't start the transfer until EOF on the pipe. A program designed to be more incremental in its protocol would be needed to handle a steady stream of update requests. A while back, I wrote a test-bed for a new rsync protocol that I called rZync (a bad name, but it was only for testing) that has such an incremental protocol: it is possible to send it a steady stream of "put this file here" messages, and it uses librsync to update the files. Something like that, but quite a bit simpler could be written that makes use of the librsync library and only supports the pushing of files from one system to the other over the open connection. It would need to re-open the connection on failure, and spool the names of changed files when the connection was down so that they could be sent when the connection was up again. It would be a pretty nice way to keep two drives in sync, but I don't see it being a part of rsync itself. ..wayne..