On 12/14/2013, 04:00 , lists at benjamindsmith.com wrote:
> We checked lsyncd out and it's most certainly an very interesting tool.
> I*will* be using it in the future!
>
> However, we found that it has some issues scaling up to really big file
> stores that we haven't seen (yet) with ZFS.
>
> For example, the first thing it has to do when it comes online is a
> fully rsync of the watched file area. This makes sense; you need to do
> this to ensure integrity. But if you have a large file store, EG: many
> millions of files and dozens of TB then this first step can take days,
> even if the window of downtime is mere minutes due to a restart. Since
> we're already at this stage now (and growing rapidly!) we've
decided to
> keep looking for something more elegant and ZFS appears to be almost an
> exact match. We have not tested the stability of lsyncd managing the
> many millions of inode write notifications in the meantime, but just
> trying to satisfy the write needs for two smaller customers (out of
> hundreds) with lsyncd led to crashes and the need to modify kernel
> parameters.
>
> As another example, lsyncd solves a (highly useful!) problem of
> replication, which is a distinctly different problem than backups.
> Replication is useful, for example as a read-only cache for remote
> application access, or for disaster recovery with near-real-time
> replication, but it's not a backup. If somebody deletes a file
> accidentally, you can't go to the replicated host and expect it to be
> there. And unless you are lsyncd'ing to a remote file system with
it's
> own snapshot capability, there isn't an easy way to version a backup
> short of running rsync (again) on the target to create hard links or
> something - itself a very slow, intensive process with very large
> filesystems. (days)
>
> I'll still be experimenting with lsyncd further to evaluate its real
> usefulness and performance compared to ZFS and report results. As said
> before, we'll know much more in another month or two once our next
stage
> of roll out is complete.
>
> -Ben
Hi Ben,
Yes, the initial replication of a large filesystem is *very* time
consuming! But it makes sleeping at night much easier. I did have to
crank up the inotify kernel parameters by a significant amount.
I did the initial replication using rsync directly, rather than asking
lsyncd to do it. I notice that if I reboot the primary server, it takes
a while for the inotify tables to be rebuilt ... after that it's smooth
sailing.
If you want to prevent deletion of files from your replicated filesystem
(which I do), you can modify the rsync{} array in the lsyncd.lua file by
adding the line 'delete = false' to it. This has saved my butt a few
times when a user has accidentally deleted a file on the primary server.
I agree that filesystem replication isn't really a backup, but for now
it's all I have available, but at least the replicated fs is on a
separate machine.
As a side note for anyone using a file server for hosting OS-X Time
Machine backups, the 'delete' parameter in rsync{} must be set to
'true'
in order to prevent chaos should a user need to point their Mac at the
replicate filesystem (which should be a very rare event). I put all TM
backups in a separate ZFS sub-pool for this reason.
Chuck