thr3ads.net - CentOS - [CentOS] ZFS on Linux testing [Dec 2013]

If this information is useful, please help other people find it:
Share via:

Chuck Munro

2013-Dec-14 16:50 UTC

[CentOS] ZFS on Linux testing

On 12/14/2013, 04:00 , lists at benjamindsmith.com wrote:
> We checked lsyncd out and it's most certainly an very interesting tool.
> I*will*  be using it in the future!
>
> However, we found that it has some issues scaling up to really big file
> stores that we haven't seen (yet) with ZFS.
>
> For example, the first thing it has to do when it comes online is a
> fully rsync of the watched file area. This makes sense; you need to do
> this to ensure integrity. But if you have a large file store, EG: many
> millions of files and dozens of TB then this first step can take days,
> even if the window of downtime is mere minutes due to a restart. Since
> we're already at this stage now (and growing rapidly!) we've
decided to
> keep looking for something more elegant and ZFS appears to be almost an
> exact match. We have not tested the stability of lsyncd managing the
> many millions of inode write notifications in the meantime, but just
> trying to satisfy the write needs for two smaller customers (out of
> hundreds) with lsyncd led to crashes and the need to modify kernel
> parameters.
>
> As another example, lsyncd solves a (highly useful!) problem of
> replication, which is a distinctly different problem than backups.
> Replication is useful, for example as a read-only cache for remote
> application access, or for disaster recovery with near-real-time
> replication, but it's not a backup. If somebody deletes a file
> accidentally, you can't go to the replicated host and expect it to be
> there. And unless you are lsyncd'ing to a remote file system with
it's
> own snapshot capability, there isn't an easy way to version a backup
> short of running rsync (again) on the target to create hard links or
> something - itself a very slow, intensive process with very large
> filesystems. (days)
>
> I'll still be experimenting with lsyncd further to evaluate its real
> usefulness and performance compared to ZFS and report results. As said
> before, we'll know much more in another month or two once our next
stage
> of roll out is complete.
>
> -Ben
Hi Ben,

Yes, the initial replication of a large filesystem is *very* time 
consuming!  But it makes sleeping at night much easier.  I did have to 
crank up the inotify kernel parameters by a significant amount.

I did the initial replication using rsync directly, rather than asking 
lsyncd to do it.  I notice that if I reboot the primary server, it takes 
a while for the inotify tables to be rebuilt ... after that it's smooth 
sailing.

If you want to prevent deletion of files from your replicated filesystem 
(which I do), you can modify the rsync{} array in the lsyncd.lua file by 
adding the line 'delete = false' to it.  This has saved my butt a few 
times when a user has accidentally deleted a file on the primary server.

I agree that filesystem replication isn't really a backup, but for now 
it's all I have available, but at least the replicated fs is on a 
separate machine.

As a side note for anyone using a file server for hosting OS-X Time 
Machine backups, the 'delete' parameter in rsync{} must be set to
'true'
in order to prevent chaos should a user need to point their Mac at the 
replicate filesystem (which should be a very rare event).  I put all TM 
backups in a separate ZFS sub-pool for this reason.

Chuck

Lists

2013-Dec-18 01:59 UTC

head link

[CentOS] ZFS on Linux testing

On 12/14/2013 08:50 AM, Chuck Munro wrote:> Hi Ben,
>
> Yes, the initial replication of a large filesystem is *very* time
> consuming!  But it makes sleeping at night much easier.  I did have to
> crank up the inotify kernel parameters by a significant amount.
>
> I did the initial replication using rsync directly, rather than asking
> lsyncd to do it.  I notice that if I reboot the primary server, it takes
> a while for the inotify tables to be rebuilt ... after that it's smooth
> sailing.
I may be being presumptuous, and if so, I apologize in advance...

It sounds to me like you might consider a disk-to-disk backup solution. 
I could suggest dirvish, BackupPC, or our own home-rolled rsync-based 
solution that works rather well: http://www.effortlessis.com/backupbuddy/

Note that with these solutions you get multiple save points that are 
deduplicated with hardlinks so you can (usually) keep dozens of save 
points in perhaps 2x the disk space of a single copy. Also, because of 
this, you can go back a few days / weeks / whatever when somebody 
deletes a file. In our case, we make the backed up directories available 
via read-only ftp so that end users can recover their files.

I don't know if dirvish offers this, but backupbuddy also allows you to 
run pre and post backup shell scripts, which we use (for example) for 
off-site archiving to permanent storage since backup save points expire.

-Ben
> If you want to prevent deletion of files from your replicated filesystem
> (which I do), you can modify the rsync{} array in the lsyncd.lua file by
> adding the line 'delete = false' to it.  This has saved my butt a
few
> times when a user has accidentally deleted a file on the primary server.
>
> I agree that filesystem replication isn't really a backup, but for now
> it's all I have available, but at least the replicated fs is on a
> separate machine.
>
> As a side note for anyone using a file server for hosting OS-X Time
> Machine backups, the 'delete' parameter in rsync{} must be set to
'true'
> in order to prevent chaos should a user need to point their Mac at the
> replicate filesystem (which should be a very rare event).  I put all TM
> backups in a separate ZFS sub-pool for this reason.
>

Reasonably Related Threads

Search for more possibly parallel threads

CentOS - Dec 2013 - ZFS on Linux testing

[CentOS] ZFS on Linux testing

[CentOS] ZFS on Linux testing

Reasonably Related Threads