thr3ads.net - Ocfs2 users - [Ocfs2-users] ocfs2 backup strategies [Mar 2009]

If this information is useful, please help other people find it:
Share via:

Uwe Schuerkamp

2009-Mar-13 09:42 UTC

[Ocfs2-users] ocfs2 backup strategies

Hi folks,

I was wondering what is a good backup strategy for ocfs2 based
clusters. 


Background: We're running a cluster of 8 SLES 10 sp2 machines sharing
a common SAN-based FS (/shared) which is about 350g in size at the
moment. We've already taken care of the usual optimizations concerning
mount options on the cluster nodes (noatime and so on), but our backup
software (bacula 2.2.8) slows to a crawl when encountering directories
in this filesystem that contain quite a few small files. Data rates
usually average in the tens of MB/sec doing "normal" backups of local
filesystems on remote machines in the same LAN, but with the ocfs2 fs
bacula is hard pressed to not fall below 1mb / sec sustained
throughput which obviously isn't enough to back up 350g of data in a
sensible timeframe.

I've already tried disabling compression, rsync'ing to another server
and so on, but so far nothing has helped with improving data rates. 

How would reducing the number of cluster nodes help with backups? Is
there a "dirty read" option in ocfs2 that would allow reading the
files without locking them first or something similar? I don't think
bacula is the culprit as it easily manages larger backups in the same
environment, even reading off smb shares is order of magnitudes faster
in this case, so my guess is I'm missing out some non-obvious
optimization that would improve ocfs2 cluster performance. 

Thanks in advance for any pointers & all the best, 


Uwe 




-- 
uwe.schuerkamp at nionex.net phone: [+49] 5242.91- 4740  fax:-9722
Hauptsitz: Avenwedder Str. 55, D-33311 Guetersloh, Germany
Registergericht Guetersloh HRB 4196, Geschaeftsfuehrer: Horst Gosewehr
NIONEX ist ein Unternehmen der DirectGroup Germany www.directgroupgermany.de

Brian Kroth

2009-Mar-13 14:07 UTC

head link

[Ocfs2-users] ocfs2 backup strategies

Uwe Schuerkamp <uwe.schuerkamp at nionex.net> 2009-03-13
10:42:> Hi folks,
> 
> I was wondering what is a good backup strategy for ocfs2 based
> clusters. 
> 
> 
> Background: We're running a cluster of 8 SLES 10 sp2 machines sharing
> a common SAN-based FS (/shared) which is about 350g in size at the
> moment. We've already taken care of the usual optimizations concerning
> mount options on the cluster nodes (noatime and so on), but our backup
> software (bacula 2.2.8) slows to a crawl when encountering directories
> in this filesystem that contain quite a few small files. Data rates
> usually average in the tens of MB/sec doing "normal" backups of
local
> filesystems on remote machines in the same LAN, but with the ocfs2 fs
> bacula is hard pressed to not fall below 1mb / sec sustained
> throughput which obviously isn't enough to back up 350g of data in a
> sensible timeframe.
> 
> I've already tried disabling compression, rsync'ing to another
server
> and so on, but so far nothing has helped with improving data rates. 
> 
> How would reducing the number of cluster nodes help with backups? Is
> there a "dirty read" option in ocfs2 that would allow reading the
> files without locking them first or something similar? I don't think
> bacula is the culprit as it easily manages larger backups in the same
> environment, even reading off smb shares is order of magnitudes faster
> in this case, so my guess is I'm missing out some non-obvious
> optimization that would improve ocfs2 cluster performance. 
> 
> Thanks in advance for any pointers & all the best, 
> 
> 
> Uwe 
This clearly may not work for all cases and I'm sure is totally
unsupported, but our SAN (Equallogic) has the ability to take RW
snapshots which is where we do our backups from.  There was a thread a
while back about the proper way to do this.  Basically after taking the
snapshot you need to fixup the filesystem in a couple of different ways
(fsck, relabel, reuuid, etc.) so that the machine can mount several of
these at once.  If anyone's interested I can post these scripts.  Since
there's only one machine handling the snapshots and it's outside of the
real ocfs2 cluster, while we're doing the fixups we also convert the
snapshot to a local fs and finally remount it ro.  This prevents all
network locking from happening (since it's unnecessary) while the
backups happen.  We're doing this with a 2TB mail volume (~700G of
_many_ small files) and haven't noticed any problems with it.

I think you could probably achieve something similar by taking the
number of active nodes in the cluster down to 1 during your backup
window, but that has it's own problems to be concerned with.  I think a
simple umount /shared on all but that one would do it.

Brian

Ocfs2 users - Mar 2009 - ocfs2 backup strategies

[Ocfs2-users] ocfs2 backup strategies

[Ocfs2-users] ocfs2 backup strategies