thr3ads.net - Gluster users - [Gluster-users] Recommended underlining disk storage environment [Dec 2008]

If this information is useful, please help other people find it:
Share via:

Stas Oskin

2008-Dec-03 09:30 UTC

[Gluster-users] Recommended underlining disk storage environment

Hi.

What is the recommended underlining disk storage environment for GlusterFS?

* file-system - EXT3, XFS...
* technology - RAID0 (strip) or LVM 1/2

Regards.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20081203/bfd72c29/attachment.html>

Keith Freedman

2008-Dec-03 10:40 UTC

head link

[Gluster-users] Recommended underlining disk storage environment

I'm not sure there's an official recommendation.
I use XFS with much success.

I think the choice of underlying filesystem depends highly on the 
types of data you'll be storing and how you'll be storing the info.
If it's primarily read data, then a filesystem with journaling 
capabilities may not provide much benefit.  If you'll have lots of 
files in few directories then a filesystem with better large 
directory metrix would be ideal, etc...  Gluster depends on the 
underlying filesystem, and will work no matter what that filesystem 
is provided it supports extended attributes.

I've found XFS works great for most purposes.  If you're on Solaris, 
I'd recommend ZFS.  but It seems people are fond of ReiserFS, but you 
could certainly use EXT3 with extended attributes enabled and be just 
fine most likely.

as for LVM.  again, this really depends what you want to do with the data.
If you need to use multiple physical devices/partitions to present 
just one to gluster you can do that and use LVM to manage your 
resizing of the single logical volume.
Alternatively, you could use gluster's Unify translator to present 
one effective large/consolidated volume which can be made up of 
multiple devices/partitions.

In this scenario, you could potentially have multiple underlying 
configurations. You could Unify xfs, reiser, and ext3 filesystems 
into one gluster filesystem.

as for RAID, again, the faster and more appropriately configured the 
underlying system for your data requirements, the better off you will 
be.  If you're going to use gluster's AFR translator, then I'd not 
bother with hardware raid/mirroring and just use RAID0 stripes, 
however, if you have the money, and can afford to do RAID0+1, that's 
always a huge benefit on read performance.  Of course, if you're in a 
high write environment, then there's no real added value so it's not 
worth doing.

this doesn't realy answer your question, but hopefully it helps.

At 01:30 AM 12/3/2008, Stas Oskin wrote:>Hi.
>
>What is the recommended underlining disk storage environment for GlusterFS?
>
>* file-system - EXT3, XFS...
>* technology - RAID0 (strip) or LVM 1/2
>
>Regards.
>_______________________________________________
>Gluster-users mailing list
>Gluster-users at gluster.org
>http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users

Keith Freedman

2008-Dec-03 10:40 UTC

head link

[Gluster-users] Recommended underlining disk storage environment

I''m not sure there''s an official recommendation.
I use XFS with much success.

I think the choice of underlying filesystem depends highly on the 
types of data you''ll be storing and how you''ll be storing the
info.
If it''s primarily read data, then a filesystem with journaling 
capabilities may not provide much benefit.  If you''ll have lots of 
files in few directories then a filesystem with better large 
directory metrix would be ideal, etc...  Gluster depends on the 
underlying filesystem, and will work no matter what that filesystem 
is provided it supports extended attributes.

I''ve found XFS works great for most purposes.  If you''re on
Solaris,
I''d recommend ZFS.  but It seems people are fond of ReiserFS, but you 
could certainly use EXT3 with extended attributes enabled and be just 
fine most likely.

as for LVM.  again, this really depends what you want to do with the data.
If you need to use multiple physical devices/partitions to present 
just one to gluster you can do that and use LVM to manage your 
resizing of the single logical volume.
Alternatively, you could use gluster''s Unify translator to present 
one effective large/consolidated volume which can be made up of 
multiple devices/partitions.

In this scenario, you could potentially have multiple underlying 
configurations. You could Unify xfs, reiser, and ext3 filesystems 
into one gluster filesystem.

as for RAID, again, the faster and more appropriately configured the 
underlying system for your data requirements, the better off you will 
be.  If you''re going to use gluster''s AFR translator, then
I''d not
bother with hardware raid/mirroring and just use RAID0 stripes, 
however, if you have the money, and can afford to do RAID0+1, that''s 
always a huge benefit on read performance.  Of course, if you''re in a 
high write environment, then there''s no real added value so
it''s not
worth doing.

this doesn''t realy answer your question, but hopefully it helps.

At 01:30 AM 12/3/2008, Stas Oskin wrote:>Hi.
>
>What is the recommended underlining disk storage environment for GlusterFS?
>
>* file-system - EXT3, XFS...
>* technology - RAID0 (strip) or LVM 1/2
>
>Regards.
>_______________________________________________
>Gluster-users mailing list
>Gluster-users at gluster.org
>http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users

Keith Freedman

2008-Dec-03 10:40 UTC

head link

[Gluster-users] Recommended underlining disk storage environment

I''m not sure there''s an official recommendation.
I use XFS with much success.

I think the choice of underlying filesystem depends highly on the 
types of data you''ll be storing and how you''ll be storing the
info.
If it''s primarily read data, then a filesystem with journaling 
capabilities may not provide much benefit.  If you''ll have lots of 
files in few directories then a filesystem with better large 
directory metrix would be ideal, etc...  Gluster depends on the 
underlying filesystem, and will work no matter what that filesystem 
is provided it supports extended attributes.

I''ve found XFS works great for most purposes.  If you''re on
Solaris,
I''d recommend ZFS.  but It seems people are fond of ReiserFS, but you 
could certainly use EXT3 with extended attributes enabled and be just 
fine most likely.

as for LVM.  again, this really depends what you want to do with the data.
If you need to use multiple physical devices/partitions to present 
just one to gluster you can do that and use LVM to manage your 
resizing of the single logical volume.
Alternatively, you could use gluster''s Unify translator to present 
one effective large/consolidated volume which can be made up of 
multiple devices/partitions.

In this scenario, you could potentially have multiple underlying 
configurations. You could Unify xfs, reiser, and ext3 filesystems 
into one gluster filesystem.

as for RAID, again, the faster and more appropriately configured the 
underlying system for your data requirements, the better off you will 
be.  If you''re going to use gluster''s AFR translator, then
I''d not
bother with hardware raid/mirroring and just use RAID0 stripes, 
however, if you have the money, and can afford to do RAID0+1, that''s 
always a huge benefit on read performance.  Of course, if you''re in a 
high write environment, then there''s no real added value so
it''s not
worth doing.

this doesn''t realy answer your question, but hopefully it helps.

At 01:30 AM 12/3/2008, Stas Oskin wrote:>Hi.
>
>What is the recommended underlining disk storage environment for GlusterFS?
>
>* file-system - EXT3, XFS...
>* technology - RAID0 (strip) or LVM 1/2
>
>Regards.
>_______________________________________________
>Gluster-users mailing list
>Gluster-users at gluster.org
>http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users

Keith Freedman

2008-Dec-03 10:40 UTC

head link

[Gluster-users] Recommended underlining disk storage environment

I''m not sure there''s an official recommendation.
I use XFS with much success.

I think the choice of underlying filesystem depends highly on the 
types of data you''ll be storing and how you''ll be storing the
info.
If it''s primarily read data, then a filesystem with journaling 
capabilities may not provide much benefit.  If you''ll have lots of 
files in few directories then a filesystem with better large 
directory metrix would be ideal, etc...  Gluster depends on the 
underlying filesystem, and will work no matter what that filesystem 
is provided it supports extended attributes.

I''ve found XFS works great for most purposes.  If you''re on
Solaris,
I''d recommend ZFS.  but It seems people are fond of ReiserFS, but you 
could certainly use EXT3 with extended attributes enabled and be just 
fine most likely.

as for LVM.  again, this really depends what you want to do with the data.
If you need to use multiple physical devices/partitions to present 
just one to gluster you can do that and use LVM to manage your 
resizing of the single logical volume.
Alternatively, you could use gluster''s Unify translator to present 
one effective large/consolidated volume which can be made up of 
multiple devices/partitions.

In this scenario, you could potentially have multiple underlying 
configurations. You could Unify xfs, reiser, and ext3 filesystems 
into one gluster filesystem.

as for RAID, again, the faster and more appropriately configured the 
underlying system for your data requirements, the better off you will 
be.  If you''re going to use gluster''s AFR translator, then
I''d not
bother with hardware raid/mirroring and just use RAID0 stripes, 
however, if you have the money, and can afford to do RAID0+1, that''s 
always a huge benefit on read performance.  Of course, if you''re in a 
high write environment, then there''s no real added value so
it''s not
worth doing.

this doesn''t realy answer your question, but hopefully it helps.

At 01:30 AM 12/3/2008, Stas Oskin wrote:>Hi.
>
>What is the recommended underlining disk storage environment for GlusterFS?
>
>* file-system - EXT3, XFS...
>* technology - RAID0 (strip) or LVM 1/2
>
>Regards.
>_______________________________________________
>Gluster-users mailing list
>Gluster-users at gluster.org
>http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users

Stas Oskin

2008-Dec-03 12:00 UTC

head link

[Gluster-users] Recommended underlining disk storage environment

Hi.

Thanks for your detailed answers. I'd like to clarify several points:

2008/12/3 Keith Freedman <freedman at freeformit.com>
> I'm not sure there's an official recommendation.
> I use XFS with much success.
>
Is XFS suitable for massive writing / occasional reading?

>
> I think the choice of underlying filesystem depends highly on the types of
> data you'll be storing and how you'll be storing the info.
> If it's primarily read data, then a filesystem with journaling
capabilities
> may not provide much benefit.  If you'll have lots of files in few
> directories then a filesystem with better large directory metrix would be
> ideal, etc...  Gluster depends on the underlying filesystem, and will work
> no matter what that filesystem is provided it supports extended attributes.
>
I'm going to store mostly large files (100+ MB), with massive writing, and
only occasional read operations.

>
> I've found XFS works great for most purposes.  If you're on
Solaris, I'd
> recommend ZFS.  but It seems people are fond of ReiserFS, but you could
> certainly use EXT3 with extended attributes enabled and be just fine most
> likely.
>
I'm actually prefer to stay on Linux. How well XFS compares to EXT3 in the
environment that I described?

>
> as for LVM.  again, this really depends what you want to do with the data.
> If you need to use multiple physical devices/partitions to present just one
> to gluster you can do that and use LVM to manage your resizing of the
single
> logical volume.

This was the first idea I though about, as I'm going to use 4 disks per
server.
>
> Alternatively, you could use gluster's Unify translator to present one
> effective large/consolidated volume which can be made up of multiple
> devices/partitions.
>
I think I read somewhere in this mailing list that there is a migration from
Unity to DHT in GlusterFS (whichever it means) in the coming 1.4. If Unity
is the legacy approach, what is the relevant solution for 1.4 (DHT)?

>
> In this scenario, you could potentially have multiple underlying
> configurations. You could Unify xfs, reiser, and ext3 filesystems into one
> gluster filesystem.
>
> as for RAID, again, the faster and more appropriately configured the
> underlying system for your data requirements, the better off you will be.
>  If you're going to use gluster's AFR translator, then I'd not
bother with
> hardware raid/mirroring and just use RAID0 stripes, however, if you have
the
> money, and can afford to do RAID0+1, that's always a huge benefit on
read
> performance.  Of course, if you're in a high write environment, then
there's
> no real added value so it's not worth doing.
>
Couple of points here:
1) Thanks to AFR, I actually don't need any fault-tolerant raid (like
mirror), so it's only recommeded in high-volume read enviroments, which is
not the case here. Is this correct?

2) Isn't LVM (or GlusterFS own solution) much better then RAID 0 in sense
that if one of the disks go, the volume still continues to work? This
contrary to RAID where the whole volume goes down?

3) Continuing 2, I think I actually meant JBOD - where you just connect all
the drives and make them look as a single device, rather then stripping.

If you could clarfy the recommended approach, it would be great.

>
> this doesn't realy answer your question, but hopefully it helps.
>
>
>Thanks again for your help.

Regards.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20081203/f944f8f2/attachment.html>

Keith Freedman

2008-Dec-03 18:25 UTC

head link

[Gluster-users] Recommended underlining disk storage environment

At 04:00 AM 12/3/2008, Stas Oskin wrote:>Hi.
>
>Thanks for your detailed answers. I''d like to clarify several
points:
>
>2008/12/3 Keith Freedman 
><<mailto:freedman at freeformit.com>freedman at freeformit.com>
>I''m not sure there''s an official recommendation.
>I use XFS with much success.
>
>
>Is XFS suitable for massive writing / occasional reading?
XFS is more optimal than EXT3 or ReiserFS for write environments:
some useful information is here: 
http://www.ibm.com/developerworks/library/l-fs9.html

I''d pay close attention to the "Delayed allocation" section.
>I think the choice of underlying filesystem depends highly on the 
>types of data you''ll be storing and how you''ll be storing
the info.
>If it''s primarily read data, then a filesystem with journaling 
>capabilities may not provide much benefit.  If you''ll have lots of 
>files in few directories then a filesystem with better large 
>directory metrix would be ideal, etc...  Gluster depends on the 
>underlying filesystem, and will work no matter what that filesystem 
>is provided it supports extended attributes.
>
>
>I''m going to store mostly large files (100+ MB), with massive 
>writing, and only occasional read operations.
>
>
>I''ve found XFS works great for most purposes.  If you''re
on Solaris,
>I''d recommend ZFS.  but It seems people are fond of ReiserFS, but 
>you could certainly use EXT3 with extended attributes enabled and be 
>just fine most likely.
>
>
>I''m actually prefer to stay on Linux. How well XFS compares to EXT3
>in the environment that I described?
They''re all Linux filesystems, so that''s not the issue.
>as for LVM.  again, this really depends what you want to do with the data.
>If you need to use multiple physical devices/partitions to present 
>just one to gluster you can do that and use LVM to manage your 
>resizing of the single logical volume.
>
>This was the first idea I though about, as I''m going to use 4 disks
>per server.
>
>Alternatively, you could use gluster''s Unify translator to present 
>one effective large/consolidated volume which can be made up of 
>multiple devices/partitions.
>
>
>I think I read somewhere in this mailing list that there is a 
>migration from Unity to DHT in GlusterFS (whichever it means) in the 
>coming 1.4. If Unity is the legacy approach, what is the relevant 
>solution for 1.4 (DHT)?
the approach is the same.  I belive the concept is that there''s a 
translator that groups multiple smaller smaller filesystem pieces 
into a single representation.  Gluster lets you do this through the 
filesystem where LVM lets you do this through the block devices.

Personally, I''d go with LVM since it''s likely easier to manage
in the
long run and gives you more flexibility.  You can grow your LVM 
volumen, and you can, if you go with XFS, dynamically resize your 
filesystem, and you wont have to make any changes to your gluster config.
>
>In this scenario, you could potentially have multiple underlying 
>configurations. You could Unify xfs, reiser, and ext3 filesystems 
>into one gluster filesystem.
>
>as for RAID, again, the faster and more appropriately configured the 
>underlying system for your data requirements, the better off you 
>will be.  If you''re going to use gluster''s AFR translator,
then I''d
>not bother with hardware raid/mirroring and just use RAID0 stripes, 
>however, if you have the money, and can afford to do RAID0+1,
that''s
>always a huge benefit on read performance.  Of course, if you''re in
>a high write environment, then there''s no real added value so
it''s
>not worth doing.
>
>
>Couple of points here:
>1) Thanks to AFR, I actually don''t need any fault-tolerant raid 
>(like mirror), so it''s only recommeded in high-volume read 
>enviroments, which is not the case here. Is this correct?
you can use AFR as your fault-tolerance/mirror.  However, be aware 
that this means your "mirroring" wil be going at network speed.  If 
you have no need to have multiple servers with live replicated data, 
you''ll be much better off performance wise using hardware 
mirroring.  However, if you want/need to have multiple servers 
serving identical data, then just use AFR and then you can live 
without hardware mirroring.

I''m not sure how gluster/AFR will perform with a very large file 
high-write environment.  We''ll have to see what the gluster devs say 
about it, but what I can say is this:
In the event your AFR servers loose contact and then later have to 
auto-heal, gluster will have to move the entire large file, since it 
doesn''t, as far as I know, have rsync like capabilities wherein it 
would only move the modified bits of the file over the network--I 
believe it just copies over the whole thing, so if this happens a 
lot, it will bog things down significantly.
>2) Isn''t LVM (or GlusterFS own solution) much better then RAID 0 in
>sense that if one of the disks go, the volume still continues to 
>work? This contrary to RAID where the whole volume goes down?
you''re confused about what RAID means.  yes, RAID0 (striping), there 
is no redundancy.  RAID 1 (mirroring) provides redundancy and if one 
drive fails the volume still functions -- you can do this with 
hardware or, I believe, with LVM.  Then there''s RAID0+1 (Striping &
mirroring) which provides the performance benefit of striping with 
the high-availability of mirroring.

So whether or not you use LVM to do your raid or a hardware raid 
controller doesn''t change anything.  RAID 0 you have a volume down in 
a failure, RAID 1 you can withstand a drive failure.
>3) Continuing 2, I think I actually meant JBOD - where you just 
>connect all the drives and make them look as a single device, rather 
>then stripping.
Right, however this presents the same issues as striping but without 
the performance benefit of striping.

Lets say you have AFR set up and you have a 4 disk stripe or 
concatenated (jbod) volume on each of 2 servers.
if you have a single drive failure on one server, that entire 
filesystem becomes unavailable.
When you repair your drive, you effectively have a blank empty 
filesystem now.  gluster/AFR will notice this and start auto-healing 
the entire filesystem (as each directory and file are accessed).  so 
in time you''ll have copied over the entire filesystem over the network.

However, if you have a single server and you mirror your devices in a 
RAID1/0+1 config, then you loose a drive, you''re filesystem is still 
running, replace the drive and the RAID software fixes everything.

AFR is much more efficient in high read environments since you can 
either distribute the load across multiple servers and specify a 
local read volume to insure a particular client is always using the 
fastest server (which could be it''s own local brick, or a server on a 
lan when you''re using AFR across a wan).
>If you could clarfy the recommended approach, it would be great.
so here''s a summary:
IF you do NOT need to have more than one server serving the data 
(i.e, you''re not going to replicate the data for DR purposes)
I''d
recommend you avoid AFR in gluster and instead configure RAID0+1 on 
your server.  You''d be better off using a hardware RAID controller 
with a large batter backed up cache, but you could use a software 
RAID (like LVM).

if you said you had a high read environment, I''d have suggested 2 
servers using AFR over a private high-speed network since that 
reduces your points of failure, but given the high write large file 
environment, AFR may become a bottleneck.  --again, if you NEED 
server redundancy, then AFR is your best option, but if you don''t 
need it then it will just slow things down.

>this doesn''t realy answer your question, but hopefully it helps.
>
>
>
>Thanks again for your help.
>
>Regards.

Keith Freedman

2008-Dec-03 18:25 UTC

head link

[Gluster-users] Recommended underlining disk storage environment

At 04:00 AM 12/3/2008, Stas Oskin wrote:>Hi.
>
>Thanks for your detailed answers. I''d like to clarify several
points:
>
>2008/12/3 Keith Freedman 
><<mailto:freedman at freeformit.com>freedman at freeformit.com>
>I''m not sure there''s an official recommendation.
>I use XFS with much success.
>
>
>Is XFS suitable for massive writing / occasional reading?
XFS is more optimal than EXT3 or ReiserFS for write environments:
some useful information is here: 
http://www.ibm.com/developerworks/library/l-fs9.html

I''d pay close attention to the "Delayed allocation" section.
>I think the choice of underlying filesystem depends highly on the 
>types of data you''ll be storing and how you''ll be storing
the info.
>If it''s primarily read data, then a filesystem with journaling 
>capabilities may not provide much benefit.  If you''ll have lots of 
>files in few directories then a filesystem with better large 
>directory metrix would be ideal, etc...  Gluster depends on the 
>underlying filesystem, and will work no matter what that filesystem 
>is provided it supports extended attributes.
>
>
>I''m going to store mostly large files (100+ MB), with massive 
>writing, and only occasional read operations.
>
>
>I''ve found XFS works great for most purposes.  If you''re
on Solaris,
>I''d recommend ZFS.  but It seems people are fond of ReiserFS, but 
>you could certainly use EXT3 with extended attributes enabled and be 
>just fine most likely.
>
>
>I''m actually prefer to stay on Linux. How well XFS compares to EXT3
>in the environment that I described?
They''re all Linux filesystems, so that''s not the issue.
>as for LVM.  again, this really depends what you want to do with the data.
>If you need to use multiple physical devices/partitions to present 
>just one to gluster you can do that and use LVM to manage your 
>resizing of the single logical volume.
>
>This was the first idea I though about, as I''m going to use 4 disks
>per server.
>
>Alternatively, you could use gluster''s Unify translator to present 
>one effective large/consolidated volume which can be made up of 
>multiple devices/partitions.
>
>
>I think I read somewhere in this mailing list that there is a 
>migration from Unity to DHT in GlusterFS (whichever it means) in the 
>coming 1.4. If Unity is the legacy approach, what is the relevant 
>solution for 1.4 (DHT)?
the approach is the same.  I belive the concept is that there''s a 
translator that groups multiple smaller smaller filesystem pieces 
into a single representation.  Gluster lets you do this through the 
filesystem where LVM lets you do this through the block devices.

Personally, I''d go with LVM since it''s likely easier to manage
in the
long run and gives you more flexibility.  You can grow your LVM 
volumen, and you can, if you go with XFS, dynamically resize your 
filesystem, and you wont have to make any changes to your gluster config.
>
>In this scenario, you could potentially have multiple underlying 
>configurations. You could Unify xfs, reiser, and ext3 filesystems 
>into one gluster filesystem.
>
>as for RAID, again, the faster and more appropriately configured the 
>underlying system for your data requirements, the better off you 
>will be.  If you''re going to use gluster''s AFR translator,
then I''d
>not bother with hardware raid/mirroring and just use RAID0 stripes, 
>however, if you have the money, and can afford to do RAID0+1,
that''s
>always a huge benefit on read performance.  Of course, if you''re in
>a high write environment, then there''s no real added value so
it''s
>not worth doing.
>
>
>Couple of points here:
>1) Thanks to AFR, I actually don''t need any fault-tolerant raid 
>(like mirror), so it''s only recommeded in high-volume read 
>enviroments, which is not the case here. Is this correct?
you can use AFR as your fault-tolerance/mirror.  However, be aware 
that this means your "mirroring" wil be going at network speed.  If 
you have no need to have multiple servers with live replicated data, 
you''ll be much better off performance wise using hardware 
mirroring.  However, if you want/need to have multiple servers 
serving identical data, then just use AFR and then you can live 
without hardware mirroring.

I''m not sure how gluster/AFR will perform with a very large file 
high-write environment.  We''ll have to see what the gluster devs say 
about it, but what I can say is this:
In the event your AFR servers loose contact and then later have to 
auto-heal, gluster will have to move the entire large file, since it 
doesn''t, as far as I know, have rsync like capabilities wherein it 
would only move the modified bits of the file over the network--I 
believe it just copies over the whole thing, so if this happens a 
lot, it will bog things down significantly.
>2) Isn''t LVM (or GlusterFS own solution) much better then RAID 0 in
>sense that if one of the disks go, the volume still continues to 
>work? This contrary to RAID where the whole volume goes down?
you''re confused about what RAID means.  yes, RAID0 (striping), there 
is no redundancy.  RAID 1 (mirroring) provides redundancy and if one 
drive fails the volume still functions -- you can do this with 
hardware or, I believe, with LVM.  Then there''s RAID0+1 (Striping &
mirroring) which provides the performance benefit of striping with 
the high-availability of mirroring.

So whether or not you use LVM to do your raid or a hardware raid 
controller doesn''t change anything.  RAID 0 you have a volume down in 
a failure, RAID 1 you can withstand a drive failure.
>3) Continuing 2, I think I actually meant JBOD - where you just 
>connect all the drives and make them look as a single device, rather 
>then stripping.
Right, however this presents the same issues as striping but without 
the performance benefit of striping.

Lets say you have AFR set up and you have a 4 disk stripe or 
concatenated (jbod) volume on each of 2 servers.
if you have a single drive failure on one server, that entire 
filesystem becomes unavailable.
When you repair your drive, you effectively have a blank empty 
filesystem now.  gluster/AFR will notice this and start auto-healing 
the entire filesystem (as each directory and file are accessed).  so 
in time you''ll have copied over the entire filesystem over the network.

However, if you have a single server and you mirror your devices in a 
RAID1/0+1 config, then you loose a drive, you''re filesystem is still 
running, replace the drive and the RAID software fixes everything.

AFR is much more efficient in high read environments since you can 
either distribute the load across multiple servers and specify a 
local read volume to insure a particular client is always using the 
fastest server (which could be it''s own local brick, or a server on a 
lan when you''re using AFR across a wan).
>If you could clarfy the recommended approach, it would be great.
so here''s a summary:
IF you do NOT need to have more than one server serving the data 
(i.e, you''re not going to replicate the data for DR purposes)
I''d
recommend you avoid AFR in gluster and instead configure RAID0+1 on 
your server.  You''d be better off using a hardware RAID controller 
with a large batter backed up cache, but you could use a software 
RAID (like LVM).

if you said you had a high read environment, I''d have suggested 2 
servers using AFR over a private high-speed network since that 
reduces your points of failure, but given the high write large file 
environment, AFR may become a bottleneck.  --again, if you NEED 
server redundancy, then AFR is your best option, but if you don''t 
need it then it will just slow things down.

>this doesn''t realy answer your question, but hopefully it helps.
>
>
>
>Thanks again for your help.
>
>Regards.

Keith Freedman

2008-Dec-03 18:25 UTC

head link

[Gluster-users] Recommended underlining disk storage environment

At 04:00 AM 12/3/2008, Stas Oskin wrote:>Hi.
>
>Thanks for your detailed answers. I''d like to clarify several
points:
>
>2008/12/3 Keith Freedman 
><<mailto:freedman at freeformit.com>freedman at freeformit.com>
>I''m not sure there''s an official recommendation.
>I use XFS with much success.
>
>
>Is XFS suitable for massive writing / occasional reading?
XFS is more optimal than EXT3 or ReiserFS for write environments:
some useful information is here: 
http://www.ibm.com/developerworks/library/l-fs9.html

I''d pay close attention to the "Delayed allocation" section.
>I think the choice of underlying filesystem depends highly on the 
>types of data you''ll be storing and how you''ll be storing
the info.
>If it''s primarily read data, then a filesystem with journaling 
>capabilities may not provide much benefit.  If you''ll have lots of 
>files in few directories then a filesystem with better large 
>directory metrix would be ideal, etc...  Gluster depends on the 
>underlying filesystem, and will work no matter what that filesystem 
>is provided it supports extended attributes.
>
>
>I''m going to store mostly large files (100+ MB), with massive 
>writing, and only occasional read operations.
>
>
>I''ve found XFS works great for most purposes.  If you''re
on Solaris,
>I''d recommend ZFS.  but It seems people are fond of ReiserFS, but 
>you could certainly use EXT3 with extended attributes enabled and be 
>just fine most likely.
>
>
>I''m actually prefer to stay on Linux. How well XFS compares to EXT3
>in the environment that I described?
They''re all Linux filesystems, so that''s not the issue.
>as for LVM.  again, this really depends what you want to do with the data.
>If you need to use multiple physical devices/partitions to present 
>just one to gluster you can do that and use LVM to manage your 
>resizing of the single logical volume.
>
>This was the first idea I though about, as I''m going to use 4 disks
>per server.
>
>Alternatively, you could use gluster''s Unify translator to present 
>one effective large/consolidated volume which can be made up of 
>multiple devices/partitions.
>
>
>I think I read somewhere in this mailing list that there is a 
>migration from Unity to DHT in GlusterFS (whichever it means) in the 
>coming 1.4. If Unity is the legacy approach, what is the relevant 
>solution for 1.4 (DHT)?
the approach is the same.  I belive the concept is that there''s a 
translator that groups multiple smaller smaller filesystem pieces 
into a single representation.  Gluster lets you do this through the 
filesystem where LVM lets you do this through the block devices.

Personally, I''d go with LVM since it''s likely easier to manage
in the
long run and gives you more flexibility.  You can grow your LVM 
volumen, and you can, if you go with XFS, dynamically resize your 
filesystem, and you wont have to make any changes to your gluster config.
>
>In this scenario, you could potentially have multiple underlying 
>configurations. You could Unify xfs, reiser, and ext3 filesystems 
>into one gluster filesystem.
>
>as for RAID, again, the faster and more appropriately configured the 
>underlying system for your data requirements, the better off you 
>will be.  If you''re going to use gluster''s AFR translator,
then I''d
>not bother with hardware raid/mirroring and just use RAID0 stripes, 
>however, if you have the money, and can afford to do RAID0+1,
that''s
>always a huge benefit on read performance.  Of course, if you''re in
>a high write environment, then there''s no real added value so
it''s
>not worth doing.
>
>
>Couple of points here:
>1) Thanks to AFR, I actually don''t need any fault-tolerant raid 
>(like mirror), so it''s only recommeded in high-volume read 
>enviroments, which is not the case here. Is this correct?
you can use AFR as your fault-tolerance/mirror.  However, be aware 
that this means your "mirroring" wil be going at network speed.  If 
you have no need to have multiple servers with live replicated data, 
you''ll be much better off performance wise using hardware 
mirroring.  However, if you want/need to have multiple servers 
serving identical data, then just use AFR and then you can live 
without hardware mirroring.

I''m not sure how gluster/AFR will perform with a very large file 
high-write environment.  We''ll have to see what the gluster devs say 
about it, but what I can say is this:
In the event your AFR servers loose contact and then later have to 
auto-heal, gluster will have to move the entire large file, since it 
doesn''t, as far as I know, have rsync like capabilities wherein it 
would only move the modified bits of the file over the network--I 
believe it just copies over the whole thing, so if this happens a 
lot, it will bog things down significantly.
>2) Isn''t LVM (or GlusterFS own solution) much better then RAID 0 in
>sense that if one of the disks go, the volume still continues to 
>work? This contrary to RAID where the whole volume goes down?
you''re confused about what RAID means.  yes, RAID0 (striping), there 
is no redundancy.  RAID 1 (mirroring) provides redundancy and if one 
drive fails the volume still functions -- you can do this with 
hardware or, I believe, with LVM.  Then there''s RAID0+1 (Striping &
mirroring) which provides the performance benefit of striping with 
the high-availability of mirroring.

So whether or not you use LVM to do your raid or a hardware raid 
controller doesn''t change anything.  RAID 0 you have a volume down in 
a failure, RAID 1 you can withstand a drive failure.
>3) Continuing 2, I think I actually meant JBOD - where you just 
>connect all the drives and make them look as a single device, rather 
>then stripping.
Right, however this presents the same issues as striping but without 
the performance benefit of striping.

Lets say you have AFR set up and you have a 4 disk stripe or 
concatenated (jbod) volume on each of 2 servers.
if you have a single drive failure on one server, that entire 
filesystem becomes unavailable.
When you repair your drive, you effectively have a blank empty 
filesystem now.  gluster/AFR will notice this and start auto-healing 
the entire filesystem (as each directory and file are accessed).  so 
in time you''ll have copied over the entire filesystem over the network.

However, if you have a single server and you mirror your devices in a 
RAID1/0+1 config, then you loose a drive, you''re filesystem is still 
running, replace the drive and the RAID software fixes everything.

AFR is much more efficient in high read environments since you can 
either distribute the load across multiple servers and specify a 
local read volume to insure a particular client is always using the 
fastest server (which could be it''s own local brick, or a server on a 
lan when you''re using AFR across a wan).
>If you could clarfy the recommended approach, it would be great.
so here''s a summary:
IF you do NOT need to have more than one server serving the data 
(i.e, you''re not going to replicate the data for DR purposes)
I''d
recommend you avoid AFR in gluster and instead configure RAID0+1 on 
your server.  You''d be better off using a hardware RAID controller 
with a large batter backed up cache, but you could use a software 
RAID (like LVM).

if you said you had a high read environment, I''d have suggested 2 
servers using AFR over a private high-speed network since that 
reduces your points of failure, but given the high write large file 
environment, AFR may become a bottleneck.  --again, if you NEED 
server redundancy, then AFR is your best option, but if you don''t 
need it then it will just slow things down.

>this doesn''t realy answer your question, but hopefully it helps.
>
>
>
>Thanks again for your help.
>
>Regards.

Keith Freedman

2008-Dec-06 00:36 UTC

head link

[Gluster-users] Recommended underlining disk storage environment

At 04:15 PM 12/5/2008, Stas Oskin wrote:>Hi.
>
>Thanks so much for your replies, they had given me a good head-start.
>
>Few remaining questions:
>
>
>you first expand the underlying block device with LVM, then you grow 
>your filesystem.  some filesystems support this, some dont.
>
>
>Isn''t this usually reversed - first you grow the underlining 
>file-system, then you increase the LVM size?the filesystem cannot exceed the size of the device it''s sitting on. 
if the block device or logical volume is 200GB, you can''t expand the 
filesystem.  so you first expand the volume/block device to 300GB 
then grow the filesystem to 300Gb, for example.
>if you have 3 drives stripped together and one filesystem on top of 
>them ,then you will have a problem.
>if you have 3 drives each with their own filesystem on top and you 
>"unify" that with gluster or something then you can keep running
but
>will loose access to those files.
>
>
>Actually, this sounds as a good idea! By having all the drives 
>unified via GlusterFS, this basically means any of them could be 
>lost, but it won''t influence the other drives on same server.
>
>Have you ever tried such setup?
not with gluster.. and there are performance advantages.
with a LVM stripe, you''re data reads are distributed over mutliple 
physical devices, however with Unify, you''d be reading any individual 
file form only one spindle.  However, this is the price we pay for 
availability, so I think it depends on your performance 
requirements.  If you dont need blazing fast read''s then unify will 
give you better availability.
>Also, I presume it still would be possible to have one of the disks 
>to function as the system disk? In the event it''s lost, a simple 
>restore of the root, boot and swap partitions to a new disk + AFR 
>healing for the data should do the job. What you think?
any sub-directory can be the root of the gluster filesystem, so you 
could have this example:
/dev/sda1 /
/dev/sda2 /boot
/dev/sda3 /home
/dev/sdb /home2
/dev/sdc1 /home3
/dev/adc2 /junk

and then unify /tree1 with /home, /home2, /home3, /junk/stuff/home4
or something like that.
>at some point you''ll saturate something.  you''ll either
saturate
>your disk I/O or your network.  most likely the network, so try and 
>make sure that the network you use for the AFR connections doesn''t 
>have anything else competing for the bandwidth and I think you''ll
be fine.
>
>
>This makes sense indeed.
>
>By the way, how do you manage all the bricks?
>Do you have some centralized way to add new breaks and update the 
>config files for clients/servers?
My configuration is pretty simple.  I have one brick on each server 
using AFR betwen them.
However, I believe they have a few features targeted for 1.5 which 
will allow dynamic reconfiguration as well as a configuration 
editor/manager which will simplify things

However, once you''re comfortable with the way the config files are 
parsed, you''ll get the hang of it.  but if you''re going to 
re-configure your setup frequently, then it''ll get inconvenient pretty
quickly.

Keith

Keith Freedman

2008-Dec-06 00:36 UTC

head link

[Gluster-users] Recommended underlining disk storage environment

At 04:15 PM 12/5/2008, Stas Oskin wrote:>Hi.
>
>Thanks so much for your replies, they had given me a good head-start.
>
>Few remaining questions:
>
>
>you first expand the underlying block device with LVM, then you grow 
>your filesystem.  some filesystems support this, some dont.
>
>
>Isn''t this usually reversed - first you grow the underlining 
>file-system, then you increase the LVM size?the filesystem cannot exceed the size of the device it''s sitting on. 
if the block device or logical volume is 200GB, you can''t expand the 
filesystem.  so you first expand the volume/block device to 300GB 
then grow the filesystem to 300Gb, for example.
>if you have 3 drives stripped together and one filesystem on top of 
>them ,then you will have a problem.
>if you have 3 drives each with their own filesystem on top and you 
>"unify" that with gluster or something then you can keep running
but
>will loose access to those files.
>
>
>Actually, this sounds as a good idea! By having all the drives 
>unified via GlusterFS, this basically means any of them could be 
>lost, but it won''t influence the other drives on same server.
>
>Have you ever tried such setup?
not with gluster.. and there are performance advantages.
with a LVM stripe, you''re data reads are distributed over mutliple 
physical devices, however with Unify, you''d be reading any individual 
file form only one spindle.  However, this is the price we pay for 
availability, so I think it depends on your performance 
requirements.  If you dont need blazing fast read''s then unify will 
give you better availability.
>Also, I presume it still would be possible to have one of the disks 
>to function as the system disk? In the event it''s lost, a simple 
>restore of the root, boot and swap partitions to a new disk + AFR 
>healing for the data should do the job. What you think?
any sub-directory can be the root of the gluster filesystem, so you 
could have this example:
/dev/sda1 /
/dev/sda2 /boot
/dev/sda3 /home
/dev/sdb /home2
/dev/sdc1 /home3
/dev/adc2 /junk

and then unify /tree1 with /home, /home2, /home3, /junk/stuff/home4
or something like that.
>at some point you''ll saturate something.  you''ll either
saturate
>your disk I/O or your network.  most likely the network, so try and 
>make sure that the network you use for the AFR connections doesn''t 
>have anything else competing for the bandwidth and I think you''ll
be fine.
>
>
>This makes sense indeed.
>
>By the way, how do you manage all the bricks?
>Do you have some centralized way to add new breaks and update the 
>config files for clients/servers?
My configuration is pretty simple.  I have one brick on each server 
using AFR betwen them.
However, I believe they have a few features targeted for 1.5 which 
will allow dynamic reconfiguration as well as a configuration 
editor/manager which will simplify things

However, once you''re comfortable with the way the config files are 
parsed, you''ll get the hang of it.  but if you''re going to 
re-configure your setup frequently, then it''ll get inconvenient pretty
quickly.

Keith

Keith Freedman

2008-Dec-06 00:36 UTC

head link

[Gluster-users] Recommended underlining disk storage environment

At 04:15 PM 12/5/2008, Stas Oskin wrote:>Hi.
>
>Thanks so much for your replies, they had given me a good head-start.
>
>Few remaining questions:
>
>
>you first expand the underlying block device with LVM, then you grow 
>your filesystem.  some filesystems support this, some dont.
>
>
>Isn''t this usually reversed - first you grow the underlining 
>file-system, then you increase the LVM size?the filesystem cannot exceed the size of the device it''s sitting on. 
if the block device or logical volume is 200GB, you can''t expand the 
filesystem.  so you first expand the volume/block device to 300GB 
then grow the filesystem to 300Gb, for example.
>if you have 3 drives stripped together and one filesystem on top of 
>them ,then you will have a problem.
>if you have 3 drives each with their own filesystem on top and you 
>"unify" that with gluster or something then you can keep running
but
>will loose access to those files.
>
>
>Actually, this sounds as a good idea! By having all the drives 
>unified via GlusterFS, this basically means any of them could be 
>lost, but it won''t influence the other drives on same server.
>
>Have you ever tried such setup?
not with gluster.. and there are performance advantages.
with a LVM stripe, you''re data reads are distributed over mutliple 
physical devices, however with Unify, you''d be reading any individual 
file form only one spindle.  However, this is the price we pay for 
availability, so I think it depends on your performance 
requirements.  If you dont need blazing fast read''s then unify will 
give you better availability.
>Also, I presume it still would be possible to have one of the disks 
>to function as the system disk? In the event it''s lost, a simple 
>restore of the root, boot and swap partitions to a new disk + AFR 
>healing for the data should do the job. What you think?
any sub-directory can be the root of the gluster filesystem, so you 
could have this example:
/dev/sda1 /
/dev/sda2 /boot
/dev/sda3 /home
/dev/sdb /home2
/dev/sdc1 /home3
/dev/adc2 /junk

and then unify /tree1 with /home, /home2, /home3, /junk/stuff/home4
or something like that.
>at some point you''ll saturate something.  you''ll either
saturate
>your disk I/O or your network.  most likely the network, so try and 
>make sure that the network you use for the AFR connections doesn''t 
>have anything else competing for the bandwidth and I think you''ll
be fine.
>
>
>This makes sense indeed.
>
>By the way, how do you manage all the bricks?
>Do you have some centralized way to add new breaks and update the 
>config files for clients/servers?
My configuration is pretty simple.  I have one brick on each server 
using AFR betwen them.
However, I believe they have a few features targeted for 1.5 which 
will allow dynamic reconfiguration as well as a configuration 
editor/manager which will simplify things

However, once you''re comfortable with the way the config files are 
parsed, you''ll get the hang of it.  but if you''re going to 
re-configure your setup frequently, then it''ll get inconvenient pretty
quickly.

Keith

Gluster users - Dec 2008 - Recommended underlining disk storage environment

[Gluster-users] Recommended underlining disk storage environment

[Gluster-users] Recommended underlining disk storage environment

[Gluster-users] Recommended underlining disk storage environment

[Gluster-users] Recommended underlining disk storage environment

[Gluster-users] Recommended underlining disk storage environment

[Gluster-users] Recommended underlining disk storage environment

[Gluster-users] Recommended underlining disk storage environment

[Gluster-users] Recommended underlining disk storage environment

[Gluster-users] Recommended underlining disk storage environment

[Gluster-users] Recommended underlining disk storage environment

[Gluster-users] Recommended underlining disk storage environment

[Gluster-users] Recommended underlining disk storage environment