thr3ads.net - Gluster users - [Gluster-users] Gluster-users Digest, Vol 20, Issue 22 [Dec 2009]

If this information is useful, please help other people find it:
Share via:

Larry Bates

2009-Dec-17 16:17 UTC

[Gluster-users] Gluster-users Digest, Vol 20, Issue 22

Phi.l,

I think the real question you need to ask has to do with why we are 
using GlusterFS at all and what happens when something fails.  Normally 
GlusterFS is used to provide scalability, redundancy/recovery, and 
performance.  For many applications performance will be the least of the 
worries so we concentrate on scalability and redundancy/recovery.  
Scalability can be achieved no matter which way you configure your 
servers.  Using distribute translator (DHT) you can unify all the 
servers into a single virtual storage space.  The problem comes when you 
look at what happens when you have a machine/drive failures and need the 
redundancy/recovery capabilities of GlusterFS.  By putting 36Tb of 
storage on a single server and exposing it as a single volume (using 
either hardware or software RAID), you will have to replicate that to a 
replacement server after a failure.  Replicating 36Tb will take a lot of 
time and CPU cycles.  If you keep things simple (JBOD) and use AFR to 
replicate drives between servers and use DHT to unify everything 
together, now you only have to move 1.5Tb/2Tb when a drive fails.  You 
will also note that you get to use 100% of your disk storage this way 
instead of wasting 1 drive per array with RAID5 or two drives with 
RAID6.  Normally with RAID5/6 it is also imperative that you have a hot 
spare per array, which means you waste an additional driver per array.  
To make RAID5/6 work with no single point of failure you have to do 
something like RAID50/60 across two controllers which gets expensive and 
much more difficult to manage and to grow.  Implementing GlusterFS using 
more modest hardware makes all those "issues" go away.  Just use 
GlusterFS to provide the RAID-like capabilities (via AFR and DHT).

Personally I doubt that I would set up my storage the way you describe.  
I probably would (and have) set it up with more smaller servers.  
Something like three times as many 2U servers with 8x2Tb drives each (or 
even 6 times as many 1U servers with 4x2Tb drives each) and forget the 
expensive RAID SATA controllers, they aren't necessary and are just a 
single point of failure that you can eliminate.  In addition you will 
enjoy significant performance improvements because you have:

1) Many parallel paths to storage (36x1U or 18x2U vs 6x5U servers).  
Gigabit Ethernet is fast, but still will limit bandwidth to a single 
machine.
2) Write performance on RAID5/6 is never going to be as fast as JBOD.
3) You should have much more memory caching available (36x8Gb = 256Gb 
memory or 18x8Gb memory = 128Gb vs maybe 6x16Gb = 96Gb)
4) Management of the storage is done in one place..GlusterFS.  No messy 
RAID controller setups to document/remember.
5) You can expand in the future in a much more granular and controlled 
fashion.  Add 2 machines (1 for replication) and you get 8Tb (using 2Tb 
drives) of storage.  When you want to replace a machine, just set up new 
one, fail the old one, and let GlusterFS build the new one for you (AFR 
will do the heavy lifting).  CPUs will get faster, hard drives will get 
faster and bigger in the future, so make it easy to upgrade.  A small 
number of BIG machines makes it a lot harder to do upgrades as new 
hardware becomes available.
6) Machine failures (motherboard, power supply, etc.) will effect much 
less of your storage network.  Having a spare 1U machine around as a hot 
spare doesn't cost much (maybe $1200).  Having a spare 5U monster around 
does (probably close to $6000).

IMHO 36 x 1U or 18 x 2U servers shouldn't cost any more (and maybe less) 
than the big boxes you are looking to buy.  They are commodity items.  
If you go the 1U route you don't need anything but a machine, with 
memory and 4 hard drives (all server motherboards come with at least 4 
SATA ports).  By using 2Tb drives, I think you would find that the cost 
would be actually less.  By NOT using hardware RAID you can also NOT use 
RAID-class hard drives which cost about $100 each more than non-RAID 
hard drives.  Just that change alone could save you 6 x 24 = 144 x $100 
= $14,400!  JBOD just doesn't need RAID-class hard drives because you 
don't need the sophisticated firmware that the RAID-class hard drives 
provide.  You still will want quality hard drives, but failures will 
have such a low impact that it is much less of a problem.

By using more smaller machines you also eliminate the need for redundant 
power supplies (which would be a requirement in your large boxes because 
it would be a single point of failure on a large percentage of your 
storage system).

Hope the information helps.

Regards,
Larry Bates


------------------------------> Message: 6
> Date: Thu, 17 Dec 2009 00:18:54 -0600
> From: phil cryer <phil at cryer.us>
> Subject: [Gluster-users] Recommended GlusterFS configuration for 6
> 	node	cluster
> To: "gluster-users at gluster.org" <gluster-users at
gluster.org>
> Message-ID:
> 	<3a3bc55a0912162218i4e3f326cr9956dd37132bfc19 at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> We're setting up 6 servers, each with 24 x 1.5TB drives, the systems
> will run Debian testing and Gluster 3.x.  The SATA RAID card offers
> RAID5 and RAID6, we're wondering what the optimum setup would be for
> this configuration.  Do we RAID5 the disks, and have GlusterFS use
> them that way, or do we keep them all 'raw' and have GlusterFS
handle
> the replication (though not 2x as we would have with the RAID
> options)?  Obviously a lot of ways to do this, just wondering what
> GlusterFS devs and other experienced users would recommend.
>
> Thanks
>
> P
>

Tejas N. Bhise

2009-Dec-17 17:23 UTC

head link

[Gluster-users] Gluster-users Digest, Vol 20, Issue 22

Thanks, Larry, for the comprehensive information.

Phil, I hope that answers a lot of your questions. Feel free to ask more, we
have a great community here.

Regards,
Tejas.

----- Original Message -----
From: "Larry Bates" <larry.bates at vitalesafe.com>
To: gluster-users at gluster.org, phil at cryer.us
Sent: Thursday, December 17, 2009 9:47:30 PM GMT +05:30 Chennai, Kolkata,
Mumbai, New Delhi
Subject: Re: [Gluster-users] Gluster-users Digest, Vol 20, Issue 22

Phi.l,

I think the real question you need to ask has to do with why we are 
using GlusterFS at all and what happens when something fails.  Normally 
GlusterFS is used to provide scalability, redundancy/recovery, and 
performance.  For many applications performance will be the least of the 
worries so we concentrate on scalability and redundancy/recovery.  
Scalability can be achieved no matter which way you configure your 
servers.  Using distribute translator (DHT) you can unify all the 
servers into a single virtual storage space.  The problem comes when you 
look at what happens when you have a machine/drive failures and need the 
redundancy/recovery capabilities of GlusterFS.  By putting 36Tb of 
storage on a single server and exposing it as a single volume (using 
either hardware or software RAID), you will have to replicate that to a 
replacement server after a failure.  Replicating 36Tb will take a lot of 
time and CPU cycles.  If you keep things simple (JBOD) and use AFR to 
replicate drives between servers and use DHT to unify everything 
together, now you only have to move 1.5Tb/2Tb when a drive fails.  You 
will also note that you get to use 100% of your disk storage this way 
instead of wasting 1 drive per array with RAID5 or two drives with 
RAID6.  Normally with RAID5/6 it is also imperative that you have a hot 
spare per array, which means you waste an additional driver per array.  
To make RAID5/6 work with no single point of failure you have to do 
something like RAID50/60 across two controllers which gets expensive and 
much more difficult to manage and to grow.  Implementing GlusterFS using 
more modest hardware makes all those "issues" go away.  Just use 
GlusterFS to provide the RAID-like capabilities (via AFR and DHT).

Personally I doubt that I would set up my storage the way you describe.  
I probably would (and have) set it up with more smaller servers.  
Something like three times as many 2U servers with 8x2Tb drives each (or 
even 6 times as many 1U servers with 4x2Tb drives each) and forget the 
expensive RAID SATA controllers, they aren't necessary and are just a 
single point of failure that you can eliminate.  In addition you will 
enjoy significant performance improvements because you have:

1) Many parallel paths to storage (36x1U or 18x2U vs 6x5U servers).  
Gigabit Ethernet is fast, but still will limit bandwidth to a single 
machine.
2) Write performance on RAID5/6 is never going to be as fast as JBOD.
3) You should have much more memory caching available (36x8Gb = 256Gb 
memory or 18x8Gb memory = 128Gb vs maybe 6x16Gb = 96Gb)
4) Management of the storage is done in one place..GlusterFS.  No messy 
RAID controller setups to document/remember.
5) You can expand in the future in a much more granular and controlled 
fashion.  Add 2 machines (1 for replication) and you get 8Tb (using 2Tb 
drives) of storage.  When you want to replace a machine, just set up new 
one, fail the old one, and let GlusterFS build the new one for you (AFR 
will do the heavy lifting).  CPUs will get faster, hard drives will get 
faster and bigger in the future, so make it easy to upgrade.  A small 
number of BIG machines makes it a lot harder to do upgrades as new 
hardware becomes available.
6) Machine failures (motherboard, power supply, etc.) will effect much 
less of your storage network.  Having a spare 1U machine around as a hot 
spare doesn't cost much (maybe $1200).  Having a spare 5U monster around 
does (probably close to $6000).

IMHO 36 x 1U or 18 x 2U servers shouldn't cost any more (and maybe less) 
than the big boxes you are looking to buy.  They are commodity items.  
If you go the 1U route you don't need anything but a machine, with 
memory and 4 hard drives (all server motherboards come with at least 4 
SATA ports).  By using 2Tb drives, I think you would find that the cost 
would be actually less.  By NOT using hardware RAID you can also NOT use 
RAID-class hard drives which cost about $100 each more than non-RAID 
hard drives.  Just that change alone could save you 6 x 24 = 144 x $100 
= $14,400!  JBOD just doesn't need RAID-class hard drives because you 
don't need the sophisticated firmware that the RAID-class hard drives 
provide.  You still will want quality hard drives, but failures will 
have such a low impact that it is much less of a problem.

By using more smaller machines you also eliminate the need for redundant 
power supplies (which would be a requirement in your large boxes because 
it would be a single point of failure on a large percentage of your 
storage system).

Hope the information helps.

Regards,
Larry Bates


------------------------------> Message: 6
> Date: Thu, 17 Dec 2009 00:18:54 -0600
> From: phil cryer <phil at cryer.us>
> Subject: [Gluster-users] Recommended GlusterFS configuration for 6
> 	node	cluster
> To: "gluster-users at gluster.org" <gluster-users at
gluster.org>
> Message-ID:
> 	<3a3bc55a0912162218i4e3f326cr9956dd37132bfc19 at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> We're setting up 6 servers, each with 24 x 1.5TB drives, the systems
> will run Debian testing and Gluster 3.x.  The SATA RAID card offers
> RAID5 and RAID6, we're wondering what the optimum setup would be for
> this configuration.  Do we RAID5 the disks, and have GlusterFS use
> them that way, or do we keep them all 'raw' and have GlusterFS
handle
> the replication (though not 2x as we would have with the RAID
> options)?  Obviously a lot of ways to do this, just wondering what
> GlusterFS devs and other experienced users would recommend.
>
> Thanks
>
> P
>   _______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

phil cryer

2010-Jan-05 16:21 UTC

head link

[Gluster-users] Gluster-users Digest, Vol 20, Issue 22

This is *very* helpful, thanks for taking the time Larry!  Looking
forward to giving feedback once we have the cluster up.

P

On Thu, Dec 17, 2009 at 11:23 AM, Tejas N. Bhise <tejas at gluster.com>
wrote:> Thanks, Larry, for the comprehensive information.
>
> Phil, I hope that answers a lot of your questions. Feel free to ask more,
we have a great community here.
>
> Regards,
> Tejas.
>
> ----- Original Message -----
> From: "Larry Bates" <larry.bates at vitalesafe.com>
> To: gluster-users at gluster.org, phil at cryer.us
> Sent: Thursday, December 17, 2009 9:47:30 PM GMT +05:30 Chennai, Kolkata,
Mumbai, New Delhi
> Subject: Re: [Gluster-users] Gluster-users Digest, Vol 20, Issue 22
>
> Phi.l,
>
> I think the real question you need to ask has to do with why we are
> using GlusterFS at all and what happens when something fails. ?Normally
> GlusterFS is used to provide scalability, redundancy/recovery, and
> performance. ?For many applications performance will be the least of the
> worries so we concentrate on scalability and redundancy/recovery.
> Scalability can be achieved no matter which way you configure your
> servers. ?Using distribute translator (DHT) you can unify all the
> servers into a single virtual storage space. ?The problem comes when you
> look at what happens when you have a machine/drive failures and need the
> redundancy/recovery capabilities of GlusterFS. ?By putting 36Tb of
> storage on a single server and exposing it as a single volume (using
> either hardware or software RAID), you will have to replicate that to a
> replacement server after a failure. ?Replicating 36Tb will take a lot of
> time and CPU cycles. ?If you keep things simple (JBOD) and use AFR to
> replicate drives between servers and use DHT to unify everything
> together, now you only have to move 1.5Tb/2Tb when a drive fails. ?You
> will also note that you get to use 100% of your disk storage this way
> instead of wasting 1 drive per array with RAID5 or two drives with
> RAID6. ?Normally with RAID5/6 it is also imperative that you have a hot
> spare per array, which means you waste an additional driver per array.
> To make RAID5/6 work with no single point of failure you have to do
> something like RAID50/60 across two controllers which gets expensive and
> much more difficult to manage and to grow. ?Implementing GlusterFS using
> more modest hardware makes all those "issues" go away. ?Just use
> GlusterFS to provide the RAID-like capabilities (via AFR and DHT).
>
> Personally I doubt that I would set up my storage the way you describe.
> I probably would (and have) set it up with more smaller servers.
> Something like three times as many 2U servers with 8x2Tb drives each (or
> even 6 times as many 1U servers with 4x2Tb drives each) and forget the
> expensive RAID SATA controllers, they aren't necessary and are just a
> single point of failure that you can eliminate. ?In addition you will
> enjoy significant performance improvements because you have:
>
> 1) Many parallel paths to storage (36x1U or 18x2U vs 6x5U servers).
> Gigabit Ethernet is fast, but still will limit bandwidth to a single
> machine.
> 2) Write performance on RAID5/6 is never going to be as fast as JBOD.
> 3) You should have much more memory caching available (36x8Gb = 256Gb
> memory or 18x8Gb memory = 128Gb vs maybe 6x16Gb = 96Gb)
> 4) Management of the storage is done in one place..GlusterFS. ?No messy
> RAID controller setups to document/remember.
> 5) You can expand in the future in a much more granular and controlled
> fashion. ?Add 2 machines (1 for replication) and you get 8Tb (using 2Tb
> drives) of storage. ?When you want to replace a machine, just set up new
> one, fail the old one, and let GlusterFS build the new one for you (AFR
> will do the heavy lifting). ?CPUs will get faster, hard drives will get
> faster and bigger in the future, so make it easy to upgrade. ?A small
> number of BIG machines makes it a lot harder to do upgrades as new
> hardware becomes available.
> 6) Machine failures (motherboard, power supply, etc.) will effect much
> less of your storage network. ?Having a spare 1U machine around as a hot
> spare doesn't cost much (maybe $1200). ?Having a spare 5U monster
around
> does (probably close to $6000).
>
> IMHO 36 x 1U or 18 x 2U servers shouldn't cost any more (and maybe
less)
> than the big boxes you are looking to buy. ?They are commodity items.
> If you go the 1U route you don't need anything but a machine, with
> memory and 4 hard drives (all server motherboards come with at least 4
> SATA ports). ?By using 2Tb drives, I think you would find that the cost
> would be actually less. ?By NOT using hardware RAID you can also NOT use
> RAID-class hard drives which cost about $100 each more than non-RAID
> hard drives. ?Just that change alone could save you 6 x 24 = 144 x $100
> = $14,400! ?JBOD just doesn't need RAID-class hard drives because you
> don't need the sophisticated firmware that the RAID-class hard drives
> provide. ?You still will want quality hard drives, but failures will
> have such a low impact that it is much less of a problem.
>
> By using more smaller machines you also eliminate the need for redundant
> power supplies (which would be a requirement in your large boxes because
> it would be a single point of failure on a large percentage of your
> storage system).
>
> Hope the information helps.
>
> Regards,
> Larry Bates
>
>
> ------------------------------
>> Message: 6
>> Date: Thu, 17 Dec 2009 00:18:54 -0600
>> From: phil cryer <phil at cryer.us>
>> Subject: [Gluster-users] Recommended GlusterFS configuration for 6
>> ? ? ? node ? ?cluster
>> To: "gluster-users at gluster.org" <gluster-users at
gluster.org>
>> Message-ID:
>> ? ? ? <3a3bc55a0912162218i4e3f326cr9956dd37132bfc19 at
mail.gmail.com>
>> Content-Type: text/plain; charset=UTF-8
>>
>> We're setting up 6 servers, each with 24 x 1.5TB drives, the
systems
>> will run Debian testing and Gluster 3.x. ?The SATA RAID card offers
>> RAID5 and RAID6, we're wondering what the optimum setup would be
for
>> this configuration. ?Do we RAID5 the disks, and have GlusterFS use
>> them that way, or do we keep them all 'raw' and have GlusterFS
handle
>> the replication (though not 2x as we would have with the RAID
>> options)? ?Obviously a lot of ways to do this, just wondering what
>> GlusterFS devs and other experienced users would recommend.
>>
>> Thanks
>>
>> P
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>


-- 
http://philcryer.com

Liam Slusser

2010-Jan-05 18:00 UTC

head link

[Gluster-users] Gluster-users Digest, Vol 20, Issue 22

Larry & All,

I would much rather rebuild a bad drive with a raid controller then
have to wait for Gluster to do it.  With a large number of files doing
a ls -aglR can take weeks.  Also you don't NEED enterprise drives with
a raid controller, i use desktop 1.5tb Seagate drives which happy as a
clam on a 3ware SAS card under a SAS expander.

liam


On Thu, Dec 17, 2009 at 8:17 AM, Larry Bates <larry.bates at
vitalesafe.com> wrote:> Phi.l,
>
> I think the real question you need to ask has to do with why we are using
> GlusterFS at all and what happens when something fails. ?Normally GlusterFS
> is used to provide scalability, redundancy/recovery, and performance. ?For
> many applications performance will be the least of the worries so we
> concentrate on scalability and redundancy/recovery. ?Scalability can be
> achieved no matter which way you configure your servers. ?Using distribute
> translator (DHT) you can unify all the servers into a single virtual
storage
> space. ?The problem comes when you look at what happens when you have a
> machine/drive failures and need the redundancy/recovery capabilities of
> GlusterFS. ?By putting 36Tb of storage on a single server and exposing it
as
> a single volume (using either hardware or software RAID), you will have to
> replicate that to a replacement server after a failure. ?Replicating 36Tb
> will take a lot of time and CPU cycles. ?If you keep things simple (JBOD)
> and use AFR to replicate drives between servers and use DHT to unify
> everything together, now you only have to move 1.5Tb/2Tb when a drive
fails.
> ?You will also note that you get to use 100% of your disk storage this way
> instead of wasting 1 drive per array with RAID5 or two drives with RAID6.
> ?Normally with RAID5/6 it is also imperative that you have a hot spare per
> array, which means you waste an additional driver per array. ?To make
> RAID5/6 work with no single point of failure you have to do something like
> RAID50/60 across two controllers which gets expensive and much more
> difficult to manage and to grow. ?Implementing GlusterFS using more modest
> hardware makes all those "issues" go away. ?Just use GlusterFS to
provide
> the RAID-like capabilities (via AFR and DHT).
>
> Personally I doubt that I would set up my storage the way you describe. ?I
> probably would (and have) set it up with more smaller servers. ?Something
> like three times as many 2U servers with 8x2Tb drives each (or even 6 times
> as many 1U servers with 4x2Tb drives each) and forget the expensive RAID
> SATA controllers, they aren't necessary and are just a single point of
> failure that you can eliminate. ?In addition you will enjoy significant
> performance improvements because you have:
>
> 1) Many parallel paths to storage (36x1U or 18x2U vs 6x5U servers).
?Gigabit
> Ethernet is fast, but still will limit bandwidth to a single machine.
> 2) Write performance on RAID5/6 is never going to be as fast as JBOD.
> 3) You should have much more memory caching available (36x8Gb = 256Gb
memory
> or 18x8Gb memory = 128Gb vs maybe 6x16Gb = 96Gb)
> 4) Management of the storage is done in one place..GlusterFS. ?No messy
RAID
> controller setups to document/remember.
> 5) You can expand in the future in a much more granular and controlled
> fashion. ?Add 2 machines (1 for replication) and you get 8Tb (using 2Tb
> drives) of storage. ?When you want to replace a machine, just set up new
> one, fail the old one, and let GlusterFS build the new one for you (AFR
will
> do the heavy lifting). ?CPUs will get faster, hard drives will get faster
> and bigger in the future, so make it easy to upgrade. ?A small number of
BIG
> machines makes it a lot harder to do upgrades as new hardware becomes
> available.
> 6) Machine failures (motherboard, power supply, etc.) will effect much less
> of your storage network. ?Having a spare 1U machine around as a hot spare
> doesn't cost much (maybe $1200). ?Having a spare 5U monster around does
> (probably close to $6000).
>
> IMHO 36 x 1U or 18 x 2U servers shouldn't cost any more (and maybe
less)
> than the big boxes you are looking to buy. ?They are commodity items. ?If
> you go the 1U route you don't need anything but a machine, with memory
and 4
> hard drives (all server motherboards come with at least 4 SATA ports). ?By
> using 2Tb drives, I think you would find that the cost would be actually
> less. ?By NOT using hardware RAID you can also NOT use RAID-class hard
> drives which cost about $100 each more than non-RAID hard drives. ?Just
that
> change alone could save you 6 x 24 = 144 x $100 = $14,400! ?JBOD just
> doesn't need RAID-class hard drives because you don't need the
sophisticated
> firmware that the RAID-class hard drives provide. ?You still will want
> quality hard drives, but failures will have such a low impact that it is
> much less of a problem.
>
> By using more smaller machines you also eliminate the need for redundant
> power supplies (which would be a requirement in your large boxes because it
> would be a single point of failure on a large percentage of your storage
> system).
>
> Hope the information helps.
>
> Regards,
> Larry Bates
>
>
> ------------------------------
>>
>> Message: 6
>> Date: Thu, 17 Dec 2009 00:18:54 -0600
>> From: phil cryer <phil at cryer.us>
>> Subject: [Gluster-users] Recommended GlusterFS configuration for 6
>> ? ? ? ?node ? ?cluster
>> To: "gluster-users at gluster.org" <gluster-users at
gluster.org>
>> Message-ID:
>> ? ? ? ?<3a3bc55a0912162218i4e3f326cr9956dd37132bfc19 at
mail.gmail.com>
>> Content-Type: text/plain; charset=UTF-8
>>
>> We're setting up 6 servers, each with 24 x 1.5TB drives, the
systems
>> will run Debian testing and Gluster 3.x. ?The SATA RAID card offers
>> RAID5 and RAID6, we're wondering what the optimum setup would be
for
>> this configuration. ?Do we RAID5 the disks, and have GlusterFS use
>> them that way, or do we keep them all 'raw' and have GlusterFS
handle
>> the replication (though not 2x as we would have with the RAID
>> options)? ?Obviously a lot of ways to do this, just wondering what
>> GlusterFS devs and other experienced users would recommend.
>>
>> Thanks
>>
>> P
>>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>

Gluster users - Dec 2009 - Gluster-users Digest, Vol 20, Issue 22

[Gluster-users] Gluster-users Digest, Vol 20, Issue 22

[Gluster-users] Gluster-users Digest, Vol 20, Issue 22

[Gluster-users] Gluster-users Digest, Vol 20, Issue 22

[Gluster-users] Gluster-users Digest, Vol 20, Issue 22