thr3ads.net - Gluster users - [Gluster-users] Advice for setup: SW RAID 6 vs JBOD [Jun 2019]

If this information is useful, please help other people find it:
Share via:

Hu Bert

2019-Jun-06 06:53 UTC

[Gluster-users] Advice for setup: SW RAID 6 vs JBOD

Good morning,

my comment won't help you directly, but i thought i'd send it anyway...

Our first glusterfs setup had 3 servers withs 4 disks=bricks (10TB,
JBOD) each. Was running fine in the beginning, but then 1 disk failed.
The following heal took ~1 month, with a bad performance (quite high
IO). Shortly after the heal hat finished another disk failed -> same
problems again. Not funny.

For our new system we decided to use 3 servers with 10 disks (10 TB)
each, but now the 10 disks in a SW RAID 10 (well, we split the 10
disks into 2 SW RAID 10, each of them is a brick, we have 2 gluster
volumes). A lot of disk space "wasted", with this type of SW RAID and
a replicate 3 setup, but we wanted to avoid the "healing takes a long
time with bad performance" problems. Now mdadm takes care of
replicating data, glusterfs should always see "good" bricks.

And the decision may depend on what kind of data you have. Many small
files, like tens of millions? Or not that much, but bigger files? I
once watched a video (i think it was this one:
https://www.youtube.com/watch?v=61HDVwttNYI). Recommendation there:
RAID 6 or 10 for small files, for big files... well, already 2 years
"old" ;-)

As i said, this won't help you directly. You have to identify what's
most important for your scenario; as you said, high performance is not
an issue - if this is true even when you have slight performance
issues after a disk fail then ok. My experience so far: the bigger and
slower the disks are and the more data you have -> healing will hurt
-> try to avoid this. If the disks are small and fast (SSDs), healing
will be faster -> JBOD is an option.


hth,
Hubert

Am Mi., 5. Juni 2019 um 11:33 Uhr schrieb Eduardo Mayoral <emayoral at
arsys.es>:>
> Hi,
>
>     I am looking into a new gluster deployment to replace an ancient one.
>
>     For this deployment I will be using some repurposed servers I
> already have in stock. The disk specs are 12 * 3 TB SATA disks. No HW
> RAID controller. They also have some SSD which would be nice to leverage
> as cache or similar to improve performance, since it is already there.
> Advice on how to leverage the SSDs would be greatly appreciated.
>
>     One of the design choices I have to make is using 3 nodes for a
> replica-3 with JBOD, or using 2 nodes with a replica-2 and using SW RAID
> 6 for the disks, maybe adding a 3rd node with a smaller amount of disk
> as metadata node for the replica set. I would love to hear advice on the
> pros and cons of each setup from the gluster experts.
>
>     The data will be accessed from 4 to 6 systems with native gluster,
> not sure if that makes any difference.
>
>     The amount of data I have to store there is currently 20 TB, with
> moderate growth. iops are quite low so high performance is not an issue.
> The data will fit in any of the two setups.
>
>     Thanks in advance for your advice!
>
> --
> Eduardo Mayoral Jimeno
> Systems engineer, platform department. Arsys Internet.
> emayoral at arsys.es - +34 941 620 105 - ext 2153
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users

Eduardo Mayoral

2019-Jun-06 16:48 UTC

head link

[Gluster-users] Advice for setup: SW RAID 6 vs JBOD

Your comment actually helps me more than you think, one of the main
doubts I have is whether I go for JOBD with replica 3 or SW RAID 6 with
replica2 + arbitrer. Before reading your email I was leaning more
towards JOBD, as reconstruction of a moderately big RAID 6 with mdadm
can be painful too. Now I see a reconstruct is going to be painful
either way...

For the record, the workload I am going to migrate is currently
18,314,445 MB and 34,752,784 inodes (which is not exactly the same as
files, but let's use that for a rough estimate), for an average file
size of about 539 KB per file.

Thanks a lot for your time and insights!

On 6/6/19 8:53, Hu Bert wrote:> Good morning,
>
> my comment won't help you directly, but i thought i'd send it
anyway...
>
> Our first glusterfs setup had 3 servers withs 4 disks=bricks (10TB,
> JBOD) each. Was running fine in the beginning, but then 1 disk failed.
> The following heal took ~1 month, with a bad performance (quite high
> IO). Shortly after the heal hat finished another disk failed -> same
> problems again. Not funny.
>
> For our new system we decided to use 3 servers with 10 disks (10 TB)
> each, but now the 10 disks in a SW RAID 10 (well, we split the 10
> disks into 2 SW RAID 10, each of them is a brick, we have 2 gluster
> volumes). A lot of disk space "wasted", with this type of SW RAID
and
> a replicate 3 setup, but we wanted to avoid the "healing takes a long
> time with bad performance" problems. Now mdadm takes care of
> replicating data, glusterfs should always see "good" bricks.
>
> And the decision may depend on what kind of data you have. Many small
> files, like tens of millions? Or not that much, but bigger files? I
> once watched a video (i think it was this one:
> https://www.youtube.com/watch?v=61HDVwttNYI). Recommendation there:
> RAID 6 or 10 for small files, for big files... well, already 2 years
> "old" ;-)
>
> As i said, this won't help you directly. You have to identify
what's
> most important for your scenario; as you said, high performance is not
> an issue - if this is true even when you have slight performance
> issues after a disk fail then ok. My experience so far: the bigger and
> slower the disks are and the more data you have -> healing will hurt
> -> try to avoid this. If the disks are small and fast (SSDs), healing
> will be faster -> JBOD is an option.
>
>
> hth,
> Hubert
>
> Am Mi., 5. Juni 2019 um 11:33 Uhr schrieb Eduardo Mayoral <emayoral at
arsys.es>:
>> Hi,
>>
>>     I am looking into a new gluster deployment to replace an ancient
one.
>>
>>     For this deployment I will be using some repurposed servers I
>> already have in stock. The disk specs are 12 * 3 TB SATA disks. No HW
>> RAID controller. They also have some SSD which would be nice to
leverage
>> as cache or similar to improve performance, since it is already there.
>> Advice on how to leverage the SSDs would be greatly appreciated.
>>
>>     One of the design choices I have to make is using 3 nodes for a
>> replica-3 with JBOD, or using 2 nodes with a replica-2 and using SW
RAID
>> 6 for the disks, maybe adding a 3rd node with a smaller amount of disk
>> as metadata node for the replica set. I would love to hear advice on
the
>> pros and cons of each setup from the gluster experts.
>>
>>     The data will be accessed from 4 to 6 systems with native gluster,
>> not sure if that makes any difference.
>>
>>     The amount of data I have to store there is currently 20 TB, with
>> moderate growth. iops are quite low so high performance is not an
issue.
>> The data will fit in any of the two setups.
>>
>>     Thanks in advance for your advice!
>>
>> --
>> Eduardo Mayoral Jimeno
>> Systems engineer, platform department. Arsys Internet.
>> emayoral at arsys.es - +34 941 620 105 - ext 2153
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
-- 
Eduardo Mayoral Jimeno
Systems engineer, platform department. Arsys Internet.
emayoral at arsys.es - +34 941 620 105 - ext 2153

Vincent Royer

2019-Jun-06 17:07 UTC

head link

[Gluster-users] Advice for setup: SW RAID 6 vs JBOD

What if you have two fast 2TB SSDs per server in hardware RAID 1, 3 hosts
in replica 3.  Dual 10gb enterprise nics.  This would end up being a single
2TB volume, correct?  Seems like that would offer great speed and have
pretty decent survivability.

On Wed, Jun 5, 2019 at 11:54 PM Hu Bert <revirii at googlemail.com> wrote:
> Good morning,
>
> my comment won't help you directly, but i thought i'd send it
anyway...
>
> Our first glusterfs setup had 3 servers withs 4 disks=bricks (10TB,
> JBOD) each. Was running fine in the beginning, but then 1 disk failed.
> The following heal took ~1 month, with a bad performance (quite high
> IO). Shortly after the heal hat finished another disk failed -> same
> problems again. Not funny.
>
> For our new system we decided to use 3 servers with 10 disks (10 TB)
> each, but now the 10 disks in a SW RAID 10 (well, we split the 10
> disks into 2 SW RAID 10, each of them is a brick, we have 2 gluster
> volumes). A lot of disk space "wasted", with this type of SW RAID
and
> a replicate 3 setup, but we wanted to avoid the "healing takes a long
> time with bad performance" problems. Now mdadm takes care of
> replicating data, glusterfs should always see "good" bricks.
>
> And the decision may depend on what kind of data you have. Many small
> files, like tens of millions? Or not that much, but bigger files? I
> once watched a video (i think it was this one:
> https://www.youtube.com/watch?v=61HDVwttNYI). Recommendation there:
> RAID 6 or 10 for small files, for big files... well, already 2 years
> "old" ;-)
>
> As i said, this won't help you directly. You have to identify
what's
> most important for your scenario; as you said, high performance is not
> an issue - if this is true even when you have slight performance
> issues after a disk fail then ok. My experience so far: the bigger and
> slower the disks are and the more data you have -> healing will hurt
> -> try to avoid this. If the disks are small and fast (SSDs), healing
> will be faster -> JBOD is an option.
>
>
> hth,
> Hubert
>
> Am Mi., 5. Juni 2019 um 11:33 Uhr schrieb Eduardo Mayoral <
> emayoral at arsys.es>:
> >
> > Hi,
> >
> >     I am looking into a new gluster deployment to replace an ancient
one.
> >
> >     For this deployment I will be using some repurposed servers I
> > already have in stock. The disk specs are 12 * 3 TB SATA disks. No HW
> > RAID controller. They also have some SSD which would be nice to
leverage
> > as cache or similar to improve performance, since it is already there.
> > Advice on how to leverage the SSDs would be greatly appreciated.
> >
> >     One of the design choices I have to make is using 3 nodes for a
> > replica-3 with JBOD, or using 2 nodes with a replica-2 and using SW
RAID
> > 6 for the disks, maybe adding a 3rd node with a smaller amount of disk
> > as metadata node for the replica set. I would love to hear advice on
the
> > pros and cons of each setup from the gluster experts.
> >
> >     The data will be accessed from 4 to 6 systems with native gluster,
> > not sure if that makes any difference.
> >
> >     The amount of data I have to store there is currently 20 TB, with
> > moderate growth. iops are quite low so high performance is not an
issue.
> > The data will fit in any of the two setups.
> >
> >     Thanks in advance for your advice!
> >
> > --
> > Eduardo Mayoral Jimeno
> > Systems engineer, platform department. Arsys Internet.
> > emayoral at arsys.es - +34 941 620 105 - ext 2153
> >
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190606/dd3c1468/attachment.html>

Gluster users - Jun 2019 - Advice for setup: SW RAID 6 vs JBOD

[Gluster-users] Advice for setup: SW RAID 6 vs JBOD

[Gluster-users] Advice for setup: SW RAID 6 vs JBOD

[Gluster-users] Advice for setup: SW RAID 6 vs JBOD