thr3ads.net - Gluster users - [Gluster-users] Need help to design a data storage [Sep 2016]

If this information is useful, please help other people find it:
Share via:

Ashish Pandey

2016-Aug-09 17:57 UTC

[Gluster-users] Need help to design a data storage

What about EC? Are redundant data spread across multiple servers? If not,
multiple replica would be placed on the same server. I can loose 2 bricks (2
disks) but what if I'll loose the whole server with both bricks on it? And
when a server fails, multiple bricks are affected .........

----- 

Yes, redundant data spread across multiple servers. In my example I mentioned 6
different nodes each with one brick.
Point is that for 4+2 you can loose any 2 bricks. It could be because of node
failure or brick failure.
1 - 6 bricks on 6 different nodes - any 2 nodes may go down - EC win 

However if you have only 2 nodes and 3 bricks on each nodes, then yes in this
case even if one node goes down, ec will fail because that will cause 3 bricks
down.
In this case replica 3 would win. 


----- Original Message -----

From: "Gandalf Corvotempesta" <gandalf.corvotempesta at
gmail.com>
To: "Ashish Pandey" <aspandey at redhat.com> 
Cc: gluster-users at gluster.org 
Sent: Tuesday, August 9, 2016 11:08:12 PM 
Subject: Re: [Gluster-users] Need help to design a data storage 



Il 09 ago 2016 19:20, "Ashish Pandey" < aspandey at redhat.com >
ha scritto: > 3 - EC with redundancy 2 that is 4+2 
> The over all storage space you get is 4TB and any 2 bricks can be down at
any point of time. So it is as good as replica 3 but providing more space.
Not really. 
With replica 3 i can set the brick location on different servers so that i can
loose multiple servers and not only multiple bricks

What about EC? Are redundant data spread across multiple servers? If not,
multiple replica would be placed on the same server. I can loose 2 bricks (2
disks) but what if I'll loose the whole server with both bricks on it? And
when a server fails, multiple bricks are affected .........

replica 3 is like a raid10 with 3 disks in each mirror (3 failed bricks in the
same replica set=data loss). EC is like raid6 (3 failed bricks in the whole
cluster=data loss). The first is safer than the latter but has a huge waste of
space.
_______________________________________________ 
Gluster-users mailing list 
Gluster-users at gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-users 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160809/ccbc2a19/attachment.html>

Gandalf Corvotempesta

2016-Aug-09 18:43 UTC

head link

[Gluster-users] Need help to design a data storage

Il 09 ago 2016 19:57, "Ashish Pandey" <aspandey at redhat.com>
ha scritto:> Yes, redundant data spread across multiple servers. In my example I
mentioned 6 different nodes each with one brick.> Point is that for 4+2 you can loose any 2 bricks. It could be because of
node failure or brick failure.> 1 - 6 bricks on 6 different nodes - any 2 nodes may go down - EC win
>
> However if you have only 2 nodes and 3 bricks on each nodes, then yes inthis case even if one node goes down, ec will fail because that will cause
3 bricks down.> In this case replica 3 would win.
6 nodes with 1 brick each is a surreal case.
A much common case is multiple nodes with multiple bricks, something like 9
nodes with 12 bricks each. (In example,  a 2U supermicro server with 12
disks)

In this case, EC replicas could be placed on a single server.

And with 9*12 bricks you still have 2 single disks (or one server if both
are placed on the same hardware) as failure domains.
Yes, you'll get 9*(12-2) usable bricks and not (9*12)/3 but you risk data
loss for sure.

Just a question:  with EC which is the right calc method between these 3:

a)  (#servers*#bricks)-#replicas

Or

b) #servers*(#bricks - #replicas)

Or

c) (#servers-#replicas)*#bricks

In case A I'll use 2 disks as replica for the whole volume (exactly like a
raid6)

In case B I'll use 2 disks from each server as replica

in case C I'll use 2 whole servers as replica (this is the most secure as i
can loose 2 whole servers)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160809/85aa323e/attachment.html>

Xavier Hernandez

2016-Sep-01 11:17 UTC

head link

[Gluster-users] Need help to design a data storage

Hi,

On 09/08/16 20:43, Gandalf Corvotempesta wrote:> Il 09 ago 2016 19:57, "Ashish Pandey" <aspandey at redhat.com
> <mailto:aspandey at redhat.com>> ha scritto:
>> Yes, redundant data spread across multiple servers. In my example I
> mentioned 6 different nodes each with one brick.
>> Point is that for 4+2 you can loose any 2 bricks. It could be because
> of node failure or brick failure.
>> 1 - 6 bricks on 6 different nodes - any 2 nodes may go down - EC win
>>
>> However if you have only 2 nodes and 3 bricks on each nodes, then yes
> in this case even if one node goes down, ec will fail because that will
> cause 3 bricks down.
>> In this case replica 3 would win.
>
> 6 nodes with 1 brick each is a surreal case.
> A much common case is multiple nodes with multiple bricks, something
> like 9 nodes with 12 bricks each. (In example,  a 2U supermicro server
> with 12 disks)
>
> In this case, EC replicas could be placed on a single server.
Not really. The disperse sets, like the replica sets, are defined when 
the volume is created. You must make sure that every disperse set is 
made of bricks from different servers. If this condition is satisfied 
while creating the volume, there won't be two fragments of the same file 
on two bricks of the same server.
>
> And with 9*12 bricks you still have 2 single disks (or one server if
> both are placed on the same hardware) as failure domains.
> Yes, you'll get 9*(12-2) usable bricks and not (9*12)/3 but you risk
> data loss for sure.
It's true that the probability of failure of a distributed-replicated 
volume is smaller than a distributed-dispersed one. However if you are 
considering big volumes of redundancy 2 or higher, replica gets 
prohibitively expensive and wastes a lot of bandwidth.

You can reduce local disk failure probability by creating bricks over a 
RAID5 or RAID6 if you want. It will waste more disks, but many less than 
a replica.
>
> Just a question:  with EC which is the right calc method between these 3:
>
> a)  (#servers*#bricks)-#replicas
>
> Or
>
> b) #servers*(#bricks - #replicas)
>
> Or
>
> c) (#servers-#replicas)*#bricks
>
> In case A I'll use 2 disks as replica for the whole volume (exactly
like
> a raid6)
>
> In case B I'll use 2 disks from each server as replica
>
> in case C I'll use 2 whole servers as replica (this is the most secure
> as i can loose 2 whole servers)
In fact none of these is completely correct. The redundancy level is per 
disperse set, not for the whole volume.

S: number of servers
D: number of disks per server
N: Disperse set size
R: Disperse redundancy

Usable disks = S * D * (1 - R / N)
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>

Gluster users - Sep 2016 - Need help to design a data storage

[Gluster-users] Need help to design a data storage

[Gluster-users] Need help to design a data storage

[Gluster-users] Need help to design a data storage