thr3ads.net - Gluster users - [Gluster-users] Remove an artificial limitation of disperse volume [Feb 2017]

If this information is useful, please help other people find it:
Share via:

Olivier Lambert

2017-Feb-07 13:16 UTC

[Gluster-users] Remove an artificial limitation of disperse volume

Hi everyone!

I'm currently working on implementing Gluster on XenServer/Xen Orchestra.

I want to expose some Gluster features (in the easiest possible way to
the user).

Therefore, I want to expose only "distributed/replicated" and
"disperse" mode. From what I understand, they are working differently.
Let's take a simple example.

Setup: 6x nodes with 1x 200GB disk each.

* Disperse with redundancy 2 (4+2): I can lose **any 2 of all my
disks**. Total usable space is 800GB. It's a kind of RAID6 (or RAIDZ2)
* Distributed/replicated with replica 2: I can lose 2 disks **BUT**
not on the same "mirror". Total usable space is 600GB. It's a kind
of
RAID10

So far, is it correct?

My main point is that behavior is very different (pairing disks in
distributed/replicated and "shared" parity in disperse).

Now, let's imagine something else. 4x nodes with 1x 200GB disk each.

Why not having disperse with redundancy 2? It will be the same in
terms of storage space than distributed/replicated, **BUT** in
disperse I can lose any of 2 disks. In dist/rep, only if they are not
on the same "mirror".

So far, I can't create a disperse volume if the redundancy level is
50% or more the number of bricks. I know that perfs would be better in
dist/rep, but what if I prefer anyway to have disperse?

Conclusion: would it be possible to have a "force" flag during
disperse volume creation even if redundancy is higher that 50%?



Thanks!



Olivier.

Nag Pavan Chilakam

2017-Feb-07 13:51 UTC

head link

[Gluster-users] Remove an artificial limitation of disperse volume

You can always go for x3(3 replica copies), to address your need which you have
asked
EC volumes can be seen as raid for understanding purpose, but don't see it
as an apple-to-apple comparison.
Raid4/6(mostly) relies on XOR'ing bits(so basic addition and subtraction),
but EC involves a more complex algorithm(reed-solomon)


----- Original Message -----
From: "Olivier Lambert" <lambert.olivier at gmail.com>
To: "gluster-users" <gluster-users at gluster.org>
Sent: Tuesday, 7 February, 2017 6:46:37 PM
Subject: [Gluster-users] Remove an artificial limitation of disperse volume

Hi everyone!

I'm currently working on implementing Gluster on XenServer/Xen Orchestra.

I want to expose some Gluster features (in the easiest possible way to
the user).

Therefore, I want to expose only "distributed/replicated" and
"disperse" mode. From what I understand, they are working differently.
Let's take a simple example.

Setup: 6x nodes with 1x 200GB disk each.

* Disperse with redundancy 2 (4+2): I can lose **any 2 of all my
disks**. Total usable space is 800GB. It's a kind of RAID6 (or RAIDZ2)
* Distributed/replicated with replica 2: I can lose 2 disks **BUT**
not on the same "mirror". Total usable space is 600GB. It's a kind
of
RAID10

So far, is it correct?

My main point is that behavior is very different (pairing disks in
distributed/replicated and "shared" parity in disperse).

Now, let's imagine something else. 4x nodes with 1x 200GB disk each.

Why not having disperse with redundancy 2? It will be the same in
terms of storage space than distributed/replicated, **BUT** in
disperse I can lose any of 2 disks. In dist/rep, only if they are not
on the same "mirror".

So far, I can't create a disperse volume if the redundancy level is
50% or more the number of bricks. I know that perfs would be better in
dist/rep, but what if I prefer anyway to have disperse?

Conclusion: would it be possible to have a "force" flag during
disperse volume creation even if redundancy is higher that 50%?



Thanks!



Olivier.
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Jeff Darcy

2017-Feb-07 15:18 UTC

head link

[Gluster-users] Remove an artificial limitation of disperse volume

> So far, I can't create a disperse volume if the redundancy level is
> 50% or more the number of bricks. I know that perfs would be better in
> dist/rep, but what if I prefer anyway to have disperse?
> 
> Conclusion: would it be possible to have a "force" flag during
> disperse volume creation even if redundancy is higher that 50%?
The problem is that the math behind erasure coding doesn't work for all
fragment counts and redundancy levels.  To get two-failure protection
you need more than four bricks.  If you had multiple disks in each
server you could get protection against multiple disk failures, but you
still wouldn't have protection against multiple server failures.  The
only thing your "force" flag could do is allow placement of multiple
fragments on a single physical disk, but then you wouldn't even have
protection against two disk failures.  If you want higher levels of
protection you need more disks, either to satisfy the mathematical
requirements of EC or to overcome the space inefficiency of replication.

Xavier Hernandez

2017-Feb-08 09:12 UTC

head link

[Gluster-users] Remove an artificial limitation of disperse volume

Hi Olivier,

sorry, didn't see the email earlier...

We've already talked about this in private, but to make things clearer 
to everyone I answer here.

On 07/02/17 14:16, Olivier Lambert wrote:> Hi everyone!
>
> I'm currently working on implementing Gluster on XenServer/Xen
Orchestra.
>
> I want to expose some Gluster features (in the easiest possible way to
> the user).
>
> Therefore, I want to expose only "distributed/replicated" and
> "disperse" mode. From what I understand, they are working
differently.
> Let's take a simple example.
>
> Setup: 6x nodes with 1x 200GB disk each.
>
> * Disperse with redundancy 2 (4+2): I can lose **any 2 of all my
> disks**. Total usable space is 800GB. It's a kind of RAID6 (or RAIDZ2)
> * Distributed/replicated with replica 2: I can lose 2 disks **BUT**
> not on the same "mirror". Total usable space is 600GB. It's a
kind of
> RAID10
>
> So far, is it correct?
Yes, but sometimes you can gain some performance by splitting each disk 
into two bricks if the disks are not the bottleneck.
>
> My main point is that behavior is very different (pairing disks in
> distributed/replicated and "shared" parity in disperse).
>
> Now, let's imagine something else. 4x nodes with 1x 200GB disk each.
>
> Why not having disperse with redundancy 2? It will be the same in
> terms of storage space than distributed/replicated, **BUT** in
> disperse I can lose any of 2 disks. In dist/rep, only if they are not
> on the same "mirror".
>
> So far, I can't create a disperse volume if the redundancy level is
> 50% or more the number of bricks. I know that perfs would be better in
> dist/rep, but what if I prefer anyway to have disperse?
>
> Conclusion: would it be possible to have a "force" flag during
> disperse volume creation even if redundancy is higher that 50%?
That's a design decision made to avoid most of the split-brains and 
thinking that 50% redundancy is already achieved by replicate (even if 
the conditions are not really the same).

The Reed-Solomon algorithm is able to create as many or even more 
redundancy fragments as there are data bricks (the only real limitation 
is the Galois Field used). However allowing this in disperse had a lot 
of complex scenarios that are both difficult to solve and prone to 
possible failures/data corruptions. So it was decided to not support 
those configurations.

Xavi
>
>
>
> Thanks!
>
>
>
> Olivier.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>

Gluster users - Feb 2017 - Remove an artificial limitation of disperse volume

[Gluster-users] Remove an artificial limitation of disperse volume

[Gluster-users] Remove an artificial limitation of disperse volume

[Gluster-users] Remove an artificial limitation of disperse volume

[Gluster-users] Remove an artificial limitation of disperse volume