thr3ads.net - Gluster users - [Gluster-users] Need some clarifications about the disperse feature [Nov 2014]

If this information is useful, please help other people find it:
Share via:

Ayelet Shemesh

2014-Nov-25 13:41 UTC

[Gluster-users] Need some clarifications about the disperse feature

Hello Gluster experts,

I have been using gluster for a small cluster for a few years now and I
have a question regarding the new disperse feature, which is for me a much
anticipated addition.

*Suppose* I create a volume with a disperse set of 3, redundancy 1 (let's
call them A1, A2, A3) and then I add 3 more bricks to that volume (we'll
call them B1, B2, B3).

*First question* - which of the bricks will be the one carrying the
redundancy data?

*Second question* - If I have machines with faster disk - should I assign
them to the data or the redundancy bricks? What should I expect the load to
be on the redundancy machine in heavy read scenarios and in heavy write
scenarios?

*Third question* - *does this require reading the entire data* of A1, A2
and A3 by initiating a heal or another operation?

*4th question* (and most important for me) - I saw in the list that it is
now a Distributed-Dispersed volume. I understand I can now lose, for
example bricks A1 and B1 and still have my entire data intact. Is this also
correct for bricks from the same set, for example A1 and A2?
Or to put it in a more generic way - *does this create the exact same
dispersed volume as if I created it originally with A1, A2 A3 B1 B2 B3 and
a redundancy of 2?*


Many thanks for your work and for your help on this list,
Ayelet
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141125/4d8fe12f/attachment.html>

Atin Mukherjee

2014-Nov-25 15:02 UTC

head link

[Gluster-users] Need some clarifications about the disperse feature

Xavi will be the better person to clear all your doubts on this feature,
however as per my understanding please see the response inline.

~Atin

On 11/25/2014 07:11 PM, Ayelet Shemesh wrote:> Hello Gluster experts,
> 
> I have been using gluster for a small cluster for a few years now and I
> have a question regarding the new disperse feature, which is for me a much
> anticipated addition.
> 
> *Suppose* I create a volume with a disperse set of 3, redundancy 1
(let's
> call them A1, A2, A3) and then I add 3 more bricks to that volume
(we'll
> call them B1, B2, B3).
> 
> *First question* - which of the bricks will be the one carrying the
> redundancy data?The current implementation is *non systematic* which means we don't have
any dedicated parity/redundancy brick.> 
> *Second question* - If I have machines with faster disk - should I assign
> them to the data or the redundancy bricks? What should I expect the load to
> be on the redundancy machine in heavy read scenarios and in heavy write
> scenarios?As mentioned above, this configuration is not possible for non
systematic implementation.> 
> *Third question* - *does this require reading the entire data* of A1, A2
> and A3 by initiating a heal or another operation?If the configuration is 2+1 as you mentioned, you can recover the whole
set of data from any two of three bricks, the algorithm provides the
intelligence of constructing the chunk of data which resides in a brick
which might be down for this configuration.> 
> *4th question* (and most important for me) - I saw in the list that it is
> now a Distributed-Dispersed volume. I understand I can now lose, for
> example bricks A1 and B1 and still have my entire data intact. Is this also
> correct for bricks from the same set, for example A1 and A2?
> Or to put it in a more generic way - *does this create the exact same
> dispersed volume as if I created it originally with A1, A2 A3 B1 B2 B3 and
> a redundancy of 2?*No, if you see the volume info with this configuration it will show you
2 X (2+1) which means on every set the quorum is two i.e. you need to
have atleast two bricks running.> 
> 
> Many thanks for your work and for your help on this list,
> Ayelet
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>

Xavier Hernandez

2014-Nov-25 15:19 UTC

head link

[Gluster-users] Need some clarifications about the disperse feature

Hi Ayelet,

On 11/25/2014 02:41 PM, Ayelet Shemesh wrote:> Hello Gluster experts,
>
> I have been using gluster for a small cluster for a few years now and I
> have a question regarding the new disperse feature, which is for me a
> much anticipated addition.
>
> *Suppose* I create a volume with a disperse set of 3, redundancy 1
> (let's call them A1, A2, A3) and then I add 3 more bricks to that
volume
> (we'll call them B1, B2, B3).
>
> *First question* - which of the bricks will be the one carrying the
> redundancy data?
In current implementation, there's no difference between data and 
redundancy. All bricks behave exactly equal and there isn't anyone more 
important than another. In a configuration with 3 bricks and redundancy 
1, you can lose any brick and everything will continue working normally.
>
> *Second question* - If I have machines with faster disk - should I
> assign them to the data or the redundancy bricks? What should I expect
> the load to be on the redundancy machine in heavy read scenarios and in
> heavy write scenarios?
As I said, there isn't a dedicated redundancy brick, so there's no 
benefit in assigning the fast disk to a specific brick.

Read requests only need to be processed on N - R bricks (N = total 
number of bricks, R = redundancy). This means that in your 
configuration, each read will be sent to 2 bricks. If all bricks are 
alive and healthy, the disperse translator balances these reads among 
all nodes, giving 2/3 of the load to each brick.

Write requests are processed by all bricks, so the load is the same on 
all of them.
>
> *Third question* - _does this require reading the entire data_ of A1, A2
> and A3 by initiating a heal or another operation?
>
Healing operations are on file basis. If only some files of A3 have been 
damaged, it will only read the corresponding data from A1 and A2, but 
not the entire contents of A1 and A2. To heal a file, all file contents 
are read.
> *4th question* (and most important for me) - I saw in the list that it
> is now a Distributed-Dispersed volume. I understand I can now lose, for
> example bricks A1 and B1 and still have my entire data intact.
Correct
> Is this also correct for bricks from the same set, for example A1 and A2?
No, each disperse set is independent and have the same redundancy. It's 
equivalent to a distributed replicated: if you lose both bricks of the 
same replica set, you will lose access to the data stored in that 
replica set.
> Or to put it in a more generic way - _does this create the exact same
> dispersed volume as if I created it originally with A1, A2 A3 B1 B2 B3
> and a redundancy of 2?
No. These are two different configurations. Both have the same effective 
capacity, but the probability of failure in the second case is several 
times lower than the first one (you can lose *any* two bricks without 
losing access to the data). However it's more expensive to grow the 
volume because you will need to add 6 new bricks at the same time, while 
with the first case you only need to add 3.

Xavi

Gluster users - Nov 2014 - Need some clarifications about the disperse feature

[Gluster-users] Need some clarifications about the disperse feature

[Gluster-users] Need some clarifications about the disperse feature

[Gluster-users] Need some clarifications about the disperse feature