thr3ads.net - Gluster users - [Gluster-users] Glusterfs Rack-Zone Awareness feature... [Apr 2014]

If this information is useful, please help other people find it:
Share via:

COCHE Sébastien

2014-Apr-15 09:26 UTC

[Gluster-users] Glusterfs Rack-Zone Awareness feature...

HI all,

 

I have a little question.

I have read glusterfs documentation looking for a replication management. I want
to be able to localize replicas on nodes hosted in 2 Datacenters
(dual-building).

CouchBase provide the feature, I'm looking for GlusterFs : "Rack-Zone
Awareness".

https://blog.couchbase.com/announcing-couchbase-server-25
<https://blog.couchbase.com/announcing-couchbase-server-25>

"Rack-Zone Awareness - This feature will allow logical groupings of
Couchbase Server nodes (where each group is physically located on a rack or an
availability zone). Couchbase Server will automatically allocate replica copies
of data on servers that belong to a group different from where the active data
lives. This significantly increases reliability in case an entire rack becomes
unavailable. This is of particularly importance for customers running
deployments in public clouds."

 

Do you know if Glusterfs provide a similar feature ?

If not, do you plan to develop it, in the near future ?

 

Thank's in advance.

 

S?bastien Coch?

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140415/8355080d/attachment.html>

Jeff Darcy

2014-Apr-15 14:36 UTC

head link

[Gluster-users] Glusterfs Rack-Zone Awareness feature...

> I have a little question.
> I have read glusterfs documentation looking for a replication management. I
> want to be able to localize replicas on nodes hosted in 2 Datacenters
> (dual-building).
> CouchBase provide the feature, I?m looking for GlusterFs : ?Rack-Zone
> Awareness?.
> https://blog.couchbase.com/announcing-couchbase-server-25
> ?Rack-Zone Awareness - This feature will allow logical groupings of
Couchbase
> Server nodes (where each group is physically located on a rack or an
> availability zone). Couchbase Server will automatically allocate replica
> copies of data on servers that belong to a group different from where the
> active data lives. This significantly increases reliability in case an
> entire rack becomes unavailable. This is of particularly importance for
> customers running deployments in public clouds.?
> Do you know if Glusterfs provide a similar feature ?
> If not, do you plan to develop it, in the near future ?
There are two parts to the answer. Rack-aware placement in general is part of
the "data classification" feature planned for the 3.6 release.

http://www.gluster.org/community/documentation/index.php/Features/data-classification

With this feature, files can be placed according to various policies using any
of several properties associated with objects or physical locations. Rack-aware
placement would use the physical location of a brick. Tiering would use the
performance properties of a brick and the access time/frequency of an object.
Multi-tenancy would use the tenant identity for both bricks and objects. And so
on. It's all essentially the same infrastructure.

For replication decisions in particular, there needs to be another piece. Right
now, the way we use N bricks with a replication factor of R is to define N/R
replica sets each containing R members. This is sub-optimal in many ways. We can
still compare the "value" or "fitness" of two replica sets
for storing a particular object, but our options are limited to the replica sets
as defined last time bricks were added or removed. The differences between one
choice and another effectively get smoothed out, and the load balancing after a
failure is less than ideal. To do this right, we need to use more (overlapping)
combinations of bricks. Some of us have discussed ways that we can do this
without sacrificing the modularity of having distribution and replication as two
separate modules, but there's no defined plan or date for that feature
becoming available.

BTW, note that using *too many* combinations can also be a problem. Every time
an object is replicated across a certain set of storage locations, it creates a
coupling between those locations. Before long, all locations are coupled
together, so that *any* failure of R-1 locations anywhere in the system will
result in data loss or unavailability. Many systems, possibly including
Couchbase Server, have made this mistake and become *less* reliable as a result.
Emin G?n Sirer does a better job describing the problem - and solutions - than I
do, here:

http://hackingdistributed.com/2014/02/14/chainsets/

Gluster users - Apr 2014 - Glusterfs Rack-Zone Awareness feature...

[Gluster-users] Glusterfs Rack-Zone Awareness feature...

[Gluster-users] Glusterfs Rack-Zone Awareness feature...