thr3ads.net - Gluster users - [Gluster-users] Quorum in distributed-replicate volume [Feb 2018]

If this information is useful, please help other people find it:
Share via:

Dave Sherohman

2018-Feb-27 08:10 UTC

[Gluster-users] Quorum in distributed-replicate volume

On Tue, Feb 27, 2018 at 12:00:29PM +0530, Karthik Subrahmanya
wrote:> I will try to explain how you can end up in split-brain even with cluster
> wide quorum:
Yep, the explanation made sense.  I hadn't considered the possibility of
alternating outages.  Thanks!
> > > It would be great if you can consider configuring an arbiter or
> > > replica 3 volume.
> >
> > I can.  My bricks are 2x850G and 4x11T, so I can repurpose the small
> > bricks as arbiters with minimal effect on capacity.  What would be the
> > sequence of commands needed to:
> >
> > 1) Move all data off of bricks 1 & 2
> > 2) Remove that replica from the cluster
> > 3) Re-add those two bricks as arbiters
> >
> > (And did I miss any additional steps?)
> >
> > Unfortunately, I've been running a few months already with the
current
> > configuration and there are several virtual machines running off the
> > existing volume, so I'll need to reconfigure it online if
possible.
> >
> Without knowing the volume configuration it is difficult to suggest the
> configuration change,
> and since it is a live system you may end up in data unavailability or data
> loss.
> Can you give the output of "gluster volume info <volname>"
> and which brick is of what size.
Volume Name: palantir
Type: Distributed-Replicate
Volume ID: 48379a50-3210-41b4-9a77-ae143c8bcac0
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: saruman:/var/local/brick0/data
Brick2: gandalf:/var/local/brick0/data
Brick3: azathoth:/var/local/brick0/data
Brick4: yog-sothoth:/var/local/brick0/data
Brick5: cthulhu:/var/local/brick0/data
Brick6: mordiggian:/var/local/brick0/data
Options Reconfigured:
features.scrub: Inactive
features.bitrot: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
network.ping-timeout: 1013
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
features.shard: on
cluster.data-self-heal-algorithm: full
storage.owner-uid: 64055
storage.owner-gid: 64055


For brick sizes, saruman/gandalf have

$ df -h /var/local/brick0
Filesystem                   Size  Used Avail Use% Mounted on
/dev/mapper/gandalf-gluster  885G   55G  786G   7% /var/local/brick0

and the other four have

$ df -h /var/local/brick0
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1        11T  254G   11T   3% /var/local/brick0


-- 
Dave Sherohman

Karthik Subrahmanya

2018-Feb-27 09:50 UTC

head link

[Gluster-users] Quorum in distributed-replicate volume

On Tue, Feb 27, 2018 at 1:40 PM, Dave Sherohman <dave at sherohman.org>
wrote:
> On Tue, Feb 27, 2018 at 12:00:29PM +0530, Karthik Subrahmanya wrote:
> > I will try to explain how you can end up in split-brain even with
cluster
> > wide quorum:
>
> Yep, the explanation made sense.  I hadn't considered the possibility
of
> alternating outages.  Thanks!
>
> > > > It would be great if you can consider configuring an arbiter
or
> > > > replica 3 volume.
> > >
> > > I can.  My bricks are 2x850G and 4x11T, so I can repurpose the
small
> > > bricks as arbiters with minimal effect on capacity.  What would
be the
> > > sequence of commands needed to:
> > >
> > > 1) Move all data off of bricks 1 & 2
> > > 2) Remove that replica from the cluster
> > > 3) Re-add those two bricks as arbiters
> > >
> > > (And did I miss any additional steps?)
> > >
> > > Unfortunately, I've been running a few months already with
the current
> > > configuration and there are several virtual machines running off
the
> > > existing volume, so I'll need to reconfigure it online if
possible.
> > >
> > Without knowing the volume configuration it is difficult to suggest
the
> > configuration change,
> > and since it is a live system you may end up in data unavailability or
> data
> > loss.
> > Can you give the output of "gluster volume info
<volname>"
> > and which brick is of what size.
>
> Volume Name: palantir
> Type: Distributed-Replicate
> Volume ID: 48379a50-3210-41b4-9a77-ae143c8bcac0
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 3 x 2 = 6
> Transport-type: tcp
> Bricks:
> Brick1: saruman:/var/local/brick0/data
> Brick2: gandalf:/var/local/brick0/data
> Brick3: azathoth:/var/local/brick0/data
> Brick4: yog-sothoth:/var/local/brick0/data
> Brick5: cthulhu:/var/local/brick0/data
> Brick6: mordiggian:/var/local/brick0/data
> Options Reconfigured:
> features.scrub: Inactive
> features.bitrot: off
> transport.address-family: inet
> performance.readdir-ahead: on
> nfs.disable: on
> network.ping-timeout: 1013
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> features.shard: on
> cluster.data-self-heal-algorithm: full
> storage.owner-uid: 64055
> storage.owner-gid: 64055
>
>
> For brick sizes, saruman/gandalf have
>
> $ df -h /var/local/brick0
> Filesystem                   Size  Used Avail Use% Mounted on
> /dev/mapper/gandalf-gluster  885G   55G  786G   7% /var/local/brick0
>
> and the other four have
>
> $ df -h /var/local/brick0
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/sdb1        11T  254G   11T   3% /var/local/brick0
>
If you want to use the first two bricks as arbiter, then you need to be
aware of the following things:
- Your distribution count will be decreased to 2.
- Your data on the first subvol i.e., replica subvol - 1 will be
unavailable till it is copied to the other subvols
after removing the bricks from the cluster.

Since arbiter bricks need not be of same size as the data bricks, if you
can configure three more arbiter bricks
based on the guidelines in the doc [1], you can do it live and you will
have the distribution count also unchanged.

One more thing from the volume info; Only the options which are
reconfigured will appear in the volume info output.
The quorum-type is in the list which says it is manually reconfigured.

[1]
http://docs.gluster.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/#arbiter-bricks-sizing

Regards,
Karthik
>
>
> --
> Dave Sherohman
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180227/0d63414a/attachment.html>

Dave Sherohman

2018-Feb-27 10:48 UTC

head link

[Gluster-users] Quorum in distributed-replicate volume

On Tue, Feb 27, 2018 at 03:20:25PM +0530, Karthik Subrahmanya
wrote:> If you want to use the first two bricks as arbiter, then you need to be
> aware of the following things:
> - Your distribution count will be decreased to 2.
What's the significance of this?  I'm trying to find documentation on
distribution counts in gluster, but my google-fu is failing me.
> - Your data on the first subvol i.e., replica subvol - 1 will be
> unavailable till it is copied to the other subvols
> after removing the bricks from the cluster.
Hmm, ok.  I was sure I had seen a reference at some point to a command
for migrating data off bricks to prepare them for removal.

Is there an easy way to get a list of all files which are present on a
given brick, then, so that I can see which data would be unavailable
during this transfer?
> Since arbiter bricks need not be of same size as the data bricks, if you
> can configure three more arbiter bricks
> based on the guidelines in the doc [1], you can do it live and you will
> have the distribution count also unchanged.
I can probably find one or more machines with a few hundred GB free
which could be allocated for arbiter bricks if it would be sigificantly
simpler and safer than repurposing the existing bricks (and I'm getting
the impression that it probably would be).  Does it particularly matter
whether the arbiters are all on the same node or on three separate
nodes?

-- 
Dave Sherohman

Seemingly Similar Threads

Search for more seemingly similar threads

Gluster users - Feb 2018 - Quorum in distributed-replicate volume

[Gluster-users] Quorum in distributed-replicate volume

[Gluster-users] Quorum in distributed-replicate volume

[Gluster-users] Quorum in distributed-replicate volume

Seemingly Similar Threads