Christopher P. Lindsey
2016-May-26 17:09 UTC
[Gluster-users] Remove brick on 3.7.11 distributed-disperse volume doesn't rebalance
Hi,
I have a distributed-disperse 7 x (2 + 1) volume that I want
to remove three bricks from:
Volume Name: glance
Type: Distributed-Disperse
Volume ID: 34a962cc-be73-480e-a9f7-8dbd9c7ca066
Status: Started
Number of Bricks: 7 x (2 + 1) = 21
Transport-type: tcp
Bricks:
Brick1: block8:/dpool/gluster/brick1/glance
Brick2: block15:/dpool/gluster/brick1/glance
Brick3: dxl3:/dpool/gluster/brick1/glance
Brick4: block1:/dpool/gluster/brick1/glance
Brick5: block9:/dpool/gluster/brick1/glance
Brick6: block16:/dpool/gluster/brick1/glance
Brick7: block2:/dpool/gluster/brick1/glance
Brick8: block10:/dpool/gluster/brick1/glance
Brick9: block20:/dpool/gluster/brick1/glance
Brick10: block3:/dpool/gluster/brick1/glance
Brick11: block11:/dpool/gluster/brick1/glance
Brick12: block21:/dpool/gluster/brick1/glance
Brick13: dxl1:/dpool/gluster/brick1/glance
Brick14: block12:/dpool/gluster/brick1/glance
Brick15: block17:/dpool/gluster/brick1/glance
Brick16: dxl2:/dpool/gluster/brick1/glance
Brick17: block13:/dpool/gluster/brick1/glance
Brick18: block22:/dpool/gluster/brick1/glance
Brick19: block14:/dpool/gluster/brick1/glance
Brick20: block18:/dpool/gluster/brick1/glance
Brick21: block23:/dpool/gluster/brick1/glance
Options Reconfigured:
cluster.min-free-disk: 200GB
performance.readdir-ahead: on
In this case, I've chosen dxl2, block13, and block22
(which are part of the same disperse set).
# gluster volume remove-brick glance block22:/dpool/gluster/brick1/glance
block13:/dpool/gluster/brick1/glance dxl2:/dpool/gluster/brick1/glance start
# gluster volume remove-brick glance block22:/dpool/gluster/brick1/glance
block13:/dpool/gluster/brick1/glance dxl2:/dpool/gluster/brick1/glance status
Node Rebalanced-files size
scanned failures skipped status run time in h:m:s
--------- ----------- -----------
----------- ----------- ----------- ------------ --------------
dxl2 0 0Bytes
0 0 0 completed 0:0:1
block13 0 0Bytes
0 0 0 completed 0:0:1
Before starting this, I created 4096 random files in a directory.
565 of them are found in /dpool/gluster/brick1/glance on each of those
bricks and nowhere else.
After running the remove-brick the files stayed on those systems
and didn't appear anywhere else, despite the logs showing that
a rebalance occurred:
[dxl2] # tail -100 /var/log/glusterfs/glance-rebalance.log
[2016-05-26 16:25:35.409904] I [MSGID: 109028]
[dht-rebalance.c:3831:gf_defrag_status_get] 0-glance-dht: Rebalance is
completed. Time taken is 1.00 secs
Attempts to manually rebalance failed because the remove-brick task
hadn't yet been committed.
If I commit the remove-brick, the system switches to a 6 x (2 + 1)
configuration which is what I wanted, but all 565 of the files that
were on the now-removed disperse set are no longer available anywhere.
Is this a bug? Or am I misunderstanding how to remove a disperse
set from a distributed-disperse volume? I'm running 3.7.11 on
CentOS 7.
Any guidance would be greatly appreciated.
Thanks,
Chris
Serkan Çoban
2016-May-26 19:29 UTC
[Gluster-users] Remove brick on 3.7.11 distributed-disperse volume doesn't rebalance
I don't know any documentation that describes how to remove a disperse set from a distributed-disperse volume. So I assume you can not do that :) On Thu, May 26, 2016 at 8:09 PM, Christopher P. Lindsey <gluster.org at spamfodder.com> wrote:> Hi, > > I have a distributed-disperse 7 x (2 + 1) volume that I want > to remove three bricks from: > > Volume Name: glance > Type: Distributed-Disperse > Volume ID: 34a962cc-be73-480e-a9f7-8dbd9c7ca066 > Status: Started > Number of Bricks: 7 x (2 + 1) = 21 > Transport-type: tcp > Bricks: > Brick1: block8:/dpool/gluster/brick1/glance > Brick2: block15:/dpool/gluster/brick1/glance > Brick3: dxl3:/dpool/gluster/brick1/glance > Brick4: block1:/dpool/gluster/brick1/glance > Brick5: block9:/dpool/gluster/brick1/glance > Brick6: block16:/dpool/gluster/brick1/glance > Brick7: block2:/dpool/gluster/brick1/glance > Brick8: block10:/dpool/gluster/brick1/glance > Brick9: block20:/dpool/gluster/brick1/glance > Brick10: block3:/dpool/gluster/brick1/glance > Brick11: block11:/dpool/gluster/brick1/glance > Brick12: block21:/dpool/gluster/brick1/glance > Brick13: dxl1:/dpool/gluster/brick1/glance > Brick14: block12:/dpool/gluster/brick1/glance > Brick15: block17:/dpool/gluster/brick1/glance > Brick16: dxl2:/dpool/gluster/brick1/glance > Brick17: block13:/dpool/gluster/brick1/glance > Brick18: block22:/dpool/gluster/brick1/glance > Brick19: block14:/dpool/gluster/brick1/glance > Brick20: block18:/dpool/gluster/brick1/glance > Brick21: block23:/dpool/gluster/brick1/glance > Options Reconfigured: > cluster.min-free-disk: 200GB > performance.readdir-ahead: on > > In this case, I've chosen dxl2, block13, and block22 > (which are part of the same disperse set). > > # gluster volume remove-brick glance block22:/dpool/gluster/brick1/glance block13:/dpool/gluster/brick1/glance dxl2:/dpool/gluster/brick1/glance start > # gluster volume remove-brick glance block22:/dpool/gluster/brick1/glance block13:/dpool/gluster/brick1/glance dxl2:/dpool/gluster/brick1/glance status > Node Rebalanced-files size scanned failures skipped status run time in h:m:s > --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- > dxl2 0 0Bytes 0 0 0 completed 0:0:1 > block13 0 0Bytes 0 0 0 completed 0:0:1 > > > Before starting this, I created 4096 random files in a directory. > 565 of them are found in /dpool/gluster/brick1/glance on each of those > bricks and nowhere else. > > After running the remove-brick the files stayed on those systems > and didn't appear anywhere else, despite the logs showing that > a rebalance occurred: > > [dxl2] # tail -100 /var/log/glusterfs/glance-rebalance.log > [2016-05-26 16:25:35.409904] I [MSGID: 109028] [dht-rebalance.c:3831:gf_defrag_status_get] 0-glance-dht: Rebalance is completed. Time taken is 1.00 secs > > Attempts to manually rebalance failed because the remove-brick task > hadn't yet been committed. > > If I commit the remove-brick, the system switches to a 6 x (2 + 1) > configuration which is what I wanted, but all 565 of the files that > were on the now-removed disperse set are no longer available anywhere. > > Is this a bug? Or am I misunderstanding how to remove a disperse > set from a distributed-disperse volume? I'm running 3.7.11 on > CentOS 7. > > Any guidance would be greatly appreciated. > > Thanks, > > Chris > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users