Christopher P. Lindsey
2016-May-26  17:09 UTC
[Gluster-users] Remove brick on 3.7.11 distributed-disperse volume doesn't rebalance
Hi,
I have a distributed-disperse 7 x (2 + 1) volume that I want
to remove three bricks from:
   Volume Name: glance
   Type: Distributed-Disperse
   Volume ID: 34a962cc-be73-480e-a9f7-8dbd9c7ca066
   Status: Started
   Number of Bricks: 7 x (2 + 1) = 21 
   Transport-type: tcp
   Bricks:
   Brick1: block8:/dpool/gluster/brick1/glance
   Brick2: block15:/dpool/gluster/brick1/glance
   Brick3: dxl3:/dpool/gluster/brick1/glance
   Brick4: block1:/dpool/gluster/brick1/glance
   Brick5: block9:/dpool/gluster/brick1/glance
   Brick6: block16:/dpool/gluster/brick1/glance
   Brick7: block2:/dpool/gluster/brick1/glance
   Brick8: block10:/dpool/gluster/brick1/glance
   Brick9: block20:/dpool/gluster/brick1/glance
   Brick10: block3:/dpool/gluster/brick1/glance
   Brick11: block11:/dpool/gluster/brick1/glance
   Brick12: block21:/dpool/gluster/brick1/glance
   Brick13: dxl1:/dpool/gluster/brick1/glance
   Brick14: block12:/dpool/gluster/brick1/glance
   Brick15: block17:/dpool/gluster/brick1/glance
   Brick16: dxl2:/dpool/gluster/brick1/glance
   Brick17: block13:/dpool/gluster/brick1/glance
   Brick18: block22:/dpool/gluster/brick1/glance
   Brick19: block14:/dpool/gluster/brick1/glance
   Brick20: block18:/dpool/gluster/brick1/glance
   Brick21: block23:/dpool/gluster/brick1/glance
   Options Reconfigured:
   cluster.min-free-disk: 200GB
   performance.readdir-ahead: on
In this case, I've chosen dxl2, block13, and block22
(which are part of the same disperse set).
   # gluster volume remove-brick glance block22:/dpool/gluster/brick1/glance
block13:/dpool/gluster/brick1/glance dxl2:/dpool/gluster/brick1/glance start
   # gluster volume remove-brick glance block22:/dpool/gluster/brick1/glance
block13:/dpool/gluster/brick1/glance dxl2:/dpool/gluster/brick1/glance status
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                                    dxl2                0        0Bytes         
0             0             0            completed        0:0:1
                                 block13                0        0Bytes         
0             0             0            completed        0:0:1
Before starting this, I created 4096 random files in a directory.
565 of them are found in /dpool/gluster/brick1/glance on each of those
bricks and nowhere else.
After running the remove-brick the files stayed on those systems
and didn't appear anywhere else, despite the logs showing that
a rebalance occurred:
   [dxl2] # tail -100 /var/log/glusterfs/glance-rebalance.log
   [2016-05-26 16:25:35.409904] I [MSGID: 109028]
[dht-rebalance.c:3831:gf_defrag_status_get] 0-glance-dht: Rebalance is
completed. Time taken is 1.00 secs
Attempts to manually rebalance failed because the remove-brick task 
hadn't yet been committed.
If I commit the remove-brick, the system switches to a 6 x (2 + 1) 
configuration which is what I wanted, but all 565 of the files that 
were on the now-removed disperse set are no longer available anywhere.
Is this a bug?  Or am I misunderstanding how to remove a disperse
set from a distributed-disperse volume?  I'm running 3.7.11 on 
CentOS 7.
Any guidance would be greatly appreciated.
Thanks,
Chris
Serkan Çoban
2016-May-26  19:29 UTC
[Gluster-users] Remove brick on 3.7.11 distributed-disperse volume doesn't rebalance
I don't know any documentation that describes how to remove a disperse set from a distributed-disperse volume. So I assume you can not do that :) On Thu, May 26, 2016 at 8:09 PM, Christopher P. Lindsey <gluster.org at spamfodder.com> wrote:> Hi, > > I have a distributed-disperse 7 x (2 + 1) volume that I want > to remove three bricks from: > > Volume Name: glance > Type: Distributed-Disperse > Volume ID: 34a962cc-be73-480e-a9f7-8dbd9c7ca066 > Status: Started > Number of Bricks: 7 x (2 + 1) = 21 > Transport-type: tcp > Bricks: > Brick1: block8:/dpool/gluster/brick1/glance > Brick2: block15:/dpool/gluster/brick1/glance > Brick3: dxl3:/dpool/gluster/brick1/glance > Brick4: block1:/dpool/gluster/brick1/glance > Brick5: block9:/dpool/gluster/brick1/glance > Brick6: block16:/dpool/gluster/brick1/glance > Brick7: block2:/dpool/gluster/brick1/glance > Brick8: block10:/dpool/gluster/brick1/glance > Brick9: block20:/dpool/gluster/brick1/glance > Brick10: block3:/dpool/gluster/brick1/glance > Brick11: block11:/dpool/gluster/brick1/glance > Brick12: block21:/dpool/gluster/brick1/glance > Brick13: dxl1:/dpool/gluster/brick1/glance > Brick14: block12:/dpool/gluster/brick1/glance > Brick15: block17:/dpool/gluster/brick1/glance > Brick16: dxl2:/dpool/gluster/brick1/glance > Brick17: block13:/dpool/gluster/brick1/glance > Brick18: block22:/dpool/gluster/brick1/glance > Brick19: block14:/dpool/gluster/brick1/glance > Brick20: block18:/dpool/gluster/brick1/glance > Brick21: block23:/dpool/gluster/brick1/glance > Options Reconfigured: > cluster.min-free-disk: 200GB > performance.readdir-ahead: on > > In this case, I've chosen dxl2, block13, and block22 > (which are part of the same disperse set). > > # gluster volume remove-brick glance block22:/dpool/gluster/brick1/glance block13:/dpool/gluster/brick1/glance dxl2:/dpool/gluster/brick1/glance start > # gluster volume remove-brick glance block22:/dpool/gluster/brick1/glance block13:/dpool/gluster/brick1/glance dxl2:/dpool/gluster/brick1/glance status > Node Rebalanced-files size scanned failures skipped status run time in h:m:s > --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- > dxl2 0 0Bytes 0 0 0 completed 0:0:1 > block13 0 0Bytes 0 0 0 completed 0:0:1 > > > Before starting this, I created 4096 random files in a directory. > 565 of them are found in /dpool/gluster/brick1/glance on each of those > bricks and nowhere else. > > After running the remove-brick the files stayed on those systems > and didn't appear anywhere else, despite the logs showing that > a rebalance occurred: > > [dxl2] # tail -100 /var/log/glusterfs/glance-rebalance.log > [2016-05-26 16:25:35.409904] I [MSGID: 109028] [dht-rebalance.c:3831:gf_defrag_status_get] 0-glance-dht: Rebalance is completed. Time taken is 1.00 secs > > Attempts to manually rebalance failed because the remove-brick task > hadn't yet been committed. > > If I commit the remove-brick, the system switches to a 6 x (2 + 1) > configuration which is what I wanted, but all 565 of the files that > were on the now-removed disperse set are no longer available anywhere. > > Is this a bug? Or am I misunderstanding how to remove a disperse > set from a distributed-disperse volume? I'm running 3.7.11 on > CentOS 7. > > Any guidance would be greatly appreciated. > > Thanks, > > Chris > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users