Christopher P. Lindsey
2016-May-26 17:09 UTC
[Gluster-users] Remove brick on 3.7.11 distributed-disperse volume doesn't rebalance
Hi, I have a distributed-disperse 7 x (2 + 1) volume that I want to remove three bricks from: Volume Name: glance Type: Distributed-Disperse Volume ID: 34a962cc-be73-480e-a9f7-8dbd9c7ca066 Status: Started Number of Bricks: 7 x (2 + 1) = 21 Transport-type: tcp Bricks: Brick1: block8:/dpool/gluster/brick1/glance Brick2: block15:/dpool/gluster/brick1/glance Brick3: dxl3:/dpool/gluster/brick1/glance Brick4: block1:/dpool/gluster/brick1/glance Brick5: block9:/dpool/gluster/brick1/glance Brick6: block16:/dpool/gluster/brick1/glance Brick7: block2:/dpool/gluster/brick1/glance Brick8: block10:/dpool/gluster/brick1/glance Brick9: block20:/dpool/gluster/brick1/glance Brick10: block3:/dpool/gluster/brick1/glance Brick11: block11:/dpool/gluster/brick1/glance Brick12: block21:/dpool/gluster/brick1/glance Brick13: dxl1:/dpool/gluster/brick1/glance Brick14: block12:/dpool/gluster/brick1/glance Brick15: block17:/dpool/gluster/brick1/glance Brick16: dxl2:/dpool/gluster/brick1/glance Brick17: block13:/dpool/gluster/brick1/glance Brick18: block22:/dpool/gluster/brick1/glance Brick19: block14:/dpool/gluster/brick1/glance Brick20: block18:/dpool/gluster/brick1/glance Brick21: block23:/dpool/gluster/brick1/glance Options Reconfigured: cluster.min-free-disk: 200GB performance.readdir-ahead: on In this case, I've chosen dxl2, block13, and block22 (which are part of the same disperse set). # gluster volume remove-brick glance block22:/dpool/gluster/brick1/glance block13:/dpool/gluster/brick1/glance dxl2:/dpool/gluster/brick1/glance start # gluster volume remove-brick glance block22:/dpool/gluster/brick1/glance block13:/dpool/gluster/brick1/glance dxl2:/dpool/gluster/brick1/glance status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- dxl2 0 0Bytes 0 0 0 completed 0:0:1 block13 0 0Bytes 0 0 0 completed 0:0:1 Before starting this, I created 4096 random files in a directory. 565 of them are found in /dpool/gluster/brick1/glance on each of those bricks and nowhere else. After running the remove-brick the files stayed on those systems and didn't appear anywhere else, despite the logs showing that a rebalance occurred: [dxl2] # tail -100 /var/log/glusterfs/glance-rebalance.log [2016-05-26 16:25:35.409904] I [MSGID: 109028] [dht-rebalance.c:3831:gf_defrag_status_get] 0-glance-dht: Rebalance is completed. Time taken is 1.00 secs Attempts to manually rebalance failed because the remove-brick task hadn't yet been committed. If I commit the remove-brick, the system switches to a 6 x (2 + 1) configuration which is what I wanted, but all 565 of the files that were on the now-removed disperse set are no longer available anywhere. Is this a bug? Or am I misunderstanding how to remove a disperse set from a distributed-disperse volume? I'm running 3.7.11 on CentOS 7. Any guidance would be greatly appreciated. Thanks, Chris
Serkan Çoban
2016-May-26 19:29 UTC
[Gluster-users] Remove brick on 3.7.11 distributed-disperse volume doesn't rebalance
I don't know any documentation that describes how to remove a disperse set from a distributed-disperse volume. So I assume you can not do that :) On Thu, May 26, 2016 at 8:09 PM, Christopher P. Lindsey <gluster.org at spamfodder.com> wrote:> Hi, > > I have a distributed-disperse 7 x (2 + 1) volume that I want > to remove three bricks from: > > Volume Name: glance > Type: Distributed-Disperse > Volume ID: 34a962cc-be73-480e-a9f7-8dbd9c7ca066 > Status: Started > Number of Bricks: 7 x (2 + 1) = 21 > Transport-type: tcp > Bricks: > Brick1: block8:/dpool/gluster/brick1/glance > Brick2: block15:/dpool/gluster/brick1/glance > Brick3: dxl3:/dpool/gluster/brick1/glance > Brick4: block1:/dpool/gluster/brick1/glance > Brick5: block9:/dpool/gluster/brick1/glance > Brick6: block16:/dpool/gluster/brick1/glance > Brick7: block2:/dpool/gluster/brick1/glance > Brick8: block10:/dpool/gluster/brick1/glance > Brick9: block20:/dpool/gluster/brick1/glance > Brick10: block3:/dpool/gluster/brick1/glance > Brick11: block11:/dpool/gluster/brick1/glance > Brick12: block21:/dpool/gluster/brick1/glance > Brick13: dxl1:/dpool/gluster/brick1/glance > Brick14: block12:/dpool/gluster/brick1/glance > Brick15: block17:/dpool/gluster/brick1/glance > Brick16: dxl2:/dpool/gluster/brick1/glance > Brick17: block13:/dpool/gluster/brick1/glance > Brick18: block22:/dpool/gluster/brick1/glance > Brick19: block14:/dpool/gluster/brick1/glance > Brick20: block18:/dpool/gluster/brick1/glance > Brick21: block23:/dpool/gluster/brick1/glance > Options Reconfigured: > cluster.min-free-disk: 200GB > performance.readdir-ahead: on > > In this case, I've chosen dxl2, block13, and block22 > (which are part of the same disperse set). > > # gluster volume remove-brick glance block22:/dpool/gluster/brick1/glance block13:/dpool/gluster/brick1/glance dxl2:/dpool/gluster/brick1/glance start > # gluster volume remove-brick glance block22:/dpool/gluster/brick1/glance block13:/dpool/gluster/brick1/glance dxl2:/dpool/gluster/brick1/glance status > Node Rebalanced-files size scanned failures skipped status run time in h:m:s > --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- > dxl2 0 0Bytes 0 0 0 completed 0:0:1 > block13 0 0Bytes 0 0 0 completed 0:0:1 > > > Before starting this, I created 4096 random files in a directory. > 565 of them are found in /dpool/gluster/brick1/glance on each of those > bricks and nowhere else. > > After running the remove-brick the files stayed on those systems > and didn't appear anywhere else, despite the logs showing that > a rebalance occurred: > > [dxl2] # tail -100 /var/log/glusterfs/glance-rebalance.log > [2016-05-26 16:25:35.409904] I [MSGID: 109028] [dht-rebalance.c:3831:gf_defrag_status_get] 0-glance-dht: Rebalance is completed. Time taken is 1.00 secs > > Attempts to manually rebalance failed because the remove-brick task > hadn't yet been committed. > > If I commit the remove-brick, the system switches to a 6 x (2 + 1) > configuration which is what I wanted, but all 565 of the files that > were on the now-removed disperse set are no longer available anywhere. > > Is this a bug? Or am I misunderstanding how to remove a disperse > set from a distributed-disperse volume? I'm running 3.7.11 on > CentOS 7. > > Any guidance would be greatly appreciated. > > Thanks, > > Chris > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users