Kyle Johnson
2016-Jul-05 17:28 UTC
[Gluster-users] Problem rebalancing a distributed volume
Hello everyone, I am having trouble with a distributed volume. In short, the rebalance command does not seem to work for me: Existing files are not migrated, and new files are not created on the new brick. I am running glusterfs 3.7.6 on two servers: 1) FreeBSD 10.3-RELEASE (colossus2 - 192.168.110.1) 2) CentOS 6.7 (colossus - 192.168.110.2) The bricks are zfs-backed on both servers, and the network consists of two direct-connected cat6 cables on 10gig NICs. The NICs are bonded (lagg'd) together with mode 4 (LACP). Here is what I am seeing: root at colossus ~]# gluster volume create fubar 192.168.110.2:/ftp/bricks/fubar volume create: fubar: success: please start the volume to access data [root at colossus ~]# gluster volume start fubar volume start: fubar: success [root at colossus ~]# mount -t glusterfs 192.168.110.2:/fubar /mnt/test [root at colossus ~]# touch /mnt/test/file{1..100} [root at colossus ~]# ls /mnt/test/| wc -l 100 [root at colossus ~]# ls /ftp/bricks/fubar | wc -l 100 # So far, so good. [root at colossus ~]# gluster volume add-brick fubar 192.168.110.1:/tank/bricks/fubar volume add-brick: success # For good measure, I'll do an explicit fix-layout first. [root at colossus ~]# gluster volume rebalance fubar fix-layout start volume rebalance: fubar: success: Rebalance on fubar has been started successfully. Use rebalance status command to check status of the rebalance process. ID: 2da23238-dbe4-4759-97b2-08879db271e7 [root at colossus ~]# gluster volume rebalance fubar status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 0 0 0 fix-layout completed 0.00 192.168.110.1 0 0Bytes 0 0 0 fix-layout completed 0.00 volume rebalance: fubar: success # Now to do the actual rebalance. [root at colossus ~]# gluster volume rebalance fubar start volume rebalance: fubar: success: Rebalance on fubar has been started successfully. Use rebalance status command to check status of the rebalance process. ID: 67160a67-01b2-4a51-9a11-114aa6269ee9 [root at colossus ~]# gluster volume rebalance fubar status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 100 0 0 completed 0.00 192.168.110.1 0 0Bytes 0 0 0 completed 0.00 volume rebalance: fubar: success [root at colossus ~]# ls /mnt/test/ | wc -l 101 [root at colossus ~]# ls /ftp/bricks/fubar/ | wc -l 100 # As the output shows, 100 files were scanned, but none were moved. # And for another test, I'll create 100 new post-fix-layout files [root at colossus ~]# touch /mnt/test/file{101..200} [root at colossus ~]# ls /ftp/bricks/fubar/ | wc -l 199 # And as you can see here, they were all created on the first server. The second server isn't touched at all. Not sure if this is relevant, but if I create the volume with both bricks to begin with, files are properly distributed. Thanks! Kyle -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160705/3729535c/attachment.html>
Susant Palai
2016-Jul-07 05:43 UTC
[Gluster-users] Problem rebalancing a distributed volume
Hi, Please pass on the rebalance log from the 1st server for more analysis which can be found under /var/log/glusterfs/"$VOL-rebalance.log". And also we need the current layout xattrs from both the bricks, which can be extracted by the following command. "getfattr -m . -de hex <$BRICK_PATH>". Thanks, Susant ----- Original Message -----> From: "Kyle Johnson" <kjohnson at gnulnx.net> > To: gluster-users at gluster.org > Sent: Tuesday, 5 July, 2016 10:58:09 PM > Subject: [Gluster-users] Problem rebalancing a distributed volume > > > > > Hello everyone, > > I am having trouble with a distributed volume. In short, the rebalance > command does not seem to work for me: Existing files are not migrated, and > new files are not created on the new brick. > > I am running glusterfs 3.7.6 on two servers: > > 1) FreeBSD 10.3-RELEASE (colossus2 - 192.168.110.1) > 2) CentOS 6.7 (colossus - 192.168.110.2) > > The bricks are zfs-backed on both servers, and the network consists of two > direct-connected cat6 cables on 10gig NICs. The NICs are bonded (lagg'd) > together with mode 4 (LACP). > > Here is what I am seeing: > > root at colossus ~]# gluster volume create fubar 192.168.110.2:/ftp/bricks/fubar > volume create: fubar: success: please start the volume to access data > [root at colossus ~]# gluster volume start fubar > volume start: fubar: success > [root at colossus ~]# mount -t glusterfs 192.168.110.2:/fubar /mnt/test > [root at colossus ~]# touch /mnt/test/file{1..100} > [root at colossus ~]# ls / mnt/test / | wc -l > 100 > [root at colossus ~]# ls /ftp/bricks/fubar | wc -l > 100 > > # So far, so good. > > [root at colossus ~]# gluster volume add-brick fubar > 192.168.110.1:/tank/bricks/fubar > volume add-brick: success > > # For good measure, I'll do an explicit fix-layout first. > > [root at colossus ~]# gluster volume rebalance fubar fix-layout start > volume rebalance: fubar: success: Rebalance on fubar has been started > successfully. Use rebalance status command to check status of the rebalance > process. > ID: 2da23238-dbe4-4759-97b2-08879db271e7 > > [root at colossus ~]# gluster volume rebalance fubar status > Node Rebalanced-files size scanned failures skipped status run time in secs > --------- ----------- ----------- ----------- ----------- ----------- > ------------ -------------- > localhost 0 0Bytes 0 0 0 fix-layout completed 0.00 > 192.168.110.1 0 0Bytes 0 0 0 fix-layout completed 0.00 > volume rebalance: fubar: success > > # Now to do the actual rebalance. > > [root at colossus ~]# gluster volume rebalance fubar start > volume rebalance: fubar: success: Rebalance on fubar has been started > successfully. Use rebalance status command to check status of the rebalance > process. > ID: 67160a67-01b2-4a51-9a11-114aa6269ee9 > > [root at colossus ~]# gluster volume rebalance fubar status > Node Rebalanced-files size scanned failures skipped status run time in secs > --------- ----------- ----------- ----------- ----------- ----------- > ------------ -------------- > localhost 0 0Bytes 100 0 0 completed 0.00 > 192.168.110.1 0 0Bytes 0 0 0 completed 0.00 > volume rebalance: fubar: success > [root at colossus ~]# ls / mnt/test / | wc -l > 101 > [root at colossus ~]# ls / ftp/bricks/fubar / | wc -l > 100 > > # As the output shows, 100 files were scanned, but none were moved. > > # And for another test, I'll create 100 new post-fix-layout files > > [root at colossus ~]# touch /mnt/test/file{101..200} > [root at colossus ~]# ls / ftp/bricks/fubar / | wc -l > 199 > > > # And as you can see here, they were all created on the first server. The > second server isn't touched at all. > > > Not sure if this is relevant, but if I create the volume with both bricks to > begin with, files are properly distributed. > > Thanks! > Kyle > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users