Olav Peeters
2015-Jan-21 12:18 UTC
[Gluster-users] problems after gluster volume remove-brick
Hi, two days ago is started a gluster volume remove-brick on a Distributed-Replicate volume with 21 x 2 per node (3 in total). I wanted to remove 4 bricks per node which are smaller than the others (on each node I have 7 x 2TB disks and 4 x 500GB disks). I am still on gluster 3.5.2. and I was not aware that using disks of different sizes is only supported as of 3.6.x (am I correct?) I started with 2 paired disks like so: gluster volume remove-brick VOLNAME node03:/export/brick8node03 node02:/export/brick10node02 start I followed the progress (which was very slow): gluster volume remove-brick volume_name node03:/export/brick8node03 node02:/export/brick10node02 status after a day the progress of node03:/export/brick8node03 showed "completed", the other brick remained "in progress" this morning several VM's with vdi's on the volume started showing disk errors + a couple of gluserfs mounts returned a disk is full type of error on the volume which is only ca. 41% filled with data currently. via df -h I saw that most of the 500GB disk where indeed 100% full. Others were meanwhile nearly empty.. Gluster seems to have gone nuts a bit during rebalancing the data. I did a: gluster volume remove-brick VOLNAME node03:/export/brick8node03 node02:/export/brick10node02 stop and a: gluster volume rebalance VOLNAME start progress is again very slow and some of the disks/bricks which were ca. 98% are now 100% full. The situation seems to be both getting worse in some cases and slowly improving e.g. for another pair of bricks (from 100% to 97%). There clearly has been some data corruption. Some VM's don't want to boot anymore, throwing disk errors. How do I proceed? Wait a very long time for the rebalance to complete and hope that the data corruption is automatically mended? Upgrade to 3.6.x and hope that the issues (which might be related to me using bricks of different sizes) are resolved and again risk a remove-brick operation? Should I rather do a: gluster volume rebalance VOLNAME migrate-data start Should I have done a replace-brick instead of a remove-brick operation originally? I thought that replace-brick is becoming obsolete. Thanks, Olav -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150121/a864aa8e/attachment.html>
Olav Peeters
2015-Jan-21 14:33 UTC
[Gluster-users] problems after gluster volume remove-brick
Adding to my previous mail.. I find a couple of strange errors in the rebalance log (/var/log/glusterfs/sr_vol01-rebalance.log) e.g.: [2015-01-21 10:00:32.123999] E [afr-self-heal-entry.c:1135:afr_sh_entry_impunge_newfile_cbk] 0-sr_vol01-replicate-11: creation of /some/file/on/the/volume.data on sr_vol01-client-23 failed (No space left on device) Why is the rebalance seemingly not taking account of the space left on disks available. This is the current situation on this particular node: [root at gluster03 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup-lv_root 50G 2.4G 45G 5% / tmpfs 7.8G 0 7.8G 0% /dev/shm /dev/sda1 485M 95M 365M 21% /boot /dev/sdb1 1.9T 577G 1.3T 31% /export/brick1gfs03 /dev/sdc1 1.9T 154G 1.7T 9% /export/brick2gfs03 /dev/sdd1 1.9T 413G 1.5T 23% /export/brick3gfs03 /dev/sde1 1.9T 1.5T 417G 78% /export/brick4gfs03 /dev/sdf1 1.9T 1.6T 286G 85% /export/brick5gfs03 /dev/sdg1 1.9T 1.4T 443G 77% /export/brick6gfs03 /dev/sdh1 1.9T 33M 1.9T 1% /export/brick7gfs03 /dev/sdi1 466G 62G 405G 14% /export/brick8gfs03 /dev/sdj1 466G 166G 301G 36% /export/brick9gfs03 /dev/sdk1 466G 466G 20K 100% /export/brick10gfs03 /dev/sdl1 466G 450G 16G 97% /export/brick11gfs03 /dev/sdm1 1.9T 206G 1.7T 12% /export/brick12gfs03 /dev/sdn1 1.9T 306G 1.6T 17% /export/brick13gfs03 /dev/sdo1 1.9T 107G 1.8T 6% /export/brick14gfs03 /dev/sdp1 1.9T 252G 1.6T 14% /export/brick15gfs03 why are brick10 and brick11 over utilised when there is plenty of space on brick 6, 14, etc. ? Anyone any idea? Cheers, Olav On 21/01/15 13:18, Olav Peeters wrote:> Hi, > two days ago is started a gluster volume remove-brick on a > Distributed-Replicate volume with 21 x 2 per node (3 in total). > > I wanted to remove 4 bricks per node which are smaller than the others > (on each node I have 7 x 2TB disks and 4 x 500GB disks). > I am still on gluster 3.5.2. and I was not aware that using disks of > different sizes is only supported as of 3.6.x (am I correct?) > > I started with 2 paired disks like so: > gluster volume remove-brick VOLNAME node03:/export/brick8node03 > node02:/export/brick10node02 start > > I followed the progress (which was very slow): > gluster volume remove-brick volume_name node03:/export/brick8node03 > node02:/export/brick10node02 status > after a day the progress of node03:/export/brick8node03 showed > "completed", the other brick remained "in progress" > > this morning several VM's with vdi's on the volume started showing > disk errors + a couple of gluserfs mounts returned a disk is full type > of error on the volume which is only ca. 41% filled with data currently. > > via df -h I saw that most of the 500GB disk where indeed 100% full. > Others were meanwhile nearly empty.. > Gluster seems to have gone nuts a bit during rebalancing the data. > > I did a: > gluster volume remove-brick VOLNAME node03:/export/brick8node03 > node02:/export/brick10node02 stop > and a: > gluster volume rebalance VOLNAME start > > progress is again very slow and some of the disks/bricks which were > ca. 98% are now 100% full. > The situation seems to be both getting worse in some cases and slowly > improving e.g. for another pair of bricks (from 100% to 97%). > > There clearly has been some data corruption. Some VM's don't want to > boot anymore, throwing disk errors. > > How do I proceed? > Wait a very long time for the rebalance to complete and hope that the > data corruption is automatically mended? > > Upgrade to 3.6.x and hope that the issues (which might be related to > me using bricks of different sizes) are resolved and again risk a > remove-brick operation? > > Should I rather do a: > gluster volume rebalance VOLNAME migrate-data start > > Should I have done a replace-brick instead of a remove-brick operation > originally? I thought that replace-brick is becoming obsolete. > > Thanks, > Olav > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150121/698b7cba/attachment.html>