Mohamed Pakkeer
2015-May-27 09:26 UTC
[Gluster-users] Issue with Pro active self healing for Erasure coding
Hi Xavier, Thanks for your reply. When can we expect the 3.7.1 release? cheers Backer On Wed, May 27, 2015 at 1:22 PM, Xavier Hernandez <xhernandez at datalab.es> wrote:> Hi, > > some Input/Output error issues have been identified and fixed. These fixes > will be available on 3.7.1. > > Xavi > > > On 05/26/2015 10:15 AM, Mohamed Pakkeer wrote: > >> Hi Glusterfs Experts, >> >> We are testing glusterfs 3.7.0 tarball on our 10 Node glusterfs cluster. >> Each node has 36 dirves and please find the volume info below >> >> Volume Name: vaulttest5 >> Type: Distributed-Disperse >> Volume ID: 68e082a6-9819-4885-856c-1510cd201bd9 >> Status: Started >> Number of Bricks: 36 x (8 + 2) = 360 >> Transport-type: tcp >> Bricks: >> Brick1: 10.1.2.1:/media/disk1 >> Brick2: 10.1.2.2:/media/disk1 >> Brick3: 10.1.2.3:/media/disk1 >> Brick4: 10.1.2.4:/media/disk1 >> Brick5: 10.1.2.5:/media/disk1 >> Brick6: 10.1.2.6:/media/disk1 >> Brick7: 10.1.2.7:/media/disk1 >> Brick8: 10.1.2.8:/media/disk1 >> Brick9: 10.1.2.9:/media/disk1 >> Brick10: 10.1.2.10:/media/disk1 >> Brick11: 10.1.2.1:/media/disk2 >> Brick12: 10.1.2.2:/media/disk2 >> Brick13: 10.1.2.3:/media/disk2 >> Brick14: 10.1.2.4:/media/disk2 >> Brick15: 10.1.2.5:/media/disk2 >> Brick16: 10.1.2.6:/media/disk2 >> Brick17: 10.1.2.7:/media/disk2 >> Brick18: 10.1.2.8:/media/disk2 >> Brick19: 10.1.2.9:/media/disk2 >> Brick20: 10.1.2.10:/media/disk2 >> ... >> .... >> Brick351: 10.1.2.1:/media/disk36 >> Brick352: 10.1.2.2:/media/disk36 >> Brick353: 10.1.2.3:/media/disk36 >> Brick354: 10.1.2.4:/media/disk36 >> Brick355: 10.1.2.5:/media/disk36 >> Brick356: 10.1.2.6:/media/disk36 >> Brick357: 10.1.2.7:/media/disk36 >> Brick358: 10.1.2.8:/media/disk36 >> Brick359: 10.1.2.9:/media/disk36 >> Brick360: 10.1.2.10:/media/disk36 >> Options Reconfigured: >> performance.readdir-ahead: on >> >> We did some performance testing and simulated the proactive self healing >> for Erasure coding. Disperse volume has been created across nodes. >> >> _*Description of problem*_ >> >> I disconnected the *network of two nodes* and tried to write some video >> files and *glusterfs* *wrote the video files on balance 8 nodes >> perfectly*. I tried to download the uploaded file and it was downloaded >> perfectly. Then i enabled the network of two nodes, the pro active self >> healing mechanism worked perfectly and wrote the unavailable junk of >> data to the recently enabled node from the other 8 nodes. But when i >> tried to download the same file node, it showed Input/Output error. I >> couldn't download the file. I think there is an issue in pro active self >> healing. >> >> Also we tried the simulation with one node network failure. We faced >> same I/O error issue while downloading the file >> >> >> _Error while downloading file _ >> _ >> _ >> >> root at master02:/home/admin# rsync -r --progress /mnt/gluster/file13_AN >> ./1/file13_AN-2 >> >> sending incremental file list >> >> file13_AN >> >> 3,342,355,597 100% 4.87MB/s 0:10:54 (xfr#1, to-chk=0/1) >> >> rsync: read errors mapping "/mnt/gluster/file13_AN": Input/output error >> (5) >> >> WARNING: file13_AN failed verification -- update discarded (will try >> again). >> >> root at master02:/home/admin# cp /mnt/gluster/file13_AN ./1/file13_AN-3 >> >> cp: error reading ?/mnt/gluster/file13_AN?: Input/output error >> >> cp: failed to extend ?./1/file13_AN-3?: Input/output error_ >> _ >> >> >> We can't conclude the issue with glusterfs 3.7.0 or our glusterfs >> configuration. >> >> Any help would be greatly appreciated >> >> -- >> Cheers >> Backer >> >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150527/eb28a88c/attachment.html>
Xavier Hernandez
2015-May-27 10:01 UTC
[Gluster-users] Issue with Pro active self healing for Erasure coding
On 05/27/2015 11:26 AM, Mohamed Pakkeer wrote:> Hi Xavier, > > Thanks for your reply. When can we expect the 3.7.1 release?AFAIK a beta of 3.7.1 will be released very soon.> > cheers > Backer > > On Wed, May 27, 2015 at 1:22 PM, Xavier Hernandez <xhernandez at datalab.es > <mailto:xhernandez at datalab.es>> wrote: > > Hi, > > some Input/Output error issues have been identified and fixed. These > fixes will be available on 3.7.1. > > Xavi > > > On 05/26/2015 10:15 AM, Mohamed Pakkeer wrote: > > Hi Glusterfs Experts, > > We are testing glusterfs 3.7.0 tarball on our 10 Node glusterfs > cluster. > Each node has 36 dirves and please find the volume info below > > Volume Name: vaulttest5 > Type: Distributed-Disperse > Volume ID: 68e082a6-9819-4885-856c-1510cd201bd9 > Status: Started > Number of Bricks: 36 x (8 + 2) = 360 > Transport-type: tcp > Bricks: > Brick1: 10.1.2.1:/media/disk1 > Brick2: 10.1.2.2:/media/disk1 > Brick3: 10.1.2.3:/media/disk1 > Brick4: 10.1.2.4:/media/disk1 > Brick5: 10.1.2.5:/media/disk1 > Brick6: 10.1.2.6:/media/disk1 > Brick7: 10.1.2.7:/media/disk1 > Brick8: 10.1.2.8:/media/disk1 > Brick9: 10.1.2.9:/media/disk1 > Brick10: 10.1.2.10:/media/disk1 > Brick11: 10.1.2.1:/media/disk2 > Brick12: 10.1.2.2:/media/disk2 > Brick13: 10.1.2.3:/media/disk2 > Brick14: 10.1.2.4:/media/disk2 > Brick15: 10.1.2.5:/media/disk2 > Brick16: 10.1.2.6:/media/disk2 > Brick17: 10.1.2.7:/media/disk2 > Brick18: 10.1.2.8:/media/disk2 > Brick19: 10.1.2.9:/media/disk2 > Brick20: 10.1.2.10:/media/disk2 > ... > .... > Brick351: 10.1.2.1:/media/disk36 > Brick352: 10.1.2.2:/media/disk36 > Brick353: 10.1.2.3:/media/disk36 > Brick354: 10.1.2.4:/media/disk36 > Brick355: 10.1.2.5:/media/disk36 > Brick356: 10.1.2.6:/media/disk36 > Brick357: 10.1.2.7:/media/disk36 > Brick358: 10.1.2.8:/media/disk36 > Brick359: 10.1.2.9:/media/disk36 > Brick360: 10.1.2.10:/media/disk36 > Options Reconfigured: > performance.readdir-ahead: on > > We did some performance testing and simulated the proactive self > healing > for Erasure coding. Disperse volume has been created across nodes. > > _*Description of problem*_ > > I disconnected the *network of two nodes* and tried to write > some video > files and *glusterfs* *wrote the video files on balance 8 nodes > perfectly*. I tried to download the uploaded file and it was > downloaded > perfectly. Then i enabled the network of two nodes, the pro > active self > healing mechanism worked perfectly and wrote the unavailable junk of > data to the recently enabled node from the other 8 nodes. But when i > tried to download the same file node, it showed Input/Output > error. I > couldn't download the file. I think there is an issue in pro > active self > healing. > > Also we tried the simulation with one node network failure. We faced > same I/O error issue while downloading the file > > > _Error while downloading file _ > _ > _ > > root at master02:/home/admin# rsync -r --progress > /mnt/gluster/file13_AN > ./1/file13_AN-2 > > sending incremental file list > > file13_AN > > 3,342,355,597 100% 4.87MB/s 0:10:54 (xfr#1, to-chk=0/1) > > rsync: read errors mapping "/mnt/gluster/file13_AN": > Input/output error (5) > > WARNING: file13_AN failed verification -- update discarded (will > try again). > > root at master02:/home/admin# cp /mnt/gluster/file13_AN > ./1/file13_AN-3 > > cp: error reading ?/mnt/gluster/file13_AN?: Input/output error > > cp: failed to extend ?./1/file13_AN-3?: Input/output error_ > _ > > > We can't conclude the issue with glusterfs 3.7.0 or our glusterfs > configuration. > > Any help would be greatly appreciated > > -- > Cheers > Backer > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > >