Mohamed Pakkeer
2015-Jun-15 07:25 UTC
[Gluster-users] Issue with Pro active self healing for Erasure coding
Hi Xavier, When can we expect the 3.7.2 release for fixing the I/O error which we discussed on this mail thread?. Thanks Backer On Wed, May 27, 2015 at 8:02 PM, Xavier Hernandez <xhernandez at datalab.es> wrote:> Hi again, > > in today's gluster meeting [1] it has been decided that 3.7.1 will be > released urgently to solve a bug in glusterd. All fixes planned for 3.7.1 > will be moved to 3.7.2 which will be released soon after. > > Xavi > > [1] > http://meetbot.fedoraproject.org/gluster-meeting/2015-05-27/gluster-meeting.2015-05-27-12.01.html > > > On 05/27/2015 12:01 PM, Xavier Hernandez wrote: > >> On 05/27/2015 11:26 AM, Mohamed Pakkeer wrote: >> >>> Hi Xavier, >>> >>> Thanks for your reply. When can we expect the 3.7.1 release? >>> >> >> AFAIK a beta of 3.7.1 will be released very soon. >> >> >>> cheers >>> Backer >>> >>> On Wed, May 27, 2015 at 1:22 PM, Xavier Hernandez <xhernandez at datalab.es >>> <mailto:xhernandez at datalab.es>> wrote: >>> >>> Hi, >>> >>> some Input/Output error issues have been identified and fixed. These >>> fixes will be available on 3.7.1. >>> >>> Xavi >>> >>> >>> On 05/26/2015 10:15 AM, Mohamed Pakkeer wrote: >>> >>> Hi Glusterfs Experts, >>> >>> We are testing glusterfs 3.7.0 tarball on our 10 Node glusterfs >>> cluster. >>> Each node has 36 dirves and please find the volume info below >>> >>> Volume Name: vaulttest5 >>> Type: Distributed-Disperse >>> Volume ID: 68e082a6-9819-4885-856c-1510cd201bd9 >>> Status: Started >>> Number of Bricks: 36 x (8 + 2) = 360 >>> Transport-type: tcp >>> Bricks: >>> Brick1: 10.1.2.1:/media/disk1 >>> Brick2: 10.1.2.2:/media/disk1 >>> Brick3: 10.1.2.3:/media/disk1 >>> Brick4: 10.1.2.4:/media/disk1 >>> Brick5: 10.1.2.5:/media/disk1 >>> Brick6: 10.1.2.6:/media/disk1 >>> Brick7: 10.1.2.7:/media/disk1 >>> Brick8: 10.1.2.8:/media/disk1 >>> Brick9: 10.1.2.9:/media/disk1 >>> Brick10: 10.1.2.10:/media/disk1 >>> Brick11: 10.1.2.1:/media/disk2 >>> Brick12: 10.1.2.2:/media/disk2 >>> Brick13: 10.1.2.3:/media/disk2 >>> Brick14: 10.1.2.4:/media/disk2 >>> Brick15: 10.1.2.5:/media/disk2 >>> Brick16: 10.1.2.6:/media/disk2 >>> Brick17: 10.1.2.7:/media/disk2 >>> Brick18: 10.1.2.8:/media/disk2 >>> Brick19: 10.1.2.9:/media/disk2 >>> Brick20: 10.1.2.10:/media/disk2 >>> ... >>> .... >>> Brick351: 10.1.2.1:/media/disk36 >>> Brick352: 10.1.2.2:/media/disk36 >>> Brick353: 10.1.2.3:/media/disk36 >>> Brick354: 10.1.2.4:/media/disk36 >>> Brick355: 10.1.2.5:/media/disk36 >>> Brick356: 10.1.2.6:/media/disk36 >>> Brick357: 10.1.2.7:/media/disk36 >>> Brick358: 10.1.2.8:/media/disk36 >>> Brick359: 10.1.2.9:/media/disk36 >>> Brick360: 10.1.2.10:/media/disk36 >>> Options Reconfigured: >>> performance.readdir-ahead: on >>> >>> We did some performance testing and simulated the proactive self >>> healing >>> for Erasure coding. Disperse volume has been created across >>> nodes. >>> >>> _*Description of problem*_ >>> >>> I disconnected the *network of two nodes* and tried to write >>> some video >>> files and *glusterfs* *wrote the video files on balance 8 nodes >>> perfectly*. I tried to download the uploaded file and it was >>> downloaded >>> perfectly. Then i enabled the network of two nodes, the pro >>> active self >>> healing mechanism worked perfectly and wrote the unavailable >>> junk of >>> data to the recently enabled node from the other 8 nodes. But >>> when i >>> tried to download the same file node, it showed Input/Output >>> error. I >>> couldn't download the file. I think there is an issue in pro >>> active self >>> healing. >>> >>> Also we tried the simulation with one node network failure. We >>> faced >>> same I/O error issue while downloading the file >>> >>> >>> _Error while downloading file _ >>> _ >>> _ >>> >>> root at master02:/home/admin# rsync -r --progress >>> /mnt/gluster/file13_AN >>> ./1/file13_AN-2 >>> >>> sending incremental file list >>> >>> file13_AN >>> >>> 3,342,355,597 100% 4.87MB/s 0:10:54 (xfr#1, to-chk=0/1) >>> >>> rsync: read errors mapping "/mnt/gluster/file13_AN": >>> Input/output error (5) >>> >>> WARNING: file13_AN failed verification -- update discarded (will >>> try again). >>> >>> root at master02:/home/admin# cp /mnt/gluster/file13_AN >>> ./1/file13_AN-3 >>> >>> cp: error reading ?/mnt/gluster/file13_AN?: Input/output error >>> >>> cp: failed to extend ?./1/file13_AN-3?: Input/output error_ >>> _ >>> >>> >>> We can't conclude the issue with glusterfs 3.7.0 or our glusterfs >>> configuration. >>> >>> Any help would be greatly appreciated >>> >>> -- >>> Cheers >>> Backer >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>> http://www.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150615/6f1f6063/attachment.html>
Xavier Hernandez
2015-Jun-15 07:56 UTC
[Gluster-users] Issue with Pro active self healing for Erasure coding
On 06/15/2015 09:25 AM, Mohamed Pakkeer wrote:> Hi Xavier, > > When can we expect the 3.7.2 release for fixing the I/O error which we > discussed on this mail thread?.As per the latest meeting held last wednesday [1] it will be released this week. Xavi [1] http://meetbot.fedoraproject.org/gluster-meeting/2015-06-10/gluster-meeting.2015-06-10-12.01.html> > Thanks > Backer > > On Wed, May 27, 2015 at 8:02 PM, Xavier Hernandez <xhernandez at datalab.es > <mailto:xhernandez at datalab.es>> wrote: > > Hi again, > > in today's gluster meeting [1] it has been decided that 3.7.1 will > be released urgently to solve a bug in glusterd. All fixes planned > for 3.7.1 will be moved to 3.7.2 which will be released soon after. > > Xavi > > [1] > http://meetbot.fedoraproject.org/gluster-meeting/2015-05-27/gluster-meeting.2015-05-27-12.01.html > > > On 05/27/2015 12:01 PM, Xavier Hernandez wrote: > > On 05/27/2015 11:26 AM, Mohamed Pakkeer wrote: > > Hi Xavier, > > Thanks for your reply. When can we expect the 3.7.1 release? > > > AFAIK a beta of 3.7.1 will be released very soon. > > > cheers > Backer > > On Wed, May 27, 2015 at 1:22 PM, Xavier Hernandez > <xhernandez at datalab.es <mailto:xhernandez at datalab.es> > <mailto:xhernandez at datalab.es > <mailto:xhernandez at datalab.es>>> wrote: > > Hi, > > some Input/Output error issues have been identified and > fixed. These > fixes will be available on 3.7.1. > > Xavi > > > On 05/26/2015 10:15 AM, Mohamed Pakkeer wrote: > > Hi Glusterfs Experts, > > We are testing glusterfs 3.7.0 tarball on our 10 > Node glusterfs > cluster. > Each node has 36 dirves and please find the volume > info below > > Volume Name: vaulttest5 > Type: Distributed-Disperse > Volume ID: 68e082a6-9819-4885-856c-1510cd201bd9 > Status: Started > Number of Bricks: 36 x (8 + 2) = 360 > Transport-type: tcp > Bricks: > Brick1: 10.1.2.1:/media/disk1 > Brick2: 10.1.2.2:/media/disk1 > Brick3: 10.1.2.3:/media/disk1 > Brick4: 10.1.2.4:/media/disk1 > Brick5: 10.1.2.5:/media/disk1 > Brick6: 10.1.2.6:/media/disk1 > Brick7: 10.1.2.7:/media/disk1 > Brick8: 10.1.2.8:/media/disk1 > Brick9: 10.1.2.9:/media/disk1 > Brick10: 10.1.2.10:/media/disk1 > Brick11: 10.1.2.1:/media/disk2 > Brick12: 10.1.2.2:/media/disk2 > Brick13: 10.1.2.3:/media/disk2 > Brick14: 10.1.2.4:/media/disk2 > Brick15: 10.1.2.5:/media/disk2 > Brick16: 10.1.2.6:/media/disk2 > Brick17: 10.1.2.7:/media/disk2 > Brick18: 10.1.2.8:/media/disk2 > Brick19: 10.1.2.9:/media/disk2 > Brick20: 10.1.2.10:/media/disk2 > ... > .... > Brick351: 10.1.2.1:/media/disk36 > Brick352: 10.1.2.2:/media/disk36 > Brick353: 10.1.2.3:/media/disk36 > Brick354: 10.1.2.4:/media/disk36 > Brick355: 10.1.2.5:/media/disk36 > Brick356: 10.1.2.6:/media/disk36 > Brick357: 10.1.2.7:/media/disk36 > Brick358: 10.1.2.8:/media/disk36 > Brick359: 10.1.2.9:/media/disk36 > Brick360: 10.1.2.10:/media/disk36 > Options Reconfigured: > performance.readdir-ahead: on > > We did some performance testing and simulated the > proactive self > healing > for Erasure coding. Disperse volume has been > created across > nodes. > > _*Description of problem*_ > > I disconnected the *network of two nodes* and tried > to write > some video > files and *glusterfs* *wrote the video files on > balance 8 nodes > perfectly*. I tried to download the uploaded file > and it was > downloaded > perfectly. Then i enabled the network of two nodes, > the pro > active self > healing mechanism worked perfectly and wrote the > unavailable > junk of > data to the recently enabled node from the other 8 > nodes. But > when i > tried to download the same file node, it showed > Input/Output > error. I > couldn't download the file. I think there is an > issue in pro > active self > healing. > > Also we tried the simulation with one node network > failure. We > faced > same I/O error issue while downloading the file > > > _Error while downloading file _ > _ > _ > > root at master02:/home/admin# rsync -r --progress > /mnt/gluster/file13_AN > ./1/file13_AN-2 > > sending incremental file list > > file13_AN > > 3,342,355,597 100% 4.87MB/s 0:10:54 (xfr#1, > to-chk=0/1) > > rsync: read errors mapping "/mnt/gluster/file13_AN": > Input/output error (5) > > WARNING: file13_AN failed verification -- update > discarded (will > try again). > > root at master02:/home/admin# cp /mnt/gluster/file13_AN > ./1/file13_AN-3 > > cp: error reading ?/mnt/gluster/file13_AN?: > Input/output error > > cp: failed to extend ?./1/file13_AN-3?: > Input/output error_ > _ > > > We can't conclude the issue with glusterfs 3.7.0 or > our glusterfs > configuration. > > Any help would be greatly appreciated > > -- > Cheers > Backer > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > <mailto:Gluster-users at gluster.org > <mailto:Gluster-users at gluster.org>> > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > http://www.gluster.org/mailman/listinfo/gluster-users > > > > >