----- Original Message -----> From: "mabi" <mabi at protonmail.ch> > To: "Ravishankar N" <ravishankar at redhat.com> > Cc: "Ben Turner" <bturner at redhat.com>, "Gluster Users" <gluster-users at gluster.org> > Sent: Sunday, August 27, 2017 3:15:33 PM > Subject: Re: [Gluster-users] self-heal not working > > Thanks Ravi for your analysis. So as far as I understand nothing to worry > about but my question now would be: how do I get rid of this file from the > heal info?Correct me if I am wrong but clearing this is just a matter of resetting the afr.dirty xattr? @Ravi - Is this correct? -b> > > -------- Original Message -------- > > Subject: Re: [Gluster-users] self-heal not working > > Local Time: August 27, 2017 3:45 PM > > UTC Time: August 27, 2017 1:45 PM > > From: ravishankar at redhat.com > > To: mabi <mabi at protonmail.ch> > > Ben Turner <bturner at redhat.com>, Gluster Users <gluster-users at gluster.org> > > > > Yes, the shds did pick up the file for healing (I saw messages like " got > > entry: 1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea") but no error afterwards. > > > > Anyway I reproduced it by manually setting the afr.dirty bit for a zero > > byte file on all 3 bricks. Since there are no afr pending xattrs > > indicating good/bad copies and all files are zero bytes, the data > > self-heal algorithm just picks the file with the latest ctime as source. > > In your case that was the arbiter brick. In the code, there is a check to > > prevent data heals if arbiter is the source. So heal was not happening and > > the entries were not removed from heal-info output. > > > > Perhaps we should add a check in the code to just remove the entries from > > heal-info if size is zero bytes in all bricks. > > > > -Ravi > > > > On 08/25/2017 06:33 PM, mabi wrote: > > > >> Hi Ravi, > >> > >> Did you get a chance to have a look at the log files I have attached in my > >> last mail? > >> > >> Best, > >> Mabi > >> > >>> -------- Original Message -------- > >>> Subject: Re: [Gluster-users] self-heal not working > >>> Local Time: August 24, 2017 12:08 PM > >>> UTC Time: August 24, 2017 10:08 AM > >>> From: mabi at protonmail.ch > >>> To: Ravishankar N > >>> [<ravishankar at redhat.com>](mailto:ravishankar at redhat.com) > >>> Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster > >>> Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) > >>> > >>> Thanks for confirming the command. I have now enabled DEBUG > >>> client-log-level, run a heal and then attached the glustershd log files > >>> of all 3 nodes in this mail. > >>> > >>> The volume concerned is called myvol-pro, the other 3 volumes have no > >>> problem so far. > >>> > >>> Also note that in the mean time it looks like the file has been deleted > >>> by the user and as such the heal info command does not show the file > >>> name anymore but just is GFID which is: > >>> > >>> gfid:1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea > >>> > >>> Hope that helps for debugging this issue. > >>> > >>>> -------- Original Message -------- > >>>> Subject: Re: [Gluster-users] self-heal not working > >>>> Local Time: August 24, 2017 5:58 AM > >>>> UTC Time: August 24, 2017 3:58 AM > >>>> From: ravishankar at redhat.com > >>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) > >>>> Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster > >>>> Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) > >>>> > >>>> Unlikely. In your case only the afr.dirty is set, not the > >>>> afr.volname-client-xx xattr. > >>>> > >>>> `gluster volume set myvolume diagnostics.client-log-level DEBUG` is > >>>> right. > >>>> > >>>> On 08/23/2017 10:31 PM, mabi wrote: > >>>> > >>>>> I just saw the following bug which was fixed in 3.8.15: > >>>>> > >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1471613 > >>>>> > >>>>> Is it possible that the problem I described in this post is related to > >>>>> that bug? > >>>>> > >>>>>> -------- Original Message -------- > >>>>>> Subject: Re: [Gluster-users] self-heal not working > >>>>>> Local Time: August 22, 2017 11:51 AM > >>>>>> UTC Time: August 22, 2017 9:51 AM > >>>>>> From: ravishankar at redhat.com > >>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) > >>>>>> Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster > >>>>>> Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) > >>>>>> > >>>>>> On 08/22/2017 02:30 PM, mabi wrote: > >>>>>> > >>>>>>> Thanks for the additional hints, I have the following 2 questions > >>>>>>> first: > >>>>>>> > >>>>>>> - In order to launch the index heal is the following command correct: > >>>>>>> gluster volume heal myvolume > >>>>>> > >>>>>> Yes > >>>>>> > >>>>>>> - If I run a "volume start force" will it have any short disruptions > >>>>>>> on my clients which mount the volume through FUSE? If yes, how long? > >>>>>>> This is a production system that's why I am asking. > >>>>>> > >>>>>> No. You can actually create a test volume on your personal linux box > >>>>>> to try these kinds of things without needing multiple machines. This > >>>>>> is how we develop and test our patches :) > >>>>>> 'gluster volume create testvol replica 3 /home/mabi/bricks/brick{1..3} > >>>>>> force` and so on. > >>>>>> > >>>>>> HTH, > >>>>>> Ravi > >>>>>> > >>>>>>>> -------- Original Message -------- > >>>>>>>> Subject: Re: [Gluster-users] self-heal not working > >>>>>>>> Local Time: August 22, 2017 6:26 AM > >>>>>>>> UTC Time: August 22, 2017 4:26 AM > >>>>>>>> From: ravishankar at redhat.com > >>>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch), Ben > >>>>>>>> Turner [<bturner at redhat.com>](mailto:bturner at redhat.com) > >>>>>>>> Gluster Users > >>>>>>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) > >>>>>>>> > >>>>>>>> Explore the following: > >>>>>>>> > >>>>>>>> - Launch index heal and look at the glustershd logs of all bricks > >>>>>>>> for possible errors > >>>>>>>> > >>>>>>>> - See if the glustershd in each node is connected to all bricks. > >>>>>>>> > >>>>>>>> - If not try to restart shd by `volume start force` > >>>>>>>> > >>>>>>>> - Launch index heal again and try. > >>>>>>>> > >>>>>>>> - Try debugging the shd log by setting client-log-level to DEBUG > >>>>>>>> temporarily. > >>>>>>>> > >>>>>>>> On 08/22/2017 03:19 AM, mabi wrote: > >>>>>>>> > >>>>>>>>> Sure, it doesn't look like a split brain based on the output: > >>>>>>>>> > >>>>>>>>> Brick node1.domain.tld:/data/myvolume/brick > >>>>>>>>> Status: Connected > >>>>>>>>> Number of entries in split-brain: 0 > >>>>>>>>> > >>>>>>>>> Brick node2.domain.tld:/data/myvolume/brick > >>>>>>>>> Status: Connected > >>>>>>>>> Number of entries in split-brain: 0 > >>>>>>>>> > >>>>>>>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick > >>>>>>>>> Status: Connected > >>>>>>>>> Number of entries in split-brain: 0 > >>>>>>>>> > >>>>>>>>>> -------- Original Message -------- > >>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working > >>>>>>>>>> Local Time: August 21, 2017 11:35 PM > >>>>>>>>>> UTC Time: August 21, 2017 9:35 PM > >>>>>>>>>> From: bturner at redhat.com > >>>>>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) > >>>>>>>>>> Gluster Users > >>>>>>>>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) > >>>>>>>>>> > >>>>>>>>>> Can you also provide: > >>>>>>>>>> > >>>>>>>>>> gluster v heal <my vol> info split-brain > >>>>>>>>>> > >>>>>>>>>> If it is split brain just delete the incorrect file from the brick > >>>>>>>>>> and run heal again. I haven"t tried this with arbiter but I > >>>>>>>>>> assume the process is the same. > >>>>>>>>>> > >>>>>>>>>> -b > >>>>>>>>>> > >>>>>>>>>> ----- Original Message ----- > >>>>>>>>>>> From: "mabi" [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) > >>>>>>>>>>> To: "Ben Turner" > >>>>>>>>>>> [<bturner at redhat.com>](mailto:bturner at redhat.com) > >>>>>>>>>>> Cc: "Gluster Users" > >>>>>>>>>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) > >>>>>>>>>>> Sent: Monday, August 21, 2017 4:55:59 PM > >>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working > >>>>>>>>>>> > >>>>>>>>>>> Hi Ben, > >>>>>>>>>>> > >>>>>>>>>>> So it is really a 0 kBytes file everywhere (all nodes including > >>>>>>>>>>> the arbiter > >>>>>>>>>>> and from the client). > >>>>>>>>>>> Here below you will find the output you requested. Hopefully that > >>>>>>>>>>> will help > >>>>>>>>>>> to find out why this specific file is not healing... Let me know > >>>>>>>>>>> if you need > >>>>>>>>>>> any more information. Btw node3 is my arbiter node. > >>>>>>>>>>> > >>>>>>>>>>> NODE1: > >>>>>>>>>>> > >>>>>>>>>>> STAT: > >>>>>>>>>>> File: > >>>>>>>>>>> ?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png? > >>>>>>>>>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file > >>>>>>>>>>> Device: 24h/36d Inode: 10033884 Links: 2 > >>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) > >>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 > >>>>>>>>>>> Modify: 2017-08-14 17:11:46.407404779 +0200 > >>>>>>>>>>> Change: 2017-08-14 17:11:46.407404779 +0200 > >>>>>>>>>>> Birth: - > >>>>>>>>>>> > >>>>>>>>>>> GETFATTR: > >>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA > >>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZhuknAAlJAg=> >>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=> >>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo> >>>>>>>>>>> > >>>>>>>>>>> NODE2: > >>>>>>>>>>> > >>>>>>>>>>> STAT: > >>>>>>>>>>> File: > >>>>>>>>>>> ?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png? > >>>>>>>>>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file > >>>>>>>>>>> Device: 26h/38d Inode: 10031330 Links: 2 > >>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) > >>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 > >>>>>>>>>>> Modify: 2017-08-14 17:11:46.403704181 +0200 > >>>>>>>>>>> Change: 2017-08-14 17:11:46.403704181 +0200 > >>>>>>>>>>> Birth: - > >>>>>>>>>>> > >>>>>>>>>>> GETFATTR: > >>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA > >>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZhu6wAA8Hpw=> >>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=> >>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE> >>>>>>>>>>> > >>>>>>>>>>> NODE3: > >>>>>>>>>>> STAT: > >>>>>>>>>>> File: > >>>>>>>>>>> /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png > >>>>>>>>>>> Size: 0 Blocks: 0 IO Block: 4096 regular empty file > >>>>>>>>>>> Device: ca11h/51729d Inode: 405208959 Links: 2 > >>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) > >>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 > >>>>>>>>>>> Modify: 2017-08-14 17:04:55.530681000 +0200 > >>>>>>>>>>> Change: 2017-08-14 17:11:46.604380051 +0200 > >>>>>>>>>>> Birth: - > >>>>>>>>>>> > >>>>>>>>>>> GETFATTR: > >>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA > >>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZe6ejAAKPAg=> >>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=> >>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4> >>>>>>>>>>> > >>>>>>>>>>> CLIENT GLUSTER MOUNT: > >>>>>>>>>>> STAT: > >>>>>>>>>>> File: > >>>>>>>>>>> "/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png" > >>>>>>>>>>> Size: 0 Blocks: 0 IO Block: 131072 regular empty file > >>>>>>>>>>> Device: 1eh/30d Inode: 11897049013408443114 Links: 1 > >>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) > >>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 > >>>>>>>>>>> Modify: 2017-08-14 17:11:46.407404779 +0200 > >>>>>>>>>>> Change: 2017-08-14 17:11:46.407404779 +0200 > >>>>>>>>>>> Birth: - > >>>>>>>>>>> > >>>>>>>>>>> > -------- Original Message -------- > >>>>>>>>>>> > Subject: Re: [Gluster-users] self-heal not working > >>>>>>>>>>> > Local Time: August 21, 2017 9:34 PM > >>>>>>>>>>> > UTC Time: August 21, 2017 7:34 PM > >>>>>>>>>>> > From: bturner at redhat.com > >>>>>>>>>>> > To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) > >>>>>>>>>>> > Gluster Users > >>>>>>>>>>> > [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) > >>>>>>>>>>> > > >>>>>>>>>>> > ----- Original Message ----- > >>>>>>>>>>> >> From: "mabi" [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) > >>>>>>>>>>> >> To: "Gluster Users" > >>>>>>>>>>> >> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) > >>>>>>>>>>> >> Sent: Monday, August 21, 2017 9:28:24 AM > >>>>>>>>>>> >> Subject: [Gluster-users] self-heal not working > >>>>>>>>>>> >> > >>>>>>>>>>> >> Hi, > >>>>>>>>>>> >> > >>>>>>>>>>> >> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and > >>>>>>>>>>> >> there is > >>>>>>>>>>> >> currently one file listed to be healed as you can see below > >>>>>>>>>>> >> but never gets > >>>>>>>>>>> >> healed by the self-heal daemon: > >>>>>>>>>>> >> > >>>>>>>>>>> >> Brick node1.domain.tld:/data/myvolume/brick > >>>>>>>>>>> >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png > >>>>>>>>>>> >> Status: Connected > >>>>>>>>>>> >> Number of entries: 1 > >>>>>>>>>>> >> > >>>>>>>>>>> >> Brick node2.domain.tld:/data/myvolume/brick > >>>>>>>>>>> >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png > >>>>>>>>>>> >> Status: Connected > >>>>>>>>>>> >> Number of entries: 1 > >>>>>>>>>>> >> > >>>>>>>>>>> >> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick > >>>>>>>>>>> >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png > >>>>>>>>>>> >> Status: Connected > >>>>>>>>>>> >> Number of entries: 1 > >>>>>>>>>>> >> > >>>>>>>>>>> >> As once recommended on this mailing list I have mounted that > >>>>>>>>>>> >> glusterfs > >>>>>>>>>>> >> volume > >>>>>>>>>>> >> temporarily through fuse/glusterfs and ran a "stat" on that > >>>>>>>>>>> >> file which is > >>>>>>>>>>> >> listed above but nothing happened. > >>>>>>>>>>> >> > >>>>>>>>>>> >> The file itself is available on all 3 nodes/bricks but on the > >>>>>>>>>>> >> last node it > >>>>>>>>>>> >> has a different date. By the way this file is 0 kBytes big. Is > >>>>>>>>>>> >> that maybe > >>>>>>>>>>> >> the reason why the self-heal does not work? > >>>>>>>>>>> > > >>>>>>>>>>> > Is the file actually 0 bytes or is it just 0 bytes on the > >>>>>>>>>>> > arbiter(0 bytes > >>>>>>>>>>> > are expected on the arbiter, it just stores metadata)? Can you > >>>>>>>>>>> > send us the > >>>>>>>>>>> > output from stat on all 3 nodes: > >>>>>>>>>>> > > >>>>>>>>>>> > $ stat <file on back end brick> > >>>>>>>>>>> > $ getfattr -d -m - <file on back end brick> > >>>>>>>>>>> > $ stat <file from gluster mount> > >>>>>>>>>>> > > >>>>>>>>>>> > Lets see what things look like on the back end, it should tell > >>>>>>>>>>> > us why > >>>>>>>>>>> > healing is failing. > >>>>>>>>>>> > > >>>>>>>>>>> > -b > >>>>>>>>>>> > > >>>>>>>>>>> >> > >>>>>>>>>>> >> And how can I now make this file to heal? > >>>>>>>>>>> >> > >>>>>>>>>>> >> Thanks, > >>>>>>>>>>> >> Mabi > >>>>>>>>>>> >> > >>>>>>>>>>> >> > >>>>>>>>>>> >> > >>>>>>>>>>> >> > >>>>>>>>>>> >> _______________________________________________ > >>>>>>>>>>> >> Gluster-users mailing list > >>>>>>>>>>> >> Gluster-users at gluster.org > >>>>>>>>>>> >> http://lists.gluster.org/mailman/listinfo/gluster-users > >>>>>>>>> > >>>>>>>>> _______________________________________________ > >>>>>>>>> Gluster-users mailing list > >>>>>>>>> Gluster-users at gluster.org > >>>>>>>>> > >>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
On 08/28/2017 01:57 AM, Ben Turner wrote:> ----- Original Message ----- >> From: "mabi" <mabi at protonmail.ch> >> To: "Ravishankar N" <ravishankar at redhat.com> >> Cc: "Ben Turner" <bturner at redhat.com>, "Gluster Users" <gluster-users at gluster.org> >> Sent: Sunday, August 27, 2017 3:15:33 PM >> Subject: Re: [Gluster-users] self-heal not working >> >> Thanks Ravi for your analysis. So as far as I understand nothing to worry >> about but my question now would be: how do I get rid of this file from the >> heal info? > Correct me if I am wrong but clearing this is just a matter of resetting the afr.dirty xattr? @Ravi - Is this correct?Yes resetting the xattr and launching index heal or running heal-info command should serve as a workaround. -Ravi> > -b > >>> -------- Original Message -------- >>> Subject: Re: [Gluster-users] self-heal not working >>> Local Time: August 27, 2017 3:45 PM >>> UTC Time: August 27, 2017 1:45 PM >>> From: ravishankar at redhat.com >>> To: mabi <mabi at protonmail.ch> >>> Ben Turner <bturner at redhat.com>, Gluster Users <gluster-users at gluster.org> >>> >>> Yes, the shds did pick up the file for healing (I saw messages like " got >>> entry: 1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea") but no error afterwards. >>> >>> Anyway I reproduced it by manually setting the afr.dirty bit for a zero >>> byte file on all 3 bricks. Since there are no afr pending xattrs >>> indicating good/bad copies and all files are zero bytes, the data >>> self-heal algorithm just picks the file with the latest ctime as source. >>> In your case that was the arbiter brick. In the code, there is a check to >>> prevent data heals if arbiter is the source. So heal was not happening and >>> the entries were not removed from heal-info output. >>> >>> Perhaps we should add a check in the code to just remove the entries from >>> heal-info if size is zero bytes in all bricks. >>> >>> -Ravi >>> >>> On 08/25/2017 06:33 PM, mabi wrote: >>> >>>> Hi Ravi, >>>> >>>> Did you get a chance to have a look at the log files I have attached in my >>>> last mail? >>>> >>>> Best, >>>> Mabi >>>> >>>>> -------- Original Message -------- >>>>> Subject: Re: [Gluster-users] self-heal not working >>>>> Local Time: August 24, 2017 12:08 PM >>>>> UTC Time: August 24, 2017 10:08 AM >>>>> From: mabi at protonmail.ch >>>>> To: Ravishankar N >>>>> [<ravishankar at redhat.com>](mailto:ravishankar at redhat.com) >>>>> Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster >>>>> Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>> >>>>> Thanks for confirming the command. I have now enabled DEBUG >>>>> client-log-level, run a heal and then attached the glustershd log files >>>>> of all 3 nodes in this mail. >>>>> >>>>> The volume concerned is called myvol-pro, the other 3 volumes have no >>>>> problem so far. >>>>> >>>>> Also note that in the mean time it looks like the file has been deleted >>>>> by the user and as such the heal info command does not show the file >>>>> name anymore but just is GFID which is: >>>>> >>>>> gfid:1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea >>>>> >>>>> Hope that helps for debugging this issue. >>>>> >>>>>> -------- Original Message -------- >>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>> Local Time: August 24, 2017 5:58 AM >>>>>> UTC Time: August 24, 2017 3:58 AM >>>>>> From: ravishankar at redhat.com >>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>> Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster >>>>>> Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>> >>>>>> Unlikely. In your case only the afr.dirty is set, not the >>>>>> afr.volname-client-xx xattr. >>>>>> >>>>>> `gluster volume set myvolume diagnostics.client-log-level DEBUG` is >>>>>> right. >>>>>> >>>>>> On 08/23/2017 10:31 PM, mabi wrote: >>>>>> >>>>>>> I just saw the following bug which was fixed in 3.8.15: >>>>>>> >>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1471613 >>>>>>> >>>>>>> Is it possible that the problem I described in this post is related to >>>>>>> that bug? >>>>>>> >>>>>>>> -------- Original Message -------- >>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>> Local Time: August 22, 2017 11:51 AM >>>>>>>> UTC Time: August 22, 2017 9:51 AM >>>>>>>> From: ravishankar at redhat.com >>>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>>>> Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster >>>>>>>> Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>>> >>>>>>>> On 08/22/2017 02:30 PM, mabi wrote: >>>>>>>> >>>>>>>>> Thanks for the additional hints, I have the following 2 questions >>>>>>>>> first: >>>>>>>>> >>>>>>>>> - In order to launch the index heal is the following command correct: >>>>>>>>> gluster volume heal myvolume >>>>>>>> Yes >>>>>>>> >>>>>>>>> - If I run a "volume start force" will it have any short disruptions >>>>>>>>> on my clients which mount the volume through FUSE? If yes, how long? >>>>>>>>> This is a production system that's why I am asking. >>>>>>>> No. You can actually create a test volume on your personal linux box >>>>>>>> to try these kinds of things without needing multiple machines. This >>>>>>>> is how we develop and test our patches :) >>>>>>>> 'gluster volume create testvol replica 3 /home/mabi/bricks/brick{1..3} >>>>>>>> force` and so on. >>>>>>>> >>>>>>>> HTH, >>>>>>>> Ravi >>>>>>>> >>>>>>>>>> -------- Original Message -------- >>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>> Local Time: August 22, 2017 6:26 AM >>>>>>>>>> UTC Time: August 22, 2017 4:26 AM >>>>>>>>>> From: ravishankar at redhat.com >>>>>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch), Ben >>>>>>>>>> Turner [<bturner at redhat.com>](mailto:bturner at redhat.com) >>>>>>>>>> Gluster Users >>>>>>>>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>>>>> >>>>>>>>>> Explore the following: >>>>>>>>>> >>>>>>>>>> - Launch index heal and look at the glustershd logs of all bricks >>>>>>>>>> for possible errors >>>>>>>>>> >>>>>>>>>> - See if the glustershd in each node is connected to all bricks. >>>>>>>>>> >>>>>>>>>> - If not try to restart shd by `volume start force` >>>>>>>>>> >>>>>>>>>> - Launch index heal again and try. >>>>>>>>>> >>>>>>>>>> - Try debugging the shd log by setting client-log-level to DEBUG >>>>>>>>>> temporarily. >>>>>>>>>> >>>>>>>>>> On 08/22/2017 03:19 AM, mabi wrote: >>>>>>>>>> >>>>>>>>>>> Sure, it doesn't look like a split brain based on the output: >>>>>>>>>>> >>>>>>>>>>> Brick node1.domain.tld:/data/myvolume/brick >>>>>>>>>>> Status: Connected >>>>>>>>>>> Number of entries in split-brain: 0 >>>>>>>>>>> >>>>>>>>>>> Brick node2.domain.tld:/data/myvolume/brick >>>>>>>>>>> Status: Connected >>>>>>>>>>> Number of entries in split-brain: 0 >>>>>>>>>>> >>>>>>>>>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >>>>>>>>>>> Status: Connected >>>>>>>>>>> Number of entries in split-brain: 0 >>>>>>>>>>> >>>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>> Local Time: August 21, 2017 11:35 PM >>>>>>>>>>>> UTC Time: August 21, 2017 9:35 PM >>>>>>>>>>>> From: bturner at redhat.com >>>>>>>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>>>>>>>> Gluster Users >>>>>>>>>>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>>>>>>> >>>>>>>>>>>> Can you also provide: >>>>>>>>>>>> >>>>>>>>>>>> gluster v heal <my vol> info split-brain >>>>>>>>>>>> >>>>>>>>>>>> If it is split brain just delete the incorrect file from the brick >>>>>>>>>>>> and run heal again. I haven"t tried this with arbiter but I >>>>>>>>>>>> assume the process is the same. >>>>>>>>>>>> >>>>>>>>>>>> -b >>>>>>>>>>>> >>>>>>>>>>>> ----- Original Message ----- >>>>>>>>>>>>> From: "mabi" [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>>>>>>>>> To: "Ben Turner" >>>>>>>>>>>>> [<bturner at redhat.com>](mailto:bturner at redhat.com) >>>>>>>>>>>>> Cc: "Gluster Users" >>>>>>>>>>>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>>>>>>>> Sent: Monday, August 21, 2017 4:55:59 PM >>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Ben, >>>>>>>>>>>>> >>>>>>>>>>>>> So it is really a 0 kBytes file everywhere (all nodes including >>>>>>>>>>>>> the arbiter >>>>>>>>>>>>> and from the client). >>>>>>>>>>>>> Here below you will find the output you requested. Hopefully that >>>>>>>>>>>>> will help >>>>>>>>>>>>> to find out why this specific file is not healing... Let me know >>>>>>>>>>>>> if you need >>>>>>>>>>>>> any more information. Btw node3 is my arbiter node. >>>>>>>>>>>>> >>>>>>>>>>>>> NODE1: >>>>>>>>>>>>> >>>>>>>>>>>>> STAT: >>>>>>>>>>>>> File: >>>>>>>>>>>>> ?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png? >>>>>>>>>>>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >>>>>>>>>>>>> Device: 24h/36d Inode: 10033884 Links: 2 >>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>> Change: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>> Birth: - >>>>>>>>>>>>> >>>>>>>>>>>>> GETFATTR: >>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZhuknAAlJAg=>>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo>>>>>>>>>>>>> >>>>>>>>>>>>> NODE2: >>>>>>>>>>>>> >>>>>>>>>>>>> STAT: >>>>>>>>>>>>> File: >>>>>>>>>>>>> ?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png? >>>>>>>>>>>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >>>>>>>>>>>>> Device: 26h/38d Inode: 10031330 Links: 2 >>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.403704181 +0200 >>>>>>>>>>>>> Change: 2017-08-14 17:11:46.403704181 +0200 >>>>>>>>>>>>> Birth: - >>>>>>>>>>>>> >>>>>>>>>>>>> GETFATTR: >>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZhu6wAA8Hpw=>>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE>>>>>>>>>>>>> >>>>>>>>>>>>> NODE3: >>>>>>>>>>>>> STAT: >>>>>>>>>>>>> File: >>>>>>>>>>>>> /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>> Size: 0 Blocks: 0 IO Block: 4096 regular empty file >>>>>>>>>>>>> Device: ca11h/51729d Inode: 405208959 Links: 2 >>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>> Modify: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>> Change: 2017-08-14 17:11:46.604380051 +0200 >>>>>>>>>>>>> Birth: - >>>>>>>>>>>>> >>>>>>>>>>>>> GETFATTR: >>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZe6ejAAKPAg=>>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4>>>>>>>>>>>>> >>>>>>>>>>>>> CLIENT GLUSTER MOUNT: >>>>>>>>>>>>> STAT: >>>>>>>>>>>>> File: >>>>>>>>>>>>> "/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png" >>>>>>>>>>>>> Size: 0 Blocks: 0 IO Block: 131072 regular empty file >>>>>>>>>>>>> Device: 1eh/30d Inode: 11897049013408443114 Links: 1 >>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>> Change: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>> Birth: - >>>>>>>>>>>>> >>>>>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>>> Local Time: August 21, 2017 9:34 PM >>>>>>>>>>>>>> UTC Time: August 21, 2017 7:34 PM >>>>>>>>>>>>>> From: bturner at redhat.com >>>>>>>>>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>>>>>>>>>> Gluster Users >>>>>>>>>>>>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>> >>>>>>>>>>>>>> ----- Original Message ----- >>>>>>>>>>>>>>> From: "mabi" [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>>>>>>>>>>> To: "Gluster Users" >>>>>>>>>>>>>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>>> Sent: Monday, August 21, 2017 9:28:24 AM >>>>>>>>>>>>>>> Subject: [Gluster-users] self-heal not working >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and >>>>>>>>>>>>>>> there is >>>>>>>>>>>>>>> currently one file listed to be healed as you can see below >>>>>>>>>>>>>>> but never gets >>>>>>>>>>>>>>> healed by the self-heal daemon: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Brick node1.domain.tld:/data/myvolume/brick >>>>>>>>>>>>>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>> Number of entries: 1 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Brick node2.domain.tld:/data/myvolume/brick >>>>>>>>>>>>>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>> Number of entries: 1 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >>>>>>>>>>>>>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>> Number of entries: 1 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> As once recommended on this mailing list I have mounted that >>>>>>>>>>>>>>> glusterfs >>>>>>>>>>>>>>> volume >>>>>>>>>>>>>>> temporarily through fuse/glusterfs and ran a "stat" on that >>>>>>>>>>>>>>> file which is >>>>>>>>>>>>>>> listed above but nothing happened. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The file itself is available on all 3 nodes/bricks but on the >>>>>>>>>>>>>>> last node it >>>>>>>>>>>>>>> has a different date. By the way this file is 0 kBytes big. Is >>>>>>>>>>>>>>> that maybe >>>>>>>>>>>>>>> the reason why the self-heal does not work? >>>>>>>>>>>>>> Is the file actually 0 bytes or is it just 0 bytes on the >>>>>>>>>>>>>> arbiter(0 bytes >>>>>>>>>>>>>> are expected on the arbiter, it just stores metadata)? Can you >>>>>>>>>>>>>> send us the >>>>>>>>>>>>>> output from stat on all 3 nodes: >>>>>>>>>>>>>> >>>>>>>>>>>>>> $ stat <file on back end brick> >>>>>>>>>>>>>> $ getfattr -d -m - <file on back end brick> >>>>>>>>>>>>>> $ stat <file from gluster mount> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Lets see what things look like on the back end, it should tell >>>>>>>>>>>>>> us why >>>>>>>>>>>>>> healing is failing. >>>>>>>>>>>>>> >>>>>>>>>>>>>> -b >>>>>>>>>>>>>> >>>>>>>>>>>>>>> And how can I now make this file to heal? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Mabi >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>>>>> Gluster-users at gluster.org >>>>>>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>> Gluster-users at gluster.org >>>>>>>>>>> >>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
Excuse me for my naive questions but how do I reset the afr.dirty xattr on the file to be healed? and do I need to do that through a FUSE mount? or simply on every bricks directly?> -------- Original Message -------- > Subject: Re: [Gluster-users] self-heal not working > Local Time: August 28, 2017 5:58 AM > UTC Time: August 28, 2017 3:58 AM > From: ravishankar at redhat.com > To: Ben Turner <bturner at redhat.com>, mabi <mabi at protonmail.ch> > Gluster Users <gluster-users at gluster.org> > > On 08/28/2017 01:57 AM, Ben Turner wrote: >> ----- Original Message ----- >>> From: "mabi" <mabi at protonmail.ch> >>> To: "Ravishankar N" <ravishankar at redhat.com> >>> Cc: "Ben Turner" <bturner at redhat.com>, "Gluster Users" <gluster-users at gluster.org> >>> Sent: Sunday, August 27, 2017 3:15:33 PM >>> Subject: Re: [Gluster-users] self-heal not working >>> >>> Thanks Ravi for your analysis. So as far as I understand nothing to worry >>> about but my question now would be: how do I get rid of this file from the >>> heal info? >> Correct me if I am wrong but clearing this is just a matter of resetting the afr.dirty xattr? @Ravi - Is this correct? > > Yes resetting the xattr and launching index heal or running heal-info > command should serve as a workaround. > -Ravi > >> >> -b >> >>>> -------- Original Message -------- >>>> Subject: Re: [Gluster-users] self-heal not working >>>> Local Time: August 27, 2017 3:45 PM >>>> UTC Time: August 27, 2017 1:45 PM >>>> From: ravishankar at redhat.com >>>> To: mabi <mabi at protonmail.ch> >>>> Ben Turner <bturner at redhat.com>, Gluster Users <gluster-users at gluster.org> >>>> >>>> Yes, the shds did pick up the file for healing (I saw messages like " got >>>> entry: 1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea") but no error afterwards. >>>> >>>> Anyway I reproduced it by manually setting the afr.dirty bit for a zero >>>> byte file on all 3 bricks. Since there are no afr pending xattrs >>>> indicating good/bad copies and all files are zero bytes, the data >>>> self-heal algorithm just picks the file with the latest ctime as source. >>>> In your case that was the arbiter brick. In the code, there is a check to >>>> prevent data heals if arbiter is the source. So heal was not happening and >>>> the entries were not removed from heal-info output. >>>> >>>> Perhaps we should add a check in the code to just remove the entries from >>>> heal-info if size is zero bytes in all bricks. >>>> >>>> -Ravi >>>> >>>> On 08/25/2017 06:33 PM, mabi wrote: >>>> >>>>> Hi Ravi, >>>>> >>>>> Did you get a chance to have a look at the log files I have attached in my >>>>> last mail? >>>>> >>>>> Best, >>>>> Mabi >>>>> >>>>>> -------- Original Message -------- >>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>> Local Time: August 24, 2017 12:08 PM >>>>>> UTC Time: August 24, 2017 10:08 AM >>>>>> From: mabi at protonmail.ch >>>>>> To: Ravishankar N >>>>>> [<ravishankar at redhat.com>](mailto:ravishankar at redhat.com) >>>>>> Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster >>>>>> Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>> >>>>>> Thanks for confirming the command. I have now enabled DEBUG >>>>>> client-log-level, run a heal and then attached the glustershd log files >>>>>> of all 3 nodes in this mail. >>>>>> >>>>>> The volume concerned is called myvol-pro, the other 3 volumes have no >>>>>> problem so far. >>>>>> >>>>>> Also note that in the mean time it looks like the file has been deleted >>>>>> by the user and as such the heal info command does not show the file >>>>>> name anymore but just is GFID which is: >>>>>> >>>>>> gfid:1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea >>>>>> >>>>>> Hope that helps for debugging this issue. >>>>>> >>>>>>> -------- Original Message -------- >>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>> Local Time: August 24, 2017 5:58 AM >>>>>>> UTC Time: August 24, 2017 3:58 AM >>>>>>> From: ravishankar at redhat.com >>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>>> Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster >>>>>>> Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>> >>>>>>> Unlikely. In your case only the afr.dirty is set, not the >>>>>>> afr.volname-client-xx xattr. >>>>>>> >>>>>>> `gluster volume set myvolume diagnostics.client-log-level DEBUG` is >>>>>>> right. >>>>>>> >>>>>>> On 08/23/2017 10:31 PM, mabi wrote: >>>>>>> >>>>>>>> I just saw the following bug which was fixed in 3.8.15: >>>>>>>> >>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1471613 >>>>>>>> >>>>>>>> Is it possible that the problem I described in this post is related to >>>>>>>> that bug? >>>>>>>> >>>>>>>>> -------- Original Message -------- >>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>> Local Time: August 22, 2017 11:51 AM >>>>>>>>> UTC Time: August 22, 2017 9:51 AM >>>>>>>>> From: ravishankar at redhat.com >>>>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>>>>> Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster >>>>>>>>> Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>>>> >>>>>>>>> On 08/22/2017 02:30 PM, mabi wrote: >>>>>>>>> >>>>>>>>>> Thanks for the additional hints, I have the following 2 questions >>>>>>>>>> first: >>>>>>>>>> >>>>>>>>>> - In order to launch the index heal is the following command correct: >>>>>>>>>> gluster volume heal myvolume >>>>>>>>> Yes >>>>>>>>> >>>>>>>>>> - If I run a "volume start force" will it have any short disruptions >>>>>>>>>> on my clients which mount the volume through FUSE? If yes, how long? >>>>>>>>>> This is a production system that"s why I am asking. >>>>>>>>> No. You can actually create a test volume on your personal linux box >>>>>>>>> to try these kinds of things without needing multiple machines. This >>>>>>>>> is how we develop and test our patches :) >>>>>>>>> "gluster volume create testvol replica 3 /home/mabi/bricks/brick{1..3} >>>>>>>>> force` and so on. >>>>>>>>> >>>>>>>>> HTH, >>>>>>>>> Ravi >>>>>>>>> >>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>> Local Time: August 22, 2017 6:26 AM >>>>>>>>>>> UTC Time: August 22, 2017 4:26 AM >>>>>>>>>>> From: ravishankar at redhat.com >>>>>>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch), Ben >>>>>>>>>>> Turner [<bturner at redhat.com>](mailto:bturner at redhat.com) >>>>>>>>>>> Gluster Users >>>>>>>>>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>>>>>> >>>>>>>>>>> Explore the following: >>>>>>>>>>> >>>>>>>>>>> - Launch index heal and look at the glustershd logs of all bricks >>>>>>>>>>> for possible errors >>>>>>>>>>> >>>>>>>>>>> - See if the glustershd in each node is connected to all bricks. >>>>>>>>>>> >>>>>>>>>>> - If not try to restart shd by `volume start force` >>>>>>>>>>> >>>>>>>>>>> - Launch index heal again and try. >>>>>>>>>>> >>>>>>>>>>> - Try debugging the shd log by setting client-log-level to DEBUG >>>>>>>>>>> temporarily. >>>>>>>>>>> >>>>>>>>>>> On 08/22/2017 03:19 AM, mabi wrote: >>>>>>>>>>> >>>>>>>>>>>> Sure, it doesn"t look like a split brain based on the output: >>>>>>>>>>>> >>>>>>>>>>>> Brick node1.domain.tld:/data/myvolume/brick >>>>>>>>>>>> Status: Connected >>>>>>>>>>>> Number of entries in split-brain: 0 >>>>>>>>>>>> >>>>>>>>>>>> Brick node2.domain.tld:/data/myvolume/brick >>>>>>>>>>>> Status: Connected >>>>>>>>>>>> Number of entries in split-brain: 0 >>>>>>>>>>>> >>>>>>>>>>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >>>>>>>>>>>> Status: Connected >>>>>>>>>>>> Number of entries in split-brain: 0 >>>>>>>>>>>> >>>>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>> Local Time: August 21, 2017 11:35 PM >>>>>>>>>>>>> UTC Time: August 21, 2017 9:35 PM >>>>>>>>>>>>> From: bturner at redhat.com >>>>>>>>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>>>>>>>>> Gluster Users >>>>>>>>>>>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>>>>>>>> >>>>>>>>>>>>> Can you also provide: >>>>>>>>>>>>> >>>>>>>>>>>>> gluster v heal <my vol> info split-brain >>>>>>>>>>>>> >>>>>>>>>>>>> If it is split brain just delete the incorrect file from the brick >>>>>>>>>>>>> and run heal again. I haven"t tried this with arbiter but I >>>>>>>>>>>>> assume the process is the same. >>>>>>>>>>>>> >>>>>>>>>>>>> -b >>>>>>>>>>>>> >>>>>>>>>>>>> ----- Original Message ----- >>>>>>>>>>>>>> From: "mabi" [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>>>>>>>>>> To: "Ben Turner" >>>>>>>>>>>>>> [<bturner at redhat.com>](mailto:bturner at redhat.com) >>>>>>>>>>>>>> Cc: "Gluster Users" >>>>>>>>>>>>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>> Sent: Monday, August 21, 2017 4:55:59 PM >>>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Ben, >>>>>>>>>>>>>> >>>>>>>>>>>>>> So it is really a 0 kBytes file everywhere (all nodes including >>>>>>>>>>>>>> the arbiter >>>>>>>>>>>>>> and from the client). >>>>>>>>>>>>>> Here below you will find the output you requested. Hopefully that >>>>>>>>>>>>>> will help >>>>>>>>>>>>>> to find out why this specific file is not healing... Let me know >>>>>>>>>>>>>> if you need >>>>>>>>>>>>>> any more information. Btw node3 is my arbiter node. >>>>>>>>>>>>>> >>>>>>>>>>>>>> NODE1: >>>>>>>>>>>>>> >>>>>>>>>>>>>> STAT: >>>>>>>>>>>>>> File: >>>>>>>>>>>>>> ?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png? >>>>>>>>>>>>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >>>>>>>>>>>>>> Device: 24h/36d Inode: 10033884 Links: 2 >>>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>>> Change: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>>> Birth: - >>>>>>>>>>>>>> >>>>>>>>>>>>>> GETFATTR: >>>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZhuknAAlJAg=>>>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>>>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo>>>>>>>>>>>>>> >>>>>>>>>>>>>> NODE2: >>>>>>>>>>>>>> >>>>>>>>>>>>>> STAT: >>>>>>>>>>>>>> File: >>>>>>>>>>>>>> ?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png? >>>>>>>>>>>>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >>>>>>>>>>>>>> Device: 26h/38d Inode: 10031330 Links: 2 >>>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.403704181 +0200 >>>>>>>>>>>>>> Change: 2017-08-14 17:11:46.403704181 +0200 >>>>>>>>>>>>>> Birth: - >>>>>>>>>>>>>> >>>>>>>>>>>>>> GETFATTR: >>>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZhu6wAA8Hpw=>>>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>>>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE>>>>>>>>>>>>>> >>>>>>>>>>>>>> NODE3: >>>>>>>>>>>>>> STAT: >>>>>>>>>>>>>> File: >>>>>>>>>>>>>> /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>> Size: 0 Blocks: 0 IO Block: 4096 regular empty file >>>>>>>>>>>>>> Device: ca11h/51729d Inode: 405208959 Links: 2 >>>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>> Modify: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>> Change: 2017-08-14 17:11:46.604380051 +0200 >>>>>>>>>>>>>> Birth: - >>>>>>>>>>>>>> >>>>>>>>>>>>>> GETFATTR: >>>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZe6ejAAKPAg=>>>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>>>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4>>>>>>>>>>>>>> >>>>>>>>>>>>>> CLIENT GLUSTER MOUNT: >>>>>>>>>>>>>> STAT: >>>>>>>>>>>>>> File: >>>>>>>>>>>>>> "/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png" >>>>>>>>>>>>>> Size: 0 Blocks: 0 IO Block: 131072 regular empty file >>>>>>>>>>>>>> Device: 1eh/30d Inode: 11897049013408443114 Links: 1 >>>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>>> Change: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>>> Birth: - >>>>>>>>>>>>>> >>>>>>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>>>> Local Time: August 21, 2017 9:34 PM >>>>>>>>>>>>>>> UTC Time: August 21, 2017 7:34 PM >>>>>>>>>>>>>>> From: bturner at redhat.com >>>>>>>>>>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>>>>>>>>>>> Gluster Users >>>>>>>>>>>>>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ----- Original Message ----- >>>>>>>>>>>>>>>> From: "mabi" [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>>>>>>>>>>>> To: "Gluster Users" >>>>>>>>>>>>>>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>>>> Sent: Monday, August 21, 2017 9:28:24 AM >>>>>>>>>>>>>>>> Subject: [Gluster-users] self-heal not working >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and >>>>>>>>>>>>>>>> there is >>>>>>>>>>>>>>>> currently one file listed to be healed as you can see below >>>>>>>>>>>>>>>> but never gets >>>>>>>>>>>>>>>> healed by the self-heal daemon: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Brick node1.domain.tld:/data/myvolume/brick >>>>>>>>>>>>>>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>>> Number of entries: 1 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Brick node2.domain.tld:/data/myvolume/brick >>>>>>>>>>>>>>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>>> Number of entries: 1 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >>>>>>>>>>>>>>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>>> Number of entries: 1 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> As once recommended on this mailing list I have mounted that >>>>>>>>>>>>>>>> glusterfs >>>>>>>>>>>>>>>> volume >>>>>>>>>>>>>>>> temporarily through fuse/glusterfs and ran a "stat" on that >>>>>>>>>>>>>>>> file which is >>>>>>>>>>>>>>>> listed above but nothing happened. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The file itself is available on all 3 nodes/bricks but on the >>>>>>>>>>>>>>>> last node it >>>>>>>>>>>>>>>> has a different date. By the way this file is 0 kBytes big. Is >>>>>>>>>>>>>>>> that maybe >>>>>>>>>>>>>>>> the reason why the self-heal does not work? >>>>>>>>>>>>>>> Is the file actually 0 bytes or is it just 0 bytes on the >>>>>>>>>>>>>>> arbiter(0 bytes >>>>>>>>>>>>>>> are expected on the arbiter, it just stores metadata)? Can you >>>>>>>>>>>>>>> send us the >>>>>>>>>>>>>>> output from stat on all 3 nodes: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> $ stat <file on back end brick> >>>>>>>>>>>>>>> $ getfattr -d -m - <file on back end brick> >>>>>>>>>>>>>>> $ stat <file from gluster mount> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Lets see what things look like on the back end, it should tell >>>>>>>>>>>>>>> us why >>>>>>>>>>>>>>> healing is failing. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -b >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> And how can I now make this file to heal? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Mabi >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>>>>>> Gluster-users at gluster.org >>>>>>>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>> Gluster-users at gluster.org >>>>>>>>>>>> >>>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170828/dc304405/attachment.html>