Thank you for the command. I ran it on all my nodes and now finally the the self-heal daemon does not report any files to be healed. Hopefully this scenario can get handled properly in newer versions of GlusterFS.> -------- Original Message -------- > Subject: Re: [Gluster-users] self-heal not working > Local Time: August 28, 2017 10:41 AM > UTC Time: August 28, 2017 8:41 AM > From: ravishankar at redhat.com > To: mabi <mabi at protonmail.ch> > Ben Turner <bturner at redhat.com>, Gluster Users <gluster-users at gluster.org> > > On 08/28/2017 01:29 PM, mabi wrote: > >> Excuse me for my naive questions but how do I reset the afr.dirty xattr on the file to be healed? and do I need to do that through a FUSE mount? or simply on every bricks directly? > > Directly on the bricks: `setfattr -n trusted.afr.dirty -v 0x000000000000000000000000 /data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png` > -Ravi > >>> -------- Original Message -------- >>> Subject: Re: [Gluster-users] self-heal not working >>> Local Time: August 28, 2017 5:58 AM >>> UTC Time: August 28, 2017 3:58 AM >>> From: ravishankar at redhat.com >>> To: Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>> Gluster Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>> >>> On 08/28/2017 01:57 AM, Ben Turner wrote: >>>> ----- Original Message ----- >>>>> From: "mabi" [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>> To: "Ravishankar N" [<ravishankar at redhat.com>](mailto:ravishankar at redhat.com) >>>>> Cc: "Ben Turner" [<bturner at redhat.com>](mailto:bturner at redhat.com), "Gluster Users" [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>> Sent: Sunday, August 27, 2017 3:15:33 PM >>>>> Subject: Re: [Gluster-users] self-heal not working >>>>> >>>>> Thanks Ravi for your analysis. So as far as I understand nothing to worry >>>>> about but my question now would be: how do I get rid of this file from the >>>>> heal info? >>>> Correct me if I am wrong but clearing this is just a matter of resetting the afr.dirty xattr? @Ravi - Is this correct? >>> >>> Yes resetting the xattr and launching index heal or running heal-info >>> command should serve as a workaround. >>> -Ravi >>> >>>> >>>> -b >>>> >>>>>> -------- Original Message -------- >>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>> Local Time: August 27, 2017 3:45 PM >>>>>> UTC Time: August 27, 2017 1:45 PM >>>>>> From: ravishankar at redhat.com >>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>> Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>> >>>>>> Yes, the shds did pick up the file for healing (I saw messages like " got >>>>>> entry: 1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea") but no error afterwards. >>>>>> >>>>>> Anyway I reproduced it by manually setting the afr.dirty bit for a zero >>>>>> byte file on all 3 bricks. Since there are no afr pending xattrs >>>>>> indicating good/bad copies and all files are zero bytes, the data >>>>>> self-heal algorithm just picks the file with the latest ctime as source. >>>>>> In your case that was the arbiter brick. In the code, there is a check to >>>>>> prevent data heals if arbiter is the source. So heal was not happening and >>>>>> the entries were not removed from heal-info output. >>>>>> >>>>>> Perhaps we should add a check in the code to just remove the entries from >>>>>> heal-info if size is zero bytes in all bricks. >>>>>> >>>>>> -Ravi >>>>>> >>>>>> On 08/25/2017 06:33 PM, mabi wrote: >>>>>> >>>>>>> Hi Ravi, >>>>>>> >>>>>>> Did you get a chance to have a look at the log files I have attached in my >>>>>>> last mail? >>>>>>> >>>>>>> Best, >>>>>>> Mabi >>>>>>> >>>>>>>> -------- Original Message -------- >>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>> Local Time: August 24, 2017 12:08 PM >>>>>>>> UTC Time: August 24, 2017 10:08 AM >>>>>>>> From: mabi at protonmail.ch >>>>>>>> To: Ravishankar N >>>>>>>> [[<ravishankar at redhat.com>](mailto:ravishankar at redhat.com)](mailto:ravishankar at redhat.com) >>>>>>>> Ben Turner [[<bturner at redhat.com>](mailto:bturner at redhat.com)](mailto:bturner at redhat.com), Gluster >>>>>>>> Users [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>> >>>>>>>> Thanks for confirming the command. I have now enabled DEBUG >>>>>>>> client-log-level, run a heal and then attached the glustershd log files >>>>>>>> of all 3 nodes in this mail. >>>>>>>> >>>>>>>> The volume concerned is called myvol-pro, the other 3 volumes have no >>>>>>>> problem so far. >>>>>>>> >>>>>>>> Also note that in the mean time it looks like the file has been deleted >>>>>>>> by the user and as such the heal info command does not show the file >>>>>>>> name anymore but just is GFID which is: >>>>>>>> >>>>>>>> gfid:1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea >>>>>>>> >>>>>>>> Hope that helps for debugging this issue. >>>>>>>> >>>>>>>>> -------- Original Message -------- >>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>> Local Time: August 24, 2017 5:58 AM >>>>>>>>> UTC Time: August 24, 2017 3:58 AM >>>>>>>>> From: ravishankar at redhat.com >>>>>>>>> To: mabi [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch) >>>>>>>>> Ben Turner [[<bturner at redhat.com>](mailto:bturner at redhat.com)](mailto:bturner at redhat.com), Gluster >>>>>>>>> Users [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>> >>>>>>>>> Unlikely. In your case only the afr.dirty is set, not the >>>>>>>>> afr.volname-client-xx xattr. >>>>>>>>> >>>>>>>>> `gluster volume set myvolume diagnostics.client-log-level DEBUG` is >>>>>>>>> right. >>>>>>>>> >>>>>>>>> On 08/23/2017 10:31 PM, mabi wrote: >>>>>>>>> >>>>>>>>>> I just saw the following bug which was fixed in 3.8.15: >>>>>>>>>> >>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1471613 >>>>>>>>>> >>>>>>>>>> Is it possible that the problem I described in this post is related to >>>>>>>>>> that bug? >>>>>>>>>> >>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>> Local Time: August 22, 2017 11:51 AM >>>>>>>>>>> UTC Time: August 22, 2017 9:51 AM >>>>>>>>>>> From: ravishankar at redhat.com >>>>>>>>>>> To: mabi [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch) >>>>>>>>>>> Ben Turner [[<bturner at redhat.com>](mailto:bturner at redhat.com)](mailto:bturner at redhat.com), Gluster >>>>>>>>>>> Users [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>>> >>>>>>>>>>> On 08/22/2017 02:30 PM, mabi wrote: >>>>>>>>>>> >>>>>>>>>>>> Thanks for the additional hints, I have the following 2 questions >>>>>>>>>>>> first: >>>>>>>>>>>> >>>>>>>>>>>> - In order to launch the index heal is the following command correct: >>>>>>>>>>>> gluster volume heal myvolume >>>>>>>>>>> Yes >>>>>>>>>>> >>>>>>>>>>>> - If I run a "volume start force" will it have any short disruptions >>>>>>>>>>>> on my clients which mount the volume through FUSE? If yes, how long? >>>>>>>>>>>> This is a production system that"s why I am asking. >>>>>>>>>>> No. You can actually create a test volume on your personal linux box >>>>>>>>>>> to try these kinds of things without needing multiple machines. This >>>>>>>>>>> is how we develop and test our patches :) >>>>>>>>>>> "gluster volume create testvol replica 3 /home/mabi/bricks/brick{1..3} >>>>>>>>>>> force` and so on. >>>>>>>>>>> >>>>>>>>>>> HTH, >>>>>>>>>>> Ravi >>>>>>>>>>> >>>>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>> Local Time: August 22, 2017 6:26 AM >>>>>>>>>>>>> UTC Time: August 22, 2017 4:26 AM >>>>>>>>>>>>> From: ravishankar at redhat.com >>>>>>>>>>>>> To: mabi [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch), Ben >>>>>>>>>>>>> Turner [[<bturner at redhat.com>](mailto:bturner at redhat.com)](mailto:bturner at redhat.com) >>>>>>>>>>>>> Gluster Users >>>>>>>>>>>>> [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>>>>> >>>>>>>>>>>>> Explore the following: >>>>>>>>>>>>> >>>>>>>>>>>>> - Launch index heal and look at the glustershd logs of all bricks >>>>>>>>>>>>> for possible errors >>>>>>>>>>>>> >>>>>>>>>>>>> - See if the glustershd in each node is connected to all bricks. >>>>>>>>>>>>> >>>>>>>>>>>>> - If not try to restart shd by `volume start force` >>>>>>>>>>>>> >>>>>>>>>>>>> - Launch index heal again and try. >>>>>>>>>>>>> >>>>>>>>>>>>> - Try debugging the shd log by setting client-log-level to DEBUG >>>>>>>>>>>>> temporarily. >>>>>>>>>>>>> >>>>>>>>>>>>> On 08/22/2017 03:19 AM, mabi wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Sure, it doesn"t look like a split brain based on the output: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Brick node1.domain.tld:/data/myvolume/brick >>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>> Number of entries in split-brain: 0 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Brick node2.domain.tld:/data/myvolume/brick >>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>> Number of entries in split-brain: 0 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>> Number of entries in split-brain: 0 >>>>>>>>>>>>>> >>>>>>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>>>> Local Time: August 21, 2017 11:35 PM >>>>>>>>>>>>>>> UTC Time: August 21, 2017 9:35 PM >>>>>>>>>>>>>>> From: bturner at redhat.com >>>>>>>>>>>>>>> To: mabi [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch) >>>>>>>>>>>>>>> Gluster Users >>>>>>>>>>>>>>> [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can you also provide: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> gluster v heal <my vol> info split-brain >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If it is split brain just delete the incorrect file from the brick >>>>>>>>>>>>>>> and run heal again. I haven"t tried this with arbiter but I >>>>>>>>>>>>>>> assume the process is the same. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -b >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ----- Original Message ----- >>>>>>>>>>>>>>>> From: "mabi" [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch) >>>>>>>>>>>>>>>> To: "Ben Turner" >>>>>>>>>>>>>>>> [[<bturner at redhat.com>](mailto:bturner at redhat.com)](mailto:bturner at redhat.com) >>>>>>>>>>>>>>>> Cc: "Gluster Users" >>>>>>>>>>>>>>>> [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>>>> Sent: Monday, August 21, 2017 4:55:59 PM >>>>>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Ben, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So it is really a 0 kBytes file everywhere (all nodes including >>>>>>>>>>>>>>>> the arbiter >>>>>>>>>>>>>>>> and from the client). >>>>>>>>>>>>>>>> Here below you will find the output you requested. Hopefully that >>>>>>>>>>>>>>>> will help >>>>>>>>>>>>>>>> to find out why this specific file is not healing... Let me know >>>>>>>>>>>>>>>> if you need >>>>>>>>>>>>>>>> any more information. Btw node3 is my arbiter node. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> NODE1: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> STAT: >>>>>>>>>>>>>>>> File: >>>>>>>>>>>>>>>> ?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png? >>>>>>>>>>>>>>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >>>>>>>>>>>>>>>> Device: 24h/36d Inode: 10033884 Links: 2 >>>>>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>>>>> Change: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>>>>> Birth: - >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> GETFATTR: >>>>>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>>>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZhuknAAlJAg=>>>>>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>>>>>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> NODE2: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> STAT: >>>>>>>>>>>>>>>> File: >>>>>>>>>>>>>>>> ?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png? >>>>>>>>>>>>>>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >>>>>>>>>>>>>>>> Device: 26h/38d Inode: 10031330 Links: 2 >>>>>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.403704181 +0200 >>>>>>>>>>>>>>>> Change: 2017-08-14 17:11:46.403704181 +0200 >>>>>>>>>>>>>>>> Birth: - >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> GETFATTR: >>>>>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>>>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZhu6wAA8Hpw=>>>>>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>>>>>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> NODE3: >>>>>>>>>>>>>>>> STAT: >>>>>>>>>>>>>>>> File: >>>>>>>>>>>>>>>> /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>>> Size: 0 Blocks: 0 IO Block: 4096 regular empty file >>>>>>>>>>>>>>>> Device: ca11h/51729d Inode: 405208959 Links: 2 >>>>>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>>>> Modify: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>>>> Change: 2017-08-14 17:11:46.604380051 +0200 >>>>>>>>>>>>>>>> Birth: - >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> GETFATTR: >>>>>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>>>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZe6ejAAKPAg=>>>>>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>>>>>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> CLIENT GLUSTER MOUNT: >>>>>>>>>>>>>>>> STAT: >>>>>>>>>>>>>>>> File: >>>>>>>>>>>>>>>> "/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png" >>>>>>>>>>>>>>>> Size: 0 Blocks: 0 IO Block: 131072 regular empty file >>>>>>>>>>>>>>>> Device: 1eh/30d Inode: 11897049013408443114 Links: 1 >>>>>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>>>>> Change: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>>>>> Birth: - >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>>>>>> Local Time: August 21, 2017 9:34 PM >>>>>>>>>>>>>>>>> UTC Time: August 21, 2017 7:34 PM >>>>>>>>>>>>>>>>> From: bturner at redhat.com >>>>>>>>>>>>>>>>> To: mabi [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch) >>>>>>>>>>>>>>>>> Gluster Users >>>>>>>>>>>>>>>>> [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ----- Original Message ----- >>>>>>>>>>>>>>>>>> From: "mabi" [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch) >>>>>>>>>>>>>>>>>> To: "Gluster Users" >>>>>>>>>>>>>>>>>> [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>>>>>> Sent: Monday, August 21, 2017 9:28:24 AM >>>>>>>>>>>>>>>>>> Subject: [Gluster-users] self-heal not working >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and >>>>>>>>>>>>>>>>>> there is >>>>>>>>>>>>>>>>>> currently one file listed to be healed as you can see below >>>>>>>>>>>>>>>>>> but never gets >>>>>>>>>>>>>>>>>> healed by the self-heal daemon: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Brick node1.domain.tld:/data/myvolume/brick >>>>>>>>>>>>>>>>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>>>>> Number of entries: 1 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Brick node2.domain.tld:/data/myvolume/brick >>>>>>>>>>>>>>>>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>>>>> Number of entries: 1 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >>>>>>>>>>>>>>>>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>>>>> Number of entries: 1 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> As once recommended on this mailing list I have mounted that >>>>>>>>>>>>>>>>>> glusterfs >>>>>>>>>>>>>>>>>> volume >>>>>>>>>>>>>>>>>> temporarily through fuse/glusterfs and ran a "stat" on that >>>>>>>>>>>>>>>>>> file which is >>>>>>>>>>>>>>>>>> listed above but nothing happened. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The file itself is available on all 3 nodes/bricks but on the >>>>>>>>>>>>>>>>>> last node it >>>>>>>>>>>>>>>>>> has a different date. By the way this file is 0 kBytes big. Is >>>>>>>>>>>>>>>>>> that maybe >>>>>>>>>>>>>>>>>> the reason why the self-heal does not work? >>>>>>>>>>>>>>>>> Is the file actually 0 bytes or is it just 0 bytes on the >>>>>>>>>>>>>>>>> arbiter(0 bytes >>>>>>>>>>>>>>>>> are expected on the arbiter, it just stores metadata)? Can you >>>>>>>>>>>>>>>>> send us the >>>>>>>>>>>>>>>>> output from stat on all 3 nodes: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> $ stat <file on back end brick> >>>>>>>>>>>>>>>>> $ getfattr -d -m - <file on back end brick> >>>>>>>>>>>>>>>>> $ stat <file from gluster mount> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Lets see what things look like on the back end, it should tell >>>>>>>>>>>>>>>>> us why >>>>>>>>>>>>>>>>> healing is failing. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -b >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> And how can I now make this file to heal? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Mabi >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>>>>>>>> Gluster-users at gluster.org >>>>>>>>>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>>>> Gluster-users at gluster.org >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170828/326a8b91/attachment.html>
Great, can you raise a bug for the issue so that it is easier to keep track (plus you'll be notified if the patch is posted) of it? The general guidelines are @ https://gluster.readthedocs.io/en/latest/Contributors-Guide/Bug-Reporting-Guidelines but you just need to provide whatever you described in this email thread in the bug: i.e. volume info, heal info, getfattr and stat output of the file in question. Thanks! Ravi On 08/28/2017 07:49 PM, mabi wrote:> Thank you for the command. I ran it on all my nodes and now finally > the the self-heal daemon does not report any files to be healed. > Hopefully this scenario can get handled properly in newer versions of > GlusterFS. > > > >> -------- Original Message -------- >> Subject: Re: [Gluster-users] self-heal not working >> Local Time: August 28, 2017 10:41 AM >> UTC Time: August 28, 2017 8:41 AM >> From: ravishankar at redhat.com >> To: mabi <mabi at protonmail.ch> >> Ben Turner <bturner at redhat.com>, Gluster Users >> <gluster-users at gluster.org> >> >> >> >> >> On 08/28/2017 01:29 PM, mabi wrote: >>> Excuse me for my naive questions but how do I reset the afr.dirty >>> xattr on the file to be healed? and do I need to do that through a >>> FUSE mount? or simply on every bricks directly? >>> >>> >> Directly on the bricks: `setfattr -n trusted.afr.dirty -v >> 0x000000000000000000000000 >> /data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png` >> -Ravi >> >>> >>>> -------- Original Message -------- >>>> Subject: Re: [Gluster-users] self-heal not working >>>> Local Time: August 28, 2017 5:58 AM >>>> UTC Time: August 28, 2017 3:58 AM >>>> From: ravishankar at redhat.com >>>> To: Ben Turner <bturner at redhat.com>, mabi <mabi at protonmail.ch> >>>> Gluster Users <gluster-users at gluster.org> >>>> >>>> >>>> >>>> On 08/28/2017 01:57 AM, Ben Turner wrote: >>>> > ----- Original Message ----- >>>> >> From: "mabi" <mabi at protonmail.ch> >>>> >> To: "Ravishankar N" <ravishankar at redhat.com> >>>> >> Cc: "Ben Turner" <bturner at redhat.com>, "Gluster Users" >>>> <gluster-users at gluster.org> >>>> >> Sent: Sunday, August 27, 2017 3:15:33 PM >>>> >> Subject: Re: [Gluster-users] self-heal not working >>>> >> >>>> >> Thanks Ravi for your analysis. So as far as I understand nothing >>>> to worry >>>> >> about but my question now would be: how do I get rid of this >>>> file from the >>>> >> heal info? >>>> > Correct me if I am wrong but clearing this is just a matter of >>>> resetting the afr.dirty xattr? @Ravi - Is this correct? >>>> >>>> Yes resetting the xattr and launching index heal or running heal-info >>>> command should serve as a workaround. >>>> -Ravi >>>> >>>> > >>>> > -b >>>> > >>>> >>> -------- Original Message -------- >>>> >>> Subject: Re: [Gluster-users] self-heal not working >>>> >>> Local Time: August 27, 2017 3:45 PM >>>> >>> UTC Time: August 27, 2017 1:45 PM >>>> >>> From: ravishankar at redhat.com >>>> >>> To: mabi <mabi at protonmail.ch> >>>> >>> Ben Turner <bturner at redhat.com>, Gluster Users >>>> <gluster-users at gluster.org> >>>> >>> >>>> >>> Yes, the shds did pick up the file for healing (I saw messages >>>> like " got >>>> >>> entry: 1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea") but no error >>>> afterwards. >>>> >>> >>>> >>> Anyway I reproduced it by manually setting the afr.dirty bit >>>> for a zero >>>> >>> byte file on all 3 bricks. Since there are no afr pending xattrs >>>> >>> indicating good/bad copies and all files are zero bytes, the data >>>> >>> self-heal algorithm just picks the file with the latest ctime >>>> as source. >>>> >>> In your case that was the arbiter brick. In the code, there is >>>> a check to >>>> >>> prevent data heals if arbiter is the source. So heal was not >>>> happening and >>>> >>> the entries were not removed from heal-info output. >>>> >>> >>>> >>> Perhaps we should add a check in the code to just remove the >>>> entries from >>>> >>> heal-info if size is zero bytes in all bricks. >>>> >>> >>>> >>> -Ravi >>>> >>> >>>> >>> On 08/25/2017 06:33 PM, mabi wrote: >>>> >>> >>>> >>>> Hi Ravi, >>>> >>>> >>>> >>>> Did you get a chance to have a look at the log files I have >>>> attached in my >>>> >>>> last mail? >>>> >>>> >>>> >>>> Best, >>>> >>>> Mabi >>>> >>>> >>>> >>>>> -------- Original Message -------- >>>> >>>>> Subject: Re: [Gluster-users] self-heal not working >>>> >>>>> Local Time: August 24, 2017 12:08 PM >>>> >>>>> UTC Time: August 24, 2017 10:08 AM >>>> >>>>> From: mabi at protonmail.ch >>>> >>>>> To: Ravishankar N >>>> >>>>> [<ravishankar at redhat.com>](mailto:ravishankar at redhat.com) >>>> >>>>> Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), >>>> Gluster >>>> >>>>> Users >>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>> >>>>> >>>> >>>>> Thanks for confirming the command. I have now enabled DEBUG >>>> >>>>> client-log-level, run a heal and then attached the glustershd >>>> log files >>>> >>>>> of all 3 nodes in this mail. >>>> >>>>> >>>> >>>>> The volume concerned is called myvol-pro, the other 3 volumes >>>> have no >>>> >>>>> problem so far. >>>> >>>>> >>>> >>>>> Also note that in the mean time it looks like the file has >>>> been deleted >>>> >>>>> by the user and as such the heal info command does not show >>>> the file >>>> >>>>> name anymore but just is GFID which is: >>>> >>>>> >>>> >>>>> gfid:1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea >>>> >>>>> >>>> >>>>> Hope that helps for debugging this issue. >>>> >>>>> >>>> >>>>>> -------- Original Message -------- >>>> >>>>>> Subject: Re: [Gluster-users] self-heal not working >>>> >>>>>> Local Time: August 24, 2017 5:58 AM >>>> >>>>>> UTC Time: August 24, 2017 3:58 AM >>>> >>>>>> From: ravishankar at redhat.com >>>> >>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>> >>>>>> Ben Turner >>>> [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster >>>> >>>>>> Users >>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>> >>>>>> >>>> >>>>>> Unlikely. In your case only the afr.dirty is set, not the >>>> >>>>>> afr.volname-client-xx xattr. >>>> >>>>>> >>>> >>>>>> `gluster volume set myvolume diagnostics.client-log-level >>>> DEBUG` is >>>> >>>>>> right. >>>> >>>>>> >>>> >>>>>> On 08/23/2017 10:31 PM, mabi wrote: >>>> >>>>>> >>>> >>>>>>> I just saw the following bug which was fixed in 3.8.15: >>>> >>>>>>> >>>> >>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1471613 >>>> >>>>>>> >>>> >>>>>>> Is it possible that the problem I described in this post is >>>> related to >>>> >>>>>>> that bug? >>>> >>>>>>> >>>> >>>>>>>> -------- Original Message -------- >>>> >>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>> >>>>>>>> Local Time: August 22, 2017 11:51 AM >>>> >>>>>>>> UTC Time: August 22, 2017 9:51 AM >>>> >>>>>>>> From: ravishankar at redhat.com >>>> >>>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>> >>>>>>>> Ben Turner >>>> [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster >>>> >>>>>>>> Users >>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>> >>>>>>>> >>>> >>>>>>>> On 08/22/2017 02:30 PM, mabi wrote: >>>> >>>>>>>> >>>> >>>>>>>>> Thanks for the additional hints, I have the following 2 >>>> questions >>>> >>>>>>>>> first: >>>> >>>>>>>>> >>>> >>>>>>>>> - In order to launch the index heal is the following >>>> command correct: >>>> >>>>>>>>> gluster volume heal myvolume >>>> >>>>>>>> Yes >>>> >>>>>>>> >>>> >>>>>>>>> - If I run a "volume start force" will it have any short >>>> disruptions >>>> >>>>>>>>> on my clients which mount the volume through FUSE? If >>>> yes, how long? >>>> >>>>>>>>> This is a production system that"s why I am asking. >>>> >>>>>>>> No. You can actually create a test volume on your personal >>>> linux box >>>> >>>>>>>> to try these kinds of things without needing multiple >>>> machines. This >>>> >>>>>>>> is how we develop and test our patches :) >>>> >>>>>>>> "gluster volume create testvol replica 3 >>>> /home/mabi/bricks/brick{1..3} >>>> >>>>>>>> force` and so on. >>>> >>>>>>>> >>>> >>>>>>>> HTH, >>>> >>>>>>>> Ravi >>>> >>>>>>>> >>>> >>>>>>>>>> -------- Original Message -------- >>>> >>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>> >>>>>>>>>> Local Time: August 22, 2017 6:26 AM >>>> >>>>>>>>>> UTC Time: August 22, 2017 4:26 AM >>>> >>>>>>>>>> From: ravishankar at redhat.com >>>> >>>>>>>>>> To: mabi >>>> [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch), Ben >>>> >>>>>>>>>> Turner [<bturner at redhat.com>](mailto:bturner at redhat.com) >>>> >>>>>>>>>> Gluster Users >>>> >>>>>>>>>> >>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>> >>>>>>>>>> >>>> >>>>>>>>>> Explore the following: >>>> >>>>>>>>>> >>>> >>>>>>>>>> - Launch index heal and look at the glustershd logs of >>>> all bricks >>>> >>>>>>>>>> for possible errors >>>> >>>>>>>>>> >>>> >>>>>>>>>> - See if the glustershd in each node is connected to all >>>> bricks. >>>> >>>>>>>>>> >>>> >>>>>>>>>> - If not try to restart shd by `volume start force` >>>> >>>>>>>>>> >>>> >>>>>>>>>> - Launch index heal again and try. >>>> >>>>>>>>>> >>>> >>>>>>>>>> - Try debugging the shd log by setting client-log-level >>>> to DEBUG >>>> >>>>>>>>>> temporarily. >>>> >>>>>>>>>> >>>> >>>>>>>>>> On 08/22/2017 03:19 AM, mabi wrote: >>>> >>>>>>>>>> >>>> >>>>>>>>>>> Sure, it doesn"t look like a split brain based on the >>>> output: >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> Brick node1.domain.tld:/data/myvolume/brick >>>> >>>>>>>>>>> Status: Connected >>>> >>>>>>>>>>> Number of entries in split-brain: 0 >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> Brick node2.domain.tld:/data/myvolume/brick >>>> >>>>>>>>>>> Status: Connected >>>> >>>>>>>>>>> Number of entries in split-brain: 0 >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >>>> >>>>>>>>>>> Status: Connected >>>> >>>>>>>>>>> Number of entries in split-brain: 0 >>>> >>>>>>>>>>> >>>> >>>>>>>>>>>> -------- Original Message -------- >>>> >>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>> >>>>>>>>>>>> Local Time: August 21, 2017 11:35 PM >>>> >>>>>>>>>>>> UTC Time: August 21, 2017 9:35 PM >>>> >>>>>>>>>>>> From: bturner at redhat.com >>>> >>>>>>>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>> >>>>>>>>>>>> Gluster Users >>>> >>>>>>>>>>>> >>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>> Can you also provide: >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>> gluster v heal <my vol> info split-brain >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>> If it is split brain just delete the incorrect file >>>> from the brick >>>> >>>>>>>>>>>> and run heal again. I haven"t tried this with arbiter >>>> but I >>>> >>>>>>>>>>>> assume the process is the same. >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>> -b >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>> ----- Original Message ----- >>>> >>>>>>>>>>>>> From: "mabi" >>>> [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>> >>>>>>>>>>>>> To: "Ben Turner" >>>> >>>>>>>>>>>>> [<bturner at redhat.com>](mailto:bturner at redhat.com) >>>> >>>>>>>>>>>>> Cc: "Gluster Users" >>>> >>>>>>>>>>>>> >>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>> >>>>>>>>>>>>> Sent: Monday, August 21, 2017 4:55:59 PM >>>> >>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> Hi Ben, >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> So it is really a 0 kBytes file everywhere (all nodes >>>> including >>>> >>>>>>>>>>>>> the arbiter >>>> >>>>>>>>>>>>> and from the client). >>>> >>>>>>>>>>>>> Here below you will find the output you requested. >>>> Hopefully that >>>> >>>>>>>>>>>>> will help >>>> >>>>>>>>>>>>> to find out why this specific file is not healing... >>>> Let me know >>>> >>>>>>>>>>>>> if you need >>>> >>>>>>>>>>>>> any more information. Btw node3 is my arbiter node. >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> NODE1: >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> STAT: >>>> >>>>>>>>>>>>> File: >>>> >>>>>>>>>>>>> >>>> ?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png? >>>> >>>>>>>>>>>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >>>> >>>>>>>>>>>>> Device: 24h/36d Inode: 10033884 Links: 2 >>>> >>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( >>>> 33/www-data) >>>> >>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>> >>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.407404779 +0200 >>>> >>>>>>>>>>>>> Change: 2017-08-14 17:11:46.407404779 +0200 >>>> >>>>>>>>>>>>> Birth: - >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> GETFATTR: >>>> >>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>> >>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZhuknAAlJAg=>>>> >>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>> >>>>>>>>>>>>> >>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo>>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> NODE2: >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> STAT: >>>> >>>>>>>>>>>>> File: >>>> >>>>>>>>>>>>> >>>> ?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png? >>>> >>>>>>>>>>>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >>>> >>>>>>>>>>>>> Device: 26h/38d Inode: 10031330 Links: 2 >>>> >>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( >>>> 33/www-data) >>>> >>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>> >>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.403704181 +0200 >>>> >>>>>>>>>>>>> Change: 2017-08-14 17:11:46.403704181 +0200 >>>> >>>>>>>>>>>>> Birth: - >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> GETFATTR: >>>> >>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>> >>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZhu6wAA8Hpw=>>>> >>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>> >>>>>>>>>>>>> >>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE>>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> NODE3: >>>> >>>>>>>>>>>>> STAT: >>>> >>>>>>>>>>>>> File: >>>> >>>>>>>>>>>>> >>>> /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>> >>>>>>>>>>>>> Size: 0 Blocks: 0 IO Block: 4096 regular empty file >>>> >>>>>>>>>>>>> Device: ca11h/51729d Inode: 405208959 Links: 2 >>>> >>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( >>>> 33/www-data) >>>> >>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>> >>>>>>>>>>>>> Modify: 2017-08-14 17:04:55.530681000 +0200 >>>> >>>>>>>>>>>>> Change: 2017-08-14 17:11:46.604380051 +0200 >>>> >>>>>>>>>>>>> Birth: - >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> GETFATTR: >>>> >>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>> >>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZe6ejAAKPAg=>>>> >>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>> >>>>>>>>>>>>> >>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4>>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> CLIENT GLUSTER MOUNT: >>>> >>>>>>>>>>>>> STAT: >>>> >>>>>>>>>>>>> File: >>>> >>>>>>>>>>>>> >>>> "/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png" >>>> >>>>>>>>>>>>> Size: 0 Blocks: 0 IO Block: 131072 regular empty file >>>> >>>>>>>>>>>>> Device: 1eh/30d Inode: 11897049013408443114 Links: 1 >>>> >>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( >>>> 33/www-data) >>>> >>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>> >>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.407404779 +0200 >>>> >>>>>>>>>>>>> Change: 2017-08-14 17:11:46.407404779 +0200 >>>> >>>>>>>>>>>>> Birth: - >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> -------- Original Message -------- >>>> >>>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>> >>>>>>>>>>>>>> Local Time: August 21, 2017 9:34 PM >>>> >>>>>>>>>>>>>> UTC Time: August 21, 2017 7:34 PM >>>> >>>>>>>>>>>>>> From: bturner at redhat.com >>>> >>>>>>>>>>>>>> To: mabi >>>> [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>> >>>>>>>>>>>>>> Gluster Users >>>> >>>>>>>>>>>>>> >>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> ----- Original Message ----- >>>> >>>>>>>>>>>>>>> From: "mabi" >>>> [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>> >>>>>>>>>>>>>>> To: "Gluster Users" >>>> >>>>>>>>>>>>>>> >>>> [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>> >>>>>>>>>>>>>>> Sent: Monday, August 21, 2017 9:28:24 AM >>>> >>>>>>>>>>>>>>> Subject: [Gluster-users] self-heal not working >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Hi, >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> I have a replicat 2 with arbiter GlusterFS 3.8.11 >>>> cluster and >>>> >>>>>>>>>>>>>>> there is >>>> >>>>>>>>>>>>>>> currently one file listed to be healed as you can >>>> see below >>>> >>>>>>>>>>>>>>> but never gets >>>> >>>>>>>>>>>>>>> healed by the self-heal daemon: >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Brick node1.domain.tld:/data/myvolume/brick >>>> >>>>>>>>>>>>>>> >>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>> >>>>>>>>>>>>>>> Status: Connected >>>> >>>>>>>>>>>>>>> Number of entries: 1 >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Brick node2.domain.tld:/data/myvolume/brick >>>> >>>>>>>>>>>>>>> >>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>> >>>>>>>>>>>>>>> Status: Connected >>>> >>>>>>>>>>>>>>> Number of entries: 1 >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >>>> >>>>>>>>>>>>>>> >>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>> >>>>>>>>>>>>>>> Status: Connected >>>> >>>>>>>>>>>>>>> Number of entries: 1 >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> As once recommended on this mailing list I have >>>> mounted that >>>> >>>>>>>>>>>>>>> glusterfs >>>> >>>>>>>>>>>>>>> volume >>>> >>>>>>>>>>>>>>> temporarily through fuse/glusterfs and ran a "stat" >>>> on that >>>> >>>>>>>>>>>>>>> file which is >>>> >>>>>>>>>>>>>>> listed above but nothing happened. >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> The file itself is available on all 3 nodes/bricks >>>> but on the >>>> >>>>>>>>>>>>>>> last node it >>>> >>>>>>>>>>>>>>> has a different date. By the way this file is 0 >>>> kBytes big. Is >>>> >>>>>>>>>>>>>>> that maybe >>>> >>>>>>>>>>>>>>> the reason why the self-heal does not work? >>>> >>>>>>>>>>>>>> Is the file actually 0 bytes or is it just 0 bytes >>>> on the >>>> >>>>>>>>>>>>>> arbiter(0 bytes >>>> >>>>>>>>>>>>>> are expected on the arbiter, it just stores >>>> metadata)? Can you >>>> >>>>>>>>>>>>>> send us the >>>> >>>>>>>>>>>>>> output from stat on all 3 nodes: >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> $ stat <file on back end brick> >>>> >>>>>>>>>>>>>> $ getfattr -d -m - <file on back end brick> >>>> >>>>>>>>>>>>>> $ stat <file from gluster mount> >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> Lets see what things look like on the back end, it >>>> should tell >>>> >>>>>>>>>>>>>> us why >>>> >>>>>>>>>>>>>> healing is failing. >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> -b >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> And how can I now make this file to heal? >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Thanks, >>>> >>>>>>>>>>>>>>> Mabi >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> _______________________________________________ >>>> >>>>>>>>>>>>>>> Gluster-users mailing list >>>> >>>>>>>>>>>>>>> Gluster-users at gluster.org >>>> >>>>>>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>>>>>>>>>> _______________________________________________ >>>> >>>>>>>>>>> Gluster-users mailing list >>>> >>>>>>>>>>> Gluster-users at gluster.org >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170828/046e2836/attachment.html>
As suggested I have now opened a bug on bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1486063> -------- Original Message -------- > Subject: Re: [Gluster-users] self-heal not working > Local Time: August 28, 2017 4:29 PM > UTC Time: August 28, 2017 2:29 PM > From: ravishankar at redhat.com > To: mabi <mabi at protonmail.ch> > Ben Turner <bturner at redhat.com>, Gluster Users <gluster-users at gluster.org> > > Great, can you raise a bug for the issue so that it is easier to keep track (plus you'll be notified if the patch is posted) of it? The general guidelines are @ https://gluster.readthedocs.io/en/latest/Contributors-Guide/Bug-Reporting-Guidelines but you just need to provide whatever you described in this email thread in the bug: > > i.e. volume info, heal info, getfattr and stat output of the file in question. > > Thanks! > Ravi > > On 08/28/2017 07:49 PM, mabi wrote: > >> Thank you for the command. I ran it on all my nodes and now finally the the self-heal daemon does not report any files to be healed. Hopefully this scenario can get handled properly in newer versions of GlusterFS. >> >>> -------- Original Message -------- >>> Subject: Re: [Gluster-users] self-heal not working >>> Local Time: August 28, 2017 10:41 AM >>> UTC Time: August 28, 2017 8:41 AM >>> From: ravishankar at redhat.com >>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>> Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>> >>> On 08/28/2017 01:29 PM, mabi wrote: >>> >>>> Excuse me for my naive questions but how do I reset the afr.dirty xattr on the file to be healed? and do I need to do that through a FUSE mount? or simply on every bricks directly? >>> >>> Directly on the bricks: `setfattr -n trusted.afr.dirty -v 0x000000000000000000000000 /data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png` >>> -Ravi >>> >>>>> -------- Original Message -------- >>>>> Subject: Re: [Gluster-users] self-heal not working >>>>> Local Time: August 28, 2017 5:58 AM >>>>> UTC Time: August 28, 2017 3:58 AM >>>>> From: ravishankar at redhat.com >>>>> To: Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>> Gluster Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>> >>>>> On 08/28/2017 01:57 AM, Ben Turner wrote: >>>>>> ----- Original Message ----- >>>>>>> From: "mabi" [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>>> To: "Ravishankar N" [<ravishankar at redhat.com>](mailto:ravishankar at redhat.com) >>>>>>> Cc: "Ben Turner" [<bturner at redhat.com>](mailto:bturner at redhat.com), "Gluster Users" [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>> Sent: Sunday, August 27, 2017 3:15:33 PM >>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>> >>>>>>> Thanks Ravi for your analysis. So as far as I understand nothing to worry >>>>>>> about but my question now would be: how do I get rid of this file from the >>>>>>> heal info? >>>>>> Correct me if I am wrong but clearing this is just a matter of resetting the afr.dirty xattr? @Ravi - Is this correct? >>>>> >>>>> Yes resetting the xattr and launching index heal or running heal-info >>>>> command should serve as a workaround. >>>>> -Ravi >>>>> >>>>>> >>>>>> -b >>>>>> >>>>>>>> -------- Original Message -------- >>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>> Local Time: August 27, 2017 3:45 PM >>>>>>>> UTC Time: August 27, 2017 1:45 PM >>>>>>>> From: ravishankar at redhat.com >>>>>>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch) >>>>>>>> Ben Turner [<bturner at redhat.com>](mailto:bturner at redhat.com), Gluster Users [<gluster-users at gluster.org>](mailto:gluster-users at gluster.org) >>>>>>>> >>>>>>>> Yes, the shds did pick up the file for healing (I saw messages like " got >>>>>>>> entry: 1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea") but no error afterwards. >>>>>>>> >>>>>>>> Anyway I reproduced it by manually setting the afr.dirty bit for a zero >>>>>>>> byte file on all 3 bricks. Since there are no afr pending xattrs >>>>>>>> indicating good/bad copies and all files are zero bytes, the data >>>>>>>> self-heal algorithm just picks the file with the latest ctime as source. >>>>>>>> In your case that was the arbiter brick. In the code, there is a check to >>>>>>>> prevent data heals if arbiter is the source. So heal was not happening and >>>>>>>> the entries were not removed from heal-info output. >>>>>>>> >>>>>>>> Perhaps we should add a check in the code to just remove the entries from >>>>>>>> heal-info if size is zero bytes in all bricks. >>>>>>>> >>>>>>>> -Ravi >>>>>>>> >>>>>>>> On 08/25/2017 06:33 PM, mabi wrote: >>>>>>>> >>>>>>>>> Hi Ravi, >>>>>>>>> >>>>>>>>> Did you get a chance to have a look at the log files I have attached in my >>>>>>>>> last mail? >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Mabi >>>>>>>>> >>>>>>>>>> -------- Original Message -------- >>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>> Local Time: August 24, 2017 12:08 PM >>>>>>>>>> UTC Time: August 24, 2017 10:08 AM >>>>>>>>>> From: mabi at protonmail.ch >>>>>>>>>> To: Ravishankar N >>>>>>>>>> [[<ravishankar at redhat.com>](mailto:ravishankar at redhat.com)](mailto:ravishankar at redhat.com) >>>>>>>>>> Ben Turner [[<bturner at redhat.com>](mailto:bturner at redhat.com)](mailto:bturner at redhat.com), Gluster >>>>>>>>>> Users [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>> >>>>>>>>>> Thanks for confirming the command. I have now enabled DEBUG >>>>>>>>>> client-log-level, run a heal and then attached the glustershd log files >>>>>>>>>> of all 3 nodes in this mail. >>>>>>>>>> >>>>>>>>>> The volume concerned is called myvol-pro, the other 3 volumes have no >>>>>>>>>> problem so far. >>>>>>>>>> >>>>>>>>>> Also note that in the mean time it looks like the file has been deleted >>>>>>>>>> by the user and as such the heal info command does not show the file >>>>>>>>>> name anymore but just is GFID which is: >>>>>>>>>> >>>>>>>>>> gfid:1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea >>>>>>>>>> >>>>>>>>>> Hope that helps for debugging this issue. >>>>>>>>>> >>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>> Local Time: August 24, 2017 5:58 AM >>>>>>>>>>> UTC Time: August 24, 2017 3:58 AM >>>>>>>>>>> From: ravishankar at redhat.com >>>>>>>>>>> To: mabi [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch) >>>>>>>>>>> Ben Turner [[<bturner at redhat.com>](mailto:bturner at redhat.com)](mailto:bturner at redhat.com), Gluster >>>>>>>>>>> Users [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>>> >>>>>>>>>>> Unlikely. In your case only the afr.dirty is set, not the >>>>>>>>>>> afr.volname-client-xx xattr. >>>>>>>>>>> >>>>>>>>>>> `gluster volume set myvolume diagnostics.client-log-level DEBUG` is >>>>>>>>>>> right. >>>>>>>>>>> >>>>>>>>>>> On 08/23/2017 10:31 PM, mabi wrote: >>>>>>>>>>> >>>>>>>>>>>> I just saw the following bug which was fixed in 3.8.15: >>>>>>>>>>>> >>>>>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1471613 >>>>>>>>>>>> >>>>>>>>>>>> Is it possible that the problem I described in this post is related to >>>>>>>>>>>> that bug? >>>>>>>>>>>> >>>>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>> Local Time: August 22, 2017 11:51 AM >>>>>>>>>>>>> UTC Time: August 22, 2017 9:51 AM >>>>>>>>>>>>> From: ravishankar at redhat.com >>>>>>>>>>>>> To: mabi [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch) >>>>>>>>>>>>> Ben Turner [[<bturner at redhat.com>](mailto:bturner at redhat.com)](mailto:bturner at redhat.com), Gluster >>>>>>>>>>>>> Users [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>>>>> >>>>>>>>>>>>> On 08/22/2017 02:30 PM, mabi wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for the additional hints, I have the following 2 questions >>>>>>>>>>>>>> first: >>>>>>>>>>>>>> >>>>>>>>>>>>>> - In order to launch the index heal is the following command correct: >>>>>>>>>>>>>> gluster volume heal myvolume >>>>>>>>>>>>> Yes >>>>>>>>>>>>> >>>>>>>>>>>>>> - If I run a "volume start force" will it have any short disruptions >>>>>>>>>>>>>> on my clients which mount the volume through FUSE? If yes, how long? >>>>>>>>>>>>>> This is a production system that"s why I am asking. >>>>>>>>>>>>> No. You can actually create a test volume on your personal linux box >>>>>>>>>>>>> to try these kinds of things without needing multiple machines. This >>>>>>>>>>>>> is how we develop and test our patches :) >>>>>>>>>>>>> "gluster volume create testvol replica 3 /home/mabi/bricks/brick{1..3} >>>>>>>>>>>>> force` and so on. >>>>>>>>>>>>> >>>>>>>>>>>>> HTH, >>>>>>>>>>>>> Ravi >>>>>>>>>>>>> >>>>>>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>>>> Local Time: August 22, 2017 6:26 AM >>>>>>>>>>>>>>> UTC Time: August 22, 2017 4:26 AM >>>>>>>>>>>>>>> From: ravishankar at redhat.com >>>>>>>>>>>>>>> To: mabi [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch), Ben >>>>>>>>>>>>>>> Turner [[<bturner at redhat.com>](mailto:bturner at redhat.com)](mailto:bturner at redhat.com) >>>>>>>>>>>>>>> Gluster Users >>>>>>>>>>>>>>> [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Explore the following: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - Launch index heal and look at the glustershd logs of all bricks >>>>>>>>>>>>>>> for possible errors >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - See if the glustershd in each node is connected to all bricks. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - If not try to restart shd by `volume start force` >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - Launch index heal again and try. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - Try debugging the shd log by setting client-log-level to DEBUG >>>>>>>>>>>>>>> temporarily. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 08/22/2017 03:19 AM, mabi wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Sure, it doesn"t look like a split brain based on the output: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Brick node1.domain.tld:/data/myvolume/brick >>>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>>> Number of entries in split-brain: 0 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Brick node2.domain.tld:/data/myvolume/brick >>>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>>> Number of entries in split-brain: 0 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >>>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>>> Number of entries in split-brain: 0 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>>>>>> Local Time: August 21, 2017 11:35 PM >>>>>>>>>>>>>>>>> UTC Time: August 21, 2017 9:35 PM >>>>>>>>>>>>>>>>> From: bturner at redhat.com >>>>>>>>>>>>>>>>> To: mabi [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch) >>>>>>>>>>>>>>>>> Gluster Users >>>>>>>>>>>>>>>>> [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Can you also provide: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> gluster v heal <my vol> info split-brain >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If it is split brain just delete the incorrect file from the brick >>>>>>>>>>>>>>>>> and run heal again. I haven"t tried this with arbiter but I >>>>>>>>>>>>>>>>> assume the process is the same. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -b >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ----- Original Message ----- >>>>>>>>>>>>>>>>>> From: "mabi" [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch) >>>>>>>>>>>>>>>>>> To: "Ben Turner" >>>>>>>>>>>>>>>>>> [[<bturner at redhat.com>](mailto:bturner at redhat.com)](mailto:bturner at redhat.com) >>>>>>>>>>>>>>>>>> Cc: "Gluster Users" >>>>>>>>>>>>>>>>>> [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>>>>>> Sent: Monday, August 21, 2017 4:55:59 PM >>>>>>>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Ben, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So it is really a 0 kBytes file everywhere (all nodes including >>>>>>>>>>>>>>>>>> the arbiter >>>>>>>>>>>>>>>>>> and from the client). >>>>>>>>>>>>>>>>>> Here below you will find the output you requested. Hopefully that >>>>>>>>>>>>>>>>>> will help >>>>>>>>>>>>>>>>>> to find out why this specific file is not healing... Let me know >>>>>>>>>>>>>>>>>> if you need >>>>>>>>>>>>>>>>>> any more information. Btw node3 is my arbiter node. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> NODE1: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> STAT: >>>>>>>>>>>>>>>>>> File: >>>>>>>>>>>>>>>>>> ?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png? >>>>>>>>>>>>>>>>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >>>>>>>>>>>>>>>>>> Device: 24h/36d Inode: 10033884 Links: 2 >>>>>>>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>>>>>>> Change: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>>>>>>> Birth: - >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> GETFATTR: >>>>>>>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>>>>>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZhuknAAlJAg=>>>>>>>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>>>>>>>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> NODE2: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> STAT: >>>>>>>>>>>>>>>>>> File: >>>>>>>>>>>>>>>>>> ?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png? >>>>>>>>>>>>>>>>>> Size: 0 Blocks: 38 IO Block: 131072 regular empty file >>>>>>>>>>>>>>>>>> Device: 26h/38d Inode: 10031330 Links: 2 >>>>>>>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.403704181 +0200 >>>>>>>>>>>>>>>>>> Change: 2017-08-14 17:11:46.403704181 +0200 >>>>>>>>>>>>>>>>>> Birth: - >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> GETFATTR: >>>>>>>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>>>>>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZhu6wAA8Hpw=>>>>>>>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>>>>>>>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> NODE3: >>>>>>>>>>>>>>>>>> STAT: >>>>>>>>>>>>>>>>>> File: >>>>>>>>>>>>>>>>>> /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>>>>> Size: 0 Blocks: 0 IO Block: 4096 regular empty file >>>>>>>>>>>>>>>>>> Device: ca11h/51729d Inode: 405208959 Links: 2 >>>>>>>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>>>>>> Modify: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>>>>>> Change: 2017-08-14 17:11:46.604380051 +0200 >>>>>>>>>>>>>>>>>> Birth: - >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> GETFATTR: >>>>>>>>>>>>>>>>>> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA >>>>>>>>>>>>>>>>>> trusted.bit-rot.version=0sAgAAAAAAAABZe6ejAAKPAg=>>>>>>>>>>>>>>>>>> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>>>>>>>>>>>>>>>>>> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> CLIENT GLUSTER MOUNT: >>>>>>>>>>>>>>>>>> STAT: >>>>>>>>>>>>>>>>>> File: >>>>>>>>>>>>>>>>>> "/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png" >>>>>>>>>>>>>>>>>> Size: 0 Blocks: 0 IO Block: 131072 regular empty file >>>>>>>>>>>>>>>>>> Device: 1eh/30d Inode: 11897049013408443114 Links: 1 >>>>>>>>>>>>>>>>>> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data) >>>>>>>>>>>>>>>>>> Access: 2017-08-14 17:04:55.530681000 +0200 >>>>>>>>>>>>>>>>>> Modify: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>>>>>>> Change: 2017-08-14 17:11:46.407404779 +0200 >>>>>>>>>>>>>>>>>> Birth: - >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -------- Original Message -------- >>>>>>>>>>>>>>>>>>> Subject: Re: [Gluster-users] self-heal not working >>>>>>>>>>>>>>>>>>> Local Time: August 21, 2017 9:34 PM >>>>>>>>>>>>>>>>>>> UTC Time: August 21, 2017 7:34 PM >>>>>>>>>>>>>>>>>>> From: bturner at redhat.com >>>>>>>>>>>>>>>>>>> To: mabi [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch) >>>>>>>>>>>>>>>>>>> Gluster Users >>>>>>>>>>>>>>>>>>> [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> ----- Original Message ----- >>>>>>>>>>>>>>>>>>>> From: "mabi" [[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)](mailto:mabi at protonmail.ch) >>>>>>>>>>>>>>>>>>>> To: "Gluster Users" >>>>>>>>>>>>>>>>>>>> [[<gluster-users at gluster.org>](mailto:gluster-users at gluster.org)](mailto:gluster-users at gluster.org) >>>>>>>>>>>>>>>>>>>> Sent: Monday, August 21, 2017 9:28:24 AM >>>>>>>>>>>>>>>>>>>> Subject: [Gluster-users] self-heal not working >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and >>>>>>>>>>>>>>>>>>>> there is >>>>>>>>>>>>>>>>>>>> currently one file listed to be healed as you can see below >>>>>>>>>>>>>>>>>>>> but never gets >>>>>>>>>>>>>>>>>>>> healed by the self-heal daemon: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Brick node1.domain.tld:/data/myvolume/brick >>>>>>>>>>>>>>>>>>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>>>>>>> Number of entries: 1 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Brick node2.domain.tld:/data/myvolume/brick >>>>>>>>>>>>>>>>>>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>>>>>>> Number of entries: 1 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick >>>>>>>>>>>>>>>>>>>> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png >>>>>>>>>>>>>>>>>>>> Status: Connected >>>>>>>>>>>>>>>>>>>> Number of entries: 1 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> As once recommended on this mailing list I have mounted that >>>>>>>>>>>>>>>>>>>> glusterfs >>>>>>>>>>>>>>>>>>>> volume >>>>>>>>>>>>>>>>>>>> temporarily through fuse/glusterfs and ran a "stat" on that >>>>>>>>>>>>>>>>>>>> file which is >>>>>>>>>>>>>>>>>>>> listed above but nothing happened. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The file itself is available on all 3 nodes/bricks but on the >>>>>>>>>>>>>>>>>>>> last node it >>>>>>>>>>>>>>>>>>>> has a different date. By the way this file is 0 kBytes big. Is >>>>>>>>>>>>>>>>>>>> that maybe >>>>>>>>>>>>>>>>>>>> the reason why the self-heal does not work? >>>>>>>>>>>>>>>>>>> Is the file actually 0 bytes or is it just 0 bytes on the >>>>>>>>>>>>>>>>>>> arbiter(0 bytes >>>>>>>>>>>>>>>>>>> are expected on the arbiter, it just stores metadata)? Can you >>>>>>>>>>>>>>>>>>> send us the >>>>>>>>>>>>>>>>>>> output from stat on all 3 nodes: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> $ stat <file on back end brick> >>>>>>>>>>>>>>>>>>> $ getfattr -d -m - <file on back end brick> >>>>>>>>>>>>>>>>>>> $ stat <file from gluster mount> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Lets see what things look like on the back end, it should tell >>>>>>>>>>>>>>>>>>> us why >>>>>>>>>>>>>>>>>>> healing is failing. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -b >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> And how can I now make this file to heal? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Mabi >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>>>>>>>>>> Gluster-users at gluster.org >>>>>>>>>>>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>>>>>> Gluster-users at gluster.org >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170828/599bee91/attachment.html>