Gluster users - Aug 2017 - self-heal not working

----- Original Message -----> From: "mabi" <mabi at protonmail.ch>
> To: "Ravishankar N" <ravishankar at redhat.com>
> Cc: "Ben Turner" <bturner at redhat.com>, "Gluster
Users" <gluster-users at gluster.org>
> Sent: Sunday, August 27, 2017 3:15:33 PM
> Subject: Re: [Gluster-users] self-heal not working
> 
> Thanks Ravi for your analysis. So as far as I understand nothing to worry
> about but my question now would be: how do I get rid of this file from the
> heal info?
Correct me if I am wrong but clearing this is just a matter of resetting the
afr.dirty xattr?  @Ravi - Is this correct?

-b
> 
> > -------- Original Message --------
> > Subject: Re: [Gluster-users] self-heal not working
> > Local Time: August 27, 2017 3:45 PM
> > UTC Time: August 27, 2017 1:45 PM
> > From: ravishankar at redhat.com
> > To: mabi <mabi at protonmail.ch>
> > Ben Turner <bturner at redhat.com>, Gluster Users
<gluster-users at gluster.org>
> >
> > Yes, the shds did pick up the file for healing (I saw messages like
" got
> > entry: 1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea") but no error
afterwards.
> >
> > Anyway I reproduced it by manually setting the afr.dirty bit for a
zero
> > byte file on all 3 bricks. Since there are no afr pending xattrs
> > indicating good/bad copies and all files are zero bytes, the data
> > self-heal algorithm just picks the file with the latest ctime as
source.
> > In your case that was the arbiter brick. In the code, there is a check
to
> > prevent data heals if arbiter is the source. So heal was not happening
and
> > the entries were not removed from heal-info output.
> >
> > Perhaps we should add a check in the code to just remove the entries
from
> > heal-info if size is zero bytes in all bricks.
> >
> > -Ravi
> >
> > On 08/25/2017 06:33 PM, mabi wrote:
> >
> >> Hi Ravi,
> >>
> >> Did you get a chance to have a look at the log files I have
attached in my
> >> last mail?
> >>
> >> Best,
> >> Mabi
> >>
> >>> -------- Original Message --------
> >>> Subject: Re: [Gluster-users] self-heal not working
> >>> Local Time: August 24, 2017 12:08 PM
> >>> UTC Time: August 24, 2017 10:08 AM
> >>> From: mabi at protonmail.ch
> >>> To: Ravishankar N
> >>> [<ravishankar at redhat.com>](mailto:ravishankar at
redhat.com)
> >>> Ben Turner [<bturner at redhat.com>](mailto:bturner at
redhat.com), Gluster
> >>> Users [<gluster-users at
gluster.org>](mailto:gluster-users at gluster.org)
> >>>
> >>> Thanks for confirming the command. I have now enabled DEBUG
> >>> client-log-level, run a heal and then attached the glustershd
log files
> >>> of all 3 nodes in this mail.
> >>>
> >>> The volume concerned is called myvol-pro, the other 3 volumes
have no
> >>> problem so far.
> >>>
> >>> Also note that in the mean time it looks like the file has
been deleted
> >>> by the user and as such the heal info command does not show
the file
> >>> name anymore but just is GFID which is:
> >>>
> >>> gfid:1985e233-d5ee-4e3e-a51a-cf0b5f9f2aea
> >>>
> >>> Hope that helps for debugging this issue.
> >>>
> >>>> -------- Original Message --------
> >>>> Subject: Re: [Gluster-users] self-heal not working
> >>>> Local Time: August 24, 2017 5:58 AM
> >>>> UTC Time: August 24, 2017 3:58 AM
> >>>> From: ravishankar at redhat.com
> >>>> To: mabi [<mabi at protonmail.ch>](mailto:mabi at
protonmail.ch)
> >>>> Ben Turner [<bturner at redhat.com>](mailto:bturner
at redhat.com), Gluster
> >>>> Users [<gluster-users at
gluster.org>](mailto:gluster-users at gluster.org)
> >>>>
> >>>> Unlikely. In your case only the afr.dirty is set, not the
> >>>> afr.volname-client-xx xattr.
> >>>>
> >>>> `gluster volume set myvolume diagnostics.client-log-level
DEBUG` is
> >>>> right.
> >>>>
> >>>> On 08/23/2017 10:31 PM, mabi wrote:
> >>>>
> >>>>> I just saw the following bug which was fixed in
3.8.15:
> >>>>>
> >>>>> bugzilla.redhat.com/show_bug.cgi?id=1471613
> >>>>>
> >>>>> Is it possible that the problem I described in this
post is related to
> >>>>> that bug?
> >>>>>
> >>>>>> -------- Original Message --------
> >>>>>> Subject: Re: [Gluster-users] self-heal not working
> >>>>>> Local Time: August 22, 2017 11:51 AM
> >>>>>> UTC Time: August 22, 2017 9:51 AM
> >>>>>> From: ravishankar at redhat.com
> >>>>>> To: mabi [<mabi at
protonmail.ch>](mailto:mabi at protonmail.ch)
> >>>>>> Ben Turner [<bturner at
redhat.com>](mailto:bturner at redhat.com), Gluster
> >>>>>> Users [<gluster-users at
gluster.org>](mailto:gluster-users at gluster.org)
> >>>>>>
> >>>>>> On 08/22/2017 02:30 PM, mabi wrote:
> >>>>>>
> >>>>>>> Thanks for the additional hints, I have the
following 2 questions
> >>>>>>> first:
> >>>>>>>
> >>>>>>> - In order to launch the index heal is the
following command correct:
> >>>>>>> gluster volume heal myvolume
> >>>>>>
> >>>>>> Yes
> >>>>>>
> >>>>>>> - If I run a "volume start force"
will it have any short disruptions
> >>>>>>> on my clients which mount the volume through
FUSE? If yes, how long?
> >>>>>>> This is a production system that's why I
am asking.
> >>>>>>
> >>>>>> No. You can actually create a test volume on  your
personal linux box
> >>>>>> to try these kinds of things without needing
multiple machines. This
> >>>>>> is how we develop and test our patches :)
> >>>>>> 'gluster volume create testvol replica 3
/home/mabi/bricks/brick{1..3}
> >>>>>> force` and so on.
> >>>>>>
> >>>>>> HTH,
> >>>>>> Ravi
> >>>>>>
> >>>>>>>> -------- Original Message --------
> >>>>>>>> Subject: Re: [Gluster-users] self-heal not
working
> >>>>>>>> Local Time: August 22, 2017 6:26 AM
> >>>>>>>> UTC Time: August 22, 2017 4:26 AM
> >>>>>>>> From: ravishankar at redhat.com
> >>>>>>>> To: mabi [<mabi at
protonmail.ch>](mailto:mabi at protonmail.ch), Ben
> >>>>>>>> Turner [<bturner at
redhat.com>](mailto:bturner at redhat.com)
> >>>>>>>> Gluster Users
> >>>>>>>> [<gluster-users at
gluster.org>](mailto:gluster-users at gluster.org)
> >>>>>>>>
> >>>>>>>> Explore the following:
> >>>>>>>>
> >>>>>>>> - Launch index heal and look at the
glustershd logs of all bricks
> >>>>>>>> for possible errors
> >>>>>>>>
> >>>>>>>> - See if the glustershd in each node is
connected to all bricks.
> >>>>>>>>
> >>>>>>>> - If not try to restart shd by `volume
start force`
> >>>>>>>>
> >>>>>>>> - Launch index heal again and try.
> >>>>>>>>
> >>>>>>>> - Try debugging the shd log by setting
client-log-level to DEBUG
> >>>>>>>> temporarily.
> >>>>>>>>
> >>>>>>>> On 08/22/2017 03:19 AM, mabi wrote:
> >>>>>>>>
> >>>>>>>>> Sure, it doesn't look like a split
brain based on the output:
> >>>>>>>>>
> >>>>>>>>> Brick
node1.domain.tld:/data/myvolume/brick
> >>>>>>>>> Status: Connected
> >>>>>>>>> Number of entries in split-brain: 0
> >>>>>>>>>
> >>>>>>>>> Brick
node2.domain.tld:/data/myvolume/brick
> >>>>>>>>> Status: Connected
> >>>>>>>>> Number of entries in split-brain: 0
> >>>>>>>>>
> >>>>>>>>> Brick
node3.domain.tld:/srv/glusterfs/myvolume/brick
> >>>>>>>>> Status: Connected
> >>>>>>>>> Number of entries in split-brain: 0
> >>>>>>>>>
> >>>>>>>>>> -------- Original Message --------
> >>>>>>>>>> Subject: Re: [Gluster-users]
self-heal not working
> >>>>>>>>>> Local Time: August 21, 2017 11:35
PM
> >>>>>>>>>> UTC Time: August 21, 2017 9:35 PM
> >>>>>>>>>> From: bturner at redhat.com
> >>>>>>>>>> To: mabi [<mabi at
protonmail.ch>](mailto:mabi at protonmail.ch)
> >>>>>>>>>> Gluster Users
> >>>>>>>>>> [<gluster-users at
gluster.org>](mailto:gluster-users at gluster.org)
> >>>>>>>>>>
> >>>>>>>>>> Can you also provide:
> >>>>>>>>>>
> >>>>>>>>>> gluster v heal <my vol> info
split-brain
> >>>>>>>>>>
> >>>>>>>>>> If it is split brain just delete
the incorrect file from the brick
> >>>>>>>>>> and run heal again. I haven"t
tried this with arbiter but I
> >>>>>>>>>> assume the process is the same.
> >>>>>>>>>>
> >>>>>>>>>> -b
> >>>>>>>>>>
> >>>>>>>>>> ----- Original Message -----
> >>>>>>>>>>> From: "mabi"
[<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)
> >>>>>>>>>>> To: "Ben Turner"
> >>>>>>>>>>> [<bturner at
redhat.com>](mailto:bturner at redhat.com)
> >>>>>>>>>>> Cc: "Gluster Users"
> >>>>>>>>>>> [<gluster-users at
gluster.org>](mailto:gluster-users at gluster.org)
> >>>>>>>>>>> Sent: Monday, August 21, 2017
4:55:59 PM
> >>>>>>>>>>> Subject: Re: [Gluster-users]
self-heal not working
> >>>>>>>>>>>
> >>>>>>>>>>> Hi Ben,
> >>>>>>>>>>>
> >>>>>>>>>>> So it is really a 0 kBytes
file everywhere (all nodes including
> >>>>>>>>>>> the arbiter
> >>>>>>>>>>> and from the client).
> >>>>>>>>>>> Here below you will find the
output you requested. Hopefully that
> >>>>>>>>>>> will help
> >>>>>>>>>>> to find out why this specific
file is not healing... Let me know
> >>>>>>>>>>> if you need
> >>>>>>>>>>> any more information. Btw
node3 is my arbiter node.
> >>>>>>>>>>>
> >>>>>>>>>>> NODE1:
> >>>>>>>>>>>
> >>>>>>>>>>> STAT:
> >>>>>>>>>>> File:
> >>>>>>>>>>>
?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png?
> >>>>>>>>>>> Size: 0 Blocks: 38 IO Block:
131072 regular empty file
> >>>>>>>>>>> Device: 24h/36d Inode:
10033884 Links: 2
> >>>>>>>>>>> Access: (0644/-rw-r--r--) Uid:
( 33/www-data) Gid: ( 33/www-data)
> >>>>>>>>>>> Access: 2017-08-14
17:04:55.530681000 +0200
> >>>>>>>>>>> Modify: 2017-08-14
17:11:46.407404779 +0200
> >>>>>>>>>>> Change: 2017-08-14
17:11:46.407404779 +0200
> >>>>>>>>>>> Birth: -
> >>>>>>>>>>>
> >>>>>>>>>>> GETFATTR:
> >>>>>>>>>>>
trusted.afr.dirty=0sAAAAAQAAAAAAAAAA
> >>>>>>>>>>>
trusted.bit-rot.version=0sAgAAAAAAAABZhuknAAlJAg=>
>>>>>>>>>>>
trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>
>>>>>>>>>>>
trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo>
>>>>>>>>>>>
> >>>>>>>>>>> NODE2:
> >>>>>>>>>>>
> >>>>>>>>>>> STAT:
> >>>>>>>>>>> File:
> >>>>>>>>>>>
?/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png?
> >>>>>>>>>>> Size: 0 Blocks: 38 IO Block:
131072 regular empty file
> >>>>>>>>>>> Device: 26h/38d Inode:
10031330 Links: 2
> >>>>>>>>>>> Access: (0644/-rw-r--r--) Uid:
( 33/www-data) Gid: ( 33/www-data)
> >>>>>>>>>>> Access: 2017-08-14
17:04:55.530681000 +0200
> >>>>>>>>>>> Modify: 2017-08-14
17:11:46.403704181 +0200
> >>>>>>>>>>> Change: 2017-08-14
17:11:46.403704181 +0200
> >>>>>>>>>>> Birth: -
> >>>>>>>>>>>
> >>>>>>>>>>> GETFATTR:
> >>>>>>>>>>>
trusted.afr.dirty=0sAAAAAQAAAAAAAAAA
> >>>>>>>>>>>
trusted.bit-rot.version=0sAgAAAAAAAABZhu6wAA8Hpw=>
>>>>>>>>>>>
trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>
>>>>>>>>>>>
trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE>
>>>>>>>>>>>
> >>>>>>>>>>> NODE3:
> >>>>>>>>>>> STAT:
> >>>>>>>>>>> File:
> >>>>>>>>>>>
/srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
> >>>>>>>>>>> Size: 0 Blocks: 0 IO Block:
4096 regular empty file
> >>>>>>>>>>> Device: ca11h/51729d Inode:
405208959 Links: 2
> >>>>>>>>>>> Access: (0644/-rw-r--r--) Uid:
( 33/www-data) Gid: ( 33/www-data)
> >>>>>>>>>>> Access: 2017-08-14
17:04:55.530681000 +0200
> >>>>>>>>>>> Modify: 2017-08-14
17:04:55.530681000 +0200
> >>>>>>>>>>> Change: 2017-08-14
17:11:46.604380051 +0200
> >>>>>>>>>>> Birth: -
> >>>>>>>>>>>
> >>>>>>>>>>> GETFATTR:
> >>>>>>>>>>>
trusted.afr.dirty=0sAAAAAQAAAAAAAAAA
> >>>>>>>>>>>
trusted.bit-rot.version=0sAgAAAAAAAABZe6ejAAKPAg=>
>>>>>>>>>>>
trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g=>
>>>>>>>>>>>
trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4>
>>>>>>>>>>>
> >>>>>>>>>>> CLIENT GLUSTER MOUNT:
> >>>>>>>>>>> STAT:
> >>>>>>>>>>> File:
> >>>>>>>>>>>
"/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png"
> >>>>>>>>>>> Size: 0 Blocks: 0 IO Block:
131072 regular empty file
> >>>>>>>>>>> Device: 1eh/30d Inode:
11897049013408443114 Links: 1
> >>>>>>>>>>> Access: (0644/-rw-r--r--) Uid:
( 33/www-data) Gid: ( 33/www-data)
> >>>>>>>>>>> Access: 2017-08-14
17:04:55.530681000 +0200
> >>>>>>>>>>> Modify: 2017-08-14
17:11:46.407404779 +0200
> >>>>>>>>>>> Change: 2017-08-14
17:11:46.407404779 +0200
> >>>>>>>>>>> Birth: -
> >>>>>>>>>>>
> >>>>>>>>>>> > -------- Original Message
--------
> >>>>>>>>>>> > Subject: Re:
[Gluster-users] self-heal not working
> >>>>>>>>>>> > Local Time: August 21,
2017 9:34 PM
> >>>>>>>>>>> > UTC Time: August 21, 2017
7:34 PM
> >>>>>>>>>>> > From: bturner at
redhat.com
> >>>>>>>>>>> > To: mabi [<mabi at
protonmail.ch>](mailto:mabi at protonmail.ch)
> >>>>>>>>>>> > Gluster Users
> >>>>>>>>>>> > [<gluster-users at
gluster.org>](mailto:gluster-users at gluster.org)
> >>>>>>>>>>> >
> >>>>>>>>>>> > ----- Original Message
-----
> >>>>>>>>>>> >> From:
"mabi" [<mabi at protonmail.ch>](mailto:mabi at protonmail.ch)
> >>>>>>>>>>> >> To: "Gluster
Users"
> >>>>>>>>>>> >> [<gluster-users at
gluster.org>](mailto:gluster-users at gluster.org)
> >>>>>>>>>>> >> Sent: Monday, August
21, 2017 9:28:24 AM
> >>>>>>>>>>> >> Subject:
[Gluster-users] self-heal not working
> >>>>>>>>>>> >>
> >>>>>>>>>>> >> Hi,
> >>>>>>>>>>> >>
> >>>>>>>>>>> >> I have a replicat 2
with arbiter GlusterFS 3.8.11 cluster and
> >>>>>>>>>>> >> there is
> >>>>>>>>>>> >> currently one file
listed to be healed as you can see below
> >>>>>>>>>>> >> but never gets
> >>>>>>>>>>> >> healed by the
self-heal daemon:
> >>>>>>>>>>> >>
> >>>>>>>>>>> >> Brick
node1.domain.tld:/data/myvolume/brick
> >>>>>>>>>>> >>
/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
> >>>>>>>>>>> >> Status: Connected
> >>>>>>>>>>> >> Number of entries: 1
> >>>>>>>>>>> >>
> >>>>>>>>>>> >> Brick
node2.domain.tld:/data/myvolume/brick
> >>>>>>>>>>> >>
/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
> >>>>>>>>>>> >> Status: Connected
> >>>>>>>>>>> >> Number of entries: 1
> >>>>>>>>>>> >>
> >>>>>>>>>>> >> Brick
node3.domain.tld:/srv/glusterfs/myvolume/brick
> >>>>>>>>>>> >>
/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png
> >>>>>>>>>>> >> Status: Connected
> >>>>>>>>>>> >> Number of entries: 1
> >>>>>>>>>>> >>
> >>>>>>>>>>> >> As once recommended
on this mailing list I have mounted that
> >>>>>>>>>>> >> glusterfs
> >>>>>>>>>>> >> volume
> >>>>>>>>>>> >> temporarily through
fuse/glusterfs and ran a "stat" on that
> >>>>>>>>>>> >> file which is
> >>>>>>>>>>> >> listed above but
nothing happened.
> >>>>>>>>>>> >>
> >>>>>>>>>>> >> The file itself is
available on all 3 nodes/bricks but on the
> >>>>>>>>>>> >> last node it
> >>>>>>>>>>> >> has a different date.
By the way this file is 0 kBytes big. Is
> >>>>>>>>>>> >> that maybe
> >>>>>>>>>>> >> the reason why the
self-heal does not work?
> >>>>>>>>>>> >
> >>>>>>>>>>> > Is the file actually 0
bytes or is it just 0 bytes on the
> >>>>>>>>>>> > arbiter(0 bytes
> >>>>>>>>>>> > are expected on the
arbiter, it just stores metadata)? Can you
> >>>>>>>>>>> > send us the
> >>>>>>>>>>> > output from stat on all 3
nodes:
> >>>>>>>>>>> >
> >>>>>>>>>>> > $ stat <file on back
end brick>
> >>>>>>>>>>> > $ getfattr -d -m -
<file on back end brick>
> >>>>>>>>>>> > $ stat <file from
gluster mount>
> >>>>>>>>>>> >
> >>>>>>>>>>> > Lets see what things look
like on the back end, it should tell
> >>>>>>>>>>> > us why
> >>>>>>>>>>> > healing is failing.
> >>>>>>>>>>> >
> >>>>>>>>>>> > -b
> >>>>>>>>>>> >
> >>>>>>>>>>> >>
> >>>>>>>>>>> >> And how can I now
make this file to heal?
> >>>>>>>>>>> >>
> >>>>>>>>>>> >> Thanks,
> >>>>>>>>>>> >> Mabi
> >>>>>>>>>>> >>
> >>>>>>>>>>> >>
> >>>>>>>>>>> >>
> >>>>>>>>>>> >>
> >>>>>>>>>>> >>
_______________________________________________
> >>>>>>>>>>> >> Gluster-users mailing
list
> >>>>>>>>>>> >> Gluster-users at
gluster.org
> >>>>>>>>>>> >>
lists.gluster.org/mailman/listinfo/gluster-users
> >>>>>>>>>
> >>>>>>>>>
_______________________________________________
> >>>>>>>>> Gluster-users mailing list
> >>>>>>>>> Gluster-users at gluster.org
> >>>>>>>>>
> >>>>>>>>>
lists.gluster.org/mailman/listinfo/gluster-users

Gluster users - Aug 2017 - self-heal not working

[Gluster-users] self-heal not working

[Gluster-users] self-heal not working

[Gluster-users] self-heal not working

Seemingly Similar Threads