Ravishankar N
2016-May-05 01:51 UTC
[Gluster-users] Question about "Possibly undergoing heal" on a file being reported.
On 05/05/2016 01:50 AM, Richard Klein (RSI) wrote:> > First time e-mailer to the group, greetings all. We are using Gluster > 3.7.6 in Cloudstack on CentOS7 with KVM. Gluster is our primary > storage. All is going well but we have a test VM QCOW2 volume that > gets stuck in the ?Possibly undergoing healing?. By stuck I mean it > stays in that state for over 24 hrs. This is a test VM with no > activity on it and we have removed the swap file on the guest as well > thinking that may be causing high I/O. All the tools show that the VM > is basically idle with low I/O. The only way I can clear it up is to > power the VM off, move the QCOW2 volume from the Gluster mount then > back (basically remove and recreate it) then power the VM back on. > Once I do this process all is well again but then it happened again on > the same volume/file. > > One additional note, I have even powered off the VM completely and the > QCOW2 file still stays in this state. >When this happens, can you share the output of the extended attributes of the file in question from all the bricks of the replica in which the file resides? `getfattr -d -m . -e hex /path/to/bricks/file-name` Also what is the size of this VM image file? Thanks, Ravi> Is there a way to stop/abort or force the heal to finish? Any help > with a direction would be appreciated. > > Thanks, > > Richard Klein > > RSI > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160505/d9f99402/attachment.html>
Richard Klein (RSI)
2016-May-05 17:22 UTC
[Gluster-users] Question about "Possibly undergoing heal" on a file being reported.
There are 2 hosts involved and we have a replica value of 2. The hosts are called n1c1cl1 and n1c2cl1. Below is the info you requested. The file name in gluster is "/97f52c71-80bd-4c2b-8e47-3c8c77712687". -- From the n1c1cl1 brick -- [root at n1c1cl1 ~]# ll -h /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 -rwxr--r--. 2 root root 3.7G May 5 12:10 /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 [root at n1c1cl1 ~]# getfattr -d -m . -e hex /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 getfattr: Removing leading '/' from absolute path names # file: data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000 trusted.afr.dirty=0xe68000000000000000000000 trusted.bit-rot.version=0x020000000000000057196a8d000e1606 trusted.gfid=0xb1a49bd1ea01479f9a8277992461e85f -- From the n1c2cl1 brick -- [root at n1c2cl1 ~]# ll -h /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 -rwxr--r--. 2 root root 3.7G May 5 12:16 /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 [root at n1c2cl1 ~]# getfattr -d -m . -e hex /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 getfattr: Removing leading '/' from absolute path names # file: data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c745f743a733000 trusted.afr.dirty=0xd38000000000000000000000 trusted.bit-rot.version=0x020000000000000057196a8d000e20ae trusted.gfid=0xb1a49bd1ea01479f9a8277992461e85f -- The "trusted.afr.dirty" is changing about 2 or 3 times a minute on both files. Let me know if you need further info and thanks. Richard Klein RSI From: Ravishankar N [mailto:ravishankar at redhat.com] Sent: Wednesday, May 04, 2016 8:52 PM To: Richard Klein (RSI); gluster-users at gluster.org Subject: Re: [Gluster-users] Question about "Possibly undergoing heal" on a file being reported.>On 05/05/2016 01:50 AM, Richard Klein (RSI) wrote: >First time e-mailer to the group, greetings all.? We are using Gluster 3.7.6 in Cloudstack on CentOS7 with KVM.? Gluster is our primary storage.? All is going well >but we have a test VM QCOW2 volume that gets stuck in the "Possibly undergoing healing".? By stuck I mean it stays in that state for over 24 hrs.? This is a test VM >with no activity on it and we have removed the swap file on the guest as well thinking that may be causing high I/O.? All the tools show that the VM is basically idle >with low I/O.? The only way I can clear it up is to power the VM off, move the QCOW2 volume from the Gluster mount then back (basically remove and recreate it) >then power the VM back on.? Once I do this process all is well again but then it happened again on the same volume/file. >? >One additional note, I have even powered off the VM completely and the QCOW2 file still stays in this state. >? >When this happens, can you share the output of the extended attributes of the file in question from all the bricks of the replica in which the file resides?`getfattr -d -m . -e hex /path/to/bricks/file-name` Also what is the size of this VM image file? Thanks, Ravi>Is there a way to stop/abort or force the heal to finish? ?Any help with a direction would be appreciated.? >? >Thanks, >? >Richard Klein >RSI? _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://www.gluster.org/mailman/listinfo/gluster-users