thr3ads.net - Gluster users - [Gluster-users] Self healing does not see files to heal [Aug 2016]

If this information is useful, please help other people find it:
Share via:

Ravishankar N

2016-Aug-17 01:24 UTC

[Gluster-users] Self healing does not see files to heal

On 08/16/2016 10:44 PM, ??????? ???????? wrote:> Hello,
>
> While testing healing after bitrot error it was found that self healing
cannot heal files which were manually deleted from brick. Gluster 3.8.1:
>
> - Create volume, mount it locally and copy test file to it
> [root at srv01 ~]# gluster volume create test01 replica 2  srv01:/R1/test01
srv02:/R1/test01
> volume create: test01: success: please start the volume to access data
> [root at srv01 ~]# gluster volume start test01
> volume start: test01: success
> [root at srv01 ~]# mount -t glusterfs srv01:/test01 /mnt
> [root at srv01 ~]# cp /etc/passwd /mnt
> [root at srv01 ~]# ls -l /mnt
> ????? 2
> -rw-r--r--. 1 root root 1505 ??? 16 19:59 passwd
>
> - Then remove test file from first brick like we have to do in case of
bitrot error in the file
You also need to remove all hard-links to the corrupted file from the 
brick, including the one in the .glusterfs folder.
There is a bug in heal-full that prevents it from crawling all bricks of 
the replica. The right way to heal the corrupted files as of now is to 
access them from the mount-point like you did after removing the 
hard-links. The list of files that are corrupted can be obtained with 
the scrub status command.

Hope this helps,
Ravi
> [root at srv01 ~]# rm /R1/test01/passwd
> [root at srv01 ~]# ls -l /mnt
> ????? 0
> [root at srv01 ~]#
>
> - Issue full self heal
> [root at srv01 ~]# gluster volume heal test01 full
> Launching heal operation to perform full self heal on volume test01 has
been successful
> Use heal info commands to check status
> [root at srv01 ~]# tail -2 /var/log/glusterfs/glustershd.log
> [2016-08-16 16:59:56.483767] I [MSGID: 108026]
[afr-self-heald.c:611:afr_shd_full_healer] 0-test01-replicate-0: starting full
sweep on subvol test01-client-0
> [2016-08-16 16:59:56.486560] I [MSGID: 108026]
[afr-self-heald.c:621:afr_shd_full_healer] 0-test01-replicate-0: finished full
sweep on subvol test01-client-0
>
> - Now we still see no files in mount point (it becomes empty right after
removing file from the brick)
> [root at srv01 ~]# ls -l /mnt
> ????? 0
> [root at srv01 ~]#
>
> - Then try to access file by using full name (lookup-optimize and
readdir-optimize are turned off by default). Now glusterfs shows the file!
> [root at srv01 ~]# ls -l /mnt/passwd
> -rw-r--r--. 1 root root 1505 ??? 16 19:59 /mnt/passwd
>
> - And it reappeared in the brick
> [root at srv01 ~]# ls -l /R1/test01/
> ????? 4
> -rw-r--r--. 2 root root 1505 ??? 16 19:59 passwd
> [root at srv01 ~]#
>
> Is it a bug or we can tell self heal to scan all files on all bricks in the
volume?
>
> --
> Dmitry Glushenok
> Jet Infosystems
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

Lindsay Mathieson

2016-Aug-17 01:55 UTC

head link

[Gluster-users] Self healing does not see files to heal

On 17 August 2016 at 11:24, Ravishankar N <ravishankar at redhat.com>
wrote:> The right way to heal the corrupted files as of now is to access them from
> the mount-point like you did after removing the hard-links. The list of
> files that are corrupted can be obtained with the scrub status command.

Hows that work with sharding where you can't see the shards from the
mount point?

-- 
Lindsay

Дмитрий Глушенок

2016-Aug-17 08:18 UTC

head link

[Gluster-users] Self healing does not see files to heal

Hello Ravi,

Thank you for reply. Found bug number (for those who will google the email)
https://bugzilla.redhat.com/show_bug.cgi?id=1112158

Accessing the removed file from mount-point is not always working because we
have to find a special client which DHT will point to the brick with removed
file. Otherwise the file will be accessed from good brick and self-healing will
not happen (just verified). Or by accessing you meant something like touch?

--
Dmitry Glushenok
Jet Infosystems
> 17 ???. 2016 ?., ? 4:24, Ravishankar N <ravishankar at redhat.com>
???????(?):
> 
> On 08/16/2016 10:44 PM, ??????? ???????? wrote:
>> Hello,
>> 
>> While testing healing after bitrot error it was found that self healing
cannot heal files which were manually deleted from brick. Gluster 3.8.1:
>> 
>> - Create volume, mount it locally and copy test file to it
>> [root at srv01 ~]# gluster volume create test01 replica 2 
srv01:/R1/test01 srv02:/R1/test01
>> volume create: test01: success: please start the volume to access data
>> [root at srv01 ~]# gluster volume start test01
>> volume start: test01: success
>> [root at srv01 ~]# mount -t glusterfs srv01:/test01 /mnt
>> [root at srv01 ~]# cp /etc/passwd /mnt
>> [root at srv01 ~]# ls -l /mnt
>> ????? 2
>> -rw-r--r--. 1 root root 1505 ??? 16 19:59 passwd
>> 
>> - Then remove test file from first brick like we have to do in case of
bitrot error in the file
> 
> You also need to remove all hard-links to the corrupted file from the
brick, including the one in the .glusterfs folder.
> There is a bug in heal-full that prevents it from crawling all bricks of
the replica. The right way to heal the corrupted files as of now is to access
them from the mount-point like you did after removing the hard-links. The list
of files that are corrupted can be obtained with the scrub status command.
> 
> Hope this helps,
> Ravi
> 
>> [root at srv01 ~]# rm /R1/test01/passwd
>> [root at srv01 ~]# ls -l /mnt
>> ????? 0
>> [root at srv01 ~]#
>> 
>> - Issue full self heal
>> [root at srv01 ~]# gluster volume heal test01 full
>> Launching heal operation to perform full self heal on volume test01 has
been successful
>> Use heal info commands to check status
>> [root at srv01 ~]# tail -2 /var/log/glusterfs/glustershd.log
>> [2016-08-16 16:59:56.483767] I [MSGID: 108026]
[afr-self-heald.c:611:afr_shd_full_healer] 0-test01-replicate-0: starting full
sweep on subvol test01-client-0
>> [2016-08-16 16:59:56.486560] I [MSGID: 108026]
[afr-self-heald.c:621:afr_shd_full_healer] 0-test01-replicate-0: finished full
sweep on subvol test01-client-0
>> 
>> - Now we still see no files in mount point (it becomes empty right
after removing file from the brick)
>> [root at srv01 ~]# ls -l /mnt
>> ????? 0
>> [root at srv01 ~]#
>> 
>> - Then try to access file by using full name (lookup-optimize and
readdir-optimize are turned off by default). Now glusterfs shows the file!
>> [root at srv01 ~]# ls -l /mnt/passwd
>> -rw-r--r--. 1 root root 1505 ??? 16 19:59 /mnt/passwd
>> 
>> - And it reappeared in the brick
>> [root at srv01 ~]# ls -l /R1/test01/
>> ????? 4
>> -rw-r--r--. 2 root root 1505 ??? 16 19:59 passwd
>> [root at srv01 ~]#
>> 
>> Is it a bug or we can tell self heal to scan all files on all bricks in
the volume?
>> 
>> --
>> Dmitry Glushenok
>> Jet Infosystems
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>> http://www.gluster.org/mailman/listinfo/gluster-users
<http://www.gluster.org/mailman/listinfo/gluster-users>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160817/988b6918/attachment.html>

Gluster users - Aug 2016 - Self healing does not see files to heal

[Gluster-users] Self healing does not see files to heal

[Gluster-users] Self healing does not see files to heal

[Gluster-users] Self healing does not see files to heal