thr3ads.net - Gluster users - [Gluster-users] 3.7.13, index healing broken? [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Dmitry Melekhov

2016-Jul-13 09:39 UTC

[Gluster-users] 3.7.13, index healing broken?

13.07.2016 13:24, Pranith Kumar Karampuri ?????:>
>
> On Wed, Jul 13, 2016 at 2:50 PM, Dmitry Melekhov <dm at belkam.com 
> <mailto:dm at belkam.com>> wrote:
>
>     13.07.2016 13:10, Pranith Kumar Karampuri ?????:
>>
>>
>>     On Wed, Jul 13, 2016 at 2:25 PM, Dmitry Melekhov <dm at
belkam.com
>>     <mailto:dm at belkam.com>> wrote:
>>
>>         13.07.2016 11:40, Pranith Kumar Karampuri ?????:
>>
>>
>>             You recipe doesn't work :-(  If there is difference
>>             between bricks directories due to direct brick
>>             manipulation it leads to problems.
>>
>>             You have to execute "gluster volume heal
<volname> full"
>>             for triggering full heal.
>>
>>         yeah, but I need to know that I need to execute it.
>>         any help from gluster or only external script?
>>
>>
>>     I guess it is not too difficult to set up cron/systemd.timer to
>>     run this command once in a while right?
>
>     Too difficult? No.
>     So you are suggesting to run heal full by cron? Right?
>     Really, I don't know how much resources this full heal may need in
>     real installations.
>     If not much- why self-heal doesn't call it?
>
>
> Because we don't expect people to touch the bricks. For a corner case 
> it doesn't make sense to keep doing full filesystem scan. But we do 
> provide the CLI for people who want it.
Well, why run heal every 10 minutes if no problems are expected?
 From your link:



The index heal is done:
a) Every 600 seconds (can be changed via the |cluster.heal-timeout| 
volume option)
b) When it is explicitly triggered via the |gluster vol heal <VOLNAME>| 
command
c) Whenever a replica brick that was down comes back up.


As I can understand, this index heal runs once per volume, not on 
specific node, this is why there is self-heal daemon,
otherwise this can be achieved by cron. If I have node with cron down, 
then I'll get no full heal, I can, definitely, run next full heal on 
different node by cron :-)

>
>>     What script do you need to write? I didn't get you.
>>
>
>     Which compares bricks directories, and, if it there is real need-
>     it alerts me, I can run heal full or, may be, just trigger files
>     heal by reading some files over fuse.
>     Could you , please, tell me how heal full works and why it is not
>     part of self-heal process?
>
>
> You can read more about it at: 
>
https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/afr-self-heal-daemon.md
Thank you!
I think it will be wise to add full heal interval to self-heal daemon.
>
>     Thank you!
>
>>
>>     -- 
>>     Pranith
>
>
>
>
> -- 
> Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160713/d2700a08/attachment.html>

Pranith Kumar Karampuri

2016-Jul-13 10:34 UTC

head link

[Gluster-users] 3.7.13, index healing broken?

On Wed, Jul 13, 2016 at 3:09 PM, Dmitry Melekhov <dm at belkam.com> wrote:
> 13.07.2016 13:24, Pranith Kumar Karampuri ?????:
>
>
>
> On Wed, Jul 13, 2016 at 2:50 PM, Dmitry Melekhov < <dm at
belkam.com>
> dm at belkam.com> wrote:
>
>> 13.07.2016 13:10, Pranith Kumar Karampuri ?????:
>>
>>
>>
>> On Wed, Jul 13, 2016 at 2:25 PM, Dmitry Melekhov < <dm at
belkam.com>
>> dm at belkam.com> wrote:
>>
>>> 13.07.2016 11:40, Pranith Kumar Karampuri ?????:
>>>
>>>>
>>>> You recipe doesn't work :-(  If there is difference between
bricks
>>>> directories due to direct brick manipulation it leads to
problems.
>>>>
>>>> You have to execute "gluster volume heal <volname>
full" for triggering
>>>> full heal.
>>>>
>>>> yeah, but I need to know that I need to execute it.
>>> any help from gluster or only external script?
>>>
>>>
>>> I guess it is not too difficult to set up cron/systemd.timer to run
this
>> command once in a while right?
>>
>>
>> Too difficult? No.
>> So you are suggesting to run heal full by cron? Right?
>> Really, I don't know how much resources this full heal may need in
real
>> installations.
>> If not much- why self-heal doesn't call it?
>>
>
> Because we don't expect people to touch the bricks. For a corner case
it
> doesn't make sense to keep doing full filesystem scan. But we do
provide
> the CLI for people who want it.
>
>
>
> Well, why run heal every 10 minutes if no problems are expected?
>
What we realized is that sometimes people run into space/quota exceeded
problems which lead to pending heals so it is better to run index heal once
every some minutes.

> From your link:
>
> The index heal is done:
> a) Every 600 seconds (can be changed via the cluster.heal-timeout volume
> option)
> b) When it is explicitly triggered via the gluster vol heal <VOLNAME>
> command
> c) Whenever a replica brick that was down comes back up.
>
> As I can understand, this index heal runs once per volume, not on specific
> node, this is why there is self-heal daemon,
> otherwise this can be achieved by cron. If I have node with cron down,
> then I'll get no full heal, I can, definitely, run next full heal on
> different node by cron :-)
>
>
>
>> What script do you need to write? I didn't get you.
>>
>>
>> Which compares bricks directories, and, if it there is real need- it
>> alerts me, I can run heal full or, may be, just trigger files heal by
>> reading some files over fuse.
>> Could you , please, tell me how heal full works and why it is not part
of
>> self-heal process?
>>
>
> You can read more about it at:
>
<https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/afr-self-heal-daemon.md>
>
https://github.com/gluster/glusterfs/blob/master/doc/developer-guide/afr-self-heal-daemon.md
>
>
> Thank you!
> I think it will be wise to add full heal interval to self-heal daemon.
>
This may not be a bad idea. Want to raise an RFE bug?

>
>
>
>>
>> Thank you!
>>
>>
>> --
>> Pranith
>>
>>
>>
>
>
> --
> Pranith
>
>
>

-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160713/c3badf9c/attachment.html>

Gluster users - Jul 2016 - 3.7.13, index healing broken?

[Gluster-users] 3.7.13, index healing broken?

[Gluster-users] 3.7.13, index healing broken?