thr3ads.net - Gluster users - [Gluster-users] 3.7.13, index healing broken? [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Pranith Kumar Karampuri

2016-Jul-13 07:40 UTC

[Gluster-users] 3.7.13, index healing broken?

On Wed, Jul 13, 2016 at 12:11 PM, Dmitry Melekhov <dm at belkam.com>
wrote:
> 13.07.2016 10:24, Pranith Kumar Karampuri ?????:
>
>
>
> On Wed, Jul 13, 2016 at 11:49 AM, Dmitry Melekhov < <dm at
belkam.com>
> dm at belkam.com> wrote:
>
>> 13.07.2016 10:10, Pranith Kumar Karampuri ?????:
>>
>>
>>
>> On Wed, Jul 13, 2016 at 11:27 AM, Dmitry Melekhov <dm at
belkam.com> wrote:
>>
>>> 13.07.2016 09:50, Pranith Kumar Karampuri ?????:
>>>
>>>
>>>
>>> On Wed, Jul 13, 2016 at 11:11 AM, Dmitry Melekhov < <dm at
belkam.com>
>>> dm at belkam.com> wrote:
>>>
>>>> 13.07.2016 09:36, Pranith Kumar Karampuri ?????:
>>>>
>>>>
>>>>
>>>> On Wed, Jul 13, 2016 at 10:58 AM, Dmitry Melekhov < <dm
at belkam.com>
>>>> dm at belkam.com> wrote:
>>>>
>>>>> 13.07.2016 09:26, Pranith Kumar Karampuri ?????:
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jul 13, 2016 at 10:50 AM, Dmitry Melekhov <
<dm at belkam.com>
>>>>> dm at belkam.com> wrote:
>>>>>
>>>>>> 13.07.2016 09:16, Pranith Kumar Karampuri ?????:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 13, 2016 at 10:38 AM, Dmitry Melekhov <
<dm at belkam.com>
>>>>>> dm at belkam.com> wrote:
>>>>>>
>>>>>>> 13.07.2016 09:04, Pranith Kumar Karampuri ?????:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 13, 2016 at 10:29 AM, Dmitry Melekhov
< <dm at belkam.com>
>>>>>>> dm at belkam.com> wrote:
>>>>>>>
>>>>>>>> 13.07.2016 08:56, Pranith Kumar Karampuri
?????:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jul 13, 2016 at 10:23 AM, Dmitry
Melekhov < <dm at belkam.com>
>>>>>>>> dm at belkam.com> wrote:
>>>>>>>>
>>>>>>>>> 13.07.2016 08:46, Pranith Kumar Karampuri
?????:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jul 13, 2016 at 10:10 AM, Dmitry
Melekhov <
>>>>>>>>> <dm at belkam.com>dm at
belkam.com> wrote:
>>>>>>>>>
>>>>>>>>>> 13.07.2016 08:36, Pranith Kumar
Karampuri ?????:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jul 13, 2016 at 9:35 AM, Dmitry
Melekhov <
>>>>>>>>>> <dm at belkam.com>dm at
belkam.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> 13.07.2016 01:52, Anuradha Talur
?????:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>>
>>>>>>>>>>>>> From: "Dmitry
Melekhov" < <dm at belkam.com>dm at belkam.com>
>>>>>>>>>>>>> To: "Pranith Kumar
Karampuri" < <pkarampu at redhat.com>
>>>>>>>>>>>>> pkarampu at redhat.com>
>>>>>>>>>>>>> Cc:
"gluster-users" < <gluster-users at gluster.org>
>>>>>>>>>>>>> gluster-users at
gluster.org>
>>>>>>>>>>>>> Sent: Tuesday, July 12,
2016 9:27:17 PM
>>>>>>>>>>>>> Subject: Re:
[Gluster-users] 3.7.13, index healing broken?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 12.07.2016 17:39, Pranith
Kumar Karampuri ?????:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Wow, what are the steps to
recreate the problem?
>>>>>>>>>>>>>
>>>>>>>>>>>>> just set file length to
zero, always reproducible.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you are setting the file
length to 0 on one of the bricks
>>>>>>>>>>>> (looks like
>>>>>>>>>>>> that is the case), it is not a
bug.
>>>>>>>>>>>>
>>>>>>>>>>>> Index heal relies on failures
seen from the mount point(s)
>>>>>>>>>>>> to identify the files that need
heal. It won't be able to
>>>>>>>>>>>> recognize any file
>>>>>>>>>>>> modification done directly on
bricks. Same goes for heal info
>>>>>>>>>>>> command which
>>>>>>>>>>>> is the reason heal info also
shows 0 entries.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Well, this makes self-heal useless
then- if any file is
>>>>>>>>>>> accidently corrupted or deleted
(yes! if file is deleted directly from
>>>>>>>>>>> brick this is no recognized by idex
heal too), then it will not be
>>>>>>>>>>> self-healed, because self-heal uses
index heal.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> It is better to look into bit-rot
feature if you want to guard
>>>>>>>>>> against these kinds of problems.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Bit rot detects bit problems, not
missing files or their wrong
>>>>>>>>>> length, i.e. this is overhead for such
simple task.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It detects wrong length. Because checksum
won't match anymore.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, sure. I guess that it will detect
missed files too. But it
>>>>>>>>> needs far more resources, then just
comparing directories in bricks?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> What use-case you are trying out is leading
to changing things
>>>>>>>>> directly on the brick?
>>>>>>>>>
>>>>>>>>> I'm trying to test gluster failure
tolerance and right now I'm not
>>>>>>>>> happy with it...
>>>>>>>>>
>>>>>>>>
>>>>>>>> Which cases of fault tolerance are you not
happy with? Making
>>>>>>>> changes directly on the brick or anything else
as well?
>>>>>>>>
>>>>>>>> I'll repeat:
>>>>>>>> As I already said- if I for some reason ( real
case  can be only by
>>>>>>>> accident ) will delete file this will not be
detected by self-heal daemon,
>>>>>>>> and, thus, will lead to lower replication
level, i.e. lower failure
>>>>>>>> tolerance.
>>>>>>>>
>>>>>>>
>>>>>>> To prevent such accidents you need to set selinux
policies so that
>>>>>>> files under the brick are not modified by accident
by any user. At least
>>>>>>> that is the solution I remember when this was
discussed 3-4 years back.
>>>>>>>
>>>>>>> So only supported platfrom is linux? Or, may be, it
is better to
>>>>>>> improve self-healing to detect missing or wrong
length files, I guess this
>>>>>>> is very low cost in terms of host resources
operation.
>>>>>>> Just a suggestion, may be we need to look to
alternatives in near
>>>>>>> future....
>>>>>>>
>>>>>>> This is a corner case, from design perspective it
is generally not a
>>>>>> good idea to optimize for the corner case. It is better
to protect
>>>>>> ourselves from the corner case (SElinux etc) or you can
also use snapshots
>>>>>> to protect against these kind of mishaps.
>>>>>>
>>>>>> Sorry, I'm not agree.
>>>>>> As you  know if on access missed or wrong lenghted file
from fuse
>>>>>> client it is restored (healed), i.e. gluster recognizes
file is wrong and
>>>>>> heal it , so I do not see any reason to provide this
such function as
>>>>>> self-healing.
>>>>>> Thank you!
>>>>>>
>>>>>> Ah! Now how do you suggest we keep track of which of
10s of millions
>>>>> of files the user accidentally deleted from the brick
without gluster's
>>>>> knowledge? Once it comes to gluster's knowledge we can
do something. But
>>>>> how does gluster become aware of something it is not
keeping track of? At
>>>>> the time you access it gluster knows something went wrong
so it restores
>>>>> it. If you change something on the bricks even by accident
all the data
>>>>> gluster keeps (similar to journal) is a waste. Even the
disk filesystems
>>>>> will ask you to do fsck if something unexpected happens so
full self-heal
>>>>> is similar operation.
>>>>>
>>>>>
>>>>> You are absolutely right- question is why gluster does not
become
>>>>> aware about such problem is case of self-healing?
>>>>>
>>>>
>>>> Because the operations that are performed directly on brick do
not go
>>>> through gluster stack.
>>>>
>>>>
>>>>
>>>> OK, I'll repeat-
>>>> As you  know if on access missed or wrong lenghted file from
fuse
>>>> client it is restored (healed), i.e. gluster recognizes file is
wrong and
>>>> heal it , so I do not see any reason to provide this such
function as
>>>> self-healing.
>>>>
>>>
>>> For which you need accessing the file.
>>>
>>> That's right.
>>>
>>> For which you need full crawl. You can't detect the
modification which
>>> doesn't go through the stack so this is the only possibility.
>>>
>>>
>>> OK, then, if self-heal is really useless and no possible way to get
it
>>> will be provided, I guess we'll use external script to check
bricks
>>> directories consistency,
>>> don't think ls and diff will get much resources.
>>>
>>
>> How is this different from full self-heal?
>>
>>
>> Self-heal does not detect deleted or wrong-length files .
>>
>
> It detects when you do full crawl. Which essentially is ls -laR kind of
> thing on the whole volume. You don't need any external scripts, keep
doing
> full crawl once in a while may be?
>
>
> You mean on fuse mount?
>
> It doesn't work:
>
> [root at father ~]# mount -t glusterfs localhost:/pool gluster
>
> [root at father ~]#
>
> then make it zero lengths in brick:
>
> [root at father gluster]# >
/wall/pool/brick/gstatus-0.64-3.el7.x86_64.rpm
> [root at father gluster]#
>
>
> [root at father gluster]# ls -laR  /root/gluster/
> /root/gluster/:
> ????? 122153384
> drwxr-xr-x   4 qemu qemu        4096 ??? 11 13:36 .
> dr-xr-x---. 10 root root        4096 ??? 11 12:26 ..
> -rw-r--r--   1 root root  8589934592 ??? 11 09:14 csr1000v1.img
> -rw-r--r--   1 root root           0 ??? 13 10:34
> gstatus-0.64-3.el7.x86_64.rpm
>
>
> As you can see gstatus-0.64-3.el7.x86_64.rpm has 0 length
> But:
>
> [root at father gluster]# touch /root/gluster/gstatus-0.64-3.el7.x86_64.rpm
> [root at father gluster]# ls -laR  /root/gluster/
> /root/gluster/:
> ????? 122153436
> drwxr-xr-x   4 qemu qemu        4096 ??? 11 13:36 .
> dr-xr-x---. 10 root root        4096 ??? 11 12:26 ..
> -rw-r--r--   1 root root  8589934592 ??? 11 09:14 csr1000v1.img
> -rw-r--r--   1 root root       52268 ??? 13 10:36
> gstatus-0.64-3.el7.x86_64.rpm
>
>
> I.e. if I do some i.o. on file then it is back.
>
>
> By the way the same problem if I delete file directly in brick:
>
> [root at father gluster]# rm /wall/pool/brick/gstatus-0.64-3.el7.x86_64.rpm
> rm: ??????? ??????? ???? ?/wall/pool/brick/gstatus-0.64-3.el7.x86_64.rpm??
> y
> [root at father gluster]# ls -laR  /root/gluster/
> /root/gluster/:
> ????? 122153384
> drwxr-xr-x   4 qemu qemu        4096 ??? 13 10:38 .
> dr-xr-x---. 10 root root        4096 ??? 11 12:26 ..
> -rw-r--r--   1 root root  8589934592 ??? 11 09:14 csr1000v1.img
> -rw-r--r--   1 qemu qemu 43692064768 ??? 13 10:38 infimonitor.img
>
>
> I don't see it in directory in fuse mount at all till touch, which
> restores file too.
>
>
> If you need any performance improvements here, we will be happy to help.
> Please give us feedback.
>
>
> You recipe doesn't work :-(  If there is difference between bricks
> directories due to direct brick manipulation it leads to problems.
>
You have to execute "gluster volume heal <volname> full" for
triggering
full heal.

>
>
> All I was saying is it is not possible to detect them through index heal.
> Because for the index to be populated you need the operations to go through
> gluster stack.
>
> Why it can't ? I don't know, you just said it is impossible in
gluster
>> because it can only track changes only made through gluster, i.e.
bricks
>> can have different files sets and it is not recognized (true) because ,
as
>> I understand, gluster's  self-heal thinks that brick underlying
filesystem
>> can't be corrupted by server admin  (not true, I can say this as
almost 25
>> years experienced engineer, i.e. I did this several times ;-) ).
>>
>>
>>
>>
>>
>>>
>>> Thank you!
>>>
>>> p.s.
>>> still can't understand why it can't be implemented in
gluster... :-(
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Pranith
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Pranith
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Pranith
>>>
>>>
>>>
>>
>>
>> --
>> Pranith
>>
>>
>>
>
>
> --
> Pranith
>
>
>

-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160713/0dae5444/attachment.html>

Dmitry Melekhov

2016-Jul-13 08:55 UTC

head link

[Gluster-users] 3.7.13, index healing broken?

13.07.2016 11:40, Pranith Kumar Karampuri ?????:>
> You recipe doesn't work :-(  If there is difference between bricks 
> directories due to direct brick manipulation it leads to problems.
>
> You have to execute "gluster volume heal <volname> full"
for
> triggering full heal.
>yeah, but I need to know that I need to execute it.
any help from gluster or only external script?

Gluster users - Jul 2016 - 3.7.13, index healing broken?

[Gluster-users] 3.7.13, index healing broken?

[Gluster-users] 3.7.13, index healing broken?