13.07.2016 09:26, Pranith Kumar Karampuri ?????:> > > On Wed, Jul 13, 2016 at 10:50 AM, Dmitry Melekhov <dm at belkam.com > <mailto:dm at belkam.com>> wrote: > > 13.07.2016 09:16, Pranith Kumar Karampuri ?????: >> >> >> On Wed, Jul 13, 2016 at 10:38 AM, Dmitry Melekhov <dm at belkam.com >> <mailto:dm at belkam.com>> wrote: >> >> 13.07.2016 09:04, Pranith Kumar Karampuri ?????: >>> >>> >>> On Wed, Jul 13, 2016 at 10:29 AM, Dmitry Melekhov >>> <dm at belkam.com <mailto:dm at belkam.com>> wrote: >>> >>> 13.07.2016 08:56, Pranith Kumar Karampuri ?????: >>>> >>>> >>>> On Wed, Jul 13, 2016 at 10:23 AM, Dmitry Melekhov >>>> <dm at belkam.com <mailto:dm at belkam.com>> wrote: >>>> >>>> 13.07.2016 08:46, Pranith Kumar Karampuri ?????: >>>>> >>>>> >>>>> On Wed, Jul 13, 2016 at 10:10 AM, Dmitry Melekhov >>>>> <dm at belkam.com <mailto:dm at belkam.com>> wrote: >>>>> >>>>> 13.07.2016 08:36, Pranith Kumar Karampuri ?????: >>>>>> >>>>>> >>>>>> On Wed, Jul 13, 2016 at 9:35 AM, Dmitry >>>>>> Melekhov <dm at belkam.com >>>>>> <mailto:dm at belkam.com>> wrote: >>>>>> >>>>>> 13.07.2016 01:52, Anuradha Talur ?????: >>>>>> >>>>>> >>>>>> ----- Original Message ----- >>>>>> >>>>>> From: "Dmitry Melekhov" >>>>>> <dm at belkam.com >>>>>> <mailto:dm at belkam.com>> >>>>>> To: "Pranith Kumar Karampuri" >>>>>> <pkarampu at redhat.com >>>>>> <mailto:pkarampu at redhat.com>> >>>>>> Cc: "gluster-users" >>>>>> <gluster-users at gluster.org >>>>>> <mailto:gluster-users at gluster.org>> >>>>>> Sent: Tuesday, July 12, 2016 >>>>>> 9:27:17 PM >>>>>> Subject: Re: [Gluster-users] >>>>>> 3.7.13, index healing broken? >>>>>> >>>>>> >>>>>> >>>>>> 12.07.2016 17:39, Pranith Kumar >>>>>> Karampuri ?????: >>>>>> >>>>>> >>>>>> >>>>>> Wow, what are the steps to >>>>>> recreate the problem? >>>>>> >>>>>> just set file length to zero, >>>>>> always reproducible. >>>>>> >>>>>> If you are setting the file length to >>>>>> 0 on one of the bricks (looks like >>>>>> that is the case), it is not a bug. >>>>>> >>>>>> Index heal relies on failures seen >>>>>> from the mount point(s) >>>>>> to identify the files that need heal. >>>>>> It won't be able to recognize any file >>>>>> modification done directly on bricks. >>>>>> Same goes for heal info command which >>>>>> is the reason heal info also shows 0 >>>>>> entries. >>>>>> >>>>>> >>>>>> Well, this makes self-heal useless then- >>>>>> if any file is accidently corrupted or >>>>>> deleted (yes! if file is deleted directly >>>>>> from brick this is no recognized by idex >>>>>> heal too), then it will not be >>>>>> self-healed, because self-heal uses index >>>>>> heal. >>>>>> >>>>>> >>>>>> It is better to look into bit-rot feature if >>>>>> you want to guard against these kinds of >>>>>> problems. >>>>> >>>>> Bit rot detects bit problems, not missing >>>>> files or their wrong length, i.e. this is >>>>> overhead for such simple task. >>>>> >>>>> >>>>> It detects wrong length. Because checksum won't >>>>> match anymore. >>>> >>>> Yes, sure. I guess that it will detect missed files >>>> too. But it needs far more resources, then just >>>> comparing directories in bricks? >>>>> >>>>> What use-case you are trying out is leading to >>>>> changing things directly on the brick? >>>> I'm trying to test gluster failure tolerance and >>>> right now I'm not happy with it... >>>> >>>> >>>> Which cases of fault tolerance are you not happy with? >>>> Making changes directly on the brick or anything else >>>> as well? >>>> >>> I'll repeat: >>> As I already said- if I for some reason ( real case can >>> be only by accident ) will delete file this will not be >>> detected by self-heal daemon, and, thus, will lead to >>> lower replication level, i.e. lower failure tolerance. >>> >>> >>> To prevent such accidents you need to set selinux policies >>> so that files under the brick are not modified by accident >>> by any user. At least that is the solution I remember when >>> this was discussed 3-4 years back. >>> >> So only supported platfrom is linux? Or, may be, it is better >> to improve self-healing to detect missing or wrong length >> files, I guess this is very low cost in terms of host >> resources operation. >> Just a suggestion, may be we need to look to alternatives in >> near future.... >> >> This is a corner case, from design perspective it is generally >> not a good idea to optimize for the corner case. It is better to >> protect ourselves from the corner case (SElinux etc) or you can >> also use snapshots to protect against these kind of mishaps. >> > Sorry, I'm not agree. > As you know if on access missed or wrong lenghted file from fuse > client it is restored (healed), i.e. gluster recognizes file is > wrong and heal it , so I do not see any reason to provide this > such function as self-healing. > Thank you! > > Ah! Now how do you suggest we keep track of which of 10s of millions > of files the user accidentally deleted from the brick without > gluster's knowledge? Once it comes to gluster's knowledge we can do > something. But how does gluster become aware of something it is not > keeping track of? At the time you access it gluster knows something > went wrong so it restores it. If you change something on the bricks > even by accident all the data gluster keeps (similar to journal) is a > waste. Even the disk filesystems will ask you to do fsck if something > unexpected happens so full self-heal is similar operation.You are absolutely right- question is why gluster does not become aware about such problem is case of self-healing?> > > -- > Pranith-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160713/1ef4cc13/attachment.html>
Pranith Kumar Karampuri
2016-Jul-13 05:36 UTC
[Gluster-users] 3.7.13, index healing broken?
On Wed, Jul 13, 2016 at 10:58 AM, Dmitry Melekhov <dm at belkam.com> wrote:> 13.07.2016 09:26, Pranith Kumar Karampuri ?????: > > > > On Wed, Jul 13, 2016 at 10:50 AM, Dmitry Melekhov < <dm at belkam.com> > dm at belkam.com> wrote: > >> 13.07.2016 09:16, Pranith Kumar Karampuri ?????: >> >> >> >> On Wed, Jul 13, 2016 at 10:38 AM, Dmitry Melekhov <dm at belkam.com> wrote: >> >>> 13.07.2016 09:04, Pranith Kumar Karampuri ?????: >>> >>> >>> >>> On Wed, Jul 13, 2016 at 10:29 AM, Dmitry Melekhov < <dm at belkam.com> >>> dm at belkam.com> wrote: >>> >>>> 13.07.2016 08:56, Pranith Kumar Karampuri ?????: >>>> >>>> >>>> >>>> On Wed, Jul 13, 2016 at 10:23 AM, Dmitry Melekhov < <dm at belkam.com> >>>> dm at belkam.com> wrote: >>>> >>>>> 13.07.2016 08:46, Pranith Kumar Karampuri ?????: >>>>> >>>>> >>>>> >>>>> On Wed, Jul 13, 2016 at 10:10 AM, Dmitry Melekhov < <dm at belkam.com> >>>>> dm at belkam.com> wrote: >>>>> >>>>>> 13.07.2016 08:36, Pranith Kumar Karampuri ?????: >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Jul 13, 2016 at 9:35 AM, Dmitry Melekhov < <dm at belkam.com> >>>>>> dm at belkam.com> wrote: >>>>>> >>>>>>> 13.07.2016 01:52, Anuradha Talur ?????: >>>>>>> >>>>>>>> >>>>>>>> ----- Original Message ----- >>>>>>>> >>>>>>>>> From: "Dmitry Melekhov" < <dm at belkam.com>dm at belkam.com> >>>>>>>>> To: "Pranith Kumar Karampuri" < <pkarampu at redhat.com> >>>>>>>>> pkarampu at redhat.com> >>>>>>>>> Cc: "gluster-users" < <gluster-users at gluster.org> >>>>>>>>> gluster-users at gluster.org> >>>>>>>>> Sent: Tuesday, July 12, 2016 9:27:17 PM >>>>>>>>> Subject: Re: [Gluster-users] 3.7.13, index healing broken? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 12.07.2016 17:39, Pranith Kumar Karampuri ?????: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Wow, what are the steps to recreate the problem? >>>>>>>>> >>>>>>>>> just set file length to zero, always reproducible. >>>>>>>>> >>>>>>>>> If you are setting the file length to 0 on one of the bricks >>>>>>>> (looks like >>>>>>>> that is the case), it is not a bug. >>>>>>>> >>>>>>>> Index heal relies on failures seen from the mount point(s) >>>>>>>> to identify the files that need heal. It won't be able to recognize >>>>>>>> any file >>>>>>>> modification done directly on bricks. Same goes for heal info >>>>>>>> command which >>>>>>>> is the reason heal info also shows 0 entries. >>>>>>>> >>>>>>> >>>>>>> Well, this makes self-heal useless then- if any file is accidently >>>>>>> corrupted or deleted (yes! if file is deleted directly from brick this is >>>>>>> no recognized by idex heal too), then it will not be self-healed, because >>>>>>> self-heal uses index heal. >>>>>>> >>>>>> >>>>>> It is better to look into bit-rot feature if you want to guard >>>>>> against these kinds of problems. >>>>>> >>>>>> >>>>>> Bit rot detects bit problems, not missing files or their wrong >>>>>> length, i.e. this is overhead for such simple task. >>>>>> >>>>> >>>>> It detects wrong length. Because checksum won't match anymore. >>>>> >>>>> >>>>> Yes, sure. I guess that it will detect missed files too. But it needs >>>>> far more resources, then just comparing directories in bricks? >>>>> >>>>> >>>>> What use-case you are trying out is leading to changing things >>>>> directly on the brick? >>>>> >>>>> I'm trying to test gluster failure tolerance and right now I'm not >>>>> happy with it... >>>>> >>>> >>>> Which cases of fault tolerance are you not happy with? Making changes >>>> directly on the brick or anything else as well? >>>> >>>> I'll repeat: >>>> As I already said- if I for some reason ( real case can be only by >>>> accident ) will delete file this will not be detected by self-heal daemon, >>>> and, thus, will lead to lower replication level, i.e. lower failure >>>> tolerance. >>>> >>> >>> To prevent such accidents you need to set selinux policies so that files >>> under the brick are not modified by accident by any user. At least that is >>> the solution I remember when this was discussed 3-4 years back. >>> >>> So only supported platfrom is linux? Or, may be, it is better to improve >>> self-healing to detect missing or wrong length files, I guess this is very >>> low cost in terms of host resources operation. >>> Just a suggestion, may be we need to look to alternatives in near >>> future.... >>> >>> This is a corner case, from design perspective it is generally not a >> good idea to optimize for the corner case. It is better to protect >> ourselves from the corner case (SElinux etc) or you can also use snapshots >> to protect against these kind of mishaps. >> >> Sorry, I'm not agree. >> As you know if on access missed or wrong lenghted file from fuse client >> it is restored (healed), i.e. gluster recognizes file is wrong and heal it >> , so I do not see any reason to provide this such function as self-healing. >> Thank you! >> >> Ah! Now how do you suggest we keep track of which of 10s of millions of > files the user accidentally deleted from the brick without gluster's > knowledge? Once it comes to gluster's knowledge we can do something. But > how does gluster become aware of something it is not keeping track of? At > the time you access it gluster knows something went wrong so it restores > it. If you change something on the bricks even by accident all the data > gluster keeps (similar to journal) is a waste. Even the disk filesystems > will ask you to do fsck if something unexpected happens so full self-heal > is similar operation. > > > You are absolutely right- question is why gluster does not become aware > about such problem is case of self-healing? >Because the operations that are performed directly on brick do not go through gluster stack.> > > > -- > Pranith > > >-- Pranith -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160713/a4f20325/attachment.html>