13.07.2016 09:36, Pranith Kumar Karampuri ?????:> > > On Wed, Jul 13, 2016 at 10:58 AM, Dmitry Melekhov <dm at belkam.com > <mailto:dm at belkam.com>> wrote: > > 13.07.2016 09:26, Pranith Kumar Karampuri ?????: >> >> >> On Wed, Jul 13, 2016 at 10:50 AM, Dmitry Melekhov <dm at belkam.com >> <mailto:dm at belkam.com>> wrote: >> >> 13.07.2016 09:16, Pranith Kumar Karampuri ?????: >>> >>> >>> On Wed, Jul 13, 2016 at 10:38 AM, Dmitry Melekhov >>> <dm at belkam.com <mailto:dm at belkam.com>> wrote: >>> >>> 13.07.2016 09:04, Pranith Kumar Karampuri ?????: >>>> >>>> >>>> On Wed, Jul 13, 2016 at 10:29 AM, Dmitry Melekhov >>>> <dm at belkam.com <mailto:dm at belkam.com>> wrote: >>>> >>>> 13.07.2016 08:56, Pranith Kumar Karampuri ?????: >>>>> >>>>> >>>>> On Wed, Jul 13, 2016 at 10:23 AM, Dmitry Melekhov >>>>> <dm at belkam.com <mailto:dm at belkam.com>> wrote: >>>>> >>>>> 13.07.2016 08:46, Pranith Kumar Karampuri ?????: >>>>>> >>>>>> >>>>>> On Wed, Jul 13, 2016 at 10:10 AM, Dmitry >>>>>> Melekhov <dm at belkam.com >>>>>> <mailto:dm at belkam.com>> wrote: >>>>>> >>>>>> 13.07.2016 08:36, Pranith Kumar Karampuri >>>>>> ?????: >>>>>>> >>>>>>> >>>>>>> On Wed, Jul 13, 2016 at 9:35 AM, Dmitry >>>>>>> Melekhov <dm at belkam.com >>>>>>> <mailto:dm at belkam.com>> wrote: >>>>>>> >>>>>>> 13.07.2016 01:52, Anuradha Talur ?????: >>>>>>> >>>>>>> >>>>>>> ----- Original Message ----- >>>>>>> >>>>>>> From: "Dmitry Melekhov" >>>>>>> <dm at belkam.com >>>>>>> <mailto:dm at belkam.com>> >>>>>>> To: "Pranith Kumar >>>>>>> Karampuri" >>>>>>> <pkarampu at redhat.com >>>>>>> <mailto:pkarampu at redhat.com>> >>>>>>> Cc: "gluster-users" >>>>>>> <gluster-users at gluster.org >>>>>>> <mailto:gluster-users at gluster.org>> >>>>>>> Sent: Tuesday, July 12, 2016 >>>>>>> 9:27:17 PM >>>>>>> Subject: Re: [Gluster-users] >>>>>>> 3.7.13, index healing broken? >>>>>>> >>>>>>> >>>>>>> >>>>>>> 12.07.2016 17:39, Pranith >>>>>>> Kumar Karampuri ?????: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Wow, what are the steps to >>>>>>> recreate the problem? >>>>>>> >>>>>>> just set file length to >>>>>>> zero, always reproducible. >>>>>>> >>>>>>> If you are setting the file >>>>>>> length to 0 on one of the bricks >>>>>>> (looks like >>>>>>> that is the case), it is not a bug. >>>>>>> >>>>>>> Index heal relies on failures >>>>>>> seen from the mount point(s) >>>>>>> to identify the files that need >>>>>>> heal. It won't be able to >>>>>>> recognize any file >>>>>>> modification done directly on >>>>>>> bricks. Same goes for heal info >>>>>>> command which >>>>>>> is the reason heal info also >>>>>>> shows 0 entries. >>>>>>> >>>>>>> >>>>>>> Well, this makes self-heal useless >>>>>>> then- if any file is accidently >>>>>>> corrupted or deleted (yes! if file >>>>>>> is deleted directly from brick this >>>>>>> is no recognized by idex heal too), >>>>>>> then it will not be self-healed, >>>>>>> because self-heal uses index heal. >>>>>>> >>>>>>> >>>>>>> It is better to look into bit-rot >>>>>>> feature if you want to guard against >>>>>>> these kinds of problems. >>>>>> >>>>>> Bit rot detects bit problems, not missing >>>>>> files or their wrong length, i.e. this is >>>>>> overhead for such simple task. >>>>>> >>>>>> >>>>>> It detects wrong length. Because checksum >>>>>> won't match anymore. >>>>> >>>>> Yes, sure. I guess that it will detect missed >>>>> files too. But it needs far more resources, >>>>> then just comparing directories in bricks? >>>>>> >>>>>> What use-case you are trying out is leading >>>>>> to changing things directly on the brick? >>>>> I'm trying to test gluster failure tolerance >>>>> and right now I'm not happy with it... >>>>> >>>>> >>>>> Which cases of fault tolerance are you not happy >>>>> with? Making changes directly on the brick or >>>>> anything else as well? >>>>> >>>> I'll repeat: >>>> As I already said- if I for some reason ( real >>>> case can be only by accident ) will delete file >>>> this will not be detected by self-heal daemon, and, >>>> thus, will lead to lower replication level, i.e. >>>> lower failure tolerance. >>>> >>>> >>>> To prevent such accidents you need to set selinux >>>> policies so that files under the brick are not modified >>>> by accident by any user. At least that is the solution >>>> I remember when this was discussed 3-4 years back. >>>> >>> So only supported platfrom is linux? Or, may be, it is >>> better to improve self-healing to detect missing or >>> wrong length files, I guess this is very low cost in >>> terms of host resources operation. >>> Just a suggestion, may be we need to look to >>> alternatives in near future.... >>> >>> This is a corner case, from design perspective it is >>> generally not a good idea to optimize for the corner case. >>> It is better to protect ourselves from the corner case >>> (SElinux etc) or you can also use snapshots to protect >>> against these kind of mishaps. >>> >> Sorry, I'm not agree. >> As you know if on access missed or wrong lenghted file from >> fuse client it is restored (healed), i.e. gluster recognizes >> file is wrong and heal it , so I do not see any reason to >> provide this such function as self-healing. >> Thank you! >> >> Ah! Now how do you suggest we keep track of which of 10s of >> millions of files the user accidentally deleted from the brick >> without gluster's knowledge? Once it comes to gluster's knowledge >> we can do something. But how does gluster become aware of >> something it is not keeping track of? At the time you access it >> gluster knows something went wrong so it restores it. If you >> change something on the bricks even by accident all the data >> gluster keeps (similar to journal) is a waste. Even the disk >> filesystems will ask you to do fsck if something unexpected >> happens so full self-heal is similar operation. > > You are absolutely right- question is why gluster does not become > aware about such problem is case of self-healing? > > > Because the operations that are performed directly on brick do not go > through gluster stack.OK, I'll repeat- As you know if on access missed or wrong lenghted file from fuse client it is restored (healed), i.e. gluster recognizes file is wrong and heal it , so I do not see any reason to provide this such function as self-healing.> >> >> >> -- >> Pranith > > > > > -- > Pranith-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160713/7343b80a/attachment.html>
Pranith Kumar Karampuri
2016-Jul-13 05:50 UTC
[Gluster-users] 3.7.13, index healing broken?
On Wed, Jul 13, 2016 at 11:11 AM, Dmitry Melekhov <dm at belkam.com> wrote:> 13.07.2016 09:36, Pranith Kumar Karampuri ?????: > > > > On Wed, Jul 13, 2016 at 10:58 AM, Dmitry Melekhov < <dm at belkam.com> > dm at belkam.com> wrote: > >> 13.07.2016 09:26, Pranith Kumar Karampuri ?????: >> >> >> >> On Wed, Jul 13, 2016 at 10:50 AM, Dmitry Melekhov <dm at belkam.com> wrote: >> >>> 13.07.2016 09:16, Pranith Kumar Karampuri ?????: >>> >>> >>> >>> On Wed, Jul 13, 2016 at 10:38 AM, Dmitry Melekhov < <dm at belkam.com> >>> dm at belkam.com> wrote: >>> >>>> 13.07.2016 09:04, Pranith Kumar Karampuri ?????: >>>> >>>> >>>> >>>> On Wed, Jul 13, 2016 at 10:29 AM, Dmitry Melekhov < <dm at belkam.com> >>>> dm at belkam.com> wrote: >>>> >>>>> 13.07.2016 08:56, Pranith Kumar Karampuri ?????: >>>>> >>>>> >>>>> >>>>> On Wed, Jul 13, 2016 at 10:23 AM, Dmitry Melekhov < <dm at belkam.com> >>>>> dm at belkam.com> wrote: >>>>> >>>>>> 13.07.2016 08:46, Pranith Kumar Karampuri ?????: >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Jul 13, 2016 at 10:10 AM, Dmitry Melekhov < <dm at belkam.com> >>>>>> dm at belkam.com> wrote: >>>>>> >>>>>>> 13.07.2016 08:36, Pranith Kumar Karampuri ?????: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Jul 13, 2016 at 9:35 AM, Dmitry Melekhov < <dm at belkam.com> >>>>>>> dm at belkam.com> wrote: >>>>>>> >>>>>>>> 13.07.2016 01:52, Anuradha Talur ?????: >>>>>>>> >>>>>>>>> >>>>>>>>> ----- Original Message ----- >>>>>>>>> >>>>>>>>>> From: "Dmitry Melekhov" < <dm at belkam.com>dm at belkam.com> >>>>>>>>>> To: "Pranith Kumar Karampuri" < <pkarampu at redhat.com> >>>>>>>>>> pkarampu at redhat.com> >>>>>>>>>> Cc: "gluster-users" < <gluster-users at gluster.org> >>>>>>>>>> gluster-users at gluster.org> >>>>>>>>>> Sent: Tuesday, July 12, 2016 9:27:17 PM >>>>>>>>>> Subject: Re: [Gluster-users] 3.7.13, index healing broken? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 12.07.2016 17:39, Pranith Kumar Karampuri ?????: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Wow, what are the steps to recreate the problem? >>>>>>>>>> >>>>>>>>>> just set file length to zero, always reproducible. >>>>>>>>>> >>>>>>>>>> If you are setting the file length to 0 on one of the bricks >>>>>>>>> (looks like >>>>>>>>> that is the case), it is not a bug. >>>>>>>>> >>>>>>>>> Index heal relies on failures seen from the mount point(s) >>>>>>>>> to identify the files that need heal. It won't be able to >>>>>>>>> recognize any file >>>>>>>>> modification done directly on bricks. Same goes for heal info >>>>>>>>> command which >>>>>>>>> is the reason heal info also shows 0 entries. >>>>>>>>> >>>>>>>> >>>>>>>> Well, this makes self-heal useless then- if any file is accidently >>>>>>>> corrupted or deleted (yes! if file is deleted directly from brick this is >>>>>>>> no recognized by idex heal too), then it will not be self-healed, because >>>>>>>> self-heal uses index heal. >>>>>>>> >>>>>>> >>>>>>> It is better to look into bit-rot feature if you want to guard >>>>>>> against these kinds of problems. >>>>>>> >>>>>>> >>>>>>> Bit rot detects bit problems, not missing files or their wrong >>>>>>> length, i.e. this is overhead for such simple task. >>>>>>> >>>>>> >>>>>> It detects wrong length. Because checksum won't match anymore. >>>>>> >>>>>> >>>>>> Yes, sure. I guess that it will detect missed files too. But it needs >>>>>> far more resources, then just comparing directories in bricks? >>>>>> >>>>>> >>>>>> What use-case you are trying out is leading to changing things >>>>>> directly on the brick? >>>>>> >>>>>> I'm trying to test gluster failure tolerance and right now I'm not >>>>>> happy with it... >>>>>> >>>>> >>>>> Which cases of fault tolerance are you not happy with? Making changes >>>>> directly on the brick or anything else as well? >>>>> >>>>> I'll repeat: >>>>> As I already said- if I for some reason ( real case can be only by >>>>> accident ) will delete file this will not be detected by self-heal daemon, >>>>> and, thus, will lead to lower replication level, i.e. lower failure >>>>> tolerance. >>>>> >>>> >>>> To prevent such accidents you need to set selinux policies so that >>>> files under the brick are not modified by accident by any user. At least >>>> that is the solution I remember when this was discussed 3-4 years back. >>>> >>>> So only supported platfrom is linux? Or, may be, it is better to >>>> improve self-healing to detect missing or wrong length files, I guess this >>>> is very low cost in terms of host resources operation. >>>> Just a suggestion, may be we need to look to alternatives in near >>>> future.... >>>> >>>> This is a corner case, from design perspective it is generally not a >>> good idea to optimize for the corner case. It is better to protect >>> ourselves from the corner case (SElinux etc) or you can also use snapshots >>> to protect against these kind of mishaps. >>> >>> Sorry, I'm not agree. >>> As you know if on access missed or wrong lenghted file from fuse client >>> it is restored (healed), i.e. gluster recognizes file is wrong and heal it >>> , so I do not see any reason to provide this such function as self-healing. >>> Thank you! >>> >>> Ah! Now how do you suggest we keep track of which of 10s of millions of >> files the user accidentally deleted from the brick without gluster's >> knowledge? Once it comes to gluster's knowledge we can do something. But >> how does gluster become aware of something it is not keeping track of? At >> the time you access it gluster knows something went wrong so it restores >> it. If you change something on the bricks even by accident all the data >> gluster keeps (similar to journal) is a waste. Even the disk filesystems >> will ask you to do fsck if something unexpected happens so full self-heal >> is similar operation. >> >> >> You are absolutely right- question is why gluster does not become aware >> about such problem is case of self-healing? >> > > Because the operations that are performed directly on brick do not go > through gluster stack. > > > > OK, I'll repeat- > As you know if on access missed or wrong lenghted file from fuse client > it is restored (healed), i.e. gluster recognizes file is wrong and heal it > , so I do not see any reason to provide this such function as self-healing. >For which you need accessing the file. For which you need full crawl. You can't detect the modification which doesn't go through the stack so this is the only possibility.> > >> >> >> -- >> Pranith >> >> >> > > > -- > Pranith > > >-- Pranith -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160713/07b84e23/attachment.html>