Pranith Kumar Karampuri
2016-Jul-13 04:46 UTC
[Gluster-users] 3.7.13, index healing broken?
On Wed, Jul 13, 2016 at 10:10 AM, Dmitry Melekhov <dm at belkam.com> wrote:> 13.07.2016 08:36, Pranith Kumar Karampuri ?????: > > > > On Wed, Jul 13, 2016 at 9:35 AM, Dmitry Melekhov < <dm at belkam.com> > dm at belkam.com> wrote: > >> 13.07.2016 01:52, Anuradha Talur ?????: >> >>> >>> ----- Original Message ----- >>> >>>> From: "Dmitry Melekhov" < <dm at belkam.com>dm at belkam.com> >>>> To: "Pranith Kumar Karampuri" < <pkarampu at redhat.com> >>>> pkarampu at redhat.com> >>>> Cc: "gluster-users" <gluster-users at gluster.org> >>>> Sent: Tuesday, July 12, 2016 9:27:17 PM >>>> Subject: Re: [Gluster-users] 3.7.13, index healing broken? >>>> >>>> >>>> >>>> 12.07.2016 17:39, Pranith Kumar Karampuri ?????: >>>> >>>> >>>> >>>> Wow, what are the steps to recreate the problem? >>>> >>>> just set file length to zero, always reproducible. >>>> >>>> If you are setting the file length to 0 on one of the bricks (looks like >>> that is the case), it is not a bug. >>> >>> Index heal relies on failures seen from the mount point(s) >>> to identify the files that need heal. It won't be able to recognize any >>> file >>> modification done directly on bricks. Same goes for heal info command >>> which >>> is the reason heal info also shows 0 entries. >>> >> >> Well, this makes self-heal useless then- if any file is accidently >> corrupted or deleted (yes! if file is deleted directly from brick this is >> no recognized by idex heal too), then it will not be self-healed, because >> self-heal uses index heal. >> > > It is better to look into bit-rot feature if you want to guard against > these kinds of problems. > > > Bit rot detects bit problems, not missing files or their wrong length, > i.e. this is overhead for such simple task. >It detects wrong length. Because checksum won't match anymore. What use-case you are trying out is leading to changing things directly on the brick?> > Thank you! > > > > >> >> >>> Heal full on the other hand will individually compare certain aspects of >>> all >>> files/dir to identify files to be healed. This is why heal full works in >>> this case >>> but index heal doesn't. >>> >> OK, thank yo for explanation, but , once again how about self-healing and >> data consistency? >> And, if I access this deleted or broken file from client then it will be >> healed, I guess this is what self-heal needs to do. >> >> Thank you! >> >> >> >>>> >>>> >>>> On Tue, Jul 12, 2016 at 3:09 PM, Dmitry Melekhov < dm at belkam.com > >>>> wrote: >>>> >>>> >>>> >>>> 12.07.2016 13:33, Pranith Kumar Karampuri ?????: >>>> >>>> >>>> >>>> What was "gluster volume heal <volname> info" showing when you saw this >>>> issue? >>>> >>>> just reproduced : >>>> >>>> >>>> [root at father brick]# > gstatus-0.64-3.el7.x86_64.rpm >>>> >>>> [root at father brick]# gluster volume heal pool >>>> Launching heal operation to perform index self heal on volume pool has >>>> been >>>> successful >>>> Use heal info commands to check status >>>> [root at father brick]# gluster volume heal pool info >>>> Brick father:/wall/pool/brick >>>> Status: Connected >>>> Number of entries: 0 >>>> >>>> Brick son:/wall/pool/brick >>>> Status: Connected >>>> Number of entries: 0 >>>> >>>> Brick spirit:/wall/pool/brick >>>> Status: Connected >>>> Number of entries: 0 >>>> >>>> [root at father brick]# >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Mon, Jul 11, 2016 at 3:28 PM, Dmitry Melekhov < dm at belkam.com > >>>> wrote: >>>> >>>> >>>> Hello! >>>> >>>> 3.7.13, 3 bricks volume. >>>> >>>> inside one of bricks: >>>> >>>> [root at father brick]# ls -l gstatus-0.64-3.el7.x86_64.rpm >>>> -rw-r--r-- 2 root root 52268 ??? 11 13:00 gstatus-0.64-3.el7.x86_64.rpm >>>> [root at father brick]# >>>> >>>> >>>> [root at father brick]# > gstatus-0.64-3.el7.x86_64.rpm >>>> [root at father brick]# ls -l gstatus-0.64-3.el7.x86_64.rpm >>>> -rw-r--r-- 2 root root 0 ??? 11 13:54 gstatus-0.64-3.el7.x86_64.rpm >>>> [root at father brick]# >>>> >>>> so now file has 0 length. >>>> >>>> try to heal: >>>> >>>> >>>> >>>> [root at father brick]# gluster volume heal pool >>>> Launching heal operation to perform index self heal on volume pool has >>>> been >>>> successful >>>> Use heal info commands to check status >>>> [root at father brick]# ls -l gstatus-0.64-3.el7.x86_64.rpm >>>> -rw-r--r-- 2 root root 0 ??? 11 13:54 gstatus-0.64-3.el7.x86_64.rpm >>>> [root at father brick]# >>>> >>>> >>>> nothing! >>>> >>>> [root at father brick]# gluster volume heal pool full >>>> Launching heal operation to perform full self heal on volume pool has >>>> been >>>> successful >>>> Use heal info commands to check status >>>> [root at father brick]# ls -l gstatus-0.64-3.el7.x86_64.rpm >>>> -rw-r--r-- 2 root root 52268 ??? 11 13:00 gstatus-0.64-3.el7.x86_64.rpm >>>> [root at father brick]# >>>> >>>> >>>> full heal is OK. >>>> >>>> But, self-heal is doing index heal according to >>>> >>>> >>>> http://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Developer-guide/afr-self-heal-daemon/ >>>> >>>> Is this bug? >>>> >>>> >>>> As far as I remember it worked in 3.7.10.... >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>> >>>> >>>> >>>> -- >>>> Pranith >>>> >>>> >>>> >>>> >>>> -- >>>> Pranith >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>> >>> >> > > > -- > Pranith > > >-- Pranith -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160713/46ba3ac7/attachment.html>
13.07.2016 08:46, Pranith Kumar Karampuri ?????:> > > On Wed, Jul 13, 2016 at 10:10 AM, Dmitry Melekhov <dm at belkam.com > <mailto:dm at belkam.com>> wrote: > > 13.07.2016 08:36, Pranith Kumar Karampuri ?????: >> >> >> On Wed, Jul 13, 2016 at 9:35 AM, Dmitry Melekhov <dm at belkam.com >> <mailto:dm at belkam.com>> wrote: >> >> 13.07.2016 01:52, Anuradha Talur ?????: >> >> >> ----- Original Message ----- >> >> From: "Dmitry Melekhov" <dm at belkam.com >> <mailto:dm at belkam.com>> >> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com >> <mailto:pkarampu at redhat.com>> >> Cc: "gluster-users" <gluster-users at gluster.org >> <mailto:gluster-users at gluster.org>> >> Sent: Tuesday, July 12, 2016 9:27:17 PM >> Subject: Re: [Gluster-users] 3.7.13, index healing >> broken? >> >> >> >> 12.07.2016 17:39, Pranith Kumar Karampuri ?????: >> >> >> >> Wow, what are the steps to recreate the problem? >> >> just set file length to zero, always reproducible. >> >> If you are setting the file length to 0 on one of the >> bricks (looks like >> that is the case), it is not a bug. >> >> Index heal relies on failures seen from the mount point(s) >> to identify the files that need heal. It won't be able to >> recognize any file >> modification done directly on bricks. Same goes for heal >> info command which >> is the reason heal info also shows 0 entries. >> >> >> Well, this makes self-heal useless then- if any file is >> accidently corrupted or deleted (yes! if file is deleted >> directly from brick this is no recognized by idex heal too), >> then it will not be self-healed, because self-heal uses index >> heal. >> >> >> It is better to look into bit-rot feature if you want to guard >> against these kinds of problems. > > Bit rot detects bit problems, not missing files or their wrong > length, i.e. this is overhead for such simple task. > > > It detects wrong length. Because checksum won't match anymore.Yes, sure. I guess that it will detect missed files too. But it needs far more resources, then just comparing directories in bricks?> > What use-case you are trying out is leading to changing things > directly on the brick?I'm trying to test gluster failure tolerance and right now I'm not happy with it...> > > Thank you! >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160713/68f0a5d7/attachment.html>