ABHISHEK PALIWAL
2016-Mar-03 11:24 UTC
[Gluster-users] [Gluster-devel] Query on healing process
On Thu, Mar 3, 2016 at 4:10 PM, Ravishankar N <ravishankar at redhat.com> wrote:> Hi, > > On 03/03/2016 11:14 AM, ABHISHEK PALIWAL wrote: > > Hi Ravi, > > As I discussed earlier this issue, I investigated this issue and find that > healing is not triggered because the "gluster volume heal c_glusterfs info > split-brain" command not showing any entries as a outcome of this command > even though the file in split brain case. > > > Couple of observations from the 'commands_output' file. > > getfattr -d -m . -e hex > opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml > The afr xattrs do not indicate that the file is in split brain: > # file: > opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml > trusted.afr.c_glusterfs-client-1=0x000000000000000000000000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.bit-rot.version=0x000000000000000b56d6dd1d000ec7a9 > trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae > > > > getfattr -d -m . -e hex > opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml > trusted.afr.c_glusterfs-client-0=0x000000080000000000000000 > trusted.afr.c_glusterfs-client-2=0x000000020000000000000000 > trusted.afr.c_glusterfs-client-4=0x000000020000000000000000 > trusted.afr.c_glusterfs-client-6=0x000000020000000000000000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.bit-rot.version=0x000000000000000b56d6dcb7000c87e7 > trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae > > 1. There doesn't seem to be a split-brain going by the trusted.afr* xattrs. >if it is not the split brain problem then how can I resolve this.> 2. You seem to have re-used the bricks from another volume/setup. For > replica 2, only trusted.afr.c_glusterfs-client-0 and > trusted.afr.c_glusterfs-client-1 must be present but I see 4 xattrs - > client-0,2,4 and 6 >could you please suggest why these entries are there because I am not able to find out scenario. I am rebooting the one board multiple times to reproduce the issue and after every reboot doing the remove-brick and add-brick on the same volume for the second board.> 3. On the rebooted node, do you have ssl enabled by any chance? There is a > bug for "Not able to fetch volfile' when ssl is enabled: > https://bugzilla.redhat.com/show_bug.cgi?id=1258931 > > Btw, you for data and metadata split-brains you can use the gluster CLI > https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md > instead of modifying the file from the back end. >But you are saying it is not split brain problem and even the split-brain command is not showing any file so how can I find the bigger file in size. Also in my case the file size is fix 2MB it is overwritten every time.> > -Ravi > > > So, what I have done I manually deleted the gfid entry of that file from > .glusterfs directory and follow the instruction mentioned in the following > link to do heal > > > https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md > > and this works fine for me. > > But my question is why the split-brain command not showing any file in > output. > > Here I am attaching all the log which I get from the node for you and also > the output of commands from both of the boards > > In this tar file two directories are present > > 000300 - log for the board which is running continuously > 002500- log for the board which is rebooted > > I am waiting for your reply please help me out on this issue. > > Thanks in advanced. > > Regards, > Abhishek > > On Fri, Feb 26, 2016 at 1:21 PM, ABHISHEK PALIWAL < > <abhishpaliwal at gmail.com>abhishpaliwal at gmail.com> wrote: > >> On Fri, Feb 26, 2016 at 10:28 AM, Ravishankar N < >> <ravishankar at redhat.com>ravishankar at redhat.com> wrote: >> >>> On 02/26/2016 10:10 AM, ABHISHEK PALIWAL wrote: >>> >>> Yes correct >>> >>> >>> Okay, so when you say the files are not in sync until some time, are you >>> getting stale data when accessing from the mount? >>> I'm not able to figure out why heal info shows zero when the files are >>> not in sync, despite all IO happening from the mounts. Could you provide >>> the output of getfattr -d -m . -e hex /brick/file-name from both bricks >>> when you hit this issue? >>> >>> I'll provide the logs once I get. here delay means we are powering on >>> the second board after the 10 minutes. >>> >>> >>> On Feb 26, 2016 9:57 AM, "Ravishankar N" < <ravishankar at redhat.com> >>> ravishankar at redhat.com> wrote: >>> >>>> Hello, >>>> >>>> On 02/26/2016 08:29 AM, ABHISHEK PALIWAL wrote: >>>> >>>> Hi Ravi, >>>> >>>> Thanks for the response. >>>> >>>> We are using Glugsterfs-3.7.8 >>>> >>>> Here is the use case: >>>> >>>> We have a logging file which saves logs of the events for every board >>>> of a node and these files are in sync using glusterfs. System in replica 2 >>>> mode it means When one brick in a replicated volume goes offline, the >>>> glusterd daemons on the other nodes keep track of all the files that are >>>> not replicated to the offline brick. When the offline brick becomes >>>> available again, the cluster initiates a healing process, replicating the >>>> updated files to that brick. But in our casse, we see that log file of >>>> one board is not in the sync and its format is corrupted means files are >>>> not in sync. >>>> >>>> >>>> Just to understand you correctly, you have mounted the 2 node replica-2 >>>> volume on both these nodes and writing to a logging file from the mounts >>>> right? >>>> >>>> >>>> Even the outcome of #gluster volume heal c_glusterfs info shows that >>>> there is no pending heals. >>>> >>>> Also , The logging file which is updated is of fixed size and the new >>>> entries will be wrapped ,overwriting the old entries. >>>> >>>> This way we have seen that after few restarts , the contents of the >>>> same file on two bricks are different , but the volume heal info shows zero >>>> entries >>>> >>>> Solution: >>>> >>>> But when we tried to put delay > 5 min before the healing everything >>>> is working fine. >>>> >>>> Regards, >>>> Abhishek >>>> >>>> On Fri, Feb 26, 2016 at 6:35 AM, Ravishankar N < >>>> <ravishankar at redhat.com>ravishankar at redhat.com> wrote: >>>> >>>>> On 02/25/2016 06:01 PM, ABHISHEK PALIWAL wrote: >>>>> >>>>> Hi, >>>>> >>>>> Here, I have one query regarding the time taken by the healing process. >>>>> In current two node setup when we rebooted one node then the >>>>> self-healing process starts less than 5min interval on the board which >>>>> resulting the corruption of the some files data. >>>>> >>>>> >>>>> Heal should start immediately after the brick process comes up. What >>>>> version of gluster are you using? What do you mean by corruption of data? >>>>> Also, how did you observe that the heal started after 5 minutes? >>>>> -Ravi >>>>> >>>>> >>>>> And to resolve it I have search on google and found the following link: >>>>> <https://support.rackspace.com/how-to/glusterfs-troubleshooting/> >>>>> https://support.rackspace.com/how-to/glusterfs-troubleshooting/ >>>>> >>>>> Mentioning that the healing process can takes upto 10min of time to >>>>> start this process. >>>>> >>>>> Here is the statement from the link: >>>>> >>>>> "Healing replicated volumes >>>>> >>>>> When any brick in a replicated volume goes offline, the glusterd >>>>> daemons on the remaining nodes keep track of all the files that are not >>>>> replicated to the offline brick. When the offline brick becomes available >>>>> again, the cluster initiates a healing process, replicating the updated >>>>> files to that brick. *The start of this process can take up to 10 >>>>> minutes, based on observation.*" >>>>> >>>>> After giving the time of more than 5 min file corruption problem has >>>>> been resolved. >>>>> >>>>> So, Here my question is there any way through which we can reduce the >>>>> time taken by the healing process to start? >>>>> >>>>> >>>>> Regards, >>>>> Abhishek Paliwal >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Gluster-devel mailing listGluster-devel at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-devel >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> >>>> >>>> >>>> Regards >>>> Abhishek Paliwal >>>> >>>> >>>> >>>> >>> >>> >> >> >> -- >> >> >> >> >> Regards >> Abhishek Paliwal >> > > > > -- > > > > > Regards > Abhishek Paliwal > > > >-- Regards Abhishek Paliwal -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160303/3dc1b5ee/attachment.html>
ABHISHEK PALIWAL
2016-Mar-04 06:40 UTC
[Gluster-users] [Gluster-devel] Query on healing process
Hi Ravi, 3. On the rebooted node, do you have ssl enabled by any chance? There is a bug for "Not able to fetch volfile' when ssl is enabled: https://bugzilla.redhat.com/show_bug.cgi?id=1258931 ->>>>> I have checked but ssl is disabled but still getting these errors # gluster volume heal c_glusterfs info c_glusterfs: Not able to fetch volfile from glusterd Volume heal failed. # gluster volume heal c_glusterfs info split-brain c_glusterfs: Not able to fetch volfile from glusterd Volume heal failed. And based on the your observation I understood that this is not the problem of split-brain but *is there any way through which can find out the file which is not in split-brain as well as not in sync?* # getfattr -m . -d -e hex /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml getfattr: Removing leading '/' from absolute path names # file: opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml trusted.afr.c_glusterfs-client-0=0x000000000000000000000000 trusted.afr.c_glusterfs-client-2=0x000000000000000000000000 trusted.afr.c_glusterfs-client-4=0x000000000000000000000000 trusted.afr.c_glusterfs-client-6=0x000000000000000000000000 trusted.afr.c_glusterfs-client-8=*0x000000060000000000000000** //because client8 is the latest client in our case and starting 8 digits * *00000006....are saying like there is something in changelog data.* trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x000000000000001356d86c0c000217fd trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae # lhsh 002500 getfattr -m . -d -e hex /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml getfattr: Removing leading '/' from absolute path names # file: opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml trusted.afr.c_glusterfs-client-1=*0x000000000000000000000000** // and here we can say that there is no split brain but the file is out of sync* trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x000000000000001156d86c290005735c trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae # gluster volume info Volume Name: c_glusterfs Type: Replicate Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.32.0.48:/opt/lvmdir/c2/brick Brick2: 10.32.1.144:/opt/lvmdir/c2/brick Options Reconfigured: performance.readdir-ahead: on network.ping-timeout: 4 nfs.disable: on # gluster volume info Volume Name: c_glusterfs Type: Replicate Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.32.0.48:/opt/lvmdir/c2/brick Brick2: 10.32.1.144:/opt/lvmdir/c2/brick Options Reconfigured: performance.readdir-ahead: on network.ping-timeout: 4 nfs.disable: on # gluster --version glusterfs 3.7.8 built on Feb 17 2016 07:49:49 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com <https://prod-webmail.windriver.com/owa/redir.aspx?SURL=1n3NinBc2tJluL9mRvtdRtuM7FXSFmZ7aHgTkNSgQ7vm1RuX9kPTCGgAdAB0AHAAOgAvAC8AdwB3AHcALgBnAGwAdQBzAHQAZQByAC4AYwBvAG0ALwA.&URL=http%3a%2f%2fwww.gluster.com%2f>>GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. # gluster volume heal info heal-failed Usage: volume heal <VOLNAME> [enable | disable | full |statistics [heal-count [replica <HOSTNAME:BRICKNAME>]] |info [healed | heal-failed | split-brain] |split-brain {bigger-file <FILE> |source-brick <HOSTNAME:BRICKNAME> [<FILE>]}] # gluster volume heal c_glusterfs info heal-failed Command not supported. Please use "gluster volume heal c_glusterfs info" and logs to find the heal information. # lhsh 002500 _______ _____ _____ _____ __ _ _ _ _ _ | |_____] |_____] | | | \ | | | \___/ |_____ | | |_____ __|__ | \_| |_____| _/ \_ 002500> gluster --version glusterfs 3.7.8 built on Feb 17 2016 07:49:49 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com <https://prod-webmail.windriver.com/owa/redir.aspx?SURL=1n3NinBc2tJluL9mRvtdRtuM7FXSFmZ7aHgTkNSgQ7vm1RuX9kPTCGgAdAB0AHAAOgAvAC8AdwB3AHcALgBnAGwAdQBzAHQAZQByAC4AYwBvAG0ALwA.&URL=http%3a%2f%2fwww.gluster.com%2f>>GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. 002500> Regards, Abhishek On Thu, Mar 3, 2016 at 4:54 PM, ABHISHEK PALIWAL <abhishpaliwal at gmail.com> wrote:> > On Thu, Mar 3, 2016 at 4:10 PM, Ravishankar N <ravishankar at redhat.com> > wrote: > >> Hi, >> >> On 03/03/2016 11:14 AM, ABHISHEK PALIWAL wrote: >> >> Hi Ravi, >> >> As I discussed earlier this issue, I investigated this issue and find >> that healing is not triggered because the "gluster volume heal c_glusterfs >> info split-brain" command not showing any entries as a outcome of this >> command even though the file in split brain case. >> >> >> Couple of observations from the 'commands_output' file. >> >> getfattr -d -m . -e hex >> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml >> The afr xattrs do not indicate that the file is in split brain: >> # file: >> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml >> trusted.afr.c_glusterfs-client-1=0x000000000000000000000000 >> trusted.afr.dirty=0x000000000000000000000000 >> trusted.bit-rot.version=0x000000000000000b56d6dd1d000ec7a9 >> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae >> >> >> >> getfattr -d -m . -e hex >> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml >> trusted.afr.c_glusterfs-client-0=0x000000080000000000000000 >> trusted.afr.c_glusterfs-client-2=0x000000020000000000000000 >> trusted.afr.c_glusterfs-client-4=0x000000020000000000000000 >> trusted.afr.c_glusterfs-client-6=0x000000020000000000000000 >> trusted.afr.dirty=0x000000000000000000000000 >> trusted.bit-rot.version=0x000000000000000b56d6dcb7000c87e7 >> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae >> >> 1. There doesn't seem to be a split-brain going by the trusted.afr* >> xattrs. >> > > if it is not the split brain problem then how can I resolve this. > > >> 2. You seem to have re-used the bricks from another volume/setup. For >> replica 2, only trusted.afr.c_glusterfs-client-0 and >> trusted.afr.c_glusterfs-client-1 must be present but I see 4 xattrs - >> client-0,2,4 and 6 >> > > could you please suggest why these entries are there because I am not able > to find out scenario. I am rebooting the one board multiple times to > reproduce the issue and after every reboot doing the remove-brick and > add-brick on the same volume for the second board. > > >> 3. On the rebooted node, do you have ssl enabled by any chance? There is >> a bug for "Not able to fetch volfile' when ssl is enabled: >> https://bugzilla.redhat.com/show_bug.cgi?id=1258931 >> >> Btw, you for data and metadata split-brains you can use the gluster CLI >> https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md >> instead of modifying the file from the back end. >> > > But you are saying it is not split brain problem and even the split-brain > command is not showing any file so how can I find the bigger file in size. > Also in my case the file size is fix 2MB it is overwritten every time. > >> >> -Ravi >> >> >> So, what I have done I manually deleted the gfid entry of that file from >> .glusterfs directory and follow the instruction mentioned in the following >> link to do heal >> >> >> https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md >> >> and this works fine for me. >> >> But my question is why the split-brain command not showing any file in >> output. >> >> Here I am attaching all the log which I get from the node for you and >> also the output of commands from both of the boards >> >> In this tar file two directories are present >> >> 000300 - log for the board which is running continuously >> 002500- log for the board which is rebooted >> >> I am waiting for your reply please help me out on this issue. >> >> Thanks in advanced. >> >> Regards, >> Abhishek >> >> On Fri, Feb 26, 2016 at 1:21 PM, ABHISHEK PALIWAL < >> <abhishpaliwal at gmail.com>abhishpaliwal at gmail.com> wrote: >> >>> On Fri, Feb 26, 2016 at 10:28 AM, Ravishankar N < >>> <ravishankar at redhat.com>ravishankar at redhat.com> wrote: >>> >>>> On 02/26/2016 10:10 AM, ABHISHEK PALIWAL wrote: >>>> >>>> Yes correct >>>> >>>> >>>> Okay, so when you say the files are not in sync until some time, are >>>> you getting stale data when accessing from the mount? >>>> I'm not able to figure out why heal info shows zero when the files are >>>> not in sync, despite all IO happening from the mounts. Could you provide >>>> the output of getfattr -d -m . -e hex /brick/file-name from both bricks >>>> when you hit this issue? >>>> >>>> I'll provide the logs once I get. here delay means we are powering on >>>> the second board after the 10 minutes. >>>> >>>> >>>> On Feb 26, 2016 9:57 AM, "Ravishankar N" < <ravishankar at redhat.com> >>>> ravishankar at redhat.com> wrote: >>>> >>>>> Hello, >>>>> >>>>> On 02/26/2016 08:29 AM, ABHISHEK PALIWAL wrote: >>>>> >>>>> Hi Ravi, >>>>> >>>>> Thanks for the response. >>>>> >>>>> We are using Glugsterfs-3.7.8 >>>>> >>>>> Here is the use case: >>>>> >>>>> We have a logging file which saves logs of the events for every board >>>>> of a node and these files are in sync using glusterfs. System in replica 2 >>>>> mode it means When one brick in a replicated volume goes offline, the >>>>> glusterd daemons on the other nodes keep track of all the files that are >>>>> not replicated to the offline brick. When the offline brick becomes >>>>> available again, the cluster initiates a healing process, replicating the >>>>> updated files to that brick. But in our casse, we see that log file >>>>> of one board is not in the sync and its format is corrupted means files are >>>>> not in sync. >>>>> >>>>> >>>>> Just to understand you correctly, you have mounted the 2 node >>>>> replica-2 volume on both these nodes and writing to a logging file from the >>>>> mounts right? >>>>> >>>>> >>>>> Even the outcome of #gluster volume heal c_glusterfs info shows that >>>>> there is no pending heals. >>>>> >>>>> Also , The logging file which is updated is of fixed size and the new >>>>> entries will be wrapped ,overwriting the old entries. >>>>> >>>>> This way we have seen that after few restarts , the contents of the >>>>> same file on two bricks are different , but the volume heal info shows zero >>>>> entries >>>>> >>>>> Solution: >>>>> >>>>> But when we tried to put delay > 5 min before the healing everything >>>>> is working fine. >>>>> >>>>> Regards, >>>>> Abhishek >>>>> >>>>> On Fri, Feb 26, 2016 at 6:35 AM, Ravishankar N < >>>>> <ravishankar at redhat.com>ravishankar at redhat.com> wrote: >>>>> >>>>>> On 02/25/2016 06:01 PM, ABHISHEK PALIWAL wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> Here, I have one query regarding the time taken by the healing >>>>>> process. >>>>>> In current two node setup when we rebooted one node then the >>>>>> self-healing process starts less than 5min interval on the board which >>>>>> resulting the corruption of the some files data. >>>>>> >>>>>> >>>>>> Heal should start immediately after the brick process comes up. What >>>>>> version of gluster are you using? What do you mean by corruption of data? >>>>>> Also, how did you observe that the heal started after 5 minutes? >>>>>> -Ravi >>>>>> >>>>>> >>>>>> And to resolve it I have search on google and found the following >>>>>> link: >>>>>> <https://support.rackspace.com/how-to/glusterfs-troubleshooting/> >>>>>> https://support.rackspace.com/how-to/glusterfs-troubleshooting/ >>>>>> >>>>>> Mentioning that the healing process can takes upto 10min of time to >>>>>> start this process. >>>>>> >>>>>> Here is the statement from the link: >>>>>> >>>>>> "Healing replicated volumes >>>>>> >>>>>> When any brick in a replicated volume goes offline, the glusterd >>>>>> daemons on the remaining nodes keep track of all the files that are not >>>>>> replicated to the offline brick. When the offline brick becomes available >>>>>> again, the cluster initiates a healing process, replicating the updated >>>>>> files to that brick. *The start of this process can take up to 10 >>>>>> minutes, based on observation.*" >>>>>> >>>>>> After giving the time of more than 5 min file corruption problem has >>>>>> been resolved. >>>>>> >>>>>> So, Here my question is there any way through which we can reduce the >>>>>> time taken by the healing process to start? >>>>>> >>>>>> >>>>>> Regards, >>>>>> Abhishek Paliwal >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Gluster-devel mailing listGluster-devel at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-devel >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> >>>>> >>>>> >>>>> Regards >>>>> Abhishek Paliwal >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> -- >>> >>> >>> >>> >>> Regards >>> Abhishek Paliwal >>> >> >> >> >> -- >> >> >> >> >> Regards >> Abhishek Paliwal >> >> >> >> > > > -- > > > > > Regards > Abhishek Paliwal >-- Regards Abhishek Paliwal -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160304/38955f37/attachment.html>