thr3ads.net - Gluster users - [Gluster-users] [Gluster-devel] Query on healing process [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Ravishankar N

2016-Mar-03 10:40 UTC

[Gluster-users] [Gluster-devel] Query on healing process

Hi,

On 03/03/2016 11:14 AM, ABHISHEK PALIWAL wrote:> Hi Ravi,
>
> As I discussed earlier this issue, I investigated this issue and find 
> that healing is not triggered because the "gluster volume heal 
> c_glusterfs info split-brain" command not showing any entries as a 
> outcome of this command even though the file in split brain case.
Couple of observations from the 'commands_output' file.

getfattr -d -m . -e hex 
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
The afr xattrs do not indicate that the file is in split brain:
# file: 
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-1=0x000000000000000000000000
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x000000000000000b56d6dd1d000ec7a9
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae



getfattr -d -m . -e hex 
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-0=0x000000080000000000000000
trusted.afr.c_glusterfs-client-2=0x000000020000000000000000
trusted.afr.c_glusterfs-client-4=0x000000020000000000000000
trusted.afr.c_glusterfs-client-6=0x000000020000000000000000
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x000000000000000b56d6dcb7000c87e7
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

1. There doesn't seem to be a split-brain going by the trusted.afr* xattrs.
2. You seem to have re-used the bricks from another volume/setup. For 
replica 2, only trusted.afr.c_glusterfs-client-0 and 
trusted.afr.c_glusterfs-client-1 must be present but I see 4 xattrs - 
client-0,2,4 and 6
3. On the rebooted node, do you have ssl enabled by any chance? There is 
a bug for "Not able to fetch volfile' when ssl is enabled: 
https://bugzilla.redhat.com/show_bug.cgi?id=1258931

Btw, you for data and metadata split-brains you can use the gluster CLI 
https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
instead of modifying the file from the back end.

-Ravi>
> So, what I have done I manually deleted the gfid entry of that file 
> from .glusterfs directory and follow the instruction mentioned in the 
> following link to do heal
>
>
https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md
>
> and this works fine for me.
>
> But my question is why the split-brain command not showing any file in 
> output.
>
> Here I am attaching all the log which I get from the node for you and 
> also the output of commands from both of the boards
>
> In this tar file two directories are present
>
> 000300 - log for the board which is running continuously
> 002500-  log for the board which is rebooted
>
> I am waiting for your reply please help me out on this issue.
>
> Thanks in advanced.
>
> Regards,
> Abhishek
>
> On Fri, Feb 26, 2016 at 1:21 PM, ABHISHEK PALIWAL 
> <abhishpaliwal at gmail.com <mailto:abhishpaliwal at
gmail.com>> wrote:
>
>     On Fri, Feb 26, 2016 at 10:28 AM, Ravishankar N
>     <ravishankar at redhat.com <mailto:ravishankar at
redhat.com>> wrote:
>
>         On 02/26/2016 10:10 AM, ABHISHEK PALIWAL wrote:
>>
>>         Yes correct
>>
>
>         Okay, so when you say the files are not in sync until some
>         time, are you getting stale data when accessing from the mount?
>         I'm not able to figure out why heal info shows zero when the
>         files are not in sync, despite all IO happening from the
>         mounts. Could you provide the output of getfattr -d -m . -e
>         hex /brick/file-name from both bricks when you hit this issue?
>
>         I'll provide the logs once I get. here delay means we are
>         powering on the second board after the 10 minutes.
>
>
>>         On Feb 26, 2016 9:57 AM, "Ravishankar N"
>>         <ravishankar at redhat.com <mailto:ravishankar at
redhat.com>> wrote:
>>
>>             Hello,
>>
>>             On 02/26/2016 08:29 AM, ABHISHEK PALIWAL wrote:
>>>             Hi Ravi,
>>>
>>>             Thanks for the response.
>>>
>>>             We are using Glugsterfs-3.7.8
>>>
>>>             Here is the use case:
>>>
>>>             We have a logging file which saves logs of the events
>>>             for every board of a node and these files are in sync
>>>             using glusterfs. System in replica 2 mode it means When
>>>             one brick in a replicated volume goes offline, the
>>>             glusterd daemons on the other nodes keep track of all
>>>             the files that are not replicated to the offline brick.
>>>             When the offline brick becomes available again, the
>>>             cluster initiates a healing process, replicating the
>>>             updated files to that brick. But in our casse, we see
>>>             that log file of one board is not in the sync and its
>>>             format is corrupted means files are not in sync.
>>
>>             Just to understand you correctly, you have mounted the 2
>>             node replica-2 volume on both these nodes and writing to
>>             a logging file from the mounts right?
>>
>>>
>>>             Even the outcome of #gluster volume heal c_glusterfs
>>>             info shows that there is no pending heals.
>>>
>>>             Also , The logging file which is updated is of fixed
>>>             size and the new entries will be wrapped ,overwriting
>>>             the old entries.
>>>
>>>             This way we have seen that after few restarts , the
>>>             contents of the same file on two bricks are different ,
>>>             but the volume heal info shows zero entries
>>>
>>>             Solution:
>>>
>>>             But when we tried to put delay > 5 min before the
>>>             healing everything is working fine.
>>>
>>>             Regards,
>>>             Abhishek
>>>
>>>             On Fri, Feb 26, 2016 at 6:35 AM, Ravishankar N
>>>             <ravishankar at redhat.com <mailto:ravishankar at
redhat.com>>
>>>             wrote:
>>>
>>>                 On 02/25/2016 06:01 PM, ABHISHEK PALIWAL wrote:
>>>>                 Hi,
>>>>
>>>>                 Here, I have one query regarding the time taken
by
>>>>                 the healing process.
>>>>                 In current two node setup when we rebooted one
node
>>>>                 then the self-healing process starts less than
5min
>>>>                 interval on the board which resulting the
>>>>                 corruption of the some files data.
>>>
>>>                 Heal should start immediately after the brick
>>>                 process comes up. What version of gluster are you
>>>                 using? What do you mean by corruption of data?
Also,
>>>                 how did you observe that the heal started after 5
>>>                 minutes?
>>>                 -Ravi
>>>>
>>>>                 And to resolve it I have search on google and
found
>>>>                 the following link:
>>>>                
https://support.rackspace.com/how-to/glusterfs-troubleshooting/
>>>>
>>>>                 Mentioning that the healing process can takes
upto
>>>>                 10min of time to start this process.
>>>>
>>>>                 Here is the statement from the link:
>>>>
>>>>                 "Healing replicated volumes
>>>>
>>>>                 When any brick in a replicated volume goes
offline,
>>>>                 the glusterd daemons on the remaining nodes
keep
>>>>                 track of all the files that are not replicated
to
>>>>                 the offline brick. When the offline brick
becomes
>>>>                 available again, the cluster initiates a
healing
>>>>                 process, replicating the updated files to that
>>>>                 brick. *The start of this process can take up
to 10
>>>>                 minutes, based on observation.*"
>>>>
>>>>                 After giving the time of more than 5 min file
>>>>                 corruption problem has been resolved.
>>>>
>>>>                 So, Here my question is there any way through
which
>>>>                 we can reduce the time taken by the healing
process
>>>>                 to start?
>>>>
>>>>
>>>>                 Regards,
>>>>                 Abhishek Paliwal
>>>>
>>>>
>>>>
>>>>
>>>>                 _______________________________________________
>>>>                 Gluster-devel mailing list
>>>>                 Gluster-devel at gluster.org
>>>>                 <mailto:Gluster-devel at gluster.org>
>>>>                
http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>>>
>>>
>>>
>>>
>>>             -- 
>>>
>>>
>>>
>>>
>>>             Regards
>>>             Abhishek Paliwal
>>
>>
>
>
>
>
>
>     -- 
>
>
>
>
>     Regards
>     Abhishek Paliwal
>
>
>
>
> -- 
>
>
>
>
> Regards
> Abhishek Paliwal

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160303/34441e6b/attachment.html>

ABHISHEK PALIWAL

2016-Mar-03 11:24 UTC

head link

[Gluster-users] [Gluster-devel] Query on healing process

On Thu, Mar 3, 2016 at 4:10 PM, Ravishankar N <ravishankar at redhat.com>
wrote:
> Hi,
>
> On 03/03/2016 11:14 AM, ABHISHEK PALIWAL wrote:
>
> Hi Ravi,
>
> As I discussed earlier this issue, I investigated this issue and find that
> healing is not triggered because the "gluster volume heal c_glusterfs
info
> split-brain" command not showing any entries as a outcome of this
command
> even though the file in split brain case.
>
>
> Couple of observations from the 'commands_output' file.
>
> getfattr -d -m . -e hex
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> The afr xattrs do not indicate that the file is in split brain:
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-1=0x000000000000000000000000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.version=0x000000000000000b56d6dd1d000ec7a9
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
>
>
> getfattr -d -m . -e hex
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-0=0x000000080000000000000000
> trusted.afr.c_glusterfs-client-2=0x000000020000000000000000
> trusted.afr.c_glusterfs-client-4=0x000000020000000000000000
> trusted.afr.c_glusterfs-client-6=0x000000020000000000000000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.version=0x000000000000000b56d6dcb7000c87e7
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
> 1. There doesn't seem to be a split-brain going by the trusted.afr*
xattrs.
>
if it is not the split brain problem then how can I resolve this.

> 2. You seem to have re-used the bricks from another volume/setup. For
> replica 2, only trusted.afr.c_glusterfs-client-0 and
> trusted.afr.c_glusterfs-client-1 must be present but I see 4 xattrs -
> client-0,2,4 and 6
>
could you please suggest why these entries are there because I am not able
to find out scenario. I am rebooting the one board multiple times to
reproduce the issue and after every reboot doing the remove-brick and
add-brick on the same volume for the second board.

> 3. On the rebooted node, do you have ssl enabled by any chance? There is a
> bug for "Not able to fetch volfile' when ssl is enabled:
> https://bugzilla.redhat.com/show_bug.cgi?id=1258931
>
> Btw, you for data and metadata split-brains you can use the gluster CLI
>
https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
> instead of modifying the file from the back end.
>
But you are saying it is not split brain problem and even the split-brain
command  is not showing any file so how can I find the bigger file in size.
Also in my case the file size is fix 2MB it is overwritten every time.
>
> -Ravi
>
>
> So, what I have done I manually deleted the gfid entry of that file from
> .glusterfs directory and follow the instruction mentioned in the following
> link to do heal
>
>
>
https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md
>
> and this works fine for me.
>
> But my question is why the split-brain command not showing any file in
> output.
>
> Here I am attaching all the log which I get from the node for you and also
> the output of commands from both of the boards
>
> In this tar file two directories are present
>
> 000300 - log for the board which is running continuously
> 002500-  log for the board which is rebooted
>
> I am waiting for your reply please help me out on this issue.
>
> Thanks in advanced.
>
> Regards,
> Abhishek
>
> On Fri, Feb 26, 2016 at 1:21 PM, ABHISHEK PALIWAL <
> <abhishpaliwal at gmail.com>abhishpaliwal at gmail.com> wrote:
>
>> On Fri, Feb 26, 2016 at 10:28 AM, Ravishankar N <
>> <ravishankar at redhat.com>ravishankar at redhat.com> wrote:
>>
>>> On 02/26/2016 10:10 AM, ABHISHEK PALIWAL wrote:
>>>
>>> Yes correct
>>>
>>>
>>> Okay, so when you say the files are not in sync until some time,
are you
>>> getting stale data when accessing from the mount?
>>> I'm not able to figure out why heal info shows zero when the
files are
>>> not in sync, despite all IO happening from the mounts. Could you
provide
>>> the output of getfattr -d -m . -e hex /brick/file-name from both
bricks
>>> when you hit this issue?
>>>
>>> I'll provide the logs once I get. here delay means we are
powering on
>>> the second board after the 10 minutes.
>>>
>>>
>>> On Feb 26, 2016 9:57 AM, "Ravishankar N" <
<ravishankar at redhat.com>
>>> ravishankar at redhat.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> On 02/26/2016 08:29 AM, ABHISHEK PALIWAL wrote:
>>>>
>>>> Hi Ravi,
>>>>
>>>> Thanks for the response.
>>>>
>>>> We are using Glugsterfs-3.7.8
>>>>
>>>> Here is the use case:
>>>>
>>>> We have a logging file which saves logs of the events for every
board
>>>> of a node and these files are in sync using glusterfs. System
in replica 2
>>>> mode it means When one brick in a replicated volume goes
offline, the
>>>> glusterd daemons on the other nodes keep track of all the files
that are
>>>> not replicated to the offline brick. When the offline brick
becomes
>>>> available again, the cluster initiates a healing process,
replicating the
>>>> updated files to that brick. But in our casse, we see that log
file of
>>>> one board is not in the sync and its format is corrupted means
files are
>>>> not in sync.
>>>>
>>>>
>>>> Just to understand you correctly, you have mounted the 2 node
replica-2
>>>> volume on both these nodes and writing to a logging file from
the mounts
>>>> right?
>>>>
>>>>
>>>> Even the outcome of #gluster volume heal c_glusterfs info shows
that
>>>> there is no pending heals.
>>>>
>>>> Also , The logging file which is updated is of fixed size and
the new
>>>> entries will be wrapped ,overwriting the old entries.
>>>>
>>>> This way we have seen that after few restarts , the contents of
the
>>>> same file on two bricks are different , but the volume heal
info shows zero
>>>> entries
>>>>
>>>> Solution:
>>>>
>>>> But when we tried to put delay  > 5 min before the healing
everything
>>>> is working fine.
>>>>
>>>> Regards,
>>>> Abhishek
>>>>
>>>> On Fri, Feb 26, 2016 at 6:35 AM, Ravishankar N <
>>>> <ravishankar at redhat.com>ravishankar at redhat.com>
wrote:
>>>>
>>>>> On 02/25/2016 06:01 PM, ABHISHEK PALIWAL wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Here, I have one query regarding the time taken by the
healing process.
>>>>> In current two node setup when we rebooted one node then
the
>>>>> self-healing process starts less than 5min interval on the
board which
>>>>> resulting the corruption of the some files data.
>>>>>
>>>>>
>>>>> Heal should start immediately after the brick process comes
up. What
>>>>> version of gluster are you using? What do you mean by
corruption of data?
>>>>> Also, how did you observe that the heal started after 5
minutes?
>>>>> -Ravi
>>>>>
>>>>>
>>>>> And to resolve it I have search on google and found the
following link:
>>>>>
<https://support.rackspace.com/how-to/glusterfs-troubleshooting/>
>>>>>
https://support.rackspace.com/how-to/glusterfs-troubleshooting/
>>>>>
>>>>> Mentioning that the healing process can takes upto 10min of
time to
>>>>> start this process.
>>>>>
>>>>> Here is the statement from the link:
>>>>>
>>>>> "Healing replicated volumes
>>>>>
>>>>> When any brick in a replicated volume goes offline, the
glusterd
>>>>> daemons on the remaining nodes keep track of all the files
that are not
>>>>> replicated to the offline brick. When the offline brick
becomes available
>>>>> again, the cluster initiates a healing process, replicating
the updated
>>>>> files to that brick. *The start of this process can take up
to 10
>>>>> minutes, based on observation.*"
>>>>>
>>>>> After giving the time of more than 5 min file corruption
problem has
>>>>> been resolved.
>>>>>
>>>>> So, Here my question is there any way through which we can
reduce the
>>>>> time taken by the healing process to start?
>>>>>
>>>>>
>>>>> Regards,
>>>>> Abhishek Paliwal
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-devel mailing listGluster-devel at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>>
>>>> Regards
>>>> Abhishek Paliwal
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>> --
>>
>>
>>
>>
>> Regards
>> Abhishek Paliwal
>>
>
>
>
> --
>
>
>
>
> Regards
> Abhishek Paliwal
>
>
>
>

-- 




Regards
Abhishek Paliwal
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160303/3dc1b5ee/attachment.html>

Gluster users - Mar 2016 - [Gluster-devel] Query on healing process

[Gluster-users] [Gluster-devel] Query on healing process

[Gluster-users] [Gluster-devel] Query on healing process