thr3ads.net - Gluster users - [Gluster-users] [Gluster-devel] Query on healing process [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Ravishankar N

2016-Mar-04 13:06 UTC

[Gluster-users] [Gluster-devel] Query on healing process

On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:>
>
>     Ok, just to confirm, glusterd  and other brick processes are
>     running after this node rebooted?
>     When you run the above command, you need to check
>     /var/log/glusterfs/glfsheal-volname.log logs errros. Setting
>     client-log-level to DEBUG would give you a more verbose message
>
> Yes, glusterd and other brick processes running fine. I have check the 
> /var/log/glusterfs/glfsheal-volname.log file without the log-level= 
> DEBUG. Here is the logs from that file
>
> [2016-03-02 13:51:39.059440] I [MSGID: 101190] 
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started 
> thread with index 1
> [2016-03-02 13:51:39.072172] W [MSGID: 101012] 
> [common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not 
> open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting 
> reserved ports info [No such file or directory]
> [2016-03-02 13:51:39.072228] W [MSGID: 101081] 
> [common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able 
> to get reserved ports, hence there is a possibility that glusterfs may 
> consume reserved port
> [2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish] 
> 0-gfapi: connection to 127.0.0.1:24007 <http://127.0.0.1:24007>
failed
> (Connection refused)
Not sure why ^^ occurs. You could try flushing iptables (iptables -F), 
restart glusterd and run the heal info command again .
> [2016-03-02 13:51:39.072663] E [MSGID: 104024] 
> [glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with 
> remote-host: localhost (Transport endpoint is not connected) 
> [Transport endpoint is not connected]
> [2016-03-02 13:51:39.072700] I [MSGID: 104025] 
> [glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile 
> servers [Transport endpoint is not connected]
>
>>     # gluster volume heal c_glusterfs info split-brain
>>     c_glusterfs: Not able to fetch volfile from glusterd
>>     Volume heal failed.
>
>>
>>
>>     And based on the your observation I understood that this is not
>>     the problem of split-brain but *is there any way through which
>>     can find out the file which is not in split-brain as well as not
>>     in sync?*
>
>     `gluster volume heal c_glusterfs info split-brain`  should give
>     you files that need heal.
>
Sorry  I meant 'gluster volume heal c_glusterfs info' should give you 
the files that need heal and 'gluster volume heal c_glusterfs info 
split-brain' the list of files in split-brain.
The commands are detailed in 
https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
>
> I have run "gluster volume heal c_glusterfs info split-brain"
command
> but it is not showing that file which is out of sync that is the issue 
> file is not in sync on both of the brick and split-brain is not 
> showing that command in output for heal required.
>
> Thats is why I am asking that is there any command other than this 
> split brain command so that I can find out the files those are 
> required the heal operation but not displayed in the output of 
> "gluster volume heal c_glusterfs info split-brain" command.
>
>
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160304/723cf0b6/attachment.html>

ABHISHEK PALIWAL

2016-Mar-04 13:30 UTC

head link

[Gluster-users] [Gluster-devel] Query on healing process

On Fri, Mar 4, 2016 at 6:36 PM, Ravishankar N <ravishankar at redhat.com>
wrote:
> On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:
>
>
>> Ok, just to confirm, glusterd  and other brick processes are running
>> after this node rebooted?
>> When you run the above command, you need to check
>> /var/log/glusterfs/glfsheal-volname.log logs errros. Setting
>> client-log-level to DEBUG would give you a more verbose message
>>
>> Yes, glusterd and other brick processes running fine. I have check the
> /var/log/glusterfs/glfsheal-volname.log file without the log-level= DEBUG.
> Here is the logs from that file
>
> [2016-03-02 13:51:39.059440] I [MSGID: 101190]
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
> [2016-03-02 13:51:39.072172] W [MSGID: 101012]
> [common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not open the
> file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports
> info [No such file or directory]
> [2016-03-02 13:51:39.072228] W [MSGID: 101081]
> [common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able to
> get reserved ports, hence there is a possibility that glusterfs may consume
> reserved port
> [2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish]
> 0-gfapi: connection to 127.0.0.1:24007 failed (Connection refused)
>
>
> Not sure why ^^ occurs. You could try flushing iptables (iptables -F),
> restart glusterd and run the heal info command again .
>
No hint from the logs? I'll try your suggestion.
>
> [2016-03-02 13:51:39.072663] E [MSGID: 104024]
> [glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with
> remote-host: localhost (Transport endpoint is not connected) [Transport
> endpoint is not connected]
> [2016-03-02 13:51:39.072700] I [MSGID: 104025]
> [glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile
> servers [Transport endpoint is not connected]
>
>> # gluster volume heal c_glusterfs info split-brain
>> c_glusterfs: Not able to fetch volfile from glusterd
>> Volume heal failed.
>>
>>
>>
>>
>> And based on the your observation I understood that this is not the
>> problem of split-brain but *is there any way through which can find out
>> the file which is not in split-brain as well as not in sync?*
>>
>>
>> `gluster volume heal c_glusterfs info split-brain`  should give you
files
>> that need heal.
>>
>
> Sorry  I meant 'gluster volume heal c_glusterfs info' should give
you the
> files that need heal and 'gluster volume heal c_glusterfs info
> split-brain' the list of files in split-brain.
> The commands are detailed in
>
https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
>
Yes, I have tried this as well It is also giving Number of entries : 0
means no healing is required but the file
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml is
not in sync both of brick showing the different version of this file.

You can see it in the getfattr command outcome as well.


# getfattr -m . -d -e hex
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
getfattr: Removing leading '/' from absolute path names
# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-0=0x000000000000000000000000
trusted.afr.c_glusterfs-client-2=0x000000000000000000000000
trusted.afr.c_glusterfs-client-4=0x000000000000000000000000
trusted.afr.c_glusterfs-client-6=0x000000000000000000000000
trusted.afr.c_glusterfs-client-8=*0x000000060000000000000000** //because
client8 is the latest client in our case and starting 8 digits *

*00000006....are saying like there is something in changelog data.*
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x000000000000001356d86c0c000217fd
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

# lhsh 002500 getfattr -m . -d -e hex
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
getfattr: Removing leading '/' from absolute path names
# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-1=*0x000000000000000000000000** // and here
we can say that there is no split brain but the file is out of sync*
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x000000000000001156d86c290005735c
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

> Regards,
>   Abhishek
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160304/442f7fac/attachment.html>

Gluster users - Mar 2016 - [Gluster-devel] Query on healing process

[Gluster-users] [Gluster-devel] Query on healing process

[Gluster-users] [Gluster-devel] Query on healing process