Ravishankar N
2016-Mar-04 13:06 UTC
[Gluster-users] [Gluster-devel] Query on healing process
On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote:> > > Ok, just to confirm, glusterd and other brick processes are > running after this node rebooted? > When you run the above command, you need to check > /var/log/glusterfs/glfsheal-volname.log logs errros. Setting > client-log-level to DEBUG would give you a more verbose message > > Yes, glusterd and other brick processes running fine. I have check the > /var/log/glusterfs/glfsheal-volname.log file without the log-level= > DEBUG. Here is the logs from that file > > [2016-03-02 13:51:39.059440] I [MSGID: 101190] > [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started > thread with index 1 > [2016-03-02 13:51:39.072172] W [MSGID: 101012] > [common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not > open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting > reserved ports info [No such file or directory] > [2016-03-02 13:51:39.072228] W [MSGID: 101081] > [common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able > to get reserved ports, hence there is a possibility that glusterfs may > consume reserved port > [2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish] > 0-gfapi: connection to 127.0.0.1:24007 <http://127.0.0.1:24007> failed > (Connection refused)Not sure why ^^ occurs. You could try flushing iptables (iptables -F), restart glusterd and run the heal info command again .> [2016-03-02 13:51:39.072663] E [MSGID: 104024] > [glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with > remote-host: localhost (Transport endpoint is not connected) > [Transport endpoint is not connected] > [2016-03-02 13:51:39.072700] I [MSGID: 104025] > [glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile > servers [Transport endpoint is not connected] > >> # gluster volume heal c_glusterfs info split-brain >> c_glusterfs: Not able to fetch volfile from glusterd >> Volume heal failed. > >> >> >> And based on the your observation I understood that this is not >> the problem of split-brain but *is there any way through which >> can find out the file which is not in split-brain as well as not >> in sync?* > > `gluster volume heal c_glusterfs info split-brain` should give > you files that need heal. >Sorry I meant 'gluster volume heal c_glusterfs info' should give you the files that need heal and 'gluster volume heal c_glusterfs info split-brain' the list of files in split-brain. The commands are detailed in https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md> > I have run "gluster volume heal c_glusterfs info split-brain" command > but it is not showing that file which is out of sync that is the issue > file is not in sync on both of the brick and split-brain is not > showing that command in output for heal required. > > Thats is why I am asking that is there any command other than this > split brain command so that I can find out the files those are > required the heal operation but not displayed in the output of > "gluster volume heal c_glusterfs info split-brain" command. > > >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160304/723cf0b6/attachment.html>
ABHISHEK PALIWAL
2016-Mar-04 13:30 UTC
[Gluster-users] [Gluster-devel] Query on healing process
On Fri, Mar 4, 2016 at 6:36 PM, Ravishankar N <ravishankar at redhat.com> wrote:> On 03/04/2016 06:23 PM, ABHISHEK PALIWAL wrote: > > >> Ok, just to confirm, glusterd and other brick processes are running >> after this node rebooted? >> When you run the above command, you need to check >> /var/log/glusterfs/glfsheal-volname.log logs errros. Setting >> client-log-level to DEBUG would give you a more verbose message >> >> Yes, glusterd and other brick processes running fine. I have check the > /var/log/glusterfs/glfsheal-volname.log file without the log-level= DEBUG. > Here is the logs from that file > > [2016-03-02 13:51:39.059440] I [MSGID: 101190] > [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread > with index 1 > [2016-03-02 13:51:39.072172] W [MSGID: 101012] > [common-utils.c:2776:gf_get_reserved_ports] 0-glusterfs: could not open the > file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports > info [No such file or directory] > [2016-03-02 13:51:39.072228] W [MSGID: 101081] > [common-utils.c:2810:gf_process_reserved_ports] 0-glusterfs: Not able to > get reserved ports, hence there is a possibility that glusterfs may consume > reserved port > [2016-03-02 13:51:39.072583] E [socket.c:2278:socket_connect_finish] > 0-gfapi: connection to 127.0.0.1:24007 failed (Connection refused) > > > Not sure why ^^ occurs. You could try flushing iptables (iptables -F), > restart glusterd and run the heal info command again . >No hint from the logs? I'll try your suggestion.> > [2016-03-02 13:51:39.072663] E [MSGID: 104024] > [glfs-mgmt.c:738:mgmt_rpc_notify] 0-glfs-mgmt: failed to connect with > remote-host: localhost (Transport endpoint is not connected) [Transport > endpoint is not connected] > [2016-03-02 13:51:39.072700] I [MSGID: 104025] > [glfs-mgmt.c:744:mgmt_rpc_notify] 0-glfs-mgmt: Exhausted all volfile > servers [Transport endpoint is not connected] > >> # gluster volume heal c_glusterfs info split-brain >> c_glusterfs: Not able to fetch volfile from glusterd >> Volume heal failed. >> >> >> >> >> And based on the your observation I understood that this is not the >> problem of split-brain but *is there any way through which can find out >> the file which is not in split-brain as well as not in sync?* >> >> >> `gluster volume heal c_glusterfs info split-brain` should give you files >> that need heal. >> > > Sorry I meant 'gluster volume heal c_glusterfs info' should give you the > files that need heal and 'gluster volume heal c_glusterfs info > split-brain' the list of files in split-brain. > The commands are detailed in > https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md >Yes, I have tried this as well It is also giving Number of entries : 0 means no healing is required but the file /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml is not in sync both of brick showing the different version of this file. You can see it in the getfattr command outcome as well. # getfattr -m . -d -e hex /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml getfattr: Removing leading '/' from absolute path names # file: opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml trusted.afr.c_glusterfs-client-0=0x000000000000000000000000 trusted.afr.c_glusterfs-client-2=0x000000000000000000000000 trusted.afr.c_glusterfs-client-4=0x000000000000000000000000 trusted.afr.c_glusterfs-client-6=0x000000000000000000000000 trusted.afr.c_glusterfs-client-8=*0x000000060000000000000000** //because client8 is the latest client in our case and starting 8 digits * *00000006....are saying like there is something in changelog data.* trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x000000000000001356d86c0c000217fd trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae # lhsh 002500 getfattr -m . -d -e hex /opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml getfattr: Removing leading '/' from absolute path names # file: opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml trusted.afr.c_glusterfs-client-1=*0x000000000000000000000000** // and here we can say that there is no split brain but the file is out of sync* trusted.afr.dirty=0x000000000000000000000000 trusted.bit-rot.version=0x000000000000001156d86c290005735c trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae> Regards, >Abhishek>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160304/442f7fac/attachment.html>