thr3ads.net - Gluster users - [Gluster-users] Peer Rejected(Connected) and Self heal daemon is not running causing split brain [Feb 2015]

If this information is useful, please help other people find it:
Share via:

Atin Mukherjee

2015-Feb-26 09:40 UTC

[Gluster-users] Peer Rejected(Connected) and Self heal daemon is not running causing split brain

Could you check the N/W firewall setting? Flush iptable setting using
iptables -F and retry.

~Atin

On 02/26/2015 02:55 PM, Kaamesh Kamalaaharan wrote:> Hi guys,
> 
> I managed to get gluster running but im having a couple of issues with my
> setup 1) my peer status is rejected but connected 2) my self heal daemon is
> not running on one server and im getting split brain files.
> My setup is two gluster volumes  (gfs1 and gfs2) on replicate each with a
> brick
> 
> 1) My peer status doesnt go into Peer in Cluster. running a peer status
> command gives me State:Peer Rejected (Connected) . At this point, the brick
> on gfs2 does not go online and i get this output
> 
> 
> #gluster volume status
> 
> Status of volume: gfsvolume
> 
> Gluster process Port Online Pid
> 
>
------------------------------------------------------------------------------
> 
> Brick gfs1:/export/sda/brick 49153 Y 15025
> 
> NFS Server on localhost 2049 Y 15039
> 
> Self-heal Daemon on localhost N/A Y 15044
> 
> 
> 
> Task Status of Volume gfsvolume
> 
>
------------------------------------------------------------------------------
> 
> There are no active volume tasks
> 
> 
> 
> I have followed the methods used in one of the threads and performed the
> following
> 
>    a) stop glusterd
>    b) rm all files in /var/lib/glusterd/  except for glusterd.info
>    c) start glusterd and probe gfs1 from gfs2 and peer status which gives
me
> 
> 
> # gluster peer status
> 
> Number of Peers: 1
> 
> 
> Hostname: gfs1
> 
> Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070
> 
> State: Sent and Received peer request (Connected)
> 
> 
> the same thread mentioned that changing the status of the peer in
> /var/lib/glusterd/peer/{UUID} from status=5 to status=3 fixes this and on
> restart of gfs1 the peer status goes to
> 
> #gluster peer status
> 
> Number of Peers: 1
> 
> 
> Hostname: gfs1
> 
> Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070
> 
> State: Peer in Cluster (Connected)
> 
> This fixes the connection between the peers and the volume status shows
> 
> 
> Status of volume: gfsvolume
> 
> Gluster process Port Online Pid
> 
>
------------------------------------------------------------------------------
> 
> Brick gfs1:/export/sda/brick 49153 Y 10852
> 
> Brick gfs2:/export/sda/brick 49152 Y 17024
> 
> NFS Server on localhost N/A N N/A
> 
> Self-heal Daemon on localhost N/A N N/A
> 
> NFS Server on gfs2 N/A N N/A
> 
> Self-heal Daemon on gfs2 N/A N N/A
> 
> 
> 
> Task Status of Volume gfsvolume
> 
>
------------------------------------------------------------------------------
> 
> There are no active volume tasks
> 
> 
> Which brings us to problem 2
> 
> 2) My self-heal demon is not alive
> 
> I fixed the self heal on gfs1 by running
> 
>  #find <gluster-mount> -noleaf -print0 | xargs --null stat
>/dev/null
> 2>/var/log/gluster/<gluster-mount>-selfheal.log
> 
> and running a volume status command gives me
> 
> # gluster volume status
> 
> Status of volume: gfsvolume
> 
> Gluster process Port Online Pid
> 
>
------------------------------------------------------------------------------
> 
> Brick gfs1:/export/sda/brick 49152 Y 16660
> 
> Brick gfs2:/export/sda/brick 49152 Y 21582
> 
> NFS Server on localhost 2049 Y 16674
> 
> Self-heal Daemon on localhost N/A Y 16679
> 
> NFS Server on gfs2 N/A N 21596
> 
> Self-heal Daemon on gfs2 N/A N 21600
> 
> 
> 
> Task Status of Volume gfsvolume
> 
>
------------------------------------------------------------------------------
> 
> There are no active volume tasks
> 
> 
> 
> However, running this on gfs2 doesnt fix the daemon.
> 
> Restarting the gfs2 server brings me back to problem 1 and the cycle
> continues..
> 
> Can anyone assist me with this issue(s).. thank you.
> 
> Thank You Kindly,
> Kaamesh
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 
-- 
~Atin

Kaamesh Kamalaaharan

2015-Feb-27 00:51 UTC

head link

[Gluster-users] Peer Rejected(Connected) and Self heal daemon is not running causing split brain

Hi atin,

I have tried to flush the iptables and this time i managed to get the peer
into cluster. However, the self heal daemon is still offline and im unable
to bring the daemon back online on gfs2. Doing a heal on either server
gives me a succesful output but when i check the heal info i am getting
many split brain errors on gfs2

Thank You Kindly,
Kaamesh


On Thu, Feb 26, 2015 at 5:40 PM, Atin Mukherjee <amukherj at redhat.com>
wrote:
> Could you check the N/W firewall setting? Flush iptable setting using
> iptables -F and retry.
>
> ~Atin
>
> On 02/26/2015 02:55 PM, Kaamesh Kamalaaharan wrote:
> > Hi guys,
> >
> > I managed to get gluster running but im having a couple of issues with
my
> > setup 1) my peer status is rejected but connected 2) my self heal
daemon
> is
> > not running on one server and im getting split brain files.
> > My setup is two gluster volumes  (gfs1 and gfs2) on replicate each
with a
> > brick
> >
> > 1) My peer status doesnt go into Peer in Cluster. running a peer
status
> > command gives me State:Peer Rejected (Connected) . At this point, the
> brick
> > on gfs2 does not go online and i get this output
> >
> >
> > #gluster volume status
> >
> > Status of volume: gfsvolume
> >
> > Gluster process Port Online Pid
> >
> >
>
------------------------------------------------------------------------------
> >
> > Brick gfs1:/export/sda/brick 49153 Y 15025
> >
> > NFS Server on localhost 2049 Y 15039
> >
> > Self-heal Daemon on localhost N/A Y 15044
> >
> >
> >
> > Task Status of Volume gfsvolume
> >
> >
>
------------------------------------------------------------------------------
> >
> > There are no active volume tasks
> >
> >
> >
> > I have followed the methods used in one of the threads and performed
the
> > following
> >
> >    a) stop glusterd
> >    b) rm all files in /var/lib/glusterd/  except for glusterd.info
> >    c) start glusterd and probe gfs1 from gfs2 and peer status which
> gives me
> >
> >
> > # gluster peer status
> >
> > Number of Peers: 1
> >
> >
> > Hostname: gfs1
> >
> > Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070
> >
> > State: Sent and Received peer request (Connected)
> >
> >
> > the same thread mentioned that changing the status of the peer in
> > /var/lib/glusterd/peer/{UUID} from status=5 to status=3 fixes this and
on
> > restart of gfs1 the peer status goes to
> >
> > #gluster peer status
> >
> > Number of Peers: 1
> >
> >
> > Hostname: gfs1
> >
> > Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070
> >
> > State: Peer in Cluster (Connected)
> >
> > This fixes the connection between the peers and the volume status
shows
> >
> >
> > Status of volume: gfsvolume
> >
> > Gluster process Port Online Pid
> >
> >
>
------------------------------------------------------------------------------
> >
> > Brick gfs1:/export/sda/brick 49153 Y 10852
> >
> > Brick gfs2:/export/sda/brick 49152 Y 17024
> >
> > NFS Server on localhost N/A N N/A
> >
> > Self-heal Daemon on localhost N/A N N/A
> >
> > NFS Server on gfs2 N/A N N/A
> >
> > Self-heal Daemon on gfs2 N/A N N/A
> >
> >
> >
> > Task Status of Volume gfsvolume
> >
> >
>
------------------------------------------------------------------------------
> >
> > There are no active volume tasks
> >
> >
> > Which brings us to problem 2
> >
> > 2) My self-heal demon is not alive
> >
> > I fixed the self heal on gfs1 by running
> >
> >  #find <gluster-mount> -noleaf -print0 | xargs --null stat
>/dev/null
> > 2>/var/log/gluster/<gluster-mount>-selfheal.log
> >
> > and running a volume status command gives me
> >
> > # gluster volume status
> >
> > Status of volume: gfsvolume
> >
> > Gluster process Port Online Pid
> >
> >
>
------------------------------------------------------------------------------
> >
> > Brick gfs1:/export/sda/brick 49152 Y 16660
> >
> > Brick gfs2:/export/sda/brick 49152 Y 21582
> >
> > NFS Server on localhost 2049 Y 16674
> >
> > Self-heal Daemon on localhost N/A Y 16679
> >
> > NFS Server on gfs2 N/A N 21596
> >
> > Self-heal Daemon on gfs2 N/A N 21600
> >
> >
> >
> > Task Status of Volume gfsvolume
> >
> >
>
------------------------------------------------------------------------------
> >
> > There are no active volume tasks
> >
> >
> >
> > However, running this on gfs2 doesnt fix the daemon.
> >
> > Restarting the gfs2 server brings me back to problem 1 and the cycle
> > continues..
> >
> > Can anyone assist me with this issue(s).. thank you.
> >
> > Thank You Kindly,
> > Kaamesh
> >
> >
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> >
>
> --
> ~Atin
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150227/aca46645/attachment.html>

Gluster users - Feb 2015 - Peer Rejected(Connected) and Self heal daemon is not running causing split brain

[Gluster-users] Peer Rejected(Connected) and Self heal daemon is not running causing split brain

[Gluster-users] Peer Rejected(Connected) and Self heal daemon is not running causing split brain