Atin Mukherjee
2015-Feb-26 09:40 UTC
[Gluster-users] Peer Rejected(Connected) and Self heal daemon is not running causing split brain
Could you check the N/W firewall setting? Flush iptable setting using iptables -F and retry. ~Atin On 02/26/2015 02:55 PM, Kaamesh Kamalaaharan wrote:> Hi guys, > > I managed to get gluster running but im having a couple of issues with my > setup 1) my peer status is rejected but connected 2) my self heal daemon is > not running on one server and im getting split brain files. > My setup is two gluster volumes (gfs1 and gfs2) on replicate each with a > brick > > 1) My peer status doesnt go into Peer in Cluster. running a peer status > command gives me State:Peer Rejected (Connected) . At this point, the brick > on gfs2 does not go online and i get this output > > > #gluster volume status > > Status of volume: gfsvolume > > Gluster process Port Online Pid > > ------------------------------------------------------------------------------ > > Brick gfs1:/export/sda/brick 49153 Y 15025 > > NFS Server on localhost 2049 Y 15039 > > Self-heal Daemon on localhost N/A Y 15044 > > > > Task Status of Volume gfsvolume > > ------------------------------------------------------------------------------ > > There are no active volume tasks > > > > I have followed the methods used in one of the threads and performed the > following > > a) stop glusterd > b) rm all files in /var/lib/glusterd/ except for glusterd.info > c) start glusterd and probe gfs1 from gfs2 and peer status which gives me > > > # gluster peer status > > Number of Peers: 1 > > > Hostname: gfs1 > > Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070 > > State: Sent and Received peer request (Connected) > > > the same thread mentioned that changing the status of the peer in > /var/lib/glusterd/peer/{UUID} from status=5 to status=3 fixes this and on > restart of gfs1 the peer status goes to > > #gluster peer status > > Number of Peers: 1 > > > Hostname: gfs1 > > Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070 > > State: Peer in Cluster (Connected) > > This fixes the connection between the peers and the volume status shows > > > Status of volume: gfsvolume > > Gluster process Port Online Pid > > ------------------------------------------------------------------------------ > > Brick gfs1:/export/sda/brick 49153 Y 10852 > > Brick gfs2:/export/sda/brick 49152 Y 17024 > > NFS Server on localhost N/A N N/A > > Self-heal Daemon on localhost N/A N N/A > > NFS Server on gfs2 N/A N N/A > > Self-heal Daemon on gfs2 N/A N N/A > > > > Task Status of Volume gfsvolume > > ------------------------------------------------------------------------------ > > There are no active volume tasks > > > Which brings us to problem 2 > > 2) My self-heal demon is not alive > > I fixed the self heal on gfs1 by running > > #find <gluster-mount> -noleaf -print0 | xargs --null stat >/dev/null > 2>/var/log/gluster/<gluster-mount>-selfheal.log > > and running a volume status command gives me > > # gluster volume status > > Status of volume: gfsvolume > > Gluster process Port Online Pid > > ------------------------------------------------------------------------------ > > Brick gfs1:/export/sda/brick 49152 Y 16660 > > Brick gfs2:/export/sda/brick 49152 Y 21582 > > NFS Server on localhost 2049 Y 16674 > > Self-heal Daemon on localhost N/A Y 16679 > > NFS Server on gfs2 N/A N 21596 > > Self-heal Daemon on gfs2 N/A N 21600 > > > > Task Status of Volume gfsvolume > > ------------------------------------------------------------------------------ > > There are no active volume tasks > > > > However, running this on gfs2 doesnt fix the daemon. > > Restarting the gfs2 server brings me back to problem 1 and the cycle > continues.. > > Can anyone assist me with this issue(s).. thank you. > > Thank You Kindly, > Kaamesh > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >-- ~Atin
Kaamesh Kamalaaharan
2015-Feb-27 00:51 UTC
[Gluster-users] Peer Rejected(Connected) and Self heal daemon is not running causing split brain
Hi atin, I have tried to flush the iptables and this time i managed to get the peer into cluster. However, the self heal daemon is still offline and im unable to bring the daemon back online on gfs2. Doing a heal on either server gives me a succesful output but when i check the heal info i am getting many split brain errors on gfs2 Thank You Kindly, Kaamesh On Thu, Feb 26, 2015 at 5:40 PM, Atin Mukherjee <amukherj at redhat.com> wrote:> Could you check the N/W firewall setting? Flush iptable setting using > iptables -F and retry. > > ~Atin > > On 02/26/2015 02:55 PM, Kaamesh Kamalaaharan wrote: > > Hi guys, > > > > I managed to get gluster running but im having a couple of issues with my > > setup 1) my peer status is rejected but connected 2) my self heal daemon > is > > not running on one server and im getting split brain files. > > My setup is two gluster volumes (gfs1 and gfs2) on replicate each with a > > brick > > > > 1) My peer status doesnt go into Peer in Cluster. running a peer status > > command gives me State:Peer Rejected (Connected) . At this point, the > brick > > on gfs2 does not go online and i get this output > > > > > > #gluster volume status > > > > Status of volume: gfsvolume > > > > Gluster process Port Online Pid > > > > > ------------------------------------------------------------------------------ > > > > Brick gfs1:/export/sda/brick 49153 Y 15025 > > > > NFS Server on localhost 2049 Y 15039 > > > > Self-heal Daemon on localhost N/A Y 15044 > > > > > > > > Task Status of Volume gfsvolume > > > > > ------------------------------------------------------------------------------ > > > > There are no active volume tasks > > > > > > > > I have followed the methods used in one of the threads and performed the > > following > > > > a) stop glusterd > > b) rm all files in /var/lib/glusterd/ except for glusterd.info > > c) start glusterd and probe gfs1 from gfs2 and peer status which > gives me > > > > > > # gluster peer status > > > > Number of Peers: 1 > > > > > > Hostname: gfs1 > > > > Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070 > > > > State: Sent and Received peer request (Connected) > > > > > > the same thread mentioned that changing the status of the peer in > > /var/lib/glusterd/peer/{UUID} from status=5 to status=3 fixes this and on > > restart of gfs1 the peer status goes to > > > > #gluster peer status > > > > Number of Peers: 1 > > > > > > Hostname: gfs1 > > > > Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070 > > > > State: Peer in Cluster (Connected) > > > > This fixes the connection between the peers and the volume status shows > > > > > > Status of volume: gfsvolume > > > > Gluster process Port Online Pid > > > > > ------------------------------------------------------------------------------ > > > > Brick gfs1:/export/sda/brick 49153 Y 10852 > > > > Brick gfs2:/export/sda/brick 49152 Y 17024 > > > > NFS Server on localhost N/A N N/A > > > > Self-heal Daemon on localhost N/A N N/A > > > > NFS Server on gfs2 N/A N N/A > > > > Self-heal Daemon on gfs2 N/A N N/A > > > > > > > > Task Status of Volume gfsvolume > > > > > ------------------------------------------------------------------------------ > > > > There are no active volume tasks > > > > > > Which brings us to problem 2 > > > > 2) My self-heal demon is not alive > > > > I fixed the self heal on gfs1 by running > > > > #find <gluster-mount> -noleaf -print0 | xargs --null stat >/dev/null > > 2>/var/log/gluster/<gluster-mount>-selfheal.log > > > > and running a volume status command gives me > > > > # gluster volume status > > > > Status of volume: gfsvolume > > > > Gluster process Port Online Pid > > > > > ------------------------------------------------------------------------------ > > > > Brick gfs1:/export/sda/brick 49152 Y 16660 > > > > Brick gfs2:/export/sda/brick 49152 Y 21582 > > > > NFS Server on localhost 2049 Y 16674 > > > > Self-heal Daemon on localhost N/A Y 16679 > > > > NFS Server on gfs2 N/A N 21596 > > > > Self-heal Daemon on gfs2 N/A N 21600 > > > > > > > > Task Status of Volume gfsvolume > > > > > ------------------------------------------------------------------------------ > > > > There are no active volume tasks > > > > > > > > However, running this on gfs2 doesnt fix the daemon. > > > > Restarting the gfs2 server brings me back to problem 1 and the cycle > > continues.. > > > > Can anyone assist me with this issue(s).. thank you. > > > > Thank You Kindly, > > Kaamesh > > > > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > -- > ~Atin >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150227/aca46645/attachment.html>