Lindsay Mathieson
2017-Jun-10 23:38 UTC
[Gluster-users] How to remove dead peer, osrry urgent again :(
Since my node died on friday I have a dead peer (vna) that needs to be removed. I had major issues this morning that I haven't resolve yet with all VM's going offline when I rebooted a node which I *hope * was due to quorum issues as I now have four peers in the cluster, one dead, three live. Confidence level is not high. -- Lindsay Mathieson
On 6/10/2017 4:38 PM, Lindsay Mathieson wrote:> Since my node died on friday I have a dead peer (vna) that needs to be > removed. > > > I had major issues this morning that I haven't resolve yet with all > VM's going offline when I rebooted a node which I *hope * was due to > quorum issues as I now have four peers in the cluster, one dead, three > live. > >Lets see: According to your previous note, you had vna, vnb and vng all replica 3 in a working cluster. vna died so you had two 'good' nodes left. All was good. You replaced vna with vnd but it is probably not fully healed yet cuz you had 3.8T worth of chunks to copy. So you had two good nodes (vnb and vng) working and you rebooted one of them? If so, yes, based on my experience learning how to deal with failed nodes you would get a quorum lock under those circumstances UNLESS you turned off the quorums prior to the reboot. Do you show any split brains? As an aside, I wonder if a strategy would have been to first replace VNA with an arbiter, get the metadata synced up for quorum purposes AND then turn the arbiter into full node by catching up the chunks. Is that even possible? -bill
Lindsay Mathieson
2017-Jun-11 00:12 UTC
[Gluster-users] How to remove dead peer, osrry urgent again :(
On 11/06/2017 10:01 AM, WK wrote:> You replaced vna with vnd but it is probably not fully healed yet cuz > you had 3.8T worth of chunks to copy.No, the heal had completed. Finished about 9 hours before I rebooted.> > So you had two good nodes (vnb and vng) working and you rebooted one > of them?Three good nodes - vnb, vng, vnh and one dead - vna from node vng: root at vng:~# gluster peer status Number of Peers: 3 Hostname: vna.proxmox.softlog Uuid: de673495-8cb2-4328-ba00-0419357c03d7 State: Peer in Cluster (Disconnected) Hostname: vnb.proxmox.softlog Uuid: 43a1bf8c-3e69-4581-8e16-f2e1462cfc36 State: Peer in Cluster (Connected) Hostname: vnh.proxmox.softlog Uuid: 9eb54c33-7f79-4a75-bc2b-67111bf3eae7 State: Peer in Cluster (Connected) -- Lindsay Mathieson
Lindsay Mathieson
2017-Jun-11 00:57 UTC
[Gluster-users] How to remove dead peer, osrry urgent again :(
On 11/06/2017 9:38 AM, Lindsay Mathieson wrote:> Since my node died on friday I have a dead peer (vna) that needs to be > removed. > > > I had major issues this morning that I haven't resolve yet with all > VM's going offline when I rebooted a node which I *hope * was due to > quorum issues as I now have four peers in the cluster, one dead, three > live. > > > Confidence level is not high. > > >It definitely appears to be quorum issues. Rebooting a node makes the volume inaccessible. All is fine once its backup. I did a | gluster volume set all cluster.server-quorum-ratio 51%| And that has resolved my issue for now as it allows two servers to form a quorum.| | | | -- Lindsay Mathieson -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170611/8ffa9e86/attachment.html>
Lindsay Mathieson
2017-Jun-11 01:01 UTC
[Gluster-users] How to remove dead peer, osrry urgent again :(
On 11/06/2017 10:57 AM, Lindsay Mathieson wrote:> > I did a > > | gluster volume set all cluster.server-quorum-ratio 51%| > > And that has resolved my issue for now as it allows two servers to > form a quorum.| > | >Edit :) Actually | gluster volume set all cluster.server-quorum-ratio 50%| -- Lindsay Mathieson -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170611/797ea13f/attachment.html>