tommy.yardley at baesystems.com
2016-Mar-15 15:38 UTC
[Gluster-users] GlusterFS cluster peer stuck in state: Sent and Received peer request (Connected)
Hi All, I'm running GlusterFS on a cluster hosted in AWS. I have a script which provisions my instances and thus will set up GlusterFS (specifically: glusterfs 3.5.8). My issue is that this only works ~50% of the time and the other 50% of the time one of the peers will be 'stuck' in the following state: root at ip-xx-xx-xx-1:/home/ubuntu# gluster peer status Number of Peers: 3 Hostname: xx.xx.xx.2 Uuid: 3b4c1fb9-b325-4204-98fd-2eb739fa867f State: Peer in Cluster (Connected) Hostname: xx.xx.xx.3 Uuid: acfc1794-9080-4eb0-8f69-3abe78bbee16 State: Sent and Received peer request (Connected) Hostname: xx.xx.xx.4 Uuid: af33463d-1b32-4ffb-a4f0-46ce16151e2f State: Peer in Cluster (Connected) Running gluster peer status on the instance that is affected yields: root at ip-xx-xx-xx-3:/var/log/glusterfs# gluster peer status Number of Peers: 1 Hostname: xx.xx.xx.1 Uuid: c4f17e9a-893b-48f0-a014-1a05cca09d01 State: Peer is connected and Accepted (Connected) Of which the status (Connected) in this case, will fluctuate between 'Connected' and 'Disconnected'. I have been unable to locate the cause of this issue. Has this been encountered before, and if so is there a general fix? I haven't been able to find anything as of yet. Many thanks, Tommy Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of BAE Systems Applied Intelligence Limited, details of which can be found at http://www.baesystems.com/Businesses/index.htm. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160315/64a94d1e/attachment.html>
Atin Mukherjee
2016-Mar-15 15:58 UTC
[Gluster-users] GlusterFS cluster peer stuck in state: Sent and Received peer request (Connected)
This indicates the peer handshaking didn't go through properly and your cluster is messed up. Are you running 3.5.8 version in all the nodes? Could you get me the glusterd log from all the nodes and mention the peer probe sequence? I'd be able to look at it tomorrow only and get back. -Atin Sent from one plus one On 15-Mar-2016 9:16 pm, "tommy.yardley at baesystems.com" < tommy.yardley at baesystems.com> wrote:> Hi All, > > > > I?m running GlusterFS on a cluster hosted in AWS. I have a script which > provisions my instances and thus will set up GlusterFS (specifically: > glusterfs 3.5.8). > > My issue is that this only works ~50% of the time and the other 50% of the > time one of the peers will be ?stuck? in the following state: > > *root at ip-xx-xx-xx-1:/home/ubuntu# gluster peer status* > > *Number of Peers: 3* > > > > *Hostname: xx.xx.xx.2* > > *Uuid: 3b4c1fb9-b325-4204-98fd-2eb739fa867f* > > *State: Peer in Cluster (Connected)* > > > > *Hostname: xx.xx.xx.3* > > *Uuid: acfc1794-9080-4eb0-8f69-3abe78bbee16* > > *State: Sent and Received peer request (Connected)* > > > > *Hostname: xx.xx.xx.4* > > *Uuid: af33463d-1b32-4ffb-a4f0-46ce16151e2f* > > *State: Peer in Cluster (Connected)* > > > > Running gluster peer status on the instance that is affected yields: > > > > > *root at ip-xx-xx-xx-3:/var/log/glusterfs# gluster peer status Number of > Peers: 1* > > > > *Hostname: xx.xx.xx.1* > > *Uuid: c4f17e9a-893b-48f0-a014-1a05cca09d01* > > *State: Peer is connected and Accepted (Connected)* > > > > Of which the status (Connected) in this case, will fluctuate between > ?Connected? and ?Disconnected?. > > > > I have been unable to locate the cause of this issue. Has this been > encountered before, and if so is there a general fix? I haven?t been able > to find anything as of yet. > > > > Many thanks, > > > > *Tommy* > > > Please consider the environment before printing this email. This message > should be regarded as confidential. If you have received this email in > error please notify the sender and destroy it immediately. Statements of > intent shall only become binding when confirmed in hard copy by an > authorised signatory. The contents of this email may relate to dealings > with other companies under the control of BAE Systems Applied Intelligence > Limited, details of which can be found at > http://www.baesystems.com/Businesses/index.htm. > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160315/b8114272/attachment.html>