Andreas Hollaus
2015-Jun-16 16:01 UTC
[Gluster-users] GlusterFS 3.6.2: Seems like the peers don't agree anymore
Hi, I discovered this strange situation when I rebooted one of the nodes. After the reboot I removed the brick on node 2, but for some reason it seems like that information didn't each node 1. Any idea what could have gone wrong and how to troubleshoot? Up until now I've never seen that the nodes disagree on peer status and volume definitions. Node 1: Correct volume definition but no peers available. IP: 10.32.0.32 # gluster peer status Number of Peers: 0 # gluster volume info Volume Name: c_glstr Type: Replicate Volume ID: d1e56a0c-fdbe-47cf-8f3d-edd54d1c73e2 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.32.0.48:/opt/lvmdir/c2/brick Brick2: 10.32.0.64:/opt/lvmdir/c2/brick Options Reconfigured: network.ping-timeout: 4 Node 2 ---------- At least the peer is connected, but a brick is missing from the volume. IP: 10.32.0.64 # gluster peer status Number of Peers: 1 Hostname: 10.32.0.48 Uuid: db59a429-9f0f-4512-8d3a-e19b54c88370 State: Accepted peer request (Connected) # gluster volume info Volume Name: c_glstr Type: Distribute Volume ID: d1e56a0c-fdbe-47cf-8f3d-edd54d1c73e2 Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 10.32.0.48:/opt/lvmdir/c2/brick Options Reconfigured: network.ping-timeout: 4 Regards Andreas
Atin Mukherjee
2015-Jun-16 16:12 UTC
[Gluster-users] GlusterFS 3.6.2: Seems like the peers don't agree anymore
Sent from one plus one On Jun 16, 2015 9:32 PM, "Andreas Hollaus" <Andreas.Hollaus at ericsson.com> wrote:> > Hi, > > I discovered this strange situation when I rebooted one of the nodes.After the> reboot I removed the brick on node 2, but for some reason it seems likethat> information didn't each node 1. > Any idea what could have gone wrong and how to troubleshoot? Up until nowI've never> seen that the nodes disagree on peer status and volume definitions. > > Node 1: > Correct volume definition but no peers available. > > IP: 10.32.0.32 > > # gluster peer status > Number of Peers: 0 > # gluster volume info > > Volume Name: c_glstr > Type: Replicate > Volume ID: d1e56a0c-fdbe-47cf-8f3d-edd54d1c73e2 > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: 10.32.0.48:/opt/lvmdir/c2/brick > Brick2: 10.32.0.64:/opt/lvmdir/c2/brick > Options Reconfigured: > network.ping-timeout: 4 > > > Node 2 > ---------- > At least the peer is connected, but a brick is missing from the volume. > > IP: 10.32.0.64 > > # gluster peer status > Number of Peers: 1 > > Hostname: 10.32.0.48This is where the problem starts. Peer status shows x.x.x.48 as its peer but your node 1 indicates its x.x.x.32. It seems like this node is stale now. Has it undergone any IP change? Could you attach glusterd log files for all these three nodes?> Uuid: db59a429-9f0f-4512-8d3a-e19b54c88370 > State: Accepted peer request (Connected) > > # gluster volume info > > Volume Name: c_glstr > Type: Distribute > Volume ID: d1e56a0c-fdbe-47cf-8f3d-edd54d1c73e2 > Status: Started > Number of Bricks: 1 > Transport-type: tcp > Bricks: > Brick1: 10.32.0.48:/opt/lvmdir/c2/brick > Options Reconfigured: > network.ping-timeout: 4 > > > Regards > Andreas > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150616/747da488/attachment.html>