Hi Team, I am facing some issue with peer status and because of that remove-brick on replica volume is getting failed. Here. is the scenario what I am doing with gluster: 1. I have two boards A & B and gluster is running on both of the boards. 2. On board I have created a replicated volume with one brick on each board. 3. Created one glusterfs mount point where both of brick are mounted. 4. start the volume with nfs.disable=true. 5. Till now everything is in sync between both of bricks. Now when I manually plug-out the board B from the slot and plug-in it again. 1. After bootup the board B I have started the glusted on the board B. Following are the some gluster command output on Board B after the step 1. # gluster peer status Number of Peers: 2 Hostname: 10.32.0.48 Uuid: f4ebe3c5-b6a4-4795-98e0-732337f76faf State: Accepted peer request (Connected) Hostname: 10.32.0.48 Uuid: 4bf982c0-b21b-415c-b870-e72f36c7f2e7 State: Peer is connected and Accepted (Connected) Why this peer status is showing two peer with different UUID? # gluster volume info Volume Name: c_glusterfs Type: Replicate Volume ID: c11f1f13-64a0-4aca-98b5-91d609a4a18d Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.32.0.48:/opt/lvmdir/c2/brick Brick2: 10.32.1.144:/opt/lvmdir/c2/brick Options Reconfigured: performance.readdir-ahead: on network.ping-timeout: 4 nfs.disable: on # gluster volume heal c_glusterfs info c_glusterfs: Not able to fetch volfile from glusterd Volume heal failed. # gluster volume status c_glusterfs Status of volume: c_glusterfs Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.32.1.144:/opt/lvmdir/c2/brick N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 3922 Task Status of Volume c_glusterfs ------------------------------------------------------------------------------ There are no active volume tasks -- At the same time Board A have the following gluster commands outcome: # gluster peer status Number of Peers: 1 Hostname: 10.32.1.144 Uuid: c6b64e36-76da-4e98-a616-48e0e52c7006 State: Peer in Cluster (Connected) Why it is showing the older UUID of host 10.32.1.144 when this UUID has been changed and new UUID is 267a92c3-fd28-4811-903c-c1d54854bda9 # gluster volume heal c_glusterfs info c_glusterfs: Not able to fetch volfile from glusterd Volume heal failed. # gluster volume status c_glusterfs Status of volume: c_glusterfs Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.32.0.48:/opt/lvmdir/c2/brick 49169 0 Y 2427 Brick 10.32.1.144:/opt/lvmdir/c2/brick N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 3388 Self-heal Daemon on 10.32.1.144 N/A N/A Y 3922 Task Status of Volume c_glusterfs ------------------------------------------------------------------------------ There are no active volume tasks As you see in the "gluster volume status" showing that Brick "10.32.1.144:/opt/lvmdir/c2/brick " is offline so We have tried to remove it but getting "volume remove-brick c_glusterfs replica 1 10.32.1.144:/opt/lvmdir/c2/brick force : FAILED : Incorrect brick 10.32.1.144:/opt/lvmdir/c2/brick for volume c_glusterfs" error on the Board A. Please reply on this post because I am always getting this error in this scenario. For more detail I am also adding the logs of both of the board which having some manual created file in which you can find the output of glulster command from both of the boards in logs 00030 is board A 00250 is board B. Thanks in advance waiting for the reply. Regards, Abhishek Regards Abhishek Paliwal -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160314/58950c98/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: HU57609_debug.zip Type: application/zip Size: 592918 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160314/58950c98/attachment-0001.zip>
Atin Mukherjee
2016-Mar-14 06:42 UTC
[Gluster-users] [Gluster-devel] Messup with peer status!!
On 03/14/2016 10:52 AM, ABHISHEK PALIWAL wrote:> Hi Team, > > I am facing some issue with peer status and because of that remove-brick > on replica volume is getting failed. > > Here. is the scenario what I am doing with gluster: > > 1. I have two boards A & B and gluster is running on both of the boards. > 2. On board I have created a replicated volume with one brick on each > board. > 3. Created one glusterfs mount point where both of brick are mounted. > 4. start the volume with nfs.disable=true. > 5. Till now everything is in sync between both of bricks. > > Now when I manually plug-out the board B from the slot and plug-in it again. > > 1. After bootup the board B I have started the glusted on the board B. > > Following are the some gluster command output on Board B after the step 1. > > # gluster peer status > Number of Peers: 2 > > Hostname: 10.32.0.48 > Uuid: f4ebe3c5-b6a4-4795-98e0-732337f76faf > State: Accepted peer request (Connected) > > Hostname: 10.32.0.48 > Uuid: 4bf982c0-b21b-415c-b870-e72f36c7f2e7 > State: Peer is connected and Accepted (Connected) > > Why this peer status is showing two peer with different UUID?GlusterD doesn't generate a new UUID on init if it has already generated an UUID earlier. This clearly indicates that on reboot of board B content of /var/lib/glusterd were wiped off. I've asked this question to you multiple times that is it the case?> > # gluster volume info > > Volume Name: c_glusterfs > Type: Replicate > Volume ID: c11f1f13-64a0-4aca-98b5-91d609a4a18d > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: 10.32.0.48:/opt/lvmdir/c2/brick > Brick2: 10.32.1.144:/opt/lvmdir/c2/brick > Options Reconfigured: > performance.readdir-ahead: on > network.ping-timeout: 4 > nfs.disable: on > # gluster volume heal c_glusterfs info > c_glusterfs: Not able to fetch volfile from glusterd > Volume heal failed. > # gluster volume status c_glusterfs > Status of volume: c_glusterfs > Gluster process TCP Port RDMA Port Online > Pid > ------------------------------------------------------------------------------ > > Brick 10.32.1.144:/opt/lvmdir/c2/brick N/A N/A N > N/A > Self-heal Daemon on localhost N/A N/A Y > 3922 > > Task Status of Volume c_glusterfs > ------------------------------------------------------------------------------ > > There are no active volume tasks > -- > > At the same time Board A have the following gluster commands outcome: > > # gluster peer status > Number of Peers: 1 > > Hostname: 10.32.1.144 > Uuid: c6b64e36-76da-4e98-a616-48e0e52c7006 > State: Peer in Cluster (Connected) > > Why it is showing the older UUID of host 10.32.1.144 when this UUID has > been changed and new UUID is 267a92c3-fd28-4811-903c-c1d54854bda9 > > > # gluster volume heal c_glusterfs info > c_glusterfs: Not able to fetch volfile from glusterd > Volume heal failed. > # gluster volume status c_glusterfs > Status of volume: c_glusterfs > Gluster process TCP Port RDMA Port Online > Pid > ------------------------------------------------------------------------------ > > Brick 10.32.0.48:/opt/lvmdir/c2/brick 49169 0 Y > 2427 > Brick 10.32.1.144:/opt/lvmdir/c2/brick N/A N/A N > N/A > Self-heal Daemon on localhost N/A N/A Y > 3388 > Self-heal Daemon on 10.32.1.144 N/A N/A Y > 3922 > > Task Status of Volume c_glusterfs > ------------------------------------------------------------------------------ > > There are no active volume tasks > > As you see in the "gluster volume status" showing that Brick > "10.32.1.144:/opt/lvmdir/c2/brick " is offline so We have tried to > remove it but getting "volume remove-brick c_glusterfs replica 1 > 10.32.1.144:/opt/lvmdir/c2/brick force : FAILED : Incorrect brick > 10.32.1.144:/opt/lvmdir/c2/brick for volume c_glusterfs" error on the > Board A. > > Please reply on this post because I am always getting this error in this > scenario. > > For more detail I am also adding the logs of both of the board which > having some manual created file in which you can find the output of > glulster command from both of the boards > > in logs > 00030 is board A > 00250 is board B.This attachment doesn't help much. Could you attach full glusterd log files from both the nodes?> > Thanks in advance waiting for the reply. > > Regards, > Abhishek > > > Regards > Abhishek Paliwal > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel >