thr3ads.net - Gluster users - [Gluster-users] Gluster 3.7.6 add new node state Peer Rejected (Connected) [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Mohammed Rafi K C

2016-Feb-25 20:23 UTC

[Gluster-users] Gluster 3.7.6 add new node state Peer Rejected (Connected)

On 02/26/2016 01:32 AM, Steve Dainard wrote:> I haven't done anything more than peer thus far, so I'm a bit
confused
> as to how the volume info fits in, can you expand on this a bit?
>
> Failed commits? Is this split brain on the replica volumes? I don't
> get any return from 'gluster volume heal <volname> info' on
all the
> replica volumes, but if I try a gluster volume heal <volname> full I
> get: 'Launching heal operation to perform full self heal on volume
> <volname> has been unsuccessful'.
forget about this. it is not for metadata selfheal .
>
> I have 5 volumes total.
>
> 'Replica 3' volumes running on gluster01/02/03:
> vm-storage
> iso-storage
> export-domain-storage
> env-modules
>
> And one distributed only volume 'storage' info shown below:
>
> *From existing host gluster01/02:*
> type=0
> count=4
> status=1
> sub_count=0
> stripe_count=1
> replica_count=1
> disperse_count=0
> redundancy_count=0
> version=25
> transport-type=0
> volume-id=26d355cb-c486-481f-ac16-e25390e73775
> username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c
> password> op-version=3
> client-op-version=3
> quota-version=1
> parent_volname=N/A
> restored_from_snap=00000000-0000-0000-0000-000000000000
> snap-max-hard-limit=256
> features.quota-deem-statfs=on
> features.inode-quota=on
> diagnostics.brick-log-level=WARNING
> features.quota=on
> performance.readdir-ahead=on
> performance.cache-size=1GB
> performance.stat-prefetch=on
> brick-0=10.0.231.50:-mnt-raid6-storage-storage
> brick-1=10.0.231.51:-mnt-raid6-storage-storage
> brick-2=10.0.231.52:-mnt-raid6-storage-storage
> brick-3=10.0.231.53:-mnt-raid6-storage-storage
>
> *From existing host gluster03/04:*
> type=0
> count=4
> status=1
> sub_count=0
> stripe_count=1
> replica_count=1
> disperse_count=0
> redundancy_count=0
> version=25
> transport-type=0
> volume-id=26d355cb-c486-481f-ac16-e25390e73775
> username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c
> password> op-version=3
> client-op-version=3
> quota-version=1
> parent_volname=N/A
> restored_from_snap=00000000-0000-0000-0000-000000000000
> snap-max-hard-limit=256
> features.quota-deem-statfs=on
> features.inode-quota=on
> performance.stat-prefetch=on
> performance.cache-size=1GB
> performance.readdir-ahead=on
> features.quota=on
> diagnostics.brick-log-level=WARNING
> brick-0=10.0.231.50:-mnt-raid6-storage-storage
> brick-1=10.0.231.51:-mnt-raid6-storage-storage
> brick-2=10.0.231.52:-mnt-raid6-storage-storage
> brick-3=10.0.231.53:-mnt-raid6-storage-storage
>
> So far between gluster01/02 and gluster03/04 the configs are the same,
> although the ordering is different for some of the features.
>
> On gluster05/06 the ordering is different again, and the
> quota-version=0 instead of 1.
This is why the peer shows as rejected. Can you check the op-version of
all the glusterd including the one which is in reject state. you can
find out the op-version here in  /var/lib/glusterd/glusterd.info

Rafi KC
>
> *From new hosts gluster05/gluster06:*
> type=0
> count=4
> status=1
> sub_count=0
> stripe_count=1
> replica_count=1
> disperse_count=0
> redundancy_count=0
> version=25
> transport-type=0
> volume-id=26d355cb-c486-481f-ac16-e25390e73775
> username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c
> password> op-version=3
> client-op-version=3
> quota-version=0
> parent_volname=N/A
> restored_from_snap=00000000-0000-0000-0000-000000000000
> snap-max-hard-limit=256
> performance.stat-prefetch=on
> performance.cache-size=1GB
> performance.readdir-ahead=on
> features.quota=on
> diagnostics.brick-log-level=WARNING
> features.inode-quota=on
> features.quota-deem-statfs=on
> brick-0=10.0.231.50:-mnt-raid6-storage-storage
> brick-1=10.0.231.51:-mnt-raid6-storage-storage
> brick-2=10.0.231.52:-mnt-raid6-storage-storage
> brick-3=10.0.231.53:-mnt-raid6-storage-storage
>
> Also, I forgot to mention that when I initially peer'd the two new
> hosts, glusterd crashed on gluster03 and had to be restarted (log
> attached) but has been fine since.
>
> Thanks,
> Steve
>
> On Thu, Feb 25, 2016 at 11:27 AM, Mohammed Rafi K C
> <rkavunga at redhat.com <mailto:rkavunga at redhat.com>> wrote:
>
>
>
>     On 02/25/2016 11:45 PM, Steve Dainard wrote:
>>     Hello,
>>
>>     I upgraded from 3.6.6 to 3.7.6 a couple weeks ago. I just peered
>>     2 new nodes to a 4 node cluster and gluster peer status is:
>>
>>     # gluster peer status *<-- from node gluster01*
>>     Number of Peers: 5
>>
>>     Hostname: 10.0.231.51
>>     Uuid: b01de59a-4428-486b-af49-cb486ab44a07
>>     State: Peer in Cluster (Connected)
>>
>>     Hostname: 10.0.231.52
>>     Uuid: 75143760-52a3-4583-82bb-a9920b283dac
>>     State: Peer in Cluster (Connected)
>>
>>     Hostname: 10.0.231.53
>>     Uuid: 2c0b8bb6-825a-4ddd-9958-d8b46e9a2411
>>     State: Peer in Cluster (Connected)
>>
>>     Hostname: 10.0.231.54 *<-- new node gluster05*
>>     Uuid: 408d88d6-0448-41e8-94a3-bf9f98255d9c
>>     *State: Peer Rejected (Connected)*
>>
>>     Hostname: 10.0.231.55 *<-- new node gluster06*
>>     Uuid: 9c155c8e-2cd1-4cfc-83af-47129b582fd3
>>     *State: Peer Rejected (Connected)*
>
>     Looks like your configuration files are mismatching, ie the
>     checksum calculation differs on this two node than the others,
>
>     Did you had any failed commit ?
>
>     Compare your /var/lib/glusterd/<volname>/info of the failed node
>     against good one, mostly you could see some difference.
>
>     can you paste the /var/lib/glusterd/<volname>/info ?
>
>     Regards
>     Rafi KC
>
>
>>     *
>>     *
>>     I followed the write-up
>>     here:
http://www.gluster.org/community/documentation/index.php/Resolving_Peer_Rejected
>>     and the two new nodes peer'd properly but after a reboot of the
>>     two new nodes I'm seeing the same Peer Rejected (Connected)
State.
>>
>>     I've attached logs from an existing node, and the two new
nodes.
>>
>>     Thanks for any suggestions,
>>     Steve
>>
>>
>>
>>
>>     _______________________________________________
>>     Gluster-users mailing list
>>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>     http://www.gluster.org/mailman/listinfo/gluster-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160226/112aad19/attachment.html>

Mohammed Rafi K C

2016-Feb-25 20:49 UTC

head link

[Gluster-users] Gluster 3.7.6 add new node state Peer Rejected (Connected)

On 02/26/2016 01:53 AM, Mohammed Rafi K C wrote:>
>
> On 02/26/2016 01:32 AM, Steve Dainard wrote:
>> I haven't done anything more than peer thus far, so I'm a bit
>> confused as to how the volume info fits in, can you expand on this a
bit?
>>
>> Failed commits? Is this split brain on the replica volumes? I don't
>> get any return from 'gluster volume heal <volname> info'
on all the
>> replica volumes, but if I try a gluster volume heal <volname>
full I
>> get: 'Launching heal operation to perform full self heal on volume
>> <volname> has been unsuccessful'.
>
> forget about this. it is not for metadata selfheal .
>
>>
>> I have 5 volumes total.
>>
>> 'Replica 3' volumes running on gluster01/02/03:
>> vm-storage
>> iso-storage
>> export-domain-storage
>> env-modules
>>
>> And one distributed only volume 'storage' info shown below:
>>
>> *From existing host gluster01/02:*
>> type=0
>> count=4
>> status=1
>> sub_count=0
>> stripe_count=1
>> replica_count=1
>> disperse_count=0
>> redundancy_count=0
>> version=25
>> transport-type=0
>> volume-id=26d355cb-c486-481f-ac16-e25390e73775
>> username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c
>> password>> op-version=3
>> client-op-version=3
>> quota-version=1
>> parent_volname=N/A
>> restored_from_snap=00000000-0000-0000-0000-000000000000
>> snap-max-hard-limit=256
>> features.quota-deem-statfs=on
>> features.inode-quota=on
>> diagnostics.brick-log-level=WARNING
>> features.quota=on
>> performance.readdir-ahead=on
>> performance.cache-size=1GB
>> performance.stat-prefetch=on
>> brick-0=10.0.231.50:-mnt-raid6-storage-storage
>> brick-1=10.0.231.51:-mnt-raid6-storage-storage
>> brick-2=10.0.231.52:-mnt-raid6-storage-storage
>> brick-3=10.0.231.53:-mnt-raid6-storage-storage
>>
>> *From existing host gluster03/04:*
>> type=0
>> count=4
>> status=1
>> sub_count=0
>> stripe_count=1
>> replica_count=1
>> disperse_count=0
>> redundancy_count=0
>> version=25
>> transport-type=0
>> volume-id=26d355cb-c486-481f-ac16-e25390e73775
>> username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c
>> password>> op-version=3
>> client-op-version=3
>> quota-version=1
>> parent_volname=N/A
>> restored_from_snap=00000000-0000-0000-0000-000000000000
>> snap-max-hard-limit=256
>> features.quota-deem-statfs=on
>> features.inode-quota=on
>> performance.stat-prefetch=on
>> performance.cache-size=1GB
>> performance.readdir-ahead=on
>> features.quota=on
>> diagnostics.brick-log-level=WARNING
>> brick-0=10.0.231.50:-mnt-raid6-storage-storage
>> brick-1=10.0.231.51:-mnt-raid6-storage-storage
>> brick-2=10.0.231.52:-mnt-raid6-storage-storage
>> brick-3=10.0.231.53:-mnt-raid6-storage-storage
>>
>> So far between gluster01/02 and gluster03/04 the configs are the
>> same, although the ordering is different for some of the features.
>>
>> On gluster05/06 the ordering is different again, and the
>> quota-version=0 instead of 1.
>
> This is why the peer shows as rejected. Can you check the op-version
> of all the glusterd including the one which is in reject state. you
> can find out the op-version here in  /var/lib/glusterd/glusterd.info
If all the op-version are same and 3.7.6, then to work-around the issue,
you can manually make it quota-version=1, and restarting the glusterd
will solve the problem, But I would strongly recommend you to figure out
the RCA. May be you can file a bug for this.

Rafi
>
> Rafi KC
>
>>
>> *From new hosts gluster05/gluster06:*
>> type=0
>> count=4
>> status=1
>> sub_count=0
>> stripe_count=1
>> replica_count=1
>> disperse_count=0
>> redundancy_count=0
>> version=25
>> transport-type=0
>> volume-id=26d355cb-c486-481f-ac16-e25390e73775
>> username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c
>> password>> op-version=3
>> client-op-version=3
>> quota-version=0
>> parent_volname=N/A
>> restored_from_snap=00000000-0000-0000-0000-000000000000
>> snap-max-hard-limit=256
>> performance.stat-prefetch=on
>> performance.cache-size=1GB
>> performance.readdir-ahead=on
>> features.quota=on
>> diagnostics.brick-log-level=WARNING
>> features.inode-quota=on
>> features.quota-deem-statfs=on
>> brick-0=10.0.231.50:-mnt-raid6-storage-storage
>> brick-1=10.0.231.51:-mnt-raid6-storage-storage
>> brick-2=10.0.231.52:-mnt-raid6-storage-storage
>> brick-3=10.0.231.53:-mnt-raid6-storage-storage
>>
>> Also, I forgot to mention that when I initially peer'd the two new
>> hosts, glusterd crashed on gluster03 and had to be restarted (log
>> attached) but has been fine since.
>>
>> Thanks,
>> Steve
>>
>> On Thu, Feb 25, 2016 at 11:27 AM, Mohammed Rafi K C
>> <rkavunga at redhat.com <mailto:rkavunga at redhat.com>>
wrote:
>>
>>
>>
>>     On 02/25/2016 11:45 PM, Steve Dainard wrote:
>>>     Hello,
>>>
>>>     I upgraded from 3.6.6 to 3.7.6 a couple weeks ago. I just
peered
>>>     2 new nodes to a 4 node cluster and gluster peer status is:
>>>
>>>     # gluster peer status *<-- from node gluster01*
>>>     Number of Peers: 5
>>>
>>>     Hostname: 10.0.231.51
>>>     Uuid: b01de59a-4428-486b-af49-cb486ab44a07
>>>     State: Peer in Cluster (Connected)
>>>
>>>     Hostname: 10.0.231.52
>>>     Uuid: 75143760-52a3-4583-82bb-a9920b283dac
>>>     State: Peer in Cluster (Connected)
>>>
>>>     Hostname: 10.0.231.53
>>>     Uuid: 2c0b8bb6-825a-4ddd-9958-d8b46e9a2411
>>>     State: Peer in Cluster (Connected)
>>>
>>>     Hostname: 10.0.231.54 *<-- new node gluster05*
>>>     Uuid: 408d88d6-0448-41e8-94a3-bf9f98255d9c
>>>     *State: Peer Rejected (Connected)*
>>>
>>>     Hostname: 10.0.231.55 *<-- new node gluster06*
>>>     Uuid: 9c155c8e-2cd1-4cfc-83af-47129b582fd3
>>>     *State: Peer Rejected (Connected)*
>>
>>     Looks like your configuration files are mismatching, ie the
>>     checksum calculation differs on this two node than the others,
>>
>>     Did you had any failed commit ?
>>
>>     Compare your /var/lib/glusterd/<volname>/info of the failed
node
>>     against good one, mostly you could see some difference.
>>
>>     can you paste the /var/lib/glusterd/<volname>/info ?
>>
>>     Regards
>>     Rafi KC
>>
>>
>>>     *
>>>     *
>>>     I followed the write-up
>>>     here:
http://www.gluster.org/community/documentation/index.php/Resolving_Peer_Rejected
>>>     and the two new nodes peer'd properly but after a reboot of
the
>>>     two new nodes I'm seeing the same Peer Rejected (Connected)
State.
>>>
>>>     I've attached logs from an existing node, and the two new
nodes.
>>>
>>>     Thanks for any suggestions,
>>>     Steve
>>>
>>>
>>>
>>>
>>>     _______________________________________________
>>>     Gluster-users mailing list
>>>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>>     http://www.gluster.org/mailman/listinfo/gluster-users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160226/3c764e75/attachment.html>

Gluster users - Feb 2016 - Gluster 3.7.6 add new node state Peer Rejected (Connected)

[Gluster-users] Gluster 3.7.6 add new node state Peer Rejected (Connected)

[Gluster-users] Gluster 3.7.6 add new node state Peer Rejected (Connected)