thr3ads.net - Gluster users - [Gluster-users] Duplicate UUID entries in "gluster peer status" command [Nov 2016]

If this information is useful, please help other people find it:
Share via:

ABHISHEK PALIWAL

2016-Nov-21 09:17 UTC

[Gluster-users] Duplicate UUID entries in "gluster peer status" command

On Mon, Nov 21, 2016 at 2:28 PM, Atin Mukherjee <amukherj at redhat.com>
wrote:
>
>
> On Mon, Nov 21, 2016 at 10:00 AM, ABHISHEK PALIWAL <
> abhishpaliwal at gmail.com> wrote:
>
>> Hi Atin,
>>
>> System is the embedded system and these dates are before the system get
>> in timer sync.
>>
>> Yes, I have also seen these two files in peers directory on 002500
board
>> and I want to know the reason why gluster creates the second file when
>> there is old file is exist. Even when you see the content of the these
file
>> are same.
>>
>> Is it possible for gluster if we fall in this situation then instead of
>> manually doing the steps which you mentioned above gluster will take
care
>> of this?
>>
>
> We shouldn't have any unwanted data in /var/lib/glusterd at first place
> and that's a prerequisite of gluster installation failing which
> inconsistencies of configuration data can't be handled automatically
until
> manual intervention.
>
>it means before starting of gluster installation /var/lib/glusterd always
we empty? because in this case nothing is unwanted before installing the
glusterd.
>
>> I have some questions:
>>
>> 1. based on the logs can we find out the reason for having two peers
>> files with same contents.
>>
>
> No we can't as the log file doesn't have any entry of
> 26ae19a6-b58f-446a-b079-411d4ee57450 which indicates that this entry is a
> stale one and was (is) there since long time and the log files are the
> latest.
>
I agreed this 26ae19a6-b58f-446a-b079-411d4ee57450 entry is not there but
as we checked this file is newer in peer and
5be8603b-18d0-4333-8590-38f918a22857
is the older file

*.*
Also, below are some more logs in etc-glusterfs-glusterd.log file from
002500 board file

The message "I [MSGID: 106004]
[glusterd-handler.c:5065:__glusterd_peer_rpc_notify] 0-management: Peer
<10.32.0.48> (<5be8603b-18d0-4333-8590-38f918a22857>), in state
<Peer in
Cluster>, has disconnected from glusterd." repeated 3 times between
[2016-11-17 22:01:23.542556] and [2016-11-17 22:01:36.993584]
The message "W [MSGID: 106118]
[glusterd-handler.c:5087:__glusterd_peer_rpc_notify] 0-management: Lock not
released for c_glusterfs" repeated 3 times between [2016-11-17
22:01:23.542973] and [2016-11-17 22:01:36.993855]
[2016-11-17 22:01:48.860555] I [MSGID: 106487]
[glusterd-handler.c:1411:__glusterd_handle_cli_list_friends] 0-glusterd:
Received cli list req
[2016-11-17 22:01:49.137733] I [MSGID: 106163]
[glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack]
0-management: using the op-version 30706
[2016-11-17 22:01:49.240986] I [MSGID: 106493]
[glusterd-rpc-ops.c:694:__glusterd_friend_update_cbk] 0-management:
Received ACC from uuid: 5be8603b-18d0-4333-8590-38f918a22857
[2016-11-17 22:11:58.658884] E [rpc-clnt.c:201:call_bail] 0-management:
bailing out frame type(glusterd mgmt) op(--(3)) xid = 0x15 sent 2016-11-17
22:01:48.945424. timeout = 600 for 10.32.0.48:24007
[2016-11-17 22:11:58.658987] E [MSGID: 106153]
[glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on
10.32.0.48. Please check log file for details.
[2016-11-17 22:11:58.659243] I [socket.c:3382:socket_submit_reply]
0-socket.management: not connected (priv->connected = 255)
[2016-11-17 22:11:58.659265] E [rpcsvc.c:1314:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc
cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management)
[2016-11-17 22:11:58.659305] E [MSGID: 106430]
[glusterd-utils.c:400:glusterd_submit_reply] 0-glusterd: Reply submission
failed
[2016-11-17 22:13:58.674343] E [rpc-clnt.c:201:call_bail] 0-management:
bailing out frame type(glusterd mgmt) op(--(3)) xid = 0x11 sent 2016-11-17
22:03:50.268751. timeout = 600 for 10.32.0.48:24007
[2016-11-17 22:13:58.674414] E [MSGID: 106153]
[glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on
10.32.0.48. Please check log file for details.
[2016-11-17 22:13:58.674604] I [socket.c:3382:socket_submit_reply]
0-socket.management: not connected (priv->connected = 255)
[2016-11-17 22:13:58.674627] E [rpcsvc.c:1314:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc
cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management)
[2016-11-17 22:13:58.674667] E [MSGID: 106430]
[glusterd-utils.c:400:glusterd_submit_reply] 0-glusterd: Reply submission
failed
[2016-11-17 22:15:58.687737] E [rpc-clnt.c:201:call_bail] 0-management:
bailing out frame type(glusterd mgmt) op(--(3)) xid = 0x17 sent 2016-11-17
22:05:51.341614. timeout = 600 for 10.32.0.48:24007

is these logs causing duplicate UUID or duplicate UUID causing this?
>
> 2. is there any way to do it from gluster code.
>>
>
> Ditto as above.
>
>
>>
>> Regards,
>> Abhishek
>>
>> Regards,
>> Abhishek
>>
>> On Mon, Nov 21, 2016 at 9:52 AM, Atin Mukherjee <amukherj at
redhat.com>
>> wrote:
>>
>>> atin at dhcp35-96:~/Downloads/gluster_users/abhishek_dup_uuid/d
>>> uplicate_uuid/glusterd_2500/peers$ ls -lrt
>>> total 8
>>> -rw-------. 1 atin wheel 71 *Jan  1  1970*
>>> 5be8603b-18d0-4333-8590-38f918a22857
>>> -rw-------. 1 atin wheel 71 Nov 18 03:31
26ae19a6-b58f-446a-b079-411d4e
>>> e57450
>>>
>>> In board 2500 look at the date of the file
5be8603b-18d0-4333-8590-38f918a22857
>>> (marked in bold). Not sure how did you end up having this file in
such time
>>> stamp. I am guessing this could be because of the set up been not
cleaned
>>> properly at the time of re-installation.
>>>
>>> Here is the steps what I'd recommend for now:
>>>
>>> 1. rename 26ae19a6-b58f-446a-b079-411d4ee57450 to
>>> 5be8603b-18d0-4333-8590-38f918a22857, you should have only one
entry in
>>> the peers folder in board 2500.
>>> 2. Bring down both glusterd instances
>>> 3. Bring back one by one
>>>
>>> And then restart glusterd to see if the issue persists.
>>>
>>>
>>>
>>> On Mon, Nov 21, 2016 at 9:34 AM, ABHISHEK PALIWAL <
>>> abhishpaliwal at gmail.com> wrote:
>>>
>>>> Hope you will see in the logs......
>>>>
>>>> On Mon, Nov 21, 2016 at 9:17 AM, ABHISHEK PALIWAL <
>>>> abhishpaliwal at gmail.com> wrote:
>>>>
>>>>> Hi Atin,
>>>>>
>>>>> It is not getting wipe off we have changed the
configuration path from
>>>>> /var/lib/glusterd to /system/glusterd.
>>>>>
>>>>> So, they will remain as same as previous.
>>>>>
>>>>> On Mon, Nov 21, 2016 at 9:15 AM, Atin Mukherjee
<amukherj at redhat.com>
>>>>> wrote:
>>>>>
>>>>>> Abhishek,
>>>>>>
>>>>>> rebooting the board does wipe of /var/lib/glusterd
contents in your
>>>>>> set up right (as per my earlier conversation with you)
? In that case, how
>>>>>> are you ensuring that the same node gets back the older
UUID? If you don't
>>>>>> then this is bound to happen.
>>>>>>
>>>>>> On Mon, Nov 21, 2016 at 9:11 AM, ABHISHEK PALIWAL <
>>>>>> abhishpaliwal at gmail.com> wrote:
>>>>>>
>>>>>>> Hi Team,
>>>>>>>
>>>>>>> Please lookinto this problem as this is very widely
seen problem in
>>>>>>> our system.
>>>>>>>
>>>>>>> We are having the setup of replicate volume setup
with two brick but
>>>>>>> after restarting the second board I am getting the
duplicate entry in
>>>>>>> "gluster peer status" command like below:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *# gluster peer status Number of Peers: 2 
Hostname: 10.32.0.48
>>>>>>> Uuid: 5be8603b-18d0-4333-8590-38f918a22857 State:
Peer in Cluster
>>>>>>> (Connected)  Hostname: 10.32.0.48 Uuid:
>>>>>>> 5be8603b-18d0-4333-8590-38f918a22857 State: Peer in
Cluster (Connected) # *
>>>>>>>
>>>>>>> I am attaching all logs from both the boards and
the command outputs
>>>>>>> as well.
>>>>>>>
>>>>>>> So could you please check what is the reason to get
in this
>>>>>>> situation as it is very frequent in multiple case.
>>>>>>>
>>>>>>> Also, we are not replacing any board from setup
just rebooting.
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Regards
>>>>>>> Abhishek Paliwal
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Gluster-users mailing list
>>>>>>> Gluster-users at gluster.org
>>>>>>>
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> ~ Atin (atinm)
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Regards
>>>>> Abhishek Paliwal
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>>
>>>>
>>>> Regards
>>>> Abhishek Paliwal
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> ~ Atin (atinm)
>>>
>>
>>
>>
>> --
>>
>>
>>
>>
>> Regards
>> Abhishek Paliwal
>>
>
>
>
> --
>
> ~ Atin (atinm)
>


-- 




Regards
Abhishek Paliwal
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20161121/b7b28707/attachment.html>

ABHISHEK PALIWAL

2016-Nov-21 10:10 UTC

head link

[Gluster-users] Duplicate UUID entries in "gluster peer status" command

I have another set of logs for this problem and in this we don't have time
stamp problem for this but we are getting same UUID duplicate entries on
BoardB

ls -lart
total 16
drwxrwxr-x 12 abhishek abhishek 4096 Oct  3 08:58 ..
drwxrwxr-x  2 abhishek abhishek 4096 Oct  3 08:58 .
-rw-rw-r--  1 abhishek abhishek   71 Oct  3 13:36
d8f66ce8-4154-4246-8084-c63b9cfc1af4
-rw-rw-r--  1 abhishek abhishek   71 Oct  3 13:36
28d37300-425a-4781-b1e0-c13efa2ceee6

Previously also I provided logs to you. Again I am attaching logs for you
to analyze.

I just want to know why this another peer file is getting created when
there is not entry in logs for it.

Regards,
Abhishek



On Mon, Nov 21, 2016 at 2:47 PM, ABHISHEK PALIWAL <abhishpaliwal at
gmail.com>
wrote:
>
>
> On Mon, Nov 21, 2016 at 2:28 PM, Atin Mukherjee <amukherj at
redhat.com>
> wrote:
>
>>
>>
>> On Mon, Nov 21, 2016 at 10:00 AM, ABHISHEK PALIWAL <
>> abhishpaliwal at gmail.com> wrote:
>>
>>> Hi Atin,
>>>
>>> System is the embedded system and these dates are before the system
get
>>> in timer sync.
>>>
>>> Yes, I have also seen these two files in peers directory on 002500
board
>>> and I want to know the reason why gluster creates the second file
when
>>> there is old file is exist. Even when you see the content of the
these file
>>> are same.
>>>
>>> Is it possible for gluster if we fall in this situation then
instead of
>>> manually doing the steps which you mentioned above gluster will
take care
>>> of this?
>>>
>>
>> We shouldn't have any unwanted data in /var/lib/glusterd at first
place
>> and that's a prerequisite of gluster installation failing which
>> inconsistencies of configuration data can't be handled
automatically until
>> manual intervention.
>>
>>
> it means before starting of gluster installation /var/lib/glusterd always
> we empty? because in this case nothing is unwanted before installing the
> glusterd.
>
>>
>>> I have some questions:
>>>
>>> 1. based on the logs can we find out the reason for having two
peers
>>> files with same contents.
>>>
>>
>> No we can't as the log file doesn't have any entry of
>> 26ae19a6-b58f-446a-b079-411d4ee57450 which indicates that this entry is
>> a stale one and was (is) there since long time and the log files are
the
>> latest.
>>
>
> I agreed this 26ae19a6-b58f-446a-b079-411d4ee57450 entry is not there but
> as we checked this file is newer in peer and
5be8603b-18d0-4333-8590-38f918a22857
> is the older file
>
> *.*
> Also, below are some more logs in etc-glusterfs-glusterd.log file from
> 002500 board file
>
> The message "I [MSGID: 106004]
[glusterd-handler.c:5065:__glusterd_peer_rpc_notify]
> 0-management: Peer <10.32.0.48>
(<5be8603b-18d0-4333-8590-38f918a22857>),
> in state <Peer in Cluster>, has disconnected from glusterd."
repeated 3
> times between [2016-11-17 22:01:23.542556] and [2016-11-17 22:01:36.993584]
> The message "W [MSGID: 106118]
[glusterd-handler.c:5087:__glusterd_peer_rpc_notify]
> 0-management: Lock not released for c_glusterfs" repeated 3 times
between
> [2016-11-17 22:01:23.542973] and [2016-11-17 22:01:36.993855]
> [2016-11-17 22:01:48.860555] I [MSGID: 106487] [glusterd-handler.c:1411:__
> glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
> [2016-11-17 22:01:49.137733] I [MSGID: 106163]
> [glusterd-handshake.c:1193:__glusterd_mgmt_hndsk_versions_ack]
> 0-management: using the op-version 30706
> [2016-11-17 22:01:49.240986] I [MSGID: 106493]
[glusterd-rpc-ops.c:694:__glusterd_friend_update_cbk]
> 0-management: Received ACC from uuid: 5be8603b-18d0-4333-8590-38f918a22857
> [2016-11-17 22:11:58.658884] E [rpc-clnt.c:201:call_bail] 0-management:
> bailing out frame type(glusterd mgmt) op(--(3)) xid = 0x15 sent >
2016-11-17 22:01:48.945424. timeout = 600 for 10.32.0.48:24007
> [2016-11-17 22:11:58.658987] E [MSGID: 106153]
[glusterd-syncop.c:113:gd_collate_errors]
> 0-glusterd: Staging failed on 10.32.0.48. Please check log file for
details.
> [2016-11-17 22:11:58.659243] I [socket.c:3382:socket_submit_reply]
> 0-socket.management: not connected (priv->connected = 255)
> [2016-11-17 22:11:58.659265] E [rpcsvc.c:1314:rpcsvc_submit_generic]
> 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc
> cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management)
> [2016-11-17 22:11:58.659305] E [MSGID: 106430]
[glusterd-utils.c:400:glusterd_submit_reply]
> 0-glusterd: Reply submission failed
> [2016-11-17 22:13:58.674343] E [rpc-clnt.c:201:call_bail] 0-management:
> bailing out frame type(glusterd mgmt) op(--(3)) xid = 0x11 sent >
2016-11-17 22:03:50.268751. timeout = 600 for 10.32.0.48:24007
> [2016-11-17 22:13:58.674414] E [MSGID: 106153]
[glusterd-syncop.c:113:gd_collate_errors]
> 0-glusterd: Staging failed on 10.32.0.48. Please check log file for
details.
> [2016-11-17 22:13:58.674604] I [socket.c:3382:socket_submit_reply]
> 0-socket.management: not connected (priv->connected = 255)
> [2016-11-17 22:13:58.674627] E [rpcsvc.c:1314:rpcsvc_submit_generic]
> 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc
> cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management)
> [2016-11-17 22:13:58.674667] E [MSGID: 106430]
[glusterd-utils.c:400:glusterd_submit_reply]
> 0-glusterd: Reply submission failed
> [2016-11-17 22:15:58.687737] E [rpc-clnt.c:201:call_bail] 0-management:
> bailing out frame type(glusterd mgmt) op(--(3)) xid = 0x17 sent >
2016-11-17 22:05:51.341614. timeout = 600 for 10.32.0.48:24007
>
> is these logs causing duplicate UUID or duplicate UUID causing this?
>
>>
>> 2. is there any way to do it from gluster code.
>>>
>>
>> Ditto as above.
>>
>>
>>>
>>> Regards,
>>> Abhishek
>>>
>>> Regards,
>>> Abhishek
>>>
>>> On Mon, Nov 21, 2016 at 9:52 AM, Atin Mukherjee <amukherj at
redhat.com>
>>> wrote:
>>>
>>>> atin at dhcp35-96:~/Downloads/gluster_users/abhishek_dup_uuid/d
>>>> uplicate_uuid/glusterd_2500/peers$ ls -lrt
>>>> total 8
>>>> -rw-------. 1 atin wheel 71 *Jan  1  1970*
>>>> 5be8603b-18d0-4333-8590-38f918a22857
>>>> -rw-------. 1 atin wheel 71 Nov 18 03:31
26ae19a6-b58f-446a-b079-411d4e
>>>> e57450
>>>>
>>>> In board 2500 look at the date of the file
>>>> 5be8603b-18d0-4333-8590-38f918a22857 (marked in bold). Not sure
how
>>>> did you end up having this file in such time stamp. I am
guessing this
>>>> could be because of the set up been not cleaned properly at the
time of
>>>> re-installation.
>>>>
>>>> Here is the steps what I'd recommend for now:
>>>>
>>>> 1. rename 26ae19a6-b58f-446a-b079-411d4ee57450 to
>>>> 5be8603b-18d0-4333-8590-38f918a22857, you should have only one
entry
>>>> in the peers folder in board 2500.
>>>> 2. Bring down both glusterd instances
>>>> 3. Bring back one by one
>>>>
>>>> And then restart glusterd to see if the issue persists.
>>>>
>>>>
>>>>
>>>> On Mon, Nov 21, 2016 at 9:34 AM, ABHISHEK PALIWAL <
>>>> abhishpaliwal at gmail.com> wrote:
>>>>
>>>>> Hope you will see in the logs......
>>>>>
>>>>> On Mon, Nov 21, 2016 at 9:17 AM, ABHISHEK PALIWAL <
>>>>> abhishpaliwal at gmail.com> wrote:
>>>>>
>>>>>> Hi Atin,
>>>>>>
>>>>>> It is not getting wipe off we have changed the
configuration path
>>>>>> from /var/lib/glusterd to /system/glusterd.
>>>>>>
>>>>>> So, they will remain as same as previous.
>>>>>>
>>>>>> On Mon, Nov 21, 2016 at 9:15 AM, Atin Mukherjee
<amukherj at redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Abhishek,
>>>>>>>
>>>>>>> rebooting the board does wipe of /var/lib/glusterd
contents in your
>>>>>>> set up right (as per my earlier conversation with
you) ? In that case, how
>>>>>>> are you ensuring that the same node gets back the
older UUID? If you don't
>>>>>>> then this is bound to happen.
>>>>>>>
>>>>>>> On Mon, Nov 21, 2016 at 9:11 AM, ABHISHEK PALIWAL
<
>>>>>>> abhishpaliwal at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Team,
>>>>>>>>
>>>>>>>> Please lookinto this problem as this is very
widely seen problem in
>>>>>>>> our system.
>>>>>>>>
>>>>>>>> We are having the setup of replicate volume
setup with two brick
>>>>>>>> but after restarting the second board I am
getting the duplicate entry in
>>>>>>>> "gluster peer status" command like
below:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *# gluster peer status Number of Peers: 2 
Hostname: 10.32.0.48
>>>>>>>> Uuid: 5be8603b-18d0-4333-8590-38f918a22857
State: Peer in Cluster
>>>>>>>> (Connected)  Hostname: 10.32.0.48 Uuid:
>>>>>>>> 5be8603b-18d0-4333-8590-38f918a22857 State:
Peer in Cluster (Connected) # *
>>>>>>>>
>>>>>>>> I am attaching all logs from both the boards
and the command
>>>>>>>> outputs as well.
>>>>>>>>
>>>>>>>> So could you please check what is the reason to
get in this
>>>>>>>> situation as it is very frequent in multiple
case.
>>>>>>>>
>>>>>>>> Also, we are not replacing any board from setup
just rebooting.
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Abhishek Paliwal
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Gluster-users mailing list
>>>>>>>> Gluster-users at gluster.org
>>>>>>>>
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> ~ Atin (atinm)
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regards
>>>>>> Abhishek Paliwal
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Regards
>>>>> Abhishek Paliwal
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ~ Atin (atinm)
>>>>
>>>
>>>
>>>
>>> --
>>>
>>>
>>>
>>>
>>> Regards
>>> Abhishek Paliwal
>>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>
>
>
> --
>
>
>
>
> Regards
> Abhishek Paliwal
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20161121/ebbb194c/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logs.tar.gz
Type: application/x-gzip
Size: 160108 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20161121/ebbb194c/attachment-0001.gz>

Gluster users - Nov 2016 - Duplicate UUID entries in "gluster peer status" command

[Gluster-users] Duplicate UUID entries in "gluster peer status" command

[Gluster-users] Duplicate UUID entries in "gluster peer status" command