thr3ads.net - Gluster users - [Gluster-users] Failure while upgrading gluster to 3.10.1 [May 2017]

If this information is useful, please help other people find it:
Share via:

Pawan Alwandi

2017-May-22 15:35 UTC

[Gluster-users] Failure while upgrading gluster to 3.10.1

On Mon, May 22, 2017 at 8:36 PM, Atin Mukherjee <amukherj at redhat.com>
wrote:
>
>
> On Mon, May 22, 2017 at 7:51 PM, Atin Mukherjee <amukherj at
redhat.com>
> wrote:
>
>> Sorry Pawan, I did miss the other part of the attachments. So looking
>> from the glusterd.info file from all the hosts, it looks like host2 and
>> host3 do not have the correct op-version. Can you please set the
op-version
>> as "operating-version=30702" in host2 and host3 and restart
glusterd
>> instance one by one on all the nodes?
>>
>
> Please ensure that all the hosts are upgraded to the same bits before
> doing this change.
>
Having to upgrade all 3 hosts to newer version before gluster could work
successfully on any of them means application downtime.  The applications
running on these hosts are expected to be highly available.  So with the
way the things are right now, is an online upgrade possible?  My upgrade
steps are: (1) stop the applications (2) umount the gluster volume, and
then (3) upgrade gluster one host at a time.

Our goal is to get gluster upgraded to 3.11 from 3.6.9, and to make this an
online upgrade we are okay to take two steps 3.6.9 -> 3.7 and then 3.7 to
3.11.

>
>
>>
>> Apparently it looks like there is a bug which you have uncovered,
during
>> peer handshaking if one of the glusterd instance is running with old
bits
>> then during validating the handshake request there is a possibility
that
>> uuid received will be blank and the same was ignored however there was
a
>> patch http://review.gluster.org/13519 which had some additional changes
>> which was always looking at this field and doing some extra checks
which
>> was causing the handshake to fail. For now, the above workaround should
>> suffice. I'll be sending a patch pretty soon.
>>
>
> Posted a patch https://review.gluster.org/#/c/17358 .
>
>
>>
>>
>>
>> On Mon, May 22, 2017 at 11:35 AM, Pawan Alwandi <pawan at
platform.sh>
>> wrote:
>>
>>> Hello Atin,
>>>
>>> The tar's have the content of `/var/lib/glusterd` too for all 3
nodes,
>>> please check again.
>>>
>>> Thanks
>>>
>>> On Mon, May 22, 2017 at 11:32 AM, Atin Mukherjee <amukherj at
redhat.com>
>>> wrote:
>>>
>>>> Pawan,
>>>>
>>>> I see you have provided the log files from the nodes, however
it'd be
>>>> really helpful if you can provide me the content of
/var/lib/glusterd from
>>>> all the nodes to get to the root cause of this issue.
>>>>
>>>> On Fri, May 19, 2017 at 12:09 PM, Pawan Alwandi <pawan at
platform.sh>
>>>> wrote:
>>>>
>>>>> Hello Atin,
>>>>>
>>>>> Thanks for continued support.  I've attached requested
files from all
>>>>> 3 nodes.
>>>>>
>>>>> (I think we already verified the UUIDs to be correct,
anyway let us
>>>>> know if you find any more info in the logs)
>>>>>
>>>>> Pawan
>>>>>
>>>>> On Thu, May 18, 2017 at 11:45 PM, Atin Mukherjee
<amukherj at redhat.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> On Thu, 18 May 2017 at 23:40, Atin Mukherjee
<amukherj at redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Wed, 17 May 2017 at 12:47, Pawan Alwandi
<pawan at platform.sh>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hello Atin,
>>>>>>>>
>>>>>>>> I realized that these
http://gluster.readthedocs.io/
>>>>>>>> en/latest/Upgrade-Guide/upgrade_to_3.10/
instructions only work
>>>>>>>> for upgrades from 3.7, while we are running
3.6.2.  Are there
>>>>>>>> instructions/suggestion you have for us to
upgrade from 3.6 version?
>>>>>>>>
>>>>>>>> I believe upgrade from 3.6 to 3.7 and then to
3.10 would work, but
>>>>>>>> I see similar errors reported when I upgraded
to 3.7 too.
>>>>>>>>
>>>>>>>> For what its worth, I was able to set the
op-version (gluster v set
>>>>>>>> all cluster.op-version 30702) but that
doesn't seem to help.
>>>>>>>>
>>>>>>>> [2017-05-17 06:48:33.700014] I [MSGID: 100030]
>>>>>>>> [glusterfsd.c:2338:main] 0-/usr/sbin/glusterd:
Started running
>>>>>>>> /usr/sbin/glusterd version 3.7.20 (args:
/usr/sbin/glusterd -p
>>>>>>>> /var/run/glusterd.pid)
>>>>>>>> [2017-05-17 06:48:33.703808] I [MSGID: 106478]
>>>>>>>> [glusterd.c:1383:init] 0-management: Maximum
allowed open file descriptors
>>>>>>>> set to 65536
>>>>>>>> [2017-05-17 06:48:33.703836] I [MSGID: 106479]
>>>>>>>> [glusterd.c:1432:init] 0-management: Using
/var/lib/glusterd as working
>>>>>>>> directory
>>>>>>>> [2017-05-17 06:48:33.708866] W [MSGID: 103071]
>>>>>>>> [rdma.c:4594:__gf_rdma_ctx_create]
0-rpc-transport/rdma: rdma_cm
>>>>>>>> event channel creation failed [No such device]
>>>>>>>> [2017-05-17 06:48:33.709011] W [MSGID: 103055]
[rdma.c:4901:init]
>>>>>>>> 0-rdma.management: Failed to initialize IB
Device
>>>>>>>> [2017-05-17 06:48:33.709033] W
[rpc-transport.c:359:rpc_transport_load]
>>>>>>>> 0-rpc-transport: 'rdma' initialization
failed
>>>>>>>> [2017-05-17 06:48:33.709088] W
[rpcsvc.c:1642:rpcsvc_create_listener]
>>>>>>>> 0-rpc-service: cannot create listener, initing
the transport failed
>>>>>>>> [2017-05-17 06:48:33.709105] E [MSGID: 106243]
>>>>>>>> [glusterd.c:1656:init] 0-management: creation
of 1 listeners failed,
>>>>>>>> continuing with succeeded transport
>>>>>>>> [2017-05-17 06:48:35.480043] I [MSGID: 106513]
>>>>>>>>
[glusterd-store.c:2068:glusterd_restore_op_version] 0-glusterd:
>>>>>>>> retrieved op-version: 30600
>>>>>>>> [2017-05-17 06:48:35.605779] I [MSGID: 106498]
>>>>>>>>
[glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo]
>>>>>>>> 0-management: connect returned 0
>>>>>>>> [2017-05-17 06:48:35.607059] I
[rpc-clnt.c:1046:rpc_clnt_connection_init]
>>>>>>>> 0-management: setting frame-timeout to 600
>>>>>>>> [2017-05-17 06:48:35.607670] I
[rpc-clnt.c:1046:rpc_clnt_connection_init]
>>>>>>>> 0-management: setting frame-timeout to 600
>>>>>>>> [2017-05-17 06:48:35.607025] I [MSGID: 106498]
>>>>>>>>
[glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo]
>>>>>>>> 0-management: connect returned 0
>>>>>>>> [2017-05-17 06:48:35.608125] I [MSGID: 106544]
>>>>>>>> [glusterd.c:159:glusterd_uuid_init]
0-management: retrieved UUID:
>>>>>>>> 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>
>>>>>>>
>>>>>>>> Final graph:
>>>>>>>>
+-----------------------------------------------------------
>>>>>>>> -------------------+
>>>>>>>>   1: volume management
>>>>>>>>   2:     type mgmt/glusterd
>>>>>>>>   3:     option rpc-auth.auth-glusterfs on
>>>>>>>>   4:     option rpc-auth.auth-unix on
>>>>>>>>   5:     option rpc-auth.auth-null on
>>>>>>>>   6:     option rpc-auth-allow-insecure on
>>>>>>>>   7:     option transport.socket.listen-backlog
128
>>>>>>>>   8:     option event-threads 1
>>>>>>>>   9:     option ping-timeout 0
>>>>>>>>  10:     option transport.socket.read-fail-log
off
>>>>>>>>  11:     option
transport.socket.keepalive-interval 2
>>>>>>>>  12:     option transport.socket.keepalive-time
10
>>>>>>>>  13:     option transport-type rdma
>>>>>>>>  14:     option working-directory
/var/lib/glusterd
>>>>>>>>  15: end-volume
>>>>>>>>  16:
>>>>>>>>
+-----------------------------------------------------------
>>>>>>>> -------------------+
>>>>>>>> [2017-05-17 06:48:35.609868] I [MSGID: 101190]
>>>>>>>> [event-epoll.c:632:event_dispatch_epoll_worker]
0-epoll: Started
>>>>>>>> thread with index 1
>>>>>>>> [2017-05-17 06:48:35.610839] W
[socket.c:596:__socket_rwv]
>>>>>>>> 0-management: readv on 192.168.0.7:24007 failed
(No data available)
>>>>>>>> [2017-05-17 06:48:35.611907] E
[rpc-clnt.c:370:saved_frames_unwind]
>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3]
>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df]
>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe]
>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39]
>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380]
>>>>>>>> ))))) 0-management: forced unwinding frame
type(GLUSTERD-DUMP) op(DUMP(1))
>>>>>>>> called at 2017-05-17 06:48:35.609965 (xid=0x1)
>>>>>>>> [2017-05-17 06:48:35.611928] E [MSGID: 106167]
>>>>>>>>
[glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk]
>>>>>>>> 0-management: Error through RPC layer, retry
again later
>>>>>>>> [2017-05-17 06:48:35.611944] I [MSGID: 106004]
>>>>>>>>
[glusterd-handler.c:5201:__glusterd_peer_rpc_notify] 0-management:
>>>>>>>> Peer <192.168.0.7>
(<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>), in
>>>>>>>> state <Peer in Cluster>, has disconnected
from glusterd.
>>>>>>>> [2017-05-17 06:48:35.612024] W
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/g
>>>>>>>> lusterd.so(glusterd_big_locked_notify+0x4b)
[0x7fd6bdc4912b]
>>>>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/gl
>>>>>>>> usterd.so(__glusterd_peer_rpc_notify+0x160)
[0x7fd6bdc52dd0]
>>>>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/gl
>>>>>>>> usterd.so(glusterd_mgmt_v3_unlock+0x4c3)
[0x7fd6bdcef1b3] )
>>>>>>>> 0-management: Lock for vol shared not held
>>>>>>>> [2017-05-17 06:48:35.612039] W [MSGID: 106118]
>>>>>>>>
[glusterd-handler.c:5223:__glusterd_peer_rpc_notify] 0-management:
>>>>>>>> Lock not released for shared
>>>>>>>> [2017-05-17 06:48:35.612079] W
[socket.c:596:__socket_rwv]
>>>>>>>> 0-management: readv on 192.168.0.6:24007 failed
(No data available)
>>>>>>>> [2017-05-17 06:48:35.612179] E
[rpc-clnt.c:370:saved_frames_unwind]
>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3]
>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df]
>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe]
>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39]
>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380]
>>>>>>>> ))))) 0-management: forced unwinding frame
type(GLUSTERD-DUMP) op(DUMP(1))
>>>>>>>> called at 2017-05-17 06:48:35.610007 (xid=0x1)
>>>>>>>> [2017-05-17 06:48:35.612197] E [MSGID: 106167]
>>>>>>>>
[glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk]
>>>>>>>> 0-management: Error through RPC layer, retry
again later
>>>>>>>> [2017-05-17 06:48:35.612211] I [MSGID: 106004]
>>>>>>>>
[glusterd-handler.c:5201:__glusterd_peer_rpc_notify] 0-management:
>>>>>>>> Peer <192.168.0.6>
(<83e9a0b9-6bd5-483b-8516-d8928805ed95>), in
>>>>>>>> state <Peer in Cluster>, has disconnected
from glusterd.
>>>>>>>> [2017-05-17 06:48:35.612292] W
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/g
>>>>>>>> lusterd.so(glusterd_big_locked_notify+0x4b)
[0x7fd6bdc4912b]
>>>>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/gl
>>>>>>>> usterd.so(__glusterd_peer_rpc_notify+0x160)
[0x7fd6bdc52dd0]
>>>>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/gl
>>>>>>>> usterd.so(glusterd_mgmt_v3_unlock+0x4c3)
[0x7fd6bdcef1b3] )
>>>>>>>> 0-management: Lock for vol shared not held
>>>>>>>> [2017-05-17 06:48:35.613432] W [MSGID: 106118]
>>>>>>>>
[glusterd-handler.c:5223:__glusterd_peer_rpc_notify] 0-management:
>>>>>>>> Lock not released for shared
>>>>>>>> [2017-05-17 06:48:35.614317] E [MSGID: 106170]
>>>>>>>>
[glusterd-handshake.c:1051:gd_validate_mgmt_hndsk_req]
>>>>>>>> 0-management: Request from peer 192.168.0.6:991
has an entry in
>>>>>>>> peerinfo, but uuid does not match
>>>>>>>>
>>>>>>>
>>>>>>> Apologies for delay. My initial suspect was
correct. You have an
>>>>>>> incorrect UUID in the peer file which is causing
this. Can you please
>>>>>>> provide me the
>>>>>>>
>>>>>>
>>>>>> Clicked the send button accidentally!
>>>>>>
>>>>>> Can you please send me the content of /var/lib/glusterd
& glusterd
>>>>>> log from all the nodes?
>>>>>>
>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, May 15, 2017 at 10:31 PM, Atin
Mukherjee <
>>>>>>>> amukherj at redhat.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, 15 May 2017 at 11:58, Pawan Alwandi
<pawan at platform.sh>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Atin,
>>>>>>>>>>
>>>>>>>>>> I see below error.  Do I require
gluster to be upgraded on all 3
>>>>>>>>>> hosts for this to work?  Right now I
have host 1 running 3.10.1 and host 2
>>>>>>>>>> & 3 running 3.6.2
>>>>>>>>>>
>>>>>>>>>> # gluster v set all cluster.op-version
31001
>>>>>>>>>> volume set: failed: Required op_version
(31001) is not supported
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes you should given 3.6 version is EOLed.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, May 15, 2017 at 3:32 AM, Atin
Mukherjee <
>>>>>>>>>> amukherj at redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> On Sun, 14 May 2017 at 21:43, Atin
Mukherjee <
>>>>>>>>>>> amukherj at redhat.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Allright, I see that you
haven't bumped up the op-version. Can
>>>>>>>>>>>> you please execute:
>>>>>>>>>>>>
>>>>>>>>>>>> gluster v set all
cluster.op-version 30101  and then restart
>>>>>>>>>>>> glusterd on all the nodes and
check the brick status?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> s/30101/31001
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, May 14, 2017 at 8:55
PM, Pawan Alwandi <
>>>>>>>>>>>> pawan at platform.sh> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hello Atin,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for looking at this.
Below is the output you requested
>>>>>>>>>>>>> for.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Again, I'm seeing those
errors after upgrading gluster on host
>>>>>>>>>>>>> 1.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Host 1
>>>>>>>>>>>>>
>>>>>>>>>>>>> # cat
/var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>
UUID=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>> operating-version=30600
>>>>>>>>>>>>>
>>>>>>>>>>>>> # cat
/var/lib/glusterd/peers/*
>>>>>>>>>>>>>
uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>>>>>>>> state=3
>>>>>>>>>>>>> hostname1=192.168.0.7
>>>>>>>>>>>>>
uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>>>>>>>> state=3
>>>>>>>>>>>>> hostname1=192.168.0.6
>>>>>>>>>>>>>
>>>>>>>>>>>>> # gluster --version
>>>>>>>>>>>>> glusterfs 3.10.1
>>>>>>>>>>>>>
>>>>>>>>>>>>> Host 2
>>>>>>>>>>>>>
>>>>>>>>>>>>> # cat
/var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>
UUID=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>>>>>>>> operating-version=30600
>>>>>>>>>>>>>
>>>>>>>>>>>>> # cat
/var/lib/glusterd/peers/*
>>>>>>>>>>>>>
uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>>>>>>>> state=3
>>>>>>>>>>>>> hostname1=192.168.0.7
>>>>>>>>>>>>>
uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>> state=3
>>>>>>>>>>>>> hostname1=192.168.0.5
>>>>>>>>>>>>>
>>>>>>>>>>>>> # gluster --version
>>>>>>>>>>>>> glusterfs 3.6.2 built on
Jan 21 2015 14:23:44
>>>>>>>>>>>>>
>>>>>>>>>>>>> Host 3
>>>>>>>>>>>>>
>>>>>>>>>>>>> # cat
/var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>
UUID=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>>>>>>>> operating-version=30600
>>>>>>>>>>>>>
>>>>>>>>>>>>> # cat
/var/lib/glusterd/peers/*
>>>>>>>>>>>>>
uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>> state=3
>>>>>>>>>>>>> hostname1=192.168.0.5
>>>>>>>>>>>>>
uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>>>>>>>> state=3
>>>>>>>>>>>>> hostname1=192.168.0.6
>>>>>>>>>>>>>
>>>>>>>>>>>>> # gluster --version
>>>>>>>>>>>>> glusterfs 3.6.2 built on
Jan 21 2015 14:23:44
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, May 13, 2017 at
6:28 PM, Atin Mukherjee <
>>>>>>>>>>>>> amukherj at redhat.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have already asked
for the following earlier:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can you please provide
output of following from all the nodes:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> cat
/var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>> cat
/var/lib/glusterd/peers/*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, 13 May 2017 at
12:22, Pawan Alwandi <pawan at platform.sh>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hello folks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Does anyone have
any idea whats going on here?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Pawan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, May 10,
2017 at 5:02 PM, Pawan Alwandi <
>>>>>>>>>>>>>>> pawan at
platform.sh> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm trying
to upgrade gluster from 3.6.2 to 3.10.1 but
>>>>>>>>>>>>>>>> don't see
the glusterfsd and glusterfs processes coming up.
>>>>>>>>>>>>>>>>
http://gluster.readthedocs.io/
>>>>>>>>>>>>>>>>
en/latest/Upgrade-Guide/upgrade_to_3.10/ is the process
>>>>>>>>>>>>>>>> that I'm
trying to follow.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This is a 3
node server setup with a replicated volume
>>>>>>>>>>>>>>>> having replica
count of 3.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Logs below:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.507959] I [MSGID: 100030]
>>>>>>>>>>>>>>>>
[glusterfsd.c:2460:main] 0-/usr/sbin/glusterd: Started running
>>>>>>>>>>>>>>>>
/usr/sbin/glusterd version 3.10.1 (args: /usr/sbin/glusterd -p
>>>>>>>>>>>>>>>>
/var/run/glusterd.pid)
>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.512827] I [MSGID: 106478]
>>>>>>>>>>>>>>>>
[glusterd.c:1449:init] 0-management: Maximum allowed open file descriptors
>>>>>>>>>>>>>>>> set to 65536
>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.512855] I [MSGID: 106479]
>>>>>>>>>>>>>>>>
[glusterd.c:1496:init] 0-management: Using /var/lib/glusterd as working
>>>>>>>>>>>>>>>> directory
>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.520426] W [MSGID: 103071]
>>>>>>>>>>>>>>>>
[rdma.c:4590:__gf_rdma_ctx_create] 0-rpc-transport/rdma:
>>>>>>>>>>>>>>>> rdma_cm event
channel creation failed [No such device]
>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.520452] W [MSGID: 103055]
>>>>>>>>>>>>>>>>
[rdma.c:4897:init] 0-rdma.management: Failed to initialize IB Device
>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.520465] W
>>>>>>>>>>>>>>>>
[rpc-transport.c:350:rpc_transport_load] 0-rpc-transport:
>>>>>>>>>>>>>>>> 'rdma'
initialization failed
>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.520518] W
>>>>>>>>>>>>>>>>
[rpcsvc.c:1661:rpcsvc_create_listener] 0-rpc-service:
>>>>>>>>>>>>>>>> cannot create
listener, initing the transport failed
>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.520534] E [MSGID: 106243]
>>>>>>>>>>>>>>>>
[glusterd.c:1720:init] 0-management: creation of 1 listeners failed,
>>>>>>>>>>>>>>>> continuing with
succeeded transport
>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.931764] I [MSGID: 106513]
>>>>>>>>>>>>>>>>
[glusterd-store.c:2197:glusterd_restore_op_version]
>>>>>>>>>>>>>>>> 0-glusterd:
retrieved op-version: 30600
>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.964354] I [MSGID: 106544]
>>>>>>>>>>>>>>>>
[glusterd.c:158:glusterd_uuid_init] 0-management:
>>>>>>>>>>>>>>>> retrieved UUID:
7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.993944] I [MSGID: 106498]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo]
>>>>>>>>>>>>>>>> 0-management:
connect returned 0
>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.995864] I [MSGID: 106498]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo]
>>>>>>>>>>>>>>>> 0-management:
connect returned 0
>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.995879] W [MSGID: 106062]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:3466:glust
>>>>>>>>>>>>>>>>
erd_transport_inet_options_build] 0-glusterd: Failed to
>>>>>>>>>>>>>>>> get
tcp-user-timeout
>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.995903] I
>>>>>>>>>>>>>>>>
[rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management:
>>>>>>>>>>>>>>>> setting
frame-timeout to 600
>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.996325] I
>>>>>>>>>>>>>>>>
[rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management:
>>>>>>>>>>>>>>>> setting
frame-timeout to 600
>>>>>>>>>>>>>>>> Final graph:
>>>>>>>>>>>>>>>>
+-----------------------------
>>>>>>>>>>>>>>>>
-------------------------------------------------+
>>>>>>>>>>>>>>>>   1: volume
management
>>>>>>>>>>>>>>>>   2:     type
mgmt/glusterd
>>>>>>>>>>>>>>>>   3:     option
rpc-auth.auth-glusterfs on
>>>>>>>>>>>>>>>>   4:     option
rpc-auth.auth-unix on
>>>>>>>>>>>>>>>>   5:     option
rpc-auth.auth-null on
>>>>>>>>>>>>>>>>   6:     option
rpc-auth-allow-insecure on
>>>>>>>>>>>>>>>>   7:     option
transport.socket.listen-backlog 128
>>>>>>>>>>>>>>>>   8:     option
event-threads 1
>>>>>>>>>>>>>>>>   9:     option
ping-timeout 0
>>>>>>>>>>>>>>>>  10:     option
transport.socket.read-fail-log off
>>>>>>>>>>>>>>>>  11:     option
transport.socket.keepalive-interval 2
>>>>>>>>>>>>>>>>  12:     option
transport.socket.keepalive-time 10
>>>>>>>>>>>>>>>>  13:     option
transport-type rdma
>>>>>>>>>>>>>>>>  14:     option
working-directory /var/lib/glusterd
>>>>>>>>>>>>>>>>  15: end-volume
>>>>>>>>>>>>>>>>  16:
>>>>>>>>>>>>>>>>
+-----------------------------
>>>>>>>>>>>>>>>>
-------------------------------------------------+
>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.996310] W [MSGID: 106062]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:3466:glust
>>>>>>>>>>>>>>>>
erd_transport_inet_options_build] 0-glusterd: Failed to
>>>>>>>>>>>>>>>> get
tcp-user-timeout
>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.000461] I [MSGID: 101190]
>>>>>>>>>>>>>>>>
[event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll:
>>>>>>>>>>>>>>>> Started thread
with index 1
>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.001493] W [socket.c:593:__socket_rwv]
>>>>>>>>>>>>>>>> 0-management:
readv on 192.168.0.7:24007 failed (No data
>>>>>>>>>>>>>>>> available)
>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.001513] I [MSGID: 106004]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:5882:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>> 0-management:
Peer <192.168.0.7> (<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>),
>>>>>>>>>>>>>>>> in state
<Peer in Cluster>, h
>>>>>>>>>>>>>>>> as disconnected
from glusterd.
>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.001677] W
>>>>>>>>>>>>>>>>
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
>>>>>>>>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/
>>>>>>>>>>>>>>>>
glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559)
>>>>>>>>>>>>>>>>
[0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu
>>>>>>>>>>>>>>>>
/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0)
>>>>>>>>>>>>>>>>
[0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>>
lusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3)
>>>>>>>>>>>>>>>>
[0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no
>>>>>>>>>>>>>>>> t held
>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.001696] W [MSGID: 106118]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:5907:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>> 0-management:
Lock not released for shared
>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003099] E
>>>>>>>>>>>>>>>>
[rpc-clnt.c:365:saved_frames_unwind] (-->
>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
lusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c]
>>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s
>>>>>>>>>>>>>>>>
aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (-->
>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de] (-->
>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_connection_cleanup+0x
>>>>>>>>>>>>>>>>
91)[0x7f0bfec91c21] (--> /usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710] )))))
>>>>>>>>>>>>>>>> 0-management:
forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
>>>>>>>>>>>>>>>> at 2017-05-10
09:0
>>>>>>>>>>>>>>>> 7:05.000627
(xid=0x1)
>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003129] E [MSGID: 106167]
>>>>>>>>>>>>>>>>
[glusterd-handshake.c:2181:__glusterd_peer_dump_version_cbk]
>>>>>>>>>>>>>>>> 0-management:
Error through RPC layer, retry again later
>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003251] W [socket.c:593:__socket_rwv]
>>>>>>>>>>>>>>>> 0-management:
readv on 192.168.0.6:24007 failed (No data
>>>>>>>>>>>>>>>> available)
>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003267] I [MSGID: 106004]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:5882:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>> 0-management:
Peer <192.168.0.6> (<83e9a0b9-6bd5-483b-8516-d8928805ed95>),
>>>>>>>>>>>>>>>> in state
<Peer in Cluster>, h
>>>>>>>>>>>>>>>> as disconnected
from glusterd.
>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003318] W
>>>>>>>>>>>>>>>>
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
>>>>>>>>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/
>>>>>>>>>>>>>>>>
glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559)
>>>>>>>>>>>>>>>>
[0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu
>>>>>>>>>>>>>>>>
/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0)
>>>>>>>>>>>>>>>>
[0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>>
lusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3)
>>>>>>>>>>>>>>>>
[0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no
>>>>>>>>>>>>>>>> t held
>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003329] W [MSGID: 106118]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:5907:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>> 0-management:
Lock not released for shared
>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003457] E
>>>>>>>>>>>>>>>>
[rpc-clnt.c:365:saved_frames_unwind] (-->
>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
lusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c]
>>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s
>>>>>>>>>>>>>>>>
aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (-->
>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de] (-->
>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_connection_cleanup+0x
>>>>>>>>>>>>>>>>
91)[0x7f0bfec91c21] (--> /usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710] )))))
>>>>>>>>>>>>>>>> 0-management:
forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
>>>>>>>>>>>>>>>> at 2017-05-10
09:0
>>>>>>>>>>>>>>>> 7:05.001407
(xid=0x1)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> There are a
bunch of errors reported but I'm not sure which
>>>>>>>>>>>>>>>> is signal and
which ones are noise.  Does anyone have any idea whats going
>>>>>>>>>>>>>>>> on here?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Pawan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>>> Gluster-users
mailing list
>>>>>>>>>>>>>>> Gluster-users at
gluster.org
>>>>>>>>>>>>>>>
http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> - Atin (atinm)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>> - Atin (atinm)
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>> - Atin (atinm)
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>> - Atin (atinm)
>>>>>>>
>>>>>> --
>>>>>> - Atin (atinm)
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170522/5269cfbd/attachment.html>

Atin Mukherjee

2017-May-22 16:12 UTC

head link

[Gluster-users] Failure while upgrading gluster to 3.10.1

On Mon, May 22, 2017 at 9:05 PM, Pawan Alwandi <pawan at platform.sh>
wrote:
>
> On Mon, May 22, 2017 at 8:36 PM, Atin Mukherjee <amukherj at
redhat.com>
> wrote:
>
>>
>>
>> On Mon, May 22, 2017 at 7:51 PM, Atin Mukherjee <amukherj at
redhat.com>
>> wrote:
>>
>>> Sorry Pawan, I did miss the other part of the attachments. So
looking
>>> from the glusterd.info file from all the hosts, it looks like host2
and
>>> host3 do not have the correct op-version. Can you please set the
op-version
>>> as "operating-version=30702" in host2 and host3 and
restart glusterd
>>> instance one by one on all the nodes?
>>>
>>
>> Please ensure that all the hosts are upgraded to the same bits before
>> doing this change.
>>
>
> Having to upgrade all 3 hosts to newer version before gluster could work
> successfully on any of them means application downtime.  The applications
> running on these hosts are expected to be highly available.  So with the
> way the things are right now, is an online upgrade possible?  My upgrade
> steps are: (1) stop the applications (2) umount the gluster volume, and
> then (3) upgrade gluster one host at a time.
>
One of the way to mitigate this is to first do an online upgrade to
glusterfs-3.7.9 (op-version:30707) given this bug was introduced in 3.7.10
and then come to 3.11.

> Our goal is to get gluster upgraded to 3.11 from 3.6.9, and to make this
> an online upgrade we are okay to take two steps 3.6.9 -> 3.7 and then
3.7
> to 3.11.
>
>
>>
>>
>>>
>>> Apparently it looks like there is a bug which you have uncovered,
during
>>> peer handshaking if one of the glusterd instance is running with
old bits
>>> then during validating the handshake request there is a possibility
that
>>> uuid received will be blank and the same was ignored however there
was a
>>> patch http://review.gluster.org/13519 which had some additional
changes
>>> which was always looking at this field and doing some extra checks
which
>>> was causing the handshake to fail. For now, the above workaround
should
>>> suffice. I'll be sending a patch pretty soon.
>>>
>>
>> Posted a patch https://review.gluster.org/#/c/17358 .
>>
>>
>>>
>>>
>>>
>>> On Mon, May 22, 2017 at 11:35 AM, Pawan Alwandi <pawan at
platform.sh>
>>> wrote:
>>>
>>>> Hello Atin,
>>>>
>>>> The tar's have the content of `/var/lib/glusterd` too for
all 3 nodes,
>>>> please check again.
>>>>
>>>> Thanks
>>>>
>>>> On Mon, May 22, 2017 at 11:32 AM, Atin Mukherjee <amukherj
at redhat.com>
>>>> wrote:
>>>>
>>>>> Pawan,
>>>>>
>>>>> I see you have provided the log files from the nodes,
however it'd be
>>>>> really helpful if you can provide me the content of
/var/lib/glusterd from
>>>>> all the nodes to get to the root cause of this issue.
>>>>>
>>>>> On Fri, May 19, 2017 at 12:09 PM, Pawan Alwandi <pawan
at platform.sh>
>>>>> wrote:
>>>>>
>>>>>> Hello Atin,
>>>>>>
>>>>>> Thanks for continued support.  I've attached
requested files from all
>>>>>> 3 nodes.
>>>>>>
>>>>>> (I think we already verified the UUIDs to be correct,
anyway let us
>>>>>> know if you find any more info in the logs)
>>>>>>
>>>>>> Pawan
>>>>>>
>>>>>> On Thu, May 18, 2017 at 11:45 PM, Atin Mukherjee
<amukherj at redhat.com
>>>>>> > wrote:
>>>>>>
>>>>>>>
>>>>>>> On Thu, 18 May 2017 at 23:40, Atin Mukherjee
<amukherj at redhat.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Wed, 17 May 2017 at 12:47, Pawan Alwandi
<pawan at platform.sh>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hello Atin,
>>>>>>>>>
>>>>>>>>> I realized that these
http://gluster.readthedocs.io/
>>>>>>>>> en/latest/Upgrade-Guide/upgrade_to_3.10/
instructions only work
>>>>>>>>> for upgrades from 3.7, while we are running
3.6.2.  Are there
>>>>>>>>> instructions/suggestion you have for us to
upgrade from 3.6 version?
>>>>>>>>>
>>>>>>>>> I believe upgrade from 3.6 to 3.7 and then
to 3.10 would work, but
>>>>>>>>> I see similar errors reported when I
upgraded to 3.7 too.
>>>>>>>>>
>>>>>>>>> For what its worth, I was able to set the
op-version (gluster v
>>>>>>>>> set all cluster.op-version 30702) but that
doesn't seem to help.
>>>>>>>>>
>>>>>>>>> [2017-05-17 06:48:33.700014] I [MSGID:
100030]
>>>>>>>>> [glusterfsd.c:2338:main]
0-/usr/sbin/glusterd: Started running
>>>>>>>>> /usr/sbin/glusterd version 3.7.20 (args:
/usr/sbin/glusterd -p
>>>>>>>>> /var/run/glusterd.pid)
>>>>>>>>> [2017-05-17 06:48:33.703808] I [MSGID:
106478]
>>>>>>>>> [glusterd.c:1383:init] 0-management:
Maximum allowed open file descriptors
>>>>>>>>> set to 65536
>>>>>>>>> [2017-05-17 06:48:33.703836] I [MSGID:
106479]
>>>>>>>>> [glusterd.c:1432:init] 0-management: Using
/var/lib/glusterd as working
>>>>>>>>> directory
>>>>>>>>> [2017-05-17 06:48:33.708866] W [MSGID:
103071]
>>>>>>>>> [rdma.c:4594:__gf_rdma_ctx_create]
0-rpc-transport/rdma: rdma_cm
>>>>>>>>> event channel creation failed [No such
device]
>>>>>>>>> [2017-05-17 06:48:33.709011] W [MSGID:
103055] [rdma.c:4901:init]
>>>>>>>>> 0-rdma.management: Failed to initialize IB
Device
>>>>>>>>> [2017-05-17 06:48:33.709033] W
[rpc-transport.c:359:rpc_transport_load]
>>>>>>>>> 0-rpc-transport: 'rdma'
initialization failed
>>>>>>>>> [2017-05-17 06:48:33.709088] W
[rpcsvc.c:1642:rpcsvc_create_listener]
>>>>>>>>> 0-rpc-service: cannot create listener,
initing the transport failed
>>>>>>>>> [2017-05-17 06:48:33.709105] E [MSGID:
106243]
>>>>>>>>> [glusterd.c:1656:init] 0-management:
creation of 1 listeners failed,
>>>>>>>>> continuing with succeeded transport
>>>>>>>>> [2017-05-17 06:48:35.480043] I [MSGID:
106513]
>>>>>>>>>
[glusterd-store.c:2068:glusterd_restore_op_version] 0-glusterd:
>>>>>>>>> retrieved op-version: 30600
>>>>>>>>> [2017-05-17 06:48:35.605779] I [MSGID:
106498]
>>>>>>>>>
[glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo]
>>>>>>>>> 0-management: connect returned 0
>>>>>>>>> [2017-05-17 06:48:35.607059] I
[rpc-clnt.c:1046:rpc_clnt_connection_init]
>>>>>>>>> 0-management: setting frame-timeout to 600
>>>>>>>>> [2017-05-17 06:48:35.607670] I
[rpc-clnt.c:1046:rpc_clnt_connection_init]
>>>>>>>>> 0-management: setting frame-timeout to 600
>>>>>>>>> [2017-05-17 06:48:35.607025] I [MSGID:
106498]
>>>>>>>>>
[glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo]
>>>>>>>>> 0-management: connect returned 0
>>>>>>>>> [2017-05-17 06:48:35.608125] I [MSGID:
106544]
>>>>>>>>> [glusterd.c:159:glusterd_uuid_init]
0-management: retrieved UUID:
>>>>>>>>> 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>
>>>>>>>>
>>>>>>>>> Final graph:
>>>>>>>>>
+-----------------------------------------------------------
>>>>>>>>> -------------------+
>>>>>>>>>   1: volume management
>>>>>>>>>   2:     type mgmt/glusterd
>>>>>>>>>   3:     option rpc-auth.auth-glusterfs on
>>>>>>>>>   4:     option rpc-auth.auth-unix on
>>>>>>>>>   5:     option rpc-auth.auth-null on
>>>>>>>>>   6:     option rpc-auth-allow-insecure on
>>>>>>>>>   7:     option
transport.socket.listen-backlog 128
>>>>>>>>>   8:     option event-threads 1
>>>>>>>>>   9:     option ping-timeout 0
>>>>>>>>>  10:     option
transport.socket.read-fail-log off
>>>>>>>>>  11:     option
transport.socket.keepalive-interval 2
>>>>>>>>>  12:     option
transport.socket.keepalive-time 10
>>>>>>>>>  13:     option transport-type rdma
>>>>>>>>>  14:     option working-directory
/var/lib/glusterd
>>>>>>>>>  15: end-volume
>>>>>>>>>  16:
>>>>>>>>>
+-----------------------------------------------------------
>>>>>>>>> -------------------+
>>>>>>>>> [2017-05-17 06:48:35.609868] I [MSGID:
101190]
>>>>>>>>>
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
>>>>>>>>> thread with index 1
>>>>>>>>> [2017-05-17 06:48:35.610839] W
[socket.c:596:__socket_rwv]
>>>>>>>>> 0-management: readv on 192.168.0.7:24007
failed (No data
>>>>>>>>> available)
>>>>>>>>> [2017-05-17 06:48:35.611907] E
[rpc-clnt.c:370:saved_frames_unwind]
>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3]
>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df]
>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe]
>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39]
>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380]
>>>>>>>>> ))))) 0-management: forced unwinding frame
type(GLUSTERD-DUMP) op(DUMP(1))
>>>>>>>>> called at 2017-05-17 06:48:35.609965
(xid=0x1)
>>>>>>>>> [2017-05-17 06:48:35.611928] E [MSGID:
106167]
>>>>>>>>>
[glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk]
>>>>>>>>> 0-management: Error through RPC layer,
retry again later
>>>>>>>>> [2017-05-17 06:48:35.611944] I [MSGID:
106004]
>>>>>>>>>
[glusterd-handler.c:5201:__glusterd_peer_rpc_notify]
>>>>>>>>> 0-management: Peer <192.168.0.7>
(<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>),
>>>>>>>>> in state <Peer in Cluster>, has
disconnected from glusterd.
>>>>>>>>> [2017-05-17 06:48:35.612024] W
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/g
>>>>>>>>> lusterd.so(glusterd_big_locked_notify+0x4b)
[0x7fd6bdc4912b]
>>>>>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/gl
>>>>>>>>> usterd.so(__glusterd_peer_rpc_notify+0x160)
[0x7fd6bdc52dd0]
>>>>>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/gl
>>>>>>>>> usterd.so(glusterd_mgmt_v3_unlock+0x4c3)
[0x7fd6bdcef1b3] )
>>>>>>>>> 0-management: Lock for vol shared not held
>>>>>>>>> [2017-05-17 06:48:35.612039] W [MSGID:
106118]
>>>>>>>>>
[glusterd-handler.c:5223:__glusterd_peer_rpc_notify]
>>>>>>>>> 0-management: Lock not released for shared
>>>>>>>>> [2017-05-17 06:48:35.612079] W
[socket.c:596:__socket_rwv]
>>>>>>>>> 0-management: readv on 192.168.0.6:24007
failed (No data
>>>>>>>>> available)
>>>>>>>>> [2017-05-17 06:48:35.612179] E
[rpc-clnt.c:370:saved_frames_unwind]
>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3]
>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df]
>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe]
>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39]
>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380]
>>>>>>>>> ))))) 0-management: forced unwinding frame
type(GLUSTERD-DUMP) op(DUMP(1))
>>>>>>>>> called at 2017-05-17 06:48:35.610007
(xid=0x1)
>>>>>>>>> [2017-05-17 06:48:35.612197] E [MSGID:
106167]
>>>>>>>>>
[glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk]
>>>>>>>>> 0-management: Error through RPC layer,
retry again later
>>>>>>>>> [2017-05-17 06:48:35.612211] I [MSGID:
106004]
>>>>>>>>>
[glusterd-handler.c:5201:__glusterd_peer_rpc_notify]
>>>>>>>>> 0-management: Peer <192.168.0.6>
(<83e9a0b9-6bd5-483b-8516-d8928805ed95>),
>>>>>>>>> in state <Peer in Cluster>, has
disconnected from glusterd.
>>>>>>>>> [2017-05-17 06:48:35.612292] W
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/g
>>>>>>>>> lusterd.so(glusterd_big_locked_notify+0x4b)
[0x7fd6bdc4912b]
>>>>>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/gl
>>>>>>>>> usterd.so(__glusterd_peer_rpc_notify+0x160)
[0x7fd6bdc52dd0]
>>>>>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/gl
>>>>>>>>> usterd.so(glusterd_mgmt_v3_unlock+0x4c3)
[0x7fd6bdcef1b3] )
>>>>>>>>> 0-management: Lock for vol shared not held
>>>>>>>>> [2017-05-17 06:48:35.613432] W [MSGID:
106118]
>>>>>>>>>
[glusterd-handler.c:5223:__glusterd_peer_rpc_notify]
>>>>>>>>> 0-management: Lock not released for shared
>>>>>>>>> [2017-05-17 06:48:35.614317] E [MSGID:
106170]
>>>>>>>>>
[glusterd-handshake.c:1051:gd_validate_mgmt_hndsk_req]
>>>>>>>>> 0-management: Request from peer
192.168.0.6:991 has an entry in
>>>>>>>>> peerinfo, but uuid does not match
>>>>>>>>>
>>>>>>>>
>>>>>>>> Apologies for delay. My initial suspect was
correct. You have an
>>>>>>>> incorrect UUID in the peer file which is
causing this. Can you please
>>>>>>>> provide me the
>>>>>>>>
>>>>>>>
>>>>>>> Clicked the send button accidentally!
>>>>>>>
>>>>>>> Can you please send me the content of
/var/lib/glusterd & glusterd
>>>>>>> log from all the nodes?
>>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, May 15, 2017 at 10:31 PM, Atin
Mukherjee <
>>>>>>>>> amukherj at redhat.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, 15 May 2017 at 11:58, Pawan
Alwandi <pawan at platform.sh>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Atin,
>>>>>>>>>>>
>>>>>>>>>>> I see below error.  Do I require
gluster to be upgraded on all 3
>>>>>>>>>>> hosts for this to work?  Right now
I have host 1 running 3.10.1 and host 2
>>>>>>>>>>> & 3 running 3.6.2
>>>>>>>>>>>
>>>>>>>>>>> # gluster v set all
cluster.op-version 31001
>>>>>>>>>>> volume set: failed: Required
op_version (31001) is not supported
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes you should given 3.6 version is
EOLed.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, May 15, 2017 at 3:32 AM,
Atin Mukherjee <
>>>>>>>>>>> amukherj at redhat.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On Sun, 14 May 2017 at 21:43,
Atin Mukherjee <
>>>>>>>>>>>> amukherj at redhat.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Allright, I see that you
haven't bumped up the op-version. Can
>>>>>>>>>>>>> you please execute:
>>>>>>>>>>>>>
>>>>>>>>>>>>> gluster v set all
cluster.op-version 30101  and then restart
>>>>>>>>>>>>> glusterd on all the nodes
and check the brick status?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> s/30101/31001
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, May 14, 2017 at
8:55 PM, Pawan Alwandi <
>>>>>>>>>>>>> pawan at platform.sh>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hello Atin,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for looking at
this.  Below is the output you
>>>>>>>>>>>>>> requested for.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Again, I'm seeing
those errors after upgrading gluster on
>>>>>>>>>>>>>> host 1.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Host 1
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # cat
/var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>>
UUID=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>> operating-version=30600
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # cat
/var/lib/glusterd/peers/*
>>>>>>>>>>>>>>
uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>>>>>>>>> state=3
>>>>>>>>>>>>>> hostname1=192.168.0.7
>>>>>>>>>>>>>>
uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>>>>>>>>> state=3
>>>>>>>>>>>>>> hostname1=192.168.0.6
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # gluster --version
>>>>>>>>>>>>>> glusterfs 3.10.1
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Host 2
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # cat
/var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>>
UUID=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>>>>>>>>> operating-version=30600
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # cat
/var/lib/glusterd/peers/*
>>>>>>>>>>>>>>
uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>>>>>>>>> state=3
>>>>>>>>>>>>>> hostname1=192.168.0.7
>>>>>>>>>>>>>>
uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>> state=3
>>>>>>>>>>>>>> hostname1=192.168.0.5
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # gluster --version
>>>>>>>>>>>>>> glusterfs 3.6.2 built
on Jan 21 2015 14:23:44
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Host 3
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # cat
/var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>>
UUID=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>>>>>>>>> operating-version=30600
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # cat
/var/lib/glusterd/peers/*
>>>>>>>>>>>>>>
uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>> state=3
>>>>>>>>>>>>>> hostname1=192.168.0.5
>>>>>>>>>>>>>>
uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>>>>>>>>> state=3
>>>>>>>>>>>>>> hostname1=192.168.0.6
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # gluster --version
>>>>>>>>>>>>>> glusterfs 3.6.2 built
on Jan 21 2015 14:23:44
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, May 13, 2017 at
6:28 PM, Atin Mukherjee <
>>>>>>>>>>>>>> amukherj at
redhat.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have already
asked for the following earlier:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can you please
provide output of following from all the
>>>>>>>>>>>>>>> nodes:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> cat
/var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>>> cat
/var/lib/glusterd/peers/*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, 13 May 2017
at 12:22, Pawan Alwandi
>>>>>>>>>>>>>>> <pawan at
platform.sh> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hello folks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Does anyone
have any idea whats going on here?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Pawan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, May 10,
2017 at 5:02 PM, Pawan Alwandi <
>>>>>>>>>>>>>>>> pawan at
platform.sh> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm
trying to upgrade gluster from 3.6.2 to 3.10.1 but
>>>>>>>>>>>>>>>>> don't
see the glusterfsd and glusterfs processes coming up.
>>>>>>>>>>>>>>>>>
http://gluster.readthedocs.io/
>>>>>>>>>>>>>>>>>
en/latest/Upgrade-Guide/upgrade_to_3.10/ is the process
>>>>>>>>>>>>>>>>> that
I'm trying to follow.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This is a 3
node server setup with a replicated volume
>>>>>>>>>>>>>>>>> having
replica count of 3.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Logs below:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.507959] I [MSGID: 100030]
>>>>>>>>>>>>>>>>>
[glusterfsd.c:2460:main] 0-/usr/sbin/glusterd: Started running
>>>>>>>>>>>>>>>>>
/usr/sbin/glusterd version 3.10.1 (args: /usr/sbin/glusterd -p
>>>>>>>>>>>>>>>>>
/var/run/glusterd.pid)
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.512827] I [MSGID: 106478]
>>>>>>>>>>>>>>>>>
[glusterd.c:1449:init] 0-management: Maximum allowed open file descriptors
>>>>>>>>>>>>>>>>> set to
65536
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.512855] I [MSGID: 106479]
>>>>>>>>>>>>>>>>>
[glusterd.c:1496:init] 0-management: Using /var/lib/glusterd as working
>>>>>>>>>>>>>>>>> directory
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.520426] W [MSGID: 103071]
>>>>>>>>>>>>>>>>>
[rdma.c:4590:__gf_rdma_ctx_create] 0-rpc-transport/rdma:
>>>>>>>>>>>>>>>>> rdma_cm
event channel creation failed [No such device]
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.520452] W [MSGID: 103055]
>>>>>>>>>>>>>>>>>
[rdma.c:4897:init] 0-rdma.management: Failed to initialize IB Device
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.520465] W
>>>>>>>>>>>>>>>>>
[rpc-transport.c:350:rpc_transport_load] 0-rpc-transport:
>>>>>>>>>>>>>>>>>
'rdma' initialization failed
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.520518] W
>>>>>>>>>>>>>>>>>
[rpcsvc.c:1661:rpcsvc_create_listener] 0-rpc-service:
>>>>>>>>>>>>>>>>> cannot
create listener, initing the transport failed
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:03.520534] E [MSGID: 106243]
>>>>>>>>>>>>>>>>>
[glusterd.c:1720:init] 0-management: creation of 1 listeners failed,
>>>>>>>>>>>>>>>>> continuing
with succeeded transport
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.931764] I [MSGID: 106513]
>>>>>>>>>>>>>>>>>
[glusterd-store.c:2197:glusterd_restore_op_version]
>>>>>>>>>>>>>>>>> 0-glusterd:
retrieved op-version: 30600
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.964354] I [MSGID: 106544]
>>>>>>>>>>>>>>>>>
[glusterd.c:158:glusterd_uuid_init] 0-management:
>>>>>>>>>>>>>>>>> retrieved
UUID: 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.993944] I [MSGID: 106498]
>>>>>>>>>>>>>>>>>
[glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo]
>>>>>>>>>>>>>>>>>
0-management: connect returned 0
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.995864] I [MSGID: 106498]
>>>>>>>>>>>>>>>>>
[glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo]
>>>>>>>>>>>>>>>>>
0-management: connect returned 0
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.995879] W [MSGID: 106062]
>>>>>>>>>>>>>>>>>
[glusterd-handler.c:3466:glust
>>>>>>>>>>>>>>>>>
erd_transport_inet_options_build] 0-glusterd: Failed to
>>>>>>>>>>>>>>>>> get
tcp-user-timeout
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.995903] I
>>>>>>>>>>>>>>>>>
[rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management:
>>>>>>>>>>>>>>>>> setting
frame-timeout to 600
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.996325] I
>>>>>>>>>>>>>>>>>
[rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management:
>>>>>>>>>>>>>>>>> setting
frame-timeout to 600
>>>>>>>>>>>>>>>>> Final
graph:
>>>>>>>>>>>>>>>>>
+-----------------------------
>>>>>>>>>>>>>>>>>
-------------------------------------------------+
>>>>>>>>>>>>>>>>>   1: volume
management
>>>>>>>>>>>>>>>>>   2:    
type mgmt/glusterd
>>>>>>>>>>>>>>>>>   3:    
option rpc-auth.auth-glusterfs on
>>>>>>>>>>>>>>>>>   4:    
option rpc-auth.auth-unix on
>>>>>>>>>>>>>>>>>   5:    
option rpc-auth.auth-null on
>>>>>>>>>>>>>>>>>   6:    
option rpc-auth-allow-insecure on
>>>>>>>>>>>>>>>>>   7:    
option transport.socket.listen-backlog 128
>>>>>>>>>>>>>>>>>   8:    
option event-threads 1
>>>>>>>>>>>>>>>>>   9:    
option ping-timeout 0
>>>>>>>>>>>>>>>>>  10:    
option transport.socket.read-fail-log off
>>>>>>>>>>>>>>>>>  11:    
option transport.socket.keepalive-interval 2
>>>>>>>>>>>>>>>>>  12:    
option transport.socket.keepalive-time 10
>>>>>>>>>>>>>>>>>  13:    
option transport-type rdma
>>>>>>>>>>>>>>>>>  14:    
option working-directory /var/lib/glusterd
>>>>>>>>>>>>>>>>>  15:
end-volume
>>>>>>>>>>>>>>>>>  16:
>>>>>>>>>>>>>>>>>
+-----------------------------
>>>>>>>>>>>>>>>>>
-------------------------------------------------+
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:04.996310] W [MSGID: 106062]
>>>>>>>>>>>>>>>>>
[glusterd-handler.c:3466:glust
>>>>>>>>>>>>>>>>>
erd_transport_inet_options_build] 0-glusterd: Failed to
>>>>>>>>>>>>>>>>> get
tcp-user-timeout
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.000461] I [MSGID: 101190]
>>>>>>>>>>>>>>>>>
[event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll:
>>>>>>>>>>>>>>>>> Started
thread with index 1
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.001493] W [socket.c:593:__socket_rwv]
>>>>>>>>>>>>>>>>>
0-management: readv on 192.168.0.7:24007 failed (No data
>>>>>>>>>>>>>>>>> available)
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.001513] I [MSGID: 106004]
>>>>>>>>>>>>>>>>>
[glusterd-handler.c:5882:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>>>
0-management: Peer <192.168.0.7>
(<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>),
>>>>>>>>>>>>>>>>> in state
<Peer in Cluster>, h
>>>>>>>>>>>>>>>>> as
disconnected from glusterd.
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.001677] W
>>>>>>>>>>>>>>>>>
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
>>>>>>>>>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/
>>>>>>>>>>>>>>>>>
glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559)
>>>>>>>>>>>>>>>>>
[0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu
>>>>>>>>>>>>>>>>>
/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0)
>>>>>>>>>>>>>>>>>
[0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>>>
lusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3)
>>>>>>>>>>>>>>>>>
[0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no
>>>>>>>>>>>>>>>>> t held
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.001696] W [MSGID: 106118]
>>>>>>>>>>>>>>>>>
[glusterd-handler.c:5907:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>>>
0-management: Lock not released for shared
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003099] E
>>>>>>>>>>>>>>>>>
[rpc-clnt.c:365:saved_frames_unwind] (-->
>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>
lusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c]
>>>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s
>>>>>>>>>>>>>>>>>
aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (-->
>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de] (-->
>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_connection_cleanup+0x
>>>>>>>>>>>>>>>>>
91)[0x7f0bfec91c21] (--> /usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710] )))))
>>>>>>>>>>>>>>>>>
0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
>>>>>>>>>>>>>>>>> at
2017-05-10 09:0
>>>>>>>>>>>>>>>>> 7:05.000627
(xid=0x1)
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003129] E [MSGID: 106167]
>>>>>>>>>>>>>>>>>
[glusterd-handshake.c:2181:__glusterd_peer_dump_version_cbk]
>>>>>>>>>>>>>>>>>
0-management: Error through RPC layer, retry again later
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003251] W [socket.c:593:__socket_rwv]
>>>>>>>>>>>>>>>>>
0-management: readv on 192.168.0.6:24007 failed (No data
>>>>>>>>>>>>>>>>> available)
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003267] I [MSGID: 106004]
>>>>>>>>>>>>>>>>>
[glusterd-handler.c:5882:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>>>
0-management: Peer <192.168.0.6>
(<83e9a0b9-6bd5-483b-8516-d8928805ed95>),
>>>>>>>>>>>>>>>>> in state
<Peer in Cluster>, h
>>>>>>>>>>>>>>>>> as
disconnected from glusterd.
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003318] W
>>>>>>>>>>>>>>>>>
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
>>>>>>>>>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/
>>>>>>>>>>>>>>>>>
glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559)
>>>>>>>>>>>>>>>>>
[0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu
>>>>>>>>>>>>>>>>>
/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0)
>>>>>>>>>>>>>>>>>
[0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>>>
lusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3)
>>>>>>>>>>>>>>>>>
[0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no
>>>>>>>>>>>>>>>>> t held
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003329] W [MSGID: 106118]
>>>>>>>>>>>>>>>>>
[glusterd-handler.c:5907:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>>>
0-management: Lock not released for shared
>>>>>>>>>>>>>>>>> [2017-05-10
09:07:05.003457] E
>>>>>>>>>>>>>>>>>
[rpc-clnt.c:365:saved_frames_unwind] (-->
>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>
lusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c]
>>>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s
>>>>>>>>>>>>>>>>>
aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (-->
>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de] (-->
>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_connection_cleanup+0x
>>>>>>>>>>>>>>>>>
91)[0x7f0bfec91c21] (--> /usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710] )))))
>>>>>>>>>>>>>>>>>
0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
>>>>>>>>>>>>>>>>> at
2017-05-10 09:0
>>>>>>>>>>>>>>>>> 7:05.001407
(xid=0x1)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> There are a
bunch of errors reported but I'm not sure
>>>>>>>>>>>>>>>>> which is
signal and which ones are noise.  Does anyone have any idea whats
>>>>>>>>>>>>>>>>> going on
here?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Pawan
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>>>> Gluster-users
mailing list
>>>>>>>>>>>>>>>> Gluster-users
at gluster.org
>>>>>>>>>>>>>>>>
http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> - Atin (atinm)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>> - Atin (atinm)
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>> - Atin (atinm)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>> - Atin (atinm)
>>>>>>>>
>>>>>>> --
>>>>>>> - Atin (atinm)
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170522/c757880c/attachment.html>

Gluster users - May 2017 - Failure while upgrading gluster to 3.10.1

[Gluster-users] Failure while upgrading gluster to 3.10.1

[Gluster-users] Failure while upgrading gluster to 3.10.1