thr3ads.net - Gluster users - [Gluster-users] Failure while upgrading gluster to 3.10.1 [May 2017]

If this information is useful, please help other people find it:
Share via:

Pawan Alwandi

2017-May-29 11:20 UTC

[Gluster-users] Failure while upgrading gluster to 3.10.1

Sorry for big attachment in previous mail...last 1000 lines of those logs
attached now.

On Mon, May 29, 2017 at 4:44 PM, Pawan Alwandi <pawan at platform.sh>
wrote:
>
>
> On Thu, May 25, 2017 at 9:54 PM, Atin Mukherjee <amukherj at
redhat.com>
> wrote:
>
>>
>> On Thu, 25 May 2017 at 19:11, Pawan Alwandi <pawan at
platform.sh> wrote:
>>
>>> Hello Atin,
>>>
>>> Yes, glusterd on other instances are up and running.  Below is the
>>> requested output on all the three hosts.
>>>
>>> Host 1
>>>
>>> # gluster peer status
>>> Number of Peers: 2
>>>
>>> Hostname: 192.168.0.7
>>> Uuid: 5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>> State: Peer in Cluster (Disconnected)
>>>
>>
>> Glusterd is disconnected here.
>>
>>>
>>>
>>> Hostname: 192.168.0.6
>>> Uuid: 83e9a0b9-6bd5-483b-8516-d8928805ed95
>>> State: Peer in Cluster (Disconnected)
>>>
>>
>> Same as above
>>
>> Can you please check what does glusterd log have to say here about
these
>> disconnects?
>>
>
> glusterd keeps logging this every 3s
>
> [2017-05-29 11:04:52.182782] W [socket.c:852:__socket_keepalive]
> 0-socket: failed to set keep idle -1 on socket 5, Invalid argument
> [2017-05-29 11:04:52.182808] E [socket.c:2966:socket_connect]
> 0-management: Failed to set keep-alive: Invalid argument
> [2017-05-29 11:04:52.183032] W [socket.c:852:__socket_keepalive]
> 0-socket: failed to set keep idle -1 on socket 20, Invalid argument
> [2017-05-29 11:04:52.183052] E [socket.c:2966:socket_connect]
> 0-management: Failed to set keep-alive: Invalid argument
> [2017-05-29 11:04:52.183622] E [rpc-clnt.c:362:saved_frames_unwind] (-->
>
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483]
> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af]
> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_
> connection_cleanup+0x7e)[0x7f767c239c8e] (--> /usr/lib/x86_64-linux-gnu/
> libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8] ))))) 0-management:
> forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called at 2017-05-29
> 11:04:52.183210 (xid=0x23419)
> [2017-05-29 11:04:52.183735] W
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/
> glusterd.so(glusterd_big_locked_notify+0x4b) [0x7f767734dffb]
> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/
> glusterd.so(__glusterd_peer_rpc_notify+0x14a) [0x7f7677357c6a]
> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/
> glusterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7f76773f0ef3] )
> 0-management: Lock for vol shared not held
> [2017-05-29 11:04:52.183928] E [rpc-clnt.c:362:saved_frames_unwind] (-->
>
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483]
> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af]
> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_
> connection_cleanup+0x7e)[0x7f767c239c8e] (--> /usr/lib/x86_64-linux-gnu/
> libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8] ))))) 0-management:
> forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called at 2017-05-29
> 11:04:52.183422 (xid=0x23419)
> [2017-05-29 11:04:52.184027] W
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/
> glusterd.so(glusterd_big_locked_notify+0x4b) [0x7f767734dffb]
> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/
> glusterd.so(__glusterd_peer_rpc_notify+0x14a) [0x7f7677357c6a]
> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/
> glusterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7f76773f0ef3] )
> 0-management: Lock for vol shared not held
>
>
>
>>
>>
>>>
>>> # gluster volume status
>>> Status of volume: shared
>>> Gluster process                             TCP Port  RDMA Port
>>> Online  Pid
>>> ------------------------------------------------------------
>>> ------------------
>>> Brick 192.168.0.5:/data/exports/shared      49152     0
>>> Y       2105
>>> NFS Server on localhost                     2049      0          Y
>>> 2089
>>> Self-heal Daemon on localhost               N/A       N/A        Y
>>> 2097
>>>
>>
>> Volume status output does show all the bricks are up. So I'm not
sure why
>> are you seeing the volume as read only. Can you please provide the
mount
>> log?
>>
>
> The attached tar has nfs.log, etc-glusterfs-glusterd.vol.log,
> glustershd.log from host1.
>
>
>>
>>
>>>
>>> Task Status of Volume shared
>>> ------------------------------------------------------------
>>> ------------------
>>> There are no active volume tasks
>>>
>>> Host 2
>>>
>>> # gluster peer status
>>> Number of Peers: 2
>>>
>>> Hostname: 192.168.0.7
>>> Uuid: 5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>> State: Peer in Cluster (Connected)
>>>
>>> Hostname: 192.168.0.5
>>> Uuid: 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>> State: Peer in Cluster (Connected)
>>>
>>>
>>> # gluster volume status
>>> Status of volume: shared
>>> Gluster process                        Port    Online    Pid
>>> ------------------------------------------------------------
>>> ------------------
>>> Brick 192.168.0.5:/data/exports/shared            49152    Y   
2105
>>> Brick 192.168.0.6:/data/exports/shared            49152    Y   
2188
>>> Brick 192.168.0.7:/data/exports/shared            49152    Y   
2453
>>> NFS Server on localhost                    2049    Y    2194
>>> Self-heal Daemon on localhost                N/A    Y    2199
>>> NFS Server on 192.168.0.5                2049    Y    2089
>>> Self-heal Daemon on 192.168.0.5                N/A    Y    2097
>>> NFS Server on 192.168.0.7                2049    Y    2458
>>> Self-heal Daemon on 192.168.0.7                N/A    Y    2463
>>>
>>> Task Status of Volume shared
>>> ------------------------------------------------------------
>>> ------------------
>>> There are no active volume tasks
>>>
>>> Host 3
>>>
>>> # gluster peer status
>>> Number of Peers: 2
>>>
>>> Hostname: 192.168.0.5
>>> Uuid: 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>> State: Peer in Cluster (Connected)
>>>
>>> Hostname: 192.168.0.6
>>> Uuid: 83e9a0b9-6bd5-483b-8516-d8928805ed95
>>> State: Peer in Cluster (Connected)
>>>
>>> # gluster volume status
>>> Status of volume: shared
>>> Gluster process                        Port    Online    Pid
>>> ------------------------------------------------------------
>>> ------------------
>>> Brick 192.168.0.5:/data/exports/shared            49152    Y   
2105
>>> Brick 192.168.0.6:/data/exports/shared            49152    Y   
2188
>>> Brick 192.168.0.7:/data/exports/shared            49152    Y   
2453
>>> NFS Server on localhost                    2049    Y    2458
>>> Self-heal Daemon on localhost                N/A    Y    2463
>>> NFS Server on 192.168.0.6                2049    Y    2194
>>> Self-heal Daemon on 192.168.0.6                N/A    Y    2199
>>> NFS Server on 192.168.0.5                2049    Y    2089
>>> Self-heal Daemon on 192.168.0.5                N/A    Y    2097
>>>
>>> Task Status of Volume shared
>>> ------------------------------------------------------------
>>> ------------------
>>> There are no active volume tasks
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, May 24, 2017 at 8:32 PM, Atin Mukherjee <amukherj at
redhat.com>
>>> wrote:
>>>
>>>> Are the other glusterd instances are up? output of gluster peer
status
>>>> & gluster volume status please?
>>>>
>>>> On Wed, May 24, 2017 at 4:20 PM, Pawan Alwandi <pawan at
platform.sh>
>>>> wrote:
>>>>
>>>>> Thanks Atin,
>>>>>
>>>>> So I got gluster downgraded to 3.7.9 on host 1 and now have
the
>>>>> glusterfs and glusterfsd processes come up.  But I see the
volume is
>>>>> mounted read only.
>>>>>
>>>>> I see these being logged every 3s:
>>>>>
>>>>> [2017-05-24 10:45:44.440435] W
[socket.c:852:__socket_keepalive]
>>>>> 0-socket: failed to set keep idle -1 on socket 17, Invalid
argument
>>>>> [2017-05-24 10:45:44.440475] E
[socket.c:2966:socket_connect]
>>>>> 0-management: Failed to set keep-alive: Invalid argument
>>>>> [2017-05-24 10:45:44.440734] W
[socket.c:852:__socket_keepalive]
>>>>> 0-socket: failed to set keep idle -1 on socket 20, Invalid
argument
>>>>> [2017-05-24 10:45:44.440754] E
[socket.c:2966:socket_connect]
>>>>> 0-management: Failed to set keep-alive: Invalid argument
>>>>> [2017-05-24 10:45:44.441354] E
[rpc-clnt.c:362:saved_frames_unwind]
>>>>> (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483]
>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af]
>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7f767c239c8e]
>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8]
>>>>> ))))) 0-management: forced unwinding frame
type(GLUSTERD-DUMP) op(DUMP(1))
>>>>> called at 2017-05-24 10:45:44.440945 (xid=0xbf)
>>>>> [2017-05-24 10:45:44.441505] W
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>>>>
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/gl
>>>>> usterd.so(glusterd_big_locked_notify+0x4b) [0x7f767734dffb]
>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>>>>> sterd.so(__glusterd_peer_rpc_notify+0x14a) [0x7f7677357c6a]
>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>>>>> sterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7f76773f0ef3] )
>>>>> 0-management: Lock for vol shared not held
>>>>> [2017-05-24 10:45:44.441660] E
[rpc-clnt.c:362:saved_frames_unwind]
>>>>> (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483]
>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af]
>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7f767c239c8e]
>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8]
>>>>> ))))) 0-management: forced unwinding frame
type(GLUSTERD-DUMP) op(DUMP(1))
>>>>> called at 2017-05-24 10:45:44.441086 (xid=0xbf)
>>>>> [2017-05-24 10:45:44.441790] W
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>>>>
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/gl
>>>>> usterd.so(glusterd_big_locked_notify+0x4b) [0x7f767734dffb]
>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>>>>> sterd.so(__glusterd_peer_rpc_notify+0x14a) [0x7f7677357c6a]
>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>>>>> sterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7f76773f0ef3] )
>>>>> 0-management: Lock for vol shared not held
>>>>>
>>>>> The heal info says this:
>>>>>
>>>>> # gluster volume heal shared info
>>>>> Brick 192.168.0.5:/data/exports/shared
>>>>> Number of entries: 0
>>>>>
>>>>> Brick 192.168.0.6:/data/exports/shared
>>>>> Status: Transport endpoint is not connected
>>>>>
>>>>> Brick 192.168.0.7:/data/exports/shared
>>>>> Status: Transport endpoint is not connected
>>>>>
>>>>> Any idea whats up here?
>>>>>
>>>>> Pawan
>>>>>
>>>>> On Mon, May 22, 2017 at 9:42 PM, Atin Mukherjee
<amukherj at redhat.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, May 22, 2017 at 9:05 PM, Pawan Alwandi
<pawan at platform.sh>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> On Mon, May 22, 2017 at 8:36 PM, Atin Mukherjee
<amukherj at redhat.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, May 22, 2017 at 7:51 PM, Atin Mukherjee
<
>>>>>>>> amukherj at redhat.com> wrote:
>>>>>>>>
>>>>>>>>> Sorry Pawan, I did miss the other part of
the attachments. So
>>>>>>>>> looking from the glusterd.info file from
all the hosts, it looks
>>>>>>>>> like host2 and host3 do not have the
correct op-version. Can you please set
>>>>>>>>> the op-version as
"operating-version=30702" in host2 and host3 and restart
>>>>>>>>> glusterd instance one by one on all the
nodes?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Please ensure that all the hosts are upgraded
to the same bits
>>>>>>>> before doing this change.
>>>>>>>>
>>>>>>>
>>>>>>> Having to upgrade all 3 hosts to newer version
before gluster could
>>>>>>> work successfully on any of them means application
downtime.  The
>>>>>>> applications running on these hosts are expected to
be highly available.
>>>>>>> So with the way the things are right now, is an
online upgrade possible?
>>>>>>> My upgrade steps are: (1) stop the applications (2)
umount the gluster
>>>>>>> volume, and then (3) upgrade gluster one host at a
time.
>>>>>>>
>>>>>>
>>>>>> One of the way to mitigate this is to first do an
online upgrade to
>>>>>> glusterfs-3.7.9 (op-version:30707) given this bug was
introduced in 3.7.10
>>>>>> and then come to 3.11.
>>>>>>
>>>>>>
>>>>>>> Our goal is to get gluster upgraded to 3.11 from
3.6.9, and to make
>>>>>>> this an online upgrade we are okay to take two
steps 3.6.9 -> 3.7 and then
>>>>>>> 3.7 to 3.11.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Apparently it looks like there is a bug
which you have uncovered,
>>>>>>>>> during peer handshaking if one of the
glusterd instance is running with old
>>>>>>>>> bits then during validating the handshake
request there is a possibility
>>>>>>>>> that uuid received will be blank and the
same was ignored however there was
>>>>>>>>> a patch http://review.gluster.org/13519
which had some additional
>>>>>>>>> changes which was always looking at this
field and doing some extra checks
>>>>>>>>> which was causing the handshake to fail.
For now, the above workaround
>>>>>>>>> should suffice. I'll be sending a patch
pretty soon.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Posted a patch
https://review.gluster.org/#/c/17358 .
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, May 22, 2017 at 11:35 AM, Pawan
Alwandi <pawan at platform.sh
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> Hello Atin,
>>>>>>>>>>
>>>>>>>>>> The tar's have the content of
`/var/lib/glusterd` too for all 3
>>>>>>>>>> nodes, please check again.
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>> On Mon, May 22, 2017 at 11:32 AM, Atin
Mukherjee <
>>>>>>>>>> amukherj at redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Pawan,
>>>>>>>>>>>
>>>>>>>>>>> I see you have provided the log
files from the nodes, however
>>>>>>>>>>> it'd be really helpful if you
can provide me the content of
>>>>>>>>>>> /var/lib/glusterd from all the
nodes to get to the root cause of this issue.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, May 19, 2017 at 12:09 PM,
Pawan Alwandi <
>>>>>>>>>>> pawan at platform.sh> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hello Atin,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for continued support. 
I've attached requested files
>>>>>>>>>>>> from all 3 nodes.
>>>>>>>>>>>>
>>>>>>>>>>>> (I think we already verified
the UUIDs to be correct, anyway
>>>>>>>>>>>> let us know if you find any
more info in the logs)
>>>>>>>>>>>>
>>>>>>>>>>>> Pawan
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, May 18, 2017 at 11:45
PM, Atin Mukherjee <
>>>>>>>>>>>> amukherj at redhat.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, 18 May 2017 at
23:40, Atin Mukherjee <
>>>>>>>>>>>>> amukherj at redhat.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, 17 May 2017 at
12:47, Pawan Alwandi <pawan at platform.sh>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hello Atin,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I realized that
these http://gluster.readthedocs.io/
>>>>>>>>>>>>>>>
en/latest/Upgrade-Guide/upgrade_to_3.10/ instructions only
>>>>>>>>>>>>>>> work for upgrades
from 3.7, while we are running 3.6.2.  Are there
>>>>>>>>>>>>>>>
instructions/suggestion you have for us to upgrade from 3.6 version?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I believe upgrade
from 3.6 to 3.7 and then to 3.10 would
>>>>>>>>>>>>>>> work, but I see
similar errors reported when I upgraded to 3.7 too.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For what its worth,
I was able to set the op-version
>>>>>>>>>>>>>>> (gluster v set all
cluster.op-version 30702) but that doesn't seem to help.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [2017-05-17
06:48:33.700014] I [MSGID: 100030]
>>>>>>>>>>>>>>>
[glusterfsd.c:2338:main] 0-/usr/sbin/glusterd: Started running
>>>>>>>>>>>>>>> /usr/sbin/glusterd
version 3.7.20 (args: /usr/sbin/glusterd -p
>>>>>>>>>>>>>>>
/var/run/glusterd.pid)
>>>>>>>>>>>>>>> [2017-05-17
06:48:33.703808] I [MSGID: 106478]
>>>>>>>>>>>>>>>
[glusterd.c:1383:init] 0-management: Maximum allowed open file descriptors
>>>>>>>>>>>>>>> set to 65536
>>>>>>>>>>>>>>> [2017-05-17
06:48:33.703836] I [MSGID: 106479]
>>>>>>>>>>>>>>>
[glusterd.c:1432:init] 0-management: Using /var/lib/glusterd as working
>>>>>>>>>>>>>>> directory
>>>>>>>>>>>>>>> [2017-05-17
06:48:33.708866] W [MSGID: 103071]
>>>>>>>>>>>>>>>
[rdma.c:4594:__gf_rdma_ctx_create] 0-rpc-transport/rdma:
>>>>>>>>>>>>>>> rdma_cm event
channel creation failed [No such device]
>>>>>>>>>>>>>>> [2017-05-17
06:48:33.709011] W [MSGID: 103055]
>>>>>>>>>>>>>>> [rdma.c:4901:init]
0-rdma.management: Failed to initialize IB Device
>>>>>>>>>>>>>>> [2017-05-17
06:48:33.709033] W [rpc-transport.c:359:rpc_transport_load]
>>>>>>>>>>>>>>> 0-rpc-transport:
'rdma' initialization failed
>>>>>>>>>>>>>>> [2017-05-17
06:48:33.709088] W [rpcsvc.c:1642:rpcsvc_create_listener]
>>>>>>>>>>>>>>> 0-rpc-service:
cannot create listener, initing the transport failed
>>>>>>>>>>>>>>> [2017-05-17
06:48:33.709105] E [MSGID: 106243]
>>>>>>>>>>>>>>>
[glusterd.c:1656:init] 0-management: creation of 1 listeners failed,
>>>>>>>>>>>>>>> continuing with
succeeded transport
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.480043] I [MSGID: 106513]
>>>>>>>>>>>>>>>
[glusterd-store.c:2068:glusterd_restore_op_version]
>>>>>>>>>>>>>>> 0-glusterd:
retrieved op-version: 30600
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.605779] I [MSGID: 106498]
>>>>>>>>>>>>>>>
[glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo]
>>>>>>>>>>>>>>> 0-management:
connect returned 0
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.607059] I [rpc-clnt.c:1046:rpc_clnt_connection_init]
>>>>>>>>>>>>>>> 0-management:
setting frame-timeout to 600
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.607670] I [rpc-clnt.c:1046:rpc_clnt_connection_init]
>>>>>>>>>>>>>>> 0-management:
setting frame-timeout to 600
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.607025] I [MSGID: 106498]
>>>>>>>>>>>>>>>
[glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo]
>>>>>>>>>>>>>>> 0-management:
connect returned 0
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.608125] I [MSGID: 106544]
>>>>>>>>>>>>>>>
[glusterd.c:159:glusterd_uuid_init] 0-management: retrieved
>>>>>>>>>>>>>>> UUID:
7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Final graph:
>>>>>>>>>>>>>>>
+-----------------------------------------------------------
>>>>>>>>>>>>>>>
-------------------+
>>>>>>>>>>>>>>>   1: volume
management
>>>>>>>>>>>>>>>   2:     type
mgmt/glusterd
>>>>>>>>>>>>>>>   3:     option
rpc-auth.auth-glusterfs on
>>>>>>>>>>>>>>>   4:     option
rpc-auth.auth-unix on
>>>>>>>>>>>>>>>   5:     option
rpc-auth.auth-null on
>>>>>>>>>>>>>>>   6:     option
rpc-auth-allow-insecure on
>>>>>>>>>>>>>>>   7:     option
transport.socket.listen-backlog 128
>>>>>>>>>>>>>>>   8:     option
event-threads 1
>>>>>>>>>>>>>>>   9:     option
ping-timeout 0
>>>>>>>>>>>>>>>  10:     option
transport.socket.read-fail-log off
>>>>>>>>>>>>>>>  11:     option
transport.socket.keepalive-interval 2
>>>>>>>>>>>>>>>  12:     option
transport.socket.keepalive-time 10
>>>>>>>>>>>>>>>  13:     option
transport-type rdma
>>>>>>>>>>>>>>>  14:     option
working-directory /var/lib/glusterd
>>>>>>>>>>>>>>>  15: end-volume
>>>>>>>>>>>>>>>  16:
>>>>>>>>>>>>>>>
+-----------------------------------------------------------
>>>>>>>>>>>>>>>
-------------------+
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.609868] I [MSGID: 101190]
>>>>>>>>>>>>>>>
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll:
>>>>>>>>>>>>>>> Started thread with
index 1
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.610839] W [socket.c:596:__socket_rwv]
>>>>>>>>>>>>>>> 0-management: readv
on 192.168.0.7:24007 failed (No data
>>>>>>>>>>>>>>> available)
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.611907] E [rpc-clnt.c:370:saved_frames_unwind]
>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>
lusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3] (-->
>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df]
>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe] (-->
>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39]
>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380] )))))
>>>>>>>>>>>>>>> 0-management:
forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
>>>>>>>>>>>>>>> at 2017-05-17
06:48:35.609965 (xid=0x1)
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.611928] E [MSGID: 106167]
>>>>>>>>>>>>>>>
[glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk]
>>>>>>>>>>>>>>> 0-management: Error
through RPC layer, retry again later
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.611944] I [MSGID: 106004]
>>>>>>>>>>>>>>>
[glusterd-handler.c:5201:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>> 0-management: Peer
<192.168.0.7> (<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>),
>>>>>>>>>>>>>>> in state <Peer
in Cluster>, has disconnected from glusterd.
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612024] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>>>>>>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/g
>>>>>>>>>>>>>>>
lusterd.so(glusterd_big_locked_notify+0x4b)
>>>>>>>>>>>>>>> [0x7fd6bdc4912b]
-->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>
lusterfs/3.7.20/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x160)
>>>>>>>>>>>>>>> [0x7fd6bdc52dd0]
-->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>
lusterfs/3.7.20/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x4c3)
>>>>>>>>>>>>>>> [0x7fd6bdcef1b3] )
0-management: Lock for vol shared not held
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612039] W [MSGID: 106118]
>>>>>>>>>>>>>>>
[glusterd-handler.c:5223:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>> 0-management: Lock
not released for shared
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612079] W [socket.c:596:__socket_rwv]
>>>>>>>>>>>>>>> 0-management: readv
on 192.168.0.6:24007 failed (No data
>>>>>>>>>>>>>>> available)
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612179] E [rpc-clnt.c:370:saved_frames_unwind]
>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>
lusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3] (-->
>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df]
>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe] (-->
>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39]
>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380] )))))
>>>>>>>>>>>>>>> 0-management:
forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
>>>>>>>>>>>>>>> at 2017-05-17
06:48:35.610007 (xid=0x1)
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612197] E [MSGID: 106167]
>>>>>>>>>>>>>>>
[glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk]
>>>>>>>>>>>>>>> 0-management: Error
through RPC layer, retry again later
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612211] I [MSGID: 106004]
>>>>>>>>>>>>>>>
[glusterd-handler.c:5201:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>> 0-management: Peer
<192.168.0.6> (<83e9a0b9-6bd5-483b-8516-d8928805ed95>),
>>>>>>>>>>>>>>> in state <Peer
in Cluster>, has disconnected from glusterd.
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612292] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>>>>>>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/g
>>>>>>>>>>>>>>>
lusterd.so(glusterd_big_locked_notify+0x4b)
>>>>>>>>>>>>>>> [0x7fd6bdc4912b]
-->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>
lusterfs/3.7.20/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x160)
>>>>>>>>>>>>>>> [0x7fd6bdc52dd0]
-->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>
lusterfs/3.7.20/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x4c3)
>>>>>>>>>>>>>>> [0x7fd6bdcef1b3] )
0-management: Lock for vol shared not held
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.613432] W [MSGID: 106118]
>>>>>>>>>>>>>>>
[glusterd-handler.c:5223:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>> 0-management: Lock
not released for shared
>>>>>>>>>>>>>>> [2017-05-17
06:48:35.614317] E [MSGID: 106170]
>>>>>>>>>>>>>>>
[glusterd-handshake.c:1051:gd_validate_mgmt_hndsk_req]
>>>>>>>>>>>>>>> 0-management:
Request from peer 192.168.0.6:991 has an
>>>>>>>>>>>>>>> entry in peerinfo,
but uuid does not match
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Apologies for delay. My
initial suspect was correct. You have
>>>>>>>>>>>>>> an incorrect UUID in
the peer file which is causing this. Can you please
>>>>>>>>>>>>>> provide me the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Clicked the send button
accidentally!
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can you please send me the
content of /var/lib/glusterd &
>>>>>>>>>>>>> glusterd log from all the
nodes?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, May 15,
2017 at 10:31 PM, Atin Mukherjee <
>>>>>>>>>>>>>>> amukherj at
redhat.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, 15 May
2017 at 11:58, Pawan Alwandi
>>>>>>>>>>>>>>>> <pawan at
platform.sh> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Atin,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I see below
error.  Do I require gluster to be upgraded on
>>>>>>>>>>>>>>>>> all 3 hosts
for this to work?  Right now I have host 1 running 3.10.1 and
>>>>>>>>>>>>>>>>> host 2
& 3 running 3.6.2
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> # gluster v
set all cluster.op-version 31001
>>>>>>>>>>>>>>>>> volume set:
failed: Required op_version (31001) is not
>>>>>>>>>>>>>>>>> supported
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes you should
given 3.6 version is EOLed.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, May
15, 2017 at 3:32 AM, Atin Mukherjee <
>>>>>>>>>>>>>>>>> amukherj at
redhat.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sun,
14 May 2017 at 21:43, Atin Mukherjee <
>>>>>>>>>>>>>>>>>>
amukherj at redhat.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
Allright, I see that you haven't bumped up the
>>>>>>>>>>>>>>>>>>>
op-version. Can you please execute:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
gluster v set all cluster.op-version 30101  and then
>>>>>>>>>>>>>>>>>>>
restart glusterd on all the nodes and check the brick status?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
s/30101/31001
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On
Sun, May 14, 2017 at 8:55 PM, Pawan Alwandi <
>>>>>>>>>>>>>>>>>>>
pawan at platform.sh> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Hello Atin,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Thanks for looking at this.  Below is the output you
>>>>>>>>>>>>>>>>>>>>
requested for.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Again, I'm seeing those errors after upgrading gluster
>>>>>>>>>>>>>>>>>>>>
on host 1.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Host 1
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
# cat /var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>>>>>>>>
UUID=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>>>>>>>>
operating-version=30600
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
# cat /var/lib/glusterd/peers/*
>>>>>>>>>>>>>>>>>>>>
uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>>>>>>>>>>>>>>>
state=3
>>>>>>>>>>>>>>>>>>>>
hostname1=192.168.0.7
>>>>>>>>>>>>>>>>>>>>
uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>>>>>>>>>>>>>>>
state=3
>>>>>>>>>>>>>>>>>>>>
hostname1=192.168.0.6
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
# gluster --version
>>>>>>>>>>>>>>>>>>>>
glusterfs 3.10.1
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Host 2
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
# cat /var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>>>>>>>>
UUID=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>>>>>>>>>>>>>>>
operating-version=30600
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
# cat /var/lib/glusterd/peers/*
>>>>>>>>>>>>>>>>>>>>
uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>>>>>>>>>>>>>>>
state=3
>>>>>>>>>>>>>>>>>>>>
hostname1=192.168.0.7
>>>>>>>>>>>>>>>>>>>>
uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>>>>>>>>
state=3
>>>>>>>>>>>>>>>>>>>>
hostname1=192.168.0.5
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
# gluster --version
>>>>>>>>>>>>>>>>>>>>
glusterfs 3.6.2 built on Jan 21 2015 14:23:44
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Host 3
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
# cat /var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>>>>>>>>
UUID=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>>>>>>>>>>>>>>>
operating-version=30600
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
# cat /var/lib/glusterd/peers/*
>>>>>>>>>>>>>>>>>>>>
uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>>>>>>>>
state=3
>>>>>>>>>>>>>>>>>>>>
hostname1=192.168.0.5
>>>>>>>>>>>>>>>>>>>>
uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>>>>>>>>>>>>>>>
state=3
>>>>>>>>>>>>>>>>>>>>
hostname1=192.168.0.6
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
# gluster --version
>>>>>>>>>>>>>>>>>>>>
glusterfs 3.6.2 built on Jan 21 2015 14:23:44
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
On Sat, May 13, 2017 at 6:28 PM, Atin Mukherjee <
>>>>>>>>>>>>>>>>>>>>
amukherj at redhat.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
I have already asked for the following earlier:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Can you please provide output of following from all
>>>>>>>>>>>>>>>>>>>>>
the nodes:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
cat /var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>>>>>>>>>
cat /var/lib/glusterd/peers/*
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
On Sat, 13 May 2017 at 12:22, Pawan Alwandi
>>>>>>>>>>>>>>>>>>>>>
<pawan at platform.sh> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Hello folks,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Does anyone have any idea whats going on here?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Thanks,
>>>>>>>>>>>>>>>>>>>>>>
Pawan
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
On Wed, May 10, 2017 at 5:02 PM, Pawan Alwandi <
>>>>>>>>>>>>>>>>>>>>>>
pawan at platform.sh> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Hello,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
I'm trying to upgrade gluster from 3.6.2 to 3.10.1
>>>>>>>>>>>>>>>>>>>>>>>
but don't see the glusterfsd and glusterfs processes coming up.
>>>>>>>>>>>>>>>>>>>>>>>
http://gluster.readthedocs.io/
>>>>>>>>>>>>>>>>>>>>>>>
en/latest/Upgrade-Guide/upgrade_to_3.10/ is the
>>>>>>>>>>>>>>>>>>>>>>>
process that I'm trying to follow.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
This is a 3 node server setup with a replicated
>>>>>>>>>>>>>>>>>>>>>>>
volume having replica count of 3.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Logs below:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.507959] I [MSGID: 100030]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterfsd.c:2460:main] 0-/usr/sbin/glusterd: Started running
>>>>>>>>>>>>>>>>>>>>>>>
/usr/sbin/glusterd version 3.10.1 (args: /usr/sbin/glusterd -p
>>>>>>>>>>>>>>>>>>>>>>>
/var/run/glusterd.pid)
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.512827] I [MSGID: 106478]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd.c:1449:init] 0-management: Maximum allowed open file descriptors
>>>>>>>>>>>>>>>>>>>>>>>
set to 65536
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.512855] I [MSGID: 106479]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd.c:1496:init] 0-management: Using /var/lib/glusterd as working
>>>>>>>>>>>>>>>>>>>>>>>
directory
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.520426] W [MSGID: 103071]
>>>>>>>>>>>>>>>>>>>>>>>
[rdma.c:4590:__gf_rdma_ctx_create]
>>>>>>>>>>>>>>>>>>>>>>>
0-rpc-transport/rdma: rdma_cm event channel creation failed [No such device]
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.520452] W [MSGID: 103055]
>>>>>>>>>>>>>>>>>>>>>>>
[rdma.c:4897:init] 0-rdma.management: Failed to initialize IB Device
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.520465] W
>>>>>>>>>>>>>>>>>>>>>>>
[rpc-transport.c:350:rpc_transport_load]
>>>>>>>>>>>>>>>>>>>>>>>
0-rpc-transport: 'rdma' initialization failed
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.520518] W
>>>>>>>>>>>>>>>>>>>>>>>
[rpcsvc.c:1661:rpcsvc_create_listener]
>>>>>>>>>>>>>>>>>>>>>>>
0-rpc-service: cannot create listener, initing the transport failed
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.520534] E [MSGID: 106243]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd.c:1720:init] 0-management: creation of 1 listeners failed,
>>>>>>>>>>>>>>>>>>>>>>>
continuing with succeeded transport
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.931764] I [MSGID: 106513]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-store.c:2197:glusterd_restore_op_version]
>>>>>>>>>>>>>>>>>>>>>>>
0-glusterd: retrieved op-version: 30600
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.964354] I [MSGID: 106544]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd.c:158:glusterd_uuid_init] 0-management:
>>>>>>>>>>>>>>>>>>>>>>>
retrieved UUID: 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.993944] I [MSGID: 106498]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo]
>>>>>>>>>>>>>>>>>>>>>>>
0-management: connect returned 0
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.995864] I [MSGID: 106498]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo]
>>>>>>>>>>>>>>>>>>>>>>>
0-management: connect returned 0
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.995879] W [MSGID: 106062]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:3466:glust
>>>>>>>>>>>>>>>>>>>>>>>
erd_transport_inet_options_build] 0-glusterd:
>>>>>>>>>>>>>>>>>>>>>>>
Failed to get tcp-user-timeout
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.995903] I
>>>>>>>>>>>>>>>>>>>>>>>
[rpc-clnt.c:1059:rpc_clnt_connection_init]
>>>>>>>>>>>>>>>>>>>>>>>
0-management: setting frame-timeout to 600
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.996325] I
>>>>>>>>>>>>>>>>>>>>>>>
[rpc-clnt.c:1059:rpc_clnt_connection_init]
>>>>>>>>>>>>>>>>>>>>>>>
0-management: setting frame-timeout to 600
>>>>>>>>>>>>>>>>>>>>>>>
Final graph:
>>>>>>>>>>>>>>>>>>>>>>>
+-----------------------------
>>>>>>>>>>>>>>>>>>>>>>>
-------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>
1: volume management
>>>>>>>>>>>>>>>>>>>>>>>
2:     type mgmt/glusterd
>>>>>>>>>>>>>>>>>>>>>>>
3:     option rpc-auth.auth-glusterfs on
>>>>>>>>>>>>>>>>>>>>>>>
4:     option rpc-auth.auth-unix on
>>>>>>>>>>>>>>>>>>>>>>>
5:     option rpc-auth.auth-null on
>>>>>>>>>>>>>>>>>>>>>>>
6:     option rpc-auth-allow-insecure on
>>>>>>>>>>>>>>>>>>>>>>>
7:     option transport.socket.listen-backlog 128
>>>>>>>>>>>>>>>>>>>>>>>
8:     option event-threads 1
>>>>>>>>>>>>>>>>>>>>>>>
9:     option ping-timeout 0
>>>>>>>>>>>>>>>>>>>>>>>
10:     option transport.socket.read-fail-log off
>>>>>>>>>>>>>>>>>>>>>>>
11:     option transport.socket.keepalive-interval
>>>>>>>>>>>>>>>>>>>>>>>
2
>>>>>>>>>>>>>>>>>>>>>>>
12:     option transport.socket.keepalive-time 10
>>>>>>>>>>>>>>>>>>>>>>>
13:     option transport-type rdma
>>>>>>>>>>>>>>>>>>>>>>>
14:     option working-directory /var/lib/glusterd
>>>>>>>>>>>>>>>>>>>>>>>
15: end-volume
>>>>>>>>>>>>>>>>>>>>>>>
16:
>>>>>>>>>>>>>>>>>>>>>>>
+-----------------------------
>>>>>>>>>>>>>>>>>>>>>>>
-------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.996310] W [MSGID: 106062]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:3466:glust
>>>>>>>>>>>>>>>>>>>>>>>
erd_transport_inet_options_build] 0-glusterd:
>>>>>>>>>>>>>>>>>>>>>>>
Failed to get tcp-user-timeout
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.000461] I [MSGID: 101190]
>>>>>>>>>>>>>>>>>>>>>>>
[event-epoll.c:629:event_dispatch_epoll_worker]
>>>>>>>>>>>>>>>>>>>>>>>
0-epoll: Started thread with index 1
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.001493] W
>>>>>>>>>>>>>>>>>>>>>>>
[socket.c:593:__socket_rwv] 0-management: readv on
>>>>>>>>>>>>>>>>>>>>>>>
192.168.0.7:24007 failed (No data available)
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.001513] I [MSGID: 106004]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:5882:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>>>>>>>>>
0-management: Peer <192.168.0.7>
(<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>),
>>>>>>>>>>>>>>>>>>>>>>>
in state <Peer in Cluster>, h
>>>>>>>>>>>>>>>>>>>>>>>
as disconnected from glusterd.
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.001677] W
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
>>>>>>>>>>>>>>>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/
>>>>>>>>>>>>>>>>>>>>>>>
glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559)
>>>>>>>>>>>>>>>>>>>>>>>
[0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu
>>>>>>>>>>>>>>>>>>>>>>>
/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0)
>>>>>>>>>>>>>>>>>>>>>>>
[0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>>>>>>>>>
lusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3)
>>>>>>>>>>>>>>>>>>>>>>>
[0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no
>>>>>>>>>>>>>>>>>>>>>>>
t held
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.001696] W [MSGID: 106118]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:5907:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>>>>>>>>>
0-management: Lock not released for shared
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003099] E
>>>>>>>>>>>>>>>>>>>>>>>
[rpc-clnt.c:365:saved_frames_unwind] (-->
>>>>>>>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>
lusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c]
>>>>>>>>>>>>>>>>>>>>>>>
(--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s
>>>>>>>>>>>>>>>>>>>>>>>
aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (-->
>>>>>>>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de]
>>>>>>>>>>>>>>>>>>>>>>>
(--> /usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_connection_cleanup+0x
>>>>>>>>>>>>>>>>>>>>>>>
91)[0x7f0bfec91c21] (-->
>>>>>>>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710]
>>>>>>>>>>>>>>>>>>>>>>>
))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1))
>>>>>>>>>>>>>>>>>>>>>>>
called at 2017-05-10 09:0
>>>>>>>>>>>>>>>>>>>>>>>
7:05.000627 (xid=0x1)
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003129] E [MSGID: 106167]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handshake.c:2181:__glusterd_peer_dump_version_cbk]
>>>>>>>>>>>>>>>>>>>>>>>
0-management: Error through RPC layer, retry again later
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003251] W
>>>>>>>>>>>>>>>>>>>>>>>
[socket.c:593:__socket_rwv] 0-management: readv on
>>>>>>>>>>>>>>>>>>>>>>>
192.168.0.6:24007 failed (No data available)
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003267] I [MSGID: 106004]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:5882:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>>>>>>>>>
0-management: Peer <192.168.0.6>
(<83e9a0b9-6bd5-483b-8516-d8928805ed95>),
>>>>>>>>>>>>>>>>>>>>>>>
in state <Peer in Cluster>, h
>>>>>>>>>>>>>>>>>>>>>>>
as disconnected from glusterd.
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003318] W
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
>>>>>>>>>>>>>>>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/
>>>>>>>>>>>>>>>>>>>>>>>
glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559)
>>>>>>>>>>>>>>>>>>>>>>>
[0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu
>>>>>>>>>>>>>>>>>>>>>>>
/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0)
>>>>>>>>>>>>>>>>>>>>>>>
[0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>>>>>>>>>
lusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3)
>>>>>>>>>>>>>>>>>>>>>>>
[0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no
>>>>>>>>>>>>>>>>>>>>>>>
t held
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003329] W [MSGID: 106118]
>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:5907:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>>>>>>>>>
0-management: Lock not released for shared
>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003457] E
>>>>>>>>>>>>>>>>>>>>>>>
[rpc-clnt.c:365:saved_frames_unwind] (-->
>>>>>>>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>
lusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c]
>>>>>>>>>>>>>>>>>>>>>>>
(--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s
>>>>>>>>>>>>>>>>>>>>>>>
aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (-->
>>>>>>>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de]
>>>>>>>>>>>>>>>>>>>>>>>
(--> /usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_connection_cleanup+0x
>>>>>>>>>>>>>>>>>>>>>>>
91)[0x7f0bfec91c21] (-->
>>>>>>>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710]
>>>>>>>>>>>>>>>>>>>>>>>
))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1))
>>>>>>>>>>>>>>>>>>>>>>>
called at 2017-05-10 09:0
>>>>>>>>>>>>>>>>>>>>>>>
7:05.001407 (xid=0x1)
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
There are a bunch of errors reported but I'm not
>>>>>>>>>>>>>>>>>>>>>>>
sure which is signal and which ones are noise.  Does anyone have any idea
>>>>>>>>>>>>>>>>>>>>>>>
whats going on here?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Thanks,
>>>>>>>>>>>>>>>>>>>>>>>
Pawan
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>>>>>>>>>>
Gluster-users mailing list
>>>>>>>>>>>>>>>>>>>>>>
Gluster-users at gluster.org
>>>>>>>>>>>>>>>>>>>>>>
http://lists.gluster.org/mailm
>>>>>>>>>>>>>>>>>>>>>>
an/listinfo/gluster-users
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>
- Atin (atinm)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> - Atin
(atinm)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> - Atin (atinm)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> - Atin (atinm)
>>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> - Atin (atinm)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>> --
>> - Atin (atinm)
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170529/297e12d8/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: log_host1.tar.gz
Type: application/x-gzip
Size: 88542 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170529/297e12d8/attachment.gz>

Atin Mukherjee

2017-May-30 05:10 UTC

head link

[Gluster-users] Failure while upgrading gluster to 3.10.1

Pawan - I couldn't reach to any conclusive analysis so far. But, looking at
the client (nfs)  & glusterd log files, it does look like that there is an
issue w.r.t peer connections. Does restarting all the glusterd one by one
solve this?

On Mon, May 29, 2017 at 4:50 PM, Pawan Alwandi <pawan at platform.sh>
wrote:
> Sorry for big attachment in previous mail...last 1000 lines of those logs
> attached now.
>
> On Mon, May 29, 2017 at 4:44 PM, Pawan Alwandi <pawan at platform.sh>
wrote:
>
>>
>>
>> On Thu, May 25, 2017 at 9:54 PM, Atin Mukherjee <amukherj at
redhat.com>
>> wrote:
>>
>>>
>>> On Thu, 25 May 2017 at 19:11, Pawan Alwandi <pawan at
platform.sh> wrote:
>>>
>>>> Hello Atin,
>>>>
>>>> Yes, glusterd on other instances are up and running.  Below is
the
>>>> requested output on all the three hosts.
>>>>
>>>> Host 1
>>>>
>>>> # gluster peer status
>>>> Number of Peers: 2
>>>>
>>>> Hostname: 192.168.0.7
>>>> Uuid: 5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>> State: Peer in Cluster (Disconnected)
>>>>
>>>
>>> Glusterd is disconnected here.
>>>
>>>>
>>>>
>>>> Hostname: 192.168.0.6
>>>> Uuid: 83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>> State: Peer in Cluster (Disconnected)
>>>>
>>>
>>> Same as above
>>>
>>> Can you please check what does glusterd log have to say here about
these
>>> disconnects?
>>>
>>
>> glusterd keeps logging this every 3s
>>
>> [2017-05-29 11:04:52.182782] W [socket.c:852:__socket_keepalive]
>> 0-socket: failed to set keep idle -1 on socket 5, Invalid argument
>> [2017-05-29 11:04:52.182808] E [socket.c:2966:socket_connect]
>> 0-management: Failed to set keep-alive: Invalid argument
>> [2017-05-29 11:04:52.183032] W [socket.c:852:__socket_keepalive]
>> 0-socket: failed to set keep idle -1 on socket 20, Invalid argument
>> [2017-05-29 11:04:52.183052] E [socket.c:2966:socket_connect]
>> 0-management: Failed to set keep-alive: Invalid argument
>> [2017-05-29 11:04:52.183622] E [rpc-clnt.c:362:saved_frames_unwind]
(-->
>>
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483]
>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af]
>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7f767c239c8e]
>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8]
>> ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP)
op(DUMP(1))
>> called at 2017-05-29 11:04:52.183210 (xid=0x23419)
>> [2017-05-29 11:04:52.183735] W
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/gl
>> usterd.so(glusterd_big_locked_notify+0x4b) [0x7f767734dffb]
>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>> sterd.so(__glusterd_peer_rpc_notify+0x14a) [0x7f7677357c6a]
>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>> sterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7f76773f0ef3] )
0-management:
>> Lock for vol shared not held
>> [2017-05-29 11:04:52.183928] E [rpc-clnt.c:362:saved_frames_unwind]
(-->
>>
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483]
>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af]
>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7f767c239c8e]
>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8]
>> ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP)
op(DUMP(1))
>> called at 2017-05-29 11:04:52.183422 (xid=0x23419)
>> [2017-05-29 11:04:52.184027] W
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/gl
>> usterd.so(glusterd_big_locked_notify+0x4b) [0x7f767734dffb]
>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>> sterd.so(__glusterd_peer_rpc_notify+0x14a) [0x7f7677357c6a]
>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>> sterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7f76773f0ef3] )
0-management:
>> Lock for vol shared not held
>>
>>
>>
>>>
>>>
>>>>
>>>> # gluster volume status
>>>> Status of volume: shared
>>>> Gluster process                             TCP Port  RDMA Port
>>>> Online  Pid
>>>> ------------------------------------------------------------
>>>> ------------------
>>>> Brick 192.168.0.5:/data/exports/shared      49152     0
>>>> Y       2105
>>>> NFS Server on localhost                     2049      0
>>>> Y       2089
>>>> Self-heal Daemon on localhost               N/A       N/A
>>>> Y       2097
>>>>
>>>
>>> Volume status output does show all the bricks are up. So I'm
not sure
>>> why are you seeing the volume as read only. Can you please provide
the
>>> mount log?
>>>
>>
>> The attached tar has nfs.log, etc-glusterfs-glusterd.vol.log,
>> glustershd.log from host1.
>>
>>
>>>
>>>
>>>>
>>>> Task Status of Volume shared
>>>> ------------------------------------------------------------
>>>> ------------------
>>>> There are no active volume tasks
>>>>
>>>> Host 2
>>>>
>>>> # gluster peer status
>>>> Number of Peers: 2
>>>>
>>>> Hostname: 192.168.0.7
>>>> Uuid: 5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>> State: Peer in Cluster (Connected)
>>>>
>>>> Hostname: 192.168.0.5
>>>> Uuid: 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>> State: Peer in Cluster (Connected)
>>>>
>>>>
>>>> # gluster volume status
>>>> Status of volume: shared
>>>> Gluster process                        Port    Online    Pid
>>>> ------------------------------------------------------------
>>>> ------------------
>>>> Brick 192.168.0.5:/data/exports/shared            49152    Y   
2105
>>>> Brick 192.168.0.6:/data/exports/shared            49152    Y   
2188
>>>> Brick 192.168.0.7:/data/exports/shared            49152    Y   
2453
>>>> NFS Server on localhost                    2049    Y    2194
>>>> Self-heal Daemon on localhost                N/A    Y    2199
>>>> NFS Server on 192.168.0.5                2049    Y    2089
>>>> Self-heal Daemon on 192.168.0.5                N/A    Y    2097
>>>> NFS Server on 192.168.0.7                2049    Y    2458
>>>> Self-heal Daemon on 192.168.0.7                N/A    Y    2463
>>>>
>>>> Task Status of Volume shared
>>>> ------------------------------------------------------------
>>>> ------------------
>>>> There are no active volume tasks
>>>>
>>>> Host 3
>>>>
>>>> # gluster peer status
>>>> Number of Peers: 2
>>>>
>>>> Hostname: 192.168.0.5
>>>> Uuid: 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>> State: Peer in Cluster (Connected)
>>>>
>>>> Hostname: 192.168.0.6
>>>> Uuid: 83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>> State: Peer in Cluster (Connected)
>>>>
>>>> # gluster volume status
>>>> Status of volume: shared
>>>> Gluster process                        Port    Online    Pid
>>>> ------------------------------------------------------------
>>>> ------------------
>>>> Brick 192.168.0.5:/data/exports/shared            49152    Y   
2105
>>>> Brick 192.168.0.6:/data/exports/shared            49152    Y   
2188
>>>> Brick 192.168.0.7:/data/exports/shared            49152    Y   
2453
>>>> NFS Server on localhost                    2049    Y    2458
>>>> Self-heal Daemon on localhost                N/A    Y    2463
>>>> NFS Server on 192.168.0.6                2049    Y    2194
>>>> Self-heal Daemon on 192.168.0.6                N/A    Y    2199
>>>> NFS Server on 192.168.0.5                2049    Y    2089
>>>> Self-heal Daemon on 192.168.0.5                N/A    Y    2097
>>>>
>>>> Task Status of Volume shared
>>>> ------------------------------------------------------------
>>>> ------------------
>>>> There are no active volume tasks
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, May 24, 2017 at 8:32 PM, Atin Mukherjee <amukherj at
redhat.com>
>>>> wrote:
>>>>
>>>>> Are the other glusterd instances are up? output of gluster
peer status
>>>>> & gluster volume status please?
>>>>>
>>>>> On Wed, May 24, 2017 at 4:20 PM, Pawan Alwandi <pawan at
platform.sh>
>>>>> wrote:
>>>>>
>>>>>> Thanks Atin,
>>>>>>
>>>>>> So I got gluster downgraded to 3.7.9 on host 1 and now
have the
>>>>>> glusterfs and glusterfsd processes come up.  But I see
the volume is
>>>>>> mounted read only.
>>>>>>
>>>>>> I see these being logged every 3s:
>>>>>>
>>>>>> [2017-05-24 10:45:44.440435] W
[socket.c:852:__socket_keepalive]
>>>>>> 0-socket: failed to set keep idle -1 on socket 17,
Invalid argument
>>>>>> [2017-05-24 10:45:44.440475] E
[socket.c:2966:socket_connect]
>>>>>> 0-management: Failed to set keep-alive: Invalid
argument
>>>>>> [2017-05-24 10:45:44.440734] W
[socket.c:852:__socket_keepalive]
>>>>>> 0-socket: failed to set keep idle -1 on socket 20,
Invalid argument
>>>>>> [2017-05-24 10:45:44.440754] E
[socket.c:2966:socket_connect]
>>>>>> 0-management: Failed to set keep-alive: Invalid
argument
>>>>>> [2017-05-24 10:45:44.441354] E
[rpc-clnt.c:362:saved_frames_unwind]
>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483]
>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af]
>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7f767c239c8e]
>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8]
>>>>>> ))))) 0-management: forced unwinding frame
type(GLUSTERD-DUMP) op(DUMP(1))
>>>>>> called at 2017-05-24 10:45:44.440945 (xid=0xbf)
>>>>>> [2017-05-24 10:45:44.441505] W
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>>>>>
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/gl
>>>>>> usterd.so(glusterd_big_locked_notify+0x4b)
[0x7f767734dffb]
>>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>>>>>> sterd.so(__glusterd_peer_rpc_notify+0x14a)
[0x7f7677357c6a]
>>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>>>>>> sterd.so(glusterd_mgmt_v3_unlock+0x4c3)
[0x7f76773f0ef3] )
>>>>>> 0-management: Lock for vol shared not held
>>>>>> [2017-05-24 10:45:44.441660] E
[rpc-clnt.c:362:saved_frames_unwind]
>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7f767c46d483]
>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7f767c2383af]
>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f767c2384ce]
>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7f767c239c8e]
>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f767c23a4a8]
>>>>>> ))))) 0-management: forced unwinding frame
type(GLUSTERD-DUMP) op(DUMP(1))
>>>>>> called at 2017-05-24 10:45:44.441086 (xid=0xbf)
>>>>>> [2017-05-24 10:45:44.441790] W
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>>>>>
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/gl
>>>>>> usterd.so(glusterd_big_locked_notify+0x4b)
[0x7f767734dffb]
>>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>>>>>> sterd.so(__glusterd_peer_rpc_notify+0x14a)
[0x7f7677357c6a]
>>>>>>
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glu
>>>>>> sterd.so(glusterd_mgmt_v3_unlock+0x4c3)
[0x7f76773f0ef3] )
>>>>>> 0-management: Lock for vol shared not held
>>>>>>
>>>>>> The heal info says this:
>>>>>>
>>>>>> # gluster volume heal shared info
>>>>>> Brick 192.168.0.5:/data/exports/shared
>>>>>> Number of entries: 0
>>>>>>
>>>>>> Brick 192.168.0.6:/data/exports/shared
>>>>>> Status: Transport endpoint is not connected
>>>>>>
>>>>>> Brick 192.168.0.7:/data/exports/shared
>>>>>> Status: Transport endpoint is not connected
>>>>>>
>>>>>> Any idea whats up here?
>>>>>>
>>>>>> Pawan
>>>>>>
>>>>>> On Mon, May 22, 2017 at 9:42 PM, Atin Mukherjee
<amukherj at redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, May 22, 2017 at 9:05 PM, Pawan Alwandi
<pawan at platform.sh>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, May 22, 2017 at 8:36 PM, Atin Mukherjee
<
>>>>>>>> amukherj at redhat.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, May 22, 2017 at 7:51 PM, Atin
Mukherjee <
>>>>>>>>> amukherj at redhat.com> wrote:
>>>>>>>>>
>>>>>>>>>> Sorry Pawan, I did miss the other part
of the attachments. So
>>>>>>>>>> looking from the glusterd.info file
from all the hosts, it looks
>>>>>>>>>> like host2 and host3 do not have the
correct op-version. Can you please set
>>>>>>>>>> the op-version as
"operating-version=30702" in host2 and host3 and restart
>>>>>>>>>> glusterd instance one by one on all the
nodes?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Please ensure that all the hosts are
upgraded to the same bits
>>>>>>>>> before doing this change.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Having to upgrade all 3 hosts to newer version
before gluster could
>>>>>>>> work successfully on any of them means
application downtime.  The
>>>>>>>> applications running on these hosts are
expected to be highly available.
>>>>>>>> So with the way the things are right now, is an
online upgrade possible?
>>>>>>>> My upgrade steps are: (1) stop the applications
(2) umount the gluster
>>>>>>>> volume, and then (3) upgrade gluster one host
at a time.
>>>>>>>>
>>>>>>>
>>>>>>> One of the way to mitigate this is to first do an
online upgrade to
>>>>>>> glusterfs-3.7.9 (op-version:30707) given this bug
was introduced in 3.7.10
>>>>>>> and then come to 3.11.
>>>>>>>
>>>>>>>
>>>>>>>> Our goal is to get gluster upgraded to 3.11
from 3.6.9, and to make
>>>>>>>> this an online upgrade we are okay to take two
steps 3.6.9 -> 3.7 and then
>>>>>>>> 3.7 to 3.11.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Apparently it looks like there is a bug
which you have uncovered,
>>>>>>>>>> during peer handshaking if one of the
glusterd instance is running with old
>>>>>>>>>> bits then during validating the
handshake request there is a possibility
>>>>>>>>>> that uuid received will be blank and
the same was ignored however there was
>>>>>>>>>> a patch http://review.gluster.org/13519
which had some
>>>>>>>>>> additional changes which was always
looking at this field and doing some
>>>>>>>>>> extra checks which was causing the
handshake to fail. For now, the above
>>>>>>>>>> workaround should suffice. I'll be
sending a patch pretty soon.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Posted a patch
https://review.gluster.org/#/c/17358 .
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, May 22, 2017 at 11:35 AM, Pawan
Alwandi <
>>>>>>>>>> pawan at platform.sh> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hello Atin,
>>>>>>>>>>>
>>>>>>>>>>> The tar's have the content of
`/var/lib/glusterd` too for all 3
>>>>>>>>>>> nodes, please check again.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>>
>>>>>>>>>>> On Mon, May 22, 2017 at 11:32 AM,
Atin Mukherjee <
>>>>>>>>>>> amukherj at redhat.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Pawan,
>>>>>>>>>>>>
>>>>>>>>>>>> I see you have provided the log
files from the nodes, however
>>>>>>>>>>>> it'd be really helpful if
you can provide me the content of
>>>>>>>>>>>> /var/lib/glusterd from all the
nodes to get to the root cause of this issue.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, May 19, 2017 at 12:09
PM, Pawan Alwandi <
>>>>>>>>>>>> pawan at platform.sh> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hello Atin,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for continued
support.  I've attached requested files
>>>>>>>>>>>>> from all 3 nodes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> (I think we already
verified the UUIDs to be correct, anyway
>>>>>>>>>>>>> let us know if you find any
more info in the logs)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Pawan
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, May 18, 2017 at
11:45 PM, Atin Mukherjee <
>>>>>>>>>>>>> amukherj at redhat.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, 18 May 2017 at
23:40, Atin Mukherjee <
>>>>>>>>>>>>>> amukherj at
redhat.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, 17 May 2017
at 12:47, Pawan Alwandi
>>>>>>>>>>>>>>> <pawan at
platform.sh> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hello Atin,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I realized that
these http://gluster.readthedocs.io/
>>>>>>>>>>>>>>>>
en/latest/Upgrade-Guide/upgrade_to_3.10/ instructions only
>>>>>>>>>>>>>>>> work for
upgrades from 3.7, while we are running 3.6.2.  Are there
>>>>>>>>>>>>>>>>
instructions/suggestion you have for us to upgrade from 3.6 version?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I believe
upgrade from 3.6 to 3.7 and then to 3.10 would
>>>>>>>>>>>>>>>> work, but I see
similar errors reported when I upgraded to 3.7 too.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For what its
worth, I was able to set the op-version
>>>>>>>>>>>>>>>> (gluster v set
all cluster.op-version 30702) but that doesn't seem to help.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [2017-05-17
06:48:33.700014] I [MSGID: 100030]
>>>>>>>>>>>>>>>>
[glusterfsd.c:2338:main] 0-/usr/sbin/glusterd: Started running
>>>>>>>>>>>>>>>>
/usr/sbin/glusterd version 3.7.20 (args: /usr/sbin/glusterd -p
>>>>>>>>>>>>>>>>
/var/run/glusterd.pid)
>>>>>>>>>>>>>>>> [2017-05-17
06:48:33.703808] I [MSGID: 106478]
>>>>>>>>>>>>>>>>
[glusterd.c:1383:init] 0-management: Maximum allowed open file descriptors
>>>>>>>>>>>>>>>> set to 65536
>>>>>>>>>>>>>>>> [2017-05-17
06:48:33.703836] I [MSGID: 106479]
>>>>>>>>>>>>>>>>
[glusterd.c:1432:init] 0-management: Using /var/lib/glusterd as working
>>>>>>>>>>>>>>>> directory
>>>>>>>>>>>>>>>> [2017-05-17
06:48:33.708866] W [MSGID: 103071]
>>>>>>>>>>>>>>>>
[rdma.c:4594:__gf_rdma_ctx_create] 0-rpc-transport/rdma:
>>>>>>>>>>>>>>>> rdma_cm event
channel creation failed [No such device]
>>>>>>>>>>>>>>>> [2017-05-17
06:48:33.709011] W [MSGID: 103055]
>>>>>>>>>>>>>>>>
[rdma.c:4901:init] 0-rdma.management: Failed to initialize IB Device
>>>>>>>>>>>>>>>> [2017-05-17
06:48:33.709033] W
>>>>>>>>>>>>>>>>
[rpc-transport.c:359:rpc_transport_load] 0-rpc-transport:
>>>>>>>>>>>>>>>> 'rdma'
initialization failed
>>>>>>>>>>>>>>>> [2017-05-17
06:48:33.709088] W
>>>>>>>>>>>>>>>>
[rpcsvc.c:1642:rpcsvc_create_listener] 0-rpc-service:
>>>>>>>>>>>>>>>> cannot create
listener, initing the transport failed
>>>>>>>>>>>>>>>> [2017-05-17
06:48:33.709105] E [MSGID: 106243]
>>>>>>>>>>>>>>>>
[glusterd.c:1656:init] 0-management: creation of 1 listeners failed,
>>>>>>>>>>>>>>>> continuing with
succeeded transport
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.480043] I [MSGID: 106513]
>>>>>>>>>>>>>>>>
[glusterd-store.c:2068:glusterd_restore_op_version]
>>>>>>>>>>>>>>>> 0-glusterd:
retrieved op-version: 30600
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.605779] I [MSGID: 106498]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo]
>>>>>>>>>>>>>>>> 0-management:
connect returned 0
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.607059] I
>>>>>>>>>>>>>>>>
[rpc-clnt.c:1046:rpc_clnt_connection_init] 0-management:
>>>>>>>>>>>>>>>> setting
frame-timeout to 600
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.607670] I
>>>>>>>>>>>>>>>>
[rpc-clnt.c:1046:rpc_clnt_connection_init] 0-management:
>>>>>>>>>>>>>>>> setting
frame-timeout to 600
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.607025] I [MSGID: 106498]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo]
>>>>>>>>>>>>>>>> 0-management:
connect returned 0
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.608125] I [MSGID: 106544]
>>>>>>>>>>>>>>>>
[glusterd.c:159:glusterd_uuid_init] 0-management:
>>>>>>>>>>>>>>>> retrieved UUID:
7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Final graph:
>>>>>>>>>>>>>>>>
+-----------------------------
>>>>>>>>>>>>>>>>
-------------------------------------------------+
>>>>>>>>>>>>>>>>   1: volume
management
>>>>>>>>>>>>>>>>   2:     type
mgmt/glusterd
>>>>>>>>>>>>>>>>   3:     option
rpc-auth.auth-glusterfs on
>>>>>>>>>>>>>>>>   4:     option
rpc-auth.auth-unix on
>>>>>>>>>>>>>>>>   5:     option
rpc-auth.auth-null on
>>>>>>>>>>>>>>>>   6:     option
rpc-auth-allow-insecure on
>>>>>>>>>>>>>>>>   7:     option
transport.socket.listen-backlog 128
>>>>>>>>>>>>>>>>   8:     option
event-threads 1
>>>>>>>>>>>>>>>>   9:     option
ping-timeout 0
>>>>>>>>>>>>>>>>  10:     option
transport.socket.read-fail-log off
>>>>>>>>>>>>>>>>  11:     option
transport.socket.keepalive-interval 2
>>>>>>>>>>>>>>>>  12:     option
transport.socket.keepalive-time 10
>>>>>>>>>>>>>>>>  13:     option
transport-type rdma
>>>>>>>>>>>>>>>>  14:     option
working-directory /var/lib/glusterd
>>>>>>>>>>>>>>>>  15: end-volume
>>>>>>>>>>>>>>>>  16:
>>>>>>>>>>>>>>>>
+-----------------------------
>>>>>>>>>>>>>>>>
-------------------------------------------------+
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.609868] I [MSGID: 101190]
>>>>>>>>>>>>>>>>
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll:
>>>>>>>>>>>>>>>> Started thread
with index 1
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.610839] W [socket.c:596:__socket_rwv]
>>>>>>>>>>>>>>>> 0-management:
readv on 192.168.0.7:24007 failed (No data
>>>>>>>>>>>>>>>> available)
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.611907] E
>>>>>>>>>>>>>>>>
[rpc-clnt.c:370:saved_frames_unwind] (-->
>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
lusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3]
>>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df] (-->
>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe] (-->
>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39]
>>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380] )))))
>>>>>>>>>>>>>>>> 0-management:
forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
>>>>>>>>>>>>>>>> at 2017-05-17
06:48:35.609965 (xid=0x1)
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.611928] E [MSGID: 106167]
>>>>>>>>>>>>>>>>
[glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk]
>>>>>>>>>>>>>>>> 0-management:
Error through RPC layer, retry again later
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.611944] I [MSGID: 106004]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:5201:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>> 0-management:
Peer <192.168.0.7> (<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>),
>>>>>>>>>>>>>>>> in state
<Peer in Cluster>, has disconnected from glusterd.
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612024] W
>>>>>>>>>>>>>>>>
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>>>>>>>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/
>>>>>>>>>>>>>>>>
glusterfs/3.7.20/xlator/mgmt/g
>>>>>>>>>>>>>>>>
lusterd.so(glusterd_big_locked_notify+0x4b)
>>>>>>>>>>>>>>>>
[0x7fd6bdc4912b] -->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>>
lusterfs/3.7.20/xlator/mgmt/gl
>>>>>>>>>>>>>>>>
usterd.so(__glusterd_peer_rpc_notify+0x160)
>>>>>>>>>>>>>>>>
[0x7fd6bdc52dd0] -->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>>
lusterfs/3.7.20/xlator/mgmt/gl
>>>>>>>>>>>>>>>>
usterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7fd6bdcef1b3]
>>>>>>>>>>>>>>>> ) 0-management:
Lock for vol shared not held
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612039] W [MSGID: 106118]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:5223:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>> 0-management:
Lock not released for shared
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612079] W [socket.c:596:__socket_rwv]
>>>>>>>>>>>>>>>> 0-management:
readv on 192.168.0.6:24007 failed (No data
>>>>>>>>>>>>>>>> available)
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612179] E
>>>>>>>>>>>>>>>>
[rpc-clnt.c:370:saved_frames_unwind] (-->
>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
lusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3]
>>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df] (-->
>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe] (-->
>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39]
>>>>>>>>>>>>>>>> (-->
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380] )))))
>>>>>>>>>>>>>>>> 0-management:
forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called
>>>>>>>>>>>>>>>> at 2017-05-17
06:48:35.610007 (xid=0x1)
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612197] E [MSGID: 106167]
>>>>>>>>>>>>>>>>
[glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk]
>>>>>>>>>>>>>>>> 0-management:
Error through RPC layer, retry again later
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612211] I [MSGID: 106004]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:5201:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>> 0-management:
Peer <192.168.0.6> (<83e9a0b9-6bd5-483b-8516-d8928805ed95>),
>>>>>>>>>>>>>>>> in state
<Peer in Cluster>, has disconnected from glusterd.
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.612292] W
>>>>>>>>>>>>>>>>
[glusterd-locks.c:681:glusterd_mgmt_v3_unlock]
>>>>>>>>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/
>>>>>>>>>>>>>>>>
glusterfs/3.7.20/xlator/mgmt/g
>>>>>>>>>>>>>>>>
lusterd.so(glusterd_big_locked_notify+0x4b)
>>>>>>>>>>>>>>>>
[0x7fd6bdc4912b] -->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>>
lusterfs/3.7.20/xlator/mgmt/gl
>>>>>>>>>>>>>>>>
usterd.so(__glusterd_peer_rpc_notify+0x160)
>>>>>>>>>>>>>>>>
[0x7fd6bdc52dd0] -->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>>
lusterfs/3.7.20/xlator/mgmt/gl
>>>>>>>>>>>>>>>>
usterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7fd6bdcef1b3]
>>>>>>>>>>>>>>>> ) 0-management:
Lock for vol shared not held
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.613432] W [MSGID: 106118]
>>>>>>>>>>>>>>>>
[glusterd-handler.c:5223:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>> 0-management:
Lock not released for shared
>>>>>>>>>>>>>>>> [2017-05-17
06:48:35.614317] E [MSGID: 106170]
>>>>>>>>>>>>>>>>
[glusterd-handshake.c:1051:gd_validate_mgmt_hndsk_req]
>>>>>>>>>>>>>>>> 0-management:
Request from peer 192.168.0.6:991 has an
>>>>>>>>>>>>>>>> entry in
peerinfo, but uuid does not match
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Apologies for
delay. My initial suspect was correct. You
>>>>>>>>>>>>>>> have an incorrect
UUID in the peer file which is causing this. Can you
>>>>>>>>>>>>>>> please provide me
the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Clicked the send button
accidentally!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can you please send me
the content of /var/lib/glusterd &
>>>>>>>>>>>>>> glusterd log from all
the nodes?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, May 15,
2017 at 10:31 PM, Atin Mukherjee <
>>>>>>>>>>>>>>>> amukherj at
redhat.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, 15
May 2017 at 11:58, Pawan Alwandi
>>>>>>>>>>>>>>>>> <pawan
at platform.sh> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi
Atin,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I see
below error.  Do I require gluster to be upgraded
>>>>>>>>>>>>>>>>>> on all
3 hosts for this to work?  Right now I have host 1 running 3.10.1
>>>>>>>>>>>>>>>>>> and
host 2 & 3 running 3.6.2
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> #
gluster v set all cluster.op-version 31001
>>>>>>>>>>>>>>>>>> volume
set: failed: Required op_version (31001) is not
>>>>>>>>>>>>>>>>>>
supported
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yes you
should given 3.6 version is EOLed.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon,
May 15, 2017 at 3:32 AM, Atin Mukherjee <
>>>>>>>>>>>>>>>>>>
amukherj at redhat.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On
Sun, 14 May 2017 at 21:43, Atin Mukherjee <
>>>>>>>>>>>>>>>>>>>
amukherj at redhat.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Allright, I see that you haven't bumped up the
>>>>>>>>>>>>>>>>>>>>
op-version. Can you please execute:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
gluster v set all cluster.op-version 30101  and then
>>>>>>>>>>>>>>>>>>>>
restart glusterd on all the nodes and check the brick status?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
s/30101/31001
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
On Sun, May 14, 2017 at 8:55 PM, Pawan Alwandi <
>>>>>>>>>>>>>>>>>>>>
pawan at platform.sh> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Hello Atin,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Thanks for looking at this.  Below is the output you
>>>>>>>>>>>>>>>>>>>>>
requested for.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Again, I'm seeing those errors after upgrading gluster
>>>>>>>>>>>>>>>>>>>>>
on host 1.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Host 1
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
# cat /var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>>>>>>>>>
UUID=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>>>>>>>>>
operating-version=30600
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
# cat /var/lib/glusterd/peers/*
>>>>>>>>>>>>>>>>>>>>>
uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>>>>>>>>>>>>>>>>
state=3
>>>>>>>>>>>>>>>>>>>>>
hostname1=192.168.0.7
>>>>>>>>>>>>>>>>>>>>>
uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>>>>>>>>>>>>>>>>
state=3
>>>>>>>>>>>>>>>>>>>>>
hostname1=192.168.0.6
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
# gluster --version
>>>>>>>>>>>>>>>>>>>>>
glusterfs 3.10.1
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Host 2
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
# cat /var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>>>>>>>>>
UUID=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>>>>>>>>>>>>>>>>
operating-version=30600
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
# cat /var/lib/glusterd/peers/*
>>>>>>>>>>>>>>>>>>>>>
uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>>>>>>>>>>>>>>>>
state=3
>>>>>>>>>>>>>>>>>>>>>
hostname1=192.168.0.7
>>>>>>>>>>>>>>>>>>>>>
uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>>>>>>>>>
state=3
>>>>>>>>>>>>>>>>>>>>>
hostname1=192.168.0.5
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
# gluster --version
>>>>>>>>>>>>>>>>>>>>>
glusterfs 3.6.2 built on Jan 21 2015 14:23:44
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
Host 3
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
# cat /var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>>>>>>>>>
UUID=5ec54b4f-f60c-48c6-9e55-95f2bb58f633
>>>>>>>>>>>>>>>>>>>>>
operating-version=30600
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
# cat /var/lib/glusterd/peers/*
>>>>>>>>>>>>>>>>>>>>>
uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073
>>>>>>>>>>>>>>>>>>>>>
state=3
>>>>>>>>>>>>>>>>>>>>>
hostname1=192.168.0.5
>>>>>>>>>>>>>>>>>>>>>
uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95
>>>>>>>>>>>>>>>>>>>>>
state=3
>>>>>>>>>>>>>>>>>>>>>
hostname1=192.168.0.6
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
# gluster --version
>>>>>>>>>>>>>>>>>>>>>
glusterfs 3.6.2 built on Jan 21 2015 14:23:44
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
On Sat, May 13, 2017 at 6:28 PM, Atin Mukherjee <
>>>>>>>>>>>>>>>>>>>>>
amukherj at redhat.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
I have already asked for the following earlier:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
Can you please provide output of following from all
>>>>>>>>>>>>>>>>>>>>>>
the nodes:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
cat /var/lib/glusterd/glusterd.info
>>>>>>>>>>>>>>>>>>>>>>
cat /var/lib/glusterd/peers/*
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
On Sat, 13 May 2017 at 12:22, Pawan Alwandi
>>>>>>>>>>>>>>>>>>>>>>
<pawan at platform.sh> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Hello folks,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Does anyone have any idea whats going on here?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
Thanks,
>>>>>>>>>>>>>>>>>>>>>>>
Pawan
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
On Wed, May 10, 2017 at 5:02 PM, Pawan Alwandi <
>>>>>>>>>>>>>>>>>>>>>>>
pawan at platform.sh> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
Hello,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
I'm trying to upgrade gluster from 3.6.2 to 3.10.1
>>>>>>>>>>>>>>>>>>>>>>>>
but don't see the glusterfsd and glusterfs processes coming up.
>>>>>>>>>>>>>>>>>>>>>>>>
http://gluster.readthedocs.io/
>>>>>>>>>>>>>>>>>>>>>>>>
en/latest/Upgrade-Guide/upgrade_to_3.10/ is the
>>>>>>>>>>>>>>>>>>>>>>>>
process that I'm trying to follow.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
This is a 3 node server setup with a replicated
>>>>>>>>>>>>>>>>>>>>>>>>
volume having replica count of 3.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
Logs below:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.507959] I [MSGID: 100030]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterfsd.c:2460:main] 0-/usr/sbin/glusterd: Started running
>>>>>>>>>>>>>>>>>>>>>>>>
/usr/sbin/glusterd version 3.10.1 (args: /usr/sbin/glusterd -p
>>>>>>>>>>>>>>>>>>>>>>>>
/var/run/glusterd.pid)
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.512827] I [MSGID: 106478]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd.c:1449:init] 0-management: Maximum allowed open file descriptors
>>>>>>>>>>>>>>>>>>>>>>>>
set to 65536
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.512855] I [MSGID: 106479]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd.c:1496:init] 0-management: Using /var/lib/glusterd as working
>>>>>>>>>>>>>>>>>>>>>>>>
directory
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.520426] W [MSGID: 103071]
>>>>>>>>>>>>>>>>>>>>>>>>
[rdma.c:4590:__gf_rdma_ctx_create]
>>>>>>>>>>>>>>>>>>>>>>>>
0-rpc-transport/rdma: rdma_cm event channel creation failed [No such device]
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.520452] W [MSGID: 103055]
>>>>>>>>>>>>>>>>>>>>>>>>
[rdma.c:4897:init] 0-rdma.management: Failed to initialize IB Device
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.520465] W
>>>>>>>>>>>>>>>>>>>>>>>>
[rpc-transport.c:350:rpc_transport_load]
>>>>>>>>>>>>>>>>>>>>>>>>
0-rpc-transport: 'rdma' initialization failed
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.520518] W
>>>>>>>>>>>>>>>>>>>>>>>>
[rpcsvc.c:1661:rpcsvc_create_listener]
>>>>>>>>>>>>>>>>>>>>>>>>
0-rpc-service: cannot create listener, initing the transport failed
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:03.520534] E [MSGID: 106243]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd.c:1720:init] 0-management: creation of 1 listeners failed,
>>>>>>>>>>>>>>>>>>>>>>>>
continuing with succeeded transport
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.931764] I [MSGID: 106513]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-store.c:2197:glusterd_restore_op_version]
>>>>>>>>>>>>>>>>>>>>>>>>
0-glusterd: retrieved op-version: 30600
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.964354] I [MSGID: 106544]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd.c:158:glusterd_uuid_init] 0-management:
>>>>>>>>>>>>>>>>>>>>>>>>
retrieved UUID: 7f2a6e11-2a53-4ab4-9ceb-8be6a9
>>>>>>>>>>>>>>>>>>>>>>>>
f2d073
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.993944] I [MSGID: 106498]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo]
>>>>>>>>>>>>>>>>>>>>>>>>
0-management: connect returned 0
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.995864] I [MSGID: 106498]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo]
>>>>>>>>>>>>>>>>>>>>>>>>
0-management: connect returned 0
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.995879] W [MSGID: 106062]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:3466:glust
>>>>>>>>>>>>>>>>>>>>>>>>
erd_transport_inet_options_build] 0-glusterd:
>>>>>>>>>>>>>>>>>>>>>>>>
Failed to get tcp-user-timeout
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.995903] I
>>>>>>>>>>>>>>>>>>>>>>>>
[rpc-clnt.c:1059:rpc_clnt_connection_init]
>>>>>>>>>>>>>>>>>>>>>>>>
0-management: setting frame-timeout to 600
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.996325] I
>>>>>>>>>>>>>>>>>>>>>>>>
[rpc-clnt.c:1059:rpc_clnt_connection_init]
>>>>>>>>>>>>>>>>>>>>>>>>
0-management: setting frame-timeout to 600
>>>>>>>>>>>>>>>>>>>>>>>>
Final graph:
>>>>>>>>>>>>>>>>>>>>>>>>
+-----------------------------
>>>>>>>>>>>>>>>>>>>>>>>>
-------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>
1: volume management
>>>>>>>>>>>>>>>>>>>>>>>>
2:     type mgmt/glusterd
>>>>>>>>>>>>>>>>>>>>>>>>
3:     option rpc-auth.auth-glusterfs on
>>>>>>>>>>>>>>>>>>>>>>>>
4:     option rpc-auth.auth-unix on
>>>>>>>>>>>>>>>>>>>>>>>>
5:     option rpc-auth.auth-null on
>>>>>>>>>>>>>>>>>>>>>>>>
6:     option rpc-auth-allow-insecure on
>>>>>>>>>>>>>>>>>>>>>>>>
7:     option transport.socket.listen-backlog 128
>>>>>>>>>>>>>>>>>>>>>>>>
8:     option event-threads 1
>>>>>>>>>>>>>>>>>>>>>>>>
9:     option ping-timeout 0
>>>>>>>>>>>>>>>>>>>>>>>>
10:     option transport.socket.read-fail-log off
>>>>>>>>>>>>>>>>>>>>>>>>
11:     option transport.socket.keepalive-interval
>>>>>>>>>>>>>>>>>>>>>>>>
2
>>>>>>>>>>>>>>>>>>>>>>>>
12:     option transport.socket.keepalive-time 10
>>>>>>>>>>>>>>>>>>>>>>>>
13:     option transport-type rdma
>>>>>>>>>>>>>>>>>>>>>>>>
14:     option working-directory /var/lib/glusterd
>>>>>>>>>>>>>>>>>>>>>>>>
15: end-volume
>>>>>>>>>>>>>>>>>>>>>>>>
16:
>>>>>>>>>>>>>>>>>>>>>>>>
+-----------------------------
>>>>>>>>>>>>>>>>>>>>>>>>
-------------------------------------------------+
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:04.996310] W [MSGID: 106062]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:3466:glust
>>>>>>>>>>>>>>>>>>>>>>>>
erd_transport_inet_options_build] 0-glusterd:
>>>>>>>>>>>>>>>>>>>>>>>>
Failed to get tcp-user-timeout
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.000461] I [MSGID: 101190]
>>>>>>>>>>>>>>>>>>>>>>>>
[event-epoll.c:629:event_dispatch_epoll_worker]
>>>>>>>>>>>>>>>>>>>>>>>>
0-epoll: Started thread with index 1
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.001493] W
>>>>>>>>>>>>>>>>>>>>>>>>
[socket.c:593:__socket_rwv] 0-management: readv on
>>>>>>>>>>>>>>>>>>>>>>>>
192.168.0.7:24007 failed (No data available)
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.001513] I [MSGID: 106004]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:5882:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>>>>>>>>>>
0-management: Peer <192.168.0.7>
(<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>),
>>>>>>>>>>>>>>>>>>>>>>>>
in state <Peer in Cluster>, h
>>>>>>>>>>>>>>>>>>>>>>>>
as disconnected from glusterd.
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.001677] W
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
>>>>>>>>>>>>>>>>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/
>>>>>>>>>>>>>>>>>>>>>>>>
glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559)
>>>>>>>>>>>>>>>>>>>>>>>>
[0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu
>>>>>>>>>>>>>>>>>>>>>>>>
/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0)
>>>>>>>>>>>>>>>>>>>>>>>>
[0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>>>>>>>>>>
lusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3)
>>>>>>>>>>>>>>>>>>>>>>>>
[0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no
>>>>>>>>>>>>>>>>>>>>>>>>
t held
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.001696] W [MSGID: 106118]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:5907:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>>>>>>>>>>
0-management: Lock not released for shared
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003099] E
>>>>>>>>>>>>>>>>>>>>>>>>
[rpc-clnt.c:365:saved_frames_unwind] (-->
>>>>>>>>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>>
lusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c]
>>>>>>>>>>>>>>>>>>>>>>>>
(--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s
>>>>>>>>>>>>>>>>>>>>>>>>
aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (-->
>>>>>>>>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de]
>>>>>>>>>>>>>>>>>>>>>>>>
(--> /usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_connection_cleanup+0x
>>>>>>>>>>>>>>>>>>>>>>>>
91)[0x7f0bfec91c21] (-->
>>>>>>>>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710]
>>>>>>>>>>>>>>>>>>>>>>>>
))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1))
>>>>>>>>>>>>>>>>>>>>>>>>
called at 2017-05-10 09:0
>>>>>>>>>>>>>>>>>>>>>>>>
7:05.000627 (xid=0x1)
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003129] E [MSGID: 106167]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handshake.c:2181:__glusterd_peer_dump_version_cbk]
>>>>>>>>>>>>>>>>>>>>>>>>
0-management: Error through RPC layer, retry again later
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003251] W
>>>>>>>>>>>>>>>>>>>>>>>>
[socket.c:593:__socket_rwv] 0-management: readv on
>>>>>>>>>>>>>>>>>>>>>>>>
192.168.0.6:24007 failed (No data available)
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003267] I [MSGID: 106004]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:5882:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>>>>>>>>>>
0-management: Peer <192.168.0.6>
(<83e9a0b9-6bd5-483b-8516-d8928805ed95>),
>>>>>>>>>>>>>>>>>>>>>>>>
in state <Peer in Cluster>, h
>>>>>>>>>>>>>>>>>>>>>>>>
as disconnected from glusterd.
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003318] W
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
>>>>>>>>>>>>>>>>>>>>>>>>
(-->/usr/lib/x86_64-linux-gnu/
>>>>>>>>>>>>>>>>>>>>>>>>
glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559)
>>>>>>>>>>>>>>>>>>>>>>>>
[0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu
>>>>>>>>>>>>>>>>>>>>>>>>
/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0)
>>>>>>>>>>>>>>>>>>>>>>>>
[0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/g
>>>>>>>>>>>>>>>>>>>>>>>>
lusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3)
>>>>>>>>>>>>>>>>>>>>>>>>
[0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no
>>>>>>>>>>>>>>>>>>>>>>>>
t held
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003329] W [MSGID: 106118]
>>>>>>>>>>>>>>>>>>>>>>>>
[glusterd-handler.c:5907:__glusterd_peer_rpc_notify]
>>>>>>>>>>>>>>>>>>>>>>>>
0-management: Lock not released for shared
>>>>>>>>>>>>>>>>>>>>>>>>
[2017-05-10 09:07:05.003457] E
>>>>>>>>>>>>>>>>>>>>>>>>
[rpc-clnt.c:365:saved_frames_unwind] (-->
>>>>>>>>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>>
lusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c]
>>>>>>>>>>>>>>>>>>>>>>>>
(--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s
>>>>>>>>>>>>>>>>>>>>>>>>
aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (-->
>>>>>>>>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>>
frpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de]
>>>>>>>>>>>>>>>>>>>>>>>>
(--> /usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_connection_cleanup+0x
>>>>>>>>>>>>>>>>>>>>>>>>
91)[0x7f0bfec91c21] (-->
>>>>>>>>>>>>>>>>>>>>>>>>
/usr/lib/x86_64-linux-gnu/libg
>>>>>>>>>>>>>>>>>>>>>>>>
frpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710]
>>>>>>>>>>>>>>>>>>>>>>>>
))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1))
>>>>>>>>>>>>>>>>>>>>>>>>
called at 2017-05-10 09:0
>>>>>>>>>>>>>>>>>>>>>>>>
7:05.001407 (xid=0x1)
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
There are a bunch of errors reported but I'm not
>>>>>>>>>>>>>>>>>>>>>>>>
sure which is signal and which ones are noise.  Does anyone have any idea
>>>>>>>>>>>>>>>>>>>>>>>>
whats going on here?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>
Pawan
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>>>>>>>>>>>
Gluster-users mailing list
>>>>>>>>>>>>>>>>>>>>>>>
Gluster-users at gluster.org
>>>>>>>>>>>>>>>>>>>>>>>
http://lists.gluster.org/mailm
>>>>>>>>>>>>>>>>>>>>>>>
an/listinfo/gluster-users
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>>>>>
- Atin (atinm)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
--
>>>>>>>>>>>>>>>>>>> -
Atin (atinm)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> - Atin
(atinm)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> - Atin (atinm)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> - Atin (atinm)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>> --
>>> - Atin (atinm)
>>>
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170530/dfa73cc0/attachment.html>

Apparently Analagous Threads

Search for more maybe matching threads

Gluster users - May 2017 - Failure while upgrading gluster to 3.10.1

[Gluster-users] Failure while upgrading gluster to 3.10.1

[Gluster-users] Failure while upgrading gluster to 3.10.1

Apparently Analagous Threads