Pawan Alwandi
2017-May-19 06:39 UTC
[Gluster-users] Failure while upgrading gluster to 3.10.1
Hello Atin, Thanks for continued support. I've attached requested files from all 3 nodes. (I think we already verified the UUIDs to be correct, anyway let us know if you find any more info in the logs) Pawan On Thu, May 18, 2017 at 11:45 PM, Atin Mukherjee <amukherj at redhat.com> wrote:> > On Thu, 18 May 2017 at 23:40, Atin Mukherjee <amukherj at redhat.com> wrote: > >> On Wed, 17 May 2017 at 12:47, Pawan Alwandi <pawan at platform.sh> wrote: >> >>> Hello Atin, >>> >>> I realized that these http://gluster.readthedocs.io/ >>> en/latest/Upgrade-Guide/upgrade_to_3.10/ instructions only work for >>> upgrades from 3.7, while we are running 3.6.2. Are there >>> instructions/suggestion you have for us to upgrade from 3.6 version? >>> >>> I believe upgrade from 3.6 to 3.7 and then to 3.10 would work, but I see >>> similar errors reported when I upgraded to 3.7 too. >>> >>> For what its worth, I was able to set the op-version (gluster v set all >>> cluster.op-version 30702) but that doesn't seem to help. >>> >>> [2017-05-17 06:48:33.700014] I [MSGID: 100030] [glusterfsd.c:2338:main] >>> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.20 >>> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) >>> [2017-05-17 06:48:33.703808] I [MSGID: 106478] [glusterd.c:1383:init] >>> 0-management: Maximum allowed open file descriptors set to 65536 >>> [2017-05-17 06:48:33.703836] I [MSGID: 106479] [glusterd.c:1432:init] >>> 0-management: Using /var/lib/glusterd as working directory >>> [2017-05-17 06:48:33.708866] W [MSGID: 103071] >>> [rdma.c:4594:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event >>> channel creation failed [No such device] >>> [2017-05-17 06:48:33.709011] W [MSGID: 103055] [rdma.c:4901:init] >>> 0-rdma.management: Failed to initialize IB Device >>> [2017-05-17 06:48:33.709033] W [rpc-transport.c:359:rpc_transport_load] >>> 0-rpc-transport: 'rdma' initialization failed >>> [2017-05-17 06:48:33.709088] W [rpcsvc.c:1642:rpcsvc_create_listener] >>> 0-rpc-service: cannot create listener, initing the transport failed >>> [2017-05-17 06:48:33.709105] E [MSGID: 106243] [glusterd.c:1656:init] >>> 0-management: creation of 1 listeners failed, continuing with succeeded >>> transport >>> [2017-05-17 06:48:35.480043] I [MSGID: 106513] [glusterd-store.c:2068:glusterd_restore_op_version] >>> 0-glusterd: retrieved op-version: 30600 >>> [2017-05-17 06:48:35.605779] I [MSGID: 106498] [glusterd-handler.c:3640: >>> glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 >>> [2017-05-17 06:48:35.607059] I [rpc-clnt.c:1046:rpc_clnt_connection_init] >>> 0-management: setting frame-timeout to 600 >>> [2017-05-17 06:48:35.607670] I [rpc-clnt.c:1046:rpc_clnt_connection_init] >>> 0-management: setting frame-timeout to 600 >>> [2017-05-17 06:48:35.607025] I [MSGID: 106498] [glusterd-handler.c:3640: >>> glusterd_friend_add_from_peerinfo] 0-management: connect returned 0 >>> [2017-05-17 06:48:35.608125] I [MSGID: 106544] >>> [glusterd.c:159:glusterd_uuid_init] 0-management: retrieved UUID: >>> 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>> >> >>> Final graph: >>> +----------------------------------------------------------- >>> -------------------+ >>> 1: volume management >>> 2: type mgmt/glusterd >>> 3: option rpc-auth.auth-glusterfs on >>> 4: option rpc-auth.auth-unix on >>> 5: option rpc-auth.auth-null on >>> 6: option rpc-auth-allow-insecure on >>> 7: option transport.socket.listen-backlog 128 >>> 8: option event-threads 1 >>> 9: option ping-timeout 0 >>> 10: option transport.socket.read-fail-log off >>> 11: option transport.socket.keepalive-interval 2 >>> 12: option transport.socket.keepalive-time 10 >>> 13: option transport-type rdma >>> 14: option working-directory /var/lib/glusterd >>> 15: end-volume >>> 16: >>> +----------------------------------------------------------- >>> -------------------+ >>> [2017-05-17 06:48:35.609868] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] >>> 0-epoll: Started thread with index 1 >>> [2017-05-17 06:48:35.610839] W [socket.c:596:__socket_rwv] 0-management: >>> readv on 192.168.0.7:24007 failed (No data available) >>> [2017-05-17 06:48:35.611907] E [rpc-clnt.c:370:saved_frames_unwind] >>> (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_ >>> callingfn+0x1a3)[0x7fd6c2d70bb3] (--> /usr/lib/x86_64-linux-gnu/ >>> libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df] (--> >>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe] >>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_ >>> connection_cleanup+0x89)[0x7fd6c2b3ba39] (--> /usr/lib/x86_64-linux-gnu/ >>> libgfrpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380] ))))) >>> 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called >>> at 2017-05-17 06:48:35.609965 (xid=0x1) >>> [2017-05-17 06:48:35.611928] E [MSGID: 106167] >>> [glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk] >>> 0-management: Error through RPC layer, retry again later >>> [2017-05-17 06:48:35.611944] I [MSGID: 106004] >>> [glusterd-handler.c:5201:__glusterd_peer_rpc_notify] 0-management: Peer >>> <192.168.0.7> (<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>), in state <Peer >>> in Cluster>, has disconnected from glusterd. >>> [2017-05-17 06:48:35.612024] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/ >>> glusterd.so(glusterd_big_locked_notify+0x4b) [0x7fd6bdc4912b] >>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/ >>> glusterd.so(__glusterd_peer_rpc_notify+0x160) [0x7fd6bdc52dd0] >>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/ >>> glusterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7fd6bdcef1b3] ) >>> 0-management: Lock for vol shared not held >>> [2017-05-17 06:48:35.612039] W [MSGID: 106118] >>> [glusterd-handler.c:5223:__glusterd_peer_rpc_notify] 0-management: Lock >>> not released for shared >>> [2017-05-17 06:48:35.612079] W [socket.c:596:__socket_rwv] 0-management: >>> readv on 192.168.0.6:24007 failed (No data available) >>> [2017-05-17 06:48:35.612179] E [rpc-clnt.c:370:saved_frames_unwind] >>> (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_ >>> callingfn+0x1a3)[0x7fd6c2d70bb3] (--> /usr/lib/x86_64-linux-gnu/ >>> libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df] (--> >>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe] >>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_ >>> connection_cleanup+0x89)[0x7fd6c2b3ba39] (--> /usr/lib/x86_64-linux-gnu/ >>> libgfrpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380] ))))) >>> 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called >>> at 2017-05-17 06:48:35.610007 (xid=0x1) >>> [2017-05-17 06:48:35.612197] E [MSGID: 106167] >>> [glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk] >>> 0-management: Error through RPC layer, retry again later >>> [2017-05-17 06:48:35.612211] I [MSGID: 106004] >>> [glusterd-handler.c:5201:__glusterd_peer_rpc_notify] 0-management: Peer >>> <192.168.0.6> (<83e9a0b9-6bd5-483b-8516-d8928805ed95>), in state <Peer >>> in Cluster>, has disconnected from glusterd. >>> [2017-05-17 06:48:35.612292] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/ >>> glusterd.so(glusterd_big_locked_notify+0x4b) [0x7fd6bdc4912b] >>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/ >>> glusterd.so(__glusterd_peer_rpc_notify+0x160) [0x7fd6bdc52dd0] >>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/ >>> glusterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7fd6bdcef1b3] ) >>> 0-management: Lock for vol shared not held >>> [2017-05-17 06:48:35.613432] W [MSGID: 106118] >>> [glusterd-handler.c:5223:__glusterd_peer_rpc_notify] 0-management: Lock >>> not released for shared >>> [2017-05-17 06:48:35.614317] E [MSGID: 106170] >>> [glusterd-handshake.c:1051:gd_validate_mgmt_hndsk_req] 0-management: >>> Request from peer 192.168.0.6:991 has an entry in peerinfo, but uuid >>> does not match >>> >> >> Apologies for delay. My initial suspect was correct. You have an >> incorrect UUID in the peer file which is causing this. Can you please >> provide me the >> > > Clicked the send button accidentally! > > Can you please send me the content of /var/lib/glusterd & glusterd log > from all the nodes? > > >>> >>> >>> >>> On Mon, May 15, 2017 at 10:31 PM, Atin Mukherjee <amukherj at redhat.com> >>> wrote: >>> >>>> >>>> On Mon, 15 May 2017 at 11:58, Pawan Alwandi <pawan at platform.sh> wrote: >>>> >>>>> Hi Atin, >>>>> >>>>> I see below error. Do I require gluster to be upgraded on all 3 hosts >>>>> for this to work? Right now I have host 1 running 3.10.1 and host 2 & 3 >>>>> running 3.6.2 >>>>> >>>>> # gluster v set all cluster.op-version 31001 >>>>> volume set: failed: Required op_version (31001) is not supported >>>>> >>>> >>>> Yes you should given 3.6 version is EOLed. >>>> >>>>> >>>>> >>>>> >>>>> On Mon, May 15, 2017 at 3:32 AM, Atin Mukherjee <amukherj at redhat.com> >>>>> wrote: >>>>> >>>>>> On Sun, 14 May 2017 at 21:43, Atin Mukherjee <amukherj at redhat.com> >>>>>> wrote: >>>>>> >>>>>>> Allright, I see that you haven't bumped up the op-version. Can you >>>>>>> please execute: >>>>>>> >>>>>>> gluster v set all cluster.op-version 30101 and then restart >>>>>>> glusterd on all the nodes and check the brick status? >>>>>>> >>>>>> >>>>>> s/30101/31001 >>>>>> >>>>>> >>>>>>> >>>>>>> On Sun, May 14, 2017 at 8:55 PM, Pawan Alwandi <pawan at platform.sh> >>>>>>> wrote: >>>>>>> >>>>>>>> Hello Atin, >>>>>>>> >>>>>>>> Thanks for looking at this. Below is the output you requested for. >>>>>>>> >>>>>>>> Again, I'm seeing those errors after upgrading gluster on host 1. >>>>>>>> >>>>>>>> Host 1 >>>>>>>> >>>>>>>> # cat /var/lib/glusterd/glusterd.info >>>>>>>> UUID=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>> operating-version=30600 >>>>>>>> >>>>>>>> # cat /var/lib/glusterd/peers/* >>>>>>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>>>>> state=3 >>>>>>>> hostname1=192.168.0.7 >>>>>>>> uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>>>>> state=3 >>>>>>>> hostname1=192.168.0.6 >>>>>>>> >>>>>>>> # gluster --version >>>>>>>> glusterfs 3.10.1 >>>>>>>> >>>>>>>> Host 2 >>>>>>>> >>>>>>>> # cat /var/lib/glusterd/glusterd.info >>>>>>>> UUID=83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>>>>> operating-version=30600 >>>>>>>> >>>>>>>> # cat /var/lib/glusterd/peers/* >>>>>>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>>>>> state=3 >>>>>>>> hostname1=192.168.0.7 >>>>>>>> uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>> state=3 >>>>>>>> hostname1=192.168.0.5 >>>>>>>> >>>>>>>> # gluster --version >>>>>>>> glusterfs 3.6.2 built on Jan 21 2015 14:23:44 >>>>>>>> >>>>>>>> Host 3 >>>>>>>> >>>>>>>> # cat /var/lib/glusterd/glusterd.info >>>>>>>> UUID=5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>>>>> operating-version=30600 >>>>>>>> >>>>>>>> # cat /var/lib/glusterd/peers/* >>>>>>>> uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>> state=3 >>>>>>>> hostname1=192.168.0.5 >>>>>>>> uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>>>>> state=3 >>>>>>>> hostname1=192.168.0.6 >>>>>>>> >>>>>>>> # gluster --version >>>>>>>> glusterfs 3.6.2 built on Jan 21 2015 14:23:44 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Sat, May 13, 2017 at 6:28 PM, Atin Mukherjee < >>>>>>>> amukherj at redhat.com> wrote: >>>>>>>> >>>>>>>>> I have already asked for the following earlier: >>>>>>>>> >>>>>>>>> Can you please provide output of following from all the nodes: >>>>>>>>> >>>>>>>>> cat /var/lib/glusterd/glusterd.info >>>>>>>>> cat /var/lib/glusterd/peers/* >>>>>>>>> >>>>>>>>> On Sat, 13 May 2017 at 12:22, Pawan Alwandi <pawan at platform.sh> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hello folks, >>>>>>>>>> >>>>>>>>>> Does anyone have any idea whats going on here? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Pawan >>>>>>>>>> >>>>>>>>>> On Wed, May 10, 2017 at 5:02 PM, Pawan Alwandi <pawan at platform.sh >>>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> I'm trying to upgrade gluster from 3.6.2 to 3.10.1 but don't see >>>>>>>>>>> the glusterfsd and glusterfs processes coming up. >>>>>>>>>>> http://gluster.readthedocs.io/en/latest/Upgrade-Guide/ >>>>>>>>>>> upgrade_to_3.10/ is the process that I'm trying to follow. >>>>>>>>>>> >>>>>>>>>>> This is a 3 node server setup with a replicated volume having >>>>>>>>>>> replica count of 3. >>>>>>>>>>> >>>>>>>>>>> Logs below: >>>>>>>>>>> >>>>>>>>>>> [2017-05-10 09:07:03.507959] I [MSGID: 100030] >>>>>>>>>>> [glusterfsd.c:2460:main] 0-/usr/sbin/glusterd: Started running >>>>>>>>>>> /usr/sbin/glusterd version 3.10.1 (args: /usr/sbin/glusterd -p >>>>>>>>>>> /var/run/glusterd.pid) >>>>>>>>>>> [2017-05-10 09:07:03.512827] I [MSGID: 106478] >>>>>>>>>>> [glusterd.c:1449:init] 0-management: Maximum allowed open file descriptors >>>>>>>>>>> set to 65536 >>>>>>>>>>> [2017-05-10 09:07:03.512855] I [MSGID: 106479] >>>>>>>>>>> [glusterd.c:1496:init] 0-management: Using /var/lib/glusterd as working >>>>>>>>>>> directory >>>>>>>>>>> [2017-05-10 09:07:03.520426] W [MSGID: 103071] >>>>>>>>>>> [rdma.c:4590:__gf_rdma_ctx_create] 0-rpc-transport/rdma: >>>>>>>>>>> rdma_cm event channel creation failed [No such device] >>>>>>>>>>> [2017-05-10 09:07:03.520452] W [MSGID: 103055] >>>>>>>>>>> [rdma.c:4897:init] 0-rdma.management: Failed to initialize IB Device >>>>>>>>>>> [2017-05-10 09:07:03.520465] W [rpc-transport.c:350:rpc_transport_load] >>>>>>>>>>> 0-rpc-transport: 'rdma' initialization failed >>>>>>>>>>> [2017-05-10 09:07:03.520518] W [rpcsvc.c:1661:rpcsvc_create_listener] >>>>>>>>>>> 0-rpc-service: cannot create listener, initing the transport failed >>>>>>>>>>> [2017-05-10 09:07:03.520534] E [MSGID: 106243] >>>>>>>>>>> [glusterd.c:1720:init] 0-management: creation of 1 listeners failed, >>>>>>>>>>> continuing with succeeded transport >>>>>>>>>>> [2017-05-10 09:07:04.931764] I [MSGID: 106513] >>>>>>>>>>> [glusterd-store.c:2197:glusterd_restore_op_version] 0-glusterd: >>>>>>>>>>> retrieved op-version: 30600 >>>>>>>>>>> [2017-05-10 09:07:04.964354] I [MSGID: 106544] >>>>>>>>>>> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved >>>>>>>>>>> UUID: 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>>>>> [2017-05-10 09:07:04.993944] I [MSGID: 106498] >>>>>>>>>>> [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] >>>>>>>>>>> 0-management: connect returned 0 >>>>>>>>>>> [2017-05-10 09:07:04.995864] I [MSGID: 106498] >>>>>>>>>>> [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] >>>>>>>>>>> 0-management: connect returned 0 >>>>>>>>>>> [2017-05-10 09:07:04.995879] W [MSGID: 106062] >>>>>>>>>>> [glusterd-handler.c:3466:glusterd_transport_inet_options_build] >>>>>>>>>>> 0-glusterd: Failed to get tcp-user-timeout >>>>>>>>>>> [2017-05-10 09:07:04.995903] I [rpc-clnt.c:1059:rpc_clnt_connection_init] >>>>>>>>>>> 0-management: setting frame-timeout to 600 >>>>>>>>>>> [2017-05-10 09:07:04.996325] I [rpc-clnt.c:1059:rpc_clnt_connection_init] >>>>>>>>>>> 0-management: setting frame-timeout to 600 >>>>>>>>>>> Final graph: >>>>>>>>>>> +----------------------------------------------------------- >>>>>>>>>>> -------------------+ >>>>>>>>>>> 1: volume management >>>>>>>>>>> 2: type mgmt/glusterd >>>>>>>>>>> 3: option rpc-auth.auth-glusterfs on >>>>>>>>>>> 4: option rpc-auth.auth-unix on >>>>>>>>>>> 5: option rpc-auth.auth-null on >>>>>>>>>>> 6: option rpc-auth-allow-insecure on >>>>>>>>>>> 7: option transport.socket.listen-backlog 128 >>>>>>>>>>> 8: option event-threads 1 >>>>>>>>>>> 9: option ping-timeout 0 >>>>>>>>>>> 10: option transport.socket.read-fail-log off >>>>>>>>>>> 11: option transport.socket.keepalive-interval 2 >>>>>>>>>>> 12: option transport.socket.keepalive-time 10 >>>>>>>>>>> 13: option transport-type rdma >>>>>>>>>>> 14: option working-directory /var/lib/glusterd >>>>>>>>>>> 15: end-volume >>>>>>>>>>> 16: >>>>>>>>>>> +----------------------------------------------------------- >>>>>>>>>>> -------------------+ >>>>>>>>>>> [2017-05-10 09:07:04.996310] W [MSGID: 106062] >>>>>>>>>>> [glusterd-handler.c:3466:glusterd_transport_inet_options_build] >>>>>>>>>>> 0-glusterd: Failed to get tcp-user-timeout >>>>>>>>>>> [2017-05-10 09:07:05.000461] I [MSGID: 101190] >>>>>>>>>>> [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: >>>>>>>>>>> Started thread with index 1 >>>>>>>>>>> [2017-05-10 09:07:05.001493] W [socket.c:593:__socket_rwv] >>>>>>>>>>> 0-management: readv on 192.168.0.7:24007 failed (No data >>>>>>>>>>> available) >>>>>>>>>>> [2017-05-10 09:07:05.001513] I [MSGID: 106004] >>>>>>>>>>> [glusterd-handler.c:5882:__glusterd_peer_rpc_notify] >>>>>>>>>>> 0-management: Peer <192.168.0.7> (<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>), >>>>>>>>>>> in state <Peer in Cluster>, h >>>>>>>>>>> as disconnected from glusterd. >>>>>>>>>>> [2017-05-10 09:07:05.001677] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] >>>>>>>>>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559) >>>>>>>>>>> [0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu >>>>>>>>>>> /glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0) >>>>>>>>>>> [0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/ >>>>>>>>>>> glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3) >>>>>>>>>>> [0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no >>>>>>>>>>> t held >>>>>>>>>>> [2017-05-10 09:07:05.001696] W [MSGID: 106118] >>>>>>>>>>> [glusterd-handler.c:5907:__glusterd_peer_rpc_notify] >>>>>>>>>>> 0-management: Lock not released for shared >>>>>>>>>>> [2017-05-10 09:07:05.003099] E [rpc-clnt.c:365:saved_frames_unwind] >>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_ >>>>>>>>>>> callingfn+0x13c)[0x7f0bfeeca73c] (--> /usr/lib/x86_64-linux-gnu/ >>>>>>>>>>> libgfrpc.so.0(s >>>>>>>>>>> aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (--> >>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de] >>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_ >>>>>>>>>>> connection_cleanup+0x >>>>>>>>>>> 91)[0x7f0bfec91c21] (--> /usr/lib/x86_64-linux-gnu/ >>>>>>>>>>> libgfrpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710] ))))) >>>>>>>>>>> 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called >>>>>>>>>>> at 2017-05-10 09:0 >>>>>>>>>>> 7:05.000627 (xid=0x1) >>>>>>>>>>> [2017-05-10 09:07:05.003129] E [MSGID: 106167] >>>>>>>>>>> [glusterd-handshake.c:2181:__glusterd_peer_dump_version_cbk] >>>>>>>>>>> 0-management: Error through RPC layer, retry again later >>>>>>>>>>> [2017-05-10 09:07:05.003251] W [socket.c:593:__socket_rwv] >>>>>>>>>>> 0-management: readv on 192.168.0.6:24007 failed (No data >>>>>>>>>>> available) >>>>>>>>>>> [2017-05-10 09:07:05.003267] I [MSGID: 106004] >>>>>>>>>>> [glusterd-handler.c:5882:__glusterd_peer_rpc_notify] >>>>>>>>>>> 0-management: Peer <192.168.0.6> (<83e9a0b9-6bd5-483b-8516-d8928805ed95>), >>>>>>>>>>> in state <Peer in Cluster>, h >>>>>>>>>>> as disconnected from glusterd. >>>>>>>>>>> [2017-05-10 09:07:05.003318] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] >>>>>>>>>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559) >>>>>>>>>>> [0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu >>>>>>>>>>> /glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0) >>>>>>>>>>> [0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/ >>>>>>>>>>> glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3) >>>>>>>>>>> [0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no >>>>>>>>>>> t held >>>>>>>>>>> [2017-05-10 09:07:05.003329] W [MSGID: 106118] >>>>>>>>>>> [glusterd-handler.c:5907:__glusterd_peer_rpc_notify] >>>>>>>>>>> 0-management: Lock not released for shared >>>>>>>>>>> [2017-05-10 09:07:05.003457] E [rpc-clnt.c:365:saved_frames_unwind] >>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_ >>>>>>>>>>> callingfn+0x13c)[0x7f0bfeeca73c] (--> /usr/lib/x86_64-linux-gnu/ >>>>>>>>>>> libgfrpc.so.0(s >>>>>>>>>>> aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (--> >>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de] >>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_ >>>>>>>>>>> connection_cleanup+0x >>>>>>>>>>> 91)[0x7f0bfec91c21] (--> /usr/lib/x86_64-linux-gnu/ >>>>>>>>>>> libgfrpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710] ))))) >>>>>>>>>>> 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called >>>>>>>>>>> at 2017-05-10 09:0 >>>>>>>>>>> 7:05.001407 (xid=0x1) >>>>>>>>>>> >>>>>>>>>>> There are a bunch of errors reported but I'm not sure which is >>>>>>>>>>> signal and which ones are noise. Does anyone have any idea whats going on >>>>>>>>>>> here? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Pawan >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>> Gluster-users mailing list >>>>>>>>>> Gluster-users at gluster.org >>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>>>> >>>>>>>>> -- >>>>>>>>> - Atin (atinm) >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> -- >>>>>> - Atin (atinm) >>>>>> >>>>> >>>>> -- >>>> - Atin (atinm) >>>> >>> >>> -- >> - Atin (atinm) >> > -- > - Atin (atinm) >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170519/683efd8f/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: host1.tar.gz Type: application/x-gzip Size: 17175 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170519/683efd8f/attachment.gz> -------------- next part -------------- A non-text attachment was scrubbed... Name: host2.tar.gz Type: application/x-gzip Size: 9542 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170519/683efd8f/attachment-0001.gz> -------------- next part -------------- A non-text attachment was scrubbed... Name: host3.tar.gz Type: application/x-gzip Size: 9625 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170519/683efd8f/attachment-0002.gz>
Atin Mukherjee
2017-May-21 02:38 UTC
[Gluster-users] Failure while upgrading gluster to 3.10.1
Hi Pawan, I'll get back to you on this by Monday. On Fri, 19 May 2017 at 12:09, Pawan Alwandi <pawan at platform.sh> wrote:> Hello Atin, > > Thanks for continued support. I've attached requested files from all 3 > nodes. > > (I think we already verified the UUIDs to be correct, anyway let us know > if you find any more info in the logs) > > Pawan > > On Thu, May 18, 2017 at 11:45 PM, Atin Mukherjee <amukherj at redhat.com> > wrote: > >> >> On Thu, 18 May 2017 at 23:40, Atin Mukherjee <amukherj at redhat.com> wrote: >> >>> On Wed, 17 May 2017 at 12:47, Pawan Alwandi <pawan at platform.sh> wrote: >>> >>>> Hello Atin, >>>> >>>> I realized that these >>>> http://gluster.readthedocs.io/en/latest/Upgrade-Guide/upgrade_to_3.10/ >>>> instructions only work for upgrades from 3.7, while we are running 3.6.2. >>>> Are there instructions/suggestion you have for us to upgrade from 3.6 >>>> version? >>>> >>>> I believe upgrade from 3.6 to 3.7 and then to 3.10 would work, but I >>>> see similar errors reported when I upgraded to 3.7 too. >>>> >>>> For what its worth, I was able to set the op-version (gluster v set all >>>> cluster.op-version 30702) but that doesn't seem to help. >>>> >>>> [2017-05-17 06:48:33.700014] I [MSGID: 100030] [glusterfsd.c:2338:main] >>>> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.20 >>>> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) >>>> [2017-05-17 06:48:33.703808] I [MSGID: 106478] [glusterd.c:1383:init] >>>> 0-management: Maximum allowed open file descriptors set to 65536 >>>> [2017-05-17 06:48:33.703836] I [MSGID: 106479] [glusterd.c:1432:init] >>>> 0-management: Using /var/lib/glusterd as working directory >>>> [2017-05-17 06:48:33.708866] W [MSGID: 103071] >>>> [rdma.c:4594:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event >>>> channel creation failed [No such device] >>>> [2017-05-17 06:48:33.709011] W [MSGID: 103055] [rdma.c:4901:init] >>>> 0-rdma.management: Failed to initialize IB Device >>>> [2017-05-17 06:48:33.709033] W [rpc-transport.c:359:rpc_transport_load] >>>> 0-rpc-transport: 'rdma' initialization failed >>>> [2017-05-17 06:48:33.709088] W [rpcsvc.c:1642:rpcsvc_create_listener] >>>> 0-rpc-service: cannot create listener, initing the transport failed >>>> [2017-05-17 06:48:33.709105] E [MSGID: 106243] [glusterd.c:1656:init] >>>> 0-management: creation of 1 listeners failed, continuing with succeeded >>>> transport >>>> [2017-05-17 06:48:35.480043] I [MSGID: 106513] >>>> [glusterd-store.c:2068:glusterd_restore_op_version] 0-glusterd: retrieved >>>> op-version: 30600 >>>> [2017-05-17 06:48:35.605779] I [MSGID: 106498] >>>> [glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo] 0-management: >>>> connect returned 0 >>>> [2017-05-17 06:48:35.607059] I >>>> [rpc-clnt.c:1046:rpc_clnt_connection_init] 0-management: setting >>>> frame-timeout to 600 >>>> [2017-05-17 06:48:35.607670] I >>>> [rpc-clnt.c:1046:rpc_clnt_connection_init] 0-management: setting >>>> frame-timeout to 600 >>>> [2017-05-17 06:48:35.607025] I [MSGID: 106498] >>>> [glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo] 0-management: >>>> connect returned 0 >>>> [2017-05-17 06:48:35.608125] I [MSGID: 106544] >>>> [glusterd.c:159:glusterd_uuid_init] 0-management: retrieved UUID: >>>> 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>> >>> >>>> Final graph: >>>> >>>> +------------------------------------------------------------------------------+ >>>> 1: volume management >>>> 2: type mgmt/glusterd >>>> 3: option rpc-auth.auth-glusterfs on >>>> 4: option rpc-auth.auth-unix on >>>> 5: option rpc-auth.auth-null on >>>> 6: option rpc-auth-allow-insecure on >>>> 7: option transport.socket.listen-backlog 128 >>>> 8: option event-threads 1 >>>> 9: option ping-timeout 0 >>>> 10: option transport.socket.read-fail-log off >>>> 11: option transport.socket.keepalive-interval 2 >>>> 12: option transport.socket.keepalive-time 10 >>>> 13: option transport-type rdma >>>> 14: option working-directory /var/lib/glusterd >>>> 15: end-volume >>>> 16: >>>> >>>> +------------------------------------------------------------------------------+ >>>> [2017-05-17 06:48:35.609868] I [MSGID: 101190] >>>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread >>>> with index 1 >>>> [2017-05-17 06:48:35.610839] W [socket.c:596:__socket_rwv] >>>> 0-management: readv on 192.168.0.7:24007 failed (No data available) >>>> [2017-05-17 06:48:35.611907] E [rpc-clnt.c:370:saved_frames_unwind] >>>> (--> >>>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3] >>>> (--> >>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df] >>>> (--> >>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe] >>>> (--> >>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39] >>>> (--> >>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380] >>>> ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) >>>> called at 2017-05-17 06:48:35.609965 (xid=0x1) >>>> [2017-05-17 06:48:35.611928] E [MSGID: 106167] >>>> [glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk] 0-management: >>>> Error through RPC layer, retry again later >>>> [2017-05-17 06:48:35.611944] I [MSGID: 106004] >>>> [glusterd-handler.c:5201:__glusterd_peer_rpc_notify] 0-management: Peer >>>> <192.168.0.7> (<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>), in state <Peer in >>>> Cluster>, has disconnected from glusterd. >>>> [2017-05-17 06:48:35.612024] W >>>> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4b) >>>> [0x7fd6bdc4912b] >>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x160) >>>> [0x7fd6bdc52dd0] >>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x4c3) >>>> [0x7fd6bdcef1b3] ) 0-management: Lock for vol shared not held >>>> [2017-05-17 06:48:35.612039] W [MSGID: 106118] >>>> [glusterd-handler.c:5223:__glusterd_peer_rpc_notify] 0-management: Lock not >>>> released for shared >>>> [2017-05-17 06:48:35.612079] W [socket.c:596:__socket_rwv] >>>> 0-management: readv on 192.168.0.6:24007 failed (No data available) >>>> [2017-05-17 06:48:35.612179] E [rpc-clnt.c:370:saved_frames_unwind] >>>> (--> >>>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3] >>>> (--> >>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df] >>>> (--> >>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe] >>>> (--> >>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39] >>>> (--> >>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380] >>>> ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) >>>> called at 2017-05-17 06:48:35.610007 (xid=0x1) >>>> [2017-05-17 06:48:35.612197] E [MSGID: 106167] >>>> [glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk] 0-management: >>>> Error through RPC layer, retry again later >>>> [2017-05-17 06:48:35.612211] I [MSGID: 106004] >>>> [glusterd-handler.c:5201:__glusterd_peer_rpc_notify] 0-management: Peer >>>> <192.168.0.6> (<83e9a0b9-6bd5-483b-8516-d8928805ed95>), in state <Peer in >>>> Cluster>, has disconnected from glusterd. >>>> [2017-05-17 06:48:35.612292] W >>>> [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4b) >>>> [0x7fd6bdc4912b] >>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x160) >>>> [0x7fd6bdc52dd0] >>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x4c3) >>>> [0x7fd6bdcef1b3] ) 0-management: Lock for vol shared not held >>>> [2017-05-17 06:48:35.613432] W [MSGID: 106118] >>>> [glusterd-handler.c:5223:__glusterd_peer_rpc_notify] 0-management: Lock not >>>> released for shared >>>> [2017-05-17 06:48:35.614317] E [MSGID: 106170] >>>> [glusterd-handshake.c:1051:gd_validate_mgmt_hndsk_req] 0-management: >>>> Request from peer 192.168.0.6:991 has an entry in peerinfo, but uuid >>>> does not match >>>> >>> >>> Apologies for delay. My initial suspect was correct. You have an >>> incorrect UUID in the peer file which is causing this. Can you please >>> provide me the >>> >> >> Clicked the send button accidentally! >> >> Can you please send me the content of /var/lib/glusterd & glusterd log >> from all the nodes? >> >> >>>> >>>> >>>> >>>> On Mon, May 15, 2017 at 10:31 PM, Atin Mukherjee <amukherj at redhat.com> >>>> wrote: >>>> >>>>> >>>>> On Mon, 15 May 2017 at 11:58, Pawan Alwandi <pawan at platform.sh> wrote: >>>>> >>>>>> Hi Atin, >>>>>> >>>>>> I see below error. Do I require gluster to be upgraded on all 3 >>>>>> hosts for this to work? Right now I have host 1 running 3.10.1 and host 2 >>>>>> & 3 running 3.6.2 >>>>>> >>>>>> # gluster v set all cluster.op-version 31001 >>>>>> volume set: failed: Required op_version (31001) is not supported >>>>>> >>>>> >>>>> Yes you should given 3.6 version is EOLed. >>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Mon, May 15, 2017 at 3:32 AM, Atin Mukherjee <amukherj at redhat.com> >>>>>> wrote: >>>>>> >>>>>>> On Sun, 14 May 2017 at 21:43, Atin Mukherjee <amukherj at redhat.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Allright, I see that you haven't bumped up the op-version. Can you >>>>>>>> please execute: >>>>>>>> >>>>>>>> gluster v set all cluster.op-version 30101 and then restart >>>>>>>> glusterd on all the nodes and check the brick status? >>>>>>>> >>>>>>> >>>>>>> s/30101/31001 >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> On Sun, May 14, 2017 at 8:55 PM, Pawan Alwandi <pawan at platform.sh> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hello Atin, >>>>>>>>> >>>>>>>>> Thanks for looking at this. Below is the output you requested for. >>>>>>>>> >>>>>>>>> Again, I'm seeing those errors after upgrading gluster on host 1. >>>>>>>>> >>>>>>>>> Host 1 >>>>>>>>> >>>>>>>>> # cat /var/lib/glusterd/glusterd.info >>>>>>>>> UUID=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>>> operating-version=30600 >>>>>>>>> >>>>>>>>> # cat /var/lib/glusterd/peers/* >>>>>>>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>>>>>> state=3 >>>>>>>>> hostname1=192.168.0.7 >>>>>>>>> uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>>>>>> state=3 >>>>>>>>> hostname1=192.168.0.6 >>>>>>>>> >>>>>>>>> # gluster --version >>>>>>>>> glusterfs 3.10.1 >>>>>>>>> >>>>>>>>> Host 2 >>>>>>>>> >>>>>>>>> # cat /var/lib/glusterd/glusterd.info >>>>>>>>> UUID=83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>>>>>> operating-version=30600 >>>>>>>>> >>>>>>>>> # cat /var/lib/glusterd/peers/* >>>>>>>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>>>>>> state=3 >>>>>>>>> hostname1=192.168.0.7 >>>>>>>>> uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>>> state=3 >>>>>>>>> hostname1=192.168.0.5 >>>>>>>>> >>>>>>>>> # gluster --version >>>>>>>>> glusterfs 3.6.2 built on Jan 21 2015 14:23:44 >>>>>>>>> >>>>>>>>> Host 3 >>>>>>>>> >>>>>>>>> # cat /var/lib/glusterd/glusterd.info >>>>>>>>> UUID=5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>>>>>> operating-version=30600 >>>>>>>>> >>>>>>>>> # cat /var/lib/glusterd/peers/* >>>>>>>>> uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>>> state=3 >>>>>>>>> hostname1=192.168.0.5 >>>>>>>>> uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>>>>>> state=3 >>>>>>>>> hostname1=192.168.0.6 >>>>>>>>> >>>>>>>>> # gluster --version >>>>>>>>> glusterfs 3.6.2 built on Jan 21 2015 14:23:44 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sat, May 13, 2017 at 6:28 PM, Atin Mukherjee < >>>>>>>>> amukherj at redhat.com> wrote: >>>>>>>>> >>>>>>>>>> I have already asked for the following earlier: >>>>>>>>>> >>>>>>>>>> Can you please provide output of following from all the nodes: >>>>>>>>>> >>>>>>>>>> cat /var/lib/glusterd/glusterd.info >>>>>>>>>> cat /var/lib/glusterd/peers/* >>>>>>>>>> >>>>>>>>>> On Sat, 13 May 2017 at 12:22, Pawan Alwandi <pawan at platform.sh> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hello folks, >>>>>>>>>>> >>>>>>>>>>> Does anyone have any idea whats going on here? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Pawan >>>>>>>>>>> >>>>>>>>>>> On Wed, May 10, 2017 at 5:02 PM, Pawan Alwandi < >>>>>>>>>>> pawan at platform.sh> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> I'm trying to upgrade gluster from 3.6.2 to 3.10.1 but don't >>>>>>>>>>>> see the glusterfsd and glusterfs processes coming up. >>>>>>>>>>>> http://gluster.readthedocs.io/en/latest/Upgrade-Guide/upgrade_to_3.10/ >>>>>>>>>>>> is the process that I'm trying to follow. >>>>>>>>>>>> >>>>>>>>>>>> This is a 3 node server setup with a replicated volume having >>>>>>>>>>>> replica count of 3. >>>>>>>>>>>> >>>>>>>>>>>> Logs below: >>>>>>>>>>>> >>>>>>>>>>>> [2017-05-10 09:07:03.507959] I [MSGID: 100030] >>>>>>>>>>>> [glusterfsd.c:2460:main] 0-/usr/sbin/glusterd: Started running >>>>>>>>>>>> /usr/sbin/glusterd version 3.10.1 (args: /usr/sbin/glusterd -p >>>>>>>>>>>> /var/run/glusterd.pid) >>>>>>>>>>>> [2017-05-10 09:07:03.512827] I [MSGID: 106478] >>>>>>>>>>>> [glusterd.c:1449:init] 0-management: Maximum allowed open file descriptors >>>>>>>>>>>> set to 65536 >>>>>>>>>>>> [2017-05-10 09:07:03.512855] I [MSGID: 106479] >>>>>>>>>>>> [glusterd.c:1496:init] 0-management: Using /var/lib/glusterd as working >>>>>>>>>>>> directory >>>>>>>>>>>> [2017-05-10 09:07:03.520426] W [MSGID: 103071] >>>>>>>>>>>> [rdma.c:4590:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event >>>>>>>>>>>> channel creation failed [No such device] >>>>>>>>>>>> [2017-05-10 09:07:03.520452] W [MSGID: 103055] >>>>>>>>>>>> [rdma.c:4897:init] 0-rdma.management: Failed to initialize IB Device >>>>>>>>>>>> [2017-05-10 09:07:03.520465] W >>>>>>>>>>>> [rpc-transport.c:350:rpc_transport_load] 0-rpc-transport: 'rdma' >>>>>>>>>>>> initialization failed >>>>>>>>>>>> [2017-05-10 09:07:03.520518] W >>>>>>>>>>>> [rpcsvc.c:1661:rpcsvc_create_listener] 0-rpc-service: cannot create >>>>>>>>>>>> listener, initing the transport failed >>>>>>>>>>>> [2017-05-10 09:07:03.520534] E [MSGID: 106243] >>>>>>>>>>>> [glusterd.c:1720:init] 0-management: creation of 1 listeners failed, >>>>>>>>>>>> continuing with succeeded transport >>>>>>>>>>>> [2017-05-10 09:07:04.931764] I [MSGID: 106513] >>>>>>>>>>>> [glusterd-store.c:2197:glusterd_restore_op_version] 0-glusterd: retrieved >>>>>>>>>>>> op-version: 30600 >>>>>>>>>>>> [2017-05-10 09:07:04.964354] I [MSGID: 106544] >>>>>>>>>>>> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID: >>>>>>>>>>>> 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>>>>>> [2017-05-10 09:07:04.993944] I [MSGID: 106498] >>>>>>>>>>>> [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] 0-management: >>>>>>>>>>>> connect returned 0 >>>>>>>>>>>> [2017-05-10 09:07:04.995864] I [MSGID: 106498] >>>>>>>>>>>> [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] 0-management: >>>>>>>>>>>> connect returned 0 >>>>>>>>>>>> [2017-05-10 09:07:04.995879] W [MSGID: 106062] >>>>>>>>>>>> [glusterd-handler.c:3466:glusterd_transport_inet_options_build] 0-glusterd: >>>>>>>>>>>> Failed to get tcp-user-timeout >>>>>>>>>>>> [2017-05-10 09:07:04.995903] I >>>>>>>>>>>> [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management: setting >>>>>>>>>>>> frame-timeout to 600 >>>>>>>>>>>> [2017-05-10 09:07:04.996325] I >>>>>>>>>>>> [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management: setting >>>>>>>>>>>> frame-timeout to 600 >>>>>>>>>>>> Final graph: >>>>>>>>>>>> >>>>>>>>>>>> +------------------------------------------------------------------------------+ >>>>>>>>>>>> 1: volume management >>>>>>>>>>>> 2: type mgmt/glusterd >>>>>>>>>>>> 3: option rpc-auth.auth-glusterfs on >>>>>>>>>>>> 4: option rpc-auth.auth-unix on >>>>>>>>>>>> 5: option rpc-auth.auth-null on >>>>>>>>>>>> 6: option rpc-auth-allow-insecure on >>>>>>>>>>>> 7: option transport.socket.listen-backlog 128 >>>>>>>>>>>> 8: option event-threads 1 >>>>>>>>>>>> 9: option ping-timeout 0 >>>>>>>>>>>> 10: option transport.socket.read-fail-log off >>>>>>>>>>>> 11: option transport.socket.keepalive-interval 2 >>>>>>>>>>>> 12: option transport.socket.keepalive-time 10 >>>>>>>>>>>> 13: option transport-type rdma >>>>>>>>>>>> 14: option working-directory /var/lib/glusterd >>>>>>>>>>>> 15: end-volume >>>>>>>>>>>> 16: >>>>>>>>>>>> >>>>>>>>>>>> +------------------------------------------------------------------------------+ >>>>>>>>>>>> [2017-05-10 09:07:04.996310] W [MSGID: 106062] >>>>>>>>>>>> [glusterd-handler.c:3466:glusterd_transport_inet_options_build] 0-glusterd: >>>>>>>>>>>> Failed to get tcp-user-timeout >>>>>>>>>>>> [2017-05-10 09:07:05.000461] I [MSGID: 101190] >>>>>>>>>>>> [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread >>>>>>>>>>>> with index 1 >>>>>>>>>>>> [2017-05-10 09:07:05.001493] W [socket.c:593:__socket_rwv] >>>>>>>>>>>> 0-management: readv on 192.168.0.7:24007 failed (No data >>>>>>>>>>>> available) >>>>>>>>>>>> [2017-05-10 09:07:05.001513] I [MSGID: 106004] >>>>>>>>>>>> [glusterd-handler.c:5882:__glusterd_peer_rpc_notify] 0-management: Peer >>>>>>>>>>>> <192.168.0.7> (<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>), in state <Peer in >>>>>>>>>>>> Cluster>, h >>>>>>>>>>>> as disconnected from glusterd. >>>>>>>>>>>> [2017-05-10 09:07:05.001677] W >>>>>>>>>>>> [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] >>>>>>>>>>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559) >>>>>>>>>>>> [0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu >>>>>>>>>>>> /glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0) >>>>>>>>>>>> [0x7f0bf9d7dcf0] >>>>>>>>>>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3) >>>>>>>>>>>> [0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no >>>>>>>>>>>> t held >>>>>>>>>>>> [2017-05-10 09:07:05.001696] W [MSGID: 106118] >>>>>>>>>>>> [glusterd-handler.c:5907:__glusterd_peer_rpc_notify] 0-management: Lock not >>>>>>>>>>>> released for shared >>>>>>>>>>>> [2017-05-10 09:07:05.003099] E >>>>>>>>>>>> [rpc-clnt.c:365:saved_frames_unwind] (--> >>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c] >>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s >>>>>>>>>>>> aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (--> >>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de] >>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x >>>>>>>>>>>> 91)[0x7f0bfec91c21] (--> >>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710] >>>>>>>>>>>> ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) >>>>>>>>>>>> called at 2017-05-10 09:0 >>>>>>>>>>>> 7:05.000627 (xid=0x1) >>>>>>>>>>>> [2017-05-10 09:07:05.003129] E [MSGID: 106167] >>>>>>>>>>>> [glusterd-handshake.c:2181:__glusterd_peer_dump_version_cbk] 0-management: >>>>>>>>>>>> Error through RPC layer, retry again later >>>>>>>>>>>> [2017-05-10 09:07:05.003251] W [socket.c:593:__socket_rwv] >>>>>>>>>>>> 0-management: readv on 192.168.0.6:24007 failed (No data >>>>>>>>>>>> available) >>>>>>>>>>>> [2017-05-10 09:07:05.003267] I [MSGID: 106004] >>>>>>>>>>>> [glusterd-handler.c:5882:__glusterd_peer_rpc_notify] 0-management: Peer >>>>>>>>>>>> <192.168.0.6> (<83e9a0b9-6bd5-483b-8516-d8928805ed95>), in state <Peer in >>>>>>>>>>>> Cluster>, h >>>>>>>>>>>> as disconnected from glusterd. >>>>>>>>>>>> [2017-05-10 09:07:05.003318] W >>>>>>>>>>>> [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] >>>>>>>>>>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559) >>>>>>>>>>>> [0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu >>>>>>>>>>>> /glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0) >>>>>>>>>>>> [0x7f0bf9d7dcf0] >>>>>>>>>>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3) >>>>>>>>>>>> [0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no >>>>>>>>>>>> t held >>>>>>>>>>>> [2017-05-10 09:07:05.003329] W [MSGID: 106118] >>>>>>>>>>>> [glusterd-handler.c:5907:__glusterd_peer_rpc_notify] 0-management: Lock not >>>>>>>>>>>> released for shared >>>>>>>>>>>> [2017-05-10 09:07:05.003457] E >>>>>>>>>>>> [rpc-clnt.c:365:saved_frames_unwind] (--> >>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c] >>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s >>>>>>>>>>>> aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (--> >>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de] >>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x >>>>>>>>>>>> 91)[0x7f0bfec91c21] (--> >>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710] >>>>>>>>>>>> ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) >>>>>>>>>>>> called at 2017-05-10 09:0 >>>>>>>>>>>> 7:05.001407 (xid=0x1) >>>>>>>>>>>> >>>>>>>>>>>> There are a bunch of errors reported but I'm not sure which is >>>>>>>>>>>> signal and which ones are noise. Does anyone have any idea whats going on >>>>>>>>>>>> here? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Pawan >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>> Gluster-users at gluster.org >>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> - Atin (atinm) >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> -- >>>>>>> - Atin (atinm) >>>>>>> >>>>>> >>>>>> -- >>>>> - Atin (atinm) >>>>> >>>> >>>> -- >>> - Atin (atinm) >>> >> -- >> - Atin (atinm) >> > > --- Atin (atinm) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170521/08bd88f1/attachment.html>
Atin Mukherjee
2017-May-22 06:02 UTC
[Gluster-users] Failure while upgrading gluster to 3.10.1
Pawan, I see you have provided the log files from the nodes, however it'd be really helpful if you can provide me the content of /var/lib/glusterd from all the nodes to get to the root cause of this issue. On Fri, May 19, 2017 at 12:09 PM, Pawan Alwandi <pawan at platform.sh> wrote:> Hello Atin, > > Thanks for continued support. I've attached requested files from all 3 > nodes. > > (I think we already verified the UUIDs to be correct, anyway let us know > if you find any more info in the logs) > > Pawan > > On Thu, May 18, 2017 at 11:45 PM, Atin Mukherjee <amukherj at redhat.com> > wrote: > >> >> On Thu, 18 May 2017 at 23:40, Atin Mukherjee <amukherj at redhat.com> wrote: >> >>> On Wed, 17 May 2017 at 12:47, Pawan Alwandi <pawan at platform.sh> wrote: >>> >>>> Hello Atin, >>>> >>>> I realized that these http://gluster.readthedocs.io/ >>>> en/latest/Upgrade-Guide/upgrade_to_3.10/ instructions only work for >>>> upgrades from 3.7, while we are running 3.6.2. Are there >>>> instructions/suggestion you have for us to upgrade from 3.6 version? >>>> >>>> I believe upgrade from 3.6 to 3.7 and then to 3.10 would work, but I >>>> see similar errors reported when I upgraded to 3.7 too. >>>> >>>> For what its worth, I was able to set the op-version (gluster v set all >>>> cluster.op-version 30702) but that doesn't seem to help. >>>> >>>> [2017-05-17 06:48:33.700014] I [MSGID: 100030] [glusterfsd.c:2338:main] >>>> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.20 >>>> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid) >>>> [2017-05-17 06:48:33.703808] I [MSGID: 106478] [glusterd.c:1383:init] >>>> 0-management: Maximum allowed open file descriptors set to 65536 >>>> [2017-05-17 06:48:33.703836] I [MSGID: 106479] [glusterd.c:1432:init] >>>> 0-management: Using /var/lib/glusterd as working directory >>>> [2017-05-17 06:48:33.708866] W [MSGID: 103071] >>>> [rdma.c:4594:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event >>>> channel creation failed [No such device] >>>> [2017-05-17 06:48:33.709011] W [MSGID: 103055] [rdma.c:4901:init] >>>> 0-rdma.management: Failed to initialize IB Device >>>> [2017-05-17 06:48:33.709033] W [rpc-transport.c:359:rpc_transport_load] >>>> 0-rpc-transport: 'rdma' initialization failed >>>> [2017-05-17 06:48:33.709088] W [rpcsvc.c:1642:rpcsvc_create_listener] >>>> 0-rpc-service: cannot create listener, initing the transport failed >>>> [2017-05-17 06:48:33.709105] E [MSGID: 106243] [glusterd.c:1656:init] >>>> 0-management: creation of 1 listeners failed, continuing with succeeded >>>> transport >>>> [2017-05-17 06:48:35.480043] I [MSGID: 106513] >>>> [glusterd-store.c:2068:glusterd_restore_op_version] 0-glusterd: >>>> retrieved op-version: 30600 >>>> [2017-05-17 06:48:35.605779] I [MSGID: 106498] >>>> [glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo] >>>> 0-management: connect returned 0 >>>> [2017-05-17 06:48:35.607059] I [rpc-clnt.c:1046:rpc_clnt_connection_init] >>>> 0-management: setting frame-timeout to 600 >>>> [2017-05-17 06:48:35.607670] I [rpc-clnt.c:1046:rpc_clnt_connection_init] >>>> 0-management: setting frame-timeout to 600 >>>> [2017-05-17 06:48:35.607025] I [MSGID: 106498] >>>> [glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo] >>>> 0-management: connect returned 0 >>>> [2017-05-17 06:48:35.608125] I [MSGID: 106544] >>>> [glusterd.c:159:glusterd_uuid_init] 0-management: retrieved UUID: >>>> 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>> >>> >>>> Final graph: >>>> +----------------------------------------------------------- >>>> -------------------+ >>>> 1: volume management >>>> 2: type mgmt/glusterd >>>> 3: option rpc-auth.auth-glusterfs on >>>> 4: option rpc-auth.auth-unix on >>>> 5: option rpc-auth.auth-null on >>>> 6: option rpc-auth-allow-insecure on >>>> 7: option transport.socket.listen-backlog 128 >>>> 8: option event-threads 1 >>>> 9: option ping-timeout 0 >>>> 10: option transport.socket.read-fail-log off >>>> 11: option transport.socket.keepalive-interval 2 >>>> 12: option transport.socket.keepalive-time 10 >>>> 13: option transport-type rdma >>>> 14: option working-directory /var/lib/glusterd >>>> 15: end-volume >>>> 16: >>>> +----------------------------------------------------------- >>>> -------------------+ >>>> [2017-05-17 06:48:35.609868] I [MSGID: 101190] >>>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started >>>> thread with index 1 >>>> [2017-05-17 06:48:35.610839] W [socket.c:596:__socket_rwv] >>>> 0-management: readv on 192.168.0.7:24007 failed (No data available) >>>> [2017-05-17 06:48:35.611907] E [rpc-clnt.c:370:saved_frames_unwind] >>>> (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3] >>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df] >>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe] >>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39] >>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380] >>>> ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) >>>> called at 2017-05-17 06:48:35.609965 (xid=0x1) >>>> [2017-05-17 06:48:35.611928] E [MSGID: 106167] >>>> [glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk] >>>> 0-management: Error through RPC layer, retry again later >>>> [2017-05-17 06:48:35.611944] I [MSGID: 106004] >>>> [glusterd-handler.c:5201:__glusterd_peer_rpc_notify] 0-management: >>>> Peer <192.168.0.7> (<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>), in state >>>> <Peer in Cluster>, has disconnected from glusterd. >>>> [2017-05-17 06:48:35.612024] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/g >>>> lusterd.so(glusterd_big_locked_notify+0x4b) [0x7fd6bdc4912b] >>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/gl >>>> usterd.so(__glusterd_peer_rpc_notify+0x160) [0x7fd6bdc52dd0] >>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/gl >>>> usterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7fd6bdcef1b3] ) >>>> 0-management: Lock for vol shared not held >>>> [2017-05-17 06:48:35.612039] W [MSGID: 106118] >>>> [glusterd-handler.c:5223:__glusterd_peer_rpc_notify] 0-management: >>>> Lock not released for shared >>>> [2017-05-17 06:48:35.612079] W [socket.c:596:__socket_rwv] >>>> 0-management: readv on 192.168.0.6:24007 failed (No data available) >>>> [2017-05-17 06:48:35.612179] E [rpc-clnt.c:370:saved_frames_unwind] >>>> (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x1a3)[0x7fd6c2d70bb3] >>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_unwind+0x1cf)[0x7fd6c2b3a2df] >>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fd6c2b3a3fe] >>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7fd6c2b3ba39] >>>> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x160)[0x7fd6c2b3c380] >>>> ))))) 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) >>>> called at 2017-05-17 06:48:35.610007 (xid=0x1) >>>> [2017-05-17 06:48:35.612197] E [MSGID: 106167] >>>> [glusterd-handshake.c:2091:__glusterd_peer_dump_version_cbk] >>>> 0-management: Error through RPC layer, retry again later >>>> [2017-05-17 06:48:35.612211] I [MSGID: 106004] >>>> [glusterd-handler.c:5201:__glusterd_peer_rpc_notify] 0-management: >>>> Peer <192.168.0.6> (<83e9a0b9-6bd5-483b-8516-d8928805ed95>), in state >>>> <Peer in Cluster>, has disconnected from glusterd. >>>> [2017-05-17 06:48:35.612292] W [glusterd-locks.c:681:glusterd_mgmt_v3_unlock] >>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/g >>>> lusterd.so(glusterd_big_locked_notify+0x4b) [0x7fd6bdc4912b] >>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/gl >>>> usterd.so(__glusterd_peer_rpc_notify+0x160) [0x7fd6bdc52dd0] >>>> -->/usr/lib/x86_64-linux-gnu/glusterfs/3.7.20/xlator/mgmt/gl >>>> usterd.so(glusterd_mgmt_v3_unlock+0x4c3) [0x7fd6bdcef1b3] ) >>>> 0-management: Lock for vol shared not held >>>> [2017-05-17 06:48:35.613432] W [MSGID: 106118] >>>> [glusterd-handler.c:5223:__glusterd_peer_rpc_notify] 0-management: >>>> Lock not released for shared >>>> [2017-05-17 06:48:35.614317] E [MSGID: 106170] >>>> [glusterd-handshake.c:1051:gd_validate_mgmt_hndsk_req] 0-management: >>>> Request from peer 192.168.0.6:991 has an entry in peerinfo, but uuid >>>> does not match >>>> >>> >>> Apologies for delay. My initial suspect was correct. You have an >>> incorrect UUID in the peer file which is causing this. Can you please >>> provide me the >>> >> >> Clicked the send button accidentally! >> >> Can you please send me the content of /var/lib/glusterd & glusterd log >> from all the nodes? >> >> >>>> >>>> >>>> >>>> On Mon, May 15, 2017 at 10:31 PM, Atin Mukherjee <amukherj at redhat.com> >>>> wrote: >>>> >>>>> >>>>> On Mon, 15 May 2017 at 11:58, Pawan Alwandi <pawan at platform.sh> wrote: >>>>> >>>>>> Hi Atin, >>>>>> >>>>>> I see below error. Do I require gluster to be upgraded on all 3 >>>>>> hosts for this to work? Right now I have host 1 running 3.10.1 and host 2 >>>>>> & 3 running 3.6.2 >>>>>> >>>>>> # gluster v set all cluster.op-version 31001 >>>>>> volume set: failed: Required op_version (31001) is not supported >>>>>> >>>>> >>>>> Yes you should given 3.6 version is EOLed. >>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Mon, May 15, 2017 at 3:32 AM, Atin Mukherjee <amukherj at redhat.com> >>>>>> wrote: >>>>>> >>>>>>> On Sun, 14 May 2017 at 21:43, Atin Mukherjee <amukherj at redhat.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Allright, I see that you haven't bumped up the op-version. Can you >>>>>>>> please execute: >>>>>>>> >>>>>>>> gluster v set all cluster.op-version 30101 and then restart >>>>>>>> glusterd on all the nodes and check the brick status? >>>>>>>> >>>>>>> >>>>>>> s/30101/31001 >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> On Sun, May 14, 2017 at 8:55 PM, Pawan Alwandi <pawan at platform.sh> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hello Atin, >>>>>>>>> >>>>>>>>> Thanks for looking at this. Below is the output you requested for. >>>>>>>>> >>>>>>>>> Again, I'm seeing those errors after upgrading gluster on host 1. >>>>>>>>> >>>>>>>>> Host 1 >>>>>>>>> >>>>>>>>> # cat /var/lib/glusterd/glusterd.info >>>>>>>>> UUID=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>>> operating-version=30600 >>>>>>>>> >>>>>>>>> # cat /var/lib/glusterd/peers/* >>>>>>>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>>>>>> state=3 >>>>>>>>> hostname1=192.168.0.7 >>>>>>>>> uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>>>>>> state=3 >>>>>>>>> hostname1=192.168.0.6 >>>>>>>>> >>>>>>>>> # gluster --version >>>>>>>>> glusterfs 3.10.1 >>>>>>>>> >>>>>>>>> Host 2 >>>>>>>>> >>>>>>>>> # cat /var/lib/glusterd/glusterd.info >>>>>>>>> UUID=83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>>>>>> operating-version=30600 >>>>>>>>> >>>>>>>>> # cat /var/lib/glusterd/peers/* >>>>>>>>> uuid=5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>>>>>> state=3 >>>>>>>>> hostname1=192.168.0.7 >>>>>>>>> uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>>> state=3 >>>>>>>>> hostname1=192.168.0.5 >>>>>>>>> >>>>>>>>> # gluster --version >>>>>>>>> glusterfs 3.6.2 built on Jan 21 2015 14:23:44 >>>>>>>>> >>>>>>>>> Host 3 >>>>>>>>> >>>>>>>>> # cat /var/lib/glusterd/glusterd.info >>>>>>>>> UUID=5ec54b4f-f60c-48c6-9e55-95f2bb58f633 >>>>>>>>> operating-version=30600 >>>>>>>>> >>>>>>>>> # cat /var/lib/glusterd/peers/* >>>>>>>>> uuid=7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>>> state=3 >>>>>>>>> hostname1=192.168.0.5 >>>>>>>>> uuid=83e9a0b9-6bd5-483b-8516-d8928805ed95 >>>>>>>>> state=3 >>>>>>>>> hostname1=192.168.0.6 >>>>>>>>> >>>>>>>>> # gluster --version >>>>>>>>> glusterfs 3.6.2 built on Jan 21 2015 14:23:44 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sat, May 13, 2017 at 6:28 PM, Atin Mukherjee < >>>>>>>>> amukherj at redhat.com> wrote: >>>>>>>>> >>>>>>>>>> I have already asked for the following earlier: >>>>>>>>>> >>>>>>>>>> Can you please provide output of following from all the nodes: >>>>>>>>>> >>>>>>>>>> cat /var/lib/glusterd/glusterd.info >>>>>>>>>> cat /var/lib/glusterd/peers/* >>>>>>>>>> >>>>>>>>>> On Sat, 13 May 2017 at 12:22, Pawan Alwandi <pawan at platform.sh> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hello folks, >>>>>>>>>>> >>>>>>>>>>> Does anyone have any idea whats going on here? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Pawan >>>>>>>>>>> >>>>>>>>>>> On Wed, May 10, 2017 at 5:02 PM, Pawan Alwandi < >>>>>>>>>>> pawan at platform.sh> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> I'm trying to upgrade gluster from 3.6.2 to 3.10.1 but don't >>>>>>>>>>>> see the glusterfsd and glusterfs processes coming up. >>>>>>>>>>>> http://gluster.readthedocs.io/en/latest/Upgrade-Guide/upgrad >>>>>>>>>>>> e_to_3.10/ is the process that I'm trying to follow. >>>>>>>>>>>> >>>>>>>>>>>> This is a 3 node server setup with a replicated volume having >>>>>>>>>>>> replica count of 3. >>>>>>>>>>>> >>>>>>>>>>>> Logs below: >>>>>>>>>>>> >>>>>>>>>>>> [2017-05-10 09:07:03.507959] I [MSGID: 100030] >>>>>>>>>>>> [glusterfsd.c:2460:main] 0-/usr/sbin/glusterd: Started running >>>>>>>>>>>> /usr/sbin/glusterd version 3.10.1 (args: /usr/sbin/glusterd -p >>>>>>>>>>>> /var/run/glusterd.pid) >>>>>>>>>>>> [2017-05-10 09:07:03.512827] I [MSGID: 106478] >>>>>>>>>>>> [glusterd.c:1449:init] 0-management: Maximum allowed open file descriptors >>>>>>>>>>>> set to 65536 >>>>>>>>>>>> [2017-05-10 09:07:03.512855] I [MSGID: 106479] >>>>>>>>>>>> [glusterd.c:1496:init] 0-management: Using /var/lib/glusterd as working >>>>>>>>>>>> directory >>>>>>>>>>>> [2017-05-10 09:07:03.520426] W [MSGID: 103071] >>>>>>>>>>>> [rdma.c:4590:__gf_rdma_ctx_create] 0-rpc-transport/rdma: >>>>>>>>>>>> rdma_cm event channel creation failed [No such device] >>>>>>>>>>>> [2017-05-10 09:07:03.520452] W [MSGID: 103055] >>>>>>>>>>>> [rdma.c:4897:init] 0-rdma.management: Failed to initialize IB Device >>>>>>>>>>>> [2017-05-10 09:07:03.520465] W [rpc-transport.c:350:rpc_transport_load] >>>>>>>>>>>> 0-rpc-transport: 'rdma' initialization failed >>>>>>>>>>>> [2017-05-10 09:07:03.520518] W [rpcsvc.c:1661:rpcsvc_create_listener] >>>>>>>>>>>> 0-rpc-service: cannot create listener, initing the transport failed >>>>>>>>>>>> [2017-05-10 09:07:03.520534] E [MSGID: 106243] >>>>>>>>>>>> [glusterd.c:1720:init] 0-management: creation of 1 listeners failed, >>>>>>>>>>>> continuing with succeeded transport >>>>>>>>>>>> [2017-05-10 09:07:04.931764] I [MSGID: 106513] >>>>>>>>>>>> [glusterd-store.c:2197:glusterd_restore_op_version] >>>>>>>>>>>> 0-glusterd: retrieved op-version: 30600 >>>>>>>>>>>> [2017-05-10 09:07:04.964354] I [MSGID: 106544] >>>>>>>>>>>> [glusterd.c:158:glusterd_uuid_init] 0-management: retrieved >>>>>>>>>>>> UUID: 7f2a6e11-2a53-4ab4-9ceb-8be6a9f2d073 >>>>>>>>>>>> [2017-05-10 09:07:04.993944] I [MSGID: 106498] >>>>>>>>>>>> [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] >>>>>>>>>>>> 0-management: connect returned 0 >>>>>>>>>>>> [2017-05-10 09:07:04.995864] I [MSGID: 106498] >>>>>>>>>>>> [glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] >>>>>>>>>>>> 0-management: connect returned 0 >>>>>>>>>>>> [2017-05-10 09:07:04.995879] W [MSGID: 106062] >>>>>>>>>>>> [glusterd-handler.c:3466:glusterd_transport_inet_options_build] >>>>>>>>>>>> 0-glusterd: Failed to get tcp-user-timeout >>>>>>>>>>>> [2017-05-10 09:07:04.995903] I [rpc-clnt.c:1059:rpc_clnt_connection_init] >>>>>>>>>>>> 0-management: setting frame-timeout to 600 >>>>>>>>>>>> [2017-05-10 09:07:04.996325] I [rpc-clnt.c:1059:rpc_clnt_connection_init] >>>>>>>>>>>> 0-management: setting frame-timeout to 600 >>>>>>>>>>>> Final graph: >>>>>>>>>>>> +----------------------------------------------------------- >>>>>>>>>>>> -------------------+ >>>>>>>>>>>> 1: volume management >>>>>>>>>>>> 2: type mgmt/glusterd >>>>>>>>>>>> 3: option rpc-auth.auth-glusterfs on >>>>>>>>>>>> 4: option rpc-auth.auth-unix on >>>>>>>>>>>> 5: option rpc-auth.auth-null on >>>>>>>>>>>> 6: option rpc-auth-allow-insecure on >>>>>>>>>>>> 7: option transport.socket.listen-backlog 128 >>>>>>>>>>>> 8: option event-threads 1 >>>>>>>>>>>> 9: option ping-timeout 0 >>>>>>>>>>>> 10: option transport.socket.read-fail-log off >>>>>>>>>>>> 11: option transport.socket.keepalive-interval 2 >>>>>>>>>>>> 12: option transport.socket.keepalive-time 10 >>>>>>>>>>>> 13: option transport-type rdma >>>>>>>>>>>> 14: option working-directory /var/lib/glusterd >>>>>>>>>>>> 15: end-volume >>>>>>>>>>>> 16: >>>>>>>>>>>> +----------------------------------------------------------- >>>>>>>>>>>> -------------------+ >>>>>>>>>>>> [2017-05-10 09:07:04.996310] W [MSGID: 106062] >>>>>>>>>>>> [glusterd-handler.c:3466:glusterd_transport_inet_options_build] >>>>>>>>>>>> 0-glusterd: Failed to get tcp-user-timeout >>>>>>>>>>>> [2017-05-10 09:07:05.000461] I [MSGID: 101190] >>>>>>>>>>>> [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: >>>>>>>>>>>> Started thread with index 1 >>>>>>>>>>>> [2017-05-10 09:07:05.001493] W [socket.c:593:__socket_rwv] >>>>>>>>>>>> 0-management: readv on 192.168.0.7:24007 failed (No data >>>>>>>>>>>> available) >>>>>>>>>>>> [2017-05-10 09:07:05.001513] I [MSGID: 106004] >>>>>>>>>>>> [glusterd-handler.c:5882:__glusterd_peer_rpc_notify] >>>>>>>>>>>> 0-management: Peer <192.168.0.7> (<5ec54b4f-f60c-48c6-9e55-95f2bb58f633>), >>>>>>>>>>>> in state <Peer in Cluster>, h >>>>>>>>>>>> as disconnected from glusterd. >>>>>>>>>>>> [2017-05-10 09:07:05.001677] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] >>>>>>>>>>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559) >>>>>>>>>>>> [0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu >>>>>>>>>>>> /glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0) >>>>>>>>>>>> [0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/g >>>>>>>>>>>> lusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3) >>>>>>>>>>>> [0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no >>>>>>>>>>>> t held >>>>>>>>>>>> [2017-05-10 09:07:05.001696] W [MSGID: 106118] >>>>>>>>>>>> [glusterd-handler.c:5907:__glusterd_peer_rpc_notify] >>>>>>>>>>>> 0-management: Lock not released for shared >>>>>>>>>>>> [2017-05-10 09:07:05.003099] E [rpc-clnt.c:365:saved_frames_unwind] >>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>> lusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c] (--> >>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s >>>>>>>>>>>> aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (--> >>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de] >>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>> frpc.so.0(rpc_clnt_connection_cleanup+0x >>>>>>>>>>>> 91)[0x7f0bfec91c21] (--> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>> frpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710] ))))) >>>>>>>>>>>> 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called >>>>>>>>>>>> at 2017-05-10 09:0 >>>>>>>>>>>> 7:05.000627 (xid=0x1) >>>>>>>>>>>> [2017-05-10 09:07:05.003129] E [MSGID: 106167] >>>>>>>>>>>> [glusterd-handshake.c:2181:__glusterd_peer_dump_version_cbk] >>>>>>>>>>>> 0-management: Error through RPC layer, retry again later >>>>>>>>>>>> [2017-05-10 09:07:05.003251] W [socket.c:593:__socket_rwv] >>>>>>>>>>>> 0-management: readv on 192.168.0.6:24007 failed (No data >>>>>>>>>>>> available) >>>>>>>>>>>> [2017-05-10 09:07:05.003267] I [MSGID: 106004] >>>>>>>>>>>> [glusterd-handler.c:5882:__glusterd_peer_rpc_notify] >>>>>>>>>>>> 0-management: Peer <192.168.0.6> (<83e9a0b9-6bd5-483b-8516-d8928805ed95>), >>>>>>>>>>>> in state <Peer in Cluster>, h >>>>>>>>>>>> as disconnected from glusterd. >>>>>>>>>>>> [2017-05-10 09:07:05.003318] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] >>>>>>>>>>>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x20559) >>>>>>>>>>>> [0x7f0bf9d74559] -->/usr/lib/x86_64-linux-gnu >>>>>>>>>>>> /glusterfs/3.10.1/xlator/mgmt/glusterd.so(+0x29cf0) >>>>>>>>>>>> [0x7f0bf9d7dcf0] -->/usr/lib/x86_64-linux-gnu/g >>>>>>>>>>>> lusterfs/3.10.1/xlator/mgmt/glusterd.so(+0xd5ba3) >>>>>>>>>>>> [0x7f0bf9e29ba3] ) 0-management: Lock for vol shared no >>>>>>>>>>>> t held >>>>>>>>>>>> [2017-05-10 09:07:05.003329] W [MSGID: 106118] >>>>>>>>>>>> [glusterd-handler.c:5907:__glusterd_peer_rpc_notify] >>>>>>>>>>>> 0-management: Lock not released for shared >>>>>>>>>>>> [2017-05-10 09:07:05.003457] E [rpc-clnt.c:365:saved_frames_unwind] >>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>> lusterfs.so.0(_gf_log_callingfn+0x13c)[0x7f0bfeeca73c] (--> >>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(s >>>>>>>>>>>> aved_frames_unwind+0x1cf)[0x7f0bfec904bf] (--> >>>>>>>>>>>> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f0bfec905de] >>>>>>>>>>>> (--> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>> frpc.so.0(rpc_clnt_connection_cleanup+0x >>>>>>>>>>>> 91)[0x7f0bfec91c21] (--> /usr/lib/x86_64-linux-gnu/libg >>>>>>>>>>>> frpc.so.0(rpc_clnt_notify+0x290)[0x7f0bfec92710] ))))) >>>>>>>>>>>> 0-management: forced unwinding frame type(GLUSTERD-DUMP) op(DUMP(1)) called >>>>>>>>>>>> at 2017-05-10 09:0 >>>>>>>>>>>> 7:05.001407 (xid=0x1) >>>>>>>>>>>> >>>>>>>>>>>> There are a bunch of errors reported but I'm not sure which is >>>>>>>>>>>> signal and which ones are noise. Does anyone have any idea whats going on >>>>>>>>>>>> here? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Pawan >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>> Gluster-users at gluster.org >>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> - Atin (atinm) >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> -- >>>>>>> - Atin (atinm) >>>>>>> >>>>>> >>>>>> -- >>>>> - Atin (atinm) >>>>> >>>> >>>> -- >>> - Atin (atinm) >>> >> -- >> - Atin (atinm) >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170522/63dbcfa1/attachment.html>