Nithya Balachandran
2019-Jan-04 10:41 UTC
[Gluster-users] update to 4.1.6-1 and fix-layout failing
On Fri, 4 Jan 2019 at 15:48, mohammad kashif <kashif.alig at gmail.com> wrote:> Hi > > I have updated our distributed gluster storage from 3.12.9-1 to 4.1.6-1. > The existing cluster had seven servers totalling in around 450 TB. OS is > Centos7. The update went OK and I could access files. > Then I added two more servers of 90TB each to cluster and started > fix-layout > > gluster volume rebalance atlasglust fix-layout start > > Some directories were created at new servers and then stopped although > rebalance status was showing that it is still running. I think it stopped > creating new directories after this error > > E [MSGID: 106061] > [glusterd-utils.c:10697:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd: > failed to get index The message "E [MSGID: 106061] > [glusterd-utils.c:10697:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd: > failed to get index" repeated 7 times between [2019-01-03 13:16:31.146779] > and [2019-01-03 13:16:31.158612] > >There are also many warning like this> [2019-01-03 16:04:34.120777] I [MSGID: 106499] > [glusterd-handler.c:4314:__glusterd_handle_status_volume] 0-management: > Received status volume req for volume atlasglust [2019-01-03 > 17:04:28.541805] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-management: error > returned while attempting to connect to host:(null), port:0 > > These are the glusterd logs. Do you see any errors in the rebalance logsfor this volume?> I waited for around 12 hours and then stopped fix-layout and started again > I can see the same error again > > [2019-01-04 09:59:20.825930] E [MSGID: 106061] > [glusterd-utils.c:10697:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd: > failed to get index The message "E [MSGID: 106061] > [glusterd-utils.c:10697:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd: > failed to get index" repeated 7 times between [2019-01-04 09:59:20.825930] > and [2019-01-04 09:59:20.837068] > > Please suggest as it is our production service. > > At the moment, I have stopped clients from using file system. Would it be > OK if I allow clients to access file system while fix-layout is still going. > > Thanks > > Kashif > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190104/2c530cb4/attachment.html>
mohammad kashif
2019-Jan-04 11:40 UTC
[Gluster-users] update to 4.1.6-1 and fix-layout failing
Hi Nithya rebalance logs has only these warnings 2019-01-04 09:59:20.826261] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-atlasglust-client-5: error returned while attempting to connect to host:(null), port:0 [2019-01-04 09:59:20.828113] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-atlasglust-client-6: error returned while attempting to connect to host:(null), port:0 [2019-01-04 09:59:20.832017] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-atlasglust-client-4: error returned while attempting to connect to host:(null), port:0 gluster volume rebalance atlasglust status Node status run time in h:m:s --------- ----------- ------------ localhost fix-layout in progress 1:0:59 pplxgluster02.physics.ox.ac.uk fix-layout in progress 1:0:59 pplxgluster03.physics.ox.ac.uk fix-layout in progress 1:0:59 pplxgluster04.physics.ox.ac.uk fix-layout in progress 1:0:59 pplxgluster05.physics.ox.ac.uk fix-layout in progress 1:0:59 pplxgluster06.physics.ox.ac.uk fix-layout in progress 1:0:59 pplxgluster07.physics.ox.ac.uk fix-layout in progress 1:0:59 pplxgluster08.physics.ox.ac.uk fix-layout in progress 1:0:59 pplxgluster09.physics.ox.ac.uk fix-layout in progress 1:0:59 But there is no new entry in logs for last one hour and I can't see any new directories being created. Thanks Kashif On Fri, Jan 4, 2019 at 10:42 AM Nithya Balachandran <nbalacha at redhat.com> wrote:> > > On Fri, 4 Jan 2019 at 15:48, mohammad kashif <kashif.alig at gmail.com> > wrote: > >> Hi >> >> I have updated our distributed gluster storage from 3.12.9-1 to 4.1.6-1. >> The existing cluster had seven servers totalling in around 450 TB. OS is >> Centos7. The update went OK and I could access files. >> Then I added two more servers of 90TB each to cluster and started >> fix-layout >> >> gluster volume rebalance atlasglust fix-layout start >> >> Some directories were created at new servers and then stopped although >> rebalance status was showing that it is still running. I think it stopped >> creating new directories after this error >> >> E [MSGID: 106061] >> [glusterd-utils.c:10697:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd: >> failed to get index The message "E [MSGID: 106061] >> [glusterd-utils.c:10697:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd: >> failed to get index" repeated 7 times between [2019-01-03 13:16:31.146779] >> and [2019-01-03 13:16:31.158612] >> >> > There are also many warning like this >> [2019-01-03 16:04:34.120777] I [MSGID: 106499] >> [glusterd-handler.c:4314:__glusterd_handle_status_volume] 0-management: >> Received status volume req for volume atlasglust [2019-01-03 >> 17:04:28.541805] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-management: error >> returned while attempting to connect to host:(null), port:0 >> >> These are the glusterd logs. Do you see any errors in the rebalance logs > for this volume? > > >> I waited for around 12 hours and then stopped fix-layout and started again >> I can see the same error again >> >> [2019-01-04 09:59:20.825930] E [MSGID: 106061] >> [glusterd-utils.c:10697:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd: >> failed to get index The message "E [MSGID: 106061] >> [glusterd-utils.c:10697:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd: >> failed to get index" repeated 7 times between [2019-01-04 09:59:20.825930] >> and [2019-01-04 09:59:20.837068] >> >> Please suggest as it is our production service. >> >> At the moment, I have stopped clients from using file system. Would it be >> OK if I allow clients to access file system while fix-layout is still going. >> >> Thanks >> >> Kashif >> >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190104/793dfc6e/attachment.html>