Serkan Çoban
2016-Mar-30 10:33 UTC
[Gluster-users] Gluster management commands gives error
Any workarounds other then not try to run these commands? On Wed, Mar 30, 2016 at 12:52 PM, Atin Mukherjee <amukherj at redhat.com> wrote:> > > On 03/30/2016 02:49 PM, Serkan ?oban wrote: >> I think the issue happens because none of gluster v status v0 >> [mem|clients|..] commands work on my cluster. > This is a known issue :( > If you have quite a number of bricks (in your case so) it incurs > brick-op RPCs for every bricks and that's why it takes time to finish > executing the command and by the time cli timeout happens. >> Those commands give Error request timeout and never outputs anything. >> Maybe because of brick count (1560) and client count (60) bt somehow >> they continue in the background. >> After some time I can run normal gluster volume status|get|set >> commands but again when I try to run gluster v status v0 >> [mem|clients|..] it gives error timeout. >> Gluster op version is 30707 this is a fresh 3.7.9 install. >> >> On Tue, Mar 29, 2016 at 3:53 PM, Atin Mukherjee <amukherj at redhat.com> wrote: >>> >>> >>> On 03/29/2016 05:34 PM, Serkan ?oban wrote: >>>> Hi, I am on 3.7.9 and currently none of the gluster commands (gluster >>>> peer status, g;uster volume status) works. I have below lines in the >>>> logs: >>>> >>>> [2016-03-29 11:25:23.878845] W >>>> [glusterd-locks.c:692:glusterd_mgmt_v3_unlock] >>>> (-->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) >>>> [0x7fdbd9c53eec] >>>> -->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) >>>> [0x7fdbd9c5e432] >>>> -->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x37a) >>>> [0x7fdbd9cff0ca] ) 0-management: Lock owner mismatch. Lock for vol v0 >>>> held by c95f1a84-761e-493a-a462-54fcd6d72122 >>>> [2016-03-29 11:27:39.310977] I [MSGID: 106499] >>>> [glusterd-handler.c:4329:__glusterd_handle_status_volume] >>>> 0-management: Received status volume req for volume v0 >>>> [2016-03-29 11:27:39.312654] E [MSGID: 106116] >>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>> failed on h1.domain.net. Please check log file for details. >>> Check if you see any error message(s) in glusterd log on h1.domain.net. >>> By any chance are you running a script in background which triggers >>> multiple commands on the same volume concurrently, then its expected. >>> Also is your cluster op-version (gluster volume get <volname> >>> cluster.op-version> is up to date? >>> >>>> [2016-03-29 11:27:39.312755] E [MSGID: 106116] >>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>> failed on h2.domain.net. Please check log file for details. >>>> [2016-03-29 11:27:39.312781] E [MSGID: 106116] >>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>> failed on h3.domain.net. Please check log file for details. >>>> [2016-03-29 11:27:39.312821] E [MSGID: 106116] >>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>> failed on h4.domain.net. Please check log file for details. >>>> [2016-03-29 11:27:39.312849] E [MSGID: 106116] >>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>> failed on h44.domain.net. Please check log file for details. >>>> [2016-03-29 11:27:39.312879] E [MSGID: 106116] >>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>> failed on h35.domain.net. Please check log file for details. >>>> [2016-03-29 11:27:39.312920] E [MSGID: 106116] >>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>> failed on h22.domain.net. Please check log file for details. >>>> [2016-03-29 11:27:39.312950] E [MSGID: 106116] >>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>> failed on h11.domain.net. Please check log file for details. >>>> [2016-03-29 11:27:39.312981] E [MSGID: 106116] >>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>> failed on h31.domain.net. Please check log file for details. >>>> [2016-03-29 11:27:39.313077] E [MSGID: 106116] >>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>> failed on h33.domain.net. Please check log file for details. >>>> [2016-03-29 11:27:46.490894] E [MSGID: 106151] >>>> [glusterd-syncop.c:1868:gd_sync_task_begin] 0-management: Locking >>>> Peers Failed. >>>> >>>> What can I do to solve the problem? command sometimes gives timeout >>>> error and sometimes give locking failed on host... >>>> >>>> Serkan >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>> >>
Atin Mukherjee
2016-Mar-30 12:10 UTC
[Gluster-users] Gluster management commands gives error
On 03/30/2016 04:03 PM, Serkan ?oban wrote:> Any workarounds other then not try to run these commands?Well, if you can execute gluster v status <volname> <brick> [mem|clients|...] then it shouldn't time out. Basically you can write a script to parse the brick details from gluster v status and then pass individual bricks to gluster v status command to fetch out client/mem/... details. Makes sense? ~Atin> > On Wed, Mar 30, 2016 at 12:52 PM, Atin Mukherjee <amukherj at redhat.com> wrote: >> >> >> On 03/30/2016 02:49 PM, Serkan ?oban wrote: >>> I think the issue happens because none of gluster v status v0 >>> [mem|clients|..] commands work on my cluster. >> This is a known issue :( >> If you have quite a number of bricks (in your case so) it incurs >> brick-op RPCs for every bricks and that's why it takes time to finish >> executing the command and by the time cli timeout happens. >>> Those commands give Error request timeout and never outputs anything. >>> Maybe because of brick count (1560) and client count (60) bt somehow >>> they continue in the background. >>> After some time I can run normal gluster volume status|get|set >>> commands but again when I try to run gluster v status v0 >>> [mem|clients|..] it gives error timeout. >>> Gluster op version is 30707 this is a fresh 3.7.9 install. >>> >>> On Tue, Mar 29, 2016 at 3:53 PM, Atin Mukherjee <amukherj at redhat.com> wrote: >>>> >>>> >>>> On 03/29/2016 05:34 PM, Serkan ?oban wrote: >>>>> Hi, I am on 3.7.9 and currently none of the gluster commands (gluster >>>>> peer status, g;uster volume status) works. I have below lines in the >>>>> logs: >>>>> >>>>> [2016-03-29 11:25:23.878845] W >>>>> [glusterd-locks.c:692:glusterd_mgmt_v3_unlock] >>>>> (-->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x4c) >>>>> [0x7fdbd9c53eec] >>>>> -->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x162) >>>>> [0x7fdbd9c5e432] >>>>> -->/usr/lib64/glusterfs/3.7.9/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x37a) >>>>> [0x7fdbd9cff0ca] ) 0-management: Lock owner mismatch. Lock for vol v0 >>>>> held by c95f1a84-761e-493a-a462-54fcd6d72122 >>>>> [2016-03-29 11:27:39.310977] I [MSGID: 106499] >>>>> [glusterd-handler.c:4329:__glusterd_handle_status_volume] >>>>> 0-management: Received status volume req for volume v0 >>>>> [2016-03-29 11:27:39.312654] E [MSGID: 106116] >>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>>> failed on h1.domain.net. Please check log file for details. >>>> Check if you see any error message(s) in glusterd log on h1.domain.net. >>>> By any chance are you running a script in background which triggers >>>> multiple commands on the same volume concurrently, then its expected. >>>> Also is your cluster op-version (gluster volume get <volname> >>>> cluster.op-version> is up to date? >>>> >>>>> [2016-03-29 11:27:39.312755] E [MSGID: 106116] >>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>>> failed on h2.domain.net. Please check log file for details. >>>>> [2016-03-29 11:27:39.312781] E [MSGID: 106116] >>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>>> failed on h3.domain.net. Please check log file for details. >>>>> [2016-03-29 11:27:39.312821] E [MSGID: 106116] >>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>>> failed on h4.domain.net. Please check log file for details. >>>>> [2016-03-29 11:27:39.312849] E [MSGID: 106116] >>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>>> failed on h44.domain.net. Please check log file for details. >>>>> [2016-03-29 11:27:39.312879] E [MSGID: 106116] >>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>>> failed on h35.domain.net. Please check log file for details. >>>>> [2016-03-29 11:27:39.312920] E [MSGID: 106116] >>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>>> failed on h22.domain.net. Please check log file for details. >>>>> [2016-03-29 11:27:39.312950] E [MSGID: 106116] >>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>>> failed on h11.domain.net. Please check log file for details. >>>>> [2016-03-29 11:27:39.312981] E [MSGID: 106116] >>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>>> failed on h31.domain.net. Please check log file for details. >>>>> [2016-03-29 11:27:39.313077] E [MSGID: 106116] >>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking >>>>> failed on h33.domain.net. Please check log file for details. >>>>> [2016-03-29 11:27:46.490894] E [MSGID: 106151] >>>>> [glusterd-syncop.c:1868:gd_sync_task_begin] 0-management: Locking >>>>> Peers Failed. >>>>> >>>>> What can I do to solve the problem? command sometimes gives timeout >>>>> error and sometimes give locking failed on host... >>>>> >>>>> Serkan >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>> >>> >