thr3ads.net - Gluster users - [Gluster-users] never ending logging [Apr 2020]

If this information is useful, please help other people find it:
Share via:

Sanju Rakonde

2020-Apr-22 06:39 UTC

[Gluster-users] never ending logging

Thanks for all the information.

For pstack output, gluster-debuginfo package has to be installed. I will
check out the provided information and get back to you.

On Wed, Apr 22, 2020 at 11:54 AM <nico at furyweb.fr> wrote:
> I think all issues are linked with the same underlying problem.
>
> 1. all peers were in Connected state from every node yesterday but node 2
> is "semi-connected" at now:
> root at glusterDevVM1:/var/log/glusterfs# gluster peer status
> Number of Peers: 2
>
> Hostname: glusterDevVM3
> Uuid: 0d8a3686-9e37-4ce7-87bf-c85d1ec40974
> State: Peer in Cluster (Connected)
>
> Hostname: glusterDevVM2
> Uuid: 7f6c3023-144b-4db2-9063-d90926dbdd18
> State: Peer in Cluster (Connected)
> root at glusterDevVM2:~# gluster peer status
> Number of Peers: 2
>
> Hostname: glusterDevVM1
> Uuid: e2263e4d-a307-45d5-9cec-e1791f7a45fb
> State: Peer in Cluster (Disconnected)
>
> Hostname: glusterDevVM3
> Uuid: 0d8a3686-9e37-4ce7-87bf-c85d1ec40974
> State: Peer in Cluster (Connected)
> root at glusterDevVM3:~# gluster peer status
> Number of Peers: 2
>
> Hostname: glusterDevVM2
> Uuid: 7f6c3023-144b-4db2-9063-d90926dbdd18
> State: Peer in Cluster (Connected)
>
> Hostname: glusterDevVM1
> Uuid: e2263e4d-a307-45d5-9cec-e1791f7a45fb
> State: Peer in Cluster (Connected)
> 2, 3, 4. a simple gluster volume status show multiple errors on each node:
> root at glusterDevVM1:~# gluster volume status tmp
> Locking failed on glusterDevVM2. Please check log file for details.
> root at glusterDevVM2:~# gluster volume status tmp
> Another transaction is in progress for tmp. Please try again after some
> time.
> root at glusterDevVM3:~# gluster volume status tmp
> Error : Request timed out
>
> Logs for each node (except SSL errors):
> root at glusterDevVM1:~# egrep -v '0-socket.management'
> /var/log/glusterfs/glusterd.log
> [2020-04-22 05:38:32.278618] E [rpc-clnt.c:346:saved_frames_unwind] (-->
>
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x138)[0x7fd28d99fda8]
> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xcd97)[0x7fd28d745d97]
(-->
> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xcebe)[0x7fd28d745ebe] (-->
>
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)[0x7fd28d746e93]
> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xea18)[0x7fd28d747a18]
)))))
> 0-management: forced unwinding frame type(Peer mgmt) op(--(4)) called at
> 2020-04-22 05:38:32.243087 (xid=0x8d)
> [2020-04-22 05:38:32.278638] E [MSGID: 106157]
> [glusterd-rpc-ops.c:665:__glusterd_friend_update_cbk] 0-management: RPC
> Error
> [2020-04-22 05:38:32.278651] I [MSGID: 106493]
> [glusterd-rpc-ops.c:681:__glusterd_friend_update_cbk] 0-management:
> Received RJT from uuid: 00000000-0000-0000-0000-000000000000
> [2020-04-22 05:38:33.256401] I [MSGID: 106498]
> [glusterd-svc-helper.c:747:__glusterd_send_svc_configure_req] 0-management:
> not connected yet
> [2020-04-22 05:38:35.279149] I [socket.c:4347:ssl_setup_connection_params]
> 0-management: SSL support on the I/O path is ENABLED
> [2020-04-22 05:38:35.279169] I [socket.c:4350:ssl_setup_connection_params]
> 0-management: SSL support for glusterd is ENABLED
> [2020-04-22 05:38:35.279178] I [socket.c:4360:ssl_setup_connection_params]
> 0-management: using certificate depth 1
> The message "I [MSGID: 106004]
> [glusterd-handler.c:6204:__glusterd_peer_rpc_notify] 0-management: Peer
> <glusterDevVM2> (<7f6c3023-144b-4db2-9063-d90926dbdd18>), in
state <Peer in
> Cluster>, has disconnected from glusterd." repeated 3 times between
> [2020-04-22 05:38:25.232116] and [2020-04-22 05:38:35.667153]
> [2020-04-22 05:38:35.667255] W
> [glusterd-locks.c:796:glusterd_mgmt_v3_unlock]
>
(-->/usr/lib/x86_64-linux-gnu/glusterfs/7.3/xlator/mgmt/glusterd.so(+0x22119)
> [0x7fd287da7119]
>
-->/usr/lib/x86_64-linux-gnu/glusterfs/7.3/xlator/mgmt/glusterd.so(+0x2caae)
> [0x7fd287db1aae]
>
-->/usr/lib/x86_64-linux-gnu/glusterfs/7.3/xlator/mgmt/glusterd.so(+0xdf0d3)
> [0x7fd287e640d3] ) 0-management: Lock for vol <vol> not held
> [2020-04-22 05:38:35.667275] W [MSGID: 106117]
> [glusterd-handler.c:6225:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for <vol>
> 2 last lines repeated for each volume
>
> root at glusterDevVM2:~# egrep -v '0-socket.management'
> /var/log/glusterfs/glusterd.log
> [2020-04-22 05:51:57.493574] E [rpc-clnt.c:346:saved_frames_unwind] (-->
>
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x138)[0x7f30411dbda8]
> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xcd97)[0x7f3040f81d97]
(-->
> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xcebe)[0x7f3040f81ebe] (-->
>
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)[0x7f3040f82e93]
> (--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xea18)[0x7f3040f83a18]
)))))
> 0-management: forced unwinding frame type(Gluster MGMT Handshake)
> op(MGMT-VERS(1)) called at 2020-04-22 05:51:57.483579 (xid=0x563)
> [2020-04-22 05:51:57.493623] E [MSGID: 106167]
> [glusterd-handshake.c:2040:__glusterd_mgmt_hndsk_version_cbk] 0-management:
> Error through RPC layer, retry again later
> [2020-04-22 05:52:00.501474] I [socket.c:4347:ssl_setup_connection_params]
> 0-management: SSL support on the I/O path is ENABLED
> [2020-04-22 05:52:00.501542] I [socket.c:4350:ssl_setup_connection_params]
> 0-management: SSL support for glusterd is ENABLED
> [2020-04-22 05:52:00.501569] I [socket.c:4360:ssl_setup_connection_params]
> 0-management: using certificate depth 1
> [2020-04-22 05:52:00.983720] I [MSGID: 106004]
> [glusterd-handler.c:6204:__glusterd_peer_rpc_notify] 0-management: Peer
> <glusterDevVM1> (<e2263e4d-a307-45d5-9cec-e1791f7a45fb>), in
state <Peer in
> Cluster>, has disconnected from glusterd.
> [2020-04-22 05:52:00.983886] W
> [glusterd-locks.c:796:glusterd_mgmt_v3_unlock]
>
(-->/usr/lib/x86_64-linux-gnu/glusterfs/7.3/xlator/mgmt/glusterd.so(+0x22119)
> [0x7f303b5e3119]
>
-->/usr/lib/x86_64-linux-gnu/glusterfs/7.3/xlator/mgmt/glusterd.so(+0x2caae)
> [0x7f303b5edaae]
>
-->/usr/lib/x86_64-linux-gnu/glusterfs/7.3/xlator/mgmt/glusterd.so(+0xdf0d3)
> [0x7f303b6a00d3] ) 0-management: Lock for vol <vol> not held
> [2020-04-22 05:52:00.983909] W [MSGID: 106117]
> [glusterd-handler.c:6225:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for <vol>
> 2 last lines repeated for each volume
>
> root at glusterDevVM3:~# egrep -v '0-socket.management'
> /var/log/glusterfs/glusterd.log
> [2020-04-22 05:38:33.229959] I [MSGID: 106499]
> [glusterd-handler.c:4264:__glusterd_handle_status_volume] 0-management:
> Received status volume req for volume tmp
> [2020-04-22 05:41:33.230170] I
> [glusterd-locks.c:729:gd_mgmt_v3_unlock_timer_cbk] 0-management: unlock
> timer is cancelled for volume_type tmp_vol
> [2020-04-22 05:48:34.908289] E [rpc-clnt.c:183:call_bail] 0-management:
> bailing out frame type(glusterd mgmt v3), op(--(1)), xid = 0x108, unique
> 918, sent = 2020-04-22 05:38:33.230268, timeout = 600 for 10.5.1.7:24007
> [2020-04-22 05:48:34.908339] E [MSGID: 106115]
> [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking
> failed on glusterDevVM1. Please check log file for details.
> [2020-04-22 05:48:40.288539] E [rpc-clnt.c:183:call_bail] 0-management:
> bailing out frame type(glusterd mgmt v3), op(--(1)), xid = 0x27, unique
> 917, sent = 2020-04-22 05:38:33.230258, timeout = 600 for 10.5.1.8:24007
> [2020-04-22 05:48:40.288568] E [MSGID: 106115]
> [glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking
> failed on glusterDevVM2. Please check log file for details.
> [2020-04-22 05:48:40.288631] E [MSGID: 106150]
> [glusterd-syncop.c:1918:gd_sync_task_begin] 0-management: Locking Peers
> Failed.
>
> I'm not familiar with pstack, when running on node 3 (arbiter) I get
only
> these few lines :
> root at glusterDevVM3:~# pstack 13700
>
> 13700: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
> (No symbols found)
> 0x7fafd747a6cd: ????
>
> Which debian stretch package should I install ?
>
> To be more explicit, I stopped glusterd on all 3 nodes then restart
> sequentially with this order : node1, node3 (arbiter) then node2.
> Log files can be dl at
> https://www.dropbox.com/s/rcgcw7jrud2wkj1/glusterd-logs.tar.bz2?dl=0
>
> Thanks for your help.
>
> ------------------------------
> *De: *"Sanju Rakonde" <srakonde at redhat.com>
> *?: *nico at furyweb.fr
> *Cc: *"gluster-users" <gluster-users at gluster.org>
> *Envoy?: *Mercredi 22 Avril 2020 07:23:52
> *Objet: *Re: [Gluster-users] never ending logging
>
> Hi,
> The email is talking about many issues. Let me ask a few questions to get
> a whole picture.
> 1. are the peers are in the connected state now? or they still in the
> rejected state?
> 2. What led you to see "locking failed" messages? We would like
to if
> there is a reproducer and fix the issue if any.
> 3. Another transaction in progress message appears when there is already a
> operation going on. Are you seeing this when there is no such transaction
> going on?
> 4. When did you hit the timedouts? Did you tried to look at the pstack
> output of glusterd process? If so, please share the pstack output.
>
> On Tue, Apr 21, 2020 at 7:08 PM <nico at furyweb.fr> wrote:
>
>> Hi all.
>>
>> We're using 3 nodes Gluster 7.3 (2 + 1 arbiter), yesterday node 2
was
>> rejected from cluster and I applied following steps to fix :
>>
https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Resolving%20Peer%20Rejected/
>> I also saw
>>
https://docs.gluster.org/en/latest/Troubleshooting/troubleshooting-glusterd/
>> but solution isn't compatible as cluster.max-op-version doesn't
exist and
>> all op-version are the same on all 3 nodes.
>>
>> After renewing SSL certs and several restart all volumes came back
online
>> but glusterd log file on all 3 nodes is filled with nothing else than
>> following 3 lines :
>>
>> [2020-04-21 13:05:19.478913] I
>> [socket.c:4347:ssl_setup_connection_params] 0-socket.management: SSL
>> support on the I/O path is ENABLED
>> [2020-04-21 13:05:19.478972] I
>> [socket.c:4350:ssl_setup_connection_params] 0-socket.management: SSL
>> support for glusterd is ENABLED
>> [2020-04-21 13:05:19.478986] I
>> [socket.c:4360:ssl_setup_connection_params] 0-socket.management: using
>> certificate depth 1
>>
>> Moreover, I have "Locking failed", "Another transaction
is in progress"
>> and "Error : Request timed out" on gluster volume status
volxxx command.
>> All SSL certs on clients have also been renewed and all volumes were
>> remounted. All 3 nodes were alternatively restarted (glusterd) and
rebooted.
>>
>> The cluster is not in production environment but there's about ~250
>> clients for ~75 volumes, I don't know how to troubleshoot and fix
this
>> problem, if anyone has an idea.
>> ________
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>
> --
> Thanks,
> Sanju
>
>
-- 
Thanks,
Sanju
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20200422/ca920d3d/attachment.html>

nico at furyweb.fr

2020-Apr-23 15:32 UTC

head link

[Gluster-users] never ending logging

Did you find any clue in log files ? 

I can try an update to 7.5 in case some recent bug were solved, what's your
opinion ?


De: "Sanju Rakonde" <srakonde at redhat.com> 
?: nico at furyweb.fr 
Cc: "gluster-users" <gluster-users at gluster.org> 
Envoy?: Mercredi 22 Avril 2020 08:39:42 
Objet: Re: [Gluster-users] never ending logging 

Thanks for all the information. 
For pstack output, gluster-debuginfo package has to be installed. I will check
out the provided information and get back to you.

On Wed, Apr 22, 2020 at 11:54 AM < [ mailto:nico at furyweb.fr | nico at
furyweb.fr ] > wrote:



I think all issues are linked with the same underlying problem. 

1. all peers were in Connected state from every node yesterday but node 2 is
"semi-connected" at now:
root at glusterDevVM1:/var/log/glusterfs# gluster peer status 
Number of Peers: 2 

Hostname: glusterDevVM3 
Uuid: 0d8a3686-9e37-4ce7-87bf-c85d1ec40974 
State: Peer in Cluster (Connected) 

Hostname: glusterDevVM2 
Uuid: 7f6c3023-144b-4db2-9063-d90926dbdd18 
State: Peer in Cluster (Connected) 
root at glusterDevVM2:~# gluster peer status 
Number of Peers: 2 

Hostname: glusterDevVM1 
Uuid: e2263e4d-a307-45d5-9cec-e1791f7a45fb 
State: Peer in Cluster (Disconnected) 

Hostname: glusterDevVM3 
Uuid: 0d8a3686-9e37-4ce7-87bf-c85d1ec40974 
State: Peer in Cluster (Connected) 
root at glusterDevVM3:~# gluster peer status 
Number of Peers: 2 

Hostname: glusterDevVM2 
Uuid: 7f6c3023-144b-4db2-9063-d90926dbdd18 
State: Peer in Cluster (Connected) 

Hostname: glusterDevVM1 
Uuid: e2263e4d-a307-45d5-9cec-e1791f7a45fb 
State: Peer in Cluster (Connected) 
2, 3, 4. a simple gluster volume status show multiple errors on each node: 
root at glusterDevVM1:~# gluster volume status tmp 
Locking failed on glusterDevVM2. Please check log file for details. 
root at glusterDevVM2:~# gluster volume status tmp 
Another transaction is in progress for tmp. Please try again after some time. 
root at glusterDevVM3:~# gluster volume status tmp 
Error : Request timed out 

Logs for each node (except SSL errors): 
root at glusterDevVM1:~# egrep -v '0-socket.management'
/var/log/glusterfs/glusterd.log
[2020-04-22 05:38:32.278618] E [rpc-clnt.c:346:saved_frames_unwind] (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x138)[0x7fd28d99fda8]
(--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xcd97)[0x7fd28d745d97] (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xcebe)[0x7fd28d745ebe] (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)[0x7fd28d746e93]
(--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xea18)[0x7fd28d747a18] )))))
0-management: forced unwinding frame type(Peer mgmt) op(--(4)) called at
2020-04-22 05:38:32.243087 (xid=0x8d)
[2020-04-22 05:38:32.278638] E [MSGID: 106157]
[glusterd-rpc-ops.c:665:__glusterd_friend_update_cbk] 0-management: RPC Error
[2020-04-22 05:38:32.278651] I [MSGID: 106493]
[glusterd-rpc-ops.c:681:__glusterd_friend_update_cbk] 0-management: Received RJT
from uuid: 00000000-0000-0000-0000-000000000000
[2020-04-22 05:38:33.256401] I [MSGID: 106498]
[glusterd-svc-helper.c:747:__glusterd_send_svc_configure_req] 0-management: not
connected yet
[2020-04-22 05:38:35.279149] I [socket.c:4347:ssl_setup_connection_params]
0-management: SSL support on the I/O path is ENABLED
[2020-04-22 05:38:35.279169] I [socket.c:4350:ssl_setup_connection_params]
0-management: SSL support for glusterd is ENABLED
[2020-04-22 05:38:35.279178] I [socket.c:4360:ssl_setup_connection_params]
0-management: using certificate depth 1
The message "I [MSGID: 106004]
[glusterd-handler.c:6204:__glusterd_peer_rpc_notify] 0-management: Peer
<glusterDevVM2> (<7f6c3023-144b-4db2-9063-d90926dbdd18>), in state
<Peer in Cluster>, has disconnected from glusterd." repeated 3 times
between [2020-04-22 05:38:25.232116] and [2020-04-22 05:38:35.667153]
[2020-04-22 05:38:35.667255] W [glusterd-locks.c:796:glusterd_mgmt_v3_unlock]
(-->/usr/lib/x86_64-linux-gnu/glusterfs/7.3/xlator/mgmt/glusterd.so(+0x22119)
[0x7fd287da7119]
-->/usr/lib/x86_64-linux-gnu/glusterfs/7.3/xlator/mgmt/glusterd.so(+0x2caae)
[0x7fd287db1aae]
-->/usr/lib/x86_64-linux-gnu/glusterfs/7.3/xlator/mgmt/glusterd.so(+0xdf0d3)
[0x7fd287e640d3] ) 0-management: Lock for vol <vol> not held
[2020-04-22 05:38:35.667275] W [MSGID: 106117]
[glusterd-handler.c:6225:__glusterd_peer_rpc_notify] 0-management: Lock not
released for <vol>
2 last lines repeated for each volume 

root at glusterDevVM2:~# egrep -v '0-socket.management'
/var/log/glusterfs/glusterd.log
[2020-04-22 05:51:57.493574] E [rpc-clnt.c:346:saved_frames_unwind] (-->
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x138)[0x7f30411dbda8]
(--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xcd97)[0x7f3040f81d97] (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xcebe)[0x7f3040f81ebe] (-->
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)[0x7f3040f82e93]
(--> /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(+0xea18)[0x7f3040f83a18] )))))
0-management: forced unwinding frame type(Gluster MGMT Handshake)
op(MGMT-VERS(1)) called at 2020-04-22 05:51:57.483579 (xid=0x563)
[2020-04-22 05:51:57.493623] E [MSGID: 106167]
[glusterd-handshake.c:2040:__glusterd_mgmt_hndsk_version_cbk] 0-management:
Error through RPC layer, retry again later
[2020-04-22 05:52:00.501474] I [socket.c:4347:ssl_setup_connection_params]
0-management: SSL support on the I/O path is ENABLED
[2020-04-22 05:52:00.501542] I [socket.c:4350:ssl_setup_connection_params]
0-management: SSL support for glusterd is ENABLED
[2020-04-22 05:52:00.501569] I [socket.c:4360:ssl_setup_connection_params]
0-management: using certificate depth 1
[2020-04-22 05:52:00.983720] I [MSGID: 106004]
[glusterd-handler.c:6204:__glusterd_peer_rpc_notify] 0-management: Peer
<glusterDevVM1> (<e2263e4d-a307-45d5-9cec-e1791f7a45fb>), in state
<Peer in Cluster>, has disconnected from glusterd.
[2020-04-22 05:52:00.983886] W [glusterd-locks.c:796:glusterd_mgmt_v3_unlock]
(-->/usr/lib/x86_64-linux-gnu/glusterfs/7.3/xlator/mgmt/glusterd.so(+0x22119)
[0x7f303b5e3119]
-->/usr/lib/x86_64-linux-gnu/glusterfs/7.3/xlator/mgmt/glusterd.so(+0x2caae)
[0x7f303b5edaae]
-->/usr/lib/x86_64-linux-gnu/glusterfs/7.3/xlator/mgmt/glusterd.so(+0xdf0d3)
[0x7f303b6a00d3] ) 0-management: Lock for vol <vol> not held
[2020-04-22 05:52:00.983909] W [MSGID: 106117]
[glusterd-handler.c:6225:__glusterd_peer_rpc_notify] 0-management: Lock not
released for <vol>
2 last lines repeated for each volume 

root at glusterDevVM3:~# egrep -v '0-socket.management'
/var/log/glusterfs/glusterd.log
[2020-04-22 05:38:33.229959] I [MSGID: 106499]
[glusterd-handler.c:4264:__glusterd_handle_status_volume] 0-management: Received
status volume req for volume tmp
[2020-04-22 05:41:33.230170] I
[glusterd-locks.c:729:gd_mgmt_v3_unlock_timer_cbk] 0-management: unlock timer is
cancelled for volume_type tmp_vol
[2020-04-22 05:48:34.908289] E [rpc-clnt.c:183:call_bail] 0-management: bailing
out frame type(glusterd mgmt v3), op(--(1)), xid = 0x108, unique = 918, sent =
2020-04-22 05:38:33.230268, timeout = 600 for [ http://10.5.1.7:24007/ |
10.5.1.7:24007 ]
[2020-04-22 05:48:34.908339] E [MSGID: 106115]
[glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on
glusterDevVM1. Please check log file for details.
[2020-04-22 05:48:40.288539] E [rpc-clnt.c:183:call_bail] 0-management: bailing
out frame type(glusterd mgmt v3), op(--(1)), xid = 0x27, unique = 917, sent =
2020-04-22 05:38:33.230258, timeout = 600 for [ http://10.5.1.8:24007/ |
10.5.1.8:24007 ]
[2020-04-22 05:48:40.288568] E [MSGID: 106115]
[glusterd-mgmt.c:117:gd_mgmt_v3_collate_errors] 0-management: Locking failed on
glusterDevVM2. Please check log file for details.
[2020-04-22 05:48:40.288631] E [MSGID: 106150]
[glusterd-syncop.c:1918:gd_sync_task_begin] 0-management: Locking Peers Failed.

I'm not familiar with pstack, when running on node 3 (arbiter) I get only
these few lines :
root at glusterDevVM3:~# pstack 13700 

13700: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO 
(No symbols found) 
0x7fafd747a6cd: ???? 

Which debian stretch package should I install ? 

To be more explicit, I stopped glusterd on all 3 nodes then restart sequentially
with this order : node1, node3 (arbiter) then node2.
Log files can be dl at [
https://www.dropbox.com/s/rcgcw7jrud2wkj1/glusterd-logs.tar.bz2?dl=0 |
https://www.dropbox.com/s/rcgcw7jrud2wkj1/glusterd-logs.tar.bz2?dl=0 ]

Thanks for your help. 


De: "Sanju Rakonde" < [ mailto:srakonde at redhat.com | srakonde at
redhat.com ] >
?: [ mailto:nico at furyweb.fr | nico at furyweb.fr ] 
Cc: "gluster-users" < [ mailto:gluster-users at gluster.org |
gluster-users at gluster.org ] >
Envoy?: Mercredi 22 Avril 2020 07:23:52 
Objet: Re: [Gluster-users] never ending logging 

Hi, 
The email is talking about many issues. Let me ask a few questions to get a
whole picture.
1. are the peers are in the connected state now? or they still in the rejected
state?
2. What led you to see "locking failed" messages? We would like to if
there is a reproducer and fix the issue if any.
3. Another transaction in progress message appears when there is already a
operation going on. Are you seeing this when there is no such transaction going
on?
4. When did you hit the timedouts? Did you tried to look at the pstack output of
glusterd process? If so, please share the pstack output.

On Tue, Apr 21, 2020 at 7:08 PM < [ mailto:nico at furyweb.fr | nico at
furyweb.fr ] > wrote:

BQ_BEGIN
Hi all. 

We're using 3 nodes Gluster 7.3 (2 + 1 arbiter), yesterday node 2 was
rejected from cluster and I applied following steps to fix : [
https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Resolving%20Peer%20Rejected/
|
https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Administrator%20Guide/Resolving%20Peer%20Rejected/
]
I also saw [
https://docs.gluster.org/en/latest/Troubleshooting/troubleshooting-glusterd/ |
https://docs.gluster.org/en/latest/Troubleshooting/troubleshooting-glusterd/ ]
but solution isn't compatible as cluster.max-op-version doesn't exist
and all op-version are the same on all 3 nodes.

After renewing SSL certs and several restart all volumes came back online but
glusterd log file on all 3 nodes is filled with nothing else than following 3
lines :

[2020-04-21 13:05:19.478913] I [socket.c:4347:ssl_setup_connection_params]
0-socket.management: SSL support on the I/O path is ENABLED
[2020-04-21 13:05:19.478972] I [socket.c:4350:ssl_setup_connection_params]
0-socket.management: SSL support for glusterd is ENABLED
[2020-04-21 13:05:19.478986] I [socket.c:4360:ssl_setup_connection_params]
0-socket.management: using certificate depth 1

Moreover, I have "Locking failed", "Another transaction is in
progress" and "Error : Request timed out" on gluster volume
status volxxx command.
All SSL certs on clients have also been renewed and all volumes were remounted.
All 3 nodes were alternatively restarted (glusterd) and rebooted.

The cluster is not in production environment but there's about ~250 clients
for ~75 volumes, I don't know how to troubleshoot and fix this problem, if
anyone has an idea.
________ 



Community Meeting Calendar: 

Schedule - 
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC 
Bridge: [ https://bluejeans.com/441850968 | https://bluejeans.com/441850968 ] 

Gluster-users mailing list 
[ mailto:Gluster-users at gluster.org | Gluster-users at gluster.org ] 
[ https://lists.gluster.org/mailman/listinfo/gluster-users |
https://lists.gluster.org/mailman/listinfo/gluster-users ]






-- 
Thanks, 
Sanju 


BQ_END



-- 
Thanks, 
Sanju 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20200423/df2316bb/attachment.html>

Gluster users - Apr 2020 - never ending logging

[Gluster-users] never ending logging

[Gluster-users] never ending logging