thr3ads.net - Gluster users - [Gluster-users] glusterd-locks.c:572:glusterd_mgmt_v3

If this information is useful, please help other people find it:
Share via:

Atin Mukherjee

2017-Jul-26 13:28 UTC

[Gluster-users] glusterd-locks.c:572:glusterd_mgmt_v3_lock

Technically if only one node is pumping all these status commands, you
shouldn't get into this situation. Can you please help me with the latest
cmd_history & glusterd log files from all the nodes?

On Wed, Jul 26, 2017 at 1:41 PM, Paolo Margara <paolo.margara at
polito.it>
wrote:
> Hi Atin,
>
> I've initially disabled gluster status check on all nodes except on one
on
> my nagios instance as you recommended but this issue happens again.
>
> So I've disabled it on each nodes but the error happens again,
currently
> only oVirt is monitoring gluster.
>
> I cannot modify this behaviour in the oVirt GUI, there is anything that
> could I do from the gluster prospective to solve this issue? Considering
> that 3.8 is near EOL also upgrading to 3.10 could be an option.
>
>
> Greetings,
>
>     Paolo
>
> Il 20/07/2017 15:37, Paolo Margara ha scritto:
>
> OK, on my nagios instance I've disabled gluster status check on all
nodes
> except on one, I'll check if this is enough.
>
> Thanks,
>
>     Paolo
>
> Il 20/07/2017 13:50, Atin Mukherjee ha scritto:
>
> So from the cmd_history.logs across all the nodes it's evident that
> multiple commands on the same volume are run simultaneously which can
> result into transactions collision and you can end up with one command
> succeeding and others failing. Ideally if you are running volume status
> command for monitoring it's suggested to be run from only one node.
>
> On Thu, Jul 20, 2017 at 3:54 PM, Paolo Margara <paolo.margara at
polito.it>
> wrote:
>
>> In attachment the requested logs for all the three nodes.
>>
>> thanks,
>>
>>     Paolo
>>
>> Il 20/07/2017 11:38, Atin Mukherjee ha scritto:
>>
>> Please share the cmd_history.log file from all the storage nodes.
>>
>> On Thu, Jul 20, 2017 at 2:34 PM, Paolo Margara <paolo.margara at
polito.it>
>> wrote:
>>
>>> Hi list,
>>>
>>> recently I've noted a strange behaviour of my gluster storage,
sometimes
>>> while executing a simple command like "gluster volume status
>>> vm-images-repo" as a response I got "Another transaction
is in progress for
>>> vm-images-repo. Please try again after sometime.". This
situation does not
>>> get solved simply waiting for but I've to restart glusterd on
the node that
>>> hold (and does not release) the lock, this situation occur randomly
after
>>> some days. In the meanwhile, prior and after the issue appear,
everything
>>> is working as expected.
>>>
>>> I'm using gluster 3.8.12 on CentOS 7.3, the only relevant
information
>>> that I found on the log file (etc-glusterfs-glusterd.vol.log) of my
>>> three nodes are the following:
>>>
>>> * node1, at the moment the issue begins:
>>>
>>> [2017-07-19 15:07:43.130203] W
[glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>>
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>> [0x7f373f25f00f] -->/usr/lib64/glusterfs/3.8.12
>>> /xlator/mgmt/glusterd.so(+0x2ba25) [0x7f373f250a25]
>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>> [0x7f373f2f548f] ) 0-management: Lock for vm-images-repo held by
>>> 2c6f154f-efe3-4479-addc-b2021aa9d5df
>>> [2017-07-19 15:07:43.128242] I [MSGID: 106499]
>>> [glusterd-handler.c:4349:__glusterd_handle_status_volume]
0-management:
>>> Received status volume req for volume vm-images-repo
>>> [2017-07-19 15:07:43.130244] E [MSGID: 106119]
>>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to
>>> acquire lock for vm-images-repo
>>> [2017-07-19 15:07:43.130320] E [MSGID: 106376]
>>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler
returned:
>>> -1
>>> [2017-07-19 15:07:43.130665] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management:
Locking
>>> failed on virtnode-0-1-gluster. Please check log file for details.
>>> [2017-07-19 15:07:43.131293] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management:
Locking
>>> failed on virtnode-0-2-gluster. Please check log file for details.
>>> [2017-07-19 15:07:43.131360] E [MSGID: 106151]
>>> [glusterd-syncop.c:1884:gd_sync_task_begin] 0-management: Locking
Peers
>>> Failed.
>>> [2017-07-19 15:07:43.132005] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management:
Unlocking
>>> failed on virtnode-0-2-gluster. Please check log file for details.
>>> [2017-07-19 15:07:43.132182] E [MSGID: 106116]
>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management:
Unlocking
>>> failed on virtnode-0-1-gluster. Please check log file for details.
>>>
>>> * node2, at the moment the issue begins:
>>>
>>> [2017-07-19 15:07:43.131975] W
[glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>>
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>> [0x7f17b5b9e00f] -->/usr/lib64/glusterfs/3.8.12
>>> /xlator/mgmt/glusterd.so(+0x2ba25) [0x7f17b5b8fa25]
>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>> [0x7f17b5c3448f] ) 0-management: Lock for vm-images-repo held by
>>> d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>> [2017-07-19 15:07:43.132019] E [MSGID: 106119]
>>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to
>>> acquire lock for vm-images-repo
>>> [2017-07-19 15:07:43.133568] W
[glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
>>>
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>> [0x7f17b5b9e00f] -->/usr/lib64/glusterfs/3.8.12
>>> /xlator/mgmt/glusterd.so(+0x2b712) [0x7f17b5b8f712]
>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
>>> [0x7f17b5c3482a] ) 0-management: Lock owner mismatch. Lock for vol
>>> vm-images-repo held by d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>> [2017-07-19 15:07:43.133597] E [MSGID: 106118]
>>> [glusterd-op-sm.c:3845:glusterd_op_ac_unlock] 0-management: Unable
to
>>> release lock for vm-images-repo
>>> The message "E [MSGID: 106376]
[glusterd-op-sm.c:7775:glusterd_op_sm]
>>> 0-management: handler returned: -1" repeated 3 times between
[2017-07-19
>>> 15:07:42.976193] and [2017-07-19 15:07:43.133646]
>>>
>>> * node3, at the moment the issue begins:
>>>
>>> [2017-07-19 15:07:42.976593] I [MSGID: 106499]
>>> [glusterd-handler.c:4349:__glusterd_handle_status_volume]
0-management:
>>> Received status volume req for volume vm-images-repo
>>> [2017-07-19 15:07:43.129941] W
[glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>>
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>> [0x7f6133f5b00f] -->/usr/lib64/glusterfs/3.8.12
>>> /xlator/mgmt/glusterd.so(+0x2ba25) [0x7f6133f4ca25]
>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>> [0x7f6133ff148f] ) 0-management: Lock for vm-images-repo held by
>>> d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>> [2017-07-19 15:07:43.129981] E [MSGID: 106119]
>>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management: Unable to
>>> acquire lock for vm-images-repo
>>> [2017-07-19 15:07:43.130034] E [MSGID: 106376]
>>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler
returned:
>>> -1
>>> [2017-07-19 15:07:43.130131] E [MSGID: 106275]
>>> [glusterd-rpc-ops.c:876:glusterd_mgmt_v3_lock_peers_cbk_fn]
>>> 0-management: Received mgmt_v3 lock RJT from uuid:
>>> 2c6f154f-efe3-4479-addc-b2021aa9d5df
>>> [2017-07-19 15:07:43.130710] W
[glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
>>>
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>> [0x7f6133f5b00f] -->/usr/lib64/glusterfs/3.8.12
>>> /xlator/mgmt/glusterd.so(+0x2b712) [0x7f6133f4c712]
>>> -->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
>>> [0x7f6133ff182a] ) 0-management: Lock owner mismatch. Lock for vol
>>> vm-images-repo held by d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>> [2017-07-19 15:07:43.130733] E [MSGID: 106118]
>>> [glusterd-op-sm.c:3845:glusterd_op_ac_unlock] 0-management: Unable
to
>>> release lock for vm-images-repo
>>> [2017-07-19 15:07:43.130771] E [MSGID: 106376]
>>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler
returned:
>>> -1
>>>
>>> The thing that is really strange is that in this case the uuid of
node3
>>> is d9047ecd-26b5-467b-8e91-50f76a0c4d16!
>>>
>>> The mapping nodename-uuid is:
>>>
>>> * (node1) virtnode-0-0-gluster:
2c6f154f-efe3-4479-addc-b2021aa9d5df
>>>
>>> * (node2) virtnode-0-1-gluster:
e93ebee7-5d95-4100-a9df-4a3e60134b73
>>>
>>> * (node3) virtnode-0-2-gluster:
d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>
>>> In this case restarting glusterd on node3 usually solve the issue.
>>>
>>> What could be the root cause of this behavior? How can I fix this
once
>>> and for all?
>>>
>>> If needed I could provide the full log file.
>>>
>>>
>>> Greetings,
>>>
>>>     Paolo Margara
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>
> _______________________________________________
> Gluster-users mailing listGluster-users at
gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170726/60ee2cc1/attachment.html>

Paolo Margara

2017-Jul-27 11:18 UTC

head link

[Gluster-users] glusterd-locks.c:572:glusterd_mgmt_v3_lock

Hi Atin,

in attachment all the requested logs.

Considering that I'm using gluster as a storage system for oVirt I've
checked also these logs and I've seen that almost every commands on all
the three nodes are executed by the supervdsm daemon and not only by the
SPM node. Could be this the root cause of this problem?

Greetings,

    Paolo

PS: could you suggest a better method than attachment for sharing log
files?

Il 26/07/2017 15:28, Atin Mukherjee ha scritto:> Technically if only one node is pumping all these status commands, you
> shouldn't get into this situation. Can you please help me with the
> latest cmd_history & glusterd log files from all the nodes?
>
> On Wed, Jul 26, 2017 at 1:41 PM, Paolo Margara
> <paolo.margara at polito.it <mailto:paolo.margara at
polito.it>> wrote:
>
>     Hi Atin,
>
>     I've initially disabled gluster status check on all nodes except
>     on one on my nagios instance as you recommended but this issue
>     happens again.
>
>     So I've disabled it on each nodes but the error happens again,
>     currently only oVirt is monitoring gluster.
>
>     I cannot modify this behaviour in the oVirt GUI, there is anything
>     that could I do from the gluster prospective to solve this issue?
>     Considering that 3.8 is near EOL also upgrading to 3.10 could be
>     an option.
>
>
>     Greetings,
>
>         Paolo
>
>
>     Il 20/07/2017 15:37, Paolo Margara ha scritto:
>>
>>     OK, on my nagios instance I've disabled gluster status check on
>>     all nodes except on one, I'll check if this is enough.
>>
>>     Thanks,
>>
>>         Paolo
>>
>>
>>     Il 20/07/2017 13:50, Atin Mukherjee ha scritto:
>>>     So from the cmd_history.logs across all the nodes it's
evident
>>>     that multiple commands on the same volume are run
simultaneously
>>>     which can result into transactions collision and you can end up
>>>     with one command succeeding and others failing. Ideally if you
>>>     are running volume status command for monitoring it's
suggested
>>>     to be run from only one node.
>>>
>>>     On Thu, Jul 20, 2017 at 3:54 PM, Paolo Margara
>>>     <paolo.margara at polito.it <mailto:paolo.margara at
polito.it>> wrote:
>>>
>>>         In attachment the requested logs for all the three nodes.
>>>
>>>         thanks,
>>>
>>>             Paolo
>>>
>>>
>>>         Il 20/07/2017 11:38, Atin Mukherjee ha scritto:
>>>>         Please share the cmd_history.log file from all the
storage
>>>>         nodes.
>>>>
>>>>         On Thu, Jul 20, 2017 at 2:34 PM, Paolo Margara
>>>>         <paolo.margara at polito.it <mailto:paolo.margara
at polito.it>>
>>>>         wrote:
>>>>
>>>>             Hi list,
>>>>
>>>>             recently I've noted a strange behaviour of my
gluster
>>>>             storage, sometimes while executing a simple command
>>>>             like "gluster volume status
vm-images-repo" as a
>>>>             response I got "Another transaction is in
progress for
>>>>             vm-images-repo. Please try again after
sometime.". This
>>>>             situation does not get solved simply waiting for
but
>>>>             I've to restart glusterd on the node that hold
(and
>>>>             does not release) the lock, this situation occur
>>>>             randomly after some days. In the meanwhile, prior
and
>>>>             after the issue appear, everything is working as
expected.
>>>>
>>>>             I'm using gluster 3.8.12 on CentOS 7.3, the
only
>>>>             relevant information that I found on the log file
>>>>             (etc-glusterfs-glusterd.vol.log) of my three nodes
are
>>>>             the following:
>>>>
>>>>             * node1, at the moment the issue begins:
>>>>
>>>>             [2017-07-19 15:07:43.130203] W
>>>>             [glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>>>            
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>>             [0x7f373f25f00f]
>>>>            
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
>>>>             [0x7f373f250a25]
>>>>            
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>>>             [0x7f373f2f548f] ) 0-management: Lock for
>>>>             vm-images-repo held by
2c6f154f-efe3-4479-addc-b2021aa9d5df
>>>>             [2017-07-19 15:07:43.128242] I [MSGID: 106499]
>>>>            
[glusterd-handler.c:4349:__glusterd_handle_status_volume]
>>>>             0-management: Received status volume req for volume
>>>>             vm-images-repo
>>>>             [2017-07-19 15:07:43.130244] E [MSGID: 106119]
>>>>             [glusterd-op-sm.c:3782:glusterd_op_ac_lock]
>>>>             0-management: Unable to acquire lock for
vm-images-repo
>>>>             [2017-07-19 15:07:43.130320] E [MSGID: 106376]
>>>>             [glusterd-op-sm.c:7775:glusterd_op_sm]
0-management:
>>>>             handler returned: -1
>>>>             [2017-07-19 15:07:43.130665] E [MSGID: 106116]
>>>>             [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors]
>>>>             0-management: Locking failed on
virtnode-0-1-gluster.
>>>>             Please check log file for details.
>>>>             [2017-07-19 15:07:43.131293] E [MSGID: 106116]
>>>>             [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors]
>>>>             0-management: Locking failed on
virtnode-0-2-gluster.
>>>>             Please check log file for details.
>>>>             [2017-07-19 15:07:43.131360] E [MSGID: 106151]
>>>>             [glusterd-syncop.c:1884:gd_sync_task_begin]
>>>>             0-management: Locking Peers Failed.
>>>>             [2017-07-19 15:07:43.132005] E [MSGID: 106116]
>>>>             [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors]
>>>>             0-management: Unlocking failed on
virtnode-0-2-gluster.
>>>>             Please check log file for details.
>>>>             [2017-07-19 15:07:43.132182] E [MSGID: 106116]
>>>>             [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors]
>>>>             0-management: Unlocking failed on
virtnode-0-1-gluster.
>>>>             Please check log file for details.
>>>>
>>>>             * node2, at the moment the issue begins:
>>>>
>>>>             [2017-07-19 15:07:43.131975] W
>>>>             [glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>>>            
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>>             [0x7f17b5b9e00f]
>>>>            
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
>>>>             [0x7f17b5b8fa25]
>>>>            
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>>>             [0x7f17b5c3448f] ) 0-management: Lock for
>>>>             vm-images-repo held by
d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>>             [2017-07-19 15:07:43.132019] E [MSGID: 106119]
>>>>             [glusterd-op-sm.c:3782:glusterd_op_ac_lock]
>>>>             0-management: Unable to acquire lock for
vm-images-repo
>>>>             [2017-07-19 15:07:43.133568] W
>>>>             [glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
>>>>            
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>>             [0x7f17b5b9e00f]
>>>>            
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2b712)
>>>>             [0x7f17b5b8f712]
>>>>            
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
>>>>             [0x7f17b5c3482a] ) 0-management: Lock owner
mismatch.
>>>>             Lock for vol vm-images-repo held by
>>>>             d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>>             [2017-07-19 15:07:43.133597] E [MSGID: 106118]
>>>>             [glusterd-op-sm.c:3845:glusterd_op_ac_unlock]
>>>>             0-management: Unable to release lock for
vm-images-repo
>>>>             The message "E [MSGID: 106376]
>>>>             [glusterd-op-sm.c:7775:glusterd_op_sm]
0-management:
>>>>             handler returned: -1" repeated 3 times between
>>>>             [2017-07-19 15:07:42.976193] and [2017-07-19
>>>>             15:07:43.133646]
>>>>
>>>>             * node3, at the moment the issue begins:
>>>>
>>>>             [2017-07-19 15:07:42.976593] I [MSGID: 106499]
>>>>            
[glusterd-handler.c:4349:__glusterd_handle_status_volume]
>>>>             0-management: Received status volume req for volume
>>>>             vm-images-repo
>>>>             [2017-07-19 15:07:43.129941] W
>>>>             [glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>>>            
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>>             [0x7f6133f5b00f]
>>>>            
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
>>>>             [0x7f6133f4ca25]
>>>>            
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>>>             [0x7f6133ff148f] ) 0-management: Lock for
>>>>             vm-images-repo held by
d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>>             [2017-07-19 15:07:43.129981] E [MSGID: 106119]
>>>>             [glusterd-op-sm.c:3782:glusterd_op_ac_lock]
>>>>             0-management: Unable to acquire lock for
vm-images-repo
>>>>             [2017-07-19 15:07:43.130034] E [MSGID: 106376]
>>>>             [glusterd-op-sm.c:7775:glusterd_op_sm]
0-management:
>>>>             handler returned: -1
>>>>             [2017-07-19 15:07:43.130131] E [MSGID: 106275]
>>>>            
[glusterd-rpc-ops.c:876:glusterd_mgmt_v3_lock_peers_cbk_fn]
>>>>             0-management: Received mgmt_v3 lock RJT from uuid:
>>>>             2c6f154f-efe3-4479-addc-b2021aa9d5df
>>>>             [2017-07-19 15:07:43.130710] W
>>>>             [glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
>>>>            
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>>             [0x7f6133f5b00f]
>>>>            
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2b712)
>>>>             [0x7f6133f4c712]
>>>>            
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
>>>>             [0x7f6133ff182a] ) 0-management: Lock owner
mismatch.
>>>>             Lock for vol vm-images-repo held by
>>>>             d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>>             [2017-07-19 15:07:43.130733] E [MSGID: 106118]
>>>>             [glusterd-op-sm.c:3845:glusterd_op_ac_unlock]
>>>>             0-management: Unable to release lock for
vm-images-repo
>>>>             [2017-07-19 15:07:43.130771] E [MSGID: 106376]
>>>>             [glusterd-op-sm.c:7775:glusterd_op_sm]
0-management:
>>>>             handler returned: -1
>>>>
>>>>             The thing that is really strange is that in this
case
>>>>             the uuid of node3 is
d9047ecd-26b5-467b-8e91-50f76a0c4d16!
>>>>
>>>>             The mapping nodename-uuid is:
>>>>
>>>>             * (node1) virtnode-0-0-gluster:
>>>>             2c6f154f-efe3-4479-addc-b2021aa9d5df
>>>>
>>>>             * (node2) virtnode-0-1-gluster:
>>>>             e93ebee7-5d95-4100-a9df-4a3e60134b73
>>>>
>>>>             * (node3) virtnode-0-2-gluster:
>>>>             d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>>
>>>>             In this case restarting glusterd on node3 usually
solve
>>>>             the issue.
>>>>
>>>>             What could be the root cause of this behavior? How
can
>>>>             I fix this once and for all?
>>>>
>>>>             If needed I could provide the full log file.
>>>>
>>>>
>>>>             Greetings,
>>>>
>>>>                 Paolo Margara
>>>>
>>>>
>>>>             _______________________________________________
>>>>             Gluster-users mailing list
>>>>             Gluster-users at gluster.org
>>>>             <mailto:Gluster-users at gluster.org>
>>>>            
http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>            
<http://lists.gluster.org/mailman/listinfo/gluster-users>
>>>>
>>
>>
>>     _______________________________________________
>>     Gluster-users mailing list
>>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>     http://lists.gluster.org/mailman/listinfo/gluster-users
>>     <http://lists.gluster.org/mailman/listinfo/gluster-users>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170727/98931475/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: node1.tbz
Type: application/x-bzip-compressed-tar
Size: 141878 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170727/98931475/attachment-0003.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: node2.tbz
Type: application/x-bzip-compressed-tar
Size: 149455 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170727/98931475/attachment-0004.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: node3.tbz
Type: application/x-bzip-compressed-tar
Size: 466936 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170727/98931475/attachment-0005.bin>

Atin Mukherjee

2017-Jul-29 01:55 UTC

head link

[Gluster-users] glusterd-locks.c:572:glusterd_mgmt_v3_lock

On Thu, 27 Jul 2017 at 16:48, Paolo Margara <paolo.margara at polito.it>
wrote:
> Hi Atin,
>
> in attachment all the requested logs.
>
> Considering that I'm using gluster as a storage system for oVirt
I've
> checked also these logs and I've seen that almost every commands on all
the
> three nodes are executed by the supervdsm daemon and not only by the SPM
> node. Could be this the root cause of this problem?
>
Indeed.

>
> Greetings,
>
>     Paolo
>
> PS: could you suggest a better method than attachment for sharing log
> files?
>
> Il 26/07/2017 15:28, Atin Mukherjee ha scritto:
>
> Technically if only one node is pumping all these status commands, you
> shouldn't get into this situation. Can you please help me with the
latest
> cmd_history & glusterd log files from all the nodes?
>
> On Wed, Jul 26, 2017 at 1:41 PM, Paolo Margara <paolo.margara at
polito.it>
> wrote:
>
>> Hi Atin,
>>
>> I've initially disabled gluster status check on all nodes except on
one
>> on my nagios instance as you recommended but this issue happens again.
>>
>> So I've disabled it on each nodes but the error happens again,
currently
>> only oVirt is monitoring gluster.
>>
>> I cannot modify this behaviour in the oVirt GUI, there is anything that
>> could I do from the gluster prospective to solve this issue?
Considering
>> that 3.8 is near EOL also upgrading to 3.10 could be an option.
>>
>>
>> Greetings,
>>
>>     Paolo
>>
>> Il 20/07/2017 15:37, Paolo Margara ha scritto:
>>
>> OK, on my nagios instance I've disabled gluster status check on all
nodes
>> except on one, I'll check if this is enough.
>>
>> Thanks,
>>
>>     Paolo
>>
>> Il 20/07/2017 13:50, Atin Mukherjee ha scritto:
>>
>> So from the cmd_history.logs across all the nodes it's evident that
>> multiple commands on the same volume are run simultaneously which can
>> result into transactions collision and you can end up with one command
>> succeeding and others failing. Ideally if you are running volume status
>> command for monitoring it's suggested to be run from only one node.
>>
>> On Thu, Jul 20, 2017 at 3:54 PM, Paolo Margara <paolo.margara at
polito.it>
>> wrote:
>>
>>> In attachment the requested logs for all the three nodes.
>>>
>>> thanks,
>>>
>>>     Paolo
>>>
>>> Il 20/07/2017 11:38, Atin Mukherjee ha scritto:
>>>
>>> Please share the cmd_history.log file from all the storage nodes.
>>>
>>> On Thu, Jul 20, 2017 at 2:34 PM, Paolo Margara <paolo.margara at
polito.it>
>>> wrote:
>>>
>>>> Hi list,
>>>>
>>>> recently I've noted a strange behaviour of my gluster
storage,
>>>> sometimes while executing a simple command like "gluster
volume status
>>>> vm-images-repo" as a response I got "Another
transaction is in progress for
>>>> vm-images-repo. Please try again after sometime.". This
situation does not
>>>> get solved simply waiting for but I've to restart glusterd
on the node that
>>>> hold (and does not release) the lock, this situation occur
randomly after
>>>> some days. In the meanwhile, prior and after the issue appear,
everything
>>>> is working as expected.
>>>>
>>>> I'm using gluster 3.8.12 on CentOS 7.3, the only relevant
information
>>>> that I found on the log file (etc-glusterfs-glusterd.vol.log)
of my three
>>>> nodes are the following:
>>>>
>>>> * node1, at the moment the issue begins:
>>>>
>>>> [2017-07-19 15:07:43.130203] W
>>>> [glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>>>
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>> [0x7f373f25f00f]
>>>>
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
>>>> [0x7f373f250a25]
>>>>
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>>> [0x7f373f2f548f] ) 0-management: Lock for vm-images-repo held
by
>>>> 2c6f154f-efe3-4479-addc-b2021aa9d5df
>>>> [2017-07-19 15:07:43.128242] I [MSGID: 106499]
>>>> [glusterd-handler.c:4349:__glusterd_handle_status_volume]
0-management:
>>>> Received status volume req for volume vm-images-repo
>>>> [2017-07-19 15:07:43.130244] E [MSGID: 106119]
>>>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management:
Unable to acquire
>>>> lock for vm-images-repo
>>>> [2017-07-19 15:07:43.130320] E [MSGID: 106376]
>>>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler
returned: -1
>>>> [2017-07-19 15:07:43.130665] E [MSGID: 106116]
>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management:
Locking
>>>> failed on virtnode-0-1-gluster. Please check log file for
details.
>>>> [2017-07-19 15:07:43.131293] E [MSGID: 106116]
>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management:
Locking
>>>> failed on virtnode-0-2-gluster. Please check log file for
details.
>>>> [2017-07-19 15:07:43.131360] E [MSGID: 106151]
>>>> [glusterd-syncop.c:1884:gd_sync_task_begin] 0-management:
Locking Peers
>>>> Failed.
>>>> [2017-07-19 15:07:43.132005] E [MSGID: 106116]
>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management:
Unlocking
>>>> failed on virtnode-0-2-gluster. Please check log file for
details.
>>>> [2017-07-19 15:07:43.132182] E [MSGID: 106116]
>>>> [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management:
Unlocking
>>>> failed on virtnode-0-1-gluster. Please check log file for
details.
>>>>
>>>> * node2, at the moment the issue begins:
>>>>
>>>> [2017-07-19 15:07:43.131975] W
>>>> [glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>>>
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>> [0x7f17b5b9e00f]
>>>>
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
>>>> [0x7f17b5b8fa25]
>>>>
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>>> [0x7f17b5c3448f] ) 0-management: Lock for vm-images-repo held
by
>>>> d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>> [2017-07-19 15:07:43.132019] E [MSGID: 106119]
>>>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management:
Unable to acquire
>>>> lock for vm-images-repo
>>>> [2017-07-19 15:07:43.133568] W
>>>> [glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
>>>>
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>> [0x7f17b5b9e00f]
>>>>
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2b712)
>>>> [0x7f17b5b8f712]
>>>>
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
>>>> [0x7f17b5c3482a] ) 0-management: Lock owner mismatch. Lock for
vol
>>>> vm-images-repo held by d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>> [2017-07-19 15:07:43.133597] E [MSGID: 106118]
>>>> [glusterd-op-sm.c:3845:glusterd_op_ac_unlock] 0-management:
Unable to
>>>> release lock for vm-images-repo
>>>> The message "E [MSGID: 106376]
[glusterd-op-sm.c:7775:glusterd_op_sm]
>>>> 0-management: handler returned: -1" repeated 3 times
between [2017-07-19
>>>> 15:07:42.976193] and [2017-07-19 15:07:43.133646]
>>>>
>>>> * node3, at the moment the issue begins:
>>>>
>>>> [2017-07-19 15:07:42.976593] I [MSGID: 106499]
>>>> [glusterd-handler.c:4349:__glusterd_handle_status_volume]
0-management:
>>>> Received status volume req for volume vm-images-repo
>>>> [2017-07-19 15:07:43.129941] W
>>>> [glusterd-locks.c:572:glusterd_mgmt_v3_lock]
>>>>
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>> [0x7f6133f5b00f]
>>>>
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2ba25)
>>>> [0x7f6133f4ca25]
>>>>
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd048f)
>>>> [0x7f6133ff148f] ) 0-management: Lock for vm-images-repo held
by
>>>> d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>> [2017-07-19 15:07:43.129981] E [MSGID: 106119]
>>>> [glusterd-op-sm.c:3782:glusterd_op_ac_lock] 0-management:
Unable to acquire
>>>> lock for vm-images-repo
>>>> [2017-07-19 15:07:43.130034] E [MSGID: 106376]
>>>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler
returned: -1
>>>> [2017-07-19 15:07:43.130131] E [MSGID: 106275]
>>>> [glusterd-rpc-ops.c:876:glusterd_mgmt_v3_lock_peers_cbk_fn]
0-management:
>>>> Received mgmt_v3 lock RJT from uuid:
2c6f154f-efe3-4479-addc-b2021aa9d5df
>>>> [2017-07-19 15:07:43.130710] W
>>>> [glusterd-locks.c:686:glusterd_mgmt_v3_unlock]
>>>>
(-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x3a00f)
>>>> [0x7f6133f5b00f]
>>>>
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0x2b712)
>>>> [0x7f6133f4c712]
>>>>
-->/usr/lib64/glusterfs/3.8.12/xlator/mgmt/glusterd.so(+0xd082a)
>>>> [0x7f6133ff182a] ) 0-management: Lock owner mismatch. Lock for
vol
>>>> vm-images-repo held by d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>> [2017-07-19 15:07:43.130733] E [MSGID: 106118]
>>>> [glusterd-op-sm.c:3845:glusterd_op_ac_unlock] 0-management:
Unable to
>>>> release lock for vm-images-repo
>>>> [2017-07-19 15:07:43.130771] E [MSGID: 106376]
>>>> [glusterd-op-sm.c:7775:glusterd_op_sm] 0-management: handler
returned: -1
>>>>
>>>> The thing that is really strange is that in this case the uuid
of node3
>>>> is d9047ecd-26b5-467b-8e91-50f76a0c4d16!
>>>>
>>>> The mapping nodename-uuid is:
>>>>
>>>> * (node1) virtnode-0-0-gluster:
2c6f154f-efe3-4479-addc-b2021aa9d5df
>>>>
>>>> * (node2) virtnode-0-1-gluster:
e93ebee7-5d95-4100-a9df-4a3e60134b73
>>>>
>>>> * (node3) virtnode-0-2-gluster:
d9047ecd-26b5-467b-8e91-50f76a0c4d16
>>>>
>>>> In this case restarting glusterd on node3 usually solve the
issue.
>>>>
>>>> What could be the root cause of this behavior? How can I fix
this once
>>>> and for all?
>>>>
>>>> If needed I could provide the full log file.
>>>>
>>>>
>>>> Greetings,
>>>>
>>>>     Paolo Margara
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>
>>
>> _______________________________________________
>> Gluster-users mailing listGluster-users at
gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users
>>
>> --- Atin (atinm)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170729/a4b22b3d/attachment.html>

Maybe Matching Threads

Search for more seemingly similar threads

Gluster users - Jul 2017 - glusterd-locks.c:572:glusterd_mgmt_v3_lock

[Gluster-users] glusterd-locks.c:572:glusterd_mgmt_v3_lock

[Gluster-users] glusterd-locks.c:572:glusterd_mgmt_v3_lock

[Gluster-users] glusterd-locks.c:572:glusterd_mgmt_v3_lock

Maybe Matching Threads