thr3ads.net - Gluster users - [Gluster-users] 3.9.1 in docker: problems when one of peers is unavailable. [May 2017]

If this information is useful, please help other people find it:
Share via:

Rafał Radecki

2017-May-16 05:33 UTC

[Gluster-users] 3.9.1 in docker: problems when one of peers is unavailable.

Hi All.

I have a 9 node dockerized glusterfs cluster and I am seeing a situation
that:
1) docker daemon on 8th node failes and as a result glusterd on this node
is leaving the cluster
2) as a result on 1st node I see message about 8th node being unavailable:

[2017-05-15 12:48:22.142865] I [MSGID: 106004]
[glusterd-handler.c:5808:__glusterd_peer_rpc_notify] 0-management: Peer
<10.10.10.8> (<5cb55b7a-1e04-4fb8-bd1d-55ee647719d2>), in state
<Peer in
Cluster>, has disconnected from glusterd.
[2017-05-15 12:48:22.167746] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.9.1/xlator/mgmt/glusterd.so(+0x2035a)
[0x7f7d9d62535a]
-->/usr/lib64/glusterfs/3.9.1/xlator/mgmt/glusterd.so(+0x29f48)
[0x7f7d9d62ef48] -->/usr/lib64/glus
terfs/3.9.1/xlator/mgmt/glusterd.so(+0xd50aa) [0x7f7d9d6da0aa] )
0-management: Lock for vol csv not held
[2017-05-15 12:48:22.167767] W [MSGID: 106118]
[glusterd-handler.c:5833:__glusterd_peer_rpc_notify] 0-management: Lock not
released for csv


and the gluster share is unavailable and when I try to list it I get:

Transport endpoint is not connected
3) then on 5th node I see message similar to 2) about 1st node being
unavailable and 5th also disconnects from the cluster

[2017-05-15 12:52:54.321189] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.9.1/xlator/mgmt/glusterd.so(+0x2035a)
[0x7f7fda22335a]
-->/usr/lib64/glusterfs/3.9.1/xlator/mgmt/glusterd.so(+0x29f48)
[0x7f7fda22cf48] -->/usr/lib64/glus
terfs/3.9.1/xlator/mgmt/glusterd.so(+0xd50aa) [0x7f7fda2d80aa] )
0-management: Lock for vol csv not held

[2017-05-15 12:52:54.321200] W [MSGID: 106118]
[glusterd-handler.c:5833:__glusterd_peer_rpc_notify] 0-management: Lock not
released for csv

[2017-05-15 12:53:04.659418] E [socket.c:2307:socket_connect_finish]
0-management: connection to 10.10.10.:24007 failed (Connection refused)


I am quite new to gluster but as far as I see this is somewhat a chain in
which failure of 1st node leads to disconnect of two other nodes. Any hints
how to solve this? Are there any settings for retries/timeouts/reconnects
in gluster which could help in my case?

Thanks for all help!

BR,
Rafal.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170516/219127e6/attachment.html>

Pranith Kumar Karampuri

2017-May-18 04:10 UTC

head link

[Gluster-users] 3.9.1 in docker: problems when one of peers is unavailable.

Hey,
     3.9.1 reached its EndOfLife, you can use either 3.8.x or 3.10.x. which
are active at the moment.

On Tue, May 16, 2017 at 11:03 AM, Rafa? Radecki <radecki.rafal at
gmail.com>
wrote:
> Hi All.
>
> I have a 9 node dockerized glusterfs cluster and I am seeing a situation
> that:
> 1) docker daemon on 8th node failes and as a result glusterd on this node
> is leaving the cluster
> 2) as a result on 1st node I see message about 8th node being unavailable:
>
> [2017-05-15 12:48:22.142865] I [MSGID: 106004]
[glusterd-handler.c:5808:__glusterd_peer_rpc_notify]
> 0-management: Peer <10.10.10.8>
(<5cb55b7a-1e04-4fb8-bd1d-55ee647719d2>),
> in state <Peer in Cluster>, has disconnected from glusterd.
> [2017-05-15 12:48:22.167746] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
> (-->/usr/lib64/glusterfs/3.9.1/xlator/mgmt/glusterd.so(+0x2035a)
> [0x7f7d9d62535a]
-->/usr/lib64/glusterfs/3.9.1/xlator/mgmt/glusterd.so(+0x29f48)
> [0x7f7d9d62ef48] -->/usr/lib64/glus
> terfs/3.9.1/xlator/mgmt/glusterd.so(+0xd50aa) [0x7f7d9d6da0aa] )
> 0-management: Lock for vol csv not held
> [2017-05-15 12:48:22.167767] W [MSGID: 106118]
[glusterd-handler.c:5833:__glusterd_peer_rpc_notify]
> 0-management: Lock not released for csv
>
>
> and the gluster share is unavailable and when I try to list it I get:
>
> Transport endpoint is not connected
> 3) then on 5th node I see message similar to 2) about 1st node being
> unavailable and 5th also disconnects from the cluster
>
> [2017-05-15 12:52:54.321189] W
[glusterd-locks.c:675:glusterd_mgmt_v3_unlock]
> (-->/usr/lib64/glusterfs/3.9.1/xlator/mgmt/glusterd.so(+0x2035a)
> [0x7f7fda22335a]
-->/usr/lib64/glusterfs/3.9.1/xlator/mgmt/glusterd.so(+0x29f48)
> [0x7f7fda22cf48] -->/usr/lib64/glus
> terfs/3.9.1/xlator/mgmt/glusterd.so(+0xd50aa) [0x7f7fda2d80aa] )
> 0-management: Lock for vol csv not held
>
> [2017-05-15 12:52:54.321200] W [MSGID: 106118]
[glusterd-handler.c:5833:__glusterd_peer_rpc_notify]
> 0-management: Lock not released for csv
>
> [2017-05-15 12:53:04.659418] E [socket.c:2307:socket_connect_finish]
> 0-management: connection to 10.10.10.:24007 failed (Connection refused)
>
>
> I am quite new to gluster but as far as I see this is somewhat a chain in
> which failure of 1st node leads to disconnect of two other nodes. Any hints
> how to solve this? Are there any settings for retries/timeouts/reconnects
> in gluster which could help in my case?
>
> Thanks for all help!
>
> BR,
> Rafal.
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>


-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170518/f796f405/attachment.html>

Gluster users - May 2017 - 3.9.1 in docker: problems when one of peers is unavailable.

[Gluster-users] 3.9.1 in docker: problems when one of peers is unavailable.

[Gluster-users] 3.9.1 in docker: problems when one of peers is unavailable.