thr3ads.net - Gluster users - [Gluster-users] connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket [Aug 2017]

If this information is useful, please help other people find it:
Share via:

peljasz

2017-Aug-01 17:31 UTC

[Gluster-users] connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

how critical is above?
I get plenty of these on all three peers.

hi guys

I've recently upgraded from 3.8 to 3.10 and I'm seeing weird 
behavior.
I see: $gluster vol status $_vol detail; takes long timeand 
mostly times out.
I do:
$ gluster vol heal $_vol info
and I see:
Brick 
10.5.6.32:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
Status: Transport endpoint is not connected
Number of entries: -

Brick 
10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
Status: Connected
Number of entries: 0

Brick 
10.5.6.100:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
Status: Transport endpoint is not connected
Number of entries: -

Ibegin to worry that 3.10 @centos7.3might have not been a 
good idea.
many thanks.
L.

Atin Mukherjee

2017-Aug-02 01:19 UTC

head link

[Gluster-users] connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

This means shd client is not able to establish the connection with the
brick on port 49155. Now this could happen if glusterd has ended up
providing a stale port back which is not what brick is listening to. If you
had killed any brick process using sigkill signal instead of sigterm this
is expected as portmap_signout is not received by glusterd in the former
case and the old portmap entry is never wiped off.

Please restart glusterd service. This should fix the problem.

On Tue, 1 Aug 2017 at 23:03, peljasz <peljasz at yahoo.co.uk> wrote:
> how critical is above?
> I get plenty of these on all three peers.
>
> hi guys
>
> I've recently upgraded from 3.8 to 3.10 and I'm seeing weird
> behavior.
> I see: $gluster vol status $_vol detail; takes long timeand
> mostly times out.
> I do:
> $ gluster vol heal $_vol info
> and I see:
> Brick
> 10.5.6.32:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
> Status: Transport endpoint is not connected
> Number of entries: -
>
> Brick
> 10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
> Status: Connected
> Number of entries: 0
>
> Brick
> 10.5.6.100:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
> Status: Transport endpoint is not connected
> Number of entries: -
>
> Ibegin to worry that 3.10 @centos7.3might have not been a
> good idea.
> many thanks.
> L.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>-- 
- Atin (atinm)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170802/2df0b10c/attachment.html>

lejeczek

2017-Aug-02 05:57 UTC

head link

[Gluster-users] connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

But I had not killed anything, unless system did for some 
reason and  silently, but I'd not think so.
It seems that one brick is particularly ill about it all.
I'd have to restart it but mostly this would not do and 
actually reboot the system, then for I short while it would 
be ok only soon later to show up as:

Brick 10.5.6.32:/__.aLocalStorages/0/0-GLUS
TERs/0GLUSTER-GROUP-WORK                    N/A       N/A 
N       N/A
Brick 10.5.6.49:/__.aLocalStorages/0/0-GLUS
TERs/0GLUSTER-GROUP-WORK                    49153     0 
Y       2391260
Brick 10.5.6.100:/__.aLocalStorages/0/0-GLU
STERs/0GLUSTER-GROUP-WORK                   49153     0 
Y       9717

and logs:
[2017-08-02 05:51:48.306839] E 
[socket.c:2316:socket_connect_finish] 0-GROUP-WORK-client-6: 
connection to 10.5.6.32:49153 failed (Connection refused); 
disconnecting socket

But systemd on that brick says processes/daemon are ok.
And all three bricks would be virtually(general config) 
identical.

Not sure what to think about.
thanks.
L

On 02/08/17 02:19, Atin Mukherjee wrote:> This means shd client is not able to establish the 
> connection with the brick on port 49155. Now this could 
> happen if glusterd has ended up providing a stale port 
> back which is not what brick is listening to. If you had 
> killed any brick process using sigkill signal instead of 
> sigterm this is expected as portmap_signout is not 
> received by glusterd in the former case and the old 
> portmap entry is never wiped off.
>
> Please restart glusterd service. This should fix the problem.
>
> On Tue, 1 Aug 2017 at 23:03, peljasz <peljasz at yahoo.co.uk 
> <mailto:peljasz at yahoo.co.uk>> wrote:
>
>     how critical is above?
>     I get plenty of these on all three peers.
>
>     hi guys
>
>     I've recently upgraded from 3.8 to 3.10 and I'm seeing
>     weird
>     behavior.
>     I see: $gluster vol status $_vol detail; takes long
>     timeand
>     mostly times out.
>     I do:
>     $ gluster vol heal $_vol info
>     and I see:
>     Brick
>     10.5.6.32:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
>     Status: Transport endpoint is not connected
>     Number of entries: -
>
>     Brick
>     10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
>     Status: Connected
>     Number of entries: 0
>
>     Brick
>     10.5.6.100:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
>     Status: Transport endpoint is not connected
>     Number of entries: -
>
>     Ibegin to worry that 3.10 @centos7.3might have not been a
>     good idea.
>     many thanks.
>     L.
>     _______________________________________________
>     Gluster-users mailing list
>     Gluster-users at gluster.org
>     <mailto:Gluster-users at gluster.org>
>     http://lists.gluster.org/mailman/listinfo/gluster-users
>
> -- 
> - Atin (atinm)

lejeczek

2017-Aug-02 06:10 UTC

head link

[Gluster-users] connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

also, now after the upgrade gluster claims, on some vols, 
log list in heal info, and these in these amongst:

Brick 
10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-USER-HOME
<gfid:ea647c38-004d-4f2c-a533-ba75682869d2>
Status: Connected

what are these entries?


On 02/08/17 02:19, Atin Mukherjee wrote:> This means shd client is not able to establish the 
> connection with the brick on port 49155. Now this could 
> happen if glusterd has ended up providing a stale port 
> back which is not what brick is listening to. If you had 
> killed any brick process using sigkill signal instead of 
> sigterm this is expected as portmap_signout is not 
> received by glusterd in the former case and the old 
> portmap entry is never wiped off.
>
> Please restart glusterd service. This should fix the problem.
>
> On Tue, 1 Aug 2017 at 23:03, peljasz <peljasz at yahoo.co.uk 
> <mailto:peljasz at yahoo.co.uk>> wrote:
>
>     how critical is above?
>     I get plenty of these on all three peers.
>
>     hi guys
>
>     I've recently upgraded from 3.8 to 3.10 and I'm seeing
>     weird
>     behavior.
>     I see: $gluster vol status $_vol detail; takes long
>     timeand
>     mostly times out.
>     I do:
>     $ gluster vol heal $_vol info
>     and I see:
>     Brick
>     10.5.6.32:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
>     Status: Transport endpoint is not connected
>     Number of entries: -
>
>     Brick
>     10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
>     Status: Connected
>     Number of entries: 0
>
>     Brick
>     10.5.6.100:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
>     Status: Transport endpoint is not connected
>     Number of entries: -
>
>     Ibegin to worry that 3.10 @centos7.3might have not been a
>     good idea.
>     many thanks.
>     L.
>     _______________________________________________
>     Gluster-users mailing list
>     Gluster-users at gluster.org
>     <mailto:Gluster-users at gluster.org>
>     http://lists.gluster.org/mailman/listinfo/gluster-users
>
> -- 
> - Atin (atinm)

lejeczek

2017-Aug-02 06:18 UTC

head link

[Gluster-users] connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

what I've just notice - the brick in question does show up as:

Brick 10.5.6.32:/__.aLocalStorages/0/0-GLUS
TERs/0GLUSTER-GROUP-WORK                    N/A       N/A 
N       N/A

for one particular vol. Status for other vols(so far) shows 
it ok.
Would this be volume problem or brick problem, or both?
And most importantly, how to troubleshoot it?

many thanks, L.


On 02/08/17 02:19, Atin Mukherjee wrote:> This means shd client is not able to establish the 
> connection with the brick on port 49155. Now this could 
> happen if glusterd has ended up providing a stale port 
> back which is not what brick is listening to. If you had 
> killed any brick process using sigkill signal instead of 
> sigterm this is expected as portmap_signout is not 
> received by glusterd in the former case and the old 
> portmap entry is never wiped off.
>
> Please restart glusterd service. This should fix the problem.
>
> On Tue, 1 Aug 2017 at 23:03, peljasz <peljasz at yahoo.co.uk 
> <mailto:peljasz at yahoo.co.uk>> wrote:
>
>     how critical is above?
>     I get plenty of these on all three peers.
>
>     hi guys
>
>     I've recently upgraded from 3.8 to 3.10 and I'm seeing
>     weird
>     behavior.
>     I see: $gluster vol status $_vol detail; takes long
>     timeand
>     mostly times out.
>     I do:
>     $ gluster vol heal $_vol info
>     and I see:
>     Brick
>     10.5.6.32:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
>     Status: Transport endpoint is not connected
>     Number of entries: -
>
>     Brick
>     10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
>     Status: Connected
>     Number of entries: 0
>
>     Brick
>     10.5.6.100:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
>     Status: Transport endpoint is not connected
>     Number of entries: -
>
>     Ibegin to worry that 3.10 @centos7.3might have not been a
>     good idea.
>     many thanks.
>     L.
>     _______________________________________________
>     Gluster-users mailing list
>     Gluster-users at gluster.org
>     <mailto:Gluster-users at gluster.org>
>     http://lists.gluster.org/mailman/listinfo/gluster-users
>
> -- 
> - Atin (atinm)

Seemingly Similar Threads

Search for more apparently analagous threads

Gluster users - Aug 2017 - connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

[Gluster-users] connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

[Gluster-users] connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

[Gluster-users] connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

[Gluster-users] connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

[Gluster-users] connection to 10.5.6.32:49155 failed (Connection refused); disconnecting socket

Seemingly Similar Threads