thr3ads.net - Gluster users - [Gluster-users] Self-healing not healing 27k files on GlusterFS 4.1.5 3 nodes replica [Nov 2018]

If this information is useful, please help other people find it:
Share via:

mabi

2018-Nov-08 12:39 UTC

[Gluster-users] Self-healing not healing 27k files on GlusterFS 4.1.5 3 nodes replica

??????? Original Message ???????
On Thursday, November 8, 2018 11:05 AM, Ravishankar N <ravishankar at
redhat.com> wrote:
> It is not a split-brain. Nodes 1 and 3 have xattrs indicating a pending
> entry heal on node2 , so heal must have happened ideally. Can you check
> a few things?
> -   Is there any disconnects between each of the shds and the brick
>     processes (check via statedump or look for disconnect messages in
>     glustershd.log). Does restarting shd via a `volume start force` solve
>     the problem?
Yes there is one disconnect at 14:21 (UTC 13:21) because node2 ran out of memory
(although it has 32 GB of RAM) and I had to reboot it. Here are the relevant log
entries taken from glustershd.log on node1:

[2018-11-05 13:21:16.284239] C [rpc-clnt-ping.c:166:rpc_clnt_ping_timer_expired]
0-myvol-pro-client-1: server 192.168.10.33:49154 has not responded in the last
42 seconds, disconnecting.
[2018-11-05 13:21:16.284385] I [MSGID: 114018] [client.c:2254:client_rpc_notify]
0-myvol-pro-client-1: disconnected from myvol-pro-client-1. Client process will
keep trying to connect to glusterd until brick's port is available
[2018-11-05 13:21:16.284889] W [rpc-clnt-ping.c:222:rpc_clnt_ping_cbk]
0-myvol-pro-client-1: socket disconnected

I also just ran a "volume start force" and saw that the glustershd
processes got restarted on all 3 nodes but that did not trigger any healing.
There are still the same amount of files/dirs pending heal...
> -   Is the symlink pointing to oc_dir present inside .glusterfs/25/e2 in
>     all 3 bricks?
They are yes for node1 and node3 but node2 there is no such symlink...

I hope that helps to debug the issue further, else please let me know if you
need more info

Ravishankar N

2018-Nov-09 01:11 UTC

head link

[Gluster-users] Self-healing not healing 27k files on GlusterFS 4.1.5 3 nodes replica

On 11/08/2018 06:09 PM, mabi wrote:> ??????? Original Message ???????
> On Thursday, November 8, 2018 11:05 AM, Ravishankar N <ravishankar at
redhat.com> wrote:
>
>> It is not a split-brain. Nodes 1 and 3 have xattrs indicating a pending
>> entry heal on node2 , so heal must have happened ideally. Can you check
>> a few things?
>> -   Is there any disconnects between each of the shds and the brick
>>      processes (check via statedump or look for disconnect messages in
>>      glustershd.log). Does restarting shd via a `volume start force`
solve
>>      the problem?
> Yes there is one disconnect at 14:21 (UTC 13:21) because node2 ran out of
memory (although it has 32 GB of RAM) and I had to reboot it. Here are the
relevant log entries taken from glustershd.log on node1:
>
> [2018-11-05 13:21:16.284239] C
[rpc-clnt-ping.c:166:rpc_clnt_ping_timer_expired] 0-myvol-pro-client-1: server
192.168.10.33:49154 has not responded in the last 42 seconds, disconnecting.
> [2018-11-05 13:21:16.284385] I [MSGID: 114018]
[client.c:2254:client_rpc_notify] 0-myvol-pro-client-1: disconnected from
myvol-pro-client-1. Client process will keep trying to connect to glusterd until
brick's port is available
> [2018-11-05 13:21:16.284889] W [rpc-clnt-ping.c:222:rpc_clnt_ping_cbk]
0-myvol-pro-client-1: socket disconnected
>
> I also just ran a "volume start force" and saw that the
glustershd processes got restarted on all 3 nodes but that did not trigger any
healing. There are still the same amount of files/dirs pending heal...
>
>> -   Is the symlink pointing to oc_dir present inside .glusterfs/25/e2
in
>>      all 3 bricks?
> They are yes for node1 and node3 but node2 there is no such symlink...
>
> I hope that helps to debug the issue further, else please let me know if
you need more infoPlease re-create the symlink on node 2 to match how it is in the other 
nodes and launch heal again. Check if this is the case for other entries 
too.
-Ravi

Gluster users - Nov 2018 - Self-healing not healing 27k files on GlusterFS 4.1.5 3 nodes replica

[Gluster-users] Self-healing not healing 27k files on GlusterFS 4.1.5 3 nodes replica

[Gluster-users] Self-healing not healing 27k files on GlusterFS 4.1.5 3 nodes replica