mabi
2018-Nov-08 07:28 UTC
[Gluster-users] Self-healing not healing 27k files on GlusterFS 4.1.5 3 nodes replica
Dear Ravi, Thank you for your answer. I will start first by sending you below the getfattr from the first entry which does not get healed (it is in fact a directory). It is the following path/dir from the output of one of my previous mails: /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir # NODE 1 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.myvol-pro-client-1=0x000000000000000300000003 trusted.gfid=0x25e2616b4fb64b2a89451afc956fff19 trusted.glusterfs.dht=0x000000010000000000000000ffffffff # NODE 2 trusted.gfid=0xd9ac192ce85e4402af105551f587ed9a trusted.glusterfs.dht=0x000000010000000000000000ffffffff # NODE 3 (arbiter) trusted.afr.dirty=0x000000000000000000000000 trusted.afr.myvol-pro-client-1=0x000000000000000300000003 trusted.gfid=0x25e2616b4fb64b2a89451afc956fff19 trusted.glusterfs.dht=0x000000010000000000000000ffffffff Notice here that node 2 does not seem to have any AFR attributes which must be problematic. Also that specific directory on my node 2 has the oldest timestamp (14:12) where as that same directory on node 1 and 3 have 14:19 as timestamp. I did run "volume heal myvol-pro" and on the console it shows: Launching heal operation to perform index self heal on volume myvol-pro has been successful Use heal info commands to check status. but then in the glustershd.log file of both 3 nodes there has been nothing new logged. The log file cmd_history.log shows: [2018-11-08 07:20:24.481603] : volume heal myvol-pro : SUCCESS and glusterd.log: [2018-11-08 07:20:24.474032] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glustershd: error returned while attempting to connect to host:(null), port:0 That's it... To me it looks like a split-brain but GlusterFS does not report it as split-brain and neither does any self-heal on it. What do you think? Regards, M. ??????? Original Message ??????? On Thursday, November 8, 2018 5:00 AM, Ravishankar N <ravishankar at redhat.com> wrote:> Can you share the getfattr output of all 4 entries from all 3 bricks? > > Also, can you tailf glustershd.log on all nodes and see if anything is > logged for these entries when you run 'gluster volume heal $volname'? > > Regards, > > Ravi > > On 11/07/2018 01:22 PM, mabi wrote: > > > To my eyes this specific case looks like a split-brain scenario but the output of "volume info split-brain" does not show any files. Should I still use the process for split-brain files as documented in the glusterfs documentation? or what do you recommend here? > > ??????? Original Message ??????? > > On Monday, November 5, 2018 4:36 PM, mabi mabi at protonmail.ch wrote: > > > > > Ravi, I did not yet modify the cluster.data-self-heal parameter to off because in the mean time node2 of my cluster had a memory shortage (this node has 32 GB of RAM) and as such I had to reboot it. After that reboot all locks got released and there are no more files left to heal on that volume. So the reboot of node2 did the trick (but this still seems to be a bug). > > > Now on another volume of this same cluster I have a total of 8 files (4 of them being directories) unsynced from node1 and node3 (arbiter) as you can see below: > > > Brick node1:/data/myvol-pro/brick > > > /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir > > > gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360 > > > /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/le_dir > > > gfid:aae4098a-1a71-4155-9cc9-e564b89957cf > > > Status: Connected > > > Number of entries: 4 > > > Brick node2:/data/myvol-pro/brick > > > Status: Connected > > > Number of entries: 0 > > > Brick node3:/srv/glusterfs/myvol-pro/brick > > > /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir > > > /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/le_dir > > > gfid:aae4098a-1a71-4155-9cc9-e564b89957cf > > > gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360 > > > Status: Connected > > > Number of entries: 4 > > > If I check the "/data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/" with an "ls -l" directory on the client (gluster fuse mount) I get the following garbage: > > > drwxr-xr-x 4 www-data www-data 4096 Nov 5 14:19 . > > > drwxr-xr-x 31 www-data www-data 4096 Nov 5 14:23 .. > > > d????????? ? ? ? ? ? le_dir > > > I checked on the nodes and indeed node1 and node3 have the same directory from the time 14:19 but node2 has a directory from the time 14:12. > > > Again here the self-heal daemon doesn't seem to be doing anything... What do you recommend me to do in order to heal these unsynced files? > > > ??????? Original Message ??????? > > > On Monday, November 5, 2018 2:42 AM, Ravishankar N ravishankar at redhat.com wrote: > > > > > > > On 11/03/2018 04:13 PM, mabi wrote: > > > > > > > > > Ravi (or anyone else who can help), I now have even more files which are pending for healing. > > > > > If the count is increasing, there is likely a network (disconnect) > > > > > problem between the gluster clients and the bricks that needs fixing. > > > > > > > > > Here is the output of a "volume heal info summary": > > > > > Brick node1:/data/myvol-private/brick > > > > > Status: Connected > > > > > Total Number of entries: 49845 > > > > > Number of entries in heal pending: 49845 > > > > > Number of entries in split-brain: 0 > > > > > Number of entries possibly healing: 0 > > > > > Brick node2:/data/myvol-private/brick > > > > > Status: Connected > > > > > Total Number of entries: 26644 > > > > > Number of entries in heal pending: 26644 > > > > > Number of entries in split-brain: 0 > > > > > Number of entries possibly healing: 0 > > > > > Brick node3:/srv/glusterfs/myvol-private/brick > > > > > Status: Connected > > > > > Total Number of entries: 0 > > > > > Number of entries in heal pending: 0 > > > > > Number of entries in split-brain: 0 > > > > > Number of entries possibly healing: 0 > > > > > Should I try to set the "cluster.data-self-heal" parameter of that volume to "off" as mentioned in the bug? > > > > > Yes, as? mentioned in the workaround in the thread that I shared. > > > > > > > > > And by doing that, does it mean that my files pending heal are in danger of being lost? > > > > > No. > > > > > > > > > Also is it dangerous to leave "cluster.data-self-heal" to off? > > > > > No. This is only disabling client side data healing. Self-heal daemon > > > > > would still heal the files. > > > > > -Ravi > > > > > > > > > ??????? Original Message ??????? > > > > > On Saturday, November 3, 2018 1:31 AM, Ravishankar N ravishankar at redhat.com wrote: > > > > > > > > > > > Mabi, > > > > > > If bug 1637953 is what you are experiencing, then you need to follow the > > > > > > workarounds mentioned in > > > > > > https://lists.gluster.org/pipermail/gluster-users/2018-October/035178.html. > > > > > > Can you see if this works? > > > > > > -Ravi > > > > > > On 11/02/2018 11:40 PM, mabi wrote: > > > > > > > > > > > > > I tried again to manually run a heal by using the "gluster volume heal" command because still not files have been healed and noticed the following warning in the glusterd.log file: > > > > > > > [2018-11-02 18:04:19.454702] I [MSGID: 106533] [glusterd-volume-ops.c:938:__glusterd_handle_cli_heal_volume] 0-management: Received heal vol req for volume myvol-private > > > > > > > [2018-11-02 18:04:19.457311] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glustershd: error returned while attempting to connect to host:(null), port:0 > > > > > > > It looks like the glustershd can't connect to "host:(null)", could that be the reason why there is no healing taking place? if yes why do I see here "host:(null)"? and what needs fixing? > > > > > > > This seeem to have happened since I upgraded from 3.12.14 to 4.1.5. > > > > > > > I really would appreciate some help here, I suspect being an issue with GlusterFS 4.1.5. > > > > > > > Thank you in advance for any feedback. > > > > > > > ??????? Original Message ??????? > > > > > > > On Wednesday, October 31, 2018 11:13 AM, mabi mabi at protonmail.ch wrote: > > > > > > > > > > > > > > > Hello, > > > > > > > > I have a GlusterFS 4.1.5 cluster with 3 nodes (including 1 arbiter) and currently have a volume with around 27174 files which are not being healed. The "volume heal info" command shows the same 27k files under the first node and the second node but there is nothing under the 3rd node (arbiter). > > > > > > > > I already tried running a "volume heal" but none of the files got healed. > > > > > > > > In the glfsheal log file for that particular volume the only error I see is a few of these entries: > > > > > > > > [2018-10-31 10:06:41.524300] E [rpc-clnt.c:184:call_bail] 0-myvol-private-client-0: bailing out frame type(GlusterFS 4.x v1) op(INODELK(29)) xid = 0x108b sent = 2018-10-31 09:36:41.314203. timeout = 1800 for 127.0.1.1:49152 > > > > > > > > and then a few of these warnings: > > > > > > > > [2018-10-31 10:08:12.161498] W [dict.c:671:dict_ref] (-->/usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/cluster/replicate.so(+0x6734a) [0x7f2a6dff434a] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x5da84) [0x7f2a798e8a84] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_ref+0x58) [0x7f2a798a37f8] ) 0-dict: dict is NULL [Invalid argument] > > > > > > > > the glustershd.log file shows the following: > > > > > > > > [2018-10-31 10:10:52.502453] E [rpc-clnt.c:184:call_bail] 0-myvol-private-client-0: bailing out frame type(GlusterFS 4.x v1) op(INODELK(29)) xid = 0xaa398 sent = 2018-10-31 09:40:50.927816. timeout = 1800 for 127.0.1.1:49152 > > > > > > > > [2018-10-31 10:10:52.502502] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] 0-myvol-private-client-0: remote operation failed [Transport endpoint is not connected] > > > > > > > > any idea what could be wrong here? > > > > > > > > Regards, > > > > > > > > Mabi > > > > > > > > Gluster-users mailing list > > > > > > > > Gluster-users at gluster.org > > > > > > > > https://lists.gluster.org/mailman/listinfo/gluster-users
Ravishankar N
2018-Nov-08 10:05 UTC
[Gluster-users] Self-healing not healing 27k files on GlusterFS 4.1.5 3 nodes replica
On 11/08/2018 12:58 PM, mabi wrote:> Dear Ravi, > > Thank you for your answer. I will start first by sending you below the getfattr from the first entry which does not get healed (it is in fact a directory). It is the following path/dir from the output of one of my previous mails: /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir > > # NODE 1 > trusted.afr.dirty=0x000000000000000000000000 > trusted.afr.myvol-pro-client-1=0x000000000000000300000003 > trusted.gfid=0x25e2616b4fb64b2a89451afc956fff19 > trusted.glusterfs.dht=0x000000010000000000000000ffffffff > > # NODE 2 > trusted.gfid=0xd9ac192ce85e4402af105551f587ed9a > trusted.glusterfs.dht=0x000000010000000000000000ffffffff > > # NODE 3 (arbiter) > trusted.afr.dirty=0x000000000000000000000000 > trusted.afr.myvol-pro-client-1=0x000000000000000300000003 > trusted.gfid=0x25e2616b4fb64b2a89451afc956fff19 > trusted.glusterfs.dht=0x000000010000000000000000ffffffff > > Notice here that node 2 does not seem to have any AFR attributes which must be problematic. Also that specific directory on my node 2 has the oldest timestamp (14:12) where as that same directory on node 1 and 3 have 14:19 as timestamp. > > I did run "volume heal myvol-pro" and on the console it shows: > > Launching heal operation to perform index self heal on volume myvol-pro has been successful > Use heal info commands to check status. > > but then in the glustershd.log file of both 3 nodes there has been nothing new logged. > > The log file cmd_history.log shows: > [2018-11-08 07:20:24.481603] : volume heal myvol-pro : SUCCESS > > and glusterd.log: > [2018-11-08 07:20:24.474032] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glustershd: error returned while attempting to connect to host:(null), port:0Not sure of this part.> > That's it... To me it looks like a split-brain but GlusterFS does not report it as split-brain and neither does any self-heal on it.It is not a split-brain. Nodes 1 and 3 have xattrs indicating a pending entry heal on node2 , so heal must have happened ideally. Can you check a few things? - Is there any disconnects between each of the shds and the brick processes (check via statedump or look for disconnect messages in glustershd.log). Does restarting shd via a `volume start force` solve the problem? - Is the symlink pointing to oc_dir present inside .glusterfs/25/e2 in all 3 bricks? Regards, Ravi> What do you think? > > Regards, > M. > > ??????? Original Message ??????? > On Thursday, November 8, 2018 5:00 AM, Ravishankar N <ravishankar at redhat.com> wrote: > >> Can you share the getfattr output of all 4 entries from all 3 bricks? >> >> Also, can you tailf glustershd.log on all nodes and see if anything is >> logged for these entries when you run 'gluster volume heal $volname'? >> >> Regards, >> >> Ravi >> >> On 11/07/2018 01:22 PM, mabi wrote: >> >>> To my eyes this specific case looks like a split-brain scenario but the output of "volume info split-brain" does not show any files. Should I still use the process for split-brain files as documented in the glusterfs documentation? or what do you recommend here? >>> ??????? Original Message ??????? >>> On Monday, November 5, 2018 4:36 PM, mabi mabi at protonmail.ch wrote: >>> >>>> Ravi, I did not yet modify the cluster.data-self-heal parameter to off because in the mean time node2 of my cluster had a memory shortage (this node has 32 GB of RAM) and as such I had to reboot it. After that reboot all locks got released and there are no more files left to heal on that volume. So the reboot of node2 did the trick (but this still seems to be a bug). >>>> Now on another volume of this same cluster I have a total of 8 files (4 of them being directories) unsynced from node1 and node3 (arbiter) as you can see below: >>>> Brick node1:/data/myvol-pro/brick >>>> /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir >>>> gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360 >>>> /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/le_dir >>>> gfid:aae4098a-1a71-4155-9cc9-e564b89957cf >>>> Status: Connected >>>> Number of entries: 4 >>>> Brick node2:/data/myvol-pro/brick >>>> Status: Connected >>>> Number of entries: 0 >>>> Brick node3:/srv/glusterfs/myvol-pro/brick >>>> /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir >>>> /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/le_dir >>>> gfid:aae4098a-1a71-4155-9cc9-e564b89957cf >>>> gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360 >>>> Status: Connected >>>> Number of entries: 4 >>>> If I check the "/data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/" with an "ls -l" directory on the client (gluster fuse mount) I get the following garbage: >>>> drwxr-xr-x 4 www-data www-data 4096 Nov 5 14:19 . >>>> drwxr-xr-x 31 www-data www-data 4096 Nov 5 14:23 .. >>>> d????????? ? ? ? ? ? le_dir >>>> I checked on the nodes and indeed node1 and node3 have the same directory from the time 14:19 but node2 has a directory from the time 14:12. >>>> Again here the self-heal daemon doesn't seem to be doing anything... What do you recommend me to do in order to heal these unsynced files? >>>> ??????? Original Message ??????? >>>> On Monday, November 5, 2018 2:42 AM, Ravishankar N ravishankar at redhat.com wrote: >>>> >>>>> On 11/03/2018 04:13 PM, mabi wrote: >>>>> >>>>>> Ravi (or anyone else who can help), I now have even more files which are pending for healing. >>>>>> If the count is increasing, there is likely a network (disconnect) >>>>>> problem between the gluster clients and the bricks that needs fixing. >>>>>> Here is the output of a "volume heal info summary": >>>>>> Brick node1:/data/myvol-private/brick >>>>>> Status: Connected >>>>>> Total Number of entries: 49845 >>>>>> Number of entries in heal pending: 49845 >>>>>> Number of entries in split-brain: 0 >>>>>> Number of entries possibly healing: 0 >>>>>> Brick node2:/data/myvol-private/brick >>>>>> Status: Connected >>>>>> Total Number of entries: 26644 >>>>>> Number of entries in heal pending: 26644 >>>>>> Number of entries in split-brain: 0 >>>>>> Number of entries possibly healing: 0 >>>>>> Brick node3:/srv/glusterfs/myvol-private/brick >>>>>> Status: Connected >>>>>> Total Number of entries: 0 >>>>>> Number of entries in heal pending: 0 >>>>>> Number of entries in split-brain: 0 >>>>>> Number of entries possibly healing: 0 >>>>>> Should I try to set the "cluster.data-self-heal" parameter of that volume to "off" as mentioned in the bug? >>>>>> Yes, as? mentioned in the workaround in the thread that I shared. >>>>>> And by doing that, does it mean that my files pending heal are in danger of being lost? >>>>>> No. >>>>>> Also is it dangerous to leave "cluster.data-self-heal" to off? >>>>>> No. This is only disabling client side data healing. Self-heal daemon >>>>>> would still heal the files. >>>>>> -Ravi >>>>>> ??????? Original Message ??????? >>>>>> On Saturday, November 3, 2018 1:31 AM, Ravishankar N ravishankar at redhat.com wrote: >>>>>> >>>>>>> Mabi, >>>>>>> If bug 1637953 is what you are experiencing, then you need to follow the >>>>>>> workarounds mentioned in >>>>>>> https://lists.gluster.org/pipermail/gluster-users/2018-October/035178.html. >>>>>>> Can you see if this works? >>>>>>> -Ravi >>>>>>> On 11/02/2018 11:40 PM, mabi wrote: >>>>>>> >>>>>>>> I tried again to manually run a heal by using the "gluster volume heal" command because still not files have been healed and noticed the following warning in the glusterd.log file: >>>>>>>> [2018-11-02 18:04:19.454702] I [MSGID: 106533] [glusterd-volume-ops.c:938:__glusterd_handle_cli_heal_volume] 0-management: Received heal vol req for volume myvol-private >>>>>>>> [2018-11-02 18:04:19.457311] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-glustershd: error returned while attempting to connect to host:(null), port:0 >>>>>>>> It looks like the glustershd can't connect to "host:(null)", could that be the reason why there is no healing taking place? if yes why do I see here "host:(null)"? and what needs fixing? >>>>>>>> This seeem to have happened since I upgraded from 3.12.14 to 4.1.5. >>>>>>>> I really would appreciate some help here, I suspect being an issue with GlusterFS 4.1.5. >>>>>>>> Thank you in advance for any feedback. >>>>>>>> ??????? Original Message ??????? >>>>>>>> On Wednesday, October 31, 2018 11:13 AM, mabi mabi at protonmail.ch wrote: >>>>>>>> >>>>>>>>> Hello, >>>>>>>>> I have a GlusterFS 4.1.5 cluster with 3 nodes (including 1 arbiter) and currently have a volume with around 27174 files which are not being healed. The "volume heal info" command shows the same 27k files under the first node and the second node but there is nothing under the 3rd node (arbiter). >>>>>>>>> I already tried running a "volume heal" but none of the files got healed. >>>>>>>>> In the glfsheal log file for that particular volume the only error I see is a few of these entries: >>>>>>>>> [2018-10-31 10:06:41.524300] E [rpc-clnt.c:184:call_bail] 0-myvol-private-client-0: bailing out frame type(GlusterFS 4.x v1) op(INODELK(29)) xid = 0x108b sent = 2018-10-31 09:36:41.314203. timeout = 1800 for 127.0.1.1:49152 >>>>>>>>> and then a few of these warnings: >>>>>>>>> [2018-10-31 10:08:12.161498] W [dict.c:671:dict_ref] (-->/usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/cluster/replicate.so(+0x6734a) [0x7f2a6dff434a] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x5da84) [0x7f2a798e8a84] -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_ref+0x58) [0x7f2a798a37f8] ) 0-dict: dict is NULL [Invalid argument] >>>>>>>>> the glustershd.log file shows the following: >>>>>>>>> [2018-10-31 10:10:52.502453] E [rpc-clnt.c:184:call_bail] 0-myvol-private-client-0: bailing out frame type(GlusterFS 4.x v1) op(INODELK(29)) xid = 0xaa398 sent = 2018-10-31 09:40:50.927816. timeout = 1800 for 127.0.1.1:49152 >>>>>>>>> [2018-10-31 10:10:52.502502] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] 0-myvol-private-client-0: remote operation failed [Transport endpoint is not connected] >>>>>>>>> any idea what could be wrong here? >>>>>>>>> Regards, >>>>>>>>> Mabi >>>>>>>>> Gluster-users mailing list >>>>>>>>> Gluster-users at gluster.org >>>>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users >