Ravishankar N
2021-Oct-30 11:36 UTC
[Gluster-users] GlusterFS 9.3 - Replicate Volume (2 Bricks / 1 Arbiter) - Self-healing does not always work
On Fri, Oct 29, 2021 at 12:28 PM Thorsten Walk <darkiop at gmail.com> wrote:> > After a certain time it always comes to the state that there are not > healable files in the GFS (in the example below: > <gfid:26c5396c-86ff-408d-9cda-106acd2b0768>). >> Currently I have the GlusterFS volume in test mode and only 1-2 VMs > running on it. So far there are no negative effects. The replication and > the selfheal basically work, only now and then something remains that > cannot be healed. > > Does anyone have an idea how to prevent or heal this? I have already > completely rebuilt the volume incl. partitions and glusterd to exclude old > loads. > > If you need more information, please contact me. > >The next time this occurs, can you check if disabling `cluster.eager-lock` helps heal the file? Also share the xattrs (eg.`getfattr -d -m. -e hex /brick-path/.glusterfs/26/c5/26c5396c-86ff-408d-9cda-106acd2b0768 ` ) output from all 3 bricks for the file or its gfid. Regards, Ravi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20211030/e3513fd6/attachment.html>
Thorsten Walk
2021-Oct-30 13:13 UTC
[Gluster-users] GlusterFS 9.3 - Replicate Volume (2 Bricks / 1 Arbiter) - Self-healing does not always work
Hi Ravi & Strahil, thanks a lot for your answer! The file in the path .glusterfs/26/c5/.. only exists at node1 (=pve01). On node2 (pve02) and the arbiter (freya), the file does not exist: ?[14:35:48] [ssh:root at pve01(192.168.1.50): ~ (700)] ??># getfattr -d -m. -e hex /data/glusterfs/.glusterfs/26/c5/26c5396c-86ff-408d-9cda-106acd2b0768 getfattr: Removing leading '/' from absolute path names # file: data/glusterfs/.glusterfs/26/c5/26c5396c-86ff-408d-9cda-106acd2b0768 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.glusterfs-1-volume-client-1=0x000000010000000100000000 trusted.afr.glusterfs-1-volume-client-2=0x000000010000000100000000 trusted.gfid=0x26c5396c86ff408d9cda106acd2b0768 trusted.glusterfs.mdata=0x01000000000000000000000000617880a3000000003b2f011700000000617880a3000000003b2f011700000000617880a3000000003983a635 ?[14:36:49] [ssh:root at pve02(192.168.1.51): /data/glusterfs/.glusterfs/26/c5 (700)] ??># ll drwx------ root root 6B 3 days ago ? ./ drwx------ root root 8.0K 6 hours ago ? ../ ?[14:36:58] [ssh:root at freya(192.168.1.40): /data/glusterfs/.glusterfs/26/c5 (700)] ??># ll drwx------ root root 6B 3 days ago ? ./ drwx------ root root 8.0K 3 hours ago ? ../ After this, i have disabled the the option you mentioned: gluster volume set glusterfs-1-volume cluster.eager-lock off After that I started another healing process manually. Unfortunately without success. @Strahil: For your idea with https://docs.gluster.org/en/latest/Troubleshooting/gfid-to-path/ i need more time, maybe i can try it tomorrow. I'll be in touch. Thanks again and best regards, Thorsten -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20211030/a4e68921/attachment.html>