mabi
2018-Nov-14 09:49 UTC
[Gluster-users] Self-healing not healing 27k files on GlusterFS 4.1.5 3 nodes replica
??????? Original Message ??????? On Wednesday, November 14, 2018 5:34 AM, Ravishankar N <ravishankar at redhat.com> wrote:> I thought it was missing which is why I asked you to create it.? The > trusted.gfid xattr for any given file or directory must be same in all 3 > bricks.? But it looks like that isn't the case. Are the gfids and the > symlinks for all the dirs leading to the parent dir of oc_dir same on > all nodes? (i.e evey directory in > /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/)?I now checked the GFIDs of all directories leading back down to the parent dir (13 directories in total) and for node 1 and node 3 the GIFDs of all underlying directories match each other. On node 2 they are also all the same except for the two highest directories (".../dir11" and and ".../dir11/oc_dir"). It's exactly these two directories which are also listed in the "volume heal info" output under node 1 and node 2 and which do not get healed. For your reference I have pasted below the GFIDs for all underlying directories up to the parent directory and for all 3 nodes. I start at the top with the highest directory and at the bottom of the list is the parent directory (/data). # NODE 1 trusted.gfid=0x25e2616b4fb64b2a89451afc956fff19 # /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir trusted.gfid=0x70c894ca422b4bceacf15cfb4669abbd # /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11 trusted.gfid=0x7d7d2165f4804edf8c93de01c8768269 # ... trusted.gfid=0xdbc0bfa0a052405ca3fad2d1ca137f82 trusted.gfid=0xbb75051c24ba4c119351bef938c55ad4 trusted.gfid=0x0002ad0c3fbe4806a75f8e68304f5b94 trusted.gfid=0xf120657977274247900db4e9cc8129dd trusted.gfid=0x8afeb00bb1e74cbab932acea705b7dd9 trusted.gfid=0x2174086880fc4fd19b187d1384300add trusted.gfid=0x2057e87cf4cc43f9bbad160cbec43d01 # ... trusted.gfid=0xa7d78519db61459399e01fad2badf3fb # /data/dir1/dir2 trusted.gfid=0xfaa0ed7ccaf84f6c8bdb20a7f657c4b4 # /data/dir1 trusted.gfid=0x2683990126724adbb6416b911180e62b # /data # NODE 2 trusted.gfid=0xd9ac192ce85e4402af105551f587ed9a trusted.gfid=0x10ec1eb1c8544ff2a36c325681713093 trusted.gfid=0x7d7d2165f4804edf8c93de01c8768269 trusted.gfid=0xdbc0bfa0a052405ca3fad2d1ca137f82 trusted.gfid=0xbb75051c24ba4c119351bef938c55ad4 trusted.gfid=0x0002ad0c3fbe4806a75f8e68304f5b94 trusted.gfid=0xf120657977274247900db4e9cc8129dd trusted.gfid=0x8afeb00bb1e74cbab932acea705b7dd9 trusted.gfid=0x2174086880fc4fd19b187d1384300add trusted.gfid=0x2057e87cf4cc43f9bbad160cbec43d01 trusted.gfid=0xa7d78519db61459399e01fad2badf3fb trusted.gfid=0xfaa0ed7ccaf84f6c8bdb20a7f657c4b4 trusted.gfid=0x2683990126724adbb6416b911180e62b # NODE 3 trusted.gfid=0x25e2616b4fb64b2a89451afc956fff19 trusted.gfid=0x70c894ca422b4bceacf15cfb4669abbd trusted.gfid=0x7d7d2165f4804edf8c93de01c8768269 trusted.gfid=0xdbc0bfa0a052405ca3fad2d1ca137f82 trusted.gfid=0xbb75051c24ba4c119351bef938c55ad4 trusted.gfid=0x0002ad0c3fbe4806a75f8e68304f5b94 trusted.gfid=0xf120657977274247900db4e9cc8129dd trusted.gfid=0x8afeb00bb1e74cbab932acea705b7dd9 trusted.gfid=0x2174086880fc4fd19b187d1384300add trusted.gfid=0x2057e87cf4cc43f9bbad160cbec43d01 trusted.gfid=0xa7d78519db61459399e01fad2badf3fb trusted.gfid=0xfaa0ed7ccaf84f6c8bdb20a7f657c4b4 trusted.gfid=0x2683990126724adbb6416b911180e62b> Let us see if the parents' gfids are the same before deleting anything. > Is the heal info still showing 4 entries? Please also share the getfattr > output of the the parent directory (i.e. dir11) .Yes, the heal info still shows the 4 entries but on node 1 the directory name is not shown anymore but just the GFID. This is the actual output of a "volume heal info": Brick node1:/data/myvol-pro/brick <gfid:25e2616b-4fb6-4b2a-8945-1afc956fff19> <gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360> <gfid:70c894ca-422b-4bce-acf1-5cfb4669abbd> <gfid:aae4098a-1a71-4155-9cc9-e564b89957cf> Status: Connected Number of entries: 4 Brick node2:/data/myvol-pro/brick Status: Connected Number of entries: 0 Brick node3:/srv/glusterfs/myvol-pro/brick /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11 /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir <gfid:aae4098a-1a71-4155-9cc9-e564b89957cf> <gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360> Status: Connected Number of entries: 4 What are the next steps in order to fix that?
Ravishankar N
2018-Nov-15 04:57 UTC
[Gluster-users] Self-healing not healing 27k files on GlusterFS 4.1.5 3 nodes replica
On 11/14/2018 03:19 PM, mabi wrote:> ??????? Original Message ??????? > On Wednesday, November 14, 2018 5:34 AM, Ravishankar N <ravishankar at redhat.com> wrote: > >> I thought it was missing which is why I asked you to create it.? The >> trusted.gfid xattr for any given file or directory must be same in all 3 >> bricks.? But it looks like that isn't the case. Are the gfids and the >> symlinks for all the dirs leading to the parent dir of oc_dir same on >> all nodes? (i.e evey directory in >> /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/)? > I now checked the GFIDs of all directories leading back down to the parent dir (13 directories in total) and for node 1 and node 3 the GIFDs of all underlying directories match each other. On node 2 they are also all the same except for the two highest directories (".../dir11" and and ".../dir11/oc_dir"). It's exactly these two directories which are also listed in the "volume heal info" output under node 1 and node 2 and which do not get healed. > > For your reference I have pasted below the GFIDs for all underlying directories up to the parent directory and for all 3 nodes. I start at the top with the highest directory and at the bottom of the list is the parent directory (/data). > > # NODE 1 > > trusted.gfid=0x25e2616b4fb64b2a89451afc956fff19 # /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir > trusted.gfid=0x70c894ca422b4bceacf15cfb4669abbd # /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11 > trusted.gfid=0x7d7d2165f4804edf8c93de01c8768269 # ... > trusted.gfid=0xdbc0bfa0a052405ca3fad2d1ca137f82 > trusted.gfid=0xbb75051c24ba4c119351bef938c55ad4 > trusted.gfid=0x0002ad0c3fbe4806a75f8e68304f5b94 > trusted.gfid=0xf120657977274247900db4e9cc8129dd > trusted.gfid=0x8afeb00bb1e74cbab932acea705b7dd9 > trusted.gfid=0x2174086880fc4fd19b187d1384300add > trusted.gfid=0x2057e87cf4cc43f9bbad160cbec43d01 # ... > trusted.gfid=0xa7d78519db61459399e01fad2badf3fb # /data/dir1/dir2 > trusted.gfid=0xfaa0ed7ccaf84f6c8bdb20a7f657c4b4 # /data/dir1 > trusted.gfid=0x2683990126724adbb6416b911180e62b # /data > > # NODE 2 > > trusted.gfid=0xd9ac192ce85e4402af105551f587ed9a > trusted.gfid=0x10ec1eb1c8544ff2a36c325681713093 > trusted.gfid=0x7d7d2165f4804edf8c93de01c8768269 > trusted.gfid=0xdbc0bfa0a052405ca3fad2d1ca137f82 > trusted.gfid=0xbb75051c24ba4c119351bef938c55ad4 > trusted.gfid=0x0002ad0c3fbe4806a75f8e68304f5b94 > trusted.gfid=0xf120657977274247900db4e9cc8129dd > trusted.gfid=0x8afeb00bb1e74cbab932acea705b7dd9 > trusted.gfid=0x2174086880fc4fd19b187d1384300add > trusted.gfid=0x2057e87cf4cc43f9bbad160cbec43d01 > trusted.gfid=0xa7d78519db61459399e01fad2badf3fb > trusted.gfid=0xfaa0ed7ccaf84f6c8bdb20a7f657c4b4 > trusted.gfid=0x2683990126724adbb6416b911180e62b > > # NODE 3 > > trusted.gfid=0x25e2616b4fb64b2a89451afc956fff19 > trusted.gfid=0x70c894ca422b4bceacf15cfb4669abbd > trusted.gfid=0x7d7d2165f4804edf8c93de01c8768269 > trusted.gfid=0xdbc0bfa0a052405ca3fad2d1ca137f82 > trusted.gfid=0xbb75051c24ba4c119351bef938c55ad4 > trusted.gfid=0x0002ad0c3fbe4806a75f8e68304f5b94 > trusted.gfid=0xf120657977274247900db4e9cc8129dd > trusted.gfid=0x8afeb00bb1e74cbab932acea705b7dd9 > trusted.gfid=0x2174086880fc4fd19b187d1384300add > trusted.gfid=0x2057e87cf4cc43f9bbad160cbec43d01 > trusted.gfid=0xa7d78519db61459399e01fad2badf3fb > trusted.gfid=0xfaa0ed7ccaf84f6c8bdb20a7f657c4b4 > trusted.gfid=0x2683990126724adbb6416b911180e62b > > >> Let us see if the parents' gfids are the same before deleting anything. >> Is the heal info still showing 4 entries? Please also share the getfattr >> output of the the parent directory (i.e. dir11) . > Yes, the heal info still shows the 4 entries but on node 1 the directory name is not shown anymore but just the GFID. This is the actual output of a "volume heal info": > > Brick node1:/data/myvol-pro/brick > <gfid:25e2616b-4fb6-4b2a-8945-1afc956fff19> > <gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360> > <gfid:70c894ca-422b-4bce-acf1-5cfb4669abbd> > <gfid:aae4098a-1a71-4155-9cc9-e564b89957cf> > Status: Connected > Number of entries: 4 > > Brick node2:/data/myvol-pro/brick > Status: Connected > Number of entries: 0 > > Brick node3:/srv/glusterfs/myvol-pro/brick > /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11 > /data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir > <gfid:aae4098a-1a71-4155-9cc9-e564b89957cf> > <gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360> > Status: Connected > Number of entries: 4 > > What are the next steps in order to fix that?1.Could you provide the getfattr output of the following 3 dirs from all 3 nodes? i)/data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10 ii)/data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/ iii)/data/dir1/dir2/dir3/dir4/dir5/dir6/dir7/dir8/dir9/dir10/dir11/oc_dir 2. Do you know the file (or directory) names corresponding to the other 2 gfids? in heal info output, i.e <gfid:aae4098a-1a71-4155-9cc9-e564b89957cf> <gfid:3c92459b-8fa1-4669-9a3d-b38b8d41c360> Please share the getfattr output of them as well. Regards, Ravi