> -Was this an upgraded setup or a fresh v9.0 install?It was freshly installed 8.3 Centos rpms and upgraded to 9.0. I enabled granular after the upgrade.> - When there are entries yet to be healed, the CLI should > have prevented you toggling this option - was that not the > case?Indeed, enabling granular was only possible when there were 0 files to heal. Re-disabling it, however, did not impose this limitation.> - Can you find the directory name corresponding to the gfid > 011fcc1b-4d90-4c36-86ec-488aaa4db3b8 (use > https://github.com/gluster/glusterfs/blob/master/extras/gfid-to-dirname.sh > if needed) and see if all files/ sub directories (first level > only) inside it are same on all 3 bricks?Nifty little script! [root at node03 ~]# ./gfid2dirname.sh /gfs/gv0 011fcc1b-4d90-4c36-86ec-488aaa4db3b8 Location of the directory corresponding to gfid:011fcc1b-4d90-4c36-86ec-488aaa4db3b8 is /gfs/gv0/vmail/net/provocation/oracle/Maildir/.Sent/cur/ I get the same answer on all three nodes. This directory contains no subdirectories, only files. [root at node01 ~]# find /gfs/gv0/vmail/net/provocation/oracle/Maildir/.Sent/cur/ -type f |wc -l 10264 [root at node02 ~]# find /gfs/gv0/vmail/net/provocation/oracle/Maildir/.Sent/cur/ -type f |wc -l 10604 [root at node03 ~]# find /gfs/gv0/vmail/net/provocation/oracle/Maildir/.Sent/cur/ -type f |wc -l 10603 The figures don't fully add up to 4/343/344, but are very close. Nothing is is in split-brain, so it simply looks like node01 is lagging behind the other two.
On 15/03/21 5:11 pm, Zenon Panoussis wrote:> > Indeed, enabling granular was only possible when there were > 0 files to heal. Re-disabling it, however, did not impose this > limitation.Ah yes, this is expected behavior because even if we disable it, there should be enough information to do the entry heal in the non-granular way.> > I get the same answer on all three nodes. This directory contains > no subdirectories, only files.Hmm, then the client4_0_mkdir_cbk? failures in the glustershd.log must be for a parallel heal of a directory which contains subdirs.> > [root at node01 ~]# find /gfs/gv0/vmail/net/provocation/oracle/Maildir/.Sent/cur/ -type f |wc -l > 10264 > > [root at node02 ~]# find /gfs/gv0/vmail/net/provocation/oracle/Maildir/.Sent/cur/ -type f |wc -l > 10604 > > [root at node03 ~]# find /gfs/gv0/vmail/net/provocation/oracle/Maildir/.Sent/cur/ -type f |wc -l > 10603 > > The figures don't fully add up to 4/343/344, but are very close. > Nothing is is in split-brain, so it simply looks like node01 is > lagging behind the other two. >Are there any file names inside /gfs/gv0/.glusterfs/indices/entry-changes/011fcc1b-4d90-4c36-86ec-488aaa4db3b8 in any of the bricks? If this heal backlog was introduced when granular-entry-heal was enabled, it must contain the list of files that need to be healed.