On 15/03/21 3:39 pm, Zenon Panoussis wrote:> Does anyone know what healing error 22 "invalid argument" is
> and how to fix it, or at least how to troubleshoot it?
>
> while true; do date; gluster volume heal gv0 statistics heal-count; echo -e
"--------------\n"; sleep 297; done
>
> Fri Mar 12 14:58:36 CET 2021
> Gathering count of entries to be healed on volume gv0 has been successful
>
> Brick node01:/gfs/gv0
> Number of entries: 4
>
> Brick node02:/gfs/gv0
> Number of entries: 343
>
> Brick node03:/gfs/gv0
> Number of entries: 344
> --------------
>
> Three days later...
>
> Mon Mar 15 10:57:23 CET 2021
> Gathering count of entries to be healed on volume gv0 has been successful
>
> Brick node01:/gfs/gv0
> Number of entries: 4
>
> Brick node02:/gfs/gv0
> Number of entries: 343
>
> Brick node03:/gfs/gv0
> Number of entries: 344
> --------------
>
> glustershd.log is full of entries like these:
>
> [2021-03-15 05:38:01.991945 +0000] I [MSGID: 108026]
[afr-self-heal-entry.c:1053:afr_selfheal_entry_do] 0-gv0-replicate-0: performing
entry selfheal on 011fcc1b-4d90-4c36-86ec-488aaa4db3b8
> [2021-03-15 05:59:02.812770 +0000] E [MSGID: 114031]
[client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] 0-gv0-client-2: remote operation
failed. [{path=(null)}, {errno=22}, {error=Invalid argument}]
> [2021-03-15 05:59:02.813933 +0000] E [MSGID: 114031]
[client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] 0-gv0-client-1: remote operation
failed. [{path=(null)}, {errno=22}, {error=Invalid argument}]
> [2021-03-15 05:59:03.061068 +0000] E [MSGID: 114031]
[client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] 0-gv0-client-0: remote operation
failed. [{path=(null)}, {errno=22}, {error=Invalid argument}]
> [2021-03-15 05:59:05.547156 +0000] I [MSGID: 108026]
[afr-self-heal-entry.c:1053:afr_selfheal_entry_do] 0-gv0-replicate-0: performing
entry selfheal on 1a5121eb-a90b-4b23-92ba-f277124cb82a
>
> That is, it starts healing an object, fails a few times, moves on
> to the next, fails on it too, and so on ad infinitum. The volume
> is a replica 3, gluster is v9.0, and all three bricks are up and
> connected.
>
> This behaviour started shortly after I enabled granular-entry-heal.
> Whether that has anything to do with the problem or not, I don't
> know. Switching back to disabled granular-entry-heal did not help.
-Was this an upgraded setup or a fresh v9.0 install? Asking because v9.0
has granular-entry-heal on by default for new volumes.
- When there are entries yet to be healed, the CLI should have prevented
you toggling this option - was that not the case?
- Can you find the directory name corresponding to the gfid
011fcc1b-4d90-4c36-86ec-488aaa4db3b8 (use
https://github.com/gluster/glusterfs/blob/master/extras/gfid-to-dirname.sh
if needed) and see if all files/ sub directories (first level only)
inside it are same on all 3 bricks?
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>