Olaf Buitelaar
2021-Feb-22 11:52 UTC
[Gluster-users] brick process crashes on "Structure needs cleaning"
Dear Users, Somehow the brick processes seem to crash on xfs filesystem error's. It seems it depends on the way the gluster process is started. Also gluster sends on this occurrence a message to the console, informing the process will go down, however it doesn't really seem to go down; M [MSGID: 113075] [posix-helpers.c:2185:posix_health_check_thread_proc] 0-ovirt-engine-posix: health-check failed, going down M [MSGID: 113075] [posix-helpers.c:2203:posix_health_check_thread_proc] 0-ovirt-engine-posix: still alive! -> SIGTERM in the brick log a message like this is logged; [posix-helpers.c:2111:posix_fs_health_check] 0-ovirt-data-posix: aio_read_cmp_buf() on /data5/gfs/bricks/brick1/ovirt-data/.glusterfs/health_check returned ret is -1 error is Structure needs cleaning or like this; W [MSGID: 113075] [posix-helpers.c:2111:posix_fs_health_check] 0-ovirt-mon-2-posix: aio_read_buf() on /data0/gfs/bricks/bricka/ovirt-mon-2/.glusterfs/health_check returned ret is -1 error is Success when i check the actual file it just seems to contain a timestamp; cat /data0/gfs/bricks/bricka/ovirt-mon-2/.glusterfs/health_check 2021-01-28 09:08:01? And don't see errors in DMESG about having issues accessing it. When i unmount the filesystem and run xfs_repair on it, no error's/issues are reported. Also when i mount the filesystem again, it's reported as a clean mount; [2478552.169540] XFS (dm-23): Mounting V5 Filesystem [2478552.180645] XFS (dm-23): Ending clean mount When i kill the brick process and start with "gluser v start x force" the issue seems much more unlikely to occur, but when started from a fresh reboot, or when killing the process and let it being started by glusterd (e.g. service glusterd start) the error seems to arise after a couple of minutes. I am making use of LVM cache (in write through mode), maybe that's related. Also the disks it self are backed by a hardware raid controller and i did inspect all disks for SMART errors. Does anybody has experience with this, and a clue on what might causing this? Thanks Olaf -------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.gluster.org/pipermail/gluster-users/attachments/20210222/30a691d2/attachment.html>