thr3ads.net - Gluster users - [Gluster-users] Glusterfs 10.5-1 healing issues [Apr 2024]

If this information is useful, please help other people find it:
Share via:

Ilias Chasapakis forumZFD

2024-Apr-09 08:05 UTC

[Gluster-users] Glusterfs 10.5-1 healing issues

Dear all,

we would like to describe the situation that we have and that does not 
solve since a long time, that means after many minor
and major upgrades of GlusterFS

We use a KVM environment for VMs for glusterfs and host servers are 
updated regularly. Hosts are disomogeneous hardware,
but configured with same characteristics.

The VMs have been also harmonized to use the virtio drivers where 
available for devices and resources reserved are the same
on each host.

Physical switch for hosts has been substituted with a reliable one.

Probing peers has been and is quite quick in the heartbeat network and 
communication between the servers for apparently has no issues on 
disruptions.

And I say apparently because what we have is:

- always pending failed heals that used to resolve by a rotated reboot 
of the gluster vms (replica 3). Restarting only
glusterfs related services (daemon, events etc.) has no effect, only 
reboot brings results
- very often failed heals are directories

We lately removed a brick that was on a vm on a host that has been 
entirely substituted. Re-added the brick, sync went on and
all data was eventually synced and started with 0 pending failed heals. 
Now it develops failed heals too like its fellow
bricks. Please take into account we healed all the failed entries 
(manually with various methods) before adding the third brick.

After some days of operating, the count of failed heals rises again, not 
really fast but with new entries for sure (which might solve
with rotated reboots, or not).

We have gluster clients also on ctdbs that connect to the gluster and 
mount via glusterfs client. Windows roaming profiles shared via smb 
become frequently corrupted,(they are composed of a great number small 
files and are though of big total dimension). Gluster nodes are 
formatted with xfs.

Also what we observer is that mounting with the vfs option in smb on the 
ctdbs has some kind of delay. This means that you can see the shared 
folder on for example
a Windows client machine on a ctdb, but not on another ctdb in the 
cluster and then after a while it appears there too. And this frequently st


This is an excerpt of entries on our shd logs:
> 2024-04-08 10:13:26.213596 +0000] I [MSGID: 108026] 
> [afr-self-heal-entry.c:1080:afr_selfheal_entry_do] 
> 0-gv-ho-replicate-0: performing full entry selfheal on 
> 2c621415-6223-4b66-a4ca-3f6f267a448d
> [2024-04-08 10:14:08.135911 +0000] W [MSGID: 114031] 
> [client-rpc-fops_v2.c:2457:client4_0_link_cbk] 0-gv-ho-client-5: 
> remote operation failed. 
> [{source=<gfid:91d83f0e-1864-4ff3-9174-b7c956e20596>}, 
> {target=(null)}, {errno=116}, {error=Veraltete Dateizugriffsnummer 
> (file handle)}]
> [2024-04-08 10:15:59.135908 +0000] W [MSGID: 114061] 
> [client-common.c:2992:client_pre_readdir_v2] 0-gv-ho-client-5: 
> remote_fd is -1. EBADFD [{gfid=6b5e599e-c836-4ebe-b16a-8224425b88c7}, 
> {errno=77}, {error=Die Dateizugriffsnummer ist in schlechter Verfassung}]
> [2024-04-08 10:30:25.013592 +0000] I [MSGID: 108026] 
> [afr-self-heal-entry.c:1080:afr_selfheal_entry_do] 
> 0-gv-ho-replicate-0: performing full entry selfheal on 
> 24e82e12-5512-4679-9eb3-8bd098367db7
> [2024-04-08 10:33:17.613594 +0000] W [MSGID: 114031] 
> [client-rpc-fops_v2.c:2457:client4_0_link_cbk] 0-gv-ho-client-5: 
> remote operation failed. 
> [{source=<gfid:ef9068fc-a329-4a21-88d2-265ecd3d208c>}, 
> {target=(null)}, {errno=116}, {error=Veraltete Dateizugriffsnummer 
> (file handle)}]
> [2024-04-08 10:33:21.201359 +0000] W [MSGID: 114031] 
> [client-rpc-fops_v2.c:2457:client4_0_link_cbk] 0-gv-ho-client-5: 
> remote operation failed. [{sourceHow are he clients mapped to real hosts in order to know on which one?s 
logs to look at?

We would like to go by exclusion to finally eradicate this, possibly in 
a conservative way (not rebuilding everything) and we

are becoming clueless as to where to look at as we also tried various 
options settings regarding performance etc.

Here is the set on our main volume:
> cluster.lookup-unhashed????????????????? on (DEFAULT)
> cluster.lookup-optimize????????????????? on (DEFAULT)
> cluster.min-free-disk??????????????????? 10% (DEFAULT)
> cluster.min-free-inodes????????????????? 5% (DEFAULT)
> cluster.rebalance-stats????????????????? off (DEFAULT)
> cluster.subvols-per-directory??????????? (null) (DEFAULT)
> cluster.readdir-optimize???????????????? off (DEFAULT)
> cluster.rsync-hash-regex???????????????? (null) (DEFAULT)
> cluster.extra-hash-regex???????????????? (null) (DEFAULT)
> cluster.dht-xattr-name?????????????????? trusted.glusterfs.dht (DEFAULT)
> cluster.randomize-hash-range-by-gfid???? off (DEFAULT)
> cluster.rebal-throttle?????????????????? normal (DEFAULT)
> cluster.lock-migration off
> cluster.force-migration off
> cluster.local-volume-name??????????????? (null) (DEFAULT)
> cluster.weighted-rebalance?????????????? on (DEFAULT)
> cluster.switch-pattern?????????????????? (null) (DEFAULT)
> cluster.entry-change-log???????????????? on (DEFAULT)
> cluster.read-subvolume?????????????????? (null) (DEFAULT)
> cluster.read-subvolume-index???????????? -1 (DEFAULT)
> cluster.read-hash-mode?????????????????? 1 (DEFAULT)
> cluster.background-self-heal-count?????? 8 (DEFAULT)
> cluster.metadata-self-heal on
> cluster.data-self-heal on
> cluster.entry-self-heal on
> cluster.self-heal-daemon enable
> cluster.heal-timeout???????????????????? 600 (DEFAULT)
> cluster.self-heal-window-size??????????? 8 (DEFAULT)
> cluster.data-change-log????????????????? on (DEFAULT)
> cluster.metadata-change-log????????????? on (DEFAULT)
> cluster.data-self-heal-algorithm???????? (null) (DEFAULT)
> cluster.eager-lock?????????????????????? on (DEFAULT)
> disperse.eager-lock????????????????????? on (DEFAULT)
> disperse.other-eager-lock??????????????? on (DEFAULT)
> disperse.eager-lock-timeout????????????? 1 (DEFAULT)
> disperse.other-eager-lock-timeout??????? 1 (DEFAULT)
> cluster.quorum-type auto
> cluster.quorum-count 2
> cluster.choose-local???????????????????? true (DEFAULT)
> cluster.self-heal-readdir-size?????????? 1KB (DEFAULT)
> cluster.post-op-delay-secs?????????????? 1 (DEFAULT)
> cluster.ensure-durability??????????????? on (DEFAULT)
> cluster.consistent-metadata????????????? no (DEFAULT)
> cluster.heal-wait-queue-length?????????? 128 (DEFAULT)
> cluster.favorite-child-policy none
> cluster.full-lock??????????????????????? yes (DEFAULT)
> cluster.optimistic-change-log??????????? on (DEFAULT)
> diagnostics.latency-measurement off
> diagnostics.dump-fd-stats??????????????? off (DEFAULT)
> diagnostics.count-fop-hits off
> diagnostics.brick-log-level INFO
> diagnostics.client-log-level INFO
> diagnostics.brick-sys-log-level????????? CRITICAL (DEFAULT)
> diagnostics.client-sys-log-level???????? CRITICAL (DEFAULT)
> diagnostics.brick-logger???????????????? (null) (DEFAULT)
> diagnostics.client-logger??????????????? (null) (DEFAULT)
> diagnostics.brick-log-format???????????? (null) (DEFAULT)
> diagnostics.client-log-format??????????? (null) (DEFAULT)
> diagnostics.brick-log-buf-size?????????? 5 (DEFAULT)
> diagnostics.client-log-buf-size????????? 5 (DEFAULT)
> diagnostics.brick-log-flush-timeout????? 120 (DEFAULT)
> diagnostics.client-log-flush-timeout???? 120 (DEFAULT)
> diagnostics.stats-dump-interval????????? 0 (DEFAULT)
> diagnostics.fop-sample-interval????????? 0 (DEFAULT)
> diagnostics.stats-dump-format??????????? json (DEFAULT)
> diagnostics.fop-sample-buf-size????????? 65535 (DEFAULT)
> diagnostics.stats-dnscache-ttl-sec?????? 86400 (DEFAULT)
> performance.cache-max-file-size 10
> performance.cache-min-file-size????????? 0 (DEFAULT)
> performance.cache-refresh-timeout??????? 1 (DEFAULT)
> performance.cache-priority (DEFAULT)
> performance.io-cache-size??????????????? 32MB (DEFAULT)
> performance.cache-size?????????????????? 32MB (DEFAULT)
> performance.io-thread-count????????????? 16 (DEFAULT)
> performance.high-prio-threads??????????? 16 (DEFAULT)
> performance.normal-prio-threads????????? 16 (DEFAULT)
> performance.low-prio-threads???????????? 16 (DEFAULT)
> performance.least-prio-threads?????????? 1 (DEFAULT)
> performance.enable-least-priority??????? on (DEFAULT)
> performance.iot-watchdog-secs??????????? (null) (DEFAULT)
> performance.iot-cleanup-disconnected-reqs off (DEFAULT)
> performance.iot-pass-through???????????? false (DEFAULT)
> performance.io-cache-pass-through??????? false (DEFAULT)
> performance.quick-read-cache-size??????? 128MB (DEFAULT)
> performance.cache-size?????????????????? 128MB (DEFAULT)
> performance.quick-read-cache-timeout???? 1 (DEFAULT)
> performance.qr-cache-timeout 600
> performance.quick-read-cache-invalidation false (DEFAULT)
> performance.ctime-invalidation?????????? false (DEFAULT)
> performance.flush-behind???????????????? on (DEFAULT)
> performance.nfs.flush-behind???????????? on (DEFAULT)
> performance.write-behind-window-size 4MB
> performance.resync-failed-syncs-after-fsync off (DEFAULT)
> performance.nfs.write-behind-window-size 1MB (DEFAULT)
> performance.strict-o-direct????????????? off (DEFAULT)
> performance.nfs.strict-o-direct????????? off (DEFAULT)
> performance.strict-write-ordering??????? off (DEFAULT)
> performance.nfs.strict-write-ordering??? off (DEFAULT)
> performance.write-behind-trickling-writes on (DEFAULT)
> performance.aggregate-size?????????????? 128KB (DEFAULT)
> performance.nfs.write-behind-trickling-writes on (DEFAULT)
> performance.lazy-open??????????????????? yes (DEFAULT)
> performance.read-after-open????????????? yes (DEFAULT)
> performance.open-behind-pass-through???? false (DEFAULT)
> performance.read-ahead-page-count??????? 4 (DEFAULT)
> performance.read-ahead-pass-through????? false (DEFAULT)
> performance.readdir-ahead-pass-through?? false (DEFAULT)
> performance.md-cache-pass-through??????? false (DEFAULT)
> performance.write-behind-pass-through??? false (DEFAULT)
> performance.md-cache-timeout 600
> performance.cache-swift-metadata???????? false (DEFAULT)
> performance.cache-samba-metadata on
> performance.cache-capability-xattrs????? true (DEFAULT)
> performance.cache-ima-xattrs???????????? true (DEFAULT)
> performance.md-cache-statfs????????????? off (DEFAULT)
> performance.xattr-cache-list (DEFAULT)
> performance.nl-cache-pass-through??????? false (DEFAULT)
> network.frame-timeout??????????????????? 1800 (DEFAULT)
> network.ping-timeout 20
> network.tcp-window-size????????????????? (null) (DEFAULT)
> client.ssl off
> network.remote-dio?????????????????????? disable (DEFAULT)
> client.event-threads 4
> client.tcp-user-timeout 0
> client.keepalive-time 20
> client.keepalive-interval 2
> client.keepalive-count 9
> client.strict-locks off
> network.tcp-window-size????????????????? (null) (DEFAULT)
> network.inode-lru-limit 200000
> auth.allow *
> auth.reject????????????????????????????? (null) (DEFAULT)
> transport.keepalive 1
> server.allow-insecure??????????????????? on (DEFAULT)
> server.root-squash?????????????????????? off (DEFAULT)
> server.all-squash??????????????????????? off (DEFAULT)
> server.anonuid?????????????????????????? 65534 (DEFAULT)
> server.anongid?????????????????????????? 65534 (DEFAULT)
> server.statedump-path??????????????????? /var/run/gluster (DEFAULT)
> server.outstanding-rpc-limit???????????? 64 (DEFAULT)
> server.ssl off
> auth.ssl-allow *
> server.manage-gids?????????????????????? off (DEFAULT)
> server.dynamic-auth????????????????????? on (DEFAULT)
> client.send-gids???????????????????????? on (DEFAULT)
> server.gid-timeout?????????????????????? 300 (DEFAULT)
> server.own-thread??????????????????????? (null) (DEFAULT)
> server.event-threads 4
> server.tcp-user-timeout????????????????? 42 (DEFAULT)
> server.keepalive-time 20
> server.keepalive-interval 2
> server.keepalive-count 9
> transport.listen-backlog 1024
> ssl.own-cert???????????????????????????? (null) (DEFAULT)
> ssl.private-key????????????????????????? (null) (DEFAULT)
> ssl.ca-list????????????????????????????? (null) (DEFAULT)
> ssl.crl-path???????????????????????????? (null) (DEFAULT)
> ssl.certificate-depth??????????????????? (null) (DEFAULT)
> ssl.cipher-list????????????????????????? (null) (DEFAULT)
> ssl.dh-param???????????????????????????? (null) (DEFAULT)
> ssl.ec-curve???????????????????????????? (null) (DEFAULT)
> transport.address-family inet
> performance.write-behind off
> performance.read-ahead on
> performance.readdir-ahead on
> performance.io-cache off
> performance.open-behind on
> performance.quick-read on
> performance.nl-cache on
> performance.stat-prefetch on
> performance.client-io-threads off
> performance.nfs.write-behind on
> performance.nfs.read-ahead off
> performance.nfs.io-cache off
> performance.nfs.quick-read off
> performance.nfs.stat-prefetch off
> performance.nfs.io-threads off
> performance.force-readdirp?????????????? true (DEFAULT)
> performance.cache-invalidation on
> performance.global-cache-invalidation??? true (DEFAULT)
> features.uss off
> features.snapshot-directory .snaps
> features.show-snapshot-directory off
> features.tag-namespaces off
> network.compression off
> network.compression.window-size????????? -15 (DEFAULT)
> network.compression.mem-level??????????? 8 (DEFAULT)
> network.compression.min-size???????????? 0 (DEFAULT)
> network.compression.compression-level??? -1 (DEFAULT)
> network.compression.debug??????????????? false (DEFAULT)
> features.default-soft-limit????????????? 80% (DEFAULT)
> features.soft-timeout??????????????????? 60 (DEFAULT)
> features.hard-timeout??????????????????? 5 (DEFAULT)
> features.alert-time????????????????????? 86400 (DEFAULT)
> features.quota-deem-statfs off
> geo-replication.indexing off
> geo-replication.indexing off
> geo-replication.ignore-pid-check off
> geo-replication.ignore-pid-check off
> features.quota off
> features.inode-quota off
> features.bitrot disable
> debug.trace off
> debug.log-history??????????????????????? no (DEFAULT)
> debug.log-file?????????????????????????? no (DEFAULT)
> debug.exclude-ops??????????????????????? (null) (DEFAULT)
> debug.include-ops??????????????????????? (null) (DEFAULT)
> debug.error-gen off
> debug.error-failure????????????????????? (null) (DEFAULT)
> debug.error-number?????????????????????? (null) (DEFAULT)
> debug.random-failure???????????????????? off (DEFAULT)
> debug.error-fops???????????????????????? (null) (DEFAULT)
> nfs.disable on
> features.read-only?????????????????????? off (DEFAULT)
> features.worm off
> features.worm-file-level off
> features.worm-files-deletable on
> features.default-retention-period??????? 120 (DEFAULT)
> features.retention-mode????????????????? relax (DEFAULT)
> features.auto-commit-period????????????? 180 (DEFAULT)
> storage.linux-aio??????????????????????? off (DEFAULT)
> storage.linux-io_uring?????????????????? off (DEFAULT)
> storage.batch-fsync-mode???????????????? reverse-fsync (DEFAULT)
> storage.batch-fsync-delay-usec?????????? 0 (DEFAULT)
> storage.owner-uid??????????????????????? -1 (DEFAULT)
> storage.owner-gid??????????????????????? -1 (DEFAULT)
> storage.node-uuid-pathinfo?????????????? off (DEFAULT)
> storage.health-check-interval??????????? 30 (DEFAULT)
> storage.build-pgfid????????????????????? off (DEFAULT)
> storage.gfid2path??????????????????????? on (DEFAULT)
> storage.gfid2path-separator????????????? : (DEFAULT)
> storage.reserve????????????????????????? 1 (DEFAULT)
> storage.health-check-timeout???????????? 20 (DEFAULT)
> storage.fips-mode-rchecksum on
> storage.force-create-mode??????????????? 0000 (DEFAULT)
> storage.force-directory-mode???????????? 0000 (DEFAULT)
> storage.create-mask????????????????????? 0777 (DEFAULT)
> storage.create-directory-mask??????????? 0777 (DEFAULT)
> storage.max-hardlinks??????????????????? 100 (DEFAULT)
> features.ctime?????????????????????????? on (DEFAULT)
> config.gfproxyd off
> cluster.server-quorum-type server
> cluster.server-quorum-ratio 51
> changelog.changelog????????????????????? off (DEFAULT)
> changelog.changelog-dir????????????????? {{ brick.path 
> }}/.glusterfs/changelogs (DEFAULT)
> changelog.encoding?????????????????????? ascii (DEFAULT)
> changelog.rollover-time????????????????? 15 (DEFAULT)
> changelog.fsync-interval???????????????? 5 (DEFAULT)
> changelog.changelog-barrier-timeout 120
> changelog.capture-del-path?????????????? off (DEFAULT)
> features.barrier disable
> features.barrier-timeout 120
> features.trash?????????????????????????? off (DEFAULT)
> features.trash-dir?????????????????????? .trashcan (DEFAULT)
> features.trash-eliminate-path??????????? (null) (DEFAULT)
> features.trash-max-filesize????????????? 5MB (DEFAULT)
> features.trash-internal-op?????????????? off (DEFAULT)
> cluster.enable-shared-storage disable
> locks.trace????????????????????????????? off (DEFAULT)
> locks.mandatory-locking????????????????? off (DEFAULT)
> cluster.disperse-self-heal-daemon??????? enable (DEFAULT)
> cluster.quorum-reads???????????????????? no (DEFAULT)
> client.bind-insecure???????????????????? (null) (DEFAULT)
> features.timeout???????????????????????? 45 (DEFAULT)
> features.failover-hosts????????????????? (null) (DEFAULT)
> features.shard off
> features.shard-block-size??????????????? 64MB (DEFAULT)
> features.shard-lru-limit???????????????? 16384 (DEFAULT)
> features.shard-deletion-rate???????????? 100 (DEFAULT)
> features.scrub-throttle lazy
> features.scrub-freq biweekly
> features.scrub?????????????????????????? false (DEFAULT)
> features.expiry-time 120
> features.signer-threads 4
> features.cache-invalidation on
> features.cache-invalidation-timeout 600
> ganesha.enable off
> features.leases off
> features.lease-lock-recall-timeout?????? 60 (DEFAULT)
> disperse.background-heals??????????????? 8 (DEFAULT)
> disperse.heal-wait-qlength?????????????? 128 (DEFAULT)
> cluster.heal-timeout???????????????????? 600 (DEFAULT)
> dht.force-readdirp?????????????????????? on (DEFAULT)
> disperse.read-policy???????????????????? gfid-hash (DEFAULT)
> cluster.shd-max-threads 4
> cluster.shd-wait-qlength???????????????? 1024 (DEFAULT)
> cluster.locking-scheme?????????????????? full (DEFAULT)
> cluster.granular-entry-heal????????????? no (DEFAULT)
> features.locks-revocation-secs?????????? 0 (DEFAULT)
> features.locks-revocation-clear-all????? false (DEFAULT)
> features.locks-revocation-max-blocked??? 0 (DEFAULT)
> features.locks-monkey-unlocking????????? false (DEFAULT)
> features.locks-notify-contention???????? yes (DEFAULT)
> features.locks-notify-contention-delay?? 5 (DEFAULT)
> disperse.shd-max-threads???????????????? 1 (DEFAULT)
> disperse.shd-wait-qlength 4096
> disperse.cpu-extensions????????????????? auto (DEFAULT)
> disperse.self-heal-window-size?????????? 32 (DEFAULT)
> cluster.use-compound-fops off
> performance.parallel-readdir on
> performance.rda-request-size 131072
> performance.rda-low-wmark??????????????? 4096 (DEFAULT)
> performance.rda-high-wmark?????????????? 128KB (DEFAULT)
> performance.rda-cache-limit 10MB
> performance.nl-cache-positive-entry????? false (DEFAULT)
> performance.nl-cache-limit 10MB
> performance.nl-cache-timeout 600
> cluster.brick-multiplex disable
> cluster.brick-graceful-cleanup disable
> glusterd.vol_count_per_thread 100
> cluster.max-bricks-per-process 250
> disperse.optimistic-change-log?????????? on (DEFAULT)
> disperse.stripe-cache??????????????????? 4 (DEFAULT)
> cluster.halo-enabled???????????????????? False (DEFAULT)
> cluster.halo-shd-max-latency???????????? 99999 (DEFAULT)
> cluster.halo-nfsd-max-latency??????????? 5 (DEFAULT)
> cluster.halo-max-latency???????????????? 5 (DEFAULT)
> cluster.halo-max-replicas??????????????? 99999 (DEFAULT)
> cluster.halo-min-replicas??????????????? 2 (DEFAULT)
> features.selinux on
> cluster.daemon-log-level INFO
> debug.delay-gen off
> delay-gen.delay-percentage?????????????? 10% (DEFAULT)
> delay-gen.delay-duration???????????????? 100000 (DEFAULT)
> delay-gen.enable (DEFAULT)
> disperse.parallel-writes???????????????? on (DEFAULT)
> disperse.quorum-count??????????????????? 0 (DEFAULT)
> features.sdfs off
> features.cloudsync off
> features.ctime on
> ctime.noatime on
> features.cloudsync-storetype???????????? (null) (DEFAULT)
> features.enforce-mandatory-lock off
> config.global-threading off
> config.client-threads 16
> config.brick-threads 16
> features.cloudsync-remote-read off
> features.cloudsync-store-id????????????? (null) (DEFAULT)
> features.cloudsync-product-id??????????? (null) (DEFAULT)
> features.acl enable
> cluster.use-anonymous-inode yes
> rebalance.ensure-durability????????????? on (DEFAULT)
Again, sorry for the long post. We would be happy to have this solved as 
we are excited using glusterfs and we would like to go back to having a 
stable configuration.

We always appreciate the spirit of collaboration and reciprocal help on 
this list.

Best
Ilias

-- 
?forumZFD
Entschieden f?r Frieden | Committed to Peace

Ilias Chasapakis
Referent IT | IT Consultant

Forum Ziviler Friedensdienst e.V. | Forum Civil Peace Service
Am K?lner Brett 8 | 50825 K?ln | Germany

Tel 0221 91273243 | Fax 0221 91273299 | http://www.forumZFD.de

Vorstand nach ? 26 BGB, einzelvertretungsberechtigt|Executive Board:
Alexander Mauz, Sonja Wiekenberg-Mlalandle, Jens von Bargen
VR 17651 Amtsgericht K?ln

Spenden|Donations: IBAN DE90 4306 0967 4103 7264 00   BIC GENODEM1GLS

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 665 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20240409/681609fe/attachment.sig>

Darrell Budic

2024-Apr-09 16:26 UTC

head link

[Gluster-users] Glusterfs 10.5-1 healing issues

The big one I see of you is to investigate and enable sharding. It can improve
performance and makes it much easier to heal VM style workloads. Be aware that
once you turn it on, you can?t go back easily, and you need to copy the VM disk
images around to get them to be sharded before it will show any real effect. A
couple other recommendations from my main volume (three dedicated host servers
with HDDs and SDD/NVM caching and log volumes on ZFS ). The cluster.shd-*
entries are especially recommended. This is on gluster 9.4 at the moment, so
some of these won?t map exactly.

Volume Name: gv1
Type: Replicate
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Options Reconfigured:
cluster.read-hash-mode: 3
performance.client-io-threads: on
performance.write-behind-window-size: 64MB
performance.cache-size: 1G
nfs.disable: on
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: on
performance.io-cache: off
performance.stat-prefetch: on
cluster.eager-lock: enable
network.remote-dio: enable
server.event-threads: 4
client.event-threads: 8
performance.io-thread-count: 64
performance.low-prio-threads: 32
features.shard: on
features.shard-block-size: 64MB
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10240
cluster.choose-local: false
cluster.granular-entry-heal: enable

Otherwise, more details about your servers, CPU, RAM, and Disks would be useful
for suggestions, and details of your network as well. And if you haven?t done
kernel level tuning on the servers, you should address that as well. These all
vary a lot by your work load and hardware setup, so there aren?t many generic
recommendations I can give other than to make sure you tuned your tcp stack and
enabled the none disk elevator on SSDs or disks used by ZFS.

There?s a lot of tuning suggesting in the archives if you go searching as well.

  -Darrell

> On Apr 9, 2024, at 3:05?AM, Ilias Chasapakis forumZFD <chasapakis at
forumZFD.de> wrote:
> 
> Dear all,
> 
> we would like to describe the situation that we have and that does not
solve since a long time, that means after many minor
> and major upgrades of GlusterFS
> 
> We use a KVM environment for VMs for glusterfs and host servers are updated
regularly. Hosts are disomogeneous hardware,
> but configured with same characteristics.
> 
> The VMs have been also harmonized to use the virtio drivers where available
for devices and resources reserved are the same
> on each host.
> 
> Physical switch for hosts has been substituted with a reliable one.
> 
> Probing peers has been and is quite quick in the heartbeat network and
communication between the servers for apparently has no issues on disruptions.
> 
> And I say apparently because what we have is:
> 
> - always pending failed heals that used to resolve by a rotated reboot of
the gluster vms (replica 3). Restarting only
> glusterfs related services (daemon, events etc.) has no effect, only reboot
brings results
> - very often failed heals are directories
> 
> We lately removed a brick that was on a vm on a host that has been entirely
substituted. Re-added the brick, sync went on and
> all data was eventually synced and started with 0 pending failed heals. Now
it develops failed heals too like its fellow
> bricks. Please take into account we healed all the failed entries (manually
with various methods) before adding the third brick.
> 
> After some days of operating, the count of failed heals rises again, not
really fast but with new entries for sure (which might solve
> with rotated reboots, or not).
> 
> We have gluster clients also on ctdbs that connect to the gluster and mount
via glusterfs client. Windows roaming profiles shared via smb become frequently
corrupted,(they are composed of a great number small files and are though of big
total dimension). Gluster nodes are formatted with xfs.
> 
> Also what we observer is that mounting with the vfs option in smb on the
ctdbs has some kind of delay. This means that you can see the shared folder on
for example
> a Windows client machine on a ctdb, but not on another ctdb in the cluster
and then after a while it appears there too. And this frequently st
> 
> 
> This is an excerpt of entries on our shd logs:
> 
>> 2024-04-08 10:13:26.213596 +0000] I [MSGID: 108026]
[afr-self-heal-entry.c:1080:afr_selfheal_entry_do] 0-gv-ho-replicate-0:
performing full entry selfheal on 2c621415-6223-4b66-a4ca-3f6f267a448d
>> [2024-04-08 10:14:08.135911 +0000] W [MSGID: 114031]
[client-rpc-fops_v2.c:2457:client4_0_link_cbk] 0-gv-ho-client-5: remote
operation failed. [{source=<gfid:91d83f0e-1864-4ff3-9174-b7c956e20596>},
{target=(null)}, {errno=116}, {error=Veraltete Dateizugriffsnummer (file
handle)}]
>> [2024-04-08 10:15:59.135908 +0000] W [MSGID: 114061]
[client-common.c:2992:client_pre_readdir_v2] 0-gv-ho-client-5: remote_fd is -1.
EBADFD [{gfid=6b5e599e-c836-4ebe-b16a-8224425b88c7}, {errno=77}, {error=Die
Dateizugriffsnummer ist in schlechter Verfassung}]
>> [2024-04-08 10:30:25.013592 +0000] I [MSGID: 108026]
[afr-self-heal-entry.c:1080:afr_selfheal_entry_do] 0-gv-ho-replicate-0:
performing full entry selfheal on 24e82e12-5512-4679-9eb3-8bd098367db7
>> [2024-04-08 10:33:17.613594 +0000] W [MSGID: 114031]
[client-rpc-fops_v2.c:2457:client4_0_link_cbk] 0-gv-ho-client-5: remote
operation failed. [{source=<gfid:ef9068fc-a329-4a21-88d2-265ecd3d208c>},
{target=(null)}, {errno=116}, {error=Veraltete Dateizugriffsnummer (file
handle)}]
>> [2024-04-08 10:33:21.201359 +0000] W [MSGID: 114031]
[client-rpc-fops_v2.c:2457:client4_0_link_cbk] 0-gv-ho-client-5: remote
operation failed. [{source>
> How are he clients mapped to real hosts in order to know on which one?s
logs to look at?
> 
> We would like to go by exclusion to finally eradicate this, possibly in a
conservative way (not rebuilding everything) and we
> 
> are becoming clueless as to where to look at as we also tried various
options settings regarding performance etc.
> 
> Here is the set on our main volume:
> 
>> cluster.lookup-unhashed                  on (DEFAULT)
>> cluster.lookup-optimize                  on (DEFAULT)
>> cluster.min-free-disk                    10% (DEFAULT)
>> cluster.min-free-inodes                  5% (DEFAULT)
>> cluster.rebalance-stats                  off (DEFAULT)
>> cluster.subvols-per-directory            (null) (DEFAULT)
>> cluster.readdir-optimize                 off (DEFAULT)
>> cluster.rsync-hash-regex                 (null) (DEFAULT)
>> cluster.extra-hash-regex                 (null) (DEFAULT)
>> cluster.dht-xattr-name                   trusted.glusterfs.dht
(DEFAULT)
>> cluster.randomize-hash-range-by-gfid     off (DEFAULT)
>> cluster.rebal-throttle                   normal (DEFAULT)
>> cluster.lock-migration off
>> cluster.force-migration off
>> cluster.local-volume-name                (null) (DEFAULT)
>> cluster.weighted-rebalance               on (DEFAULT)
>> cluster.switch-pattern                   (null) (DEFAULT)
>> cluster.entry-change-log                 on (DEFAULT)
>> cluster.read-subvolume                   (null) (DEFAULT)
>> cluster.read-subvolume-index             -1 (DEFAULT)
>> cluster.read-hash-mode                   1 (DEFAULT)
>> cluster.background-self-heal-count       8 (DEFAULT)
>> cluster.metadata-self-heal on
>> cluster.data-self-heal on
>> cluster.entry-self-heal on
>> cluster.self-heal-daemon enable
>> cluster.heal-timeout                     600 (DEFAULT)
>> cluster.self-heal-window-size            8 (DEFAULT)
>> cluster.data-change-log                  on (DEFAULT)
>> cluster.metadata-change-log              on (DEFAULT)
>> cluster.data-self-heal-algorithm         (null) (DEFAULT)
>> cluster.eager-lock                       on (DEFAULT)
>> disperse.eager-lock                      on (DEFAULT)
>> disperse.other-eager-lock                on (DEFAULT)
>> disperse.eager-lock-timeout              1 (DEFAULT)
>> disperse.other-eager-lock-timeout        1 (DEFAULT)
>> cluster.quorum-type auto
>> cluster.quorum-count 2
>> cluster.choose-local                     true (DEFAULT)
>> cluster.self-heal-readdir-size           1KB (DEFAULT)
>> cluster.post-op-delay-secs               1 (DEFAULT)
>> cluster.ensure-durability                on (DEFAULT)
>> cluster.consistent-metadata              no (DEFAULT)
>> cluster.heal-wait-queue-length           128 (DEFAULT)
>> cluster.favorite-child-policy none
>> cluster.full-lock                        yes (DEFAULT)
>> cluster.optimistic-change-log            on (DEFAULT)
>> diagnostics.latency-measurement off
>> diagnostics.dump-fd-stats                off (DEFAULT)
>> diagnostics.count-fop-hits off
>> diagnostics.brick-log-level INFO
>> diagnostics.client-log-level INFO
>> diagnostics.brick-sys-log-level          CRITICAL (DEFAULT)
>> diagnostics.client-sys-log-level         CRITICAL (DEFAULT)
>> diagnostics.brick-logger                 (null) (DEFAULT)
>> diagnostics.client-logger                (null) (DEFAULT)
>> diagnostics.brick-log-format             (null) (DEFAULT)
>> diagnostics.client-log-format            (null) (DEFAULT)
>> diagnostics.brick-log-buf-size           5 (DEFAULT)
>> diagnostics.client-log-buf-size          5 (DEFAULT)
>> diagnostics.brick-log-flush-timeout      120 (DEFAULT)
>> diagnostics.client-log-flush-timeout     120 (DEFAULT)
>> diagnostics.stats-dump-interval          0 (DEFAULT)
>> diagnostics.fop-sample-interval          0 (DEFAULT)
>> diagnostics.stats-dump-format            json (DEFAULT)
>> diagnostics.fop-sample-buf-size          65535 (DEFAULT)
>> diagnostics.stats-dnscache-ttl-sec       86400 (DEFAULT)
>> performance.cache-max-file-size 10
>> performance.cache-min-file-size          0 (DEFAULT)
>> performance.cache-refresh-timeout        1 (DEFAULT)
>> performance.cache-priority (DEFAULT)
>> performance.io-cache-size                32MB (DEFAULT)
>> performance.cache-size                   32MB (DEFAULT)
>> performance.io-thread-count              16 (DEFAULT)
>> performance.high-prio-threads            16 (DEFAULT)
>> performance.normal-prio-threads          16 (DEFAULT)
>> performance.low-prio-threads             16 (DEFAULT)
>> performance.least-prio-threads           1 (DEFAULT)
>> performance.enable-least-priority        on (DEFAULT)
>> performance.iot-watchdog-secs            (null) (DEFAULT)
>> performance.iot-cleanup-disconnected-reqs off (DEFAULT)
>> performance.iot-pass-through             false (DEFAULT)
>> performance.io-cache-pass-through        false (DEFAULT)
>> performance.quick-read-cache-size        128MB (DEFAULT)
>> performance.cache-size                   128MB (DEFAULT)
>> performance.quick-read-cache-timeout     1 (DEFAULT)
>> performance.qr-cache-timeout 600
>> performance.quick-read-cache-invalidation false (DEFAULT)
>> performance.ctime-invalidation           false (DEFAULT)
>> performance.flush-behind                 on (DEFAULT)
>> performance.nfs.flush-behind             on (DEFAULT)
>> performance.write-behind-window-size 4MB
>> performance.resync-failed-syncs-after-fsync off (DEFAULT)
>> performance.nfs.write-behind-window-size 1MB (DEFAULT)
>> performance.strict-o-direct              off (DEFAULT)
>> performance.nfs.strict-o-direct          off (DEFAULT)
>> performance.strict-write-ordering        off (DEFAULT)
>> performance.nfs.strict-write-ordering    off (DEFAULT)
>> performance.write-behind-trickling-writes on (DEFAULT)
>> performance.aggregate-size               128KB (DEFAULT)
>> performance.nfs.write-behind-trickling-writes on (DEFAULT)
>> performance.lazy-open                    yes (DEFAULT)
>> performance.read-after-open              yes (DEFAULT)
>> performance.open-behind-pass-through     false (DEFAULT)
>> performance.read-ahead-page-count        4 (DEFAULT)
>> performance.read-ahead-pass-through      false (DEFAULT)
>> performance.readdir-ahead-pass-through   false (DEFAULT)
>> performance.md-cache-pass-through        false (DEFAULT)
>> performance.write-behind-pass-through    false (DEFAULT)
>> performance.md-cache-timeout 600
>> performance.cache-swift-metadata         false (DEFAULT)
>> performance.cache-samba-metadata on
>> performance.cache-capability-xattrs      true (DEFAULT)
>> performance.cache-ima-xattrs             true (DEFAULT)
>> performance.md-cache-statfs              off (DEFAULT)
>> performance.xattr-cache-list (DEFAULT)
>> performance.nl-cache-pass-through        false (DEFAULT)
>> network.frame-timeout                    1800 (DEFAULT)
>> network.ping-timeout 20
>> network.tcp-window-size                  (null) (DEFAULT)
>> client.ssl off
>> network.remote-dio                       disable (DEFAULT)
>> client.event-threads 4
>> client.tcp-user-timeout 0
>> client.keepalive-time 20
>> client.keepalive-interval 2
>> client.keepalive-count 9
>> client.strict-locks off
>> network.tcp-window-size                  (null) (DEFAULT)
>> network.inode-lru-limit 200000
>> auth.allow *
>> auth.reject                              (null) (DEFAULT)
>> transport.keepalive 1
>> server.allow-insecure                    on (DEFAULT)
>> server.root-squash                       off (DEFAULT)
>> server.all-squash                        off (DEFAULT)
>> server.anonuid                           65534 (DEFAULT)
>> server.anongid                           65534 (DEFAULT)
>> server.statedump-path                    /var/run/gluster (DEFAULT)
>> server.outstanding-rpc-limit             64 (DEFAULT)
>> server.ssl off
>> auth.ssl-allow *
>> server.manage-gids                       off (DEFAULT)
>> server.dynamic-auth                      on (DEFAULT)
>> client.send-gids                         on (DEFAULT)
>> server.gid-timeout                       300 (DEFAULT)
>> server.own-thread                        (null) (DEFAULT)
>> server.event-threads 4
>> server.tcp-user-timeout                  42 (DEFAULT)
>> server.keepalive-time 20
>> server.keepalive-interval 2
>> server.keepalive-count 9
>> transport.listen-backlog 1024
>> ssl.own-cert                             (null) (DEFAULT)
>> ssl.private-key                          (null) (DEFAULT)
>> ssl.ca-list                              (null) (DEFAULT)
>> ssl.crl-path                             (null) (DEFAULT)
>> ssl.certificate-depth                    (null) (DEFAULT)
>> ssl.cipher-list                          (null) (DEFAULT)
>> ssl.dh-param                             (null) (DEFAULT)
>> ssl.ec-curve                             (null) (DEFAULT)
>> transport.address-family inet
>> performance.write-behind off
>> performance.read-ahead on
>> performance.readdir-ahead on
>> performance.io-cache off
>> performance.open-behind on
>> performance.quick-read on
>> performance.nl-cache on
>> performance.stat-prefetch on
>> performance.client-io-threads off
>> performance.nfs.write-behind on
>> performance.nfs.read-ahead off
>> performance.nfs.io-cache off
>> performance.nfs.quick-read off
>> performance.nfs.stat-prefetch off
>> performance.nfs.io-threads off
>> performance.force-readdirp               true (DEFAULT)
>> performance.cache-invalidation on
>> performance.global-cache-invalidation    true (DEFAULT)
>> features.uss off
>> features.snapshot-directory .snaps
>> features.show-snapshot-directory off
>> features.tag-namespaces off
>> network.compression off
>> network.compression.window-size          -15 (DEFAULT)
>> network.compression.mem-level            8 (DEFAULT)
>> network.compression.min-size             0 (DEFAULT)
>> network.compression.compression-level    -1 (DEFAULT)
>> network.compression.debug                false (DEFAULT)
>> features.default-soft-limit              80% (DEFAULT)
>> features.soft-timeout                    60 (DEFAULT)
>> features.hard-timeout                    5 (DEFAULT)
>> features.alert-time                      86400 (DEFAULT)
>> features.quota-deem-statfs off
>> geo-replication.indexing off
>> geo-replication.indexing off
>> geo-replication.ignore-pid-check off
>> geo-replication.ignore-pid-check off
>> features.quota off
>> features.inode-quota off
>> features.bitrot disable
>> debug.trace off
>> debug.log-history                        no (DEFAULT)
>> debug.log-file                           no (DEFAULT)
>> debug.exclude-ops                        (null) (DEFAULT)
>> debug.include-ops                        (null) (DEFAULT)
>> debug.error-gen off
>> debug.error-failure                      (null) (DEFAULT)
>> debug.error-number                       (null) (DEFAULT)
>> debug.random-failure                     off (DEFAULT)
>> debug.error-fops                         (null) (DEFAULT)
>> nfs.disable on
>> features.read-only                       off (DEFAULT)
>> features.worm off
>> features.worm-file-level off
>> features.worm-files-deletable on
>> features.default-retention-period        120 (DEFAULT)
>> features.retention-mode                  relax (DEFAULT)
>> features.auto-commit-period              180 (DEFAULT)
>> storage.linux-aio                        off (DEFAULT)
>> storage.linux-io_uring                   off (DEFAULT)
>> storage.batch-fsync-mode                 reverse-fsync (DEFAULT)
>> storage.batch-fsync-delay-usec           0 (DEFAULT)
>> storage.owner-uid                        -1 (DEFAULT)
>> storage.owner-gid                        -1 (DEFAULT)
>> storage.node-uuid-pathinfo               off (DEFAULT)
>> storage.health-check-interval            30 (DEFAULT)
>> storage.build-pgfid                      off (DEFAULT)
>> storage.gfid2path                        on (DEFAULT)
>> storage.gfid2path-separator              : (DEFAULT)
>> storage.reserve                          1 (DEFAULT)
>> storage.health-check-timeout             20 (DEFAULT)
>> storage.fips-mode-rchecksum on
>> storage.force-create-mode                0000 (DEFAULT)
>> storage.force-directory-mode             0000 (DEFAULT)
>> storage.create-mask                      0777 (DEFAULT)
>> storage.create-directory-mask            0777 (DEFAULT)
>> storage.max-hardlinks                    100 (DEFAULT)
>> features.ctime                           on (DEFAULT)
>> config.gfproxyd off
>> cluster.server-quorum-type server
>> cluster.server-quorum-ratio 51
>> changelog.changelog                      off (DEFAULT)
>> changelog.changelog-dir                  {{ brick.path
}}/.glusterfs/changelogs (DEFAULT)
>> changelog.encoding                       ascii (DEFAULT)
>> changelog.rollover-time                  15 (DEFAULT)
>> changelog.fsync-interval                 5 (DEFAULT)
>> changelog.changelog-barrier-timeout 120
>> changelog.capture-del-path               off (DEFAULT)
>> features.barrier disable
>> features.barrier-timeout 120
>> features.trash                           off (DEFAULT)
>> features.trash-dir                       .trashcan (DEFAULT)
>> features.trash-eliminate-path            (null) (DEFAULT)
>> features.trash-max-filesize              5MB (DEFAULT)
>> features.trash-internal-op               off (DEFAULT)
>> cluster.enable-shared-storage disable
>> locks.trace                              off (DEFAULT)
>> locks.mandatory-locking                  off (DEFAULT)
>> cluster.disperse-self-heal-daemon        enable (DEFAULT)
>> cluster.quorum-reads                     no (DEFAULT)
>> client.bind-insecure                     (null) (DEFAULT)
>> features.timeout                         45 (DEFAULT)
>> features.failover-hosts                  (null) (DEFAULT)
>> features.shard off
>> features.shard-block-size                64MB (DEFAULT)
>> features.shard-lru-limit                 16384 (DEFAULT)
>> features.shard-deletion-rate             100 (DEFAULT)
>> features.scrub-throttle lazy
>> features.scrub-freq biweekly
>> features.scrub                           false (DEFAULT)
>> features.expiry-time 120
>> features.signer-threads 4
>> features.cache-invalidation on
>> features.cache-invalidation-timeout 600
>> ganesha.enable off
>> features.leases off
>> features.lease-lock-recall-timeout       60 (DEFAULT)
>> disperse.background-heals                8 (DEFAULT)
>> disperse.heal-wait-qlength               128 (DEFAULT)
>> cluster.heal-timeout                     600 (DEFAULT)
>> dht.force-readdirp                       on (DEFAULT)
>> disperse.read-policy                     gfid-hash (DEFAULT)
>> cluster.shd-max-threads 4
>> cluster.shd-wait-qlength                 1024 (DEFAULT)
>> cluster.locking-scheme                   full (DEFAULT)
>> cluster.granular-entry-heal              no (DEFAULT)
>> features.locks-revocation-secs           0 (DEFAULT)
>> features.locks-revocation-clear-all      false (DEFAULT)
>> features.locks-revocation-max-blocked    0 (DEFAULT)
>> features.locks-monkey-unlocking          false (DEFAULT)
>> features.locks-notify-contention         yes (DEFAULT)
>> features.locks-notify-contention-delay   5 (DEFAULT)
>> disperse.shd-max-threads                 1 (DEFAULT)
>> disperse.shd-wait-qlength 4096
>> disperse.cpu-extensions                  auto (DEFAULT)
>> disperse.self-heal-window-size           32 (DEFAULT)
>> cluster.use-compound-fops off
>> performance.parallel-readdir on
>> performance.rda-request-size 131072
>> performance.rda-low-wmark                4096 (DEFAULT)
>> performance.rda-high-wmark               128KB (DEFAULT)
>> performance.rda-cache-limit 10MB
>> performance.nl-cache-positive-entry      false (DEFAULT)
>> performance.nl-cache-limit 10MB
>> performance.nl-cache-timeout 600
>> cluster.brick-multiplex disable
>> cluster.brick-graceful-cleanup disable
>> glusterd.vol_count_per_thread 100
>> cluster.max-bricks-per-process 250
>> disperse.optimistic-change-log           on (DEFAULT)
>> disperse.stripe-cache                    4 (DEFAULT)
>> cluster.halo-enabled                     False (DEFAULT)
>> cluster.halo-shd-max-latency             99999 (DEFAULT)
>> cluster.halo-nfsd-max-latency            5 (DEFAULT)
>> cluster.halo-max-latency                 5 (DEFAULT)
>> cluster.halo-max-replicas                99999 (DEFAULT)
>> cluster.halo-min-replicas                2 (DEFAULT)
>> features.selinux on
>> cluster.daemon-log-level INFO
>> debug.delay-gen off
>> delay-gen.delay-percentage               10% (DEFAULT)
>> delay-gen.delay-duration                 100000 (DEFAULT)
>> delay-gen.enable (DEFAULT)
>> disperse.parallel-writes                 on (DEFAULT)
>> disperse.quorum-count                    0 (DEFAULT)
>> features.sdfs off
>> features.cloudsync off
>> features.ctime on
>> ctime.noatime on
>> features.cloudsync-storetype             (null) (DEFAULT)
>> features.enforce-mandatory-lock off
>> config.global-threading off
>> config.client-threads 16
>> config.brick-threads 16
>> features.cloudsync-remote-read off
>> features.cloudsync-store-id              (null) (DEFAULT)
>> features.cloudsync-product-id            (null) (DEFAULT)
>> features.acl enable
>> cluster.use-anonymous-inode yes
>> rebalance.ensure-durability              on (DEFAULT)
> 
> Again, sorry for the long post. We would be happy to have this solved as we
are excited using glusterfs and we would like to go back to having a stable
configuration.
> 
> We always appreciate the spirit of collaboration and reciprocal help on
this list.
> 
> Best
> Ilias
> 
> -- 
> ?forumZFD
> Entschieden f?r Frieden | Committed to Peace
> 
> Ilias Chasapakis
> Referent IT | IT Consultant
> 
> Forum Ziviler Friedensdienst e.V. | Forum Civil Peace Service
> Am K?lner Brett 8 | 50825 K?ln | Germany
> 
> Tel 0221 91273243 | Fax 0221 91273299 | http://www.forumZFD.de
> 
> Vorstand nach ? 26 BGB, einzelvertretungsberechtigt|Executive Board:
> Alexander Mauz, Sonja Wiekenberg-Mlalandle, Jens von Bargen
> VR 17651 Amtsgericht K?ln
> 
> Spenden|Donations: IBAN DE90 4306 0967 4103 7264 00   BIC GENODEM1GLS
> 
> ________
> 
> 
> 
> Community Meeting Calendar:
> 
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20240409/6d49db9c/attachment.html>

Gluster users - Apr 2024 - Glusterfs 10.5-1 healing issues

[Gluster-users] Glusterfs 10.5-1 healing issues

[Gluster-users] Glusterfs 10.5-1 healing issues