thr3ads.net - Gluster users - [Gluster-users] SelfHeal/AutoHeal Thread Cap on metadata heavy workload [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Kayra Otaner | BilgiO

2016-Mar-20 18:49 UTC

[Gluster-users] SelfHeal/AutoHeal Thread Cap on metadata heavy workload

Hello there,

We're working on migrating over 100 million small files to Gluster 3.7 with
bricks sitting on XFS filesystem. We've started with single node first and
optimized NFS and other aspects of GlusterFS to perform best for our
workload. When we switched second node on, we started experiencing very
heavy CPU utilization, mostly due to SelfHealDaemon (SHD). We've tried
turning SHD off and let autoheal take care of replicating data across
nodes, yet with some directories having over 2 million files, it proved to
be very difficult to control CPU utilization.

Our setup has 6 bricks on each node, 2 node, distributed and replicated set
up, NFS mount, lots of small files, server-threads, client-threads and
io-threads set to 16. We've tried reducing them to 4 but still seeing same
symptoms. I've spend fair amount of time to analyze threat usage using,
strace, sysdig and such, no matter what we set for thread configuration SHD
seems like using 24 threads (4 for each brick it seems).

Does anyone know how to throttle SHD and autoheal down so that it doesn't
consume too much CPU power? Any other ideas on how to tune GlusterFS for
small file / metadata-heavy workload is appreciated, especially when
adding/removing nodes/bricks.

Thank you

--
Kayra Otaner
BilgiO A.?. - SecOps Experts
PGP KeyID : A945251E | Manager, Enterprise Linux Solutions
www.bilgio.com | TR +90 (532) 111-7240 x 1001 | US +1 (201) 206-2592
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160320/70e6bff7/attachment.html>

Gluster users - Mar 2016 - SelfHeal/AutoHeal Thread Cap on metadata heavy workload

[Gluster-users] SelfHeal/AutoHeal Thread Cap on metadata heavy workload