Kayra Otaner | BilgiO
2016-Mar-20 18:49 UTC
[Gluster-users] SelfHeal/AutoHeal Thread Cap on metadata heavy workload
Hello there, We're working on migrating over 100 million small files to Gluster 3.7 with bricks sitting on XFS filesystem. We've started with single node first and optimized NFS and other aspects of GlusterFS to perform best for our workload. When we switched second node on, we started experiencing very heavy CPU utilization, mostly due to SelfHealDaemon (SHD). We've tried turning SHD off and let autoheal take care of replicating data across nodes, yet with some directories having over 2 million files, it proved to be very difficult to control CPU utilization. Our setup has 6 bricks on each node, 2 node, distributed and replicated set up, NFS mount, lots of small files, server-threads, client-threads and io-threads set to 16. We've tried reducing them to 4 but still seeing same symptoms. I've spend fair amount of time to analyze threat usage using, strace, sysdig and such, no matter what we set for thread configuration SHD seems like using 24 threads (4 for each brick it seems). Does anyone know how to throttle SHD and autoheal down so that it doesn't consume too much CPU power? Any other ideas on how to tune GlusterFS for small file / metadata-heavy workload is appreciated, especially when adding/removing nodes/bricks. Thank you -- Kayra Otaner BilgiO A.?. - SecOps Experts PGP KeyID : A945251E | Manager, Enterprise Linux Solutions www.bilgio.com | TR +90 (532) 111-7240 x 1001 | US +1 (201) 206-2592 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160320/70e6bff7/attachment.html>