thr3ads.net - Gluster users - [Gluster-users] Gluster-users Digest, Vol 74, Issue 11 [Jun 2014]

If this information is useful, please help other people find it:
Share via:

Jorick Astrego

2014-Jun-10 21:15 UTC

[Gluster-users] Gluster-users Digest, Vol 74, Issue 11

On 06/10/2014 02:00 PM, gluster-users-request at gluster.org
wrote:> From: Laurent Chouinard<laurent.chouinard at ubisoft.com>
> To: Pranith Kumar Karampuri<pkarampu at redhat.com>
> Cc:"gluster-users at gluster.org"  <gluster-users at
gluster.org>
> Subject: Re: [Gluster-users] Unavailability during self-heal for large
>                   volumes
> Message-ID:
>   <95ea1865fac2484980d020c6a3b7f0cd at MSR-MAIL-EXCH02.ubisoft.org>
> Content-Type: text/plain; charset="utf-8"
>
>> >Laurent,
>> >    This has been improved significantly in afr-v2 (enhanced
version of
> replication
>> >translator in gluster) which will be released with 3.6 I believe.
The
> issue happens
>> >because of the directory self-heal in the older versions. In the
new
> version per file
>> >healing in a directory is performed instead of Full directory heal
> at-once which was
>> >creating a lot of traffic. Unfortunately This is too big a change
to
> backport to older
>> >releases:-(.
>> >
>> >Pranith
> Hi Pranith,
>
> Thank you for this information.
>
> Do you think there is a way to limit/throttle the current directory
> self-heal then? I don't mind if it takes a long time.
>
> Alternatively, is there a way to completely disable the complete healing
> system? I would consider running a manual healing operation by STAT'ing
> every file, which would allow me to throttle the speed to a more
> manageable level.
>
> Thanks,
>
> Laurent ChouinardYou could try this:

http://www.gluster.org/author/andrew-lau/

by Andrew Lau <http://www.gluster.org/author/andrew-lau/> on February 3, 
2014

    Controlling glusterfsd CPU outbreaks with cgroups

<http://www.andrewklau.com//controlling-glusterfsd-cpu-outbreaks-with-cgroups/>

Some of you may that same feeling when adding a new brick to your 
gluster replicated volume which already has an excess of 1TB data 
already on there and suddenly your gluster server has shot up to 500% 
CPU usage. What's worse is when my hosts run along side oVirt so while 
gluster hogged all the CPU, my VMs started to crawl, even running simple 
commands like |top| would take 30+ seconds. Not a good feeling.

My first attempt I limited the NIC's bandwidth to 200Mbps rather than 
the 2x1Gbps aggregated link and this calmed glusterfsd down to a healthy 
50%. A temporary fix which however meant clients accessing gluster 
storage would be bottlenecked by that shared limit.

So off to the mailing list - a great suggestion from James/purpleidea 
(https://ttboj.wordpress.com/code/puppet-gluster/) on using cgroups.

The concept is simple, we limit the total CPU glusterfsd sees so when it 
comes to doing the checksums for self heals, replication etc. They won't 
have the high priority which other services such as running VMs would 
have. *This effectively slows down replication rate in return for lower 
CPU usage.
*

Kind regards,

Jorick Astrego
Netbulae B.V.
http://www.netbulae.eu

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140610/aeef34cd/attachment.html>

Laurent Chouinard

2014-Jun-11 20:51 UTC

head link

[Gluster-users] Gluster-users Digest, Vol 74, Issue 11

Hi Jorick,
> You could try this:
>
> http://www.gluster.org/author/andrew-lau/
That's an interesting approach. If the other suggestion doesn't work
out, I think we'll try this one next. Thanks for sharing!

Regards,

Laurent

Gluster users - Jun 2014 - Gluster-users Digest, Vol 74, Issue 11

[Gluster-users] Gluster-users Digest, Vol 74, Issue 11

[Gluster-users] Gluster-users Digest, Vol 74, Issue 11