thr3ads.net - Gluster users - [Gluster-users] Fwd: disperse heal speed up [Apr 2016]

If this information is useful, please help other people find it:
Share via:

Serkan Çoban

2016-Apr-13 11:59 UTC

[Gluster-users] Fwd: disperse heal speed up

Hi Xavier,

Can you help me about the below issue? How can I increase the disperse
heal speed?
Also I would be grateful if you have detailed documentation about disperse heal,
why heal happens on disperse volume, how it is triggered? Which nodes
participate in heal process? Any client interaction?

Serkan


---------- Forwarded message ----------
From: Serkan ?oban <cobanserkan at gmail.com>
Date: Fri, Apr 8, 2016 at 5:46 PM
Subject: disperse heal speed up
To: Gluster Users <gluster-users at gluster.org>


Hi,

I am testing heal speed of disperse volume and what I see is 5-10MB/s per node.
I increased disperse.background-heals to 32 and
disperse.heal-wait-qlength to 256, but still no difference.
One thing I noticed is that, when I kill a brick process, reformat it
and restart it heal speed is nearly 20x (200MB/s/node)

But when I kill the brick, then write 100TB data, and start brick
afterwords heal is slow (5-10MB/s/node)

What is the difference between two scenarios? Why one heal is slow and
other is fast? How can I increase disperse heal speed? Should I
increase thread count to 128 or 256? I am on 78x(16+4) disperse volume
and my servers are pretty strong (2x14 cores with 512GB ram, each node
has 26x8TB disks)

Gluster version is 3.7.10.

Thanks,
Serkan

Xavier Hernandez

2016-Apr-15 06:27 UTC

head link

[Gluster-users] Fwd: disperse heal speed up

Hi Serkan,

sorry for the delay, I'm a bit busy lately.

On 13/04/16 13:59, Serkan ?oban wrote:> Hi Xavier,
>
> Can you help me about the below issue? How can I increase the disperse
> heal speed?
It seems weird. Is there any related message in the logs ?

In this particular test, are the 100TB modified files or newly created 
files while the brick was down ?

How many files have been modified ?
> Also I would be grateful if you have detailed documentation about disperse
heal,
> why heal happens on disperse volume, how it is triggered? Which nodes
> participate in heal process? Any client interaction?
Heal process is basically the same used for replicate. There are two 
ways to trigger a self-heal:

* when an inconsistency is detected, the client initiates a background 
self-heal of the inode

* the self-heal daemon scans the lists of modified files created by the 
index xlator when a modification is made while some node is down. All 
these files are self-healed.

Xavi
>
> Serkan
>
>
> ---------- Forwarded message ----------
> From: Serkan ?oban <cobanserkan at gmail.com>
> Date: Fri, Apr 8, 2016 at 5:46 PM
> Subject: disperse heal speed up
> To: Gluster Users <gluster-users at gluster.org>
>
>
> Hi,
>
> I am testing heal speed of disperse volume and what I see is 5-10MB/s per
node.
> I increased disperse.background-heals to 32 and
> disperse.heal-wait-qlength to 256, but still no difference.
> One thing I noticed is that, when I kill a brick process, reformat it
> and restart it heal speed is nearly 20x (200MB/s/node)
>
> But when I kill the brick, then write 100TB data, and start brick
> afterwords heal is slow (5-10MB/s/node)
>
> What is the difference between two scenarios? Why one heal is slow and
> other is fast? How can I increase disperse heal speed? Should I
> increase thread count to 128 or 256? I am on 78x(16+4) disperse volume
> and my servers are pretty strong (2x14 cores with 512GB ram, each node
> has 26x8TB disks)
>
> Gluster version is 3.7.10.
>
> Thanks,
> Serkan
>

Gluster users - Apr 2016 - Fwd: disperse heal speed up

[Gluster-users] Fwd: disperse heal speed up

[Gluster-users] Fwd: disperse heal speed up