Hi Serkan,
sorry for the delay, I'm a bit busy lately.
On 13/04/16 13:59, Serkan ?oban wrote:> Hi Xavier,
>
> Can you help me about the below issue? How can I increase the disperse
> heal speed?
It seems weird. Is there any related message in the logs ?
In this particular test, are the 100TB modified files or newly created
files while the brick was down ?
How many files have been modified ?
> Also I would be grateful if you have detailed documentation about disperse
heal,
> why heal happens on disperse volume, how it is triggered? Which nodes
> participate in heal process? Any client interaction?
Heal process is basically the same used for replicate. There are two
ways to trigger a self-heal:
* when an inconsistency is detected, the client initiates a background
self-heal of the inode
* the self-heal daemon scans the lists of modified files created by the
index xlator when a modification is made while some node is down. All
these files are self-healed.
Xavi
>
> Serkan
>
>
> ---------- Forwarded message ----------
> From: Serkan ?oban <cobanserkan at gmail.com>
> Date: Fri, Apr 8, 2016 at 5:46 PM
> Subject: disperse heal speed up
> To: Gluster Users <gluster-users at gluster.org>
>
>
> Hi,
>
> I am testing heal speed of disperse volume and what I see is 5-10MB/s per
node.
> I increased disperse.background-heals to 32 and
> disperse.heal-wait-qlength to 256, but still no difference.
> One thing I noticed is that, when I kill a brick process, reformat it
> and restart it heal speed is nearly 20x (200MB/s/node)
>
> But when I kill the brick, then write 100TB data, and start brick
> afterwords heal is slow (5-10MB/s/node)
>
> What is the difference between two scenarios? Why one heal is slow and
> other is fast? How can I increase disperse heal speed? Should I
> increase thread count to 128 or 256? I am on 78x(16+4) disperse volume
> and my servers are pretty strong (2x14 cores with 512GB ram, each node
> has 26x8TB disks)
>
> Gluster version is 3.7.10.
>
> Thanks,
> Serkan
>