thr3ads.net - Gluster users - [Gluster-users] How to Speed UP heal process in Glusterfs 3.10.1 [Apr 2017]

If this information is useful, please help other people find it:
Share via:

Amudhan P

2017-Apr-18 08:36 UTC

[Gluster-users] How to Speed UP heal process in Glusterfs 3.10.1

I actually used this (find /mnt/gluster -d -exec getfattr -h -n
trusted.ec.heal {} \; > /dev/null
) command on a specific folder to trigger heal but it was also not showing
any difference in speed.

I was asking about reading data in same disperse set like 8+2 disperse
config if one disk is replaced and when heal is in process and when client
reads data which is available in rest of the 9 disks.

I am sure there was no bottleneck on network/disk IO in my case.

I have tested 3.10.1 heal with disperse.shd-max-threads = 4. heal completed
data size of 27GB in 13M15s. so it works well in a test environment but
production environment it differs.



On Tue, Apr 18, 2017 at 12:47 PM, Serkan ?oban <cobanserkan at gmail.com>
wrote:
> You can increase heal speed by running below command from a client:
> find /mnt/gluster -d -exec getfattr -h -n trusted.ec.heal {} \; >
/dev/null
>
> You can write a script with different folders to make it parallel.
>
> In my case I see 6TB data was healed within 7-8 days with above command
> running.
> >did you face any issue in reading data from rest of the good bricks in
> the set. like slow read < KB/s.
> No, nodes generally have balanced network/disk  IO during heal..
>
> You should make a detailed tests with non-prod cluster and try to find
> optimum heal configuration for your use case..
> Our new servers are on the way, in a couple of months I also will do
> detailed tests with 3.10.x and parallel disperse heal, will post the
> results here...
>
>
> On Tue, Apr 18, 2017 at 9:51 AM, Amudhan P <amudhan83 at gmail.com>
wrote:
> > Serkan,
> >
> > I have initially changed shd-max-thread 1 to 2 saw a little difference
> and
> > changing it to 4 & 8. doesn't make any difference.
> > disk write speed was about <1MB and data passed in thru network for
> healing
> > node from other node were 4MB combined.
> >
> > Also, I tried ls -l from mount point to the folders and files which
need
> to
> > be healed but have not seen any difference in performance.
> >
> > But after 3 days of heal process running disk write speed was
increased
> to 9
> > - 11MB and data passed thru network for healing node from other node
were
> > 40MB combined.
> >
> > Still 14GB of data to be healed when comparing to other disks in set.
> >
> > I saw in another thread you also had the issue with heal speed, did
you
> face
> > any issue in reading data from rest of the good bricks in the set.
like
> slow
> > read < KB/s.
> >
> > On Mon, Apr 17, 2017 at 2:05 PM, Serkan ?oban <cobanserkan at
gmail.com>
> wrote:
> >>
> >> Normally I see 8-10MB/sec/brick heal speed with gluster 3.7.11.
> >> I tested parallel heal for disperse with version 3.9.0 and see
that it
> >> increase the heal speed to 20-40MB/sec
> >> I tested with shd-max-threads 2,4,8 and saw that best performance
> >> achieved with 2 or 4 threads.
> >> you can try to start with 2 and test with 4 and 8 and compare the
> results?
> >
> >
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170418/0f13ac2a/attachment.html>

Serkan Çoban

2017-Apr-18 09:59 UTC

head link

[Gluster-users] How to Speed UP heal process in Glusterfs 3.10.1

>I was asking about reading data in same disperse set like 8+2 disperseconfig if one disk is replaced and when heal is in process and when client
reads data which is available in rest of the 9 disks.

My use case is write heavy, we barely read data, so I do not know if read
speed degrades during heal. But I know write speed do not change during
heal.

How big is your files? How many files on average in each directory?

On Tue, Apr 18, 2017 at 11:36 AM, Amudhan P <amudhan83 at gmail.com>
wrote:
>
> I actually used this (find /mnt/gluster -d -exec getfattr -h -n
> trusted.ec.heal {} \; > /dev/null
> ) command on a specific folder to trigger heal but it was also not showing
> any difference in speed.
>
> I was asking about reading data in same disperse set like 8+2 disperse
> config if one disk is replaced and when heal is in process and when client
> reads data which is available in rest of the 9 disks.
>
> I am sure there was no bottleneck on network/disk IO in my case.
>
> I have tested 3.10.1 heal with disperse.shd-max-threads = 4. heal
> completed data size of 27GB in 13M15s. so it works well in a test
> environment but production environment it differs.
>
>
>
> On Tue, Apr 18, 2017 at 12:47 PM, Serkan ?oban <cobanserkan at
gmail.com>
> wrote:
>
>> You can increase heal speed by running below command from a client:
>> find /mnt/gluster -d -exec getfattr -h -n trusted.ec.heal {} \; >
>> /dev/null
>>
>> You can write a script with different folders to make it parallel.
>>
>> In my case I see 6TB data was healed within 7-8 days with above command
>> running.
>> >did you face any issue in reading data from rest of the good bricks
in
>> the set. like slow read < KB/s.
>> No, nodes generally have balanced network/disk  IO during heal..
>>
>> You should make a detailed tests with non-prod cluster and try to find
>> optimum heal configuration for your use case..
>> Our new servers are on the way, in a couple of months I also will do
>> detailed tests with 3.10.x and parallel disperse heal, will post the
>> results here...
>>
>>
>> On Tue, Apr 18, 2017 at 9:51 AM, Amudhan P <amudhan83 at
gmail.com> wrote:
>> > Serkan,
>> >
>> > I have initially changed shd-max-thread 1 to 2 saw a little
difference
>> and
>> > changing it to 4 & 8. doesn't make any difference.
>> > disk write speed was about <1MB and data passed in thru network
for
>> healing
>> > node from other node were 4MB combined.
>> >
>> > Also, I tried ls -l from mount point to the folders and files
which
>> need to
>> > be healed but have not seen any difference in performance.
>> >
>> > But after 3 days of heal process running disk write speed was
increased
>> to 9
>> > - 11MB and data passed thru network for healing node from other
node
>> were
>> > 40MB combined.
>> >
>> > Still 14GB of data to be healed when comparing to other disks in
set.
>> >
>> > I saw in another thread you also had the issue with heal speed,
did you
>> face
>> > any issue in reading data from rest of the good bricks in the set.
like
>> slow
>> > read < KB/s.
>> >
>> > On Mon, Apr 17, 2017 at 2:05 PM, Serkan ?oban <cobanserkan at
gmail.com>
>> wrote:
>> >>
>> >> Normally I see 8-10MB/sec/brick heal speed with gluster
3.7.11.
>> >> I tested parallel heal for disperse with version 3.9.0 and see
that it
>> >> increase the heal speed to 20-40MB/sec
>> >> I tested with shd-max-threads 2,4,8 and saw that best
performance
>> >> achieved with 2 or 4 threads.
>> >> you can try to start with 2 and test with 4 and 8 and compare
the
>> results?
>> >
>> >
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170418/dcdd2380/attachment.html>

Gluster users - Apr 2017 - How to Speed UP heal process in Glusterfs 3.10.1

[Gluster-users] How to Speed UP heal process in Glusterfs 3.10.1

[Gluster-users] How to Speed UP heal process in Glusterfs 3.10.1